GPT-3 (2020) Paper Notes
Paper notes on GPT-3 covering its core ideas: 175B scaling, in-context learning (zero/one/few-shot), weighted-sampling training data, headline benchmark numbers, and the data-contamination and bias limitations.
Paper notes on GPT-3 covering its core ideas: 175B scaling, in-context learning (zero/one/few-shot), weighted-sampling training data, headline benchmark numbers, and the data-contamination and bias limitations.
A retrospective on building a macOS desktop app (Tauri + Rust + React) with an AI agent. The decision to make markdown files — not the database — the source of truth, the research I did before writing any code, the features I chose not to build, and the macOS permission bug that took the longest to crack. A story about judgment more than code.
Semble is a Python library that splits code into chunks with tree-sitter, fuses Model2Vec static embeddings with BM25 via RRF, and applies code-aware reranking — delivering millisecond code search on CPU alone. Where CodeGraph solves the same problem with an AST knowledge graph, Semble solves it through retrieval. This post analyzes the architecture by contrasting the two approaches.
CodeGraph is a TypeScript tool that uses tree-sitter to parse source code, builds a local SQLite knowledge graph of symbols, edges, and files (with FTS5), and exposes that graph via an MCP server to coding agents like Claude Code, Cursor, Codex, and Hermes Agent. Instead of burning tokens navigating a codebase with grep/Read calls, agents get answers from a single codegraph_explore invocation. This post analyzes the architecture that makes that possible.
WeKnora is a Go-based enterprise knowledge framework open-sourced by Tencent. It bundles document parsing, vectorization, hybrid search, and LLM inference into an event-driven chat pipeline, then layers a ReAct Agent and Wiki Mode on top. This analysis covers how a Python docreader gRPC service, 20+ LLM providers, 7 vector DBs, 7 IM channels, multi-tenant RBAC, and Langfuse observability are all handled as swappable components within a single monorepo.