ML.

All Posts

Semble Architecture Analysis: How an Agent-Oriented RAG Solves Code Search with Static Embeddings

Semble is a Python library that splits code into chunks with tree-sitter, fuses Model2Vec static embeddings with BM25 via RRF, and applies code-aware reranking — delivering millisecond code search on CPU alone. Where CodeGraph solves the same problem with an AST knowledge graph, Semble solves it through retrieval. This post analyzes the architecture by contrasting the two approaches.

CodeGraph Architecture Analysis: How Is the Code Intelligence Layer Beneath Coding Agents Built?

CodeGraph is a TypeScript tool that uses tree-sitter to parse source code, builds a local SQLite knowledge graph of symbols, edges, and files (with FTS5), and exposes that graph via an MCP server to coding agents like Claude Code, Cursor, Codex, and Hermes Agent. Instead of burning tokens navigating a codebase with grep/Read calls, agents get answers from a single codegraph_explore invocation. This post analyzes the architecture that makes that possible.

WeKnora Architecture Analysis: What Does a Framework Look Like When It Combines RAG, ReAct Agent, and Wiki Mode?

WeKnora is a Go-based enterprise knowledge framework open-sourced by Tencent. It bundles document parsing, vectorization, hybrid search, and LLM inference into an event-driven chat pipeline, then layers a ReAct Agent and Wiki Mode on top. This analysis covers how a Python docreader gRPC service, 20+ LLM providers, 7 vector DBs, 7 IM channels, multi-tenant RBAC, and Langfuse observability are all handled as swappable components within a single monorepo.

Dify Project Analysis: How Far Has This LLM App Platform Been Productized?

Dify is an open-source project that brings LLM app development, a workflow canvas, a RAG pipeline, model/tool plugins, MCP, and operational observability together into a single productized platform. This post analyzes its architecture through the lens of the Flask API, Graphon workflow runtime, Celery workers, Next.js console, plugin daemon, and vector backend structure.