LangChain Python 모노레포 아키텍처 분석 보고서

분석 일자: 2026-03-13 대상 버전: langchain-core v1.2.18 / langchain v1.2.12 저장소: https://github.com/langchain-ai/langchain

This article is mostly written by Claude Code

1. 프로젝트 개요

LangChain은 Python 기반의 LLM 애플리케이션 개발 프레임워크로, 다양한 언어 모델과 도구를 조합해 복잡한 AI 파이프라인을 구성할 수 있습니다.

핵심 철학: Composability(조합성) — 모든 컴포넌트를 파이프처럼 연결
핵심 가치: 프로바이더 중립성, 타입 안전성, 관측 가능성, 확장성
지원 LLM: OpenAI, Anthropic, Google, Groq, Mistral, HuggingFace, Ollama, xAI, DeepSeek, Fireworks, Perplexity, OpenRouter 등 20개+
지원 벡터DB: Chroma, Qdrant, Pinecone, Weaviate 등 71개+
라이선스: MIT

2. 기술 스택

영역	기술
언어	Python 3.10–3.14
패키지 관리	uv (pip/poetry 대체)
빌드 백엔드	hatchling
린터/포매터	ruff
타입 체커	mypy (strict mode)
테스트	pytest, pytest-asyncio, pytest-xdist
스냅샷 테스트	syrupy
테스트 레코딩	vcrpy
직렬화 검증	pydantic v2
관측	LangSmith
추적	langsmith SDK

핵심 의존성

패키지	역할
pydantic ≥ 2.7.4	런타임 타입 검증 및 직렬화
langsmith ≥ 0.3.45	프로덕션 관측 및 트레이싱
tenacity	재시도 로직 (지수 백오프)
jsonpatch	증분 스트림 패치
PyYAML	설정 파일 파싱
langgraph ≥ 1.1.1	그래프 기반 에이전트 실행

3. 전체 아키텍처

╔══════════════════════════════════════════════════════════════════════════╗
║                         LangChain 시스템                                 ║
║                                                                          ║
║  ┌─────────────────────────────────────────────────────────────────┐    ║
║  │                    애플리케이션 레이어                             │    ║
║  │  사용자 코드 / LangGraph 그래프 / LangServe API 서버              │    ║
║  └───────────────────────┬─────────────────────────────────────────┘    ║
║                          │ 호출                                          ║
║  ┌───────────────────────▼─────────────────────────────────────────┐    ║
║  │                langchain (메인 패키지)                            │    ║
║  │                                                                  │    ║
║  │  agents/ ──┬── chains/         retrievers/ ── memory/           │    ║
║  │            ├── tools/          embeddings/ ── callbacks/        │    ║
║  │            ├── llms/           vectorstores/── output_parsers/  │    ║
║  │            └── document_loaders/                                │    ║
║  └──────────────────────┬──────────────────────────────────────────┘    ║
║                         │ 상속/구현                                       ║
║  ┌──────────────────────▼──────────────────────────────────────────┐    ║
║  │                langchain-core (기반 레이어)                       │    ║
║  │                                                                  │    ║
║  │  Runnable ──┬── BaseLanguageModel (BaseLLM / BaseChatModel)     │    ║
║  │  Protocol   ├── BaseRetriever                                    │    ║
║  │             ├── BaseTool                                         │    ║
║  │             ├── VectorStore                                      │    ║
║  │             ├── BasePromptTemplate                               │    ║
║  │             ├── BaseOutputParser                                 │    ║
║  │             └── CallbackManager                                  │    ║
║  └──────────┬─────────────────────────────────┬─────────────────────┘    ║
║             │ 구현                              │ 구현                      ║
║             ▼                                  ▼                          ║
║  ┌──────────────────────┐         ┌────────────────────────────────────┐  ║
║  │   파트너 패키지 (15개) │         │         외부 서비스                  │  ║
║  │                      │         │                                    │  ║
║  │  langchain-openai    │         │  OpenAI / Anthropic / Google       │  ║
║  │  langchain-anthropic │  ──────▶│  Groq / Mistral / HuggingFace     │  ║
║  │  langchain-ollama    │         │  Chroma / Qdrant / LangSmith       │  ║
║  │  langchain-chroma    │         │  (로컬 모델 포함)                    │  ║
║  │  + 11개 추가          │         └────────────────────────────────────┘  ║
║  └──────────────────────┘                                                ║
╚══════════════════════════════════════════════════════════════════════════╝

4. 패키지 구조

langchain/ (모노레포 루트)
├── libs/
│   ├── core/                # langchain-core v1.2.18 — 기반 추상화
│   ├── langchain_v1/        # langchain v1.2.12 — 활성 메인 패키지
│   ├── langchain/           # langchain-classic v1.0.2 — 레거시 (유지보수 모드)
│   ├── partners/            # 15개 파트너 통합 패키지
│   │   ├── openai/
│   │   ├── anthropic/
│   │   ├── ollama/
│   │   ├── groq/
│   │   ├── mistralai/
│   │   ├── huggingface/
│   │   ├── chroma/
│   │   ├── qdrant/
│   │   └── ... (7개 추가)
│   ├── text-splitters/      # langchain-text-splitters v1.1.1
│   ├── standard-tests/      # langchain-tests v1.1.5 — 공유 테스트 슈트
│   └── model-profiles/      # 모델 설정 프로파일 관리
└── .github/
    └── workflows/           # 19개+ CI/CD 워크플로우

5. 핵심 모듈 (langchain-core)

langchain-core는 제로 서드파티 의존성 원칙의 기반 레이어입니다.

모듈	역할	핵심 파일
runnables/	범용 호출 프로토콜	`base.py` (222KB), `passthrough.py`, `branch.py`
language_models/	LLM/ChatModel 기반 클래스	`base.py`, `llms.py`, `chat_models.py`
callbacks/	이벤트 시스템 (트레이싱, 스트리밍)	`manager.py` (85KB+)
messages/	메시지 타입 시스템	`base.py`, `ai.py`, `human.py`, `tool.py`
prompts/	프롬프트 템플릿	`base.py`, `chat.py`, `few_shot.py`
tools/	도구 기반 클래스	`base.py`, `structured.py`
output_parsers/	LLM 출력 파싱	`json.py`, `pydantic.py`, `xml.py`
vectorstores/	벡터 저장소 추상화	`base.py`, `in_memory.py`
retrievers.py	문서 검색 추상화	`base.py`
embeddings/	임베딩 인터페이스	`base.py`, `fake.py`
load/	직렬화/역직렬화	`serializable.py`, `dump.py`, `mapping.py`
tracers/	LangSmith 통합	`langchain.py`, `log_stream.py`
utils/	공통 유틸리티 (17개 모듈)	`pydantic.py`, `json_schema.py`

6. 메인 패키지 (langchain)

libs/langchain_v1/의 langchain 패키지는 고수준 구현체를 제공합니다.

디렉토리	내용 수	역할
`agents/`	30개 서브 디렉토리	ReAct, OpenAI Functions 에이전트 등
`chains/`	43개 서브 디렉토리	체인 조합 및 LCEL 패턴
`memory/`	21개 파일	대화 메모리 관리
`retrievers/`	48개 서브 디렉토리	구체적인 검색 구현체
`embeddings/`	53개 서브 디렉토리	임베딩 프로바이더 어댑터
`chat_models/`	37개 서브 디렉토리	채팅 모델 래퍼
`llms/`	86개 서브 디렉토리	LLM 프로바이더 래퍼
`vectorstores/`	71개 서브 디렉토리	벡터 저장소 구현체
`tools/`	74개 서브 디렉토리	도구 구현체
`document_loaders/`	149개 서브 디렉토리	문서 로더 구현체
`evaluation/`	15개 서브 디렉토리	LLM 평가 도구
`graphs/`	15개 서브 디렉토리	그래프 관련 기능

7. 파트너 패키지

모든 파트너 패키지는 동일한 구조를 따릅니다.

파트너 패키지 목록

패키지	제공 내용
langchain-openai (v1.1.11)	ChatOpenAI, OpenAI 임베딩, 미들웨어 지원
langchain-anthropic	Claude 모델, Tool Use 지원
langchain-groq	Groq 고속 추론 API
langchain-mistralai	Mistral AI 모델
langchain-huggingface	HuggingFace Hub 모델
langchain-ollama	로컬 모델 실행 (llama, gemma 등)
langchain-fireworks	Fireworks AI 서버리스 추론
langchain-deepseek	DeepSeek 모델
langchain-xai	xAI Grok 모델
langchain-perplexity	Perplexity 검색 기반 LLM
langchain-openrouter	OpenRouter 멀티 프로바이더
langchain-chroma	Chroma 벡터 데이터베이스
langchain-qdrant	Qdrant 벡터 데이터베이스
langchain-exa	Exa 신경망 검색
langchain-nomic	Nomic 임베딩

파트너 패키지 공통 구조

partners/{provider}/
├── pyproject.toml                  # 패키지 메타데이터
├── langchain_{provider}/
│   ├── __init__.py
│   ├── chat_models.py              # ChatModel 구현체
│   ├── llms.py                     # LLM 구현체
│   ├── embeddings.py               # 임베딩 구현체
│   ├── output_parsers/             # 커스텀 출력 파서
│   ├── tools/                      # 도구 통합
│   └── data/                       # 모델 프로파일 JSON
├── tests/
│   ├── unit_tests/
│   └── integration_tests/
├── Makefile
└── uv.lock

8. Runnable 인터페이스

LangChain의 핵심 패턴 — 모든 컴포넌트의 범용 호출 프로토콜입니다. (base.py 222KB)

class Runnable[Input, Output](ABC, Generic[Input, Output]):
    """범용 호출 프로토콜 — 모든 LangChain 컴포넌트의 기반"""

    # 동기 인터페이스
    def invoke(self, input: Input, config: RunnableConfig) -> Output: ...
    def batch(self, inputs: list[Input]) -> list[Output]: ...
    def stream(self, input: Input) -> Iterator[Output]: ...

    # 비동기 인터페이스
    async def ainvoke(self, input: Input) -> Output: ...
    async def abatch(self, inputs: list[Input]) -> list[Output]: ...
    async def astream(self, input: Input) -> AsyncIterator[Output]: ...

    # 고급 스트리밍
    async def astream_log(self, input: Input) -> AsyncIterator[RunLogPatch]: ...
    async def astream_events(self, input: Input) -> AsyncIterator[StreamEvent]: ...

    # 조합 연산자
    def pipe(self, other: Runnable) -> RunnableSequence: ...
    def __or__(self, other: Runnable) -> RunnableSequence: ...  # | 연산자

    # 설정 변형
    def with_config(self, config: RunnableConfig) -> Runnable: ...
    def with_fallbacks(self, fallbacks: list[Runnable]) -> RunnableWithFallbacks: ...
    def with_retry(self, ...) -> RunnableRetry: ...
    def configurable_fields(...) -> RunnableConfigurableFields: ...

Runnable 서브패턴

클래스	역할
`RunnableSequence`	순차 실행 (`a \| b \| c`)
`RunnableParallel`	병렬 팬아웃 (`{a: r1, b: r2}`)
`RunnableMap`	입력에 대한 매핑
`RunnableBranch`	조건부 라우팅
`RunnablePassthrough`	입력을 그대로 통과
`RunnableWithFallbacks`	에러 복구
`RunnableRetry`	자동 재시도 (지수 백오프)
`RunnableConfigurable`	런타임 동적 설정

LCEL 체인 예시

# LangChain Expression Language (LCEL)
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | output_parser
)

# 동기/비동기 동일 인터페이스
result = chain.invoke("사용자 질문")
result = await chain.ainvoke("사용자 질문")

# 스트리밍
for chunk in chain.stream("사용자 질문"):
    print(chunk, end="", flush=True)

9. 콜백 시스템

이벤트 기반 아키텍처로 LLM 실행을 추적하고 관측합니다.

[Runnable.invoke() 호출]
        │
        ▼
CallbackManager
  ├── on_llm_start(serialized, prompts)     ← LLM 호출 시작
  │       │ 스트리밍
  │       ▼
  ├── on_llm_new_token(token)               ← 토큰 수신 (스트리밍)
  │       │ 완료
  │       ▼
  ├── on_llm_end(response)                  ← LLM 응답 완료
  │
  ├── on_tool_start(tool, input)            ← 도구 실행 시작
  ├── on_tool_end(output)                   ← 도구 실행 완료
  │
  ├── on_retriever_start(query)             ← 검색 시작
  ├── on_retriever_end(documents)           ← 검색 완료
  │
  └── → LangSmithTracer / 커스텀 핸들러

콜백 핸들러 타입

핸들러	용도
`LangSmithTracer`	프로덕션 관측 및 디버깅
`StreamingStdOutCallbackHandler`	콘솔 스트리밍 출력
`AsyncIteratorCallbackHandler`	비동기 스트리밍
`FileCallbackHandler`	파일 로깅
커스텀	`BaseCallbackHandler` 상속 구현

10. 핵심 데이터 구조

메시지 타입 계층

BaseMessage(Serializable)
├── content: str | list[BaseMessageContentBlock]
│   # BaseMessageContentBlock:
│   # - TextContentBlock    → 텍스트
│   # - ImageContentBlock   → 이미지 (멀티모달)
│   # - FileContentBlock    → 파일 첨부
│   # - ToolUseBlock        → 도구 호출 요청
│   # - ToolCallBlock       → 도구 호출 결과
│
├── HumanMessage      ← 사용자 입력
├── AIMessage         ← 모델 응답 (tool_calls 포함)
│   └── AIMessageChunk  ← 스트리밍 조각
├── SystemMessage     ← 시스템 지시
├── ToolMessage       ← 도구 실행 결과
└── FunctionMessage   ← (레거시, deprecated)

LLM 모델 계층

BaseLanguageModel (Runnable[LanguageModelInput, str])
├── BaseLLM                       ← 텍스트 완성 (레거시)
│   └── invoke(str) -> str
│
└── BaseChatModel                 ← 채팅 인터페이스 (현재 표준)
    ├── invoke(List[BaseMessage]) -> AIMessage
    ├── stream()  -> Iterator[AIMessageChunk]
    └── with_structured_output()  → Pydantic 모델 출력

도구 계층

BaseTool (RunnableSerializable[ToolInput, ToolOutput])
├── name: str
├── description: str
├── args_schema: Type[BaseModel]         ← Pydantic 스키마
│
├── StructuredTool                       ← Pydantic 기반
├── Tool                                 ← 함수 기반
└── ToolCollection                       ← 다중 도구 묶음

Document

Document(BaseModel)
├── page_content: str          ← 문서 내용
└── metadata: dict             ← 출처, 날짜 등

11. 레이어별 의존 관계

langchain (v1.2.12)
├── langchain-core (≥1.2.10)
├── langgraph (≥1.1.1)
└── pydantic (≥2.7.4)

langchain-core (v1.2.18)
├── langsmith (≥0.3.45)       ← 관측
├── tenacity (≥8.1.0)         ← 재시도
├── pydantic (≥2.7.4)         ← 직렬화
├── PyYAML                     ← 설정
└── typing-extensions

langchain-classic (v1.0.2)    ← 레거시
├── langchain-core
├── langchain-text-splitters
└── SQLAlchemy, requests

langchain-{provider}          ← 파트너 패키지
├── langchain-core
├── {provider-sdk}             ← openai≥2.26.0, anthropic SDK 등
└── tiktoken (OpenAI 전용)

langchain-tests (v1.1.5)      ← 테스트 슈트
├── langchain-core
├── pytest, pytest-asyncio
└── httpx, vcrpy

12. 빌드 & 개발 도구

패키지별 공통 Makefile 명령

make lock          # uv.lock 재생성
make check-lock    # 락파일 최신 여부 검증
make test          # 단위 테스트 실행 (네트워크 없음)
make lint          # ruff로 코드 린팅
make format        # ruff로 코드 포매팅
make spell_check   # 오탈자 검사

의존성 설치

# 테스트 그룹 설치
uv sync --group test

# 전체 그룹 설치
uv sync --all-groups

# 특정 테스트 실행
uv run --group test pytest tests/unit_tests/test_specific.py

# 타입 체킹
uv run --group lint mypy .

로컬 에디터블 설치 패턴

# 각 패키지의 pyproject.toml
[tool.uv.sources]
langchain-core = { path = "../core", editable = true }
langchain-tests = { path = "../standard-tests", editable = true }

코드 품질 도구

도구	버전	역할
ruff	0.15.0–0.16.0	린터 + 포매터
mypy	1.19.1–1.20.0	정적 타입 체크 (strict 모드)
pytest	-	테스트 프레임워크
pytest-asyncio	-	비동기 테스트
pytest-xdist	-	병렬 테스트 실행
syrupy	-	스냅샷 테스트
vcrpy	-	HTTP 인터셉션 테스트

13. CI/CD 인프라

주요 GitHub Actions 워크플로우 (19개+)

워크플로우	역할
`_test.yml`	단위 테스트 실행 템플릿
`_lint.yml`	린팅 워크플로우 템플릿
`_release.yml`	릴리즈 오케스트레이션 (25KB)
`_test_pydantic.yml`	Pydantic 버전 호환성 검증
`check_diffs.yml`	변경된 패키지 감지 (10KB)
`integration_tests.yml`	예약된 통합 테스트 (12KB)
`pr_lint.yml`	PR 제목 검증 (Conventional Commits)
`auto-label-by-package.yml`	수정된 패키지별 PR 라벨 자동 부여
`refresh_model_profiles.yml`	모델 프로파일 자동 갱신
`check_core_versions.yml`	의존성 버전 체크
`require_issue_link.yml`	PR에 이슈 링크 필수화 (10KB)
`v03_api_doc_build.yml`	API 문서 자동 빌드

커밋 규칙 (Conventional Commits)

feat(scope): 새 기능 추가
fix(scope): 버그 수정
chore(scope): 인프라 변경
docs(scope): 문서 수정
refactor(scope): 리팩토링
test(scope): 테스트 추가/수정

예시:
- feat(core): add streaming support to Runnable
- fix(openai): handle rate limit errors gracefully
- chore(anthropic): update infrastructure dependencies

14. 디렉토리 트리

langchain/
├── libs/
│   ├── core/
│   │   ├── langchain_core/
│   │   │   ├── runnables/          # 18개 파일 (base.py 222KB)
│   │   │   ├── language_models/
│   │   │   ├── callbacks/          # manager.py 85KB+
│   │   │   ├── messages/           # 14개 파일
│   │   │   ├── prompts/            # 14개 파일
│   │   │   ├── tools/              # 6개 파일
│   │   │   ├── output_parsers/     # 13개 파일
│   │   │   ├── vectorstores/
│   │   │   ├── retrievers.py
│   │   │   ├── embeddings/
│   │   │   ├── load/               # 4개 파일
│   │   │   ├── tracers/
│   │   │   └── utils/              # 17개 모듈
│   │   └── pyproject.toml
│   │
│   ├── langchain_v1/
│   │   ├── langchain/
│   │   │   ├── agents/             # 30개 서브디렉토리
│   │   │   ├── chains/             # 43개 서브디렉토리
│   │   │   ├── document_loaders/   # 149개 서브디렉토리
│   │   │   ├── llms/               # 86개 서브디렉토리
│   │   │   ├── vectorstores/       # 71개 서브디렉토리
│   │   │   └── tools/              # 74개 서브디렉토리
│   │   └── pyproject.toml
│   │
│   ├── partners/
│   │   ├── openai/
│   │   │   ├── langchain_openai/
│   │   │   │   ├── chat_models.py
│   │   │   │   ├── llms.py
│   │   │   │   ├── embeddings/
│   │   │   │   ├── middleware/
│   │   │   │   └── data/           # 모델 프로파일
│   │   │   └── pyproject.toml
│   │   ├── anthropic/
│   │   ├── ollama/
│   │   └── ... (12개 추가)
│   │
│   ├── standard-tests/
│   │   └── langchain_tests/
│   │       ├── integration_tests/
│   │       │   ├── chat_models.py  # 128KB 종합 테스트
│   │       │   ├── embeddings.py
│   │       │   └── vectorstores.py # 31KB
│   │       └── unit_tests/
│   │
│   ├── text-splitters/
│   ├── model-profiles/
│   └── Makefile
│
└── .github/
    ├── workflows/              # 19개+ YAML 파일
    └── ISSUE_TEMPLATE/

15. 핵심 개념 설명

Runnable Protocol

LangChain의 모든 컴포넌트(모델, 검색기, 파서, 프롬프트)는 Runnable 인터페이스를 구현합니다. 이는 모든 컴포넌트를 | 연산자로 파이프 연결할 수 있게 하는 핵심 설계입니다.

# 모든 컴포넌트가 동일한 인터페이스를 구현
prompt    : Runnable[dict, PromptValue]
llm       : Runnable[PromptValue, AIMessage]
parser    : Runnable[AIMessage, str]

# 파이프 연결 (RunnableSequence 생성)
chain = prompt | llm | parser

LCEL (LangChain Expression Language)

Python의 | 연산자를 오버로드해 선언적으로 체인을 구성하는 도메인 특화 언어입니다.

Serializable Pattern

모든 컴포넌트는 Serializable(Pydantic BaseModel 기반)을 상속해 JSON 직렬화/역직렬화를 지원합니다. 이를 통해 체인을 저장하고 불러올 수 있습니다.

Standard Test Pattern

langchain-tests는 모든 파트너 패키지가 상속받아 공통 테스트를 실행할 수 있는 표준 테스트 슈트를 제공합니다.

# 파트너 패키지 테스트 예시
from langchain_tests.integration_tests import ChatModelIntegrationTests

class TestChatOpenAI(ChatModelIntegrationTests):
    @property
    def chat_model_class(self) -> type:
        return ChatOpenAI

    @property
    def chat_model_params(self) -> dict:
        return {"model": "gpt-4o-mini"}

16. 일반 LLM 라이브러리와의 차별점

특징	LangChain	일반 LLM SDK
프로바이더 중립성	동일 코드로 모델 교체 가능	프로바이더별 API 차이
컴포넌트 조합	`\|` 연산자로 파이프라인 구성	수동 함수 호출
스트리밍	모든 컴포넌트에서 통일 인터페이스	모델별 구현 차이
동기/비동기	`invoke`/`ainvoke` 일관성	별도 클라이언트
관측성	LangSmith 내장 통합	별도 구축 필요
직렬화	체인 전체 저장/복원 가능	없음
도구 통합	표준화된 도구 인터페이스	수동 파싱
구조화 출력	`with_structured_output()`	프롬프트 엔지니어링 의존
에러 복구	`with_fallbacks()`, `with_retry()`	try/except 수동 구현
테스트	`langchain-tests` 공통 슈트	개별 테스트 작성

요약: LangChain Python 모노레포는 Runnable Protocol을 중심으로 설계된 프로바이더 중립적 LLM 프레임워크입니다. langchain-core의 기반 추상화 → langchain의 고수준 구현체 → 파트너 패키지의 구체적 통합이라는 3계층 구조를 통해 확장성과 일관성을 동시에 달성합니다. uv 기반 모노레포 관리와 공유 테스트 슈트(langchain-tests)가 15개 파트너 패키지 전반에 걸친 품질을 보장합니다.