컨텍스트 윈도우 관리

입문 6 min

Claude Code는 유한한 컨텍스트 윈도우 안에서 작동하며, 이 제약 조건 내에서 효과적으로 작업하는 방법을 이해하는 것이 생산적인 세션의 핵심입니다. 이 가이드에서는 컨텍스트 제한의 작동 원리를 설명하고, 컨텍스트를 효율적으로 유지하기 위한 실용적인 전략을 알려드리며, 서브에이전트와 압축 기법을 활용하여 단일 컨텍스트 윈도우를 훨씬 초과하는 코드베이스를 다루는 방법을 보여줍니다.

컨텍스트token서브에이전트압축

컨텍스트 제한 이해하기

The context window is the total amount of text Claude can process in a single conversation turn. This includes everything: your system prompt, CLAUDE.md contents, conversation history, files that Claude reads, and the response it generates. When you approach the limit, Claude may lose track of earlier parts of the conversation or refuse to process additional files. In practice, context pressure builds gradually. Your first prompt in a session has the most available context. As the conversation continues, each exchange adds to the history. Reading large files consumes significant chunks. A single 500-line file can use several thousand tokens, and if Claude reads five such files, a substantial portion of the context window is already occupied. Recognizing when context is running low is important. Signs include Claude forgetting instructions from earlier in the conversation, giving inconsistent answers, or explicitly warning about context limits. When you notice these signs, it is time to apply the strategies covered in the following sections.

효율적인 컨텍스트 관리 전략

The most effective strategy is prevention: structure your work to minimize context consumption from the start. Use .claudeignore to exclude directories that Claude should never read, such as node_modules, build outputs, and generated files. This prevents accidental context bloat when Claude searches your project. Break large tasks into focused sub-tasks. Instead of asking Claude to refactor an entire module in one conversation, handle it function by function or file by file. Each focused conversation starts with a fresh context window, giving Claude full capacity to reason about the specific task at hand. Use targeted file references instead of letting Claude discover files on its own. When you say 'fix the error in src/utils/parser.ts at line 23,' Claude reads exactly one file. When you say 'find and fix parsing errors,' Claude may read dozens of files before locating the issue, consuming context with irrelevant code along the way.

# .claudeignore - keep these out of context
node_modules/
dist/
build/
coverage/
*.min.js
*.map
.next/
__pycache__/

# Be specific in prompts to minimize file reads
# Instead of: "fix the parsing bug"
# Use: "fix the regex in src/utils/parser.ts parseDate function"

# Start fresh sessions for unrelated tasks
# Session 1: "refactor the auth module"
# Session 2: "update the payment integration"

대규모 작업을 위한 서브에이전트 활용

Subagents are Claude Code's answer to tasks that exceed a single context window. When Claude spawns a subagent, it delegates a focused sub-task to a separate context, then receives back only the result. This means the parent conversation stays lean while still accomplishing complex, multi-file operations. Claude uses subagents automatically when it determines a task benefits from parallel processing or when it needs to explore parts of the codebase without polluting the main conversation's context. You can also explicitly guide this behavior by structuring your prompts to suggest decomposition. For example, asking Claude to 'review each module in src/features/ and summarize the issues' naturally encourages subagent usage. Each module review happens in its own context, and only the summary flows back to your main conversation. This lets you effectively process a codebase many times larger than the context window.

# Prompts that encourage efficient subagent usage

# Good: naturally decomposes into sub-tasks
claude "for each service in src/services/, check for error handling gaps and report findings"

# Good: parallel exploration with summary
claude "analyze the test coverage of src/features/auth, src/features/payment, and src/features/cart separately, then give me a combined report"

# The /compact command compresses conversation history
# Use it when context is getting full mid-session
/compact

컨텍스트 압축 기법

When you are deep into a productive session and context is running low, compression can extend the conversation without starting over. The /compact command summarizes the conversation history, reducing its token footprint while preserving the key decisions and context Claude needs to continue working. Time your compression strategically. The best moment is after completing a sub-task but before starting the next one. This way, the completed work is summarized and the new task gets maximum available context. Avoid compressing mid-task, as you may lose nuanced details Claude needs to finish the current work. Another technique is to periodically export important decisions or intermediate results to files. When Claude writes a summary to a file, that information persists outside the context window and can be re-read later if needed. This 'external memory' pattern is especially useful for long refactoring sessions that span multiple conversation turns.

# Use /compact when context is getting full
/compact

# Export intermediate results to preserve outside context
claude "write the analysis results to docs/refactor-plan.md so we can reference it later"

# Start a new session but reference previous work
claude "read docs/refactor-plan.md and continue with phase 2 of the refactoring"

# Check context usage
claude "how much context have we used in this session?"

실행 미리보기

컨텍스트 윈도우 관리