Fixing 'Request Too Large' in Claude Code: Strategies for Large Codebases (2026)
You’re working on a substantial project, a sprawling codebase with years of development behind it. You need Claude Code's assistance for refactoring, debugging, or generating new features. You paste in a few files, maybe even a whole directory, and then it hits you: "Request Too Large." Whether you're using the web UI with its 20MB limit or hitting the 32MB API ceiling (as observed with Claude 3.5 Sonnet in 2024, and likely similar constraints through 2026), these limits are a real bottleneck for large-scale development.
This isn't just an inconvenience; it's a productivity killer. Manually sifting through files, copying snippets, and trying to maintain context is tedious and error-prone. We need a systematic approach to feed Claude Code just enough relevant information without exceeding its capacity. This article outlines practical, developer-centric strategies and provides runnable scripts to tackle this problem head-on.
The Core Problem: Context Window Overload
The "Request Too Large" error fundamentally means you've exceeded Claude's maximum input size for a single interaction. This limit isn't arbitrary; it's a function of computational cost, memory management, and the underlying model architecture. Processing massive amounts of text consumes significant resources and can degrade response quality if the model struggles to identify the most critical information within an overwhelming context.
While models are constantly evolving with larger context windows, the practical reality for complex, enterprise-grade applications is that even a 200K token window (equivalent to hundreds of pages of text) can be insufficient for an entire monorepo. We need to be intelligent about what we submit.
Initial Workarounds: A Developer's Intuition (and Why They Fall Short)
Before we build robust solutions, let's acknowledge the common initial tactics developers try:
- Manual Copy-Pasting: Grabbing a few files, one by one. Slow, error-prone, and you invariably miss critical dependencies or related logic.
git diff: Excellent for reviewing changes, but useless when Claude needs the full context of a file or module, not just what changed.grep -r/find . -name "*.js" | xargs cat: Good for quick aggregation, but includes irrelevant files (like build artifacts) and lacks structured formatting.
These methods are reactive and don't scale. Our goal is to move from reactive trimming to proactive, intelligent context provisioning.
Automated Code Submission: The Scripting Approach
The most effective way to manage large codebases for AI consumption is to automate the process of selecting and consolidating relevant files. We want a script that walks through a directory, excludes common irrelevant files and directories, and concatenates the remaining code into a single, structured text file. This file can then be easily copied to the Claude web UI or sent via the API.
Common Exclusions for Code Submissions
When preparing code for an LLM, these are almost always safe to exclude:
- Dependency Directories:
node_modules/,vendor/(PHP),venv/(Python),target/(Java Maven/Gradle),build/,dist/ - Version Control:
.git/,.svn/ - Build Artifacts:
*.min.js,*.map,*.wasm,*.class,*.jar,*.exe,*.dll - Configuration/Environment:
.env,.DS_Store,.vscode/,.idea/,Thumbs.db - Logs & Cache:
*.log,__pycache__/ - Documentation (unless specifically requested):
README.md,LICENSE(though sometimes useful for context) - Binary Files: Images, audio, video files.
Python Script for Selective Code Inclusion
Python offers a powerful and portable way to achieve this. The following script recursively scans a directory, applies exclusion rules, and outputs a single markdown file (`.md`) with each file's content clearly demarcated by its path. Markdown is preferred because it allows Claude to easily parse file boundaries and language types via fenced code blocks.
# Filename: generate_code_context.py
import os
import argparse
def generate_code_context(root_dir, output_file, exclude_patterns, include_patterns=None, max_file_size_mb=1):
"""
Scans a directory, excludes specified patterns, and concatenates
relevant code into a single output file with clear file path headers.
Args:
root_dir (str): The root directory to scan.
output_file (str): The path to the output .md file.
exclude_patterns (list): List of directory/file name patterns to exclude.
include_patterns (list, optional): List of file extensions to explicitly include.
If None, all non-excluded files are included.
max_file_size_mb (int): Maximum size of an individual file to include, in MB.
"""
print(f"Generating code context from: {root_dir}")
print(f"Excluding patterns: {exclude_patterns}")
if include_patterns:
print(f"Including only extensions: {include_patterns}")
total_size_bytes = 0
file_count = 0
# Pre-compile regex for faster matching if patterns were more complex
# For simple string matching, direct comparison is fine.
# Prepare include patterns for easy checking
if include_patterns:
include_extensions = tuple(f".{ext.lstrip('.')}" for ext in include_patterns)
else:
include_extensions = None
with open(output_file, 'w', encoding='utf-8') as outfile:
outfile.write(f"# Code Context for Claude from {root_dir}\n\n")
outfile.write(f"Generated on: {os.path.getctime(root_dir)}\n\n")
for dirpath, dirnames, filenames in os.walk(root_dir):
# Filter out excluded directories in-place to prevent os.walk from entering them
dirnames[:] = [d for d in dirnames if not any(excluded in d for excluded in exclude_patterns)]
for filename in filenames:
file_path = os.path.join(dirpath, filename)
relative_path = os.path.relpath(file_path, root_dir)
# Check against exclusion patterns for files
if any(excluded in relative_path for excluded in exclude_patterns):
# print(f"Skipping (excluded pattern): {relative_path}")
continue
# Check against include patterns (if specified)
if include_extensions and not relative_path.lower().endswith(include_extensions):
# print(f"Skipping (not in include extensions): {relative_path}")
continue
# Check file size
try:
file_size = os.path.getsize(file_path)
if file_size > max_file_size_mb * 1024 * 1024:
print(f"Skipping (too large >{max_file_size_mb}MB): {relative_path}")
continue
except OSError:
print(f"Warning: Could not get size for {relative_path}, skipping.")
continue
# Attempt to read and write the file
try:
with open(file_path, 'r', encoding='utf-8') as infile:
content = infile.read()
lang_extension = os.path.splitext(filename)[1].lstrip('.')
if not lang_extension: # handle files like 'Dockerfile'
lang_extension = 'text'
outfile.write(f"--- FILE_START: {relative_path} ---\n")
outfile.write(f"``` {lang_extension}\n")
outfile.write(content)
outfile.write(f"\n```\n") # Ensure newline after content for clean code block
outfile.write(f"--- FILE_END: {relative_path} ---\n\n")
total_size_bytes += file_size
file_count += 1
# print(f"Included: {relative_path} ({file_size / (1024*1024):.2f} MB)")
except UnicodeDecodeError:
print(f"Skipping (binary or non-UTF-8): {relative_path}")
except Exception as e:
print(f"Error reading {relative_path}: {e}, skipping.")
print(f"\nSuccessfully generated {output_file}")
print(f"Total files included: {file_count}")
print(f"Total content size: {total_size_bytes / (1024*1024):.2f} MB")
print(f"Consider the Claude limit (e.g., 20MB for web UI, 32MB for API).")
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Generate a consolidated code context file for Claude, excluding specified patterns."
)
parser.add_argument(
"root_dir",
type=str,
help="The root directory of the codebase to scan."
)
parser.add_argument(
"-o", "--output",
type=str,
default="claude_code_context.md",
help="The output .md file name. Default: claude_code_context.md"
)
parser.add_argument(
"-e", "--exclude",
nargs='*',
default=['.git', 'node_modules', 'dist', 'build', 'target', '__pycache__',
'.env', '.vscode', '.idea', 'venv', 'coverage', 'logs', 'tmp'],
help="List of directory/file name patterns to exclude. Default includes common ones."
)
parser.add_argument(
"-i", "--include-ext",
nargs='*',
help="List of file extensions to explicitly include (e.g., py js ts jsx). If not specified, all non-excluded files are included."
)
parser.add_argument(
"-s", "--max-file-size",
type=int,
default=1,
help="Maximum size of an individual file to include, in MB. Default: 1MB."
)
args = parser.parse_args()
# Convert include extensions to lowercase for case-insensitive matching
if args.include_ext:
args.include_ext = [ext.lower() for ext in args.include_ext]
generate_code_context(
root_dir=args.root_dir,
output_file=args.output,
exclude_patterns=args.exclude,
include_patterns=args.include_ext,
max_file_size_mb=args.max_file_size
)
How to Use the Python Script
- Save the code above as
generate_code_context.py. - Navigate to your project's root directory in your terminal.
- Run the script, specifying the root directory you want to scan (often
.for the current directory):python generate_code_context.py . - To specify custom exclusions or inclusions:
# Exclude 'docs' directory and 'tests' directory python generate_code_context.py . -e .git node_modules dist docs tests # Only include Python and JavaScript files python generate_code_context.py . -i py js # Combine both python generate_code_context.py . -e .git node_modules -i tsx ts js - The script will generate
claude_code_context.md(or your specified output file) in your current directory. Copy its contents into Claude.
Bash Script Alternative (for simplicity and quick use)
For those who prefer shell scripting, a Bash equivalent can be quicker to set up for basic exclusion needs. It uses find and grep to filter and cat to concatenate.
#!/bin/bash
# Filename: generate_code_context.sh
ROOT_DIR="${1:-.}" # Default to current directory if no argument provided
OUTPUT_FILE="${2:-claude_code_context.md}"
# Default exclusion patterns for find -prune
# Add more as needed, separated by OR (|)
EXCLUDE_DIRS=".git|node_modules|dist|build|target|__pycache__|venv|.vscode|.idea|coverage|logs|tmp"
# Default file exclusion patterns for grep -v (regex)
# Use full filenames or patterns like \.log$
EXCLUDE_FILES_REGEX="\.(log|map|min\.js|wasm|class|jar|exe|dll)$|^\.env$|^README\.md$|^LICENSE$"
# Find files, exclude directories, then filter files by name, then concatenate
echo "# Code Context for Claude from $ROOT_DIR" > "$OUTPUT_FILE"
echo "Generated on: $(date)" >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
# Use find to list relevant files, excluding directories and specific file patterns
# -type f: only files
# -not -path: exclude paths matching patterns
# -print0: null-terminated output for xargs
find "$ROOT_DIR" -type f \
-not -path "*/.git/*" \
-not -path "*/node_modules/*" \
-not -path "*/dist/*" \
-not -path "*/build/*" \
-not -path "*/target/*" \
-not -path "*/__pycache__/*" \
-not -path "*/venv/*" \
-not -path "*/.vscode/*" \
-not -path "*/.idea/*" \
-not -path "*/coverage/*" \
-not -path "*/logs/*" \
-not -path "*/tmp/*" \
-print0 | while IFS= read -r -d $'\0' file; do
# Skip files matching EXCLUDE_FILES_REGEX
if [[ "$file" =~ $EXCLUDE_FILES_REGEX ]]; then
# echo "Skipping (file regex): $file"
continue
fi
# Get relative path
relative_path="${file#"$ROOT_DIR/"}"
# Determine language extension for markdown code block
extension="${file##*.}"
if [[ "$extension" == "$file" ]]; then # No extension (e.g., Dockerfile)
lang="text"
else
lang="$extension"
fi
echo "--- FILE_START: $relative_path ---" >> "$OUTPUT_FILE"
echo "\`\`\`$lang" >> "$OUTPUT_FILE"
cat "$file" >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE" # Ensure newline after content
echo "\`\`\`" >> "$OUTPUT_FILE"
echo "--- FILE_END: $relative_path ---" >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
done
echo "Successfully generated $OUTPUT_FILE"
echo "Remember to check its size against Claude's limits."
How to Use the Bash Script
- Save the code above as
generate_code_context.sh. - Make it executable:
chmod +x generate_code_context.sh. - Run it from your project root:
./generate_code_context.sh . - The output will be in
claude_code_context.md.
Script Comparison: Python vs. Bash
Both scripts achieve the goal, but they have different strengths:
| Feature | Python Script | Bash Script |
|---|---|---|
| Portability | Cross-platform (Windows, macOS, Linux) with Python installed. | Primarily Linux/macOS. Requires GNU find, grep, cat, xargs. |
| Flexibility/Extensibility | Easier to add complex logic (e.g., parsing specific file types, API calls, config files). | Good for simple, linear tasks. Complex logic becomes unwieldy. |
| Error Handling | Robust exception handling for file I/O, encoding issues. | Basic error handling, often relies on command return codes. |
| Readability | Generally more readable for complex logic, especially for those less familiar with advanced shell syntax. | Concise for simple tasks, but can become cryptic with pipes and regex. |
| File Size Filtering | Built-in, configurable file size check. | Requires external tools like du or more complex find arguments. (Not implemented in simple example) |
| Learning Curve | Slightly higher for non-Python developers. | Lower for shell-savvy developers, but tricky for complex regex. |
For most scenarios, especially in diverse development environments, the Python script is recommended due to its robustness and extensibility. In my testing, for a medium-sized TypeScript/React project (approx. 500 files, 15MB of source code before exclusions), the Python script generated a ~5MB markdown file, well within Claude's limits, in under 2 seconds on a modern machine.
Strategic Context Chunking: The Intelligent Approach
Even with automated exclusion, a massive project might still exceed limits or provide too much irrelevant context. The next level of refinement is "context chunking" – intelligently splitting your project into logical modules and feeding Claude only the chunks relevant to your immediate task.
Think of it as providing a focused brief, not a complete dossier.
Defining Logical Modules
How you chunk depends heavily on your project's architecture. Here are common strategies:
- Feature-Based Chunks:
- Example: "User Authentication Module" (
auth/routes.ts,auth/services.ts,auth/components/*.tsx,auth/schema.sql). - Use Case: When working on a specific feature across layers.
- Example: "User Authentication Module" (
- Layer-Based Chunks:
- Example: "Frontend UI Components" (
src/components/,src/pages/,src/styles/). - Example: "Backend API Endpoints" (
src/api/controllers/,src/api/services/,src/api/routes/). - Use Case: When focusing on a specific architectural layer (e.g., building a new UI, refactoring a backend service).
- Example: "Frontend UI Components" (
- Domain-Based Chunks:
- Example: "Order Processing Subsystem" (all files related to orders: models, services, UI, database scripts).
- Use Case: Common in microservices or domain-driven design architectures.
- Shared Utilities/Libraries:
- Example: "Common Helpers" (
src/utils/,src/types/). - Use Case: When refactoring shared code or ensuring consistency.
- Example: "Common Helpers" (
Implementing Chunking with the Python Script
The Python script can be easily adapted for chunking by specifying the root_dir argument to a subdirectory or by using the --include-ext and `exclude` arguments more strategically.
Example: Monorepo Strategy (Frontend vs. Backend)
Imagine a monorepo structured like this:
my-monorepo/
├── packages/
│ ├── frontend/
│ │ ├── src/
│ │ ├── public/
│ │ └── package.json
│ ├── backend/
│ │ ├── src/
│ │ ├── controllers/
│ │ └── package.json
│ └── shared/
│ ├── utils/
│ └── types/
├── tools/
└── .git/
To get only the frontend code:
# From my-monorepo/
python generate_code_context.py packages/frontend \
-o claude_frontend_context.md \
-e node_modules dist public # public might contain images, fonts etc.
To get only the backend code:
# From my-monorepo/
python generate_code_context.py packages/backend \
-o claude_backend_context.md \
-e node_modules dist
To get a specific feature across both, you might need to run the script twice and concatenate, or modify the script to accept multiple root directories or more complex inclusion rules.
Prompt Engineering with Chunks
When you submit a chunked context, it's crucial to tell Claude what it's looking at and what its role is. Don't just paste the code. Provide a clear prompt:
"I'm providing you with the code for the Frontend User Profile component. This includes
UserProfile.tsx,UserProfile.module.css, and the relevant API service call inuserService.ts. My goal is to refactor theUserProfile.tsxcomponent to use React Hooks more effectively and improve its accessibility. Please review the provided code and suggest specific changes, focusing on the component itself and how it interacts with the service."[Paste
claude_frontend_profile_context.mdcontents here]
This explicit framing helps Claude focus its attention and provides better, more relevant responses.
Advanced Considerations
Version Control Integration
Your .gitignore file is already a goldmine for exclusion patterns. Consider writing a script that parses your .gitignore and automatically generates the exclude_patterns list for the Python script. This ensures consistency and reduces maintenance.
# Basic Python snippet to read .gitignore
def get_gitignore_excludes(git_ignore_path):
excludes = []
if os.path.exists(git_ignore_path):
with open(git_ignore_path, 'r', encoding='utf-8') as f:
for line in f:
line = line.strip()
if line and not line.startswith('#'):
# Convert .gitignore patterns to something our script can use
# This is a simplification; .gitignore has complex rules
if line.endswith('/'): # directory
excludes.append(line.rstrip('/'))
elif '*' in line or '?' in line: # wildcard patterns
# Requires more complex glob/regex conversion for perfect match
excludes.append(line)
else: # specific file/directory name
excludes.append(line)
return excludes
# Example usage:
# git_excludes = get_gitignore_excludes(os.path.join(root_dir, '.gitignore'))
# all_excludes = default_excludes + git_excludes
Note that converting .gitignore patterns perfectly to simple string matching can be complex due to its specific wildcard and negation rules. For full fidelity, a dedicated library might be needed, but a basic parser can cover many common cases.
Token Counting
While we focus on file size (MB), Claude's limits are ultimately based on tokens. A rough conversion is 1MB of text ≈ 1,000,000 characters. For English text, 1 token is roughly 4 characters, so 1MB ≈ 250,000 tokens. Code, especially with syntax, can be denser. The 20MB web UI limit is approximately 5 million tokens, and the 32MB API limit is around 8 million tokens. These are significant, but large projects can easily exceed them. Monitoring the output file size is a good proxy for token count.
Security and Sensitive Data
Never, under any circumstances, include sensitive data in your code submissions to any AI model, especially publicly accessible ones. This includes API keys, database credentials, PII (Personally Identifiable Information), or proprietary algorithms that cannot leave your perimeter. Our exclusion lists already handle .env files, but always perform a sanity check on the generated output file before submission.
Future Outlook (2026)
By 2026, we can expect LLM context windows to continue growing. However, the fundamental principles of providing relevant, focused context will likely remain crucial. Larger context windows don't necessarily mean better performance if the input is noisy or unfocused. The ability to intelligently chunk and curate code will likely remain a valuable skill for developers leveraging AI for code assistance, even with 100M+ token windows.
Conclusion
The "Request Too Large" error in Claude Code is a solvable problem, not an insurmountable barrier. By leveraging automated scripting for intelligent file exclusion and adopting a strategic approach to context chunking, senior developers can effectively utilize AI on even the most sprawling codebases.
The Python script provided here offers a robust, extensible foundation for generating focused code contexts. Combine this with thoughtful prompt engineering, and you transform Claude from a limited assistant into a powerful, project-aware coding partner. Embrace these strategies, and keep your development flow uninterrupted.