Skip to main content

Optimizing Claude Code Context Window to Avoid Rate Limits on Large Repos

 There is no workflow interruption quite as jarring as the "Usage limit exceeded" error from Anthropic in the middle of a complex refactor.

If you are using Claude Code (the CLI agent) on a modern monolith or a large monorepo, you likely hit the 40-hour usage cap or daily token limits significantly faster than peers working on microservices. This isn't just about how much you use the tool; it is about how inefficiently the agent is navigating your file system.

When an LLM agent creates a plan or searches a codebase, every character it reads consumes token quota. In large repositories, irrelevant context is the silent killer of rate limits.

This guide details exactly how the context window fills up during directory traversal and provides a programmatic solution to identify and eliminate "token hogs" from your agent's view.

The Root Cause: Implicit Context Injection

To understand why your quota evaporates, you must understand how Claude Code interacts with your file system.

When you ask Claude to "Fix the layout bug in the dashboard," the agent does not immediately know where the dashboard code lives. It typically performs the following actions:

  1. File Listing: It runs a variant of ls -R or find to map the directory structure.
  2. Grepping: It searches for keywords like "dashboard" or "layout".
  3. Reading: It opens candidate files to analyze logic.

The Problem with "Standard" Git Ignores

Claude Code respects your .gitignore file. However, .gitignore is designed for version control, not context optimization.

There are gigabytes of text in your repo that must be version controlled but should never enter an LLM's context window:

  • package-lock.json / yarn.lock: These files can easily exceed 50,000 tokens. If the agent reads this to check a version number, you just burned 25% of a standard context window.
  • Snapshots: Jest __snapshots__ files are dense, repetitive text.
  • SVGs and Assets: Inline SVG paths in source folders are token-dense noise.
  • Database Seeds/Dumps: Large SQL inserts.
  • Minified Vendor Scripts: Legacy libraries often committed to older repos.

If the agent "greps" your repo and hits a 5MB data.json file that isn't gitignored, it attempts to tokenize it. This spikes your usage metric instantly, even if the agent ultimately decides the file is irrelevant.

The Fix: Context Fencing Analysis

To stop hemorrhaging tokens, we need to identify exactly which files are consuming the context window. We cannot optimize what we do not measure.

Below is a specialized Node.js utility script. It scans your repository, respects your existing .gitignore, but calculates the "Token Gravity" of the remaining files. It identifies the files that are technically tracked by Git but are too heavy for an LLM agent.

Prerequisites

  • Node.js (v18 or higher)
  • A project initialized with package.json

Step 1: The Token Audit Script

Create a file named audit-context.mjs in your project root.

import fs from 'node:fs/promises';
import path from 'node:path';
import { glob } from 'glob'; // Requires: npm install glob

/**
 * CONFIGURATION
 * Adjust these thresholds based on your repo size.
 */
const TOKEN_ratio = 4; // Approx 4 chars per token
const FILE_SIZE_WARNING_THRESHOLD = 50 * 1024; // 50KB (~12.5k tokens)
const IGNORE_PATTERNS = [
  'node_modules/**',
  '.git/**',
  'dist/**',
  'coverage/**'
];

async function getGitIgnorePatterns() {
  try {
    const content = await fs.readFile('.gitignore', 'utf-8');
    return content
      .split('\n')
      .filter(line => line.trim() && !line.startsWith('#'))
      .map(line => line.trim());
  } catch (e) {
    return [];
  }
}

async function analyzeRepo() {
  console.log('🔍 Scanning repository for context bottlenecks...');
  
  const gitIgnores = await getGitIgnorePatterns();
  const allIgnores = [...IGNORE_PATTERNS, ...gitIgnores];

  // Find all files, respecting gitignore
  const files = await glob('**/*', { 
    ignore: allIgnores, 
    nodir: true,
    dot: true 
  });

  console.log(`📂 Found ${files.length} tracked files. Analyzing token density...\n`);

  const heavyFiles = [];
  let totalEstimatedTokens = 0;

  for (const file of files) {
    try {
      const stats = await fs.stat(file);
      
      // Heuristic: 1 token ~= 4 chars (English text)
      // This is an estimation, not an exact tokenizer count
      const estimatedTokens = Math.ceil(stats.size / TOKEN_ratio);
      totalEstimatedTokens += estimatedTokens;

      if (stats.size > FILE_SIZE_WARNING_THRESHOLD) {
        heavyFiles.push({
          file,
          size: (stats.size / 1024).toFixed(2) + ' KB',
          tokens: estimatedTokens,
          rawSize: stats.size
        });
      }
    } catch (err) {
      console.warn(`⚠️ Could not read ${file}`);
    }
  }

  // Sort by heaviest token count
  heavyFiles.sort((a, b) => b.rawSize - a.rawSize);

  console.log('🚨 TOP CONTEXT CONSUMERS (Consider adding to .gitignore or CLAUDE.md):');
  console.table(
    heavyFiles.slice(0, 10).map(f => ({
      File: f.file,
      "Est. Tokens": f.tokens.toLocaleString(),
      Size: f.size
    }))
  );

  console.log('\n📊 SUMMARY');
  console.log(`Total Tracked Context: ~${totalEstimatedTokens.toLocaleString()} tokens`);
  console.log(`(Note: Claude's context window is ~200k. If Total is > 100k, you have a problem.)`);
}

analyzeRepo();

Run this script:

npm install glob
node audit-context.mjs

Step 2: Interpreting the Data

The output will likely surprise you. In a typical legacy monolith, you might see results like this:

FileEst. TokensSize
package-lock.json45,200180 KB
public/vectors/hero.svg12,50050 KB
src/data/mock-users.json8,00032 KB
src/__snapshots__/App.test.js.snap6,50026 KB

If package-lock.json enters the context, you lose ~25% of your available memory for that interaction. If Claude reads 4 or 5 of these heavy files during a search, you hit rate limits immediately because Anthropic counts input tokens against your tier.

Step 3: Implementing "Agent Ignores"

Currently, Claude Code relies heavily on .gitignore. To optimize without breaking your actual Git workflow, you have two primary options.

Option A: The CLAUDE.md Exclusion (Soft Limit)

Create a CLAUDE.md file in your root. This file acts as a system prompt for the agent.

# CLAUDE.md

## Context Constraints
When searching or reading files, strictly IGNORE the following directories and files unless explicitly asked to modify them. They are too large for the context window:

- package-lock.json
- yarn.lock
- /public/assets/**
- /src/__snapshots__/**
- /legacy/modules/** (Do not read legacy code)
- **/*.svg

This reduces usage by instructing the agent to prune its own search paths.

Option B: The Temporary Ignore (Hard Limit)

For maximum efficiency (and to strictly prevent the CLI from reading these files), the most robust method is to temporarily hide them from the agent.

Since Claude Code runs locally, we can append these patterns to .gitignore locally (without committing them) or rely on a global ignore file if configured.

However, the cleanest implementation for teams is to create a .claudeignore file (conceptual) and script the exclusion.

Create a .agentignore file:

# Files tracked by git, but ignored by Agents
package-lock.json
yarn.lock
**/*.snap
**/*.svg
public/
legacy/

Now, update your audit-context.mjs or standard build scripts to ensure your agent doesn't look here. Currently, the most effective way to force Claude Code to ignore these is to add them to .gitignore but assume your team knows not to commit that change, OR use a wrapper alias.

The "Phantom Gitignore" Alias:

Add this function to your .zshrc or .bashrc. It backs up your real gitignore, appends the agent-specific ignores, runs Claude, and then restores the original file.

function claude-safe() {
  # 1. Back up existing gitignore
  cp .gitignore .gitignore.bak

  # 2. Append the 'Agent Ignore' list
  # checks if .agentignore exists
  if [ -f .agentignore ]; then
    echo "\n# --- AGENT IGNORES (TEMP) ---" >> .gitignore
    cat .agentignore >> .gitignore
  fi

  # 3. Run Claude
  # Using 'command' to bypass any other aliases
  command claude "$@"

  # 4. Restore original gitignore immediately after exit
  mv .gitignore.bak .gitignore
}

Now run claude-safe instead of claude. This physically prevents the CLI tool from indexing massive lockfiles and assets, saving thousands of tokens per prompt.

Deep Dive: Why "Pruning" Beats "Summarizing"

You might ask: "Why not just ask Claude to summarize the files?"

The rate limit calculation happens at the input stage.

  1. Request: "Check dependencies."
  2. Action: Agent reads package-lock.json.
  3. Cost: 45,000 tokens are sent to the API.
  4. Result: The API returns a summary.

By the time the summarization happens, the cost is already incurred. The "Usage Limit" is based on Token Processed Management (TPM) and total volume.

By implementing the "Phantom Gitignore" or strict CLAUDE.md rules, you prevent step 2 entirely. The agent sees that package-lock.json is ignored and relies on package.json (which is much smaller) or asks you for specific versions, keeping your session lightweight.

Common Pitfalls

1. The Monorepo Trap

In a monorepo (e.g., Turborepo/Nx), running Claude from the root is dangerous. It attempts to index every app and package. Fix: Always cd into the specific package directory (e.g., apps/web) before starting the agent. This naturally restricts the file system scope.

2. Ignoring Context You Need

Be careful not to ignore types/ or interfaces/. Even though these can be verbose, they contain the schema definitions the LLM needs to write valid code. If you ignore type definitions, the agent will hallucinate methods.

3. The "Legacy" Folder

Every monolith has a folder of deprecated code. If you don't ignore it, Claude will often "fix" a bug by modifying the deprecated code instead of the active code, wasting tokens and engineer time. Always explicitly ignore legacy paths.

Conclusion

Rate limits in Claude Code are rarely about the complexity of your prompt; they are about the verbosity of your file system.

By auditing your repository for "Token Hogs" and implementing strict context fencing via a phantom .gitignore or CLAUDE.md, you can triple the duration of your coding sessions. Treat your context window like a budget—spend it on logic, not on lockfiles.