Pruning Claude Code conversation history

A few days ago I cleaned up 200GB from my Mac and deliberately kept my Claude Code history at 2.3GB. Four days later it had grown to 9.5GB. Here’s how I pruned it back to 1.1GB without breaking conversation continuity.

This also broke ccs - my fuzzy finder for Claude conversations. Parsing 3GB JSONL files was taking forever. I’ve since added --max-size filtering to ccs (defaults to 1GB), but that just hides the problem. Better to prune the files themselves.

The problem Link to heading

Claude Code stores conversations in ~/.claude/projects/. Each project gets a directory, and each conversation is a JSONL file where every line is a JSON object representing a message, tool result, or progress update.

du -sh ~/.claude/projects/
# 9.5G

The largest files were 3.3GB and 3.2GB each - single conversations from a complex debugging session involving multiple subagents.

What takes up space Link to heading

I expected the conversation text to be the culprit. It wasn’t. Looking at a 3.3GB file:

# Find the largest lines
cat file.jsonl | awk '{ print length, NR }' | sort -rn | head -5

171825800 2949
86652311 2979
86593724 2985
...

A single line was 171MB. Inspecting it:

sed -n '2949p' file.jsonl | jq -c 'to_entries | map({key: .key, size: (.value | tostring | length)}) | sort_by(-.size) | .[0:3]'

[{"key":"data","size":171825000}]

The data field in progress messages was enormous. Drilling deeper:

sed -n '2949p' file.jsonl | jq -c '.data | to_entries | map({key: .key, size: (.value | tostring | length)}) | sort_by(-.size)'

[{"key":"normalizedMessages","size":83479687},{"key":"message","size":83477716},...]

Two fields - normalizedMessages and message - were ~83MB each and contained nearly identical data. The normalizedMessages field is a duplicate of message used internally.

The main space hogs:

agent_progress messages with duplicated normalizedMessages and message fields
toolUseResult fields in subagent files (one was 83MB)
bash_progress output from commands that produced lots of output
thinking blocks from extended reasoning

The solution Link to heading

A jq script that:

Removes normalizedMessages entirely (it’s a duplicate)
Truncates large message fields in agent_progress
Truncates large toolUseResult fields
Truncates large bash output
Truncates old thinking blocks

#!/bin/bash
# prune-history.sh - Prune large Claude Code history files

set -e

FILE="$1"
if [ -z "$FILE" ] || [ ! -f "$FILE" ]; then
    echo "Usage: $0 <file.jsonl>"
    exit 1
fi

echo "Processing: $FILE"
echo "Original size: $(ls -lh "$FILE" | awk '{print $5}')"
echo "Original lines: $(wc -l < "$FILE")"

cp "$FILE" "${FILE}.bak"

jq -c '
  if .type == "progress" then
    if .data.type == "agent_progress" then
      del(.data.normalizedMessages) |
      if (.data.message | tostring | length) > 10000 then
        .data.message = "[truncated - was \(.data.message | tostring | length) bytes]"
      else .
      end
    elif .data.type == "bash_progress" then
      if (.data.output | type) == "string" and (.data.output | length) > 10000 then
        .data.output = (.data.output | .[0:1000]) + "\n...[truncated]..."
      else .
      end
    else .
    end
  elif .type == "assistant" then
    if .message.content then
      .message.content = [.message.content[] |
        if .type == "thinking" and (.thinking | length) > 20000 then
          .thinking = (.thinking | .[0:2000]) + "\n...[truncated]..."
        else .
        end
      ]
    else .
    end
  elif .toolUseResult and (.toolUseResult | tostring | length) > 10000 then
    .toolUseResult = "[truncated - was \(.toolUseResult | tostring | length) bytes]"
  else .
  end
' "$FILE" > "${FILE}.pruned"

PRUNED_LINES=$(wc -l < "${FILE}.pruned")
ORIG_LINES=$(wc -l < "${FILE}.bak")

if [ "$PRUNED_LINES" -eq "$ORIG_LINES" ]; then
    mv "${FILE}.pruned" "$FILE"
    rm "${FILE}.bak"
    echo "Pruned size: $(ls -lh "$FILE" | awk '{print $5}')"
    echo "Success!"
else
    echo "ERROR: Line count mismatch!"
    mv "${FILE}.bak" "$FILE"
    rm -f "${FILE}.pruned"
    exit 1
fi

Results Link to heading

File	Before	After
conversation-1	3.3GB	9.7MB
conversation-2	3.2GB	6.8MB
conversation-3	1.6GB	42MB
subagent-file	83MB	299KB
4 smaller files	285MB	11MB

Total: 9.5GB → 1.1GB (88% reduction)

The conversations still work - you can resume them with claude --continue. The pruned content was tool output and intermediate state, not the actual conversation.

Finding large files Link to heading

To find conversation files worth pruning:

find ~/.claude/projects -name "*.jsonl" -size +50M -exec ls -lh {} \;

Then run the script on each:

./prune-history.sh /path/to/large-file.jsonl

Why this happens Link to heading

Claude Code’s subagent feature spawns child processes for complex tasks. Each subagent’s full context gets stored in progress messages, including all tool results and intermediate state. When subagents spawn other subagents, this compounds quickly.

A single debugging session with multiple parallel agents can generate gigabytes of stored context. Most of this is redundant - the same information exists in multiple places as context gets passed between agents.

Automation Link to heading

I’d recommend running this periodically or when disk space gets tight. A simple cron job:

# Weekly pruning of large Claude history files
0 3 * * 0 find ~/.claude/projects -name "*.jsonl" -size +100M -exec /path/to/prune-history.sh {} \;