A few days ago I cleaned up 200GB from my Mac and deliberately kept my Claude Code history at 2.3GB. Four days later it had grown to 9.5GB. Here’s how I pruned it back to 1.1GB without breaking conversation continuity.
This also broke ccs - my fuzzy finder for Claude conversations. Parsing 3GB JSONL files was taking forever. I’ve since added --max-size filtering to ccs (defaults to 1GB), but that just hides the problem. Better to prune the files themselves.
The problem Link to heading
Claude Code stores conversations in ~/.claude/projects/. Each project gets a directory, and each conversation is a JSONL file where every line is a JSON object representing a message, tool result, or progress update.
du -sh ~/.claude/projects/
# 9.5G
The largest files were 3.3GB and 3.2GB each - single conversations from a complex debugging session involving multiple subagents.
What takes up space Link to heading
I expected the conversation text to be the culprit. It wasn’t. Looking at a 3.3GB file:
# Find the largest lines
cat file.jsonl | awk '{ print length, NR }' | sort -rn | head -5
171825800 2949
86652311 2979
86593724 2985
...
A single line was 171MB. Inspecting it:
sed -n '2949p' file.jsonl | jq -c 'to_entries | map({key: .key, size: (.value | tostring | length)}) | sort_by(-.size) | .[0:3]'
[{"key":"data","size":171825000}]
The data field in progress messages was enormous. Drilling deeper:
sed -n '2949p' file.jsonl | jq -c '.data | to_entries | map({key: .key, size: (.value | tostring | length)}) | sort_by(-.size)'
[{"key":"normalizedMessages","size":83479687},{"key":"message","size":83477716},...]
Two fields - normalizedMessages and message - were ~83MB each and contained nearly identical data. The normalizedMessages field is a duplicate of message used internally.
The main space hogs:
- agent_progress messages with duplicated
normalizedMessagesandmessagefields - toolUseResult fields in subagent files (one was 83MB)
- bash_progress output from commands that produced lots of output
- thinking blocks from extended reasoning
The solution Link to heading
A jq script that:
- Removes
normalizedMessagesentirely (it’s a duplicate) - Truncates large
messagefields in agent_progress - Truncates large
toolUseResultfields - Truncates large bash output
- Truncates old thinking blocks
#!/bin/bash
# prune-history.sh - Prune large Claude Code history files
set -e
FILE="$1"
if [ -z "$FILE" ] || [ ! -f "$FILE" ]; then
echo "Usage: $0 <file.jsonl>"
exit 1
fi
echo "Processing: $FILE"
echo "Original size: $(ls -lh "$FILE" | awk '{print $5}')"
echo "Original lines: $(wc -l < "$FILE")"
cp "$FILE" "${FILE}.bak"
jq -c '
if .type == "progress" then
if .data.type == "agent_progress" then
del(.data.normalizedMessages) |
if (.data.message | tostring | length) > 10000 then
.data.message = "[truncated - was \(.data.message | tostring | length) bytes]"
else .
end
elif .data.type == "bash_progress" then
if (.data.output | type) == "string" and (.data.output | length) > 10000 then
.data.output = (.data.output | .[0:1000]) + "\n...[truncated]..."
else .
end
else .
end
elif .type == "assistant" then
if .message.content then
.message.content = [.message.content[] |
if .type == "thinking" and (.thinking | length) > 20000 then
.thinking = (.thinking | .[0:2000]) + "\n...[truncated]..."
else .
end
]
else .
end
elif .toolUseResult and (.toolUseResult | tostring | length) > 10000 then
.toolUseResult = "[truncated - was \(.toolUseResult | tostring | length) bytes]"
else .
end
' "$FILE" > "${FILE}.pruned"
PRUNED_LINES=$(wc -l < "${FILE}.pruned")
ORIG_LINES=$(wc -l < "${FILE}.bak")
if [ "$PRUNED_LINES" -eq "$ORIG_LINES" ]; then
mv "${FILE}.pruned" "$FILE"
rm "${FILE}.bak"
echo "Pruned size: $(ls -lh "$FILE" | awk '{print $5}')"
echo "Success!"
else
echo "ERROR: Line count mismatch!"
mv "${FILE}.bak" "$FILE"
rm -f "${FILE}.pruned"
exit 1
fi
Results Link to heading
| File | Before | After |
|---|---|---|
| conversation-1 | 3.3GB | 9.7MB |
| conversation-2 | 3.2GB | 6.8MB |
| conversation-3 | 1.6GB | 42MB |
| subagent-file | 83MB | 299KB |
| 4 smaller files | 285MB | 11MB |
Total: 9.5GB → 1.1GB (88% reduction)
The conversations still work - you can resume them with claude --continue. The pruned content was tool output and intermediate state, not the actual conversation.
Finding large files Link to heading
To find conversation files worth pruning:
find ~/.claude/projects -name "*.jsonl" -size +50M -exec ls -lh {} \;
Then run the script on each:
./prune-history.sh /path/to/large-file.jsonl
Why this happens Link to heading
Claude Code’s subagent feature spawns child processes for complex tasks. Each subagent’s full context gets stored in progress messages, including all tool results and intermediate state. When subagents spawn other subagents, this compounds quickly.
A single debugging session with multiple parallel agents can generate gigabytes of stored context. Most of this is redundant - the same information exists in multiple places as context gets passed between agents.
Automation Link to heading
I’d recommend running this periodically or when disk space gets tight. A simple cron job:
# Weekly pruning of large Claude history files
0 3 * * 0 find ~/.claude/projects -name "*.jsonl" -size +100M -exec /path/to/prune-history.sh {} \;
Further reading Link to heading
- jq manual - the JSON processor used for pruning
- Claude Code docs - official documentation