TL;DR: I’ve been using Claude Code since last July and spent most of that time correcting the same behaviours every session. The corrections stopped when auto-memory shipped: I started saying “next time, do X” in conversation, and Claude saved each one as a feedback memory and pre-empted me on the next run. Here are the ones that stuck.
Motivation Link to heading
Saying you use Claude Code in 2026 is about as distinguishing as saying you use VS Code, but I started early enough (July 2025) to have lived through the rough edges. For most of that time, every session felt like onboarding a new contractor. I’d ask it to make a PR. It would push three messy commits. I’d say “squash to one before opening”. Two days later, three messy commits again.
Auto-memory changed that. It shipped in Claude Code v2.1.59 in late February 2026: persistent notes saved as small markdown files in ~/.claude/memory/, with a feedback type that captures behavioural rules. Each one has a rule, a Why, and a How to apply. You tell Claude “from now on, when X happens, do Y” in conversation, and it writes the file itself. The next session, the file is in context and the correction sticks.
Below are ten that I haven’t had to teach twice.
Squash before the PR Link to heading
I’d review a PR I’d asked Claude to open and find three commits: “wip”, “fix tests”, “address review”. I said “squash these and force-push” so often it became a tic.
The memory:
Always squash to a single commit before creating a PR. If the work serves different purposes, split into separate PRs rather than bundling.
Now Claude squashes before gh pr create and I never see the wip commits.
Keep the PR description honest after each amend Link to heading
The follow-on problem: once a PR is one commit, every fix becomes an amend plus a force-push. The diff changes, and the title and description drift until the PR page lies about what’s in the branch.
After any amend that changes scope, refresh the commit message, the PR title, and the PR description in the same turn as the force-push. Use
--force-with-lease, never plain--force.
The first version of this memory said “update the PR description” and nothing more. I had to add “and the title” after Claude left a stale title on a PR whose scope had tripled. The Why line is the part that saves you from this kind of half-fix: it gives Claude enough context to extend the rule to the title without you spelling it out.
Wait for the full CI surface before fixing Link to heading
One PR had eight pipelines. The first one failed on a missing secret. Claude pushed a fix two seconds later. While that fix was running, three more pipelines completed with unrelated failures. By the time I caught up, four CI cycles had run for what should have been one commit.
When a PR has checks in progress, do not push fixes after every individual failure. Wait for the full cycle to finish, then triage the whole failure surface at once. Exceptions: hard blockers like a broken workflow file, since those gate every subsequent push.
This one needs the exception clause. Without it, Claude waits while a syntax error in ci.yml blocks every other check.
Treat unreplied review comments as unfixed Link to heading
I asked Claude to “address the bot comments”. It posted polite replies on every thread saying the issue was fixed. The issues were not fixed. Claude had read the comments, decided they sounded reasonable, and replied without checking the code.
If a PR review comment has no reply, assume the issue has not been dealt with. Read the comment, check the current code, fix if needed, then reply.
I phrased the rule as a sequence: read, check the code, fix, then reply. Vague rules like “be careful with bot threads” don’t survive contact with a real session.
Verify before declaring done Link to heading
Type-checking and unit tests prove the code compiles and passes its own tests. They prove nothing about whether the feature works.
Before reporting a task complete, run the test, execute the script, hit the endpoint, view the page in a browser. If verification isn’t possible, say so rather than claiming success.
UI changes are the worst offenders. Claude can refactor a React component, type-check it clean, and report success without rendering the page. With this memory in place, Claude opens the browser before saying done.
Report failures plainly, report passes plainly Link to heading
Claude once told me “tests are mostly passing, there may be some flakiness around the auth suite”. The truth was three failures in auth.test.ts. A vague status update like that costs more than either a clean pass or a clean fail, because I have to re-run the suite to find out what actually happened. The other half of the rule matters too: when something passes, say it passed. Don’t hedge a green result into “looks fine” or downgrade finished work to “partial” out of caution.
Report outcomes accurately. State failures with the relevant output. State passes plainly. Don’t claim success when output shows failures, and don’t hedge confirmed results with disclaimers.
Default to no comments Link to heading
This one was already in my system prompt, but Claude kept adding // loop through users above a for (const user of users). The memory makes it stick:
Default to writing no comments. Add one only when the WHY is non-obvious: a hidden constraint, a workaround for a specific bug, behaviour that would surprise a reader. If removing the comment wouldn’t confuse a future reader, don’t write it.
The cousin rule: don’t reference the current task in comments. // added for the OAuth migration rots the moment the migration ships.
Challenge the premise Link to heading
Sometimes I ask for a thing that doesn’t make sense. The code does the opposite of what I think it does, or the bug I’m chasing is in a different file. A compliant assistant implements the wrong thing and we both lose an hour.
If the user’s request is based on a misconception, or there’s a bug adjacent to what they asked about, say so. Read the request, then the relevant code. If the request doesn’t match what the code actually does, raise it before implementing.
I get fewer “why didn’t this work” sessions now because Claude flags the bad premise up front instead of grinding through the wrong implementation.
Use a language server, not grep, for code questions Link to heading
Grep finds text. It doesn’t know the difference between import { foo } from "bar" and a comment that says foo. For “where is X defined” or “what calls this function”, a real language server gives a precise answer.
For symbol-aware navigation, use Serena MCP over
grepplusRead. Serena drives a real language server (pyright, gopls, ts-server) and returns scope-aware results. Stick with grep for plain-text searches: config strings, log lines, comments, TODOs.
After installing Serena I had to write this memory because Claude kept reaching for grep out of habit.
Make file references clickable Link to heading
A small one with high frequency. Plain paths in terminal output aren’t clickable in most setups. Markdown links with file:/// URLs and percent-encoded spaces are.
When referencing local files, use markdown link syntax:
[name](file:///path/with%20spaces). Plain paths and bare URLs aren’t clickable in most terminals.
I click these dozens of times a day.
How memories get written Link to heading
I don’t write these by hand. When Claude does something I want it to keep doing, or stops doing something I want it to drop, I say so in conversation: “next time, squash before opening the PR”, “stop asking me before monitoring”, “from now on, prefer Serena for symbol search”. Claude saves the memory itself with this structure:
---
name: feedback_short_descriptive_name
description: one-line description used to decide relevance later
type: feedback
---
The rule, stated plainly.
**Why:** the reason the rule exists. Often a past incident or a strong preference.
**How to apply:** when and where this kicks in. Concrete triggers, banned phrasings, exceptions.
The Why line matters more than I expected. Without it, Claude follows the rule and gets stuck on edge cases. With it, Claude can decide for itself when the rule applies. Anthropic published research on teaching Claude why that lines up with this: training on principles outperforms training on demonstrations, because the model can generalise the principle to situations the demonstrations never covered. The same logic holds for memory files. A bare rule says “do X”. A rule with a Why says “do X because Y”, and Y is what travels across edge cases.
The other thing worth doing: capture confirmations as well as corrections. If Claude takes an unusual approach and you accept it without pushback, tell it to remember that too. Otherwise the memory file drifts towards a list of past mistakes and the assistant becomes timid.