Before anything else

Why you need this guide

You already use AI in your business. You have an account with Claude, ChatGPT, or Gemini. You pay for them every month. But somewhere, deep down, you feel that you're not getting everything you could be getting.

That feeling is not misplaced. The difference between someone who gets mediocre results from AI and someone who gets professional results is not talent, nor technical experience. It is vocabulary. There are 40 to 50 key terms in technical English that, when dropped into your commands, make the AI respond at a completely different level of quality.

This guide gives you those words. No theoretical padding, no useless jargon. With concrete examples and real-life analogies.

What you lose now

Hours rewording the same command

You get a draft, it is close to what you want but not quite, you ask for changes, you get something else, you start over. With the right vocabulary, the first delivery is the good one in 80% of cases.

What you lose now

Double AI costs for the same output

Every "try again" costs tokens. Every revision cycle eats into your subscription. Precise commands cut consumption by 30 to 60%.

What you lose now

Dependency on freelancers you can't audit

You pay someone to do "something technical", you don't understand what they delivered, you can't verify. With the vocabulary in this guide, you can ask questions that show you know what you're asking for.

Who this is for

Founders, managers, marketers, and business operators who already use AI in their work, but feel that the results are below their real potential. You don't need to be a programmer. You just need to be curious and want to take your digital work to the next level.

If you already speak fluent technical English and have built AI systems for clients, this guide is too basic for you. For everyone else, it is exactly what you need.

Three ways to use it

20 min read

Top 5 + Cheat Sheet only. Go straight to the 5 words to put on the wall, read them, then jump to the final cheat sheet. That gives you 70% of the value.

2 hours

Full read. 7 chapters, each with 4 to 7 words. With examples, analogies, and templates. The investment that changes how you work with AI for the entire next year.

Reference

Keep it open next to your monitor. When you write a command to an AI and you want it to be professional, open the right chapter, copy a sentence, adjust. Over time it becomes reflex.

→ Solutions mentioned in this guide

The guide teaches you the words. If you want to use them without building the infrastructure yourself, there are two RoboMarketing products that do exactly that:

Chapters 1-3 · Verification & thinking

RobOS

Memory, workspaces, and anti-improvisation verification on top of Claude Code. "Don't guess, verify" becomes default behavior.

For Claude Code users

Chapters 4-6 · Code, integrations, security

Agent Factory

Dedicated VPS with 4 AI agents publishing daily. Idempotency, rate limiting, secrets management, pre-installed.

Live course · 12 seats per cohort

→ Jump to details at the end of the guide

★ Recommendation

Whichever mode you pick, do two things in the first 15 minutes: (1) print the cheat sheet and put it next to your monitor, (2) memorize the five words in the Top 5. The rest comes with practice.

Before you start

How to use this guide

This guide is not meant to be read once, from start to finish. It is built as a working tool: you open it when you have to write a command to an AI agent and you don't know how to phrase it to get what you want.

Each chapter covers a family of "golden words", technical English terms that, when you drop them into your command (even if the rest of the command is in plain English), signal to the AI agent that you expect a certain level of quality, rigor, or type of thinking.

For each word you will find:

What it means, in a couple of sentences, in plain language
A real-life analogy, so it sticks in memory
When to use it, concrete business situations
Weak command vs. command with golden words, examples side by side
Traps, mistakes most people make

★ The single most important piece of advice

You don't need to memorize the whole guide. Memorize just the five words in the "Top 5: On the wall" section. Look up the rest when you need them. The guide has a cheat sheet at the end for exactly that purpose.

Chapter 00 · Introduction

Why words matter

Imagine you've hired an extremely talented programmer who has read everything ever written about programming, business, design, and marketing. The only problem: they can't read your mind.

If you say "build me a Shopify integration", they will build something, probably what they've seen others build, on the quick side, with no verification, with no thought to the cases where things break. It will work "kind of".

But if you say "build me a Shopify integration, idempotent, with rate limiting, and before you say it's done write evals for it", the same programmer will deliver something substantially better. Not because the words are magic. Because each word activates in their mind a set of standards and checks they would have skipped otherwise.

AI agents work exactly the same way. They were trained on millions of technical documents where, for example, the word "idempotent" always shows up next to safety procedures, retry logic, and protection against duplicates. When you drop it into your command, the agent pulls along all those standards.

The big secret: the agents' language

Here is the good news for you, if you don't speak technical English: you don't need to speak it fluently. You just need to know these 40 to 50 key terms and slip them into your everyday sentences.

It's like going to a mechanic when you know nothing about cars, but saying: "I want you to check the suspension and the brakes before you give me the car back". The fact that you used the words "suspension" and "brakes" tells the mechanic that you are not completely lost and that you expect a serious check, not a quick patch. AI agents work the same way.

This guide gives you the mechanic's vocabulary. You don't need to become a mechanic. You just need to know what to ask for.

Anatomy of a good command

A good command to an AI agent almost always has three parts:

Part	Role	Example
1. WHAT?	The concrete task	"Build the Shopify integration."
2. HOW?	Standards and constraints (this is where the golden words live)	"Make it idempotent, with rate limiting and retry exponential backoff."
3. DONE WHEN?	Checks and acceptance criteria	"Before you say it's done, write evals and run them."

Most people write only part 1: "build the Shopify integration". Then they wonder why the result is disappointing. Golden words are the tool you use to fill in parts 2 and 3 effortlessly.

5 rules before you start

1. Use the words in English, even if the rest of your sentence is not.

AI agents understand technical terms best in English. Write "make it idempotent", not "make it so that it gives the same result when run multiple times...". The technical term is more precise than any paraphrase.

2. Don't use words you don't understand.

If you slip in a term just to "sound technical" and the agent asks you a follow-up about it, you will be stuck. Use only words you can explain in two sentences.

3. Start simple, then add rigor.

For a small task, you don't need all 50 words. Use 1 to 3, the relevant ones. For a big project, go up to 5 to 10.

4. Always ask for verification.

The single best rule in the whole guide: whatever you ask, add at the end "don't guess, verify" or "write evals and run them". That alone eliminates 70% of agent mistakes.

5. Re-read the result with a critical eye.

Golden words improve the result, but they don't make it perfect. You are still the one who decides whether what the agent delivered is good enough for your business.

✓ Verdict before Chapter 1

If you stop here and just apply the rules above, you will already get considerably better results. The chapters that follow give you the concrete vocabulary. Read them at your own pace.

Chapter 01

Quality and verification

These five words are the most important in the entire guide. If you remember only this much from this document, you already place yourself in the top 10% of people working with AI agents.

They are all about the same thing: how you force the agent to verify its own work before handing it to you. Without them, you will always get results that "look fine" until the moment you put them in production and discover that they are not.

evals

short for "evaluations"

What it means: Sets of automated tests that check whether your agent (or the functionality you built) actually does what it should. They are like a written exam you give the agent, with hundreds of questions, and the agent reports how many it got right.

Think of a baker who wants to buy a new oven. Before paying, they want to test it: bake 5 loaves at different temperatures, see if the crust comes out right, if it bakes evenly, if anything misbehaves. Evals are the 5 trial loaves. Without them, the baker buys on faith, and finds out something is wrong only when 200 customers are waiting.

When to use it

Always, when you build something new. Before you "release" any new functionality, whether it's an agent that answers customers, a script that sends emails, or a Shopify integration, ask for evals. And especially when you modify something that already works: evals tell you whether the change broke something else.

Weak command vs. command with golden words

✗ Weak command

"Build me an agent that answers questions about customer orders."

✓ Command with golden words

"Build me an agent that answers questions about customer orders. Before you say it's done, write evals with at least 20 scenarios: customer asks for order status, customer asks for refund, customer writes with typos, customer asks in another language. Run the evals and show me which ones failed."

⚠ Common trap

Many people ask "write evals" and stop there. The agent will write 3 trivial tests that all pass and tell you it's done. Specify how many tests you want and what types of scenarios they should cover (happy path + edge cases, see the words in the next chapters).

ground truth

literally "the truth on the ground", the reference answer

What it means: The correct answer, verified by a human, against which the agent's result is compared. If evals are "the exam", ground truth is "the answer key". Without the key, you don't know whether the answers are good or bad.

In a math exam, the teacher has a sheet with the correct answers. That is ground truth. Without it, they can't grade the students. Likewise, without ground truth, your agent doesn't know whether the answer to "how many orders did we have yesterday?" is correct, it only knows that it gave an answer.

When to use it

Always next to the word "evals". Ask the agent to define ground truth for each test. For example: if it asks "how many orders did we have yesterday?" and the ground truth (manually verified) is 47, the agent knows that an answer of "around 50" is too vague, and "23" is clearly wrong.

Weak command vs. command with golden words

✗ Weak command

"Write evals for the sales reporting."

✓ Command with golden words

"Write evals for the sales reporting. For each test define ground truth: take real data from a known period, calculate the correct answer manually, then compare the agent's output against it. Allowed tolerance: 0%."

sanity check

literally "common-sense check"

What it means: A quick, obvious verification that a result makes sense in the real world. Not a deep check, just "could this be true?". Sanity checks catch the big, obvious mistakes that would otherwise slip past.

If a child says they ran 100 meters in 3 seconds, you don't need to be an Olympic coach to say "wait, that's impossible". That is a sanity check. If your agent reports that you had 50,000 Shopify orders in one hour (when the average is 20 per hour), a sanity check would catch it instantly.

When to use it

On any automated report, before it gets forwarded (to a client, on Slack, by email). On any large transaction. On any bulk change (for example, price updates on 500 products).

Weak command vs. command with golden words

✗ Weak command

"Send the weekly report on Slack."

✓ Command with golden words

"Send the weekly report on Slack. Before sending, do a sanity check on the key numbers: orders between 100 and 2,000 per week, AOV between $50 and $500, conversion rate between 0.5% and 5%. If any number falls outside the range, stop and tell me."

dry run

running through the steps without any real effects

What it means: You run the action, but nothing actually happens. The agent shows you what it would do, so you can verify before you press "do it for real". It is the most important safety net when the action has consequences (sending emails, changing prices, deleting customers).

Before a pilot takes off, they go through an entire checklist without touching the real controls: "check the throttle... ok, but don't move... check flaps... ok, but don't move". That is a dry run. Then, once they are satisfied that everything is ready, they start doing things for real. Likewise, before sending 5,000 emails to your customers, you want to see the list of 5,000 without anything being sent.

When to use it

Mandatory for: mass email, bulk price changes, data deletions, large system syncs, any batch operation. Recommended for: new scripts you're running for the first time.

Weak command vs. command with golden words

✗ Weak command

"Send the Black Friday campaign to all active customers."

✓ Command with golden words

"Send the Black Friday campaign. First do a dry run: show me how many emails would be sent, to what segments, 3 full example emails (how they'd appear to the customer). The real send only happens after I confirm."

★ Tip

Combine dry run with sanity check. Dry run shows you what would happen, sanity check verifies whether it makes sense. Together, they are almost impossible to fool.

smoke test

the term comes from electronics

What it means: The smallest possible test that confirms the system isn't completely broken. The name comes from engineers who, after repairing a device, would power it on for the first time while standing back: if it smoked, something fundamental was wrong.

When you start your car in the morning, before going on a family road trip, you listen to the engine for 5 seconds. If you hear something weird, you stop and check. If it sounds normal, you go. You don't do a full service every morning, just that short smoke test. Likewise, after any change to your system, run 2 or 3 key actions to confirm that nothing fundamental is broken.

When to use it

After every change, however small. After a version update. After rotating a token. After a rebrand. After a server move. Smoke test = the one bit of assurance that the change didn't kill something critical.

Weak command vs. command with golden words

✗ Weak command

"I changed the Klaviyo token. Does it work?"

✓ Command with golden words

"I changed the Klaviyo token. Run a smoke test: try to fetch the segment list, send a test email to my address, verify that the webhook still responds. Tell me which of the three worked and which didn't."

Chapter 1 recap

Word	In a sentence	When
evals	Automated tests that check the agent	Always, when building or changing
ground truth	The correct answer to compare against	In every eval
sanity check	Quick "does this make sense?"	Reports, transactions, updates
dry run	Run with no real effects	Mass email, deletions, big changes
smoke test	Minimal "nothing is broken" check	After any change

◆ Combined command you can copy

"Build [X]. Before you say it's done: write evals with happy path and edge cases, define ground truth for each test, add sanity checks on the key numbers, do a dry run if the action has real effects, and run a smoke test at the end."

This chapter in practice

"evals", "ground truth", "sanity check", done automatically

If you already use Claude Code, there is a product that does exactly this: it verifies claims before writing, stops you from closing a session without saving, keeps an audit of decisions. All the words from this chapter, installed with one command.

See RobOS → Built for operators who already use Claude Code

Chapter 02

How the agent thinks

The words in this chapter don't ask the agent to do anything different, they ask it to think differently before acting. The difference between an agent that throws the first idea that comes to mind, and one that actually reasons.

think step by step

literally what it says

What it means: You tell the agent to break its thinking into explicit steps, instead of "jumping" to the answer. It sounds trivial, but it is one of the most studied techniques in AI: agents make significantly fewer mistakes when forced to think step by step.

In school, when the teacher asked you to solve a math problem, they said: "show me the steps, not just the answer". Not because the steps were interesting, but because writing the steps forced you to think correctly. If you only wrote the answer, you were guessing. AI agents have the same weakness and the same fix.

When to use it

On any request that involves a complex decision, a calculation, a comparison, or analyzing a situation. Not on simple tasks ("send email X"), where it just adds noise.

✗ Weak command

"Decide which products should be pulled from the catalog this month."

✓ Command with golden words

"Decide which products should be pulled from the catalog. Think step by step: (1) take the sales for the last 90 days, (2) compute the margin, (3) identify products with low sales AND low margin, (4) check whether any have high stock, (5) only then propose the list."

first principles

reasoning from fundamentals, not from analogies

What it means: Start from the fundamental basics of the problem, not from "how it's usually done". It is the opposite of imitation thinking. Ask the agent not to simply copy what it has seen in other projects, but to ask "what is actually needed here?"

Everyone built rockets by reusing NASA's old design. Someone asked from first principles: "why is a rocket expensive? What is it made of? How much do the raw materials cost?" They discovered that materials were 2% of the price, the rest was labor and the loss of the rocket after launch. That is how reusable rockets were born. The same kind of thinking can find simple answers to apparently complex business problems.

When to use it

When you suspect that the agent will give you a "standard" solution that doesn't fit your case. Or when you want to optimize costs / processes and everyone else does "things this way because that's how it's always been done".

✗ Weak command

"How do we cut the cost of sending WhatsApp messages to customers?"

✓ Command with golden words

"How do we cut the cost of sending WhatsApp messages? Think from first principles: what are we paying for, per message or per conversation? Which messages are mandatory vs. nice-to-have? Don't copy 'best practices' from the internet, analyze our concrete situation."

chain of thought

the full reasoning chain

What it means: A more detailed version of "think step by step". You ask not just for the steps, but for the complete reasoning: why you picked step A over B, what you considered and rejected, what assumptions you made. Used correctly, it lets you "audit" the agent's thinking and catch flawed reasoning.

The difference between a consultant who says "I recommend you invest in Meta ads" and one who says "I analyzed 3 options: Meta, Google, and TikTok. I picked Meta because your target audience is 35 to 55 years old (70% of them are there), the budget is small (Meta works well under $1,000 per month), and you already have quality visual content." The second is chain of thought. You can argue with it if you spot a flaw. With the first, you can't.

When to use it

For strategic decisions. For cases where the agent gives you a recommendation and you want to understand what it is based on, not just what the answer is.

✗ Weak command

"Which logistics provider should we work with?"

✓ Command with golden words

"Which logistics provider should we work with? Show me the chain of thought: what options you considered, what criteria you used, what assumptions you made about our needs, and on what basis your recommendation is the best."

red team

military term: the "red team" that simulates the enemy attack

What it means: You ask the agent to attack its own solution. To put itself in the role of a critic, hacker, or unhappy customer and find every way the solution could break. It is the exact opposite of what an agent does naturally (justify what it built).

In the military, when planning a mission, two teams are formed: one plans the attack ("blue team"), the other plays the enemy trying to stop it ("red team"). That is how they find weak points before they become casualties. Likewise, once the agent builds you a solution, have it take on the role of "a customer trying to break it".

When to use it

On critical solutions (anything involving money, security, customer communication). On new code before it goes live. On campaign plans before you sign off. On anything that, if it breaks, costs you dearly.

✗ Weak command

"Check that the checkout process works."

✓ Command with golden words

"The checkout process is ready. Now red team it: think like an attacker, what inputs could break it? Think like a confused customer, where could they get lost? Think like an external system that fails. Give me the top 5 vulnerabilities."

steel-man

the opposite of "straw-man", build the strongest version of the opposing argument

What it means: Before you make a decision, you ask the agent to build the strongest argument against what you want to do. Not a caricature of the opposition, but the smartest, most serious version. Only then do you know whether your decision actually holds up, or whether it is just enthusiasm.

Before buying a house you love, ask a smart friend to tell you the best reasons NOT to buy it, not jokes, not surface-level criticism, but the real risks they see. If you survive their steel-man, your decision is solid. If it shakes you, you've learned something important before you wire the money.

When to use it

On big decisions: launching a new product, changing strategy, signing a major contract, hiring. Steel-man protects you from your own enthusiasm.

✗ Weak command

"I want to launch a new premium product line. Is it a good idea?"

✓ Command with golden words

"I want to launch a premium line. Before telling me what you think, build the steel-man for the opposition: construct the strongest argument against the launch, serious arguments grounded in our context. Then tell me whether the idea still holds."

Chapter 2 recap

Word	In a sentence	When
think step by step	Force explicit steps	Complex decisions, calculations
first principles	From fundamentals, not copying	Optimizations, non-standard solutions
chain of thought	Full, auditable reasoning	Strategic decisions, recommendations
red team	Attack your own solution	Before going live in production
steel-man	Strongest counter-argument	Big decisions

Chapter 03

Anti-guess: force rigor

The biggest problem with AI agents is not that they don't know things, it's that they don't know what they don't know. The words in this chapter are the tools you use to force them to admit uncertainty and verify before asserting.

don't guess, verify

literally what it says

What it means: Probably the single most powerful instruction in this guide. You tell the agent explicitly: do not invent. If you don't know for sure, search, ask, verify in the docs. Better to admit you don't know than to confidently say something wrong.

The difference between a good doctor and a bad one: the bad doctor gives you a diagnosis on the spot, to look sure of himself. The good doctor says "there are 3 possibilities; we need to run these tests to be sure". You want your AI agent to be the second one, not the first. "Don't guess, verify" is exactly the command that turns one into the other.

When to use it

Almost always. Add it at the end of any command involving facts, concrete data, or actions with consequences. It costs three words and probably eliminates 50% of agent "hallucinations".

✗ Weak command

"What is the API endpoint for fetching Shopify orders?"

✓ Command with golden words

"What is the API endpoint for Shopify orders? Don't guess, verify, search the official documentation, give me the link, don't make it up 'from memory'. If you can't access the docs, tell me explicitly."

⚠ Uncomfortable truth

AI agents are trained to be helpful. That means if they don't know an answer, they will invent a plausible one instead of saying "I don't know". "Don't guess, verify" is the direct antidote.

cite your sources

literally what it says

What it means: For every factual claim, you require the concrete source. A link, a file name, a document, a section in the docs. It forces the agent to ground its answers in something verifiable, not in something that just sounds right.

In school, when you wrote a paper, the teacher required a bibliography. Not because they cared about the list of books, but because the simple obligation to cite forced you not to make up facts. Same with AI agents.

When to use it

On research, technical comparisons, supplier recommendations, statistics, any claim of the form "X% of customers...", "studies show that...", "best practice is to...".

✗ Weak command

"What is the average e-commerce conversion rate in the US?"

✓ Command with golden words

"What is the average e-commerce conversion rate in the US? Cite your sources, give me 2 to 3 sources with links, the year, and the sample size. If you can't find concrete data, tell me instead of inventing."

show your work

show the calculations and reasoning

What it means: For any calculation, decision, or result, you want to see how it got there, not just the final answer. Different from "chain of thought", which is for full reasoning. "Show your work" is more granular, for concrete steps, calculations, formulas.

When the accountant gives you the final profit number, you want to see how they got there: what revenue, what costs, what taxes. If they only give you the number, you can't verify it. Same with the agent: if it says "the campaign generated $47,000", you want to see where the 47,000 comes from, how many sales, through what channels, over what period.

When to use it

On reports with numbers. On decisions based on calculations. On any result that will drive a significant action. Not on creative or conversational tasks.

✗ Weak command

"Compute the average customer LTV."

✓ Command with golden words

"Compute the average customer LTV. Show your work: the formula you used, the numbers feeding it (AOV, frequency, duration), where you got them, and the intermediate result at each step. I want to be able to redo the calculation in Excel."

acceptance criteria

what must be true for the task to count as done

What it means: Before the agent starts, you define together a concrete list of conditions that, if all met, mean the task is done. It is like a contract: "if you do A, B, and C, we are done. If not, we are not."

When you hire a painter, the contract doesn't say "paint well". It says concrete things: "two coats of paint, smooth surfaces, no drips, straight edges". Those are acceptance criteria. Without them, "well" means anything. With them, you have a basis to say "wait, you're not finished yet".

When to use it

At the start of every non-trivial task. Ask the agent to propose acceptance criteria, you confirm them, then let it run. This alone eliminates 80% of "I thought you wanted something else" situations.

✗ Weak command

"Write a newsletter for the new product line launch."

✓ Command with golden words

"I want a newsletter for the new line launch. Before you start, propose acceptance criteria: subject line, structure, CTA, length, tone, images, segmentation. I confirm them, then you do the work. I don't want 4 wrong versions."

definition of done

the definition of "done"

What it means: A stricter cousin of acceptance criteria. A universal quality standard that any deliverable has to hit before being declared "done". Acceptance criteria are specific to a task; definition of done applies to everything you deliver.

In good restaurants, before any plate leaves the kitchen, the chef inspects it: right temperature, clean presentation, fresh ingredients. That is the definition of done for anything leaving the kitchen. It doesn't matter if it is soup or steak, everything goes through that filter.

Example of definition of done

✓ All tests (evals) pass
✓ Code is commented in complex areas
✓ User-facing documentation exists
✓ Sanity check has been done on results
✓ A smoke test has been run
✓ Errors are logged (observability)
✓ Tokens are in secrets, not in code
✓ Results are reproducible with the same inputs (idempotent)

★ Tip

Print your definition of done and put it somewhere visible. Then, on any major command to an agent, just write: "follow the standard definition of done". The agent will walk through each point.

Chapter 3 recap

Word	In a sentence	When
don't guess, verify	Don't invent, search, verify	Almost always
cite your sources	Sources for every claim	Research, statistics, "best practices"
show your work	Show the calculations, not just the result	Reports with numbers, numeric decisions
acceptance criteria	List of conditions for "done"	At the start of every task
definition of done	Universal standard for all deliverables	Defined once, applied everywhere

This chapter in practice

"Don't guess, verify", applied continuously, without you having to write it

The most powerful instruction in this guide becomes useless if you have to repeat it on every command. There is a product built on exactly this philosophy: it says "I don't know" instead of inventing, keeps a decisions journal you can audit 6 months later, and refuses to improvise when it has no basis.

See RobOS → For Claude Code users

Chapter 04

Code and architecture

Now we step into "construction" territory, how what the agent delivers is built, so it is solid and doesn't break in 3 months. These words are not just for programmers. With them, even if you don't see the code, you can demand it meet quality standards.

separation of concerns

each piece does one thing

What it means: Your code should not be a "tangle" where everything depends on everything else. Each piece should have one clear responsibility. The piece that sends emails should not be the same one that calculates discounts.

In a good kitchen there are separate stations: one for cutting, one for sauces, one for cooking. If a new chef wants to change the sauce, they only touch the sauce station, they don't touch the cutting knives. Likewise, if you want to change how you send emails, you shouldn't have to dig through the code that calculates discounts.

✗ Weak command

"Build me a system that pulls orders from Shopify and sends emails through Klaviyo."

✓ Command with golden words

"Build me a system that pulls orders from Shopify and sends emails through Klaviyo. Apply separation of concerns: one component that only reads from Shopify, one that only sends to Klaviyo, one that coordinates them. Each should be testable independently."

single source of truth

SSOT for short

What it means: For any important piece of information, there is one single place where it "lives". If you want to change it, you change it in that single place and the rest of the system updates itself. The opposite is chaos: the same price written in 5 different places.

In an office, if every employee keeps their own price list, in two weeks you have 5 different lists and customers get 5 different quotes. If there is one official list, everyone looks at it. That is single source of truth.

✗ Weak command

"Configure the agent for the 3 environments: dev, staging, production."

✓ Command with golden words

"Configure the agent for the 3 environments. Use single source of truth for configurations: one centralized file per environment, the rest of the code reads from it. If I change a token, I change it in one place only."

idempotent

math term: an operation that, repeated, yields the same result

What it means: An action is idempotent if you can run it 10 times and the result is the same as after the first run. Crucial for any action with consequences (sending emails, payments, stock changes), because in the real world, something will get triggered twice by mistake.

Pressing "Floor 5" in an elevator is idempotent. Press it 10 times, the elevator still takes you to floor 5, not to 50. Pressing "Place order" on a website is not idempotent, if the customer clicks 3 times, they get 3 orders and 3 invoices. You want your system's actions to behave like the elevator button, not the "Place order" button.

When to use it

Always. For any integration. Any script that sends emails. Any sync. Any action triggered by a webhook.

✗ Weak command

"When a customer checks out, send them a confirmation email through Klaviyo."

✓ Command with golden words

"When a customer checks out, send them a confirmation email. Make the function idempotent: if it gets called twice for the same order, send only one email. Use order_id as the deduplication key."

★ Top 3 reasons "idempotent" saves your business

1. Shopify webhooks sometimes fire 2 to 3 times (this is in their documentation).

2. If your script crashes halfway, you will want to re-run it, and not have it send things it already sent.

3. Customers click buttons 2 to 3 times. Idempotency saves them from duplicates.

observability

you can see what the system is doing while it runs

What it means: Your system "talks" about what it is doing, via logs, metrics, alerts, so that when something breaks you can investigate quickly. Without observability, a system that stops working is a black box.

Your car has observability: tachometer, fuel gauge, "check engine" light, tire pressure sensor. If something breaks, you immediately see what. Now imagine a car with no dashboard, just the steering wheel. That is what systems without observability look like. You drive until it stops suddenly, with no idea why.

✗ Weak command

"Build the script that syncs stock from Shopify to Klaviyo."

✓ Command with golden words

"Build the sync script. Add observability: log every call with time, status, and number of products. On errors, log the full error. Send a Slack alert if it runs longer than 5 minutes."

fail loudly vs fail gracefully

"crash noisily" vs. "fail with grace"

What it means: Two complementary philosophies. Fail loudly, in development, you want errors to be obvious so you can fix them. Fail gracefully, in production, you want errors to be absorbed gracefully (friendly message, automatic retry, fallback), so the customer experience is not ruined.

When you are learning to cook, you want your partner to tell you straight: "too much salt", fail loudly. That is how you learn. When serving guests, you want them to politely pick the better slice of meat if one is over-salted, without announcing it at the table. Fail gracefully. Your systems need both, at different moments.

✗ Weak command

"Handle errors when the Shopify API fails."

✓ Command with golden words

"Handle errors on the Shopify API. In dev, fail loudly: stop immediately, show the full error. In production, fail gracefully: retry up to 3 times, if it still fails show 'we're processing your request', and log the details for us."

DRY

acronym: "Don't Repeat Yourself"

What it means: If you notice the same logic / code / configuration showing up in 3 or 4 places, it is time to centralize it. Repetition leads to bugs: you change it in one place, forget another, and the system becomes inconsistent.

If you write your company address separately on every invoice, contract, brochure, and website, when you move you have 50 places to update and you will surely miss one. Better to have one "official source" and have the rest pull from it.

YAGNI

acronym: "You Aren't Gonna Need It"

What it means: The antidote to the urge to build features "for the future". Build only what you need now. Add the rest when you actually need it.

When you buy a house, you don't build 6 bedrooms because "maybe someday we'll have 4 kids and 2 nannies". You build the bedrooms you need now. Same with AI agents: without YAGNI, you end up with a system with 30 features, of which you use 5.

✗ Weak command

"Build a simple contact page, but also think about the future."

✓ Command with golden words

"Build a simple contact page. Apply YAGNI: don't add fields for 'eventualities', don't build complex tracking 'for when we might need it'. The minimum page that works, that's it."

Chapter 4 recap

Word	In a sentence	When
separation of concerns	Each piece, one thing	Larger projects
single source of truth	One place per piece of info	Tokens, configurations
idempotent	Repeated run = OK	Integrations, webhooks, payments
observability	Logs, metrics, alerts	Everything running in production
fail loudly / gracefully	Noise in dev, grace in prod	Error handling
DRY	Don't repeat yourself	Refactoring, reviews
YAGNI	Only what you need now	Anti over-engineering

Chapter 05

Integrations
(Shopify, Klaviyo, WhatsApp)

Almost every modern business depends on "integrations", the bits of code that connect two external systems. Integrations are the most fragile type of code in the world. The words here are the tools you use to build integrations that don't break at the first storm.

rate limiting

you cap how many requests you send per second

What it means: Every external API (Shopify, Klaviyo, WhatsApp) has a clear cap: "you can't send more than X requests per second". If you exceed it, they block you. Rate limiting is the code that makes sure you don't exceed the cap.

On the highway there is a speed limit. You can go faster, but the police will fine you. Likewise, Shopify says "40 requests per second, max". If your script sends 100 in one second, Shopify "fines" you, it blocks the requests. Rate limiting is like cruise control that always keeps you under the limit.

When to use it

In every integration. Not knowing this is one of the main causes of "scripts that worked perfectly for 30 days and then suddenly broke".

✗ Weak command

"Sync all 5,000 products with Klaviyo."

✓ Command with golden words

"Sync the 5,000 products with Klaviyo. Implement rate limiting based on Klaviyo's official limits (look them up in the docs, don't guess). If you get close to the limit, slow down; if you hit it, wait per Retry-After."

retry with exponential backoff

retry with increasing wait times

What it means: When a request fails, you don't give up immediately. You retry, but with a wait that grows each time. First time you wait 1 second, then 2, then 4, then 8. That gives the other server time to "breathe".

You call a friend. They don't answer. If you call right back, and again right back, you annoy them and they block you. If you call once now, again in 5 minutes, then 15 minutes, then 1 hour, that is civilized, and they will answer. Servers behave the same way with retries.

✗ Weak command

"If the WhatsApp API doesn't respond, throw an error and stop."

✓ Command with golden words

"If the WhatsApp API doesn't respond, implement retry with exponential backoff: max 5 attempts, waits of 1s, 2s, 4s, 8s, 16s. Only after 5 failures, report the error."

★ Tip

Set a "ceiling" on the backoff (60 or 120 seconds). Otherwise the 10th attempt would be ~17 minutes later, which usually no longer makes sense.

webhook vs polling

"you are notified" vs. "you keep asking"

What it means: Two opposite ways of learning that something happened in an external system. Polling: you ask the system every 5 minutes. 99% of the time the answer is no. Wasted effort. Webhook: the system notifies you automatically when something happens. Efficient, real-time.

Polling = calling the courier once an hour to ask "has the package arrived?". Webhook = letting the courier ring your doorbell when they are at the door. Which is better? Obvious.

✗ Weak command

"Build a system that checks for new Shopify orders every 10 minutes."

✓ Command with golden words

"Build a system notified of new Shopify orders. Use webhook instead of polling. Configure the orders/create webhook. Use polling only as a backup, once an hour."

idempotency key

a unique code per request

What it means: A unique code you send with every important request (payment, order, email). If you accidentally send the same request twice, the server sees the same idempotency key and understands "duplicate, already processed", and does not process it again.

When you pay with a card, each transaction has a unique code. If you accidentally pay twice with the same code, the bank refuses the second one. That is idempotency key. Without it, duplicates are inevitable.

✗ Weak command

"Send the payment to the payment processor for each order."

✓ Command with golden words

"Send the payment to the processor. Use idempotency key per order (e.g., order_id). If the script runs twice, the processor will not charge twice."

pagination

you fetch the data in pages, not all at once

What it means: When you have 10,000 orders in Shopify, you can't ask for all 10,000 at once, the server will refuse or, worse, give you only the first 50 and you lose the rest without noticing. Pagination = you take the orders in "pages" and walk through every page.

Imagine a 1,000-page book. You don't read all of it at once, you go page by page. If someone asks you for the book's contents and you only hand them the first 5 pages with "that's all of it", you lose 995 pages. That is what scripts without pagination do.

✗ Weak command

"Fetch all customers from Klaviyo and sync them to our CRM."

✓ Command with golden words

"Fetch all customers from Klaviyo. Implement pagination, Klaviyo returns max 100 per request, so walk through every page. Sanity check: CRM count after sync = Klaviyo count, ±0."

Chapter 5 recap

Word	In a sentence	When
rate limiting	Stay under the API limit	Any integration
retry with exponential backoff	Retry with growing waits	Any call to an external API
webhook vs polling	Get notified, don't keep asking	External notifications
idempotency key	Unique code to avoid duplication	Payments, orders, side-effect actions
pagination	Fetch data in pages	Bulk syncs

★ Master command for any integration

"Build the integration with [X]. Apply: rate limiting per official limits, retry with exponential backoff (max 5), idempotency key per request, pagination for bulk data, webhooks instead of polling where possible. Add full observability. Write evals."

One sentence = a professional integration.

This chapter in practice

Rate limiting, retry, observability, pre-installed

All these patterns are already built and running on a VPS at the end of the course. You don't spend a month planning the architecture, hunting libraries, or debugging rate limits. Everything runs from day 1. 4 live AI agents, a visual control panel, 24/7 monitoring. You leave with the system, not a list of tutorials.

See Agent Factory → Live course · 12 seats per cohort

Chapter 06

Security for business

Security covers everything that can turn from a minor technical detail into a legal, financial, or reputational disaster. These 4 words are the minimum any business handling customer data should demand.

least privilege

each entity has exactly what it needs

What it means: When you grant access to a token or user, you give them only the permissions they need. Nothing more. If your script only reads orders from Shopify, its token must not have permission to delete orders.

When you hire a cleaner, you give them the key to the office. You don't also give them the safe key, bank account access, and your home Wi-Fi password. They have what they need for the job, nothing extra. If someone steals their key, the damage is limited. That is least privilege.

✗ Weak command

"Generate a Shopify token for the new script."

✓ Command with golden words

"Generate a token for the new Shopify script. Apply least privilege: the script only reads orders, so just read_orders. No write, products, customers, or payments scopes."

secrets management

handling secrets safely

What it means: API tokens, passwords, private keys, all are "secrets". They are never written directly into code, never sent over email, never committed to Git. They live in a dedicated place (environment variables, vault).

The key to the safe doesn't sit on the reception desk. And you don't print it on the company brochure "for transparency". It lives in a dedicated place, with limited and logged access. Same with API tokens: they are the keys to your digital business.

✗ Weak command

"Put the Shopify token in config.py."

✓ Command with golden words

"Apply secrets management: the Shopify token lives in environment variables (.env), not in the code. .env is in .gitignore. For production, use the cloud's secrets manager."

⚠ Quick test

Ask the agent: "If someone had access to our code repo, could they steal the tokens?". If the answer is yes, you have an urgent problem.

PII

acronym: "Personally Identifiable Information"

What it means: Any information that lets you identify a real person: name, email, phone, address, national ID, IP, photo. In the EU (and the UK, with similar rules), PII is protected by GDPR. There are strict legal obligations about how you store it, who has access, and how long you keep it. The US has CCPA and other state-level laws with overlapping requirements.

Your customers' data is like their ID cards. If they handed you their ID at your desk, you wouldn't photocopy it, send it on WhatsApp to friends, or post it in the storefront window. The same rigor applies to digital data.

✗ Weak command

"Log all requests for debugging."

✓ Command with golden words

"Log requests for debugging, but without PII. Don't put emails, phones, addresses, or national IDs in the logs. Use an internal ID (customer_id). If absolutely necessary, encrypt and retain for max 30 days."

audit log

who did what, when

What it means: A journal of every important action: who did it, when, on what, what changed. Different from technical logs. The audit log shows activity (who deleted a customer, who changed a price).

In accounting, every change is recorded: who made the correction, when, why. If 6 months later someone asks "why is this number this way?", you can reconstruct the history. Without an audit log, any issue stays unresolved.

✗ Weak command

"Allow editors to change product prices."

✓ Command with golden words

"Allow editors to change prices. For every change, write to the audit log: who (user_id), when (timestamp), which product, from what price to what price. The audit log cannot be deleted by regular users."

Chapter 6 recap

Word	In a sentence	When
least privilege	Only strictly necessary permissions	Every new token
secrets management	Tokens in secrets, NOT in code	Always, no exceptions
PII	Personal data (GDPR / CCPA)	Any project handling customer data
audit log	Journal: who, what, when	Irreversible changes

This chapter in practice

Secrets management, audit log, VPN, from day 0

The 4 words in this chapter are the difference between an AI system that grows your business and one that puts it at risk. On the VPS built at Agent Factory, the token vault, encrypted VPN access, audit log, and privilege separation are configured before you connect for the first time. Security is not something you "add later".

See Agent Factory → Dedicated VPS · your data stays yours · no vendor lock-in

Chapter 07

Product and decisions

The last chapter is different. Here we don't talk about how to build, but what to build. The words help you make better choices and guide the agent away from building things that shouldn't be built.

MVP

acronym: "Minimum Viable Product"

What it means: The simplest possible version that still solves the real problem. Not the best version, the minimum, but functional. You build the MVP in 2 weeks, test it with real users, learn, then improve.

You want to see whether people would buy a truffle tart. Two options: (A) Invest $5,500 in equipment, buy expensive truffles, rent a space, open a bakery in 3 months. (B) Bake 5 tarts tomorrow, take them to a weekend market, charge $7 each. If people buy them, invest. If not, you lost $15. B is the MVP.

✗ Weak command

"I want a loyalty system with points, tiers, badges, rewards, gamification."

✓ Command with golden words

"I want a loyalty system. First an MVP: just point accumulation per purchase and redemption on the next one, no tiers, badges, or gamification. We launch, measure 60 days, decide what to add."

happy path first

build the ideal case first

What it means: When you build something, start with the case where everything goes perfectly. Only after the happy path works, start handling edge cases (customer abandons mid-checkout, payment is declined, internet drops).

When you build a house, you first put up straight walls and a roof. Then you think "but what if a magnitude 7 earthquake hits?" and add an anti-seismic system. You don't pour the anti-seismic foundation for a house that doesn't exist yet.

✗ Weak command

"Build the checkout flow with all validations and protections for edge cases."

✓ Command with golden words

"Build the checkout flow. Happy path first: customer has correct data, card is valid, payment goes through. Once that works, we handle edge cases: declined card, expired session, double-click."

north star metric

the one metric that actually matters

What it means: The single metric your team will not compromise on. If you have 50 metrics on a dashboard, in practice you focus on none. The north star is the one that, if it grows, you know your business is healthy.

For an online store selling household goods, "monthly sales" sounds like the north star. But maybe the real north star is "customers who buy a second time within 90 days", because it reflects quality, not just quantity.

✗ Weak command

"Build me a dashboard with all the important metrics."

✓ Command with golden words

"Build me a dashboard. Identify the north star metric: which is the single metric we track weekly? That one at the top, large. Below it, 3 to 5 supporting metrics. Nothing extra."

feedback loop

how fast you learn whether something worked

What it means: The time between when you take an action and when you find out the result. Short feedback loops = you learn fast, correct fast. Long feedback loops = you operate blind.

When you learn to cook, you have a 30-minute feedback loop (you cook, taste, adjust). When you learn to make wine, the loop is a year. That is why most people cook well and very few make good wine. You want short feedback loops: launch fast, measure fast, adjust fast.

✗ Weak command

"We launch the new product category and measure sales quarterly."

✓ Command with golden words

"We launch the new category. Build a short feedback loop: weekly measurement of orders, AOV, return rate. Threshold: if after 4 weeks the numbers are below X, we stop. We don't run 3 months on 'let's see'."

Chapter 7 recap

Word	In a sentence	When
MVP	Minimum version that solves the problem	Start of a new project
happy path first	Ideal case first, edge cases later	Start of any build
north star metric	The only metric that matters	Dashboards, strategy
feedback loop	Speed of learning if it worked	Launches, experiments

Bonus 01

Top 5: put these on the wall

If you take only 5 words from the entire guide, pick these. We selected them by one criterion: impact per word.

Recommendation: print this page and put it next to your monitor. For the first 2 weeks, glance at it before every long command. After 2 weeks, it becomes reflex.

1

don't guess, verify

Kills guessing

Added at the end of any factual command, it cuts ~50% of agent "hallucinations". Costs 3 words, saves hours of manual checking.

2

evals

Kills "hope it works"

Forces the agent to verify its own work before declaring "done". With evals, you find bugs at delivery. Without evals, your customers find them.

3

idempotent

Kills duplicates

Duplicate orders, emails sent twice, double-charged payments, all come from non-idempotent functions. One word saves you a year of complaints.

4

acceptance criteria

Kills "I thought you wanted something else"

Defined at the start, they save you from 80% of cases where the agent delivers next to the topic. It is the written contract with the agent.

5

observability

Kills blind debugging

When something breaks (and it will), observability tells you why in 5 minutes, not 5 days. The difference between "panic" and "operational calm".

★ How to use them together

One command combining all five:

"Build [X]. First define acceptance criteria, I confirm, then you start. Make it idempotent. Add observability. Write evals. Don't guess, verify, search the official docs if you're not sure."

One sentence = your product, delivered properly.

Bonus 02

Command templates

These templates are written in plain English with the golden words highlighted. Copy them directly and replace only the bits [in brackets].

Template 01, Build a new integration

When you connect two systems (Shopify ↔ Klaviyo, etc.)

"Build the integration between [System A] and [System B]. Before you start, propose acceptance criteria, I confirm, then you go. Architectural requirements:

separation of concerns: independent components for each system
idempotent: repeated run = same result
rate limiting per official documentation of each API
retry with exponential backoff (max 5 attempts)
idempotency key per request
pagination for bulk data
webhooks instead of polling where possible
secrets management: tokens in .env, not in code
observability: log every call and every error

Before you say it's done, write evals with happy path and at least 5 edge cases. Don't guess, verify in official documentation for anything you're not sure of."

Template 02, Automated reporting

When you want a weekly / monthly business report

"Build the [weekly/monthly] [topic] report. Requirements:

First identify the north star metric and put it large, at the top
3 to 5 supporting metrics under it that explain the north star
For every figure: show your work, data source, period, formula
Sanity check before sending: if a figure falls out of range, stop
Don't log PII, use internal IDs
Idempotent: if it runs twice, don't send twice

Before delivery, do a dry run: show me what the report would look like on last week's data."

Template 03, Marketing campaign

Mass email, mass SMS, push notifications

"Prepare the [description] campaign for [segment X]. Before sending:

Dry run required: show me how many recipients, to what segments, 3 example messages
Sanity check: if the count is ±30% off from the average, stop
Idempotency key per recipient: if the script runs twice, each person gets a single message
Rate limiting per provider limits
Respect GDPR and PII
Audit log: who approved the send, when

I approve the actual send after I see the dry run."

Template 04, Launch a new feature

New functionality on site / app / internal system

"Launch feature [X]. First an MVP: the simplest version that still solves the problem. Exact list of minimum capabilities, and nothing more (YAGNI). Build happy path first. Before launch:

Evals on the key scenarios
Red team the feature: what could break, how could it be broken
Smoke test in staging before production
Set a short feedback loop: measure daily for the first 2 weeks
Define the threshold at which we stop if it isn't working"

Template 05, Debugging when something breaks

System failing, unknown error, weird behavior

"System [X] is not behaving as expected. Investigate:

Think step by step: what data do we receive? what do we process? what do we deliver? where does the chain break?
Don't guess: don't infer the cause from memory. Check observability (logs)
Cite your sources: for every hypothesis, point to the log line / network request
Show your work: walk me through your reasoning
Once you find the cause, propose a minimal fix + an eval that catches this issue if it returns"

Template 06, Modify a running system

Change to code / configuration already in production

"I want to modify [X] in the running system. Before the change:

Tell me the chain of thought: what exactly you'll change, what it could affect
Red team your own change: what could break?
Ensure it preserves backward compatibility

After the change:

Smoke test: run the key actions
Run existing evals as regression
Log the change in the audit log"

Bonus 03

Common mistakes

In the first months of using "golden words", almost everyone makes the same 8 mistakes. We've listed them so you can spot them early.

Over-stuffing with technical terms

You put all 40 words into a single command, thinking "more is better". Result: the agent gets confused, tries to satisfy every criterion, and satisfies none of them well.

Fix: Use 3 to 7 relevant words per command. For simple tasks, 1 to 3. The 5 from the Top 5 + 2 or 3 specific to the context, that's it.

Using words you don't understand

You slip "idempotency key" into a command because it sounds good, but you don't know what it means. The agent asks a clarifying question and you're stuck.

Fix: Use only words you can explain in two sentences. The list you're sure of will grow naturally over time.

Missing acceptance criteria

You rush and skip the "define what 'done' means" step. Then the agent delivers something that is technically correct but doesn't solve your real problem. You start over.

Fix: 5 minutes invested in acceptance criteria at the start save 5 hours of rework later. No non-trivial task without them.

Evals without ground truth

You ask for "evals" and get 10 tests that all pass. In short: the agent verified its own assumptions with its own assumptions. Useless. Evals without human-verified ground truth are theater.

Fix: For every important eval, you have a correct answer verified manually. That is the ground truth.

Ignoring sanity checks

You let the report send automatically without a sanity check. One day, a bug makes the report show "$0 in sales last week". You forward that to 15 people on Slack.

Fix: On every automated report or action, define 2 or 3 sanity checks. "Numbers must be between X and Y; otherwise, stop."

Webhooks without idempotency

You build a system that reacts to Shopify webhooks. Works perfectly for 2 weeks. Then Shopify retries a webhook (perfectly legitimate) and your system processes the same order twice.

Fix: Every webhook handler is idempotent by design. Use event_id as the deduplication key. This is not optional.

Tokens in code or in Git

You put the Shopify token directly in config.py "to move fast". Later, someone clones the repo and you have 200 bots trying to log in. Or you accidentally commit to public Git and a scanner finds it in 30 minutes.

Fix: Never tokens directly in code. Never .env in Git. Use secrets management from day one.

Confusing "evals" with "does it work?"

You ask "does it work?" and the agent says "yes, it works". You assume it has verified. It hasn't, it just estimated that it probably works. "Does it work?" is not a verifiable question; "do the evals pass?" is.

Fix: Don't accept vague answers. Always demand concrete evidence: "how many evals pass? which don't? show me the output."

Final

Final cheat sheet

All the golden words at a glance. Print this section and keep it next to your monitor.

Not sure which solution fits you?

See the side-by-side comparison of Claude Code, Agent Factory, and RobOS. No signup, no email.

See the platform comparison →

Quality & verification

evalsAutomated tests

ground truthVerified correct answer

sanity check"Does this make sense?"

dry runRun without effects

smoke test"Nothing is broken"

How the agent thinks

think step by stepExplicit steps

first principlesFrom fundamentals

chain of thoughtFull reasoning

red teamAttack your solution

steel-manStrongest counter-argument

Rigor

don't guess, verifyDon't invent

cite your sourcesSources with links

show your workShow the math

acceptance criteriaList of conditions

definition of doneUniversal standard

Code & architecture

separation of concernsEach piece, one thing

single source of truthOne place per info

idempotentRepeated run = OK

observabilityLogs, metrics

fail loudly/gracefullyNoise / grace

DRYDon't repeat yourself

YAGNIOnly what you need now

Integrations

rate limitingStay under API limit

retry exp. backoffGrowing waits

webhook vs pollingGet notified

idempotency keyUnique code per request

paginationData in pages

Security

least privilegeOnly what's necessary

secrets managementTokens in .env

PIIPersonal data (GDPR/CCPA)

audit logWho, what, when

Product & decisions

MVPMinimum version

happy path firstIdeal case first

north star metricThe one metric

feedback loopFast learning

Next step

Now that you know the words...

Vocabulary is the first brick. The rest depends on where you are and what you want to build.

For Claude Code users

RobOS

The system that runs by itself: 5-layer memory, anti-improvisation verification, multi-workspace. All the words from chapters 1 to 3, automated.

Operator workspace

See RobOS →

For builders who want infrastructure

Agent Factory

Build your own agent: dedicated VPS with 4 AI agents publishing daily articles and video. All the patterns from chapters 4 to 6, pre-installed.

Live course · 12 seats per cohort

Join Agent Factory →

For those who want to talk

Direct conversation

15 minutes on WhatsApp about your business. We figure out together what fits, no sales pressure.

I answer personally

Talk to Adrian →

Not sure which one fits? See the Claude / Agent Factory / RobOS comparison →

Or write directly to office@robomarketing.ro

Why you need this guide

Hours rewording the same command

Double AI costs for the same output

Dependency on freelancers you can't audit

Who this is for

Three ways to use it

RobOS

Agent Factory

How to use this guide

Why words matter

The big secret: the agents' language

Anatomy of a good command

5 rules before you start

1. Use the words in English, even if the rest of your sentence is not.

2. Don't use words you don't understand.

3. Start simple, then add rigor.

4. Always ask for verification.

5. Re-read the result with a critical eye.

Quality and verification

evals

When to use it

Weak command vs. command with golden words

ground truth

When to use it

Weak command vs. command with golden words

sanity check

When to use it

Weak command vs. command with golden words

dry run

When to use it

Weak command vs. command with golden words

smoke test

When to use it

Weak command vs. command with golden words

Chapter 1 recap

"evals", "ground truth", "sanity check", done automatically

How the agent thinks

think step by step

When to use it

first principles

When to use it

chain of thought

When to use it

red team

When to use it

steel-man

When to use it

Chapter 2 recap

Anti-guess: force rigor

don't guess, verify

When to use it

cite your sources

When to use it

show your work

When to use it

acceptance criteria

When to use it

definition of done

Example of definition of done

Chapter 3 recap

"Don't guess, verify", applied continuously, without you having to write it

Code and architecture

separation of concerns

single source of truth

idempotent

When to use it

observability

fail loudly vs fail gracefully

DRY

YAGNI

Chapter 4 recap

Integrations(Shopify, Klaviyo, WhatsApp)

rate limiting

When to use it

retry with exponential backoff

webhook vs polling

idempotency key

pagination

Chapter 5 recap

Rate limiting, retry, observability, pre-installed

Integrations
(Shopify, Klaviyo, WhatsApp)