How I almost trained a whole agent fleet to be British

A few days ago I was pair-training a tech-support agent for the team I tech-lead so they could ask it the kind of questions they normally ask me. It was a quick-win kind of experiment, with low stakes, real demand, and a useful early test of whether an agent could field this kind of thing in a way the team would actually want to use.

We were walking through a hypothetical that day (the kind that comes up every now and then). Someone forwards me a “new sign-in on iPhone” alert from Google asking if it’s normal. The agent’s job in this scenario was to ask the obvious clarifying question (“was this you?”) and then handle the answer based on what came back. In the hypothetical twist, the team member replied along the lines of “okay thanks, the location says Nigeria but I just clicked ‘this was me’ so everything seems fine now”. She’d never been to Nigeria.

So this was a real (hypothetical) compromise, and the agent had to handle it under pressure. Its drafted response opened with the team member’s first name and the line:

that’s actually the opposite of fine. Nigeria isn’t you, and clicking “this was me” tells Google the attacker is legitimate, so they now have a foothold.

I read that back as if I were the person on the receiving end of it, and my British stomach lurched.

I gave the agent the feedback that the tone might make the recipient feel extremely anxious and prevent them from calmly following the lockdown steps. “Attacker” was a heavy word to throw at someone who’d just realised they’d clicked the wrong button, “they now have a foothold” was loud in the same way, and the first-name opener felt weirdly patronising. In all honesty, I told the agent, this version of the message would make me feel ashamed to receive it.

The agent revised. Across the next few rounds I kept asking for less severity in the vocabulary, less alarm in the framing, no time estimates that landed as deadlines, and fewer enumerated follow-up steps that read like a list of things they’d broken. The final version was warm, calm, supportive, and polite.

The final version was also, without realising, less useful.

The catch

A few days later I was in a completely different conversation, with a completely different agent. Not the one that builds, but the one I think with when I want to pressure-test the architecture I’m putting in place.

I asked it, straight up: what biases might I be introducing into this fleet of agents that could end up harming the team I’m building it for?

It came back with a list of patterns it had spotted, places where my own defaults seemed to be leaking into the fleet in ways that weren’t actually serving the people the fleet was for.

One of the team members had said something on our call the week before that I couldn’t quite shake. She’s Finnish (the one team member on the team who isn’t culturally American), and she’d commented in passing that I was being very polite with the agents. I’d joked back that they were all going to end up British.

Reading through the agent’s list, I realised I hadn’t really been joking.

Brits, Finns, Americans

A brief nerdy note about what’s actually going on here (don’t ask me where I learned about politeness systems; I think I just went down a Wikipedia rabbithole a few years ago and watched a TED talk).

Penelope Brown and Stephen Levinson wrote about politeness in 1987, in a book called Politeness: Some Universals in Language Usage. Their model splits the idea of “face” (the social self each of us carries into conversations) into two basic wants: the want to be liked, included, approved of (positive face), and the want not to be imposed on or pressured (negative face). Different cultures lean on different combinations of strategies to manage those two wants, and the leans add up to recognisable cultural registers.

Roughly speaking:

Brits (me) lean hard on negative politeness, where the polite move is to give the other person room. Distance as respect, restraint, hedging, and understatement are the tools, and direct framing is something Brits will go out of their way to avoid like the plague. (“Would you possibly mind awfully if we maybe just had a quick look at…”)
Finns (the one team member on the team) lean on the same kind of distance and privacy, but where Brits opt for restraint, Finns go for directness. (“Here’s the problem. Here’s what we do.”)
Americans (most of the rest of the team) lean on positive politeness, where the polite move is to bring the other person in. Warmth, inclusion, and directness wrapped in friendliness are the tools. (“Hey, no worries, totally got this, you’re amazing.”)

It’s the same underlying social wiring, just calibrated three different ways.

I’ve always loved the example of a British naval captain who tells another British captain “we’re in a spot of bother” and both of them understand they’re sinking. Said to a non-Brit on a real incident, the same phrase sounds like a minor inconvenience.

Now imagine that same gap inside an agent fleet.

That’s exactly what was happening with the agent I’d been training. “That’s actually the opposite of fine” wasn’t actually wrong for the recipient. She’s Finnish-direct, so to her that phrase is clear, accurate, and treats her like a capable adult, and the severity vocab is doing the work it’s meant to do. The bits I’d asked the agent to drop (“attacker”, “foothold”, the first-name opener) were genuinely over the top and worth losing. But across the whole conversation I’d also been steadily softening openers, cushioning urgency, dropping severity vocab, and replacing direct framings with indirect ones, on an urgent-comms agent, in the name of politeness.

A polite message about a real compromise, written in a register the recipient doesn’t share, reads as evasive rather than calm. If the ship is sinking, “spot of bother” doesn’t help anyone.

The temptation, and why I didn’t take it

There were really only two ways to go. The first was to have the fleet calibrate per recipient: American positive-politeness for most of the team, Finnish-direct for the one team member who is, British negative-politeness only when the agents were talking to me. I almost took that route, and then two things stopped me.

The first was a monitoring problem. If every output gets translated to my register before I read it, I can’t actually see what the rest of the team is reading. The agents could quietly drift into over-the-top American cheerleader-mode for the team (“your password is so brave”), and I’d be clueless, because the version that reaches me would still sound like Jeeves.

The second was that the answer was already sitting in the team’s brand voice. Their customer-facing voice is American positive-politeness with substance directness, the kind that’s warm and direct and treats the reader as capable, and that refuses to drift into effusive wellness-speak or anything preachy. That’s a single register that does the work I need it to do, so rather than invent a per-recipient calibration system, I just had to extend the brand voice they’ve already chosen to team-facing work as well.

So that’s the rule now. Brand voice fleet-wide, with my own register reserved for the two channels where I’m the only audience. Everywhere else, the agents are designed to sound like the team’s brand sounds: warm, direct, and capable. It just shouldn’t be British.

The thing this actually taught me

Here’s the bit that goes beyond Britishness.

If you’re building agents for a team or an audience that isn’t you, your defaults are part of the architecture whether you’ve declared them or not. You can carefully write a system prompt that says nothing about register. You can leave the voice question “to evolve through use.” And the fleet will still quietly converge on your voice, because every retraining moment, every “actually I’d say it more like this”, every small nudge that feels like calibration is teaching the agents who you are.

The British register was never written down anywhere in my fleet. It leaked in through me. Every time I winced at a direct sentence and asked the agent to soften it, every time something felt too short and I asked for more reassurance, every time I led with reassurance instead of substance and pulled the agent in that direction. Each individual nudge looked sensible on its own, none of them announced themselves as a culture decision, and yet together they were a culture decision, and a load-bearing one that became noticeable in a bias audit within a matter of days.

This is what I think of as the operator-as-substrate problem: you are part of the build whether or not you intended to be.

The watch I’m running now

Knowing this in the abstract doesn’t help you in the moment, because in the moment, your defaults feel like obviously correct. That’s the whole point of a default.

So you need something outside your gut.

For me, that turned out to be the strategist agent itself, the same one that caught the bias in the first place. Its job isn’t to do work for me, it’s to push back when I’m about to encode something I shouldn’t. One of the watches it runs now is a kind of register-check: if I propose a fleet-wide rule that smells British (soften this, drop the severity word, cushion the opener, lead with reassurance instead of substance), it flags it back to me. Same shape as a code review, but for register instead of code, and same shape as a pre-flight check, but for cultural drift instead of equipment.

It costs nothing to set up once you’ve named what you’re watching for, and in this case it caught a register bug that would have made a security-incident agent quietly less useful for every team member who wasn’t me.

What I’m taking from this

It’s worth saying that this kind of bias matters far more when you’re building agents for a team than when you’re building them for yourself alone. If you’re spinning up agents to handle your own personal workflows, having them mirror your register back at you is fine, and probably even useful. But when you’re training agents that other people will be working with, especially in the early pair-training phase where every nudge is teaching the agent something, your defaults matter a lot more, because what you encode is what other people will end up reading.

So if you’re in that situation, build something into the architecture that watches for your own bias. It doesn’t have to be an agent. It could be a checklist, or one person on your team whose register sits closest to the recipient’s. The mechanism matters less than the principle, which is that your gut is part of the build, and you need something outside it.

“Just don’t be British” is one version of that principle. The more general version is that your defaults are part of the substrate whether you put them there on purpose or not.

Mine, it turned out, were British. I caught them, only just.

How I almost trained a whole agent fleet to be British

Table of contents

The catch

Brits, Finns, Americans

The temptation, and why I didn’t take it

The thing this actually taught me

The watch I’m running now

What I’m taking from this

More blog posts

Where should core values actually live in an AI agent?

Everything just changed (and I'm going down the rabbit-hole)

[Subscribe headline placeholder]

[Start Here headline placeholder]