Renaud Richardet

Which LLM model are you using? <-- Wrong question!

By far, the most common question I get when discussing a project involving LLMs is which model did I use.

In my opinion, it denotes a deep misunderstanding of what and how to work with LLMs¹.

Choosing the right model is part of a much larger set of questions that is nowadays called context engineering. I like Anthropic's definition:

Context engineering is the process of considering the holistic state available to the LLM at any given time and what potential behaviors that state might yield.

Context engineering is about asking: given this input and context, can I reliably trust the model to produce the output I need? It is about designing a system where you understand the relationship between your inputs and the model's capabilities well enough to trust the outputs.

Here are the actual questions we should ask ourselves:

What's your prompt structure? LLMs themselves are very good at improving prompts.
Are there ambiguities? At the very least, copy-paste your complete prompt and just ask another LLM to highlight ambiguities.
How are you handling context window limits?
What's your evaluation/testing strategy? You used to conduct evaluations when using traditional ML algorithms, right? Well, it's the same with LLMs. You can't skip that part.
How are you managing state/memory?
What tools/functions are you providing?
How are you handling errors/retries? Without a strategy here, your system will fail in production no matter which model you choose.

Context goes far beyond providing the right textual context to the model. Again, it is about deeply understanding the problem you are trying to solve, and designing a prompt so that the LLM operates at a level where you can trust it (where you can be confident enough the model will perform appropriately).

Next time someone shows you their LLM project, skip the model question. Instead, ask about context engineering (in particular, evaluation strategies). That's where the real engineering work lives.

It's as if picking the right (best) LLM model would just fix it, like picking the right ML algorithm would fix it. ↩

Condensations of Human Thought (pipes, silicon, and what truly flows)

Our house has its own spring. When I picture it, I see water flowing straight from the mountain, filling my glass. I see the moutain. I don’t dwell on the pipe that carries it there, or on the people who laid that pipe (though I’m grateful to them).

In the same way, when we think about a large language model, we tend to fixate on the silicon, we just see the matrix operations choosing the next word.

But really, it’s better seen as a condensation of humanity’s writings. An oracle we can turn to for advice.

Imagine the countless bouts of human intelligence that have flowed into shaping that oracle.

What if Your Ballot Had a Text Box?

Imagine it’s Sunday and you head to the voting booth. But this time, there are no boxes to tick. No yes-or-no choices. Instead, you’re asked open questions.

Instead of deciding for or against a new swimming pool, you’re asked what you think of the city’s project. Instead of voting yes or no on fireworks, you’re asked how you’d like to celebrate the national holiday. And at the end, you can even share your own ideas with the city council.

For a long time, this kind of voting felt impossible — too many answers, too much complexity to process. But with today’s large language models, we could actually gather and summarize the ideas of thousands of people in a way that city councils can work with.

Democracy has always meant power to the people. Maybe the next step is a system where leaders don’t just count our votes, but listen to the full range of our voices.