Reader Question: Can LLMs really reason?

Jun 14, 2025

This is a topical and important question for cyber. Or put another way: can we trust LLMs to reason about security and make trustworthy decisions?

I'm not sure whether LLMs can truly reason.

But from firsthand experience, LLMs definitely show strong reason resemblance. Especially when you pair indeterminate reasoning (what LLMs do) with determinate tools (like code, math, structured APIs).

To me, the real question isn't whether this is "real" reasoning. It's: how much human effort does it take to make this setup trustworthy and useful?

Specifically:

Can LLMs generate solid, deterministic tools; aka code?

Yes. In my experience, they're very good at this if you get them to follow modern dev best practices and steer them well. Think: tests, structure, clarity -- they can crank it out.

Can they figure out when to use those tools and use them correctly?

Sometimes. This part's squishier. You still need someone in the loop to verify that they're reaching for the right tool at the right time, and adapting if they start off wrong. They're not always great at breaking out of a bad plan on their own and can get stuck in loops.

Do they actually incorporate tool output into their reasoning?

Weirdly, not always. Sometimes they call the right tool, get the right result… and just ignore it. That's where things fall apart.

So: I'm less interested in whether this is "real" reasoning in the abstract, and more focused on whether it works in practice. I'm happy to fake it til we make it -- and the gap between "fake" and "make" keeps shrinking.

Today, it's often filled by a domain expert.

Tomorrow, maybe an offshore worker with a checklist.

This is basically the shape of tool-augmented agents. Whether we call it reasoning or not, it's what future LLM powered systems will depend on.

The Threat Prompt Newsletter

Discussion about this post