15 March 2026

A Mechanistic View of Thinking and the Brain

by Janaki Ram Puli

I have been thinking about learning, the brain, and how all of this relates to LLMs. I have also noticed significant changes in the way I read, learn, and focus. These changes are driven by AI-generated content, short-form media, and rapid context switching. So I want to explore what is actually going on inside the brain and how it relates to LLM thinking.

Content/Code generation is cheap, verification is still hard
Attention span
Two Modes of Thinking
Chunking
Neurotransmitters and what motivates us?
Procrastination as a habit loop
Techniques that help
Flow state
Thinking… VS <thinking>...</thinking>
Fast thinking and slow thinking
The illusion of thinking
What we do differently?
References

Content/Code generation is cheap, verification is still hard

There is a shift in how people code or even write content, and a lot of it is AI-generated at this point: articles, emails, messages, PRs, etc. Having used them quite extensively, I can usually tell immediately. There is nothing wrong with that. In fact, they do a much better job than me in any of these areas. But it’s also very hard to verify whether anything in it is actually correct. Generation has become extremely cheap. Verification is still expensive. As the volume of generated content grows, it gets more expensive.

Attention span

With short-form content everywhere and LLMs helping with everything, I’ve noticed that my attention span has gone down significantly. I can do things faster now. But I’m also trying to do everything faster, even when that’s not the right way to approach the problem.

For example:

research
learning a new topic
deeply understanding a concept

Those things require long periods of uninterrupted thinking. But that’s getting harder. How information is engineered, curated, and served to us matters. Short-form video and infinite scroll are designed with no natural stopping cues. Constant rapid context switching is affecting cognitive performance and memory. Rapid switching between contexts increases cognitive load and reduces attention and learning efficiency.

Coding itself is becoming somewhat commoditized. I honestly don’t remember the last time I manually wrote large chunks of code at work.

But long-term skills like deep thinking and genuine understanding feel more valuable than ever. This can be a valuable skill to possess in the era of vibe-coding.

So I thought it’s better to take a step back and understand what is actually going on.

So I started exploring:

how humans actually learn
how the brain processes ideas
how thinking works
how this compares to LLMs
what the gaps are between our brain and LLMs

While exploring, I came across this blog post from Aleksa Gordić’s blog [2] 5 Tips to Boost Your Learning - Aleksa Gordić. .

And the Coursera course Learning How to Learn [3] Coursera: Learning How to Learn. .

Two Modes of Thinking

The brain operates in 2 modes of thinking.

Focused mode - focused thinking happens when we’re actively concentrating on something. This requires deliberate attention and pattern matching to already existing networks.

solving a math problem
reading something carefully
reasoning, deliberate attention

Diffuse mode - this mode of thinking happens when we’re relaxed.

walking
showering
sleeping
daydreaming

During this mode, the brain actually makes connections between ideas that may not seem related before. Quite often, a good idea strikes us during this mode (light sleep, shower, brisk walk, …): the “aha” moment.

Can these 2 run simultaneously? No.

Chunking

A chunk in the brain is basically a piece of knowledge. It can be a pattern, concept, or a skill … When you truly understand something, it becomes a chunk.

To form a chunk, you need:

focused attention
practice (DIY)
rest (diffuse mode to consolidate)
recall (retrieve it later to reinforce)
make mistakes (error signals to correct)

Illusion of competence: simply looking at or reading through solutions, or watching some YouTube stuff to understand something, can create an illusion of competence.

It happens, right? You think you understand that topic until someone asks about it, or something goes wrong, and you realize that you didn’t understand it properly. So you have to work on it yourself to build the neural pattern.

A useful technique before actually learning any topic is to get a sense of the big picture: skim through headings, images, tables, and the author’s work. Also, I don’t want to overlearn something; that can have diminishing returns. Better to jump to something else.

Neurotransmitters and what motivates us?

Acetylcholine - associated with focused learning and attention
Dopamine - encodes reward anticipation and assigns value to stimuli. It spikes in anticipation of a future reward. This is required to bridge the gap between the current state and the desired goal.
Serotonin - linked to social behaviour and risk-taking. Lower serotonin correlates with more risk-averse, emotionally reactive behaviour.
Emotions broadly - being angry, stressed, or afraid actively impairs learning.

How is the brain deciding that this is beneficial? Before any significant task, the brain performs a quick cost-benefit analysis:

effort required now
reward that may come later

The issue is that the brain heavily discounts future rewards - temporal discounting. So difficult tasks feel painful now because the benefit feels distant and abstract.

Meanwhile, scrolling provides immediate reward with near-zero effort.

All the modern social media systems are very good at exploiting this. Short-form videos and infinite scroll constantly reset our attention. Now our brain, having seen all this, expects new information every 15-30 seconds. Once your baseline dopamine level is elevated through this overstimulation, the threshold for new information to feel stimulating rises. If something takes longer to understand, it suddenly feels boring and slow.

But the brain is plastic, and we can rewire it based on this experience.

Procrastination as a habit loop

This is the loop that is actually causing procrastination (habit loop).

cue - a trigger (time, location, emotion, …)
routine - the action
reward - immediate relief
belief - reinforcement

The loop becomes so automatic that you can close an app and reopen it five seconds later without consciously deciding to.

example: cue -> difficult task (as perceived by the brain) routine -> scroll Twitter reward -> immediate relief belief -> the brain quickly learns that avoiding a task produces the dopamine reward

You need to change your reaction to the cue. Rewire the routine by substituting a different behaviour.

Techniques that help

Pomodoro Technique

25 mins focused work
5 minute break
repeat

Why will this help? We’re trying to focus on the process rather than the result. I need to finish this paper -> I’ll work on this for 25 mins. Psychological resistance is lowered as part of this.

Spaced Repetition

Instead of doing everything at once, try to revisit ideas over time, like working on some idea/project/paper.

Why does this work? It gives the brain time to consolidate and recall. STM -> consolidation -> LTM.

Temptation Bundling

Pair a high-effort task with an enjoyable activity. The dopamine from the enjoyable activity lowers the perceived cost of doing the hard task.

Physical Exercise

Physical exercise promotes the growth of new neurons and significantly improves mood, focus, and memory consolidation. Even a brisk walk can surface insights that were stuck during focused work.

Writing

One thing I personally noticed:

When I write about something and take notes, I feel much more in control of it (sometimes it’s an illusion and often it’s real). It forces you to organise and articulate what you know, which reveals the gaps.

As the saying goes - if you can’t explain it, you don’t understand it.

Flow state

I cannot describe what a flow state is, but you know when you’re in it. When I enter it, I can focus for extremely long periods of time without any distraction.

Some patterns I have noticed:

working on something genuinely driven by curiosity
having a clear goal and visible progress

Thinking… VS `<thinking>...</thinking>`

It’s natural to compare human cognition with modern LLMs. Do they actually think?

First, they’re impressive and probably the coolest tech we have. And at the same time, they’re quite dumb in that they can’t solve simple problems sometimes and struggle with context rot, instruction following, …

So what makes them similar to us, and what makes them different from us?

First, they’re autoregressive in nature -> they generate the next token based on the probability distribution over all previous tokens.

\[P(\text{next token} \mid \text{previous tokens})\]

Now we have a really powerful pattern predictor.

What do humans seem to do differently?

we have persistent memory
world models
embodied experience
planning abilities
memory consolidation
diffuse thinking
continual learning

We can spend hours or even days thinking about a problem before actually responding.

Fast thinking and slow thinking

Fast thinking - operates quickly, is intuitive and automatic, and requires little energy. Slow thinking - deliberate, logical, and expensive. For LLMs: Fast thinking - standard next-token generation with no reasoning. Slow thinking - Chain-of-thought (CoT) prompting, reasoning step by step.

Methods like CoT (chain-of-thought) prompting ask models to reason step by step.

<thinking> step1 step2 step3 ... </thinking>

Ex. Let’s think step by step.

This made a huge difference for thinking/reasoning models compared to normal models.

DeepSeek reasoning-related figure — Figure: aha moment

But are they really thinking? Or is this just an illusion, where it is generating a sequence of tokens that look like reasoning?

The illusion of thinking

The “illusion of thinking” paper [4] The paper discussing the ‘illusion of thinking’ in reasoning models. suggests that reasoning models perform well on medium-complexity tasks but struggle with complex reasoning that requires novel problem solving.

Is the <thinking> block generating a sequence of tokens that looks like reasoning, in the same way I can write “let me work through this” and then actually reproduce a memorised solution?

And LLMs compound errors as they generate longer sequences. Without a persistent world model or genuine planning mechanism, errors in early reasoning steps propagate and amplify. Common-sense physics, causal reasoning, and anything requiring grounding in sensory reality remain difficult.

Illusion of Thinking paper figure — Figure: Image from ‘Illusion of Thinking’ paper.

What we do differently?

What labs are doing right now:

Token-level next-word prediction
Post-training and alignment
Scaling - more data, more compute, more parameters

What would it take for AGI?

World models

We have some internal models for physical reality. Ex. I can predict where a ball might land when I throw it. Internal representations that allow agents to simulate, predict, and plan.
Continual learning

We keep learning continuously, and we don’t overwrite older memories. So, the ability to integrate new knowledge without forgetting much.
Latent space reasoning

Thinking in compressed, abstract representations rather than just at the token level. Latent variable z and the final outcome variable y.

$P(y) = \sum_{z \sim P(z)} P(y \mid z)$

let x denote the question, y the answer to it, and z as the latent

$P(y \mid x) = \sum_{z \sim P(z \mid x)} P(y \mid x, z)$
Diffuse thinking

We can solve problems when we’re not actively thinking about them, like a background process. Instead of step by step, the reasoning emerges gradually by de-noising latent representations. Chain of thought -> diffusion of thought
Embodied intelligence

We can interact with the physical world and learn from it. vision, touch, physics, motion, …
RL environments

RL in rich, uncertain environments seems like a natural step for developing planning and adaptive behaviour
Memory consolidation

We constantly re-organize our memory. Agents should constantly consolidate, reconsolidate, and selectively retrieve knowledge over long time horizons.

The first principle is that you must not fool yourself - and you are the easiest person to fool.

— Richard Feynman

Some ideas that I want to explore more deeply in future:

thinking in latent space
world models
diffusion models
continual learning
reward hacking during RL
how to perform RL in CoT style

References:

tags: learning, thinking