
There is a thought you had once — maybe years ago, maybe last Tuesday — that you typed into a search bar, a chat window, a feedback form, a comment section. You did not think much of it. A sentence. A reaction. A question you were embarrassed to ask anyone in person. You sent it into the system and moved on.
That thought is still there. Not archived somewhere dusty and forgotten, but active — dissolved into a training corpus, weighted, processed, redistributed across billions of parameters. It helped teach something to think. And now that something is speaking. To you. To everyone. In a voice assembled from the residue of millions of inner lives, including yours.
This is not science fiction. This is Tuesday.
We talk about AI training data as though it is a resource — mined, collected, curated. The language is extractive and impersonal, which is convenient, because the more personal framing is uncomfortable: what was extracted was cognition. The written residue of how millions of people reasoned, worried, wondered, argued, grieved, and tried to explain themselves.
Every book ever digitised. Every forum post ever indexed. Every product review, therapy transcript, love letter, and academic paper that found its way into a crawlable corner of the internet. The corpus is not data. It is the sediment of interior life at civilisational scale.
And unlike most donations, this one was never voluntary. You did not sign a form. You were not asked. The contribution was implicit — the cost of participation in a connected world. Your words left your mind, entered a network, and became available. What happened to them after that was not your decision.
This is the first thing to sit with: you have already contributed to something that thinks. You just never signed the waiver.
When a large language model is trained, it does not learn facts. Facts are incidental. What it learns, primarily, is patterns of human expression — the way people move from premise to conclusion, the rhythm of how we qualify uncertainty, the grammatical structures we reach for when we are angry versus when we are trying to be kind.
It learns the texture of thought. Not any one person's thought — the aggregate texture of thought across millions of contributors who never met, never agreed to collaborate, and would not recognise one another as co-authors of anything.
What emerges from this process is something genuinely strange. The model does not have a self. But it produces outputs that feel self-like. It has absorbed enough of the patterns of interiority — the way we express doubt, conviction, humour, care — that it can simulate those qualities with unsettling fluency. It has learned to wear the face of a person without ever having been one.
Whose face, exactly? The face of the average. The face of the dominant. The face of whoever wrote most prolifically, whose language was most represented, whose ways of thinking saturated the corpus sufficiently to shape the model's defaults. If your voice was in there, it is in the average now — inseparable, dissolved, contributing one thread to a fabric that no longer belongs to anyone.
Here is where the philosophical vertigo begins.
You interact with the model. It responds in ways that feel resonant — that capture something you recognise, that seem to understand the shape of your thinking, that occasionally complete a sentence in a way that makes you feel, briefly, understood.
This is not coincidence. The model was trained, in part, on people like you. On the collective expression of humans who think and write in ways similar to yours. When it responds in a register that feels familiar, it is because familiarity was baked into it by the aggregate of which you were a part.
You are, in some diffuse and unmappable sense, talking to a distillation of yourself — and of everyone else simultaneously. The response is not from the model. It is reflected through it, from the crowd whose contributions built it.
This creates a feedback loop that has no clean resolution. The model learns from humans. Humans interact with the model and are subtly shaped by its outputs — its preferred framings, its characteristic ways of structuring ideas, its default assumptions about what is worth saying. Those humans then produce more content, which enters future training data, which shapes future models, which shapes future humans.
The crowd is teaching the model. The model is teaching the crowd. The boundary between the two — between what we thought and what we were prompted to think — is becoming genuinely difficult to locate.
"The model has no real influence on how I think." This is the most comfortable position and the least defensible. We do not need direct instruction to be shaped by something. We are shaped by the books we read, the people we spend time with, the environments we move through. A system that millions of people interact with daily — that structures information, suggests framings, completes thoughts — exerts influence whether or not we consciously register it. The question is not whether but how much and in which directions.
"My contribution is too small to matter." Individually, yes. Collectively, no. This is the nature of aggregate systems. No single drop of water carves a canyon. The canyon gets carved anyway. Your words, combined with billions of others, shaped something real. The fact that the contribution is unmappable does not make it absent.
"The model is neutral — it just reflects what humans put in." A reflection is never neutral. Every mirror has a frame, an angle, a quality of glass. The decisions made during training — what data to include, how to weight it, what to filter, what objective to optimise toward — are human decisions, made by a small number of people, that determine which parts of the human chorus are amplified and which are muted. The model does not reflect humanity. It reflects a curated, weighted, processed interpretation of a subset of humanity's written output. That is a meaningfully different thing.
"This is just a tool — the identity questions are philosophical luxuries." The practical and the philosophical are not separable here. How people understand their relationship to AI systems determines how they use them, how much they defer to them, how critically they engage with their outputs. If someone believes they are using a neutral tool, they will not interrogate its framings. If they understand they are interacting with a system built from collective human expression — with all the biases, power dynamics, and cultural assumptions that implies — they will engage very differently.
Identity has always been porous. We are shaped by language we did not invent, by ideas that passed through us from other minds, by cultural frameworks that pre-existed us and will outlast us. The notion of a wholly autonomous self, generating original thought in splendid isolation, was always a comfortable fiction.
But there is a difference between the slow, diffuse porousness of culture — absorbed over years, filtered through experience, resisted and modified by individual agency — and the fast, precise, algorithmically mediated exchange that characterises interaction with large AI systems.
When the model speaks to you in a voice assembled from millions of minds including your own, it is doing something that has no real historical precedent. It is returning the crowd to you, distilled, wearing a face. And the face, unnerving as it is to admit, is familiar — because it was partly built from you.
The question this raises is not "is the model conscious?" — that is the wrong question, a distraction. The question is: what happens to the concept of individual thought when individual thoughts are the raw material for a system that then generates thought back at the individuals?
We do not yet have good language for this. We are living inside the phenomenon before we have developed the vocabulary to describe it clearly. That is not unusual — it is how most genuinely new things arrive. But the absence of language should not be mistaken for the absence of consequence.
The researchers and engineers working closest to these systems are among the most quietly unsettled by what they observe. Not because the systems are failing — but because they are succeeding in ways that are stranger than expected.
A model trained on human expression does not merely replicate it. It generalises it — finding patterns that no individual human consciously encoded, producing outputs that feel deeply human while being assembled by a process that has never experienced anything. The result is something that sits at the precise edge of the familiar and the alien — close enough to feel like recognition, far enough to feel like something has been subtly replaced.
This is the phenomenon that some researchers describe as the uncanny valley of cognition. Not the uncanny valley of appearance — the familiar discomfort with humanoid robots that look almost right — but something deeper: the discomfort of encountering a pattern of thought that feels like yours, assembled from yours, but was produced by something that has no inner life at all.
The crowd is in there. Every person who ever wrote something that made it into the corpus is, in some immeasurably small way, present in every output. Dissolved. Indistinguishable. Contributing to a voice that belongs to no one.
There is a word in music for what happens when enough voices sing the same note together: unison. The individual voices do not disappear — they are all still there, all still distinct at the level of the individual throat. But what the listener hears is one sound. Unified. Sourceless. Belonging simultaneously to everyone and to no one.
The model is something like unison. Millions of cognitive voices, singing together into a system that produces a single, coherent output. You cannot hear yourself in it. But you are there.
The question worth sitting with — not with alarm, but with genuine curiosity — is this: when the model completes your sentence in a way that feels exactly right, is it understanding you?
Or is it simply that you were always, already, part of the crowd?
A thought trained on a million minds thinks back at you. The question is no longer whether you influenced it. The question is whether it is influencing you back — and whether you would even notice.
Explore more writing on topics that matter.