Nov 10, 2025
Articles
The End of the Answer
Why AI will force education to finally measure thinking.
Haley Moller
CEO

By the end of 2028, there is a strong likelihood—about seventy percent—that at least one third of university assessment in the United States will look completely different from the way it does now. The quiet ritual of uploading an essay or project for a professor to grade in solitude will give way to a model that demands something riskier and more revealing: evidence of thinking itself. Students will defend ideas orally, record their reasoning trails, or submit interactive logs of how they arrived at their conclusions. The very format of the “final answer” will begin to dissolve. But far from being a loss, this will be one of the healthiest changes education has undergone in a century.
For as long as we have had mass schooling, we have been grading products instead of minds. When education became industrialized, the problem wasn’t the assembly line of students but rather the assembly line of evaluation. To manage scale, we decided to measure learning via outcomes, represented by the essay, the test, and the project. At first this may have seemed efficient. But somewhere along the way, the measurement replaced the thing being measured; students learned not to understand, but to perform understanding. Teachers, constrained by numbers, taught to the performance, and school became a place where we learned how to appear as though we were learning.
Like most accidents of scale and time, no one meant for this to happen. You can almost hear the logic behind it: one teacher, a hundred papers or exams—what else could we do? We needed a system that could compress thought into something legible and comparable. And so we built one. The problem is that we began to believe in it rather than seeing it for what it is. The more successful you were in that system, the more fluent you became in gaming it (figuring out how to get the highest grade while putting in the least amount of effort).
Artificial intelligence, for all its dangers, has finally broken this illusion. When a model can write a passable essay on the causes of the French Revolution or the ethics of biotechnology in seconds, the old system collapses. When the artifact loses its integrity, so does the grade built upon it. The institutions that have relied on these proxies for a century are now facing a reckoning they can no longer postpone: What are we actually measuring when the artifact can be manufactured on demand?
Some will try to patch the system rather than rebuild it. They will double down on surveillance and proctoring or on handwriting requirements;—but this is only a performance of control. Unfortunately, AI detectors are unreliable, surveillance invites lawsuits, and no one truly believes that writing essays in pen under fluorescent light restores intellectual honesty. The old way of grading is dead—but far from being a bad thing, this is good.
This collapse has the potential to push us toward a better system. As my old professor and now friend points out, “To the extent that AI is used as a labor-saving assistant to construct sentences, arguments, and interpretations of literature, it is a shortcut that prevents a student from learning what it means to think one’s own thoughts and write with one’s own voice.” But I believe that we should teach—and incentivize—students to use AI in productive ways; my hope is that the advent of artificial intelligence will force universities to begin grading thought in motion rather than the polished artifact left behind. I predict that we will see the rise of the old Socratic method, but in a new form. Under this approach, students will no longer be asked to hide their tools; instead, they will use AI openly and document how they use it. The transcript of their interaction—questions asked, paths abandoned, and revisions made—will become a window into their reasoning. The point is not to ban AI but to reveal the student’s judgment through its use. What ideas did the student accept or reject and why?
A teacher reading a record of thought can see the texture of a mind thinking; they can see when curiosity sparked, when confusion deepened into insight, and when a student changed his or her mind. Paired with short oral defenses (five or ten minutes of honest conversation), this method turns assessment from an act of policing into one of witnessing. We finally get to see what we have always claimed to care about: how a student learns, not what they can recite.
This shift is not just plausible—it is inevitable. Technology makes it impossible to continue the old game. Every major writing and research tool now embeds an AI model. Within a few years, there will be no meaningful way to distinguish “AI-assisted writing” from writing itself. Universities cannot outlaw the calculator of language any more than they could outlaw the calculator of numbers. The only viable solution is to measure the one thing machines cannot fake: human reasoning in real time.
Institutions will also have to make this change to protect their credibility. A degree has always been a kind of promise—a certification that the holder has demonstrated understanding. If employers, graduate programs, or accreditors begin to doubt that coursework reflects real learning, that promise dissolves. The universities that can show authentic evidence of student reasoning will be the ones that survive with their reputations intact.
Politically and ethically, this new model is also safer. Students and faculty alike will resist the intrusion of surveillance technology into their private spaces. Recording the process rather than the person—the chat history instead of the webcam feed—offers a humane compromise; it allows us to verify authenticity without turning education into an airport security line.
But the deeper reason this change matters is philosophical. For more than a century, we have mistaken fluency for thought, rewarding polish over depth and correctness over curiosity. AI is forcing us to admit how shallow this method has turned out to be. When a machine can now mimic fluency perfectly, the only thing left worth measuring is the struggle to understand. Learning to embrace—rather than run from—that struggle, ironically, is what education was always supposed to be about.
The effects will ripple outward. Professors will rediscover teaching as mentorship rather than content delivery. Feedback will shift from marking sentences to guiding decisions. Students who once felt invisible behind their writing will have a chance to show effort and growth. My hope is that the skills we cultivate in school will finally match the world beyond school: judgment, adaptability, collaboration with intelligent systems, and the humility to revise one’s own mind.
Of course, the transition will be messy. Oral defenses take time, and recording reasoning trails raises privacy questions. Faculty workloads will have to be reconsidered, and universities will need new tools to help manage these artifacts of thought. But these are logistical problems, not existential ones. The existential problem is continuing to pretend that the artifact still measures the mind that made it.
If you were to walk into a university in 2028, the change would be visible in small but significant ways. Students would speak more often and hide less. Their written work would arrive with a living history attached, and professors would judge it not only by the elegance of its prose but by the reasoning that led there. A few minutes of conversation might replace hours of suspicion;—and the classroom would feel, strangely, more human.
The irony is that the technology many fear will dehumanize education may end up restoring its humanity. When the work of writing can be automated, what remains is the work of thinking—and the courage to make that thinking visible. We built schools to measure knowledge, and in doing so, we lost sight of understanding. AI has reminded us of the difference.
If we let it, this change will push us toward an education that values honesty over performance. For the first time, our measurements might align with our mission. We will stop rewarding students for hacking the test and start asking them to confront the only test that can’t be gamed: the test of their own reasoning.
And if that happens—if we learn to grade thought instead of answers—then the end of the answer will finally mark the beginning of understanding.