Since OpenAI introduced its “Strawberry” AI models, something intriguing has unfolded. The o1-preview and o1-mini models have quickly gained attention for their superior step-by-step reasoning, offering a structured glimpse into problem-solving. However, behind this polished façade, a hidden layer of the AI’s mind remains off-limits—an area OpenAI is determined to keep out of reach.
Thank you for reading this post, don't forget to subscribe!Unlike previous models, the o1 series conceals its raw thought processes. Users only see the refined, final answer, generated by a secondary AI, while the deeper, unfiltered reasoning is locked away. Naturally, this secrecy has only fueled curiosity.
Hackers, researchers, and enthusiasts are already working to break through this barrier. Using jailbreak techniques and clever prompt manipulations, they are seeking to uncover the AI’s raw chain of thought, hoping to reveal what OpenAI has concealed.
Rumors of partial breakthroughs have circulated, though nothing definitive has emerged. Meanwhile, OpenAI closely monitors these efforts, issuing warnings and threatening account bans to those who dig too deep. On platforms like X, users have reported receiving warnings merely for mentioning terms like “reasoning trace” in their interactions with the o1 models. Even casual inquiries into the AI’s thinking process seem to trigger OpenAI’s defenses.
The company’s warnings are explicit: any attempt to expose the hidden reasoning violates their policies and could result in revoked access to the AI. Marco Figueroa, leader of Mozilla’s GenAI bug bounty program, publicly shared his experience after attempting to probe the model’s thought process through jailbreaks—he quickly found himself flagged by OpenAI.
Now I’m on their ban list,” Figueroa revealed.
So, why all the secrecy? OpenAI explained in a blog post titled Learning to Reason with LLMs that concealing the raw thought process allows for better monitoring of the AI’s decision-making without interfering with its cognitive flow. Revealing this raw data, they argue, could lead to unintended consequences, such as the model being misused to manipulate users or its internal workings being copied by competitors.
OpenAI acknowledges that the raw reasoning process is valuable, and exposing it could give rivals an edge in training their own models. However, critics, such as independent AI researcher Simon Willison, have condemned this decision. Willison argues that concealing the model’s thought process is a blow to transparency.
“As someone working with AI systems, I need to understand how my prompts are being processed,” he wrote. “Hiding this feels like a step backward.”
Ultimately, OpenAI’s decision to keep the AI’s raw thought process hidden is about more than just user safety—it’s about control. By retaining access to these concealed layers, OpenAI maintains its lead in the competitive AI race. Yet, in doing so, they’ve sparked a hunt. Researchers, hackers, and enthusiasts continue to search for what remains hidden. And until that veil is lifted, the pursuit won’t stop.