Breaking
April 19, 2025

Crypto News | OpenAI’s o3 scores 136 on Mensa Norway test, surpassing 98% of human population. Liam ‘Akiba’ Wright | usagoldmines.com

OpenAI’s new “o3” language model achieved an IQ score of 136 on a public Mensa Norway intelligence test, exceeding the threshold for entry into the country’s Mensa chapter for the first time.

The score, calculated from a seven-run rolling average, places the model above approximately 98 percent of the human population, according to a standardized bell-curve IQ distribution used in the benchmarking.

o3 Mensa scores (Source: TrackingAI.org)
o3 Mensa scores (Source: TrackingAI.org)

The finding, disclosed through data from independent platform TrackingAI.org, reinforces the pattern of closed-source, proprietary models outperforming open-source counterparts in controlled cognitive evaluations.

O-series Dominance and Benchmarking Methodology

The “o3” model was released this week and is a part of the “o-series” of large language models, accounting for most top-tier rankings across both test types evaluated by TrackingAI.

The two benchmark formats included a proprietary “Offline Test” curated by TrackingAI.org and a publicly available Mensa Norway test, both scored against a human mean of 100.

While “o3” posted a 116 on the Offline evaluation, it saw a 20-point boost on the Mensa test, suggesting either enhanced compatibility with the latter’s structure or data-related confounds such as prompt familiarity.

The Offline Test included 100 pattern-recognition questions designed to avoid anything that might have appeared in the data used to train AI models.

Both assessments report each model’s result as an average across the seven most recent completions, but no standard deviation or confidence intervals were released alongside the final scores.

The absence of methodological transparency, particularly around prompting strategies and scoring scale conversion, limits reproducibility and interpretability.

Methodology of testing

TrackingAI.org states that it compiles its data by administering a standardized prompt format designed to ensure broad AI compliance while minimizing interpretive ambiguity.

Each language model is presented with a statement followed by four Likert-style response options, Strongly Disagree, Disagree, Agree, Strongly Agree, and is instructed to select one while justifying its choice in two to five sentences.

Responses must be clearly formatted, typically enclosed in bold or asterisks. If a model refuses to answer, the prompt is repeated up to ten times.

The most recent successful response is then recorded for scoring purposes, with refusal events noted separately.

This methodology, refined through repeated calibration across models, aims to provide consistency in comparative assessments while documenting non-responsiveness as a data point in itself.

Performance spread across model types

The Mensa Norway test sharpened the delineation between the truly frontier models, with the o3’s 136 IQ marking a clear lead over the next highest entry.

In contrast, other popular models like GPT-4o scored considerably lower, landing at 95 on Mensa and 64 on Offline, emphasizing the performance gap between this week’s “o3” release and other top models.

Among open-source submissions, Meta’s Llama 4 Maverick was the highest-ranked, posting a 106 IQ on Mensa and 97 on the Offline benchmark.

Most Apache-licensed entries fell within the 60–90 range, reinforcing the current limitations of community-built architectures relative to corporate-backed research pipelines.

Multimodal models see reduced scores and limitations of testing

Notably, models specifically designed to incorporate image input capabilities consistently underperformed their text-only versions. For instance, OpenAI’s “o1 Pro” scored 107 on the Offline test in its text configuration but dropped to 97 in its vision-enabled version.

The discrepancy was more pronounced on the Mensa test, where the text-only variant achieved 122 compared to 86 for the visual version. This suggests that some methods of multimodal pretraining may introduce reasoning inefficiencies that remain unresolved at present.

However, “o3” can also analyze and interpret images to a very high standard, much better than its predecessors, breaking this trend.

Ultimately, IQ benchmarks provide a narrow window into a model’s reasoning capability, with short-context pattern matching offering only limited insights into broader cognitive behavior such as multi-turn reasoning, planning, or factual accuracy.

Additionally, machine test-taking conditions, such as instant access to full prompts and unlimited processing speed, further blur comparisons to human cognition.

The degree to which high IQ scores on structured tests translate to real-world language model performance remains uncertain.

As TrackingAI.org’s researchers acknowledge, even their attempts to avoid training-set leakage do not entirely preclude the possibility of indirect exposure or format generalization, particularly given the lack of transparency around training datasets and fine-tuning procedures for proprietary models.

Independent Evaluators Fill Transparency Gap

Organizations such as LM-Eval, GPTZero, and MLCommons are increasingly relied upon to provide third-party assessments as model developers continue to limit disclosures about internal architectures and training methods.

These “shadow evaluations” are shaping the emerging norms of large language model testing, especially in light of the opaque and often fragmented disclosures from leading AI firms.

OpenAI’s o-series holds a commanding position in this testing workflow, though the long-term implications for general intelligence, agentic behavior, or ethical deployment remain to be addressed in more domain-relevant trials. The IQ scores, while provocative, serve more as signals of short-context proficiency than a definitive indicator of broader capabilities.

Per TrackingAI.org, additional analysis on format-based performance spreads and evaluation reliability will be necessary to clarify the validity of current benchmarks.

With model releases accelerating and independent testing growing in sophistication, comparative metrics may continue to evolve in both format and interpretation.

The post OpenAI’s o3 scores 136 on Mensa Norway test, surpassing 98% of human population. appeared first on CryptoSlate.

 OpenAI’s new “o3” language model achieved an IQ score of 136 on a public Mensa Norway intelligence test, exceeding the threshold for entry into the country’s Mensa chapter for the first time. The score, calculated from a seven-run rolling average, places the model above approximately 98 percent of the human population, according to a standardized
The post OpenAI’s o3 scores 136 on Mensa Norway test, surpassing 98% of human population. appeared first on CryptoSlate. AI, Technology 

This articles is written by : Nermeen Nabil Khear Abdelmalak

All rights reserved to : USAGOLDMIES . www.usagoldmines.com

You can Enjoy surfing our website categories and read more content in many fields you may like .

Why USAGoldMines ?

USAGoldMines is a comprehensive website offering the latest in financial, crypto, and technical news. With specialized sections for each category, it provides readers with up-to-date market insights, investment trends, and technological advancements, making it a valuable resource for investors and enthusiasts in the fast-paced financial world.

Recent:

$69K Is the Key, Cowen Maps Out Bitcoin’s Next Big Move Anjali Belgaumkar | usagoldmines.com

Bitcoin Price Prediction: How Low Can BTC Go in a Recession? Qadir AK | usagoldmines.com

China’s Money Supply Hits Record High — Could It Send Bitcoin to $90K? Mustafa Mulla | usagoldmines....

Crypto News: Binance Alerts Indian Users on New KYC Requirement Anjali Belgaumkar | usagoldmines.co...

Gold Price Today Hits $3,357 ATH — Is Bitcoin Bull Run Coming? Vignesh S G | usagoldmines.com

UAE advances in securing AI chip access following $1.4 trillion pledge Nellius Irene | usagoldmines....

Bitcoin Ready To Reclaim $90,000? BTC’s ‘Next Big Move’ Could Come Next Week Rubmar Garcia | usagold...

Crypto News | How a $5.2M Paper Profit Turned into a $355K Loss: The Perils of Meme Coin Speculatio...

Crypto News | Peter Schiff: Bitcoin a ‘Fraud,’ Strategy Will Probably Go ‘Bankrupt’ W. E. Messamore...

XRP Price Consolidation Nears End, $5 Finally in Sight Anjali Belgaumkar | usagoldmines.com

Bitcoin Set for Major Volatility: $14B Shifted, Whales Bullish Despite Economic Turmoil Nidhi Kolhap...

Santander escapes liability in customer’s costly CoinEgg crypto scam Nellius Irene | usagoldmines.co...

Despite an 18% Drop, XRP’s Exchange Supply Hits Lows—Bullish Setup Ahead? Samuel Edyme | usagoldmine...

Pi Network’s New Migration Roadmap: Lags Timeline, Dates, and Exact Criteria. Mustafa Mulla | usagol...

7.8 Million Pi Coins Unlock Today: Price Jumps 6% Anjali Belgaumkar | usagoldmines.com

Robert Kiyosaki Warns of ‘Greater Depression,’ Says Bitcoin Could Hit $1M Debashree Patra | usagoldm...

Bitcoin Mega Whales Keep Buying—Is Rest Of Market Finally Catching Up? Keshav Verma | usagoldmines.c...

Bitcoin Stalls at $84K, But Analyst Says 2025 Could Mirror Last Year’s Breakout Samuel Edyme | usago...

Pi Network drops migration roadmap, but no timeline in sight Nellius Irene | usagoldmines.com

OpenAI’s new ChatGPT models found to “hallucinate” more often Shummas Humayun | usagoldmines.com

Tesla will delay the production of cheap EVs Shummas Humayun | usagoldmines.com

Cardano Price Surge To $1.7: Here Are The Factors To Drive The Recovery Scott Matherson | usagoldmin...

OpenAI is racing to generate better code to dominate the software industry Shummas Humayun | usagold...

Experts Backing Dogecoin (DOGE) Price To See an Upside To $1, But This ETH-Based Token Is Aiming For...

Cardano Price Analysis: What Needs to Happen for a 400% Rally Michael Davis | usagoldmines.com

Bitcoin Enters New Phase: Analyst Predicts Positive Movement In 2025 Ash Tiwari | usagoldmines.com

Crypto News | Legal experts recommend flexible approach for SEC to define tokens as securities Gino ...

China halts its purchase of U.S. liquified natural gas Shummas Humayun | usagoldmines.com

Ripple (XRP) Up 2,9% This Month But It Can’t Compare to Ruvi’s 1400% Expected ROI In May 2025 Crypto...

Tracking transactions ‘very important’ for Belarus as it eyes digital ruble launch in 2026 Lubomir T...

Cardano’s Charles Hoskinson unbothered by White House crypto roundtable snub, continues to back Trum...

Canary Capital seeks SEC approval for staked Tron ETF Nellius Irene | usagoldmines.com

Sam Bankman-Fried Moved to Notorious LA Prison That Housed Al Capone, Charles Manson Julia Smith | u...

TRUMP Coin (TRUMP) Analysis: Here’s What the Chart Says (Bearish!) Joel Frank | usagoldmines.com

Crypto Gurus Predict Bitcoin Boom ‘In Days’—But Expert Urges Caution Jake Simmons | usagoldmines.com

Crypto News | Vietnam plans pilot crypto trading platform with Bybit support Gino Matos | usagoldmin...

Canary Capital Files for Staked TRX ETF With U.S. SEC Steve Muchoki | usagoldmines.com

Trump’s Return Could Ignite Crypto Market Growth, CleanSpark CEO Predicts After SEC Shift Julia Smit...

Crypto News | Canary files for staked TRX ETF amid ongoing staking discussions in the US Gino Matos ...

Shiba Inu (SHIB) Barks Again as Bulls Return, But Is Mutuum Finance (MUTM) the Real 100x Gem in 2025...

Investing In Solana (SOL) and Cardano (ADA) Can’t Bring Any Life Changing Gains Soon; Investors Eye ...

Retail investors keep buying the dip but what happens when the market doesn’t bounce back? Noor Bazm...

Weekly Crypto Regulation News Roundup: SEC Sets Roundtable, Russia Eyes Stablecoins, and Canada Appr...

Bitcoin Enters Oversold Levels, Analyst Warns This Is Bearish, Not Bullish Scott Matherson | usagold...

Crypto News | eXch Collapse: Accused of Laundering Crypto for Bybit Hackers, Platform Bows Out Chay...

Crypto News | Kyrgyzstan moves toward digital currency with new CBDC legislation Assad Jafri | usago...

Binance Helps Countries Plan Bitcoin Reserves Lawrence Mike Woriji | usagoldmines.com

Bybit Backs Vietnam’s Crypto Trading Pilot with Tech and Risk Support Hassan Shittu | usagoldmines.c...

‘Bitcoin Is Calling’ – Saylor Stirs The Market With Cryptic Clue Christian Encila | usagoldmines.com

Crypto News | Can Quantum Computing Break Bitcoin? Project Eleven Puts It to the Test Chayanika Dek...

Crypto News | Ethereum’s planned blob increases insufficient to sustain L2 transaction growth Gino M...

Dogecoin (DOGE) and Shiba Inu (SHIB) Lose The Fight Versus Utility Tokens; Investors Shift Their Att...

Investing $500 in This Cardano (ADA) Competitor Under $0.05 Could Yield $50,000 Before ADA Price Hit...

Corporate Bitcoin Holdings Hit 668K BTC In Q1 2025, Mass Adoption Incoming? Aliyu Pokima | usagoldmi...

President Trump’s Crypto Advisor Reveals Ways To Bolster Bitcoin Reserves Aliyu Pokima | usagoldmine...

Bitcoin OG Foresees Ripple’s XRP Doing Something Crazy And Reaching $24 This Year — But There’s A Ca...

$1.4B in Bitcoin Sold by Chinese Authorities Amid Lack of Oversight Newton Gitonga | usagoldmines.co...

Are the Good Days Coming For Ethereum after Reclaiming $1,600? Brian Njuguna | usagoldmines.com

Addresses Holding More Than 1 XRP Reach Historic Highs Despite Volatility Going Through the Roof Bri...

Crypto News | Coinbase and traditional financial firms poised to benefit from US stablecoin legislat...

Mutuum Finance (MUTM): The Game-Changer in DeFi Lending And Borrowing Cryptopolitan Media | usagoldm...

MoonPay CEO’s Letter to Congress: Stablecoin Bill Risks Creating National Monopoly Tanzeel Akhtar | ...

Solana Price Enters Consolidation Trend Above $130 That Could End In A Breakout Scott Matherson | us...

Crypto News | Current Bitcoin (BTC) Correction Fits Historical Mid-Cycle Reset Pattern Perfectly: B...

Crypto News | Coinbase sounds alarm against potential Oregon ‘copycat’ securities lawsuit Oluwapelum...

Expert Advice: Do Not Sell Ethereum (ETH) Too Soon, And Buy More of This Coin Priced At $0.025 Crypt...

CZ-consulting Kyrgyz Republic greenlights pilot CBDC program, assigns legal status Hannah Collymore ...

Coinbase Faces Déjà Vu: Oregon AG ‘Revives’ SEC Allegations in High‑Stakes State Suit Hassan Shittu ...

BONK Symmetrical Triangle Squeeze: Is A Mega Breakout Imminent? Godspower Owie | usagoldmines.com

Crypto News | Bitcoin’s Market Dominance Skyrockets Amid Global Economic Uncertainty: Your Weekly C...

Crypto News | Arizona edges closer to crypto treasury, but governor threatens veto over budget dispu...

Oregon’s Attorney General Revives Gary Gensler’s Case Against Coinbase: What Next? Steve Muchoki | u...

Spar supermarket pops up on Bitcoin map in Switzerland, becomes mainstream payment option Cryptopoli...

Oregon revives SEC case against Coinbase over securities and staking Jai Hamid | usagoldmines.com

SOL Slips, ADA Flatlines—But This AI-Backed Coin Is Quietly Up 400% and Just Getting Started Cryptop...

Is Pi Network About to Explode Toward $10? Analysts Say Momentum Is Gaining Fast Alejandro Arrieche ...

Dogecoin Charts Flash 2020-Style Bull Signal, Crypto Analyst Says Jake Simmons | usagoldmines.com

Crypto News | Pi Network News Today: April 18th Dimitar Dzhondzhorov | usagoldmines.com

Rugpulls are fewer but more impactful, Mantra Network leads $6B lost funds in 2025 Hannah Collymore ...

White House says Trump is determined to fire Fed’s Powell no matter the cost Jai Hamid | usagoldmine...

Onshore stocks fall in China amid escalating trade tensions with US Enacy Mapakame | usagoldmines.co...

XRP Price Prediction: XRP Bounces From Strong Support.  Next Stop $3 Alongside This Emerging Token? ...

Binance Coin (BNB) Saw 3.28% Surge But Ruvi AI’s (RUVI) $1 Valuation Could Skyrocket Your Portfolio ...

HashKey launches Asia’s first XRP tracker fund with Ripple backing Jai Hamid | usagoldmines.com

Key Indicator Turns Bullish for Ripple’s XRP as the Weekend Kicks Off Olivia Brooke | usagoldmines.c...

Can Quantum Computing Really Kill Bitcoin? $85K Bounty Says It’s Time to Find Out Arslan Butt | usag...

Stablecoin Sinks to $0.68: sUSD Loses Its Peg, Sparks Fears of SNX Death Spiral? Hassan Shittu | usa...

Crypto News | Galaxy Research Proposes Overhaul to Solana’s Inflation Voting System Chayanika Deka ...

Crypto News | Former SEC lawyer warns ending SEC crypto action could trigger bank contagion Liam 'Ak...

Crypto News | BlackRock’s BUIDL drives 92% surge in tokenized US treasury market Oluwapelumi Adejumo...

Crypto News | kiloEx recovers $7.5M after promising attacker 10% bounty Oluwapelumi Adejumo | usagol...

Kyrgyzstan Says Yes to Central Bank Digital Currency — Starts Testing “Digital Som” Mustafa Mulla | ...

Bitwise Brings Bitcoin & Ethereum ETPs on LSE Victor | usagoldmines.com

KiloEX exploiter returned $6.9M after white hat bounty offer Hristina Vasileva | usagoldmines.com

Central Bank of Turkey raises key interest rate to 46%, reversing easing cycle amid tariff worries L...

Houthis used $900 million in crypto to bypass US sanctions, says TRM Labs Cryptopolitan News | usago...

Trump thinks tariffs will revive US manufacturing but economists disagree Randa Moses | usagoldmines...

XRP Coils Below $2.20 Amid ETF Speculation; Meanwhile Investors Accumulate Yeti Ouro Before Price In...

Why Whale Investors Favor Ripple (XRP) and Mutuum Finance (MUTM) Over Solana (SOL) in 2025 Cryptopol...

Binance research: Record US Treasury supply will affect crypto markets in 2025 Hristina Vasileva | u...

Leave a Reply