Tech
Study Finds AI Models Get Basic Math Wrong Around 40 Percent of the Time
Artificial intelligence (AI) tools are increasingly used for everyday calculations, but a new study suggests users should approach their answers with caution. Researchers from the Omni Research on Calculation in AI (ORCA) found that when tested on 500 real-world math prompts, AI models had roughly a 40 percent chance of producing an incorrect result.
The study evaluated five widely used AI systems in October 2025: ChatGPT-5 (OpenAI), Gemini 2.5 Flash (Google), Claude 4.5 Sonnet (Anthropic), DeepSeek V3.2 (DeepSeek AI), and Grok-4 (xAI). None of the models scored above 63 percent overall, with Gemini leading at 63 percent, Grok close behind at 62.8 percent, and DeepSeek at 52 percent. ChatGPT-5 scored 49.4 percent, while Claude trailed at 45.2 percent. The average accuracy across all five models was 54.5 percent.
“Although the exact rankings might shift if we repeated the benchmark today, the broader conclusion would likely remain the same: numerical reliability remains a weak spot across current AI models,” said Dawid Siuda, co-author of the ORCA Benchmark.
Performance varied across categories. AI models performed best in basic math and conversions, with Gemini achieving 83 percent accuracy and Grok 76.9 percent. ChatGPT-5 scored 66.7 percent in the same category, giving a combined average of 72.1 percent—the highest across the seven tested categories. Physics proved the most challenging, with overall accuracy dropping to 35.8 percent. Grok led this category at 43.8 percent, while Claude scored just 26.6 percent.
Some AI systems struggled more than others in specific fields. DeepSeek recorded only 10.6 percent accuracy in biology and chemistry, meaning it failed nearly nine out of ten questions. In finance and economics, Gemini and Grok reached 76.7 percent, while the other three models scored below 50 percent.
The study also categorized the types of mistakes AI makes. “Sloppy math” errors, including miscalculations or rounding issues, accounted for 68 percent of mistakes. Faulty logic errors represented 26 percent, reflecting incorrect formulas or assumptions. Misreading instructions accounted for 5 percent, while some AI simply refused to answer. Siuda noted that multi-step calculations with rounding were particularly prone to error.
The research highlights the importance of verifying AI-generated calculations. “If the task is critical, use calculators or proven sources, or at least double-check with another AI,” Siuda advised.
All 500 prompts used in the study had one correct answer and were designed to reflect everyday math tasks, including statistics, finance, physics, and basic arithmetic. The findings indicate that while AI can assist with calculations, it remains unreliable for precise numerical work and users should remain cautious when relying on these tools.
Tech
ESA and GSMA Launch €100 Million Initiative to Advance Europe’s 6G and AI Ambitions
Europe has stepped up its push to lead in next-generation connectivity with a new partnership between the European Space Agency and the GSMA aimed at strengthening 6G and artificial intelligence capabilities through satellite-based communications.
The two organisations announced at the Mobile World Congress a joint funding programme worth up to €100 million to accelerate the integration of satellite and terrestrial mobile networks, known as non-terrestrial networks (NTN). The initiative marks one of Europe’s most significant public investments to date in hybrid satellite-mobile infrastructure.
Antonio Franchi, head of the 5G/6G NTN Programme Office at ESA, described connectivity as the backbone for unlocking advanced technologies. He said the funding would support the development of networks, services and digital tools that could benefit industries and society at large as digital transformation expands.
The programme is open to companies and organisations based in EU member states, which can apply by submitting formal proposals to ESA. Projects will be selected following an evaluation process.
Funding will focus on four core areas: artificial intelligence-driven management of multi-orbit satellite and ground networks; direct-to-device connectivity for smartphones and Internet of Things devices; collaborative 5G and 6G testing platforms; and early research into edge intelligence and advanced IoT systems.
The types of applications envisioned include telemedicine and telesurgery, autonomous driving systems and precision agriculture, all of which depend on reliable, high-capacity connectivity. By merging satellite coverage with mobile infrastructure, the initiative aims to extend high-speed communication even to remote regions.
Alex Sinclair, chief technology officer at GSMA, said combining the mobile industry’s global reach with ESA’s expertise in space technology would help usher in a new era of connectivity and deliver transformative benefits.
The move comes as global competition intensifies in satellite internet and advanced communications, with US companies currently holding a strong position. European officials say the continent’s strength in high-tech manufacturing and specialised software can offer an independent and competitive alternative.
Several European firms are showcasing their work under the programme at MWC, including Nokia, Filtronic, OQ Technology and MinWave Technologies. Demonstrations include live displays of hybrid network architectures and orchestration of satellite-terrestrial systems.
A centrepiece of the exhibition highlights Europe’s space ambitions through a mixed-reality model of ESA’s Argonaut lunar lander, designed to deliver cargo to the Moon. Visitors can remotely operate a training rover via a live satellite link, underscoring how Europe’s connectivity infrastructure is intended to support not only terrestrial innovation but also future lunar missions.
Tech
Mobile World Congress Opens in Barcelona With Focus on AI and 5G Concerns
Tech
Transatlantic Tensions on Digital Rules Highlight Need for Cooperation
Discussions between Europe and the United States over digital regulation continue to be marked by miscommunication and frustration, even as competitors observe from the sidelines. Europeans and Americans talk past each other while rivals watch. The European Union can set its own standards, but in an interconnected economy, decoupling fantasies and grandstanding won’t help.
The debate often centres on “free speech” concerns voiced by U.S. tech companies and policymakers in response to the EU’s legislative framework for digital platforms. In Europe, such narratives typically prompt defensive reactions. Some Europeans respond with a blunt message: “This is our land, our Union, our laws, follow them, or leave the EU—we’ll find alternative products to use!” Public awareness of American constitutional amendments is low across Europe, just as Americans pay little attention to European digital acts and regulations.
The transatlantic dialogue is further complicated by the global nature of social media platforms. Any EU legislation affecting user experience inevitably influences the functioning of these platforms worldwide, touching on what Americans see as free speech rights. The EU also seeks to extend its influence through the “Brussels effect,” ensuring that European rules shape global standards, while the U.S. maintains a large trade surplus in services and competes technologically with China. This mix of economic, political, and regulatory factors explains why U.S. attention is sharply focused on Europe’s digital policies.
Europeans argue that their 450-million-consumer market has the right to set rules that reflect local principles and values. Attempts to adjust or simplify regulations are difficult, with efforts often met with political resistance and scrutiny. The regulatory ecosystem in Europe supports industries of lawyers, consultants, and experts whose work depends on maintaining complex rules, making reform a sensitive topic.
On the American side, anti-EU rhetoric by public figures has sometimes compounded the problem, drowning out moderates and reinforcing defensive European responses. Analysts note that both regions have seen productive voices sidelined as grandstanding and negative statements dominate public discourse.
Observers argue that long-term thinking is necessary. By evaluating the EU-U.S. tech partnership in the broader context of global alliances, including China and Russia, policymakers can better assess priorities and avoid unnecessary disruption. Blank-slate decoupling between Europe and the United States is unrealistic, and delaying constructive dialogue risks broader economic consequences.
Experts warn that continued transatlantic infighting benefits other global powers and weakens the ability of both regions to set coherent standards in emerging technologies. The message from analysts is clear: cooperation, not confrontation, will determine whether the EU and U.S. can maintain leadership in digital regulation while safeguarding economic and technological interests.
-
Entertainment2 years agoMeta Acquires Tilda Swinton VR Doc ‘Impulse: Playing With Reality’
-
Business2 years agoSaudi Arabia’s Model for Sustainable Aviation Practices
-
Business2 years agoRecent Developments in Small Business Taxes
-
Home Improvement1 year agoEffective Drain Cleaning: A Key to a Healthy Plumbing System
-
Politics2 years agoWho was Ebrahim Raisi and his status in Iranian Politics?
-
Business2 years agoCarrectly: Revolutionizing Car Care in Chicago
-
Sports2 years agoKeely Hodgkinson Wins Britain’s First Athletics Gold at Paris Olympics in 800m
-
Business2 years agoSaudi Arabia: Foreign Direct Investment Rises by 5.6% in Q1
