Tech

Study Finds AI Models Get Basic Math Wrong Around 40 Percent of the Time

Published

2 months ago

December 30, 2025

Artificial intelligence (AI) tools are increasingly used for everyday calculations, but a new study suggests users should approach their answers with caution. Researchers from the Omni Research on Calculation in AI (ORCA) found that when tested on 500 real-world math prompts, AI models had roughly a 40 percent chance of producing an incorrect result.

The study evaluated five widely used AI systems in October 2025: ChatGPT-5 (OpenAI), Gemini 2.5 Flash (Google), Claude 4.5 Sonnet (Anthropic), DeepSeek V3.2 (DeepSeek AI), and Grok-4 (xAI). None of the models scored above 63 percent overall, with Gemini leading at 63 percent, Grok close behind at 62.8 percent, and DeepSeek at 52 percent. ChatGPT-5 scored 49.4 percent, while Claude trailed at 45.2 percent. The average accuracy across all five models was 54.5 percent.

“Although the exact rankings might shift if we repeated the benchmark today, the broader conclusion would likely remain the same: numerical reliability remains a weak spot across current AI models,” said Dawid Siuda, co-author of the ORCA Benchmark.

Performance varied across categories. AI models performed best in basic math and conversions, with Gemini achieving 83 percent accuracy and Grok 76.9 percent. ChatGPT-5 scored 66.7 percent in the same category, giving a combined average of 72.1 percent—the highest across the seven tested categories. Physics proved the most challenging, with overall accuracy dropping to 35.8 percent. Grok led this category at 43.8 percent, while Claude scored just 26.6 percent.

Some AI systems struggled more than others in specific fields. DeepSeek recorded only 10.6 percent accuracy in biology and chemistry, meaning it failed nearly nine out of ten questions. In finance and economics, Gemini and Grok reached 76.7 percent, while the other three models scored below 50 percent.

The study also categorized the types of mistakes AI makes. “Sloppy math” errors, including miscalculations or rounding issues, accounted for 68 percent of mistakes. Faulty logic errors represented 26 percent, reflecting incorrect formulas or assumptions. Misreading instructions accounted for 5 percent, while some AI simply refused to answer. Siuda noted that multi-step calculations with rounding were particularly prone to error.

The research highlights the importance of verifying AI-generated calculations. “If the task is critical, use calculators or proven sources, or at least double-check with another AI,” Siuda advised.

All 500 prompts used in the study had one correct answer and were designed to reflect everyday math tasks, including statistics, finance, physics, and basic arithmetic. The findings indicate that while AI can assist with calculations, it remains unreliable for precise numerical work and users should remain cautious when relying on these tools.

Tech

ESA and GSMA Launch €100 Million Initiative to Advance Europe’s 6G and AI Ambitions

Published

2 hours ago

March 2, 2026

Web Reporter

Europe has stepped up its push to lead in next-generation connectivity with a new partnership between the European Space Agency and the GSMA aimed at strengthening 6G and artificial intelligence capabilities through satellite-based communications.

The two organisations announced at the Mobile World Congress a joint funding programme worth up to €100 million to accelerate the integration of satellite and terrestrial mobile networks, known as non-terrestrial networks (NTN). The initiative marks one of Europe’s most significant public investments to date in hybrid satellite-mobile infrastructure.

Antonio Franchi, head of the 5G/6G NTN Programme Office at ESA, described connectivity as the backbone for unlocking advanced technologies. He said the funding would support the development of networks, services and digital tools that could benefit industries and society at large as digital transformation expands.

The programme is open to companies and organisations based in EU member states, which can apply by submitting formal proposals to ESA. Projects will be selected following an evaluation process.

Funding will focus on four core areas: artificial intelligence-driven management of multi-orbit satellite and ground networks; direct-to-device connectivity for smartphones and Internet of Things devices; collaborative 5G and 6G testing platforms; and early research into edge intelligence and advanced IoT systems.

The types of applications envisioned include telemedicine and telesurgery, autonomous driving systems and precision agriculture, all of which depend on reliable, high-capacity connectivity. By merging satellite coverage with mobile infrastructure, the initiative aims to extend high-speed communication even to remote regions.

Alex Sinclair, chief technology officer at GSMA, said combining the mobile industry’s global reach with ESA’s expertise in space technology would help usher in a new era of connectivity and deliver transformative benefits.

The move comes as global competition intensifies in satellite internet and advanced communications, with US companies currently holding a strong position. European officials say the continent’s strength in high-tech manufacturing and specialised software can offer an independent and competitive alternative.

Several European firms are showcasing their work under the programme at MWC, including Nokia, Filtronic, OQ Technology and MinWave Technologies. Demonstrations include live displays of hybrid network architectures and orchestration of satellite-terrestrial systems.

A centrepiece of the exhibition highlights Europe’s space ambitions through a mixed-reality model of ESA’s Argonaut lunar lander, designed to deliver cargo to the Moon. Visitors can remotely operate a training rover via a live satellite link, underscoring how Europe’s connectivity infrastructure is intended to support not only terrestrial innovation but also future lunar missions.

Tech

Mobile World Congress Opens in Barcelona With Focus on AI and 5G Concerns

Published

2 hours ago

March 2, 2026

Web Reporter

The Mobile World Congress opens its doors on Monday, marking its 20th year in Barcelona and showcasing the latest developments in global connectivity. Once known primarily as a launchpad for new smartphones, the annual technology gathering has evolved into a broader platform for artificial intelligence, next-generation networks and emerging digital infrastructure.

This year’s event is set to spotlight AI innovations and what organisers describe as the “IQ Era,” referring to the deeper integration of artificial intelligence into daily life and industry. Exhibitions will also explore the future of airport travel, advances in robotics and discussions around 5G and early 6G development.

Vivek Badrinath, director general of the GSMA, which hosts the conference, issued a warning about Europe’s lagging 5G deployment in remarks to Euronews. He said that while the United States and China have advanced in standalone 5G networks, enabling industrial automation in ports and factories, Europe has reached only about 3 percent deployment of 5G standalone technology.

Badrinath described the situation as a “chicken and egg” problem. Without broad network coverage, European companies are reluctant to invest in robotics or AI systems that depend on 5G. At the same time, limited demand slows infrastructure rollout. “If we don’t roll out 5G properly, you’re out of the game,” he said, arguing that digital competitiveness depends on strong network foundations.

Regulatory reform is expected to be a central topic at the conference, particularly around the European Union’s proposed Digital Network Act, which aims to modernise and harmonise connectivity rules. Telecom operators have called for changes that would allow greater consolidation and investment capacity. Industry leaders point to Europe’s fragmented market of roughly 200 operators, many serving around five million customers each, compared with the far larger scale of major providers in the US and China.

Government participation at the event remains strong. Last year’s ministerial programme drew dozens of ministers and regulatory agency heads, and similar high-level attendance is expected this year, offering a forum for dialogue between policymakers and industry executives.

Beyond policy debates, organisers say MWC will continue to highlight consumer and enterprise technologies. Among the anticipated product showcases is a foldable robotic phone from Chinese brand Honor. The exhibition will also introduce “Airport of the Future,” demonstrating how connectivity is reshaping aviation systems, and “New Frontiers,” a space dedicated to quantum computing, robotics and satellite-based non-terrestrial networks.

As the conference enters its third decade in Barcelona, organisers aim to balance technological ambition with urgent discussions about Europe’s digital future.

Tech

Transatlantic Tensions on Digital Rules Highlight Need for Cooperation

Published

1 week ago

February 22, 2026

Web Reporter

Discussions between Europe and the United States over digital regulation continue to be marked by miscommunication and frustration, even as competitors observe from the sidelines. Europeans and Americans talk past each other while rivals watch. The European Union can set its own standards, but in an interconnected economy, decoupling fantasies and grandstanding won’t help.

The debate often centres on “free speech” concerns voiced by U.S. tech companies and policymakers in response to the EU’s legislative framework for digital platforms. In Europe, such narratives typically prompt defensive reactions. Some Europeans respond with a blunt message: “This is our land, our Union, our laws, follow them, or leave the EU—we’ll find alternative products to use!” Public awareness of American constitutional amendments is low across Europe, just as Americans pay little attention to European digital acts and regulations.

The transatlantic dialogue is further complicated by the global nature of social media platforms. Any EU legislation affecting user experience inevitably influences the functioning of these platforms worldwide, touching on what Americans see as free speech rights. The EU also seeks to extend its influence through the “Brussels effect,” ensuring that European rules shape global standards, while the U.S. maintains a large trade surplus in services and competes technologically with China. This mix of economic, political, and regulatory factors explains why U.S. attention is sharply focused on Europe’s digital policies.

Europeans argue that their 450-million-consumer market has the right to set rules that reflect local principles and values. Attempts to adjust or simplify regulations are difficult, with efforts often met with political resistance and scrutiny. The regulatory ecosystem in Europe supports industries of lawyers, consultants, and experts whose work depends on maintaining complex rules, making reform a sensitive topic.

On the American side, anti-EU rhetoric by public figures has sometimes compounded the problem, drowning out moderates and reinforcing defensive European responses. Analysts note that both regions have seen productive voices sidelined as grandstanding and negative statements dominate public discourse.

Observers argue that long-term thinking is necessary. By evaluating the EU-U.S. tech partnership in the broader context of global alliances, including China and Russia, policymakers can better assess priorities and avoid unnecessary disruption. Blank-slate decoupling between Europe and the United States is unrealistic, and delaying constructive dialogue risks broader economic consequences.

Experts warn that continued transatlantic infighting benefits other global powers and weakens the ability of both regions to set coherent standards in emerging technologies. The message from analysts is clear: cooperation, not confrontation, will determine whether the EU and U.S. can maintain leadership in digital regulation while safeguarding economic and technological interests.