Tech
Study Finds AI Models Get Basic Math Wrong Around 40 Percent of the Time
Artificial intelligence (AI) tools are increasingly used for everyday calculations, but a new study suggests users should approach their answers with caution. Researchers from the Omni Research on Calculation in AI (ORCA) found that when tested on 500 real-world math prompts, AI models had roughly a 40 percent chance of producing an incorrect result.
The study evaluated five widely used AI systems in October 2025: ChatGPT-5 (OpenAI), Gemini 2.5 Flash (Google), Claude 4.5 Sonnet (Anthropic), DeepSeek V3.2 (DeepSeek AI), and Grok-4 (xAI). None of the models scored above 63 percent overall, with Gemini leading at 63 percent, Grok close behind at 62.8 percent, and DeepSeek at 52 percent. ChatGPT-5 scored 49.4 percent, while Claude trailed at 45.2 percent. The average accuracy across all five models was 54.5 percent.
“Although the exact rankings might shift if we repeated the benchmark today, the broader conclusion would likely remain the same: numerical reliability remains a weak spot across current AI models,” said Dawid Siuda, co-author of the ORCA Benchmark.
Performance varied across categories. AI models performed best in basic math and conversions, with Gemini achieving 83 percent accuracy and Grok 76.9 percent. ChatGPT-5 scored 66.7 percent in the same category, giving a combined average of 72.1 percent—the highest across the seven tested categories. Physics proved the most challenging, with overall accuracy dropping to 35.8 percent. Grok led this category at 43.8 percent, while Claude scored just 26.6 percent.
Some AI systems struggled more than others in specific fields. DeepSeek recorded only 10.6 percent accuracy in biology and chemistry, meaning it failed nearly nine out of ten questions. In finance and economics, Gemini and Grok reached 76.7 percent, while the other three models scored below 50 percent.
The study also categorized the types of mistakes AI makes. “Sloppy math” errors, including miscalculations or rounding issues, accounted for 68 percent of mistakes. Faulty logic errors represented 26 percent, reflecting incorrect formulas or assumptions. Misreading instructions accounted for 5 percent, while some AI simply refused to answer. Siuda noted that multi-step calculations with rounding were particularly prone to error.
The research highlights the importance of verifying AI-generated calculations. “If the task is critical, use calculators or proven sources, or at least double-check with another AI,” Siuda advised.
All 500 prompts used in the study had one correct answer and were designed to reflect everyday math tasks, including statistics, finance, physics, and basic arithmetic. The findings indicate that while AI can assist with calculations, it remains unreliable for precise numerical work and users should remain cautious when relying on these tools.
Tech
Europe’s 2025 App Market Shows Divide Between Downloads and Revenue
Europe’s app market in 2025 reveals a clear gap between what users download and what generates the most revenue. While utility, shopping, and AI apps lead in downloads, entertainment, subscription, and dating apps dominate earnings.
According to estimates from AppFigures shared with Euronews Next, the most downloaded app in the EU last year was ChatGPT, with just over 64 million downloads. It was followed by shopping platform Temu with nearly 44 million. Downloads for other top apps drop to around 27 million, including Threads (27.3 million), TikTok (26.8 million), CapCut (25.5 million), and Google Gemini (25.2 million). Rounding out the top ten were WhatsApp Messenger, Revolut, Vinted, and Lidl Plus, each exceeding 22 million downloads.
Productivity apps emerged as a major category, driven largely by artificial intelligence. ChatGPT and Google Gemini signal AI tools moving from niche use to mainstream adoption, as Europeans increasingly rely on AI for work, study, and personal tasks. Shopping apps also featured prominently, with Temu, SHEIN, Vinted, Lidl Plus, and Klarna ranking high. Photo and video apps reflect the rising importance of content creation for social media and small businesses.
Despite dominating downloads, these apps do not always generate the highest revenue. AppFigures data show that TikTok earned an estimated €740 million in Europe, making it the top-grossing app, even though it ranked fourth in downloads. ChatGPT followed with €448 million, demonstrating that AI subscriptions are converting users into paying customers.
Dating apps also ranked high by revenue despite not appearing among the top downloads. Tinder generated €429 million, while Bumble and Badoo recorded €125 million and €81 million, respectively. Streaming services such as Disney+ (€351 million), Amazon Prime Video (€323 million), Google One (€283 million), and YouTube (€243 million) highlight the continued strength of subscription-based digital content.
“The drivers behind spending in top-earning EU apps show a more diverse mix than several years ago, when most spending outside of mobile games went to entertainment and dating apps, such as Disney+, Spotify, Tinder, and Hulu,” Randy Nelson, head of market insights at AppFigures, told Euronews Next.
App rankings also vary significantly by country. In the UK, domestic finance and government services are popular, with GOV.UK ID Check and HMRC among the most downloaded apps. Local retail and finance platforms such as Monzo and Tesco also rank highly. In Turkey, state-backed digital services like e-Devlet Kapısı and e-Nabız, alongside local e-commerce platforms Trendyol and sahibinden, dominate downloads, reflecting a preference for national platforms over cross-border alternatives.
Revenue estimates focus on in-app spending, including subscriptions and digital content, and do not account for physical goods or services. These figures also exclude the roughly 30 percent platform fee taken by Apple and Google, meaning actual developer earnings are lower than the reported totals.
The data underline the evolving European app market, where popularity does not always translate into revenue, and local preferences shape user behaviour in individual countries.
Tech
European Commission Closes Better Regulation Consultation, Public Calls for Strong Impact Assessments
On February 4, the European Commission concluded its public consultation on the Better Regulation framework, seeking feedback on how the initiative could be improved. Among the 286 respondents, representing industry, consumer groups, public sector organizations, and transparency advocates, the majority urged the Commission to maintain robust impact assessments and consultation tools rather than weakening them.
The feedback comes as the EU seeks to speed up decision-making while maintaining transparency and stakeholder engagement. Responses ranged from detailed proposals to ensure focused stakeholder involvement to criticisms of the Commission’s Omnibus approach to legislation.
In its response, Consumer Choice Center Europe (CCCE) suggested that the Commission take stronger action to prevent the overuse of exemptions from Better Regulation guidelines. “Nothing motivates Europeans more than fact-based evidence,” the organization said, calling for disclosure of all exemptions requested since 2021. Current rules allow exemptions for political imperatives, emergencies, or deadlines, but critics warn that such flexibility fosters a culture of loophole-seeking.
Another concern raised during the consultation is the structure of public consultations. Critics note that some surveys, such as those for the Digital Fairness Act, provide detailed answer options for supporters of proposals while offering limited space for opponents to explain their views. Respondents called for more rigorous methodological standards to ensure all stakeholders can express their opinions equally.
The consultation also highlighted the need for faster, clearer feedback. The CCCE recommended that statistical summaries on the “Have Your Say” portal include information on whether respondents support, oppose, or remain neutral on proposals. Currently, summaries are released up to two months after consultations, and critics warn that results can be framed subjectively. Shorter, more readable synopses of the most common arguments, emailed to participants, could increase transparency and trust.
Transparency was another central theme. Respondents suggested that the Commission publish factual summaries not only for formal public consultations but also for targeted consultations and stakeholder meetings. While current guidelines recommend this as “good practice,” advocates argue it should be mandatory to prevent decision-making behind closed doors.
The consultation responses signal a clear message from the European public: while the EU seeks efficiency in legislative processes, citizens and organizations want consultative mechanisms and impact assessments to remain strong and accessible.
The Commission now faces the challenge of balancing faster policy adoption with transparency and accountability, ensuring that citizens can continue to engage meaningfully in shaping EU law.
Tech
France Ranks Last in Global AI Adoption Among Public Servants, Study Finds
France ranks last in a new global index measuring artificial intelligence adoption in government, with nearly half of its public servants reporting that they have never used AI at work, despite substantial government investment in the technology.
The Public Sector AI Adoption Index 2026, released on Monday by Public First for the Center for Data Innovation with support from Google, surveyed 3,335 civil servants across 10 countries, including the United States, Japan, Germany, the United Kingdom, Brazil, South Africa, India, Singapore, and Saudi Arabia. The study highlights a gap between ambitious AI strategies and actual implementation in European governments.
According to the index, 74 percent of French public servants said AI cannot perform any part of their work, and about 45 percent reported never using AI on the job. Only 27 percent noted that their organisations had invested in AI tools, and many said guidance from leadership on AI use was unclear.
“While France positions AI as a strategic tool for competitiveness and modernisation, without hands-on experience, its value remains abstract for many workers,” the report stated. Researchers warned that 70 percent of employees who actively use AI in organisations with limited guidance are doing so in “shadow” mode, meaning they operate AI tools without informing their employers.
Across Europe, adoption of AI in public services remains cautious. Germany and France were grouped as risk-averse countries, where AI is limited to specialists and pilot projects. The United Kingdom showed more progress, with 37 percent of public servants receiving AI training, but adoption remains uneven across departments and access to approved tools is limited.
By contrast, countries such as Singapore, Saudi Arabia, and India led the index. Public servants in these countries report widespread, everyday use of AI in government work, supported by clear leadership guidance and training programmes. Globally, 74 percent of public servants now use AI, and 80 percent say it empowers them, but only 18 percent believe their governments are using AI very effectively.
The survey assessed adoption across five areas: attitudes toward AI, confidence in using it, access to approved tools, integration of AI in daily work, and access to training. Experts said these factors determine whether governments can translate ambitious AI strategies into tangible improvements in public services.
“Many governments have ambitious plans for AI in the public sector, but some are creating better conditions for real‑world use than others,” said Rachel Wolf, CEO of Public First. “Our research shows who is succeeding and where improvement is needed. This matters because effective AI enables better public services, stronger outcomes for citizens, and more resilient public institutions.”
The findings raise questions about the effectiveness of France’s AI initiatives, which have included significant investment in infrastructure and ethical frameworks aimed at guiding responsible AI deployment in government. Analysts said closing the gap between strategy and practical use will be critical for the country to realise the benefits of AI for public services.
-
Entertainment1 year agoMeta Acquires Tilda Swinton VR Doc ‘Impulse: Playing With Reality’
-
Business2 years agoSaudi Arabia’s Model for Sustainable Aviation Practices
-
Business2 years agoRecent Developments in Small Business Taxes
-
Home Improvement1 year agoEffective Drain Cleaning: A Key to a Healthy Plumbing System
-
Politics2 years agoWho was Ebrahim Raisi and his status in Iranian Politics?
-
Business2 years agoCarrectly: Revolutionizing Car Care in Chicago
-
Sports2 years agoKeely Hodgkinson Wins Britain’s First Athletics Gold at Paris Olympics in 800m
-
Business2 years agoSaudi Arabia: Foreign Direct Investment Rises by 5.6% in Q1
