Tech
Study Finds AI Models Get Basic Math Wrong Around 40 Percent of the Time
Artificial intelligence (AI) tools are increasingly used for everyday calculations, but a new study suggests users should approach their answers with caution. Researchers from the Omni Research on Calculation in AI (ORCA) found that when tested on 500 real-world math prompts, AI models had roughly a 40 percent chance of producing an incorrect result.
The study evaluated five widely used AI systems in October 2025: ChatGPT-5 (OpenAI), Gemini 2.5 Flash (Google), Claude 4.5 Sonnet (Anthropic), DeepSeek V3.2 (DeepSeek AI), and Grok-4 (xAI). None of the models scored above 63 percent overall, with Gemini leading at 63 percent, Grok close behind at 62.8 percent, and DeepSeek at 52 percent. ChatGPT-5 scored 49.4 percent, while Claude trailed at 45.2 percent. The average accuracy across all five models was 54.5 percent.
“Although the exact rankings might shift if we repeated the benchmark today, the broader conclusion would likely remain the same: numerical reliability remains a weak spot across current AI models,” said Dawid Siuda, co-author of the ORCA Benchmark.
Performance varied across categories. AI models performed best in basic math and conversions, with Gemini achieving 83 percent accuracy and Grok 76.9 percent. ChatGPT-5 scored 66.7 percent in the same category, giving a combined average of 72.1 percent—the highest across the seven tested categories. Physics proved the most challenging, with overall accuracy dropping to 35.8 percent. Grok led this category at 43.8 percent, while Claude scored just 26.6 percent.
Some AI systems struggled more than others in specific fields. DeepSeek recorded only 10.6 percent accuracy in biology and chemistry, meaning it failed nearly nine out of ten questions. In finance and economics, Gemini and Grok reached 76.7 percent, while the other three models scored below 50 percent.
The study also categorized the types of mistakes AI makes. “Sloppy math” errors, including miscalculations or rounding issues, accounted for 68 percent of mistakes. Faulty logic errors represented 26 percent, reflecting incorrect formulas or assumptions. Misreading instructions accounted for 5 percent, while some AI simply refused to answer. Siuda noted that multi-step calculations with rounding were particularly prone to error.
The research highlights the importance of verifying AI-generated calculations. “If the task is critical, use calculators or proven sources, or at least double-check with another AI,” Siuda advised.
All 500 prompts used in the study had one correct answer and were designed to reflect everyday math tasks, including statistics, finance, physics, and basic arithmetic. The findings indicate that while AI can assist with calculations, it remains unreliable for precise numerical work and users should remain cautious when relying on these tools.
Tech
Google Removes Some AI Health Summaries After Accuracy Concerns
Google has reportedly removed certain AI-generated summaries for health-related searches after an investigation found that some of the information provided could be misleading.
The summaries, known as AI Overviews, appear at the top of search results and are designed to provide concise answers to user questions. A report by the Guardian newspaper found that several AI Overviews contained inaccurate health information, raising concerns about potential harm to users.
The investigation highlighted cases where the AI supplied numbers with little context in response to queries such as “what is the normal range for liver blood tests?” and “what is the normal range for liver function tests?” The results did not account for differences based on age, sex, ethnicity, or nationality. In some cases, Google’s AI extracted data from Max Healthcare, an Indian hospital chain in New Delhi, rather than providing verified global medical guidance.
Featured snippets, which also appear at the top of Google search results, differ from AI Overviews because they extract existing text from relevant websites rather than generating new content. However, the Guardian noted that even variations of liver test queries, such as “[liver function test] lft reference range,” continued to produce AI-generated summaries. Liver function tests measure proteins and enzymes in the blood to evaluate how well the liver is performing.
In one example, Google’s AI reportedly advised pancreatic cancer patients to avoid high-fat foods. Experts told the Guardian that such guidance could be dangerous, potentially increasing the risk of mortality among patients.
The Guardian’s findings come amid broader concerns about AI chatbots “hallucinating,” a term used to describe when AI systems generate false or misleading information due to incomplete or inaccurate data. Experts have warned that reliance on AI for medical information could pose serious risks if users interpret these responses as authoritative.
Euronews Next contacted Google to confirm whether AI Overviews had been removed from certain health queries but did not receive an immediate response. Google announced over the weekend that it would expand AI Overviews to Gmail, allowing users to ask questions about their emails and receive automated answers without searching through messages manually.
The development underscores ongoing tensions between AI innovation and accuracy, particularly in sensitive areas such as healthcare. As AI tools become more integrated into search engines and email platforms, experts emphasize the importance of verifying information with trusted medical sources and cautioning users against relying solely on machine-generated summaries.
Tech
ChatGPT Launches Health Feature to Help Users Manage Medical Information
OpenAI has unveiled a new health-focused feature for ChatGPT, aimed at helping users better understand their well-being and prepare for medical conversations. The tool, called ChatGPT Health, connects users’ personal health data, such as medical records and wellness apps, to deliver more personalized guidance.
The feature is designed as a standalone experience within ChatGPT, with health-related chats, files, and connected apps kept separate from users’ other conversations. OpenAI said health information is not shared with non-health chats, and users can view or delete Health memories at any time through the platform’s settings.
“ChatGPT Health is another step toward turning ChatGPT into a personal super-assistant that can support you with information and tools to achieve your goals across any part of your life,” Fidji Simo, OpenAI’s applications CEO, said in a post on Substack.
Users can connect apps such as Apple Health, MyFitnessPal, and Function to ChatGPT Health. The AI can then help interpret recent test results, offer guidance for doctor appointments, and provide insights on diet, exercise routines, or healthcare choices. OpenAI emphasized that all app connections require explicit user permission and undergo additional privacy and security reviews.
OpenAI stressed that the tool is not intended to replace medical care. ChatGPT Health is designed to assist users in understanding patterns in their health and supporting everyday wellness questions. The company said the platform was developed with input from more than 260 physicians across 60 countries, who provided feedback on model outputs over 600,000 times.
Health-related queries are already a major reason people use ChatGPT, with the company reporting that over 230 million questions about health and wellness are asked globally each week. ChatGPT Health aims to make these interactions more personalized by leveraging data from users’ medical and wellness apps.
Access to ChatGPT Health is initially limited to a small group of early users with Free, Go, Plus, or Pro accounts. Users in the European Economic Area, Switzerland, and the United Kingdom are not included in the early rollout due to stricter local health and data regulations. Some app integrations and medical record access are currently only available in the United States.
OpenAI said it plans to expand ChatGPT Health to all users on web and iOS in the coming weeks as the platform is refined.
The company highlighted that the feature is meant to complement, not replace, professional medical advice. By providing insights from personal health data and helping users track trends over time, ChatGPT Health seeks to make individuals better prepared for discussions with their healthcare providers.
Tech
CES 2026 Set to Showcase AI Everywhere, Next-Gen Laptops, and Robotics
The world’s largest technology exhibition, CES 2026, opens Tuesday in Las Vegas, following two days of media previews. The event will feature over 4,500 exhibitors, including 1,400 start-ups and major companies such as Meta, Lenovo, Samsung, and Nvidia, offering a glimpse into the latest developments in artificial intelligence, consumer electronics, and robotics.
Last year, CES attracted more than 140,000 attendees across multiple venues, amid economic uncertainty and discussions on tariffs under the Trump Administration. Paolo Pescatore, a tech analyst, said the focus has shifted from simply showcasing connected devices to exploring how people interact with them and the content they access.
Artificial intelligence will play a central role at this year’s show. Industry leaders are integrating AI into nearly every category of technology. Nvidia CEO Jensen Huang is expected to present the company’s latest productivity-focused AI solutions, while AMD CEO Lisa Su will outline her vision for future AI developments. Lenovo CEO Yuanqing Yang will also address AI integration in consumer devices and enterprise solutions.
A competition over next-generation chips is anticipated to dominate attention. Intel has unveiled its Core Ultra “Panther Lake” platform, while Qualcomm introduced the Snapdragon Elite X2, a mobile processor for Windows on Arm devices. These advancements are expected to spark a wave of new laptop announcements. LG has teased its 2026 Gram Pro line, including what it claims is the “world’s lightest 17-inch RTX laptop.”
Tim Danton, editor of TechFinitive.com, said, “CES 2026 won’t be short of laptops. Intel’s new chips promise high performance and long battery life, and we’re likely to see innovative designs, including rollable screens and more repairable models.”
CES will also feature innovations across healthcare, wearables, vehicles, and gaming. Sony Honda Mobility is expected to present a production version of its Afeela electric vehicle. Domestic robotics will be highlighted as well, with LG unveiling its helper bot “CLOiD,” designed for indoor household tasks. Ben Bajarin, CEO of Creative Strategies, said humanoid robots will increasingly appear at the show, marking the rise of “physical AI”—artificial intelligence that manifests in real-world applications such as autonomous cars and home assistants.
Samsung will showcase new uses of OLED technology, integrating it into AI-powered devices to enhance displays, including an AI OLED Bot that functions as a teaching assistant in educational settings.
Bajarin added that concerns over an AI bubble have eased. “This is more of a build-out than a bubble. We are at the start of a major industrial investment cycle, laying the groundwork for future computing capabilities,” he said.
CES 2026 promises a wide array of AI-driven gadgets, next-generation laptops, and autonomous robotics, reflecting the growing influence of artificial intelligence across industries and everyday life.
-
Entertainment1 year agoMeta Acquires Tilda Swinton VR Doc ‘Impulse: Playing With Reality’
-
Business2 years agoSaudi Arabia’s Model for Sustainable Aviation Practices
-
Business2 years agoRecent Developments in Small Business Taxes
-
Home Improvement1 year agoEffective Drain Cleaning: A Key to a Healthy Plumbing System
-
Politics2 years agoWho was Ebrahim Raisi and his status in Iranian Politics?
-
Business2 years agoCarrectly: Revolutionizing Car Care in Chicago
-
Sports1 year agoKeely Hodgkinson Wins Britain’s First Athletics Gold at Paris Olympics in 800m
-
Business2 years agoSaudi Arabia: Foreign Direct Investment Rises by 5.6% in Q1
