Connect with us

Tech

AI Tools Boost Paper Production but Raise Quality Concerns in Scientific Research

Published

on

Large language models such as ChatGPT are increasing research output, particularly for scientists who are not native English speakers, but a new study warns that many AI-assisted papers are less likely to pass peer review.

Researchers at Cornell University, United States, analysed more than two million research papers posted between 2018 and 2024 on three major preprint servers, which host early versions of scientific work prior to formal review. Their findings, published in the journal Science, show that AI tools are reshaping how scientific papers are written and disseminated.

To identify AI-assisted papers, the team trained an AI system to detect text likely generated by large language models. Comparing papers posted before 2023 with those written after tools like ChatGPT became widely available, the researchers measured publication output and subsequent acceptance rates in scientific journals.

The analysis revealed a significant productivity boost for AI users. On a major preprint server for physics and computer science, researchers using AI produced about one-third more papers than those who did not. In biology and the social sciences, the increase exceeded 50 percent. The largest gains were seen among scientists whose first language is not English. In some Asian institutions, researchers published between 40 percent and nearly 90 percent more papers after adopting AI writing tools, depending on the discipline.

AI tools also appear to aid in literature review. Researchers using AI were more likely to identify newer studies and relevant books rather than relying on older, frequently cited works. “People using LLMs are connecting to more diverse knowledge, which might be driving more creative ideas,” said Keigo Kusumegi, a doctoral student and first author of the study.

See also  Hacker Group Accesses Data of Over 200 Million Pornhub Users

Despite the productivity gains, the study highlights quality concerns. Many AI-written papers, while linguistically polished, were less likely to be accepted by journals. Papers written by humans that scored high on writing complexity were more likely to be accepted, whereas AI-generated papers with similar scores often failed to meet scientific standards.

“Already now, the question is not, ‘Have you used AI?’ The question is, ‘How exactly have you used AI and whether it’s helpful or not,’” said Yian Yin, assistant professor at Cornell and corresponding author of the study. Yin added that the widespread adoption of AI tools across disciplines—including physical sciences, computer science, biology, and social sciences—requires careful consideration by reviewers, funders, and policymakers.

The researchers stress that AI-assisted tools are reshaping the academic ecosystem, offering opportunities to improve productivity and access to scientific knowledge, but they also call for guidelines to ensure that the technology is used responsibly and that scientific contributions maintain their integrity.

As AI becomes increasingly integrated into research practices, the challenge for the scientific community will be balancing efficiency and innovation with rigorous evaluation standards to maintain the quality and credibility of published science.

Tech

Study Finds AI Models Get Basic Math Wrong Around 40 Percent of the Time

Published

on

Artificial intelligence (AI) tools are increasingly used for everyday calculations, but a new study suggests users should approach their answers with caution. Researchers from the Omni Research on Calculation in AI (ORCA) found that when tested on 500 real-world math prompts, AI models had roughly a 40 percent chance of producing an incorrect result.

The study evaluated five widely used AI systems in October 2025: ChatGPT-5 (OpenAI), Gemini 2.5 Flash (Google), Claude 4.5 Sonnet (Anthropic), DeepSeek V3.2 (DeepSeek AI), and Grok-4 (xAI). None of the models scored above 63 percent overall, with Gemini leading at 63 percent, Grok close behind at 62.8 percent, and DeepSeek at 52 percent. ChatGPT-5 scored 49.4 percent, while Claude trailed at 45.2 percent. The average accuracy across all five models was 54.5 percent.

“Although the exact rankings might shift if we repeated the benchmark today, the broader conclusion would likely remain the same: numerical reliability remains a weak spot across current AI models,” said Dawid Siuda, co-author of the ORCA Benchmark.

Performance varied across categories. AI models performed best in basic math and conversions, with Gemini achieving 83 percent accuracy and Grok 76.9 percent. ChatGPT-5 scored 66.7 percent in the same category, giving a combined average of 72.1 percent—the highest across the seven tested categories. Physics proved the most challenging, with overall accuracy dropping to 35.8 percent. Grok led this category at 43.8 percent, while Claude scored just 26.6 percent.

Some AI systems struggled more than others in specific fields. DeepSeek recorded only 10.6 percent accuracy in biology and chemistry, meaning it failed nearly nine out of ten questions. In finance and economics, Gemini and Grok reached 76.7 percent, while the other three models scored below 50 percent.

See also  Red Sea Undersea Cable Cuts Disrupt Internet Across Middle East and Asia

The study also categorized the types of mistakes AI makes. “Sloppy math” errors, including miscalculations or rounding issues, accounted for 68 percent of mistakes. Faulty logic errors represented 26 percent, reflecting incorrect formulas or assumptions. Misreading instructions accounted for 5 percent, while some AI simply refused to answer. Siuda noted that multi-step calculations with rounding were particularly prone to error.

The research highlights the importance of verifying AI-generated calculations. “If the task is critical, use calculators or proven sources, or at least double-check with another AI,” Siuda advised.

All 500 prompts used in the study had one correct answer and were designed to reflect everyday math tasks, including statistics, finance, physics, and basic arithmetic. The findings indicate that while AI can assist with calculations, it remains unreliable for precise numerical work and users should remain cautious when relying on these tools.

Continue Reading

Tech

Generative AI Adoption Varies Widely Across Europe, Survey Finds

Published

on

The use of generative artificial intelligence (Gen AI) tools such as ChatGPT, Gemini, and Grok has grown significantly across Europe, with millions of people now relying on the technology for personal, work, and educational purposes. These tools can generate new content, including text, images, code, and videos, based on user prompts and patterns learned from existing data.

According to Eurostat, about one-third of Europeans aged 16 to 74 used AI tools at least once in 2025. However, adoption rates vary widely across the continent, with usage ranging from 17 percent in Turkey to 56 percent in Norway. Within the European Union, Denmark leads with 48 percent of people reporting AI use, while Romania has the lowest rate at 18 percent.

Thirteen countries reported that at least two in five people had used Gen AI tools in the three months prior to the survey. These include Switzerland and Estonia (47 percent each), Malta (46 percent), Finland (46 percent), Ireland (45 percent), the Netherlands (45 percent), Cyprus (44 percent), Greece (44 percent), Luxembourg (43 percent), Belgium (42 percent), and Sweden (42 percent).

Conversely, eight countries saw usage fall below 25 percent, including Serbia (19 percent), Italy (20 percent), Bosnia and Herzegovina (20 percent), North Macedonia (22 percent), Bulgaria (23 percent), Poland (23 percent), Turkey (17 percent), and Romania (18 percent). Among major EU economies, Germany (32 percent) and Italy (20 percent) remain below the EU average, while Spain (38 percent) and France (37 percent) slightly exceed it.

Experts say the differences reflect the broader digital landscape and skill levels in each country. Colin van Noordt, a researcher at KU Leuven University in Belgium, told Euronews Next that nations with strong digital foundations, like Denmark and Switzerland, have higher adoption rates because their populations already possess digital skills, frequent internet use, and familiarity with technology.

See also  MIT Study Warns of Cognitive Decline Linked to ChatGPT Use

“In countries with lower adoption, people often don’t know generative AI exists or are unsure how to use it,” van Noordt said. He added that understanding how AI can be applied in daily life or work, often referred to as “AI literacy,” is a major factor in adoption. Government policies may encourage use, but underlying digital culture and practical skills appear to have a greater impact, he said.

The survey also highlighted differences in how AI is used. Across the EU, personal use (25 percent) exceeds work-related use (15 percent) in every country, though the gap varies. In the Netherlands, personal and work use are nearly equal at 28 percent and 27 percent, respectively. In Greece, 41 percent use AI personally, compared with just 16 percent at work.

Use of AI in formal education is limited, with only 9 percent of Europeans reporting educational use. Sweden and Switzerland lead at 21 percent, while Hungary records just 1 percent. Analysts suggest that uncertainty over practical applications of AI continues to limit workplace and educational adoption.

The Eurostat data underscores a clear north–south and west–east divide in Gen AI adoption, with Nordic and digitally advanced countries leading the way and southern, central-eastern, and Balkan nations trailing.

Continue Reading

Tech

As AI Hype Fades, Analysts Say ‘Boring’ Tools May Last Longer Online

Published

on

After a year of intense attention on flashy AI applications, analysts are noting a shift in user experience, with practical, low-profile tools likely to have a longer-term impact than more sensational AI offerings.

In 2025, “AI slop”—low-quality or unwanted AI-generated content—became a major feature of the Internet. From confusing chatbots to nonsensical product summaries, AI slop appeared across search engines, e-commerce platforms, and even official communications. Online media and consumer intelligence firm Meltwater reported that mentions of “AI slop” grew ninefold this year compared to 2024, with negative sentiment peaking at 54 percent in October. According to SEO firm Graphite, AI-generated content now represents more than half of all English-language material online. The term was even named Word of the Year 2025 by Merriam-Webster and Australia’s national dictionary.

Analysts warn that much of this content reflects “solution-led design,” where technology is added first, then products are built to justify it. Kate Moran, vice president of research at Nielsen Norman Group, said companies have often introduced AI in ways that confuse users rather than solve problems. She cited Meta’s AI search feature on Instagram, which replaced the traditional search bar and was quickly rolled back after user backlash. Consumer AI hardware, such as the Humane AI Pin, also received negative reviews, suggesting that “solutions are being built for problems that don’t exist,” according to Logitech CEO Hanneke Faber.

Even as some firms continue to launch flashy AI apps, user engagement has been muted. Meta introduced its AI video app “Vibes” in Europe this year, but early reports indicate just 23,000 daily users across the continent, concentrated in France, Italy, and Spain. This contrasts with the company’s previous efforts to prioritize “authentic storytelling” over low-value AI-generated content.

See also  Iranian Missiles Breach Israeli Defences, Sparking Questions Over Effectiveness of Missile Shield

Experts say that practical, low-interaction AI features may be more effective in improving user experience. Moran highlighted Amazon’s AI-generated summaries of product reviews as a valuable example, providing quick insights without requiring user input. Similarly, Daniel Mügge, a researcher at the University of Amsterdam, argued that European tech investment should prioritize AI applications that solve concrete problems in robotics, manufacturing, or other sectors, rather than tools that amplify advertising or create low-quality content.

Platforms like Pinterest and YouTube are already responding to user frustration by allowing people to limit AI-generated content. Analysts say these “boring” but useful tools are shaping a more intentional approach to AI design.

“Smaller, specialized AI products can make a real difference for users without grabbing headlines,” Moran said. Mügge added that focusing on practical applications allows smaller companies to contribute meaningfully while avoiding a direct race with dominant AI developers.

As the AI hype cools, analysts agree that thoughtful, problem-focused tools are likely to outlast flashy applications, shaping the future of the Internet in ways that matter to everyday users.

Continue Reading

Trending