GPT-4 performed close to the level of expert doctors in eye assessments - - GPT-4 performed close to the level of expert doctors in eye assessments

Media › GPT-4 performed close to the level of expert doctors in eye assessments

GPT-4 performed close to the level of expert doctors in eye assessments

2024-04-18 15:15:17| Engadget

As learning language models (LLMs) continue to advance, so do questions about how they can benefit society in areas such as the medical field. A recent study from the University of Cambridge's School of Clinical Medicine found that OpenAI's GPT-4 performed nearly as well in an ophthalmology assessment as experts in the field, the Financial Times first reported. In the study, published in PLOS Digital Health, researchers tested the LLM, its predecessor GPT-3.5, Google's PaLM 2 and Meta's LLaMA with 87 multiple choice questions. Five expert ophthalmologists, three trainee ophthalmologists and two unspecialized junior doctors received the same mock exam. The questions came from a textbook for trialing trainees on everything from light sensitivity to lesions. The contents aren't publicly available, so the researchers believe LLMs couldn't have been trained on them previously. ChatGPT, equipped with GPT-4 or GPT-3.5, was given three chances to answer definitively or its response was marked as null. GPT-4 scored higher than the trainees and junior doctors, getting 60 of the 87 questions right. While this was significantly higher than the junior doctors' average of 37 correct answers, it just beat out the three trainees' average of 59.7. While one expert ophthalmologist only answered 56 questions accurately, the five had an average score of 66.4 right answers, beating the machine. PaLM 2 scored a 49, and GPT-3.5 scored a 42. LLaMa scored the lowest at 28, falling below the junior doctors. Notably, these trials occurred in mid-2023. While these results have potential benefits, there are also quite a few risks and concerns. Researchers noted that the study offered a limited number of questions, especially in certain categories, meaning the actual results might be varied. LLMs also have a tendency to "hallucinate" or make things up. That's one thing if its an irrelevant fact but claiming there's a cataract or cancer is another story. As is the case in many instances of LLM use, the systems also lack nuance, creating further opportunities for inaccuracy.This article originally appeared on Engadget at https://www.engadget.com/gpt-4-performed-close-to-the-level-of-expert-doctors-in-eye-assessments-131517436.html?src=rss

Category: Marketing and Advertising

Latest from this category

08.07	Japanese hotel chain introduces 'stay first, interview later' recruitment model
07.07	Its 2100. Can you guess where you are?
04.07	Otriums AI models give unsold fashion a second chance
03.07	Top 10 Favorite Creator-Led Brands [Infographic]
03.07	AI Update, July 3, 2025: AI News and Views From the Past Week
03.07	The Power of Emotional Advertising in B2B Brand-Building: Feelings vs. Function
03.07	In Relooted, gamers plan the perfect heist to reclaim Africas stolen artifacts
02.07	The State of Hybrid Work
Marketing and Advertising »

All news

08.07	Morgan Stanley initiates coverage on The Leela shares with overweight rating, Rs 549 target; stock up 5%
08.07	BPCL shares can rally up to 26% on superior refining metrics, outshining HPCL, IOCL in Nomura's view
08.07	Vedanta shares down 2% in 1 year but giving 7% dividend. Is it enough to buy the stock?
08.07	Gold prices may tumble up to Rs 94,950/10 grams. Should you book profit?
08.07	Trade tensions, not BRICS, are the bigger threat to global stability: Geoff Dennis
08.07	Japanese hotel chain introduces 'stay first, interview later' recruitment model
08.07	Tuesday Watch
08.07	Gokaldas Exports, other textile stocks zoom up to 8% as US slaps 35% tariff on Bangladesh
More »

Media

Popular tags

2024-04-18 15:15:17| Engadget

Latest from this category

All news