Can ChatGPT Tackle Your Medical Queries? Recent Study Reveals this....
Everyone is talking about ChatGPT these days, but can it really answer everything?
ChatGPT, created by OpenAI and launched on November 30, 2022, is a chatbot that uses a ‘large language model’ to chat with people about anything they fancy in any way they like.
But, a new study found that it might not be super accurate at answering detailed medical questions.
As per a report from CNN, researchers from Long Island University asked ChatGPT 39 questions about medications, which were actual queries from the university’s College of Pharmacy drug information service. They then compared ChatGPT’s answers with responses crafted and reviewed by trained pharmacists.
The study revealed that out of the 39 questions asked, ChatGPT gave “accurate responses” to only around 10 of them. For the other 29 questions, the answers were either “incomplete or inaccurate,” or they didn’t really answer the questions. These results were shared at the annual meeting of the American Society for Health-Systems Pharmacists in Anaheim, California.
Due to its popularity, researchers were worried that students, regular people, and even pharmacists might rely on ChatGPT to get information about their health and medications.
According to the report, ChatGPT’s responses often provided inaccurate or even risky information. For example, when asked about taking two specific medicines together— a COVID-19 antiviral and a blood-pressure-lowering medication—ChatGPT said there would be no “adverse effects.” But in reality, people who take both drugs might experience a significant drop in blood pressure, leading to dizziness and fainting. Clinicians usually make personalized plans for patients in such cases, like adjusting the dose of the blood pressure medication or advising them to stand up slowly from sitting positions.
In the year 2022, ChatGPT passed the US medical licensing exam. As reported by TheHealthSite, researchers discovered that the intelligent chatbot learns from mistakes and improves itself over time. This implies that similar to a medical student, ChatGPT’s medical knowledge may improve with more experience and interaction with users. However, the chatbot excels in technical language, which could be helpful for medical professionals but might be challenging for laypeople to understand, making it difficult for them to obtain accurate answers.
In a recent study, when researchers asked ChatGPT to give scientific sources to back up its answers, the software could supposedly only do so for eight questions. Surprisingly, it was discovered that ChatGPT was “making up” references. The responses were written in a very professional and advanced way, which might make people feel confident in the tool’s accuracy. Users who may not be able to tell might be influenced by the appearance of authority.
Another really important study by researchers from CSIRO and The University of Queensland looked into how different prompts affect the accuracy of health information given by the ChatGPT model. This study is a big step forward in understanding how AI handles health questions. It shows that the way questions are asked can really affect how reliable the answers are for people.
The study by CSIRO and The University of Queensland sheds light on how the wording of questions affects ChatGPT’s responses, especially in health information seeking where accuracy is vital.
Using the TREC Misinformation dataset, researchers precisely tested ChatGPT’s performance under different prompting conditions. They found that when given questions alone, ChatGPT could provide accurate health advice about 80% of the time. However, this accuracy dropped when additional information or biases were included in the prompts.
The study set up two main conditions: “Question-only,” where ChatGPT relied solely on the question, and “Evidence-biased,” where it was given extra context from a web search. This mirrored real-life situations where users either ask direct questions or provide background information from previous searches.
One of the study’s most notable findings is how the structure of the prompt significantly influences the accuracy of ChatGPT’s responses. While the model performed well with just questions, a closer look revealed a bias based on how the questions were framed and whether they expected a “yes” or “no” answer. This bias highlights the complexity of language processing in AI systems and stresses the importance of carefully crafting prompts.
Moreover, when given additional evidence, ChatGPT’s accuracy dropped to 63%. This shows that the model can be swayed by the information in the prompt, challenging the idea that more context always leads to better answers. Interestingly, even correct evidence could worsen the model’s accuracy, showing the complex relationship between prompt content and AI response.
These findings are critical in a world where people rely on AI for health advice. Ensuring the accuracy of AI-driven health information is vital. This research highlights the need for ongoing development to make AI systems more robust and transparent, especially in health information.
Furthermore, the study’s insights into how prompt variation affects ChatGPT’s performance are significant for developing AI-powered health advice tools. It stresses the importance of optimizing prompt design to reduce biases and inaccuracies, ultimately leading to more reliable AI-driven health services.
Overall, the report by CNN, along with the research by CSIRO and the University of Queensland, is a significant step in understanding AI’s abilities and limitations in handling health information. As AI becomes more prevalent, these insights will guide the development of more accurate and user-friendly health information tools.