ChatGPT’s Accuracy Is Getting Worse Over Time, Study Shows

BySamson Ononeme

Jul 20, 2023 ,
ChatGPT's Accuracy Is Getting Worse Over Time, Study Shows

Key Insights

  • Researchers found that later versions of ChatGPT showed decreased accuracy in providing answers over several months, without a clear explanation for the decline.
  • Analysts recommend implementing monitoring analysis to keep ChatGPT up to date due to the findings.
  • Coinbase’s study revealed that ChatGPT had difficulty achieving the required level of analysis accuracy, mistakenly identifying high-risk assets as low-risk for listing on the platform.

Researchers at Stanford and UC Berkeley found that later versions of ChatGPT were much less likely to provide accurate answers to the same questions over several months. However, they were unable to explain why this happened.

Due to the findings, analysts are asking everyone using ChatGPT to implement some form of monitoring analysis to ensure that the chatbot stays up to date.

How was the research on ChatGPT’s accuracy conducted?

To test how reliable the various ChatGPT models are, the researchers asked the ChatGPT-3.5 and ChatGPT-4 models to solve several mathematical problems, answer sensitive questions, and write lines of code.

As a result, it turned out that in March ChatGPT-4 could give the correct answers in 97.6% of cases. The same test in June showed that GPT-4’s accuracy dropped to 2.4%. At the same time, an earlier chatbot model – GPT-3.5 – improved the identification of prime numbers over the same period of time.

When it came to generating lines of new code, the capabilities of both models deteriorated significantly in the three months from March to June.

ChatGPT’s responses to sensitive questions – with some examples emphasizing ethnicity and gender – later became more concise.

An earlier version of the chatbot provided detailed explanations of why some sensitive questions could not be answered.

Read also: Apple GPT: Is the Tech Giant Creating Its Own AI Chatbot?

ChatGPT misidentified high-risk assets as low-risk

Another study by cryptocurrency exchange Coinbase showed that ChatGPT was unable to achieve the required level of analysis accuracy.

In five out of eight cases, the chatbot identified high-risk assets as low-risk and approved them for listing on the platform.

Read also: SEC Chief Warns of AI Risks to Financial Stability

In addition, the AI ​​bot cannot understand situations when it does not have enough data for qualitative analysis.

However, 75% of traders are still willing to trust ChatGPT’s financial advice.

The Investor Index study showed that users trust the financial advice of an artificial intelligence chatbot. At the same time, experts note that people rely less on the recommendations of professional consultants and prefer to study the situation on their own.

How can the findings of the research impact the use of ChatGPT in various applications, including financial advice? Let us know your thoughts in the comments.

Samson Ononeme

Meet Samson Ononeme, a dynamic writer, editor, and CEO of With a passion for words and a sharp business acumen, Samson captivates readers with captivating storytelling and delivers insightful market analysis. He is a trailblazer in the finance industry, empowering individuals with knowledge and shaping the narrative of money. Get ready to be inspired by his literary prowess and entrepreneurial leadership.

Leave a Reply