Large language model (LLM) ChatGPT has proven itself to be highly capable in various professional domains. It has successfully passed challenging exams such as medical school, law school bar exams, and an MBA exam from the Wharton School of Business. However, there is one area where ChatGPT falls short: accounting.
Users have observed that ChatGPT struggles with even basic math functions. To investigate this further, Brigham Young University (BYU) professor of accounting, David Wood, conducted a comprehensive study on ChatGPT’s accounting abilities. Wood reached out to the global accounting community through social media, and 327 co-authors from 186 educational institutions across 14 countries participated in the study. They provided over 25,000 accounting exam questions to test ChatGPT’s knowledge.
The results were clear: ChatGPT achieved a score of 47.4%, while human students scored an average of 76.7%, surpassing the machine by a significant margin. The study revealed that ChatGPT struggled particularly with tax, financial, and managerial assessment problems, which involved complex mathematical calculations.
This raises the question of how an AI system, which is often portrayed as a potential threat to humanity, can struggle with basic math. The answer lies in understanding that ChatGPT is primarily a language model. It has been trained on vast amounts of data to understand language patterns, but not numerical calculations. Its output is based on probability rather than accuracy, providing answers with the highest statistical likelihood.
Paulo Shakarian, an associate professor at Arizona State University, conducted a study on ChatGPT’s performance in solving mathematical word problems. The results showed that ChatGPT’s accuracy was significantly below that of state-of-the-art algorithms for math word problem-solving. The algorithm lacks the logical reasoning necessary for solving multi-step math problems.
Despite its shortcomings in accounting and math, ChatGPT has demonstrated strengths in other areas. Christian Terwiesch, a professor at the Wharton School of Business, found that ChatGPT excelled in a case study involving troubleshooting a bottleneck process at an iron ore factory. It provided correct and well-explained answers, earning high praise.
ChatGPT also proves valuable in automating tedious tasks such as invoice processing, expense categorization, and data entry. Additionally, it offers educators like Professor Wood an opportunity to reflect on their teaching methods and improve the learning process for students.
While ChatGPT shows promise in various fields, it is not yet capable of handling complex accounting tasks or filing taxes. However, as researchers continue to refine and improve AI models, it may only be a matter of time before AI becomes proficient in these areas as well.