Between my two blog posts titled Using ChatGPT As A Data Analysis Assistant and Data Analysis By ChatGPT, I covered seven examples of data analysis via ChatGPT.
- Retail Format Price Comparison
- Stock Portfolio Prediction Tracker
- Pizza Coke Sales Correlation
- Marketing Campaign Strategy
- ETF Composition
- Is G7 Obsolete?
- ChatGPT And The Art Of Motorcar Maintenance
Since then, I’ve heard several objections about hallucination, data sources, data quality, and accuracy of analysis, etc.
In this post, I’ll address them.
1. Hallucination-1
Objection: Articles cited by ChatGPT don’t exist. Seems like ChatGPT is hallucinating.
Rebuttal: When you click a link cited by ChatGPT, you might get a “Page Not Found” error. You might conclude that ChatGPT is hallucinating. However, this might have to do with linkrot in the Internet rather than hallucination in GenAI. According to a Pew Research study cited by FORTUNE magazine, 38% of webpages that were accessible in 2013 are no longer accessible today.
In other words, 38% of links have rotted in 10 years. So it’s quite possible that the webpage was there when ChatGPT trained on its content but has disappeared when you and me click the link provided in the ChatGPT answer.
Linkrot affects the entire Internet and is not unique to ChatGPT.
2. Hallucination-2
Objection: Citations provided by ChatGPT don’t have the content claimed by it. Seems like ChatGPT is making up things as it goes along aka hallucinating.
Rebuttal: I’ve experienced this sometimes in ChatGPT’s summaries of large documents but I want to share a firsthand experience that suggests that the hallucination charge is perhaps exaggerated. Many times, what J6P (Joe Six Pack / Jane Six Pack aka Common Man / Common Woman) calls hallucination is a sign that s/he cannot access the cited source as deeply as ChatGPT / LLM.
In a recent answer, ChatGPT provided some valuable data. But I could not find any mention of it in the source cited by it. At first blush, like everyone, I blamed hallucination. But the information given by ChatGPT was too juicy for me to give up. So, I reached out to the source.
They admitted that a past version of their article did carry the data mentioned by ChatGPT but that, over time, they found it to be too juicy to be given away for free, so they put it behind a paywall. This is what I call a “pinch point” in Six Best Practices To Convert Freemium To Premium. But I digress.
Not being a paid subscriber, you and I may not have access to all the content in the source webpage but ChatGPT may not have such restrictions.
Stumbled upon yet another paywall bypass hack: Ask ChatGPT to summarize an article. ChatGPT 3.5 keeps saying it cannot access the Internet but somehow this works. pic.twitter.com/NRFRrAv33r
— Ketharaman Swaminathan (@s_ketharaman) December 12, 2023
3. Source of Data
Objection: Somebody should develop an AI to look for stats, it is absolutely frustrating the way it is now, we spend hours looking for the most basic number.
Rebuttal: ChatGPT works for me for this. It’s not only able to source basic numbers but also able to compute user-defined composite numbers.
To cite an example, I asked Chat “Can you plot the contribution to global GDP of USA and G7 minus USA during the last 10 years?” ChatGPT got that I’d defined a new variable “G7 minus USA” and it was still able to compute the figures and give me a readymade chart.
4. Unofficial Data Sources
Objection: ChatGPT is not trained on official statistics databases and scientific articles, and reports only what is in the media, which is often wrong.
Rebuttal: ChatGPT admits that it has not explicitly listed official statistics databases (e.g., from agencies like the UN or World Bank) or scientific databases (e.g., PubMed or IEEE) as part of the training data and that the model has likely been exposed to general information drawn from public sources and literature that discusses or references such data. However, it refutes that this methodology makes its data / analysis wrong.
I happened to notice the following line in Chat’s reply:
OpenAI has not explicitly listed official statistics databases as part of the training data…
I was intrigued by ChatGPT’s choice of words. It did not unequivocally say “NO, I’m not trained on official statistics databases”. Going by its hedging tone, I can’t help wondering if ChatGPT is actually trained on official stats but OpenAI does not want to go on the record having done that.
5. Accuracy
Objection: Have you done fact check on the data analysis carried out by ChatGPT?
Rebuttal: IMO, factcheck is pointless. Google is God even though it makes bloomers. Woz a time when I was in Marina Beach (a city in Chennai in South India that’s on the shores of Bay of Bengal) and Bay of Bengal was less than 500m away from me but Google Maps placed me at Gemini Flyover and showed the beach to be three kilometers away from me. People believed Google Maps, not what I saw with my own eyes.
So, I’d rather spend my time on accelerating the day when ChatGPT / GenAI is accepted as Gospel Truth (assuming that day has not already come). Because, notwithstanding what I do and say, people will look up ChatGPT and tell me I’m wrong if my results vary from its answer.
And, why not? From what I know of Consumer Behavior, trust is proportional to popularity, and J6P will always trust the most popular product in their lifetime over unknown official data sources.
Obviously, the above is true only in B2C contexts.
When it comes to businesses, there are several possibilities.
Some businesses might be satisfied with first cut data analysis via ChatGPT because their real world use cases don’t analyze data for the sake of analyzing data but only to get actionable insights. As long as the actions they take on the basis of ChatGPT’s output work out seven out of 10 times, they’re happy – because, as the saying goes, “no one bats a thousand”.
Other businesses, especially those that deploy AI Agents to take autonomous actions based on GenAI’s output, cannot afford to take this expedient approach and will need accurate results. They can get that by licensing official statistics sources and using them to finetune and / or RAG their GenAI platforms.
For the uninitiated:
- Finetuning is the implementation activity of retraining the LLM on additional data. Finetuning may modify model weights.
- Retrieval Augmented Generation is the runtime activity of augmenting the prompt to use additional data. RAG does not modify model weights.
To give a crude analogy with ERP, finetuning is like customization, which changes the source code of the base product whereas RAG is like extension, which leaves the source code intact.
6. Explainability
Objection: ChatGPT is a blackbox, how do we know if its analysis complies with regulations around data privacy, non-discrimination, etc.?
Rebuttal: Even traditional analytics platforms are answerable to questions around bias, discrimination, etc. For example, banks cannot deny a loan to someone on the basis of a protected class like religion. If sued by a rejected borrower, lenders must prove in a court of law (or banking ombudsman or arbitrator) that their decision to deny a loan was not based on the borrower’s religion. They cannot simply throw up and their hands and say “Our algorithm told us to reject this application”. Analytics software helps them to address this via a feature called “explainability”.
GenAI will do the same. While explainability requires extra work around data governance and safety guardrails, it’s not a totally new challenge for GenAI.
7. Pedestrian
Objection: The analysis in the car repair estimate is pedestrian.
Rebuttal: That’s a feature, not a bug, of a lot of technology. One of the things that hasn’t changed in my entire career in the IT industry is the line “XYZ frees you up from mundane activities and lets you focus on strategic areas”. Only the nature of strategic areas has changed but the goal of technology has remained constant.
The whole point of technology is to save people from mundane tasks and pedestrian insights so that they’re freed up to work on strategic areas.
— Ketharaman Swaminathan (@s_ketharaman) November 9, 2024
While solving the data analysis problems described in Using ChatGPT As A Data Analysis Assistant and Data Analysis By ChatGPT, apart from ChatGPT, I also checked out Microsoft Office Help, Stack Overflow, and other traditional sources of help with formulas and other facets of data analysis. In all cases, ChatGPT’s step-by-step guides were way more clear and easier to implement than what I got from the traditional sources. At every stage, it offered to drill down deeper and moron-proof its instructions unlike members of forums who are in a rush to end the conversation and often exhibit racist and xenophobic tendencies. Another great thing about ChatGPT data analysis is that it automatically cleansed the data I uploaded – no other data analysis platform I’ve used in the past has done that.