Exploratory data analysis (EDA) within qualitative research is a time-consuming endeavor, requiring us to navigate through mountains of text. Most experienced researchers have developed their own hacks, little tools and tricks for this, but hey, there is still no getting away from the effort to distill early insights.
Since November of last year, ChatGPT has been revealing its bag of tricks, and many people are doing surprising things with it. Bard is also making its presence felt, as is HuggingFace and Xiaoice (in China). And there are multiple derivative apps that use the same LLM’s for their own chatbots.
The accessibility of these chatbots, and their ability to create content, and even hold conversations, is going to trigger a transformative shift in QDA (daresay, has already triggered), offering researchers an innovative approach to engage with and derive deeper meaning from qualitative data.
EDA has been the cornerstone of all analytics for years, and it is true for qualitative data as well. Traditional methods for Exploratory Data Analysis (EDA) have included
I’m sure I have missed a few, but essentially this is what I have been exposed to. These manual techniques emphasize close reading, inductive reasoning, and understanding context to get to early insights and iterate for the full scope of the project.
Enter ChatGPT/Bard/all else – these harness the power of natural language processing (NLP) to facilitate dynamic interactions with text-based data. The difference from NLP maybe 3-4 years ago, versus now, is what powers this NLP – Large Language Models (LLM’s). These are massive datasets of unimaginable size, covering an endless variety of topics. But enough geek-speak.
The emergence of ChatGPT/Bard et all introduces a novel approach in qualitative research. It offers researchers an innovative set of tools to navigate the complexities of textual data. Here is a list of emerging ways in which we can redefine EDA using these tools.
Let's bring these concepts to life with a few examples:
Summarizing Complexity: Imagine you're conducting research on mental health perceptions. By utilizing ChatGPT, you can summarize a plethora of individual stories into a concise overview: "The majority of participants value the role of community support in their mental health journey, while some highlight the need for improved access to professional help."
Tracking Repeated Mentions: Consider a study on sustainable living. ChatGPT can identify recurring mentions of "eco-friendly habits" and "impact on future generations," revealing prominent trends in participants' perspectives.
Naming Emerging Themes: In healthcare research, ChatGPT could propose thematic labels like "Concept - Overall Reaction" and "Concept - Positive Reactions", aiding researchers in framing their analysis.
But all this is theory. At flowres.io we decided to run a simple test and share it with you folks.
We took a couple of publicly available transcripts, gave them as an input to ChatGPT (minus all PII), and engaged in asking questions using different prompts.
Here are the results for you to see first-hand.
As you can see, this can become quite involved, very quickly, but more on that a little later.
While ChatGPT introduces an array of possibilities for qualitative data analysis, it's crucial to emphasize that AI is a tool that complements, rather than replaces, human expertise. The nuances, cultural sensitivities, and contextual understandings that researchers bring to the table remain invaluable. The synergy between AI capabilities and human insights enhances the quality and depth of research outcomes.
Even for drawing early insights, all researchers need to learn the art of creating prompts, at the very least. We must remember that the answer elicited from the input you provide, depends entirely on how you structure the question.
Here is a quick screen grab of some weak prompting from the example.
Luckily, the models are evolving at a rapid pace, and therefore strong control on context is within reach.
In my experience, integration of ChatGPT into our research processes can usher in tremendous time savings and deeper early analysis.