The new normal? – AI chatbots for exploratory QDA

Aug 30, 2023, Maniish Karve

Exploratory data analysis (EDA) within qualitative research is a time-consuming endeavor, requiring us to navigate through mountains of text. Most experienced researchers have developed their own hacks, little tools and tricks for this, but hey, there is still no getting away from the effort to distill early insights. 

Since November of last year, ChatGPT has been revealing its bag of tricks, and many people are doing surprising things with it. Bard is also making its presence felt, as is HuggingFace and Xiaoice (in China). And there are multiple derivative apps that use the same LLM’s for their own chatbots.

The accessibility of these chatbots, and their ability to create content, and even hold conversations, is going to trigger a transformative shift in QDA (daresay, has already triggered), offering researchers an innovative approach to engage with and derive deeper meaning from qualitative data.

Exploratory Data Analysis (EDA) in Qualitative Research

EDA has been the cornerstone of all analytics for years, and it is true for qualitative data as well. Traditional methods for Exploratory Data Analysis (EDA) have included

  • Open coding and analysis, which involves assigning codes to significant statements and iteratively quantifying words, emotions, semantics and themes. 
  • Memo writing or note taking, which documents insights
  • Comparative content analysis, which organizes data into matrices in order to compare cases

I’m sure I have missed a few, but essentially this is what I have been exposed to. These manual techniques emphasize close reading, inductive reasoning, and understanding context to get to early insights and iterate for the full scope of the project.

The Arrival of ChatGPT: A New Paradigm

Enter ChatGPT/Bard/all else – these harness the power of natural language processing (NLP) to facilitate dynamic interactions with text-based data. The difference from NLP maybe 3-4 years ago, versus now, is what powers this NLP – Large Language Models (LLM’s). These are massive datasets of unimaginable size, covering an endless variety of topics. But enough geek-speak.

The emergence of ChatGPT/Bard et all introduces a novel approach in qualitative research. It offers researchers an innovative set of tools to navigate the complexities of textual data. Here is a list of emerging ways in which we can redefine EDA using these tools.

  • Summarization: extracting the essence
  • Identifying Trends: suggesting recurring patterns
  • Theme Labeling: categorization into thematic codes
  • Thought-Provoking Questions: Q&A with the data
  • Visualizing Connections: Mapping visually 
  • Multilingual Insights: Working across multiple languages with automated translation

Examples: not talking through my hat!

Let's bring these concepts to life with a few examples:

Summarizing Complexity: Imagine you're conducting research on mental health perceptions. By utilizing ChatGPT, you can summarize a plethora of individual stories into a concise overview: "The majority of participants value the role of community support in their mental health journey, while some highlight the need for improved access to professional help."

Tracking Repeated Mentions: Consider a study on sustainable living. ChatGPT can identify recurring mentions of "eco-friendly habits" and "impact on future generations," revealing prominent trends in participants' perspectives.

Naming Emerging Themes: In healthcare research, ChatGPT could propose thematic labels like "Concept - Overall Reaction" and "Concept - Positive Reactions", aiding researchers in framing their analysis.

But all this is theory. At flowres.io we decided to run a simple test and share it with you folks.

We took a couple of publicly available transcripts, gave them as an input to ChatGPT (minus all PII), and engaged in asking questions using different prompts.

Here are the results for you to see first-hand.

EDA Here

As you can see, this can become quite involved, very quickly, but more on that a little later.

Harmonizing AI and Human Expertise: smart prompting, and usage of tools

While ChatGPT introduces an array of possibilities for qualitative data analysis, it's crucial to emphasize that AI is a tool that complements, rather than replaces, human expertise. The nuances, cultural sensitivities, and contextual understandings that researchers bring to the table remain invaluable. The synergy between AI capabilities and human insights enhances the quality and depth of research outcomes.


Even for drawing early insights, all researchers need to learn the art of creating prompts, at the very least. We must remember that the answer elicited from the input you provide, depends entirely on how you structure the question.


      Here is a quick screen grab of some weak prompting from the example.



Luckily, the models are evolving at a rapid pace, and therefore strong control on context is within reach.

In my experience, integration of ChatGPT into our research processes can usher in tremendous time savings and deeper early analysis.


Here is an example of summarization, themes and sentiment analysis on flowres.io, using API’s.

Watch out: Control and Hallucination

Having done these experiments, I can attest to the efficiency of using ChatGPT in aiding EDA of my transcripts, and indeed, across multiple interactions. However, there are some things we must always keep in mind.

Hallucination: AI models do at times find data that doesn’t exist, i.e. either data that is factually incorrect, irrelevant or just plain nonsensical in the context of what you are working on, and obviously this can lead to enormous complications related to veracity.

Lack of Contextual Understanding: AI models may struggle to grasp the nuanced context of qualitative data. They might misinterpret sarcasm, cultural references, or idiomatic expressions.

Inherent Bias: AI models can unintentionally learn and reinforce biases present in the training data. This can lead to skewed or unfair interpretations.

Loss of Human Sensitivity: Relying solely on AI-generated insights might sideline the nuanced interpretations that human researchers provide. AI lacks empathy and a deep understanding of human emotions, potentially missing subtle cues within narratives.

Generalization: AI-generated summaries, themes, or questions may oversimplify complex and unique individual stories.

Unintended Suggestions: AI models might propose codes, themes, or questions that align with their training data but do not fit the specific research context. This could lead researchers down incorrect analytical paths.

Data Privacy Concerns: Sharing sensitive or confidential data with AI models raises concerns about data security and privacy breaches. Researchers must carefully manage data access and storage.

Transparency: The inner workings of AI models like ChatGPT and BAR-D are complex and not always transparent. It can be challenging to understand how certain suggestions or insights are generated, making it difficult to validate or replicate findings.

Here are some tips I can share from my experience.

  • Be very clear in your prompts in terms of using simple words, and defining output parameters like creativity, brevity, roles etc. 
  • Always ask the tool to cite specifics related to summary from the actual input
  • Don't go down the rabbit hole during Q&A, always evaluate the responses you get from your first couple of Q&A's, and use those prompts on repeat as applicable for every interaction 
  • Follow a path of exploration, never ask too many questions in a single prompt
  • There are input limits, so break into parts while prompting properly to consider all parts
  • Always provide the complete and relevant context, for example relevant section of the transcript instead of just 2 responses
  • Before you feed any data into these tools directly, strip away all PII
  • If possible use API’s and sign DPA’s (Data Protection Agreements) with tool providers

In conclusion: collaboration with AI

Integrating ChatGPT with our QDA processes enriches our analysis toolkit. There are multiple ways in which to leverage AI, but finally AI has its limitations. There are enough and more conversations in media that outline both the pros and the cons of this integration.

From our perspective, if we deploy this fantastic tool mindfully, we can understand our own data better, allowing researchers to focus where they can make greatest impact – generating actionable insights

Yet, remember that while ChatGPT provides powerful assistance, your expertise and nuanced interpretation remain indispensable. By harmonizing AI capabilities with human insights, you can navigate the intricate world of qualitative data with speed, precision and depth.

Maniish Karve

In next week's blog, I will share what tools myMRPlace has developed for Qualitative Data Analysis.

PS: If you’d like to know about how myMRPlace is building tools to greatly improve efficiency while dealing with some of the conundrums, ethical and procedural, do get in touch. Happy to discuss and discover together.  

Get in Touch

Maniish Karve
Aug 30, 2023