As a researcher, extracting insights from qualitative, guided interactions is a very intensive and time consuming process. Even if we are referring to notes instead of actual transcripts, the sheer volume of text to go through is enormous. Extracting the obvious themes, and the underlying subtext, subtleties and nuances of the data is necessary. And there is no real alternative to reading, and reading again, is there? This how most of us started out.
Manual Coding
Let’s discuss how we all probably used to code earlier in our careers.
- We engaged with the data: Listened to the initial interactions, made our notes
- We created a structure of analysis: Created codes map from the DG, notes and transcripts
- We identified themes: Grouped codes into themes, revealing underlying patterns.
- We iterated: Marked transcripts, and refined codes/themes we dug deeper into the data
- We sorted the data: Copied into Excel, created tables, and maybe did some math
- We summarized our findings: Various cuts by cohort, some numbers, and illustrative verbatims
There are definite benefits of this manual effort.
- Hands-on and in-depth: You develop a connection with the data, and are able to explore the depth of the conversations, rather than just the volume.
- Contextual Nuances: By directly engaging with the data, you're better equipped to capture context, emotions, and subtle nuances.
- Adaptability: You can adapt your analysis to unexpected discoveries and findings quickly.
But there are also challenges.
- Time-Consuming: It demands significant time and effort.
- Subjective: Researcher bias can inadvertently influence the coding process
- Scalability: For large datasets, manual coding becomes laborious and less efficient.
Over time, all of us have developed our own methods and hacks to solve for these challenges. However, I personally always have anxiety about missing some key information or text overload. So more hacks – ask a junior or intern to copy data to Excel and sort it, mark the transcripts for completeness, visualize cohorts in tabular form, and more. And over time third party firms have cropped up that help me do just that at scale.
I have also seen the development of automated tools closely, and have been an early adopter. Lets take a look at how this has changed my world.
Automated coding
It is more than obvious that technology should help us solve some, if not all the challenges we faced with manual effort. This view of the coding journey’s evolution thanks to technology is an attempt to summarize what me and my colleagues have seen over the last couple of decades.
- MS Office: Word and Excel were our go-to’s. Transcripts were digitized, and tools within these applications allowed us to annotate, tabulate, comment, track changes and the works. Even now, these are still commonly used for exchanging information and analysis at least.
- Open ended coding tools like Ascribe, Clarabridge: Though most industry standard tools have now been acquired by large enterprises, these went a more beyond Office. You could build and iterate codeframes more easily, and also bring in the collaboration aspects when many individuals are involved. Over time, they also gained the ability to automatically tag codes and text.
- CAQDAS’s like NVivo, Delve, Atlas.ti: Computer assisted qualitative data analysis software, which were able to do more complex activities and help reduce manual effort significantly. They did evolve over time and became more complex.
- Machine Learning and AI: This is where NLP, complex algorithms (LLM’s for one), self-learning systems, and hardware came together, at scale, to take on more research and reporting tasks. I’m sure everyone has heard of ChatGPT, DALL-e, Midjourney, which operate at unbelievable scale. Many CAQDAS’s now leverage these is various ways, and it promises a revolution.
We all have experimented with these in some shape or form, sometimes successfully, sometimes not. Here are some observations from what friends and colleagues have experienced.
On the plus side –
- Speed and Efficiency: Automated coding significantly accelerates the process, ideal for time-sensitive research.
- Consistency: Automated methods reduce the potential for human bias, ensuring consistent results.
- Scalability: Handling large datasets becomes more manageable with automation.
The downsides –
- Features: Some of these tools have truck load of features, many that make you change your established processes. And often you pay for things that you do not need.
- Ease of use: There is almost always a learning curve involved
- Flexibility: Once you start using a tool, you are stuck with its algorithms and shortcomings too.
- Contextual loss: While they are getting better, nuances and subtleties still pose challenges. Incorrect data can also filter in.
- Human intelligence: Machines cannot think like a researcher, at least, not yet.
Machines also pose some interesting ethical conundrums
- Data privacy
- Developer bias
- Consent and transparency
- Environmental impact
Our fundamental desire to generate better, faster and more actionable insights will require us to be aware of new technologies, and boundaries we must draw between human and artificial intelligence. There have been books and movies that have excited us about the potential of AI, and also its downsides. Debates will rage on at least for some years to come.
As for me, I am excited about the evolving role of the qualitative researcher in today’s world. And to this end, please check in next Thursday for more information on advances in generative AI specifically that can help qualitative researchers stay relevant, and on top of their game.
PS: If you’d like to know about how myMRPlace is building tools to greatly improve efficiency while dealing with some of the conundrums, ethical and procedural, do get in touch. Happy to discuss and discover together.