Making the Most of Qualitative Data: The Story of Text Explorer

Use Cases & Projects, Dataiku Product Adam McMaster

This is a guest post by Adam McMaster and Meirin Evans, our friends at The Brilliant Club, whom we have partnered with since 2021. Adam joined The Brilliant Club in October 2022 for a three-month internship as part of his doctoral training program. Adam is in the third year of his Ph.D. at the Open University and his research focuses on black holes and variable stars. Meirin works in The Brilliant Club’s Research and Impact team as the Impact and Analysis Officer and supported Adam during his internship.

In working with The Brilliant Club’s Research and Impact team, Adam created an app to improve how we analyze qualitative data. In this blog, we share insights from Adam’s project and how it has impacted the work we do at The Brilliant Club.  

The Brilliant Club runs the U.K.’s largest university access program, The Scholars Programme. In the academic year 2021-22, we supported over 22,000 students through the program which, for our Research and Impact team, means a lot of data to process and analyze!

Alongside reporting on the numbers, an integral part of our evaluation work is understanding how pupils and Ph.D. tutors experience the program – What do they want to tell us about the program? Collecting qualitative feedback via surveys is one mechanism that we use to listen to the voices of the young people who take part in our programs. But what do we do with the data, once we have it? 

Until now, working with the qualitative responses to our surveys meant having to read every response. While reading the responses, the usual requirement is to code each response. That means labeling or categorizing the responses in some way, to organize them and make it easier to refer back to subsets of responses later. Combined, our surveys receive thousands of responses per school term, so reading and coding everything is a lot of manual work. To make this process easier, we built an app that allows Brilliant Club staff to interactively browse and search the written answers to our surveys. It's called Text Explorer.

So, What Can Text Explorer Do?

In the next sections, we’ll detail the value of Text Explorer, features and capabilities in Dataiku that made the process easier and faster, and reasons other organizations might want to build their own version of Text Explorer.

Text Explorer is built in Dataiku, the platform for Everyday AI, which allows us to build data processing workflows that carry out data cleaning and analysis. It even includes plugins that can do a number of sophisticated things, such as natural language processing (NLP). Integrating these built-in features with Adam’s own Python code was simple and, as a result, we were able to put together a workflow that takes a set of surveys as input and produces a set of outputs that tell us about the answers to those surveys. 

Key Features of Text Explorer

Text Explorer has many useful features:

GM2516-DAC Blog Post Image for NLP

  • Text Explorer combines all the written survey answers into a unified data structure. Answers to our surveys are stored as a set of comments, with metadata recording which survey and which question each comment relates to. This way, comments can be explored from multiple surveys at once.
  • Text Explorer has a basic search interface, where the user can enter a search query. And there are various filtering options, so the results can be limited to particular surveys and to a given timeframe.
  • NLP techniques are also used to identify "trending" words in the most recent term's comments, as well as the most common phrases that appear in all of the surveys (using Dataiku’s ability to extract n-grams from sentences). Dataiku’s sentiment analysis is performed on all comments, allowing the user to filter results to, for instance, just the positive comments or only neutral comments. A similarity analysis is also performed, so that the user can view similar comments. Now, nobody has to read through every single comment!
  • Comments can be coded ad hoc, either individually or by adding a tag to all of the comments matching the current search and filtering options. This allows the user to code comments according to whatever criteria they need, or to organize the comments however they wish, in a flexible way.
  • Finally, there is a data export facility. The user can add comments to the export (like adding items to the basket on a shopping website) — again either individually or en masse from the current search/filtering options. Once the comments have been added, the user can click a button to download a file containing the comments along with their tags.

What Difference Has Text Explorer Made?

Thanks to Dataiku’s Text Explorer, it’s now possible to search, code, and export written responses from surveys far more easily than it was before. Not only has Text Explorer increased efficiency (i.e., saved hours of time processing qualitative data manually) but, even more importantly, it has enhanced how we use the data. For example, we can readily compare responses between different surveys, extrapolate different sentiments, and identify meta-themes based on feedback from pupils and Ph.D. tutors. 

The Brilliant Club’s Impact and Analysis Officer, Dr Meirin Evans said about the tool: "I was able to compare the positivity of survey responses across two of our programs, which wouldn't have been possible before Text Explorer.

Adam McMaster said about his experiences of the internship: “I’m pleased I was able to create something that The Brilliant Club will find useful after my time here, while in the process I got to experience working on something very different to the data I usually use.” 

We are so glad to see how this partnership with The Brilliant Club embraces the potential of Everyday AI via concrete applications of how a data project can accelerate a nonprofit’s mission in its day-to-day operations. I'm so glad to see how the team has been able to run this project successfully and can't wait to see what's next. I also hope it will inspire other nonprofits to join the Ikig.AI program.”

-Emilie Stojanowski, CSR & Ikig.AI Manager at Dataiku

We hope that this blog shows that, with the right tools, qualitative data can be systematically analyzed and used to understand the impact of large-scale programs, such as The Scholars Programme. And that most importantly, qualitative analysis, just like quantitative analysis, needs investment — it takes time and needs the right tools and people behind it. For any questions about the article, please contact meirin.evans@thebrilliantclub.org.

You May Also Like

Dataiku Makes Machine Learning Accessible, Transparent, & Universal

Read More

Explainable AI in Practice (In Plain English!)

Read More

Democratizing Access to AI: SLB and Deloitte

Read More

Secure and Scalable Enterprise AI: TitanML & the Dataiku LLM Mesh

Read More