Uncovering themes in qualitative data can be daunting and difficult. Summarizing a quantitative study is relatively clear: you scored 25% better than the competition, let’s say. But how do you summarize a collection of qualitative observations?

In the early stages of a project, exploratory research is often carried out. This research often produces a lot of qualitative data, which can include:

Qualitative attitudinal data,如人们的思想,信仰和自我报告的需求,从用户访谈中获得,焦点小组甚至日记研究


Thematic analysis, which anyone can do, renders important aspects of qualitative data visible and makes uncovering themes easier.


Definition: Thematic analysisis a systematic method of breaking down and organizing rich data from qualitative research by tagging individual observations and quotations with appropriate codes, to facilitate the discovery of significant themes.

As the name implies, a thematic analysis involves findingthemes.

Definition: Atheme:

  • is a description of a belief, practice, need, or another phenomenon that is discovered from the data
  • emerges when related findings appear multiple times across participants or data sources

Challenges with Analyzing Qualitative Data

Many researchers feel overwhelmed by qualitative data from exploratory research conducted in the early stages of a project. The table below highlights some common challenges and resulting issues.


Large quantity of data:Qualitative research results in long transcripts and extensive field notes that can be time-consuming to read; you may have a hard time seeing patterns and remembering what’s important.

Superficial analysis:Analysis is often done very superficially, just skimming topics, focusing on only memorable events and quotes, and missing large sections of notes.

Rich data:每个句子或段落都有很多细节。很难看出哪些细节是有用的并且是多余的。

分析成为许多细节的描述:The analysis simply becomes a regurgitation of what participants’ may have said or done, without any analytical thinking applied to it.

与数据相矛盾:Sometimes the data from different participants or even from the same participant contains contradictions that researchers have to make sense of.

Findings are not definitive:分析并非明确,因为参与者反馈是冲突,或者更糟糕的是,不适合研究员的信仰的观点被忽略。

No goals set for the analysis:最初的目的数据收集丢失because researchers can easily become too absorbed in the detail. Wasted time and misdirected analysis:The analysis lacks focus and the research reports on the wrong thing.

Without some form of systematic process, the problems outlined easily arise when analyzing qualitative data. Thematic analysis keeps researchers organized and focused and gives them a general process to follow when analyzing qualitative data.

Tools and Methods for Conducting Thematic Analysis

A thematic analysis can be done in many different ways. The best tool or method for this process is determined based on the:

  • data
  • context and constraints of the data-analysis phase
  • 研究员的个人工作风格


  • Using software
  • Journaling
  • Using affinity diagramming techniques

Using Software

To analyze large amounts of qualitative data, qualitative researchers often use software, known as CAQDAS (Computer-Aided Qualitative-Data–Analysis software) — pronounced “cak∙das”.Researchers upload transcripts and field notes into a software program and then analyze the text systematically through formal coding. The software helps with the discovery of themes by offering various visualization tools, such as word trees or word clouds, that allow the coded data to be manipulated in many different ways.


  • 分析非常彻底。
  • 可以与其他人共享物理项目文件(包含原始数据和分析)。(此方法在学术机构的学生项目中受欢迎。)


  • Time-consuming, as it results in many codes which need to be condensed into a small, manageable list
  • Expensive
  • Hard to analyze with others synchronously
  • Requires some learning of the software
  • 可以受到限制的


Writing thought processes and ideas you have about a text is common among researchers practicing grounded-theory methodology. Journaling as a form of thematic analysis is based on this methodology and involves manual annotation and highlighting of the data, followed by writing down the researchers’ ideas and thought processes. The notes are known as memos(not to be confused with the office memo delivering news to employees).


  • 该过程鼓励通过写作详细说明的反思。
  • 研究人员有记录他们如何抵达他们的主题。
  • 分析便宜灵活。


  • Hard to do collaboratively


The data is highlighted, cut out physically or digitally, and reassembled into meaningful groups until themes emerge on a physical or digital board. (See avideo demonstrating affinity-diagramming.)


  • 可以协同完成
  • Quick arriving at themes
  • 便宜灵活
  • Visual, and supports an iterative-analysis process


  • 没有像其他方法那么彻底,因为文本的段通常没有多次编码
  • Hard to do when data is very varied, or there is a lot of data

Codes and Coding


Definition: A codeis a word or phrase that acts as a label for a segment of text.

代码描述了文本是什么,是一个更复杂的信息的速记。(一个良好的类比是代码描述了关键字的数据描述了一篇文章或类似物描述了推文。)通常,定性研究人员不仅具有每个代码的名称,还将描述代码均值的描述和examples of text that fit or don’t fit the code. These descriptions and examples are especially useful if more than one person is responsible for coding the data or if coding is done over a longer period of time.

Definition: Codingrefers to the process of labeling segments of text with the appropriate codes.

Once codes are assigned, it’s easy to identify and compare segments of text that are about the same thing. The codes allow us to sort information easily and to analyze data to uncover similarities, differences, and relationships among segments. We can then arrive at an understanding of the essential themes.

A visualization showing coding of qualitative data leads to codes, and an iterative comparison of codes leads to themes.
A thematic analysis starts with coding qualitative data. Through a systematic process of comparing segments of text within and between codes, the researcher arrives at themes.


Codes can be:

  • Descriptive:They describe what the data is about
  • Interpretive:They are an analytical reading of the data, adding the researcher’s interpretive lens to it.


“I was petrified about facilitating a meeting and my company offered a day-and-a-half– long course. So, I went in there and the instructor did something that I felt was horrible at the time, but I've since really come to appreciate it. The first thing that we did was we filled out a sheet of paper with our name and wrote down our worst fear of moderating or facilitating and we turned it in and then he said, okay, tomorrow you're going to act out this situation (…) the next day we came back and I would leave the room while the rest of the team read, they read my worst fear, figured out how they'd act it out, and then I'd walk in and facilitate for 10 minutes with that. And that really helped me realize that there isn't anything to be afraid of, that our fears are really in our head most of the time and facing that made me realize I can handle these situations.”

Here are possible descriptive and interpretive codes for the text above:

Descriptive code:how skills are acquired
Rationale behind the code label: Participants were asked to describe how they came to possess certain skills.

Interpretive code:self-reflection
Rationale behind the code label: The participant describes how this experience changed her beliefs about facilitation and how she reflected on her fear.

Steps to Conduct a Thematic Analysis

Regardless of which tool you use (software, journaling, or affinity diagraming), the act of conducting a thematic analysis can be broken down into 6 steps.

A roadmap illustration overview of 6 steps to perform a thematic analysis. Step 1: Gather your data. Step 2: Read all your data from beginning to end. Step 3: Code the text based on what it's about. Step 4: Create new codes that encapsulate potential themes. Step 5: Take a break for a day. Step 6: Evaluate your themes for good fit.
A thematic analysis involves 6 different phases: gathering the data, reading all the data from beginning to end, coding the text based on what it’s about, creating new codes that encapsulate candidate themes, taking a break and coming back to the analysis later, and evaluating your themes for good fit.

Step 1: Gather All Your Data

Start with the raw data, such as interview or focus-group transcripts, field notes, or日记研究entries. I recommendedtranscribing audio recordings from interviewsand using the transcriptions for analysis instead of依靠斑驳的内存.

第二步:读你所有的数据from Beginning to End

Familiarize yourself with the data before you begin the analysis, even if you were the one to perform the research. Read all your transcripts, field notes, and other data sources before analyzing them. At this step, you can involve your team in the project.Involving your teaminstills knowledge of users andempathyfor them and theirneeds.

Run aworkshop(or a series of workshops if your team is very large or you have a lot of data). Follow these steps:

  1. Before your team members engage with the data, write your research questions on a whiteboard or piece of flipchart paper in order to make the questions easy to refer to while working.
  2. Give each member a transcript or one field- or diary-study entry. Tell people to highlight anything they think is important.
  3. Once team members have completed reading their entries, they can pass their transcript or entry to someone else and receive a new one from another team member. This step is repeated until all team members have engaged with all the data.
  4. Discuss as a group what you noticed or found surprising.
A workshop where each team member reads each diary- or field-study entry and highlights important bits is a good way of getting team members to actively engage with the text, as opposed to just reading it and letting it wash over them.

While it’s best if your team observes all your research sessions, that may not be possible if you have a lot of sessions or a big team. When individual team members observe only a handful of sessions, they sometimes walk away with an incomplete understanding of the findings. The workshop can solve that problem, since everyone will read all the session transcripts.

Step 3: Code the Text Based on What It’s About

In the coding step, highlighted sections need to be categorized so that the highlighted sections can be easily compared.

At this stage, remind yourself of your research objectives. Print your research questions out. Stick them up on a wall or on a whiteboard in the room where you’re conducting the analysis.

If you have adequate time, you can involve your team in this initial coding step. If time is limited and there is a lot of data to work through, then do this step by yourself and invite your team later to review your codes and help flesh out the themes.

As you are coding, review each segment of text and ask yourself这是关于什么的?“为片段提供描述数据(描述性代码)的名称。您还可以在此阶段添加文本的解释码。但是,这些通常会变得更容易分配。

Thecode can be created before or after you have grouped the data. The next two sections of this step describe how and when you may add the codes.

Traditional Method: Create Codes Before Grouping


Once all the text has been coded, you can group all the data that has the same code.

如果你using CAQDAS for this process, then the software automatically logs the codes you assign while coding, so you can use them again. It then provides a way for you to view all text coded with the same code.

A screenshot from Nvivo, a software tool for analyzing qualitative data. The screenshot shows a transcript and how it has been coded.
An example from Nvivo (a CAQDAS tool) is shown above. The coding stripes on the right show which parts of the text have been coded. All codes used throughout all the raw data in this project are displayed in the node panel (Nvivo refers to codes as nodes). Double-clicking on a node will display all the raw data coded with this word.

Quick Method: Group Segments of Text, Then Assign a Code

Rather than coming up with a code when you highlight text, you cut up (physically or digitally) and cluster all the similar highlighted segments (similarly to how different stickies may be grouped in an亲和地图)。然后分组被给定一个代码。如果你doing the clustering digitally, you might pull coded sections into a new document or a visual collaboration platform.

In the pictures below, the grouping was done manually. Transcripts were cut up, fixed to stickies, and moved around the board until they fell into natural topic groups. The researcher then assigned a pink sticky with a descriptive code to the grouping.

A photograph of a highlighted transcript being cut up into sections.
The highlighted sections were physically cut up with scissors and taped to stickies.
The participant number or the data type (i.e., interview vs. field study) was written on the sticky (but could also be communicated through the color of the sticky). This practice facilitates an easy return to the full data, as well as comparisons across participants and data sources. Stickies allow the segments of text to be easily moved around a board or wall.
A photograph of a researcher naming the groups of stickies by writing a label on a new sticky and placing it by each group.
The highlighted segments were clustered by the text topic and given a descriptive code.

At the end of this step, you should have data grouped by topics and codes for each topic.


After grouping the highlighted clippings from my interviews by topic, I ended up with 3 broad descriptive codes and corresponding groupings:

  • Cooking experiences:与烹饪相关的令人难忘的积极和消极经验
  • Pain points: anything that stops someone from cooking or makes cooking difficult (including navigating dietary restrictions, limited budgets, etc.)
  • Things that help:what helps (or is believed to possibly help) someone overcome specific challenges or pain points

Step 4: Create New Codes that Encapsulate Potential Themes

Look across all the codes and explore any causal relationships, similarities, differences, or contradictions to see if you can uncover underlying themes. While doing so, some of the codes will be set aside (either archived or deleted) and new interpretive codes will be created. If you’re using a physical-mapping approach like that discussed in step 3, then some of these initial groupings may collapse or expand as you look for themes.


  • What’s going on in each group?
  • How are these codes related?
  • 这些如何与我的研究问题有关?

Returning to our cooking topic, when analyzing the text within each grouping and looking for relationships between the data, I noticed that two participants said that they liked ingredients that can be prepared in different ways and go well with other different ingredients. A third participant talked about wishing she could have a set of ingredients that can be used for many different meals throughout the week, rather than having to buy separate ingredients for each meal plan. Thus, a new theme about the flexibility of ingredients emerged. For this theme, I came up with the codeone ingredient fits all,我然后写了一个详细的描述。

A photograph of a researcher creating a new grouping on the wall.
In this research example, a new grouping was formed; the grouping included quotes mentioning a need for ingredients that can be flexibly used — either because they can be prepared in several ways or because they can be used in several different meals throughout a week. The grouping was labeled with the interpretive code one ingredient fits all. The researcher then fleshed out the description of this code.

Step 5: Take a Break for a Day, then Return to the Data

It almost always is a good idea to take a break and come back and look at the data with a fresh pair of eyes. Doing so sometimes helps you to see significant patterns in the data clearly and derive breakthrough insights.

Step 6: Evaluate Your Themes for Good Fit

In this step, it can be useful to have others involved to help you review your codes and emerging themes. Not only are new insights drawn out, but your conclusions can be challenged and critiqued by fresh eyes and brains. This practice reduces the potential for your interpretation to be colored by personal biases.

Put your themes under scrutiny. Ask yourself these questions:

  • 数据是否得到了很好的支持?或者你能找到不支持你的主题的数据吗?
  • Is the theme saturated with lots of instances?
  • 别人是否分开分析数据后,您在数据中发现的主题?

If the answer to these questions isno, it might mean that you need to return to the analysis board. Assuming you collected sound data, there is almost always something to be learned, so spending more time with your team repeating steps 4–6 will be worthwhile.



Learn more:用户访谈, Advanced techniques to uncover values, motivations, and desires, a full-day course at the UX Conference.