Introduction

Previous papers in this series have introduced readers to qualitative research and identified approaches to collecting qualitative data. However, for those new to this approach, one of the most bewildering aspects of qualitative research is, perhaps, how to analyse and present the data once it has been collected. This final paper therefore considers a method of analysing and presenting textual data gathered during qualitative work.boxed-text

Approaches to analysing qualitative data

There are two fundamental approaches to analysing qualitative data (although each can be handled in a variety of different ways): the deductive approach and the inductive approach.1,2 Deductive approaches involve using a structure or predetermined framework to analyse data. Essentially, the researcher imposes their own structure or theories on the data and then uses these to analyse the interview transcripts.3

This approach is useful in studies where researchers are already aware of probable participant responses. For example, if a study explored patients' reasons for complaining about their dentist, the interview may explore common reasons for patients' complaints, such as trauma following treatment and communication problems. The data analysis would then consist of examining each interview to determine how many patients had complaints of each type and the extent to which complaints of each type co-occur.3 However, while this approach is relatively quick and easy, it is inflexible and can potentially bias the whole analysis process as the coding framework has been decided in advance, which can severely limit theme and theory development.

Conversely, the inductive approach involves analysing data with little or no predetermined theory, structure or framework and uses the actual data itself to derive the structure of analysis. This approach is comprehensive and therefore time-consuming and is most suitable where little or nothing is known about the study phenomenon. Inductive analysis is the most common approach used to analyse qualitative data2 and is, therefore, the focus of this paper.

Whilst a variety of inductive approaches to analysing qualitative data are available, the method of analysis described in this paper is that of thematic content analysis, and is, perhaps, the most common method of data analysis used in qualitative work.4,5 This method arose out of the approach known as grounded theory,6 although the method can be used in a range of other types of qualitative work, including ethnography and phenomenology (see the first paper in this series7 for definitions). Indeed, the process of thematic content analysis is often very similar in all types of qualitative research, in that the process involves analysing transcripts, identifying themes within those data and gathering together examples of those themes from the text.

Data collection and data analysis

Interview transcripts, field notes and observations provide a descriptive account of the study, but they do not provide explanations.4 It is the researcher who has to make sense of the data that have been collected by exploring and interpreting them.

Quantitative and qualitative research differ somewhat in their approach to data analysis. In quantitative research, data analysis often only occurs after all or much of data have been collected. However, in qualitative research, data analysis often begins during, or immediately after, the first data are collected, although this process continues and is modified throughout the study. Initial analysis of the data may also further inform subsequent data collection. For example, interview schedules may be slightly modified in light of emerging findings, where additional clarification may be required.

Computer software for data analysis

The method of analysis described in this paper involves managing the data 'by hand'. However, there are several computer-assisted qualitative data analysis software (CAQDAS) packages available that can be used to manage and help in the analysis of qualitative data. Common programmes include ATLAS. ti and NVivo. It should be noted, however, that such programs do not 'analyse' the data – that is the task of the researcher – they simply manage the data and make handling of them easier.

For example, computer packages can help to manage, sort and organise large volumes of qualitative data, store, annotate and retrieve text, locate words, phrases and segments of data, prepare diagrams and extract quotes.8 However, whilst computer programmes can facilitate data analysis, making the process easier and, arguably, more flexible, accurate and comprehensive, they do not confirm or deny the scientific value or quality of qualitative research, as they are merely instruments, as good or as bad as the researcher using them.

Stages in the process

Regardless of whether data are analysed by hand or using computer software, the process of thematic content analysis is essentially the same, in that it involves identifying themes and categories that 'emerge from the data'. This involves discovering themes in the interview transcripts and attempting to verify, confirm and qualify them by searching through the data and repeating the process to identify further themes and categories.4

In order to do this, once the interviews have been transcribed verbatim, the researcher reads each transcript and makes notes in the margins of words, theories or short phrases that sum up what is being said in the text. This is usually known as open coding. The aim, however, is to offer a summary statement or word for each element that is discussed in the transcript. The exception to this is when the respondent has clearly gone off track and begun to move away from the topic under discussion. Such deviations (as long as they really are deviations) can simply be uncoded. Such 'off the topic' material is sometimes known as 'dross'.9

Table 1 is an example of the initial coding framework used in the data generated from an actual interview with a child in a qualitative dental public health study, exploring primary school children's understanding of food.10

Table 1 An example of an initial coding framework

In the second stage, the researcher collects together all of the words and phrases from all of the interviews onto a clean set of pages. These can then be worked through and all duplications crossed out. This will have the effect of reducing the numbers of 'categories' quite considerably.11,12 Using a section of the initial coding framework from the above study,10 such a list of categories might read as follows:

  • Children's perception of food

  • Positive notions of food and their consequences

  • Negative notions of food and their consequences

  • Peer influence

  • Copying

  • Healthy/unhealthy foods

  • Effects of sweets and chocolates

  • Effects of 'junk food'

  • Food choices in school

  • Diet in childhood

  • Food preferences

  • Expected diet as a 'grown up'

  • Food choices and preferences of friendship groups

  • Effects of fizzy drinks

  • Perceptions of adult/child diets

  • The need to be 'healthy' as an adult.

Once this second, shorter list of categories has been compiled, the researcher goes a stage further and looks for overlapping or similar categories. Informed by the analytical and theoretical ideas developed during the research, these categories are further refined and reduced in number by grouping them together.4 A list of several categories (perhaps up to a maximum of twelve) can then be compiled. If we consider the above example, we might eventually come up with the reduced list shown in Table 2.

Table 2 An example of a final coding framework after reduction of the categories in the initial coding framework

This reduced list forms the final category system that can be used to divide up all of the interviews.12 The next stage is to allocate each of the categories its own coloured marking pen and then each transcript is worked through and data that fit under a particular category are marked with the according colour. Finally, all of the sections of data, under each of the categories (and thus assigned a particular colour) are cut out and pasted onto the A4 sheets. Subject dividers can then be labelled with each category label and the corresponding coloured snippets, on each of the pages, are filed in a lever arch file. What the researcher has achieved is an organised dataset, filed in one folder. It is from this folder that the report of the findings can be written.

As discussed earlier, computer programmes can be used to manage this process and may be particularly useful in qualitative studies with larger datasets. However, researchers wishing to use such software should first undertake appropriate training and should be aware that most programmes often do not abide by normal MS Windows conventions (eg, most interview transcripts have to be converted from MS Word into rich text format before they can be imported into the programme for analysis).

Verification

The analysis of qualitative data does, of course, involve interpreting the study findings. However, this process is arguably more subjective than the process normally associated with quantitative data analysis, since a common belief amongst social scientists is that a definitive, objective view of social reality does not exist. For example, some quantitative researchers claim that qualitative accounts cannot be held straightforwardly to represent the social world, thus different researchers may interpret the same data somewhat differently.4 Consequently, this leads to the issue of the verifiability of qualitative data analysis.

There is, therefore, a debate as to whether qualitative researchers should have their analyses verified or validated by a third party.13,14 It has been argued that this process can make the analysis more rigorous and reduce the element bias. There are two key ways of having data analyses validated by others: respondent validation (or member check) – returning to the study participants and asking them to validate analyses – and peer review (or peer debrief, also referred to as inter-rater reliability) – whereby another qualitative researcher analyses the data independently.13,14,15

Participant validation involves returning to respondents and asking them to carefully read through their interview transcripts and/or data analysis for them to validate, or refute, the researcher's interpretation of the data. Whilst this can arguably help to refine theme and theory development, the process is hugely time consuming and, if it does not occur relatively soon after data collection and analysis, participants may have also changed their perceptions and views because of temporal effects and potential changes in their situation, health, and perhaps even as a result of participation in the study.15

Some respondents may also want to modify their opinions on re-presentation of the data if they now feel that, on reflection, their original comments are not 'socially desirable'. There is also the problem of how to present such information to people who are likely to be non-academics. Furthermore, it is possible that some participants will not recognise some of the emerging theories, as each of them will probably have contributed only a portion of the data.16

The process of peer review involves at least one other suitably experienced researcher independently reviewing and exploring interview transcripts, data analysis and emerging themes. It has been argued that this process may help to guard against the potential for lone researcher bias and help to provide additional insights into theme and theory development.14,16,17 However, many researchers also feel that the value of this approach is questionable, since it is possible that each researcher may interpret the data, or parts of it, differently.8 Also, if both perspectives are grounded in and supported by the data, is one interpretation necessarily stronger or more valid than the other?

Unfortunately, despite perpetual debate, there is no definitive answer to the issue of validity in qualitative analysis. However, to ensure that the analysis process is systematic and rigorous, the whole corpus of collected data must be thoroughly analysed. Therefore, where appropriate, this should also include the search for and identification of relevant 'deviant or contrary cases' – ie, findings that are different or contrary to the main findings, or are simply unique to some or even just one respondent. Qualitative researchers should also utilise a process of 'constant comparison' when analysing data. This essentially involves reading and re-reading data to search for and identify emerging themes in the constant search for understanding and the meaning of the data.18,19 Where appropriate, researchers should also provide a detailed explication in published reports of how data was collected and analysed, as this helps the reader to critically assess the value of the study.

It should also be noted that qualitative data cannot be usefully quantified given the nature, composition and size of the sample group, and ultimately the epistemological aim of the methodology.

Writing and presenting qualitative research

There are two main approaches to writing up the findings of qualitative research.20 The first is to simply report key findings under each main theme or category, using appropriate verbatim quotes to illustrate those findings. This is then accompanied by a linking, separate discussion chapter in which the findings are discussed in relation to existing research (as in quantitative studies). The second is to do the same but to incorporate the discussion into the findings chapter. Below are brief examples of the two approaches, using actual data from a qualitative dental public health study that explored primary school children's understanding of food.10

Example a (the traditional approach):

FINDINGS

Contrasts and contradictions

The interviews demonstrated that children are able to operate contrasts and contradictions about food effortlessly. These contradictions are both sophisticated and complex, incorporating positive and negative notions relating to food and its health and social consequences, which they are able to fluently adopt when talking about food:

'My mother says drink juice because it's healthy and she says if you don't drink it you won't get healthy and you won't have any sweets and you'll end up having to go to hospital if you don't eat anything like vegetables because you'll get weak'. (Girl, school 3, age 11 years).

If this approach was used, the findings chapter would subsequently be followed by a separate supporting discussion and conclusion section in which the findings would be critically discussed and compared to the appropriate existing research. As in quantitative research, these supporting chapters would also be used to develop theories or hypothesise about the data and, if appropriate, to make realistic conclusions and recommendations for practice and further research.

Example b (combined findings and discussion chapter):

Copying friends

In this study, as with others (eg Ludvigsen & Sharma21 and Watt & Sheiham22), peer influence is a strong factor, with children copying each other's food choices at school meal times:

Girl: 'They say “copy me and what I have.”'

Interviewer: 'And do you copy them if they say that?'

Girl: 'Yes.'

Interviewer: 'Why do you copy them if they say that?'

Girl: 'Because they are my friends.'

(Girl, school 1, age 7).

Children also identified friendship groups according to the school meal type they have. Children have been known to have school dinners, or packed lunches if their friends also have the same.21

If this approach was used, the combined findings and discussion section would simply be followed by a concluding chapter. Further guidance on writing up qualitative reports can be found in the literature.20

Conclusion

This paper has described a pragmatic process of thematic content analysis as a method of analysing qualitative data generated by interviews or focus groups. Other approaches to analysis are available and are discussed in the literature.23,24,25 The method described here offers a method of generating categories under which similar themes or categories can be collated. The paper also briefly illustrates two different ways of presenting qualitative reports, having analysed the data.

This analysis process, when done properly, is systematic and rigorous and therefore labour-intensive and time consuming.4 Consequently, for those undertaking this process for the first time, we recommend seeking advice from experienced qualitative researchers.