There are a number of benefits to open-text responses in that they can provide context and meaning when interpreting the results of the closed questions and identify the cause and effect of the statistics. Open text responses allow respondents to elaborate on their responses and may identify themes not covered by the closed questions. One of the challenges of open text responses is how to uncover the key insights.
The most common way of incorporating text data is via text analysis which identifies patterns and themes in the responses. These themes can be used to create text buckets allowing responses to be categorised and analysed as numeric data.
Cleaning text data
Just like numeric data, text data needs to be cleaned. The researcher needs to review all the text responses and removed the invalid data, this is any responses that do not relate to or add value to the research. A common example of invalid data is respondents who answer “no” “none” or “.” to an extension or general text question. The other common example of invalid data is a response that is not on topic, this is actually more common than one may think. Some recent examples of this has included comments on the Flag Referendum in a customer satisfaction survey, general comments and thoughts about the Government of the day and one’s thoughts and opinions about cyclists in a survey on proposed changes to dog control by-laws. This often is a manual process as some relevant data may be included within what at first glance may be invalid data but can have its’ rewards in terms of Huh moments.
There are software options which can do the data cleaning based on programmed or learned algorithms however these have limitations. Once the data has been cleaned one is ready to analysis and code it.
Manual coding involves an individual reading each comment and assign it to the appropriate text bucket/s. While this method is considered best practice it does have its’ disadvantages. It is labour intensive and requires a reasonable level of skill to do.
The way text responses are coded can also be open to researcher bias as they interpret what the respondent said and what they were trying to communicate. Be honest humans are imperfect machines.
Best practice includes the on-going reviewing the text buckets. Every response should fit into at least one bucket even if it is a generic, not applicable bucket. As you code the responses notice the patterns and themes emerging and let these guide the code frame. Review the code frame on a regular basis as you may be able to merge some buckets while others may have to be split as the data emerges.
There are a number of programs which will do text coding for you. Again this automation process is based on programmed or learned algorithms which are its’ limitations and disadvantages. In Q Research Software one sets up the code frame, manually codes a portion of the data (15-20%) and then the software can code the rest of the data based on both programmed and the learned algorithms from the manual coding process. The advantage of automatic coding is quick and cost-effective to do.
I will be covering automatic coding in a future blog.
Once you have completed the analysis the question is how do you present the data in a meaningful way.
Text analysis transforms raw text data into numeric data making it a lot easier to incorporate into a report. Instead of having hundreds of individual pieces of data they have been categorised into themes. From this point, one has all the display options available with numeric data at their disposal. It is up to the researcher to pick the option which best summarises the data and meets the needs of the client. Your report can include quotes from the text data to highlight important themes or messages. Quotes can also be used to increase the general appeal and reliability of the report as well as liven it up.