Automation and the role of the analyst

By November 7, 2018 No Comments

Back in 2011 Marc Andreessen published an essay on the changing economy, from one based on hardware to one where products and services are managed through software and delivered online. This is where the comment “Software is eating the world” comes from. Automation via software has, and continues, to reshape whole industries including the research sector.

Cartoon. Caption at the top "Software in eating up the world". Image below of Pac Man about to eat earth

One of the biggest ways that software has reshaped the research industry is the automation of processes that researchers carry out. Automation is defined as “the use of systems with minimal human intervention, reducing or eliminating unnecessary human labour activities and thus allowing people to focus on high-intelligence processes”. Many of us will remember the days of scribbling equations on paper as we analysed data. Later this was aided by scientific calculators. Today, using software, analysis can be done with the click of a button.

Split image. On the left a whiteboard with a maths equation on it. On the right a piece of paper with an equation on it and a calculator

The bad old days of manual analysis.

In a webinar on improving quantitative researcher productivity, it was argued that automation occurs on a continuum with blood, sweat, and tears on one end and full “turn-key” automation on the other. Between the two ends you have a range of software options with different levels of automation. SPSS can require scripting, R requires programming and Q Research Software¬†has both point and click automation and coding with R and JavaScript. Amy Eborall notes the advantage of automation is that it frees the researcher of time-consuming and tedious tasks allowing them to focus on generating research insights. Tasks which are now automated include cleaning, updating, analysing and charting data, which I explore below.

Data cleaning

Cleaning data is very important to the validity and reliability of your findings. Long gone are the days of manually looking for patterns in data, one-word answers or even worse, gibberish. You often found yourself buried in a mountain of survey forms which could take days to check. Most survey platforms now have an inbuilt tool that checks responses and identifies suspicious answers. Software can even check the consistency of answers within a response. This further increases the quality of the data.

Cartoon. Mountain of papers, a hand reaching out of the top of the mountain holding a sign saying "help"

Cleaning the data

Amy Eborall notes a potential limitation of automated data cleaning is the lack of human oversight. This can result in issues with the quality and accuracy of the data due to the algorithms used. It is good practice to review the responses flagged during data cleaning.

Data Analysis

Data analysis today can be done by clicking a button or selecting an option from a drop-down menu. In some packages, you can analyse statistics in 32 different ways using drop-downs. The options range from basic n values through to mean, median, mode, percentiles, t and z statistics, standard error and confidence levels from just one drop-down menu. The software automatically calculates and presents you with a table. You don’t have to remember the appropriate equation or manually work out the answer which can introduce¬†the potential for error as humans are imperfect machines. An additional benefit is the saving of time and in turn, money. Analysis which could normally take weeks can now be done in days allowing for faster reporting. The downside of automated analysis is that details behind the data and how the data relates to the question isn’t provided.

Two screen captures, one above the other. Top image shows the equation to work out the Standardized Residual (z value). The bottom image shows a drop down menu box with option to show significance

The equation used to calculate significance vs the automated drop-down menu


Automatic Charting

Automation of charting is not new. Some of you will remember the stressful hours of creating charts in excel and pasting it into Word or PowerPoint. Often the formatting would change along the way leading you to frantically re-format charts as deadlines loomed. The advent of chart templates made the process easier as you could apply the same formatting over several charts.

A big change software has made is the range of ways in which data can be visualised. We are no longer limited to simple bar, line, or pie charts. We now have scatter and ranking plots, bar charts with trend lines, heat maps, and customisable pictographs. These new charts allow for deeper analysis. You could do these before, however, it required you to write code which was time-consuming and could be prone to error. Today many charts are accessible via a drop-down menu and are fully automated. You can use R if you want further customisation i.e. icons used in pictographs.

Updating Data

One of the huge advantages of software is the ability to update and incorporate new data as it comes in. This process used to require entering the new data, redoing the analysis and modifying the charts. With software what was a laborious task is now relatively pain-free. Not only do many software packages refresh data but they also update analysis and charts automatically.

The Modern Data Analyst

So, with all this automation where does this leave data analysts?

There is still a role for researchers and data analysts. A lot of knowledge and skill goes into designing a survey including what questions to ask and how to ask them. While data collection, cleaning, analysis, and reporting can be carried out by software, an analyst needs to decide on the most appropriate method of analysis and instruct the software to carry this out. The same applies to charting and reporting. A software package can’t decide if a pie chart or a bar graph is the best way to present the data. According to Chris Wallbridge the future of the research industry will be less about data collection and increasingly focused on data narration.

Adriana Rocha points out that the analyst is still much better at looking at the big picture than any software is and advises analysts to use their intuition and creativity to tell the story behind the data.

Venn diagram. Top circle is titled Analyse, the one on the left, Narrate and the one on the right is Design. In the overlapping segments are interpretation, clarity and storytelling. In the center is data literacy

The roles of a researcher or data analyst

At PublicVoice, we use several software packages to help with the grunt work so that we humans can spend more time delivering insights. Freeing up our time means we can discover important, useful and interesting patterns. By working smarter, rather than harder, we are able to save time and reduce costs for our clients. With this in mind, we welcome automation and actively seek opportunities to merge human and machine capabilities as we strive for excellence.

Happy surveying!