Experience Blog

The Ethnography of Experience

Data + Context = Information: The Fallacy of Quantitative Concreteness and the COVID-19 Moment

Fauci.jpg

“Data is real; model is hypothesis.” Dr. Anthony Fauci, COVID-19 Task Force Press Briefing

As an instructor of ethnography, and teacher of qualitative data analysis, students invariably ask the same question about subjectivity and bias in qualitative research and data. After all, even books on qualitative methods talk about the ‘interpretation of the qualitative data,’ where the researcher interjects her or himself into the process by trying to divine what the ‘data means.’ Qualitative data is assumed to be interpretive, while quantitative data is treated as objective, or “real” as Dr. Fauci’s quote indicates.

However, even in the most basic course that introduces t-tests and multiple regression, we are taught there is that final moment where we have to “interpret” the results. It is funny that while qualitative data is criticized for involving involve interpretation, the word “interpret” is actually part of the quantitative process. I do know that what is meant here is interpreting how the results compared to our model.  However, the interpretation of quantitative results actually needs to go deeper into what does the quantitative data actually represent.

Sociologist Harold Garfinkel

Sociologist Harold Garfinkel

A fundamental point to remember is quantitative or numeric data is just a reflection of the practices that led to its creation. For instance, crime data is not a representation of that total amount of crime being committed in a particular location. Rather, it is a reflection of the police practices and criminal justice recording processes which result in measurable instances. As one police officer told me, if we made an arrest for every crime committed, we would overrun the system. Thus, crime data represents an undercount of the total amount of law-breaking activity, and for good reason. And, as sociologist Harold Garfinkel explicated decades ago, sometimes there are good reasons for bad records. At the same time, sometimes the records are a reflection of poor practices. We don’t know which it is unless we examine those practices.

The point that metrics reflects record-keeping or counting practices should not come as news. Even though Dewey was predicted to win the 1948 presidential election based on polling data, Dewey did not defeat Truman. The polls were wrong because the data collection practices did not create an accurate picture. Thus, polling practices were changed to better capture voter intention.

Interesting, while polls did improve, they have increasingly been unreliable. So, rather than just questioning polling practices, we might cast our attention to the practices of holding elections. A simple point is that vote totals are not necessarily a reflection of voters allowed to participate and votes actually counted. Those who control the voting processes may make it harder for certain groups to vote. Poorer people tend to have a more difficult time getting to the polls, have older machines, and fewer polling places resulting in longer lines. There also can be intentional acts of subverting the vote through forms of suppression or even vote manipulation. Thus, you can’t talk about what vote totals represent without talking about the larger social and structural context in which voting happens. In other words, to understand voting data, you need to look at the totality of voting practices.

For another example, we can turn to the state of Florida (you can always turn to Florida to be an example of something). The current COVID-19 moment is highlighting how the Florida unemployment system was designed to make it almost impossible to register for benefits, with the goal being to have very low ‘rates’ of unemployment. As an article in the New Yorker pronounces, “Florida GOP Realizes Deliberately Impoverishing the Unemployed Has Downsides.” A Politico article puts it more succinctly, “’It’s a sh—sandwich’”. A key quote in this article lays out the game:

“It’s a sh—sandwich, and it was designed that way by (former Governor Rick) Scott,” said one (Gov. Ron) DeSantis advisor. “It wasn’t about saving money. It was about making it harder for people to get benefits or keep benefits so that the unemployment numbers were low to give the governor something to brag about.”

Therefore, rather than unemployment rates being a reflection of the number of unemployed in the state of Florida, it is a reflection of a system designed to hide unemployment rates.

The same applies to the COVID-19 rates being discussed in the news on a daily basis. The number of people represented as being infected with COVID-19 are simply a reflection of testing practices, and the quality and accuracy of those tests. They are not a reflection of how many people are infected, but how many accurately tested. Again, pretty simple, and this point is repeatedly stated in news coverage.

Covid 19 test.jpg

At the same time, we have the quote from Dr. Fauci of data being real. The data may be real, but it is real in the sense that most people believe. We have to figure out what that data represents by looking at those practices and context of testing. For instance, do COVID-19 numbers represent an intentional undercount intended to a better picture? Do they represent the fact that most people who are infected are asymptomatic and don’t require testing? Do they represent that the tests we have are not accurate in terms of creating false positives or negatives? Do they represent the enormity of moment in which no society could have been ready to have enough tests to manage the situation? Do they represent a failure of pandemic planning and heeding of warnings? Or is it all of the above to some extent? Without knowing answers to these questions, we don’t know what the data means. COVID-19 data devoid of the context of how that data was created does not provide us with actionable information.

When asked what my company name “ethno-analytics” means, my answer is simply “Data + Context = Information.” ‘Ethno’ refers to context, and ‘analytics’ refers to data. Put together. ethno-analytics means the integration of data and context in order to create actionable information. In terms of how to put more context into your data, here are some points to consider:

  1. The word “data” includes qualitative and quantitative varieties. For your company, make sure you have and make use of both types.

  2. Make sure you understand the processes that led to the data. Don’t take any data at face value. Look more deeply into where it comes from, and how that impacts what you see in front of you.

  3. Layer different contexts onto the same data. See what happens when you look at the same data from different standpoints and perspectives. Explore how this process can lead to different findings and conclusions.

  4. Don’t fall in love with your preferred data type. Those of us in social science often will describe ourselves as “qualitative” or “quantitative.” While we might have tendencies and preferences, it is important to have a more ‘open relationship’ with your data options.

  5. Be fluent in many types of data. Saying you are a ‘qualitative’ or ‘quantitative’ person also says you speak only one language. If you can’t learn another data language, work with or employ those who can.

  6. Think of integrative mixed methods. Social science is increasingly moving toward a mixed method approaches for research and grants. Explore how you can do the same in your organizations to capture the data that is necessary to take the action that you need.

Or, you can just reach out to me to talk about how to apply the ethno-analytics framework to your organization.