Making Sense of Big Data

First-of-its-kind tool offers new lens to study culture and human behavior

September 26, 2018   |   By Karen Rivedal

It happens with every credit-card swipe and GPS ping. It’s part of writing an email, sending a text or posting on Instagram. It can be triggered by playing an online game, ordering dinner on a smartphone or surfing for best buys on a tablet.

Even showing up on security footage while shopping will do it.

Active or passive, all these activities leave a digital record that moment by moment is driving the largest collection of information ever accumulated. According to Google CEO Eric Schmidt, the information being generated online every two days equals the total amount recorded in all of human history prior to 2003, with a growth rate doubling biennially.

That information represents a bonanza of potential new insights for those who study human behavior.

But even as technological advances speed this data explosion, making sense of what’s being gathered grows more difficult, notes University of Wisconsin-Madison scholar David Williamson Shaffer, the Vilas Distinguished Achievement Professor of Learning Sciences.

So Shaffer is offering researchers a new way to help find meaning in the digital clamor.

He and his team have developed a first-of-its-kind research method for the study of culture and human behavior that blends the two main types of data analysis used by researchers—quantitative and qualitative—to reveal uniquely rich, real-time results with the validity of a statistically sound random sample. He calls the new method quantitative ethnography (QE).

“Our tools let researchers look in more depth at their data,” Shaffer says. “It’s like we’ve invented a new microscope—a new lens that will let you see things that you couldn’t see before.”

Why it matters

Settling for a more incomplete view risks getting the meaning of big data wrong, a potential mistake that grows more perilous as it’s used increasingly in automated, impersonal ways that can negatively impact lives.

“Computer algorithms screen job applicants, determine whether prisoners will be granted parole, evaluate the work of teachers and influence a host of other decisions that used to be made by human beings,” says Shaffer, who also is a data philosopher in the UW–Madison School of Education’s Wisconsin Center for Education Research.

Big data brings this mix of promise and peril as surely to the field of education as it does for any area affected by the revolution in Internet and mobile technology in data generation.

“Massive amounts of rich data get collected all the time in education,” says Andrew Ruis, associate director for research in Shaffer’s Epistemic Analytics Lab at the Wisconsin Center for Education Research. “Online educational interventions and the people who are in those communities generate huge amounts of information about teachers and students.”

“We have more information than ever about what students are doing and how they are thinking,” Shaffer agrees. “But the sheer volume of data available can overwhelm traditional research methods. Existing methods are not well designed to look at big data by themselves.”

Best for complex, collaborative thinking

Shaffer’s 2017 book “Quantitative Ethnography” is a formal introduction to this new way of analyzing data. The approach has been in development by Shaffer and his UW associates for the past decade with key collaborators at Arizona State University, Michigan State University, the University of Memphis, Aalborg University in Denmark and the University of Edinburgh in Scotland.

But the tools of QE have traveled well beyond that extended circle. The Epistemic Analytics Lab, which interacts with users around the world, supports more than 250 researchers, with more than 80 ongoing collaborations between the lab at WCER and researchers at more than 50 institutions in 16 countries.

Developed through research grants from the National Science Foundation, QE is especially well-suited to the analysis of complex, collaborative thinking, and Shaffer sees big potential for QE to be adopted more widely, especially as teamwork and collaboration become more important in education and the workforce.

“The vast majority of work that people do is in teams of one kind or another,” Shaffer said. “But assessing that work is really hard, and the tools of quantitative ethnography make that possible.”

Already, QE has been used to analyze data for topics as varied as surgical education, a global after-school program to support STEM learning, communication patterns among successful students in large, online classes in Europe and Australia, and simulation-based training of U.S. Naval teams responsible for detecting incoming missiles.

Statistics find connections in culture

Quantitative Ethnography, published in 2017, took a decade to develop.

Quantitative Ethnography, published in 2017, took a decade to develop.

Perhaps a headscratcher for many researchers, the term “Quantitative Ethnography” reflects the approach’s blend of data methods: ethnography is the qualitative study of culture, now combined with statistical tools to analyze culture in big data.

“Quantitative Ethnography is a set of research methods that weave the study of culture together with statistical tools to understand human behavior,” Shaffer says, “a way to go beyond looking for arbitrary patterns in mountains of data and begin telling textured stories at scale.”

The goal of QE is to help researchers develop richer understanding than what broad, survey-based quantitative data methods alone can offer, by blending it with the qualitative tools for analyzing “thick data” – what people say or do as captured in interviews, field notes, focus groups or video, for example.

“It’s a different way of thinking about your work,” says Aroutis Foster, a professor of learning technologies at Drexel University in Philadelphia, Pennsylvania, and an early adopter of QE. “To me it’s a method that allows you to think of (qualitative data) in a whole different way – how can we segment and quantify it? It’s a methodology that’s pretty flexible.”

Crucially, the method involves applying statistical techniques to determine whether conclusions drawn from a small sample of thick data apply to a much larger set of data. It can answer whether the findings of an in-depth analysis represent a meaningful pattern, within a defined margin of error, rather than one of the many random patterns that exist in any set of big data.

“The tools of quantitative ethnography give you a sense as to whether what you’re seeing by looking very closely at some small group is representative of something larger,” Shaffer said. “Quantitative ethnography takes ethnographic, qualitative data and marries that to statistical techniques.”

QE answers not just what, but how and why

As a result, QE can answer more complex questions – revealing, for example, not just whether a certain educational program does a good job teaching civics to students, but also how and why it does or doesn’t.

“Quantitative ethnography preserves the depth of qualitative analysis and the breadth of quantitative analysis,” Ruis says. “It gives you different kinds of evidence for believing something is true.”

QE does that through a data-modeling technique known as Epistemic Network Analysis, a kind of interactive grid tracing connections between what people said and did in the order it all occurred, looking at how people make sense of the world by connecting ideas and actions.

“ENA looks at windows of time to see what’s happening—and how the things that are going on are related to one another,” Shaffer says. “As we look at many windows over time, we can see which things occur together, which things are most strongly related.”

“Connections are the most important piece,” says Brendan Eagan, a graduate student in Shaffer’s lab. “The ideas and actions don’t stand apart from each other. The meaning is in how they are connected.”

Uses beyond field of education

Shifting from an earlier focus on designing educational computer games for young people, Shaffer’s Epistemic Analytics Lab – formerly, the Epistemic Games Group -- works to refine and share this potentially powerful new way to spot meaningful patterns in large data collections.

“The term ‘epistemic’ is important,” Shaffer says, “because epistemology is the study of how we know what we know—that is, how we make meaning of things. That’s what ENA is designed to do.”

“It shows relationships between heterogeneous constructs,” says early adopter Eric Hamilton of the approach’s bridge-building capacities between different types of data in an analysis. “It gives a method for drawing connections and making them visible.”

Hamilton is an associate dean in the Graduate School of Education and Psychology at Pepperdine University in Los Angeles, California, and a researcher at the United Nations Educational, Scientific, and Cultural Organization (UNESCO). He says QE’s ability to analyze data from collaborative learning was a key part of winning a recent federal grant application.

“I was dreaming of a project where we could get kids from different countries to do projects together,” Hamilton said. “They would be collaborating across national and cultural boundaries, which is a very complicated proposal. The darn thing got funded.” And thanks, in part, to QE, he noted, “It hit the right notes.”

Foster, who also helps lead Drexel’s Office of Research, says he’s taking steps to have Shaffer’s approach added to the university’s curriculum on research methods.

“I read the (QE) book before it was published, and since it was published, I’ve given it to my graduate students, and they want to use it in their dissertations,” Foster explains.

Shaffer wants to share the approach with other researchers, including the tools his lab has developed to implement the ideas of QE, available free at

His team at the Epistemic Analytics Lab also helps QE adopters through the basic set-up, and with adapting the methods for fields beyond education.

 “At the core, anybody who has interview data, who has field notes from their research, who’s looking at Twitter or Facebook, or blogs at scale, anybody who is analyzing the mountains of big data that are recorded every day,” Shaffer says, “can use our tools to get insight into their data and support the claims they are making.”