Posts

New Sources, Innovative Applications of Big Data Explored at Conference UCLA Luskin Center for Innovation researchers demonstrate how digital technology is transforming humans into sensors that generate behavioral data on an unprecedented scale

By Stan Paul

If your 2016 Thanksgiving dinner was shorter than usual, the turkey on your dining table may not have been to blame.

Who you had dinner with and their political affiliations following last year’s divisive election may have shortened the holiday get-together by about 25 minutes — or up to an hour depending on how many campaign/political messages saturated your market area. It’s all in the data.

“It’s not that conservatives and [liberals] don’t like eating Thanksgiving dinner with each other;  they don’t like eating Thanksgiving dinner together after an incredibly polarizing period,” said Keith Chen, associate professor of economics at the UCLA Anderson School of Management. Chen was among a group of scholars and data researchers who presented recent findings on Aug. 25, 2017, at a daylong conference about computational social science and digital technology hosted by the UCLA Luskin Center for Innovation.

Information gleaned from social media and from cellphone tracking data can reveal and confirm political polarization and other topics, such as poverty or protest, said researchers who gathered at the “The Future of Humans as Sensors” conference held at the UCLA Luskin School of Public Affairs.

The event brought together social scientists and data researchers to look for “ways to either extend what we can do with existing data sets or explore new sources of ‘big data,’” said Zachary Steinert-Threlkeld, assistant professor of public policy at UCLA Luskin and the leader of the program.

Steinert-Threlkeld presented his latest research, which was motivated by the Women’s March in the United States, as an example of measuring protest with new data sources that include geo-located Twitter accounts. While conducting research, Steinert-Threlkeld has observed that working with social media data has actually become more difficult of late.

“While Facebook lets you use data from profiles that are public, most people have private profiles,” Steinert-Threlkeld said. Seeing private data requires researchers to work directly with Facebook, which has become more cautious in the wake of a controversial 2014 paper, thus impacting what scholars can publish. In addition, Instagram previously provided much more data, but since 2016 it has followed the Facebook model and that data has been severely restricted despite Instagram’s norm of having public profiles, he said.

“This workshop will discuss how ‘humans as sensors’ can continue to yield productive research agendas,” Steinert-Threlkeld told conference attendees.

Talking about new and innovative ways to do this, Michael Macy, a sociologist and director of the Social Dynamics Laboratory at Cornell, began his presentation by pointing out the innate difficulties of observing human behavior and social interaction, as well as both the potential and the limitations of social media data.

“There are privacy concerns; the interactions are fleeting. You have to be right there at the time when it happens.” He added, “They’re usually behind closed doors, and the number of interactions increases exponentially with the size of the population.”

But, Macy said, new technologies in various scientific fields have opened up research opportunities that were previously inaccessible.

“We can see things that we could never see before. In fact, not only can we see things, the web sees everything and it forgets nothing.”

He tempered the potential of digital data with the fact that for the past several decades the main instrument of social science observation has been the survey, which comes with its own limitations, including unreliability when people recount their own behavior or rely on memories of past events. But, he said, “In some ways I see these social media data as being really nicely complementary with the survey. They have offsetting strengths and weaknesses.”

Macy provided examples of ways that tracking of political polarization can be done, not by looking at extreme positions on a single issue but by inferring positions on one issue by knowing the position that individuals hold on another. This can range from their choices of books on politics and science to their preferences for cars, fast food and music.

“The method seems to recover something real about political alignments … political alignment can be inferred from those purchases, and then we can look to see what else they’re purchasing,” Macy said.

“What I think we’re really looking at is not the era of explanation, at least for now … it’s the era of measurement, and what we are now able to do is to test theories that we could not test before because we can see things that we could not see before.”

The day’s presentations also included the ways in which data can be used to provide rapid policy evaluation with targeted crowds and how demographic sampling weights from Twitter data could be used to improve public opinion estimates. Data could also help fight poverty worldwide.

The world seems awash in information and data, but “most of world doesn’t live in a data-rich environment,” said presenter Joshua Blumemstock, an assistant professor at U.C. Berkeley’s School of Information and director of the school’s Data-Intensive Development Lab.

“You can use Twitter data to measure unemployment in Spain. The problem is that these methods don’t port very well in developing countries,” Blumenstock said. “There’s these big black holes in Africa for Twitter.”

Blumenstock discussed how data from billions of mobile phone calls in countries such as Rwanda could be used in conjunction with survey data to create a composite of where individuals fall on the socioeconomic spectrum. In turn, the information collected could be “aggregated up” to a much larger regional or national level.

“And when you aggregate up, you start to get things that might be conceivably useful to someone doing research or some policymaker,” such as being able to respond instantaneously to economic shocks, Blumenstock said. In addition, instead of costing millions of dollars and taking years, he said this methodology could potentially cost thousands of dollars and be conducted in weeks or months.

“For researchers like me who are interested in understanding the causes and consequences of poverty … just measuring the poverty is the first step. For people designing policy for these countries, their hands are tied if they don’t even know where poverty is,” Blumenstock explained. “It’s hard to think about how to fix it.”