Ever wonder how much happiness a word contains? Data can be used to study aspects of our lives that people may have not previously thought possible — even the type of emotion that our words convey. Such applications are the focus of the conference sponsored by the University’s new Data Science Initiative.
Chris Danforth and Peter Dodds, professors at the University of Vermont, discussed the research and goals of their work at the University of Vermont’s Computational Story Lab in two of the University’s data science colloquia Feb. 2. The first lecture was given by Danforth and centered around the team’s flagship program, which they call “Hedonometrics.” The team gathered the 10,000 most frequently used words from scans of Google’s books project, tweets , lyrics and the New York Times, Danforth said. Then, using Amazon’s Mechanical Turk Project, people rated the words on a “happy-to-sad” scale.
The inspiration for the project came from Danforth’s desire to develop a more holistic approach to measuring people’s well-being. “Scientists tend to focus on measuring things that are easier. It’s hard to measure how well people are doing,” Danforth said. “Because it’s hard, (research on well-being) sort of diminished over time, and the focus tends to be put on these economic indicators.”
“Originally, our goal was to develop instruments that could quantify how well people were doing on a population scale so that investments could be made by the government in the right things,” Danforth said. There are not enough investments in research of mental health conditions, and data like this will allow people to develop software that can help in many ways, such as in diagnosis and medical treatment, he added.
Dodds, the other main researcher for the project, discussed the efforts of the Computational Story Lab to study the data of people’s stories — to algorithmically measure different aspects of these narratives through different systems like the Hedonometer, a tool that measures people’s happiness. He discussed the potential for lexical analysis to develop even more scales, such as the “Lexicocalorimeter,” which measures the number of calories people burned or consumed in an area by tracking the use of words that have been assigned caloric values, like “eating” or “butter.”
The lab wants to take this data and apply it to everyday life, Dodds said. For example, they are able to geographically map happiness by examining the locations from which words are sent. They can also chronologically view fluctuations in happiness through social media through methods like looking at a spike in negative words used during a tragedy.
Dodds also noted the ability of their methods to analyze literature. Using the Hedonometer, the researchers were able to track the mood of the text as the narrative progresses in various novels and movie scripts.
Danforth said that developing apps is a main priority for the future of the lab, adding that the researchers will pursue funding to build these tools that will be helpful in daily life. “Part of our goal is to show what is possible using this type of data and make the code available so that individuals can try and build tools that will be useful for public health applications,” he said. “We want to try and make technology that helps people be aware of what their phone knows about them.”
Brown’s Data Science Initiative, which offers a one-year master’s program, strives to create opportunities for students to engage with data science and learn about its applications in projects like those underway at the Computational Story Lab. “We’re trying to ramp up the different kinds of activities that we’re sponsoring on campus,” said Professor Jeffrey Brock, the director of the Data Science Initiative. “We’re using the colloquia as a kind of seminar series to build community across these departments … and engage local and nearby researchers in similar areas to come speak on campus.”
The four departments central to the Data Science Initiative — mathematics, applied mathematics, computer science and biostatistics — are trying to reach out to involve even more departments, Brock said. “We’ve been working maybe more in isolation than is appropriate, so we saw an opportunity … to build across these different units,” he said. Brock sees potential for Brown in the theoretical aspects of data science as well as the applied ones and hopes to emphasize applications in public health, neuroscience and physics.
Brock encourages interested students to take relevant courses, which include ECON 1660: “Big Data,” data fluency courses and APMA 0110: “What’s the big deal with Data Science?” The department is also planning to host a Hackathon and a Music Hack Day.
Brock emphasized the variety of companies moving to Rhode Island that may create opportunities for Brown students interested in data science. “It’s a very broad spectrum of career opportunities for students now if you’ve got just a little bit of technical know-how and an interest in a particular domain area,” Brock said. “It’s hard to find a field where (data analysis) doesn’t have some sort of impact.”