Big Data: Changing the Way We Do Science

Data science relies on software programs to manage and analyze large data sets. Big data has become the buzzword and is helping both industry and academia to become data-driven. Researchers started using data science because of the need to analyze complex data sets. For instance, Karthik Ram is a trained ecologist learned programming and information management to analyze ecosystem data. He now works at the Berkeley Institute for Data Science. His colleagues include neuroscientists, social scientists, and biologists working as data scientists!

Ram’s case highlights the fact that many different types of researchers now work in data science. It also shows that many fields of research are now generating a huge amount of data. In order to analyze this data, researchers may need to acquire additional programming and statistical skills. Many organizations now offer short training courses to help researchers acquire these skills.

Getting Data Science Savvy

Training in data science can help scientists manage their data sets. It can also help them to interpret their data in new ways by using computational tools. Data science has applications in many fields. This means that data scientists are in demand in industry as well as academia. According to IBM, there could be more than 2.7 million jobs in data science and analytics by 2020. According to the European Data Science Academy, more than three million ads have been placed for data scientists since 2015.

Data science can be used to drive positive change. One example of this is a computational model developed for healthcare research. Created by scientists from University of Chicago, it is called the Research Opportunity Index. This tool measures the difference between the resources spent on a disease and its relative burden on the society. It provides an unbiased, data-driven assessment of where investments need to be made to address unmet medical needs. This index estimates the societal burden of 1,400 medical conditions. Moreover, it is now possible to determine the heart rates of people from YouTube videos. Data embedded in digital photos can pinpoint a photographer’s movement or home location. Oceanographic data can alter land risk profiles that affect property values.

Using Big Data Responsibly

It is important and useful to make data-driven decisions. Big data, in many instances, involves personal information. This could include social media activity, medical records, or patterns of movement. Often the data are collected in the background by smart devices.

It is for this reason that big data must be used ethically. Ten rules for responsible big data research have been published recently. Researchers in both academia and industry can use these guidelines.

In many instances, ethics committees may not be equipped to handle big data questions. The best people to evaluate the ethics of big data researchers may be the researchers themselves. It is therefore important to encourage open discussions within the community about potential risks. Annette Markham (a digital social scientist and ethicist) said ethics is “…about choices we make at critical junctures; choices that will invariably have impact.” Big data projects are often interdisciplinary that open up the possibility for a robust debate.

Data science has evolved as a way to make sense of big data. These analyses allow researchers and companies to make data-driven decisions. Getting trained in data science can help researchers analyze and manage their data sets. Data science can also be used to better allocate research resources. In all of this, it is important that data science be used ethically.

What do you think about the usefulness of developing data skills in scientists? Share you thoughts in the comments section!