What makes someone a data scientist? This question keeps coming up continuously, so we’ve finally decided to write a blog to make it clear.
It is difficult to come up with one, single definition, because data scientists perform a wide variety of activities and have a wide variety of skills.
Typically, a data scientist can be best utilized working alongside an engineering and product management team to help develop products, and to help provide valuable insights to the business. Some data scientists will even refer to themselves as “data detectives.”
Here are some of the values/services/skills that a data scientist can provide:
Product Analytics: To understand how users interact with the product through analysis of the logs
Data Engineering / Data Pipeline: To automate data collection, aggregation, and visualization to produce metrics that are actionable
Experimentation (A/B Testing): To establish causality that is otherwise too messy or not possible through observational study
Data Modeling: To build predictive models
However, it is important to note that each of these is an individual skill and job. A single data scientist cannot perform all of these job positions.
Here is a great podcast where a data scientist explains in her own words what a data scientist is:
Big Data has become very popular and therefore is being made into a “hype”. The Big Data hype gives the impression that Big Data is a magical tool that can be implemented into any company and voila!- instant solutions to all problems. Society has started to place unrealistic expectations on Big Data and the results it can produce.
It is human nature to project our interests onto others. Like the football dad who never made it to the big leagues, companies also place their desires into their data. This is why studies and surveys are supposed to be objective, because as soon as someone biases the data, the results are compromised. Big Data takes a long time to be analyzed properly to find relationships that can be translated into valuable information. When companies jump at the first correlation because it is in line with their desired result is when Big Data becomes biased.
It is extremely important to ensure your data is clean, or else it is going to lead to a lot of wasted money. Companies buy into the Big Data hype, purchase expensive big data technologies and hope that it is going to magically reaffirm all of their dreams and provide a golden staircase to them. This is not how it works.
This is why we have data scientists, they are the geniuses who can look through mass amounts of data and find correlations without biasing them. If a company is going to spend the money on implementing big data they must also hire data scientists if they hope to profit from their investment.
The take home point to this would be that companies must allow the data to create new ideas and not try to drive the data with their own thinking.
If you want to do Big Data you need a Data Scientist.
Data scientists look for meaning and knowledge in the data. Advanced algorithms help data scientists visualize and look for patterns in the data.
“The data scientist is someone who has fantastic communication and empathy ability, as well a lot of mathematical skill, and then the engineering skill they need to do the math that they want to,” says Hilary Mason, Founder at Fast Forward Labs.
Considered one of the sexiest fields to pursue, the data scientist is now assisting companies of all shapes and sizes make sense of these massive data sets to better inform business outcomes.