According to this article which references Linkedin, data scientists saw a 56% increase in job openings in the US in 2019. IBM also predicts demand for data scientists will increase by 28% by 2020; so what’s the fuss all about? Never has there been a time where it’s so easy and cheap to record and store so much data about ourselves. Companies ultimately predict what customers are going to do and buy next, and how they can more effectively target customers. Research institutions wants to gather all the data they might possibly need. Governments wish to gather as much data about their citizens as possible. Naturally, therefore, those who can manipulate this data are in demand.
What does a data scientist do?
This seems hugely variable depending on the organization but it can range from working closely with software engineers to optimally collect the correct data from customers, pulling and cleaning that data from its source, building and using statistical models to fit data and predict outcomes or visualizing the data in graphs and tables. The common theme is command of basic programming using languages such as Python and R and know some elementary statistics. Many of the job openings mention experience with the database querying language SQL, and also the data visualization software such as Tableau.
What about all the people already trained to do some of these things like economists, physicists and operations research people?
I don’t think that these posts will be obsolete, but data science is certainly the subject of the moment, along with machine learning. Having trained as an economist, I covered a lot of the relevant topics such as statistics and basic programming (more so with more basic packages such as Stata) but have gaps to fill before I’d be eligible to apply for data science roles. If you’ve had some technical training that covers areas such as modeling and statistics, there is something to build on if you want to switch over. I’ve also come across quite a few people who trained as physicists and have made the transition to data science/applied scientist etc.
Where is data science going in the future?
At the moment, as this Forbes article says, there will surely be a a more refined classification of sub roles within data science, as there are currently so many different definitions.
Whilst the labels of “machine learning” and “data scientists” may fade, the ability to gather data, ask the right questions about what we can know and infer, plus rigorous modeling will always be useful.