The evolution of data science over the past decade has been nothing short of thrilling. The buzzwords are ever-changing, the job titles are getting more whimsical and the approaches are evolving ever more rapidly than ever before, only being accelerated by the continued advancements in computing.
This progress has been echoed by the growth of big data and enterprise analytics/AI companies such as Snowflake, DataRobot, Databricks, etc. This begs the question: what is happening & what is changing?It turns out, a lot.
We are at a pivotal moment for AI and data science – a greater number of universities are offering graduate (and even undergraduate programs) in the disciplines to compliment an education in statistics and mathematics, and a greater number of professionals are switching careers into the field, one that welcomes individuals with diverse backgrounds with open arms. For universities already offering these courses of study, they are pivoting from considering them professional or continuing education programs to formally integrating them into their physical sciences divisions.
So, as the field evolves, the skillset required to succeed in it is evolving as well. Instead of creating a data science department with a small army of data scientists, companies are looking for “Full Stack Data Scientists” – these are jacks-of-all-trades, able to do more than just run your data through an algorithm. According to Google Trends, this growth is most apparent in the U.S. and India.
What is a “Full Stack Data Scientist?”
Simply put, a “Full Stack Data Scientist” is not only able to translate a business problem into an actionable work plan, buttressed by data, but also able to identify the additional data to include (whether it be internal or external), sufficiently explore the data to better understand it, teasing out insights, apply machine/deep learning to the data, test the code written as well as deploy and monitor the deployment of the models on a cloud-based infrastructure. Of course, data engineering, statistical, and algorithm development principles are baked into this process by default.
What is being demanded by the industry for data scientists today is that they can do more than simply work with algorithms and explore data – they need to understand how their model can be “productionalized” (i.e., the process of ensuring your model is ready to run, per business as usual, for a client) and the types of cloud/distributed computing components necessary for their model’s success. This flows naturally into the topic of MLOps, or Machine Learning Operations, which borrows concepts from traditional DevOps, a set of practices that combines IT operations and software development/engineering.
The ever-changing skillset required to “succeed in data” these days is aptly reflected by LinkedIn’s 2020 Emerging Jobs Report, where 11/15 (or over 73%) of the hottest jobs have some sort of data/engineering/coding component. Digging into the job descriptions briefly, it is easy to see the types of technologies and skills necessary to “succeed in data” in 2020 and beyond. Many of these are related to being able to do machine learning at scale.
Given the above context and the need to smoothen out the process to streamline it, it would be no surprise if 2021 sees a handful of more IPOs for enterprise AI, big data cloud analytics, and/or MLOps-focused companies. I, for one, am more than excited for the ride!
About the Author
Michael N. Colella is the Chief Data Scientist at ADC where he is responsible for providing the data science perspective on everything we do and infusing our solutions with Artificial Intelligence. Prior to joining the ADC team, Michael was a Director of Data Science at dunnhumby, a global leader in customer-centric retail analytics. At dunnhumby, Michael was responsible for leading analysis for some of the biggest names in global retail, focusing on innovative initiatives. Before his time at dunnhumby, Michael was a Senior Manager of Advanced Analytics at Kraft Heinz where he led the global advanced analytics innovation hub under the CIO. Michael has worked/consulted for various other retailers & Consumer Product Goods companies, including Bayer Consumer Care as well as Wilton Brands, to name a few. Michael has 10 years of experience working on innovative analytic projects spanning Supply Chain Planning & Optimization, FMCG, Big Data, predictive and prescriptive analytics as well as behavioral and cognitive neuroscience.