Data has played a significant role in business for a long time, but in recent decades, the amount of data available to companies has skyrocketed. The internet and its connectivity have left behind a data trail that savvy analysts can use in several ways. A bookstore in the pre-internet era could leverage data about what books people were buying. An online bookstore today could collect data about who is buying what books, what readers are saying about those books on social media, what time of day or year they’re most likely to buy a new book, and reams upon reams of other data.
This massive quantity of information is called Big Data, and it has created an entire field of employment at the intersection of data analysis and computer science. The field is data science, yet many people who work in data science don’t call themselves data scientists. A 2021 report from Anaconda, a data science and machine learning firm, found that only 11 percent of data science workers described “data scientist” as their primary role. Another 11 percent identified as business analysts, and 7 percent identified as data engineers.
This diverse range of job titles is reflected in job postings as well. According to data from Lightcast, a job market analysis firm, data analyst was the most commonly posted job title in data science for the year between May 2021 and April 2022. Data scientist and data engineer were also common titles. Other less common jobs had domain-specific titles, such as IT data analyst or clinical data manager.
The future of data science is one where prospective data science graduates can enter many different roles, depending on their backgrounds. The online Master of Science in Data Science (MSDS) program at the Tufts University School of Engineering prepares working professionals and students with STEM experience to move into any role in data science, whatever their interests or career objectives.
Why Are There So Many Job Titles in Data Science?
One reason for the range of job titles in data science may be confusion from employers and hiring managers who are not data scientists and don’t know much about the discipline. The field is also experiencing skill segmentation as it continues to grow. Smaller companies may still hire a single data scientist who handles data at every stage of its lifecycle, but companies that can afford larger data science teams will split the duties into various specializations. They might have a team of data engineers who create sustainable data collection pipelines and data warehouses, analysts and statisticians who draw insights from data, and machine learning engineers who develop and deploy computer programs that comb over massive quantities of data and improve their analysis techniques as they go.
The segmentation also represents the value of subject-matter expertise for data scientists. It’s not enough to have a pure data science background; to get the most out of data, data scientists must also understand the subjects they analyze. Thus we see calls for analysts with, for example, IT or healthcare backgrounds. Future data scientists are more likely to be working professionals with data science skills on top of relevant industry experience.
Data Scientist vs. Data Analyst vs. Data Engineer
Data scientist, data analyst, and data engineer are three common occupations that work closely with data. They aren’t the only data science jobs, but we can get a picture of the field’s future by looking at them and their associated responsibilities.
What Do They Have in Common?
Data scientists, analysts, and engineers all share the same high-level objective: to deliver actionable insights to their organization using Big Data. They also share some of the same fundamental data science skills. All three professions must understand the foundations of data science. That includes statistical analysis, databases, algorithms, and computer programming skills. These skills are part of the data science curriculum at Tufts University by design.
Data science professionals frequently use Python, a computer programming language. Anaconda’s State of Data Science report found that only 4 percent of data science professionals claimed never to use Python. Data from Lightcast shows that Python proficiency frequently appears on job postings for data scientists, data analysts, and data engineers. Tufts online MSDS program applicants should have some experience with Python because they will use the language in some data science classes.
The jobs also share a demand for professionals with advanced degrees. A report from IBM and the Business Higher Education Forum found that 42 percent of job postings for data scientists, data analysts, and data engineers called for a master’s degree or higher.
How Do These Roles Differ?
There is no clear consensus among data science professionals about what duties belong to what job titles. However, it is still common for different jobs to interact with data differently and at different stages.
There is some evidence that analysts in specific subfields do less technical work. For example, Anaconda’s State of Data Science report found that 43 percent of self-described business analysts didn’t deploy predictive models as a part of their jobs. Only 17 percent of data engineers said the same. Some data analyst roles work more closely with business than data scientists. They communicate with stakeholders; build information dashboards; use data visualization tools, such as Google Charts and Tableau; and drive business decision-making.
Data engineers work to collect, store, and convert unstructured data into usable formats for data analysts and other data science professionals. Raw data rarely comes in convenient forms ready for data modeling or predictive analytics methods, so data engineers build systems that can transform unstructured datasets into something ready for data analytics. Data engineers use more database management skills, such as SQL, than other data science professionals.
The main differences between data scientists and other data professions are harder to pin down. Some experts say that data scientists work on more abstract research and unstructured data than data analysts. Others believe that data scientist as a job title is going extinct, and we will see field-specific data scientists and analysts in the future. So instead of pure data scientists working to solve industry-specific problems, there will be marketing, healthcare, IT, and other professionals with data science skills who work on solutions for their organizations.
However, while there may be some key differences between subfields in data science, data science professionals should look to individual job postings to get an idea about the associated duties and skills. Remember that these job descriptions are often written by hiring managers who aren’t data-literate and know only that they need to hire someone to work with data.
What About Other Data Science Job Titles?
The presence of other job titles in the data science profession supports the idea that the future of data science will be more segmented based on skills––as we’re seeing with data engineers today––and industry knowledge––as we see with data analyst positions in specific industries.
More and more job postings exist for cloud engineers, cloud security managers, data architects, data mining specialists, data visualization developers, and machine learning engineers. These job titles reflect specific skill sets within data science and computer science more broadly.
Other emerging job titles, such as business intelligence analyst, decision scientist, and other domain expert analysts, reflect the need for data science professionals who also have a deep understanding of a specific industry and the challenges it faces.
What Does This Mean for the Future of Data Science?
These job titles and their work can tell us a few things about where data science might be heading in the future.
Automation Is the Future
Machine learning engineering is one of the fastest-growing careers in any field. In 2019, Indeed reported that the number of machine learning jobs increased by 344 percent between 2015 and 2018. While growth hasn’t kept up at quite that pace, it is still significant. According to Lightcast, the number of job postings for machine learning engineers almost doubled between 2019 and 2022.
Machine learning is just one facet of automation in data science. Automating simple tasks has long been a feature of the field and something that Python does well. However, as artificial intelligence becomes more advanced, data scientists can automate more sophisticated work.
This does not mean that data scientist roles will disappear and machines will take over their duties. Instead, increased automation will increase the demand for highly trained data science professionals who can develop machine learning models. It will also free up data scientists’ time, allowing them to work on more interesting problems instead of repetitive tasks. Anaconda’s State of Data Science report found that 55 percent of professionals thought automation in data science was a good thing. Only 4 percent had negative attitudes about automation.
Professionals interested in a data science career should develop artificial intelligence and machine learning skills to prepare for these emerging jobs. Both topics are part of the online MSDS curriculum at Tufts.
Data Science Is Still Growing
The need for professionals specializing in particular skills suggests that data science is still growing. In the past, companies may have hired a single data scientist or a small team to perform almost every task related to data collection and analysis. Today, a large company could have multiple teams segmented along their roles in the data lifecycle. We should expect this trend to continue, creating even more opportunities for data science professionals with cutting-edge skills to find jobs in the field.
Experts in the industry support this view. The Quant Crunch report from IBM and the Business Higher Education Forum recommends that “workforce development and higher education must look beyond the data scientist to develop talent for a variety of roles.” A survey from McKinsey & Company identified data analytics as the largest skill gap in business. LinkedIn’s 2020 Emerging Jobs report labeled data science as the third fastest-growing field, with 37 percent annual growth.
There is plenty of room for aspiring data scientists to enter the field. There are opportunities for working professionals with a background in a particular industry who augment their knowledge with data science. There are also opportunities for computer science professionals to develop expertise in a specific data science topic, such as data warehousing or machine learning. The online MS in Data Science from Tufts sets students up for success in this growing field.
Data Storage Is Changing
We are seeing more and more job postings for roles such as cloud engineer, data warehousing engineer, and data architect. This tells us that data storage is a growing part of data science. It’s no longer possible to leave data collection and management to a single data scientist or software engineer, who handles those duties alongside their regular work. Instead, massive quantities of data now require specialized engineers who can deliver it to dedicated analysts in usable and secure forms.
Today, data engineers in Anaconda’s State of Data Science report say that “meeting IT security standards” is one of their primary job functions. In the future, data scientist job roles will increasingly emphasize efficient data storage and security as the total data and threats against it increase. A Seagate report predicts that the data portion under secure storage conditions will increase from 67 percent in 2020 to 87 percent in 2025.
Titles Also Confirm Advanced Education Is Still Crucial to Success in Data Science
In the past, many junior data science professionals were stuck doing “data janitor work,” or repetitive tasks to clean data and prepare it for more sophisticated analysis. As machine learning automates more of these tasks, we can expect entry-level data janitor jobs to disappear in favor of more skilled roles. We’re already seeing this trend in the data. In 2021, respondents to Anaconda’s State of Data Science survey spent about 17 percent of their time on data cleansing. That’s down from 26 percent in 2020.
An advanced degree will serve future data science professionals in two ways. First, for current data and computer science professionals, it will build the advanced skill sets necessary for more advanced roles, such as data engineer or machine learning engineer. Second, for professionals who want to apply data science to their industry, it will develop the foundation for that work.
Advanced degrees, such as the online Master of Science in Data Science from Tufts University, prepare professionals to excel along multiple data science career paths. The program delivers a student-focused experience with small class sizes and a rigorous curriculum that teaches skills essential for the future of data science. Core courses give students skills that prepare them for success in emerging data science jobs, from industry-focused data analysts to machine learning engineers.
No one can say precisely what data science will look like in 10 years. Technology is changing at an unprecedented rate, and data science will likely change with it. But what is clear is that aspiring data science professionals or current professionals who want job security should seek out advanced degree programs to prepare them for this increasingly-segmented field, however it develops in the future.