Best Software for 2025 is now live!

The Role of Data Scientists and What We Saw at Rev 3

27. Juni 2022
von Matthew Miller

The world is full of conferences, bright lights, relatively comfortable chairs, and friendly faces. Throw in some decent food there, and you’re in for an all-around great experience.

Last month, I had the pleasure of attending Rev 3 by Domino Data Lab in the Big Apple, which did not disappoint. Besides the camaraderie and the swag, I learned about their product updates and the broader machine learning (ML) space, which I will share here.  

Data science is not what it used to be

When one considers the history of data science, two things always come up: 

  • Data
  • Compute

The basic story goes as follows: the rise of artificial intelligence (AI) as a technological method to gain insight into data and drive predictions was fueled by a significant increase in data produced and greater, cheaper computing methods.

Initially, tools focused on crafting algorithms to fit that data and produce sensible outputs. Not much thought was put into reproducibility and systematization. The name of the game was “throw something at the wall and see if it sticks.” The times, however, are a-changin'…   

MLOps to the rescue

Gone are the haphazard days. Now, the notion of machine learning operationalization (MLOps) is in full force. MLOps fosters a culture and practice that aims to unify machine learning system development and machine learning system operation. Through AI & machine learning operationalization (MLOps) software, businesses can be systematic about their AI efforts by monitoring and maintaining their models. With this technology, they can gain visibility into their machine learning projects, put their models into production, and understand how it performs.

MLOps is helping to tackle the two biggest challenges related to AI: data and computing. Indeed, when we review the topics mentioned above for the top-rated MLOps product on G2, Databricks Lakehouse Platform, we find that data is an often-mentioned aspect of the technology, as seen below.

A snapshot of popular topics mentioned related to Databricks Lakehouse Platform

How is MLOps helping take control of one’s data and computing?

With MLOps, businesses can ensure a tight and clear connection between their data and models. Domino Data Lab had announced their partnership with data management company Snowflake in January 2021, and they elaborated on strengthening that connection at Rev 3 (May 2022) with Domino 5.2. Their relationship took a step further in June 2022 with Snowflake’s investment in Domino Data Lab to unite ML models and cloud data in one platform. With Domino 5.2, ​​users can access autonomous model performance monitoring in Snowflake’s Data Cloud. 

Stig Pedersen, head of data science CoE at Topdanmark, notes

“Domino’s Snowflake Data Cloud integration helps our team focus on data science, not complicated data regulatory requirements, with improved efficiencies, such as rapidly discovering model drift to minimize the potential business impact of suboptimal predictions.”

In addition, they announced their new IntelliSize capability. This feature helps businesses manage their costs and operational burden by recommending the optimal size for an environment. With this cost-optimization feature, IT teams and business leaders can ensure that just the right amount of computing and storage is utilized for the task at hand.

These integrations and features help bring data and data science together, allowing the technology to focus on the data so that the data scientists can focus on the data science.

Read now: Data Trends in 2022

What even is data science anyways?

Besides the feature updates announced, Rev 3 had some great high-level insights from industry leaders, such as Cassie Kozyrkov (chief decision scientist at Google), who spoke about making data science useful. Her talk was chock full of great illustrations, metaphors, and jokes, but one that stuck out was her analogy to cooking.

Cooking is not a one-object job. It has multiple ingredients, utensils, and steps, just like data science. If we were to compare, it would be: 

  • Data=ingredients
  • Algorithms=appliances
  • Models=recipes
  • Predictions=dishes

We cannot forget about or neglect any of these key components. By fostering collaboration and innovation, businesses can unlock the power of their data and talent.

Read more: How to Choose a Data Science and Machine Learning Platform That’s Right For Your Business

Stop calling them soft skills

Another insightful key takeaway from Kozyrkov’s talk was her discussion about what happens when tools get easier. As MLOps products like Domino become easier to use and data science tasks, such as data preparation, become automated, where does that leave the humans involved? One hot topic is the role and place of soft skills or those non-technical skills that are ever-so-helpful and ever-so-human.

Kozyrkov remarked: 

“Don’t call them soft skills—call them the hardest to automate.”

This is a great place to end. As we think about the evolution of these platforms, how data and models are coming closer together, and how computing is becoming increasingly easy to optimize, we need to be constantly thinking:

What is the place of a human in the process?

Möchten Sie mehr über MLOps-Plattformen erfahren? Erkunden Sie MLOps-Plattformen Produkte.

Matthew Miller
MM

Matthew Miller

Matthew Miller is a research and data enthusiast with a knack for understanding and conveying market trends effectively. With experience in journalism, education, and AI, he has honed his skills in various industries. Currently a Senior Research Analyst at G2, Matthew focuses on AI, automation, and analytics, providing insights and conducting research for vendors in these fields. He has a strong background in linguistics, having worked as a Hebrew and Yiddish Translator and an Expert Hebrew Linguist, and has co-founded VAICE, a non-profit voice tech consultancy firm.