Why Is It So Difficult to Develop Data Science as a Product?


Developing machine learning models as goods that add value to businesses is still a new area with uncharted territory. While putting well-established software development methods, such as agile, into practise is not easy, it can provide a strong basis for success.

Picnic sees itself as the tech equivalent of a supermarket. The technology and underlying data are critical to Picnic’s development, from the app-only store to the 20-minute delivery windows and just-in-time supply chain.

It is our responsibility as the Picnic Data Science team to take data-driven decision making to the next stage. We’ve been tasked with developing automated systems with the intelligence, background, and authority to make business decisions worth tens of millions of euros each year.

Building these systems, on the other hand, is difficult. It’s much more difficult to get them into production and into use in the market. Let’s take a look at what it takes for Picnic to productize our data science programmes, which we’ve dubbed “Data Science as a Commercial.”

Why is this so difficult?
87 percent of data science ventures never make it to production, according to a report published in July 2019. A lack of leadership support siloed data sources, and a lack of collaboration are some of the reasons given. Apart from these issues, data science and machine-learning projects have a number of inherent characteristics that set them apart from other forms of software development.

To begin with, data science, especially machine learning, is based on probabilities and uncertainties. The likelihood of this order being fraudulent is 73 percent +/- 5 percent with a 95 percent confidence interval, according to a machine learning-based payment fraud model. Our business counterparts live in a determinist environment, where “we want to block all fraudulent orders.” It’s not easy to communicate between these worlds.

Furthermore, data science ventures have a non-linearity that (usually) does not exist in ‘traditional’ software development. Before we start constructing a model, we have no idea how well it will do. It could take a week, three months, or even longer to achieve a satisfactory level of results. This makes putting together a nice project schedule with deadlines and deliverables that the company needs to see extremely difficult.

Finally, when it comes to releasing a model to output, the value of model confidence cannot be overstated. When we partner with a company to commercialise a concept, we’re entering a domain where they’re the experts. We’re trying to automate a manual process or replace a collection of carefully designed business rules in a lot of cases. Although these laws aren’t perfect, they were developed by people who have a thorough understanding of the industry. It’s difficult to hand over a black-box machine learning algorithm and tell the company that it’ll replace their existing way of operating. In the end, the profit/loss from whatever process the model is looking to automate belongs to the company, and we as data scientists must persuade them to place their livelihood in the hands of our models.

The following considerations, in our experience, will help you effectively productionalize models across a wide range of domains:

  1. Make use of case range
  2. Alignment of businesses
  3. Data science (agile) growth

Make use of case range
“I guess it’s tempting to handle it as though it were a nail if your only weapon is a hammer.” — Maslow, Abraham

The number of problems that machine learning could solve is enormous. There are many applications in customer success, supply chain, distribution, finance, and other areas. It’s difficult to know where to begin given the ease with which high-quality data can be found in Picnic’s well-kept data warehouse. A data science project’s progress hinges on selecting the right use case.

So, how can you choose which use case to pursue?

Which one has the most commercial value?
The ‘low hanging fruit’ for a swift victory?
Which one is in line with the company’s strategic goals?
We consider those factors at Picnic, but the deciding factor is one thing:

How certain are we that machine learning is the most effective solution to this problem?

(Remember how I said we data scientists are used to working in probabilistic terms?)

We want to make sure that our data scientists are getting the best out of their time. Let’s say there’s a compelling issue that has the potential to produce a lot of value, but a few well-crafted market rules will only get us 80% of that value. Is it the most efficient use of money to make the data science team spend months attempting to get an extra 10%? Most likely not.

We can break down the use case selection criteria into several components using our Zen of Data Science principles as a guide:

  1. Do we have enough high-quality, clean data to model the problem?
  2. Is there a specific objective criterion (or loss function) we’re aiming for?
  3. Is your company ready to automate this process?
  4. What role will it play in the manufacturing process? Is there enough time for the product team to introduce it?
  5. Are there any case studies, research papers, or other tools on how to solve this type of problem using machine learning?
  6. Do we need to answer any prejudices or ethical concerns?

If any of those issues cause us concern, we will rethink if this is the best project for our team to take on.

Without the right use case, no matter how many money you have, the chances of success are slim.

Alignment of businesses
It is possible to create the ideal project plan if one first creates a list of all the unknowns. — Langley, Bill

Assuring that everyone is on the same page about the project’s target seems to be both straightforward and clear. The company needs more precise predictions. You are certain that you can outsmart the current scheme. What exactly is the problem?

The problem is that it isn’t just about the model’s efficiency.

Let’s assume you create a fantastic model and assign it to a regular task. It’s possible that the company would need to be able to change their predictions during the day. Suddenly, you need a real-time service. Your model does a good job on the majority of articles/segments/regions, but a new product is coming out this quarter. Your model is now making predictions based on no prior data (the cold start problem says hello).

Machine learning programmes necessitate a certain degree of company knowledge of how the systems operate. They must understand the inherent strengths and disadvantages of machine learning models, as well as how edge cases are dealt with and which features are employed.

Furthermore, you must understand how the model will be used. What is the anticipated outcome? What will be done with the predictions? Will a fallback mechanism need to be in place if the model fails to run? Knowing the answers to these questions before you begin production will save you a lot of headaches, heated conversations, and late nights reworking.

The problem of model trust arises once more. What if your model’s performance isn’t trusted by the business?

You can display all the ROC curves, F1 ratings, and test set performance you like, but will your model be given a chance to recover if the first few predictions it makes are incorrect? The basic market rules in place at the time weren’t perfect, but the company understood which cases it performed well in and which it didn’t, and it could intervene accordingly. Your models could (hopefully) have an operational effect, and they won’t be used if the company doesn’t trust them. It’s as easy as that.

Model trust discussions are difficult to have, but they are necessary. You must know ahead of time what it would take for the company to use your model in production. At the very least, all parties must agree to and sign off on an appraisal period with performance indicators.

Many data science projects fail due to misalignment of priorities between data scientists and the company. Before months of development work has been expended, dialogue is needed. It could mean the difference between life and death for your model.

Data science (agile) growth

POC loses to MVP.

“When it comes to fundraising, it’s all about AI. It’s ML when it comes to recruiting. It’s linear regression when it comes to implementation. It’s printf() while you’re debugging.” — Schwartz, Baron

Agile software development has become the de facto norm, but it has yet to make its way into the data science community (yet). Today’s data science initiatives are mostly based on the premise of “design it and they will come.” A data scientist meets with the company to discuss the issue, determines which metric to optimise, and inquires about data access. Then they go off and spend a few months creating a stunning, sturdy model, which they then show. Then there’s…

It isn’t seen at all. Data science will benefit from the same central purpose that agile architecture does: it must be customer-focused.

What works is focusing on developing a minimum viable product rather than a proof of concept (POC), which usually never leaves the data scientist’s laptop (MVP).

The aim of an MVP is to build an end-to-end solution as quickly as possible. You build the data pipeline, start with a simple baseline model (also known as linear or logistic regression), and then show the results to the end user. How we use machine learning to find the best drop times is a real-world example.

Machine learning will benefit from the same reasons that this has become the de-facto norm in software development. We’re doing everything we can to adhere to the agile manifesto’s main principles:

Concentrate on software that works.
Spending time fine-tuning a model that might never be used is a waste of time. Spend it on establishing a functional and sustainable Customer Partnership.

Reduce the time to market so that the ‘customer’ views the output as if it came from a more advanced device. From there, you can iterate and change.

Adapting to changes

It’s best to figure out what works in the second week rather than the second fifth. Perhaps the in-house framework you planned to integrate with lacks a way to show the data you need. Flexibility with specifications is essential, as is shipping working code early and often.
The simulation isn’t the most difficult aspect of data science ventures. That’s the rest of it. By concentrating on an MVP, you can quickly bring a functioning system into development and start making predictions. You’ll find problems faster and be able to offer your customers a fresh, shiny model in weeks rather than months.

In the end, we don’t want to make a model just for the sake of making a model. We’re developing a product with a model as one of its components. And we’ll be able to get there by applying what we’ve learned over decades of product growth.

Final Thoughts
It’s not easy to create machine learning-based goods. All of the elements of software development are present, with the added complexity of machine learning at its heart. It’s a brand-new area with no proven best practises. You can set yourself up for success by ensuring that you’ve chosen the right use case, aligned with the market, and followed tried-and-true agile software development practises.

Technology, data, and machine learning are at the heart of Picnic. To get our models into production, we have an incredibly talented community of data scientists, a scalable, self-service data science platform, and the full support of the company.