Barry Boehm, perhaps the most influential software engineer of his generation, was delivering software to a military submarine in the early 1970s in the form of a big three-foot long metal tray filled with paper punch cards. An officer asked if the software would be on the boat when they sailed. Boehm said yes, it’s part of a new system they’ll be testing on this voyage. The officer said that, in that case, he needed to weigh the software since the overall weight of a submarine was critical to its performance. Barry said no need sir; it weighs nothing. The officer, puzzled, looked at the metal tray full of cards and said, “That’s nuts, it must be 40 pounds or more!” to which Barry replied, “We only use the holes.”
Trading in Atoms for Bits
Fifteen years later, Ford Motor Company competed with Japanese newcomers and needed to increase the safety of its vehicles. Many options were evaluated including adding metal to frames, changing crumple zone shapes, and upgrading airbag software. Airbag software was selected because of the speed and cost-effectiveness of the change.
Thirty years later, Google started training cars to drive themselves. They built a fake city in the desert for it but wanted to go faster since even fake cities take a long time to reconfigure. So they built a virtual simulator in which their cars’ software learns faster than in real time, and Google engineers create hundreds of variations to a situation, which they simply called “fuzzing.”
GE Aviation uses Dataiku and machine learning to predict the performance of new jet engine part designs. It’s so much faster than their previous process that they too began “fuzzing” new designs. Overall, they reduced design time by an astonishing 50%.
Boehm’s punch card holes, Ford’s airbag software, Google’s fuzzing, and GE Aviation’s fuzzing are all examples of a macro trend: trading atoms for bits. Digital twins are the application of that trend to industrial processes and are generating staggering ROI in many industries. So what is it?
Digital Twins and Their Value
A digital twin is a virtual, digital representation of a physical object or process that we can apply AI to in order to rapidly and cheaply get answers about the real-world object or process. Dataiku customers are building digital twins for:
- Customer behavior
- Jet engines
- Municipal water grids
- Water treatment plants
- Municipal electricity grids
- City traffic
- Aluminum mining and smelting
- Plant-based oils and fats (lipids)
- Chicken deboning machines
- Black holes
- Semiconductor manufacturing
- Oil well drills
GE Aviation wanted to accelerate the design of new engine parts to make them more sustainable: reduce soot emissions and increase fuel efficiency. Two generations ago, they would fabricate a new part, put it in an engine in the shop, run it for a while, and measure efficiency and emissions. The process took weeks. Computing costs dropped and they switched to computational fluid dynamics, a digital simulation of physical properties, which used expensive computers and took two days. Computing costs plummeted so they developed deep learning models on Dataiku’s platform to predict fuel efficiency and soot emissions. It’s 190 million times faster than computational fluid dynamics.
More recently, a 130-year-old food sciences company, Bunge Loders Croklaan, wanted to accelerate their design process to keep up with changing consumer behavior caused by COVID-19 lockdowns. The company specializes in plant-based oils and fats and has prototyped thousands of products over its history. The company’s goal was to leverage data from those past prototypes to quickly predict attributes like color and viscosity in new designs. Using Dataiku, they built predictive models and a web app in just two weeks. “Today’s tools are amazing. Lightning speed results. They enable play, enable fun,” said Renee Boerefijn, Ph.D., Bunge’s Director of Innovation.
A third example is from semiconductor manufacturing and yielded one of the highest ROIs in the history of machine learning (excluding trading). The chip industry is competitive so there’s great value in getting new products to market quickly. They’re like movies and fast-fashion apparel: a product is released, sells well or doesn’t, and is gone soon. Shortening time to market can be worth billions of dollars.
Computer chips are perhaps the most complex devices made and their manufacturing process is equally complex, with tens of thousands of steps and thousands of parameters to be tuned. The machines that make them generate terabytes of data each day containing 50,000 variables. The design of the whole process is simply called “the recipe.” Key steps to getting to market are designing and testing the recipe in an R&D lab, and rolling out the recipe in production factories. During rollout, factories are offline and not making money so reducing the duration of these two steps is valuable. One semiconductor equipment manufacturer used Dataiku machine learning to rapidly estimate the effect of recipe changes without testing them in the physical machines, cut three to six months off the process, and saved millions of dollars.
So, where should you start when it comes to digital twins? Simulating the real world can seem daunting, but cloud computing and today’s machine learning algorithms are well up to it. Best practices for starting a digital twins initiative are similar to those for any AI initiative:
- Set clear goals and business ROI expectations
- Get a champion in the executive suite
- Avoid moonshots as your first project
- Get quick wins, document them, and evangelize them widely
- Use interdisciplinary teams, not subject matter experts in one corner, data scientists in another, and business analysts in a third
- Collect data from every possible source and hold data product owners responsible for quality, velocity, and other SLAs