The following remarks were given by Mark Elszy, Dataiku RVP Federal, at a U.S. Chamber of Commerce Artificial Intelligence Commission hearing on July 21, 2022.
I would like to thank the commission for this opportunity to share some insights from Dataiku that we have gained from advising and working with over 500 customers worldwide to scale Responsible AI of all kinds. At Dataiku, we see a future of Everyday AI, where everyone, across government agencies, in civilian industry, and across job functions, has the tools and training to contribute to Artificial Intelligence in a way that improves everyday decision-making as well as enables game-changing new products and services. The Everyday AI future will require all of us and not just some of us.
In the context of national security, we typically think of safeguarding IP, cyber threats, or supply chain security, all of which are critical issues. But when it comes to Artificial Intelligence, I would ask you to consider some foundational challenges that, if not addressed, could prevent us from leading in AI and making us safer.
That overriding challenge is readiness.
At Dataiku, we are focused on three foundational issues when it comes to readiness. The first and most important factor is our people. Within the government, we simply don’t have enough skilled data science experts and we lack a coherent, broad-based plan to upskill the federal workforce that we have.
The second key factor is the trustworthiness of AI predictions. Of paramount importance in trustworthiness is transparency — transparency of the data used, the predictive models developed, as well as continuous, always-on monitoring of models for changes in accuracy and bias.
The third key factor driving AI readiness is speed. Speed of mission. Quite simply, we must go faster in industry, within federal agencies, and as a nation. The number of AI patents filed worldwide is growing 77% annually, yet last year China filed three times more patents than we did. Throwing people and money at the problem is insufficient. We need an industrial-scale production model for AI, analogous to what Henry Ford did for automobiles. Such a model enables automation, collaboration, and continuous improvement.
Now let me go into a bit more detail on each of these.
We cannot hire or outsource our way to a solution. Some report that the demand for AI experts is three to five times the supply and, in that kind of labor market, federal agencies will have a difficult time competing with Google and Facebook for the number of AI workers needed.
MITRE estimates that 20% of the Department of Defense’s civilian workforce could be enabled with some degree of AI skills, which is approximately 157,000 people! That doesn’t mean that we are going to turn every Excel and PowerPoint user into a coder. We need broad adoption of low-code and no-code capabilities as a way to get more domain experts involved. For anyone that might not be familiar with those terms, low-code and no-code refers to software development functions that just require a click of the mouse rather than writing software code. With these kinds of tools, it’s possible to have a business expert working side by side with a Ph.D. data scientist on the same project. Many non-coder roles are required for AI industrialization including:
- Data domain experts
- Data engineers and stewards
- Model catalog managers
- AI operations architects, and
- AI Governance policy managers
Just to name a few.
We know that upskilling works. We have experienced massive upskilling programs at companies like Pfizer, GE, and Schlumberger. Some of the best practices of those programs are:
- Branding and publicizing the upskilling program
- Providing multiple, self-service learning paths
- Creating multiple levels of certification so that everyone can get started now regardless of their current skills
Upskilling hundreds of thousands of workers will create a flywheel effect both within the national security industry and across the U.S. economy in general, as upskilled workers move to other industries.
This issue has multiple layers to it and is of great concern to just about everyone. When it comes to readiness, it is critical that frontline workers have trust in AI products or they won’t use them. Without trust, even the best AI products cannot generate value. For example, let’s say that an AI model predicts that we have sufficient spare parts on hand to service mission-critical aircraft for the next 12 months. If sailors or airmen don’t trust the prediction, then we overstock and an efficiency opportunity is lost. Boston Consulting Group estimates that less than 20% of Global 2000 companies have generated significant positive ROI from AI. A primary reason is a lack of trust among frontline workers. So what can we do about this? Transparency, traceability, and monitoring generate trust.
Transparency includes the data features that a model uses to make predictions, and an explanation of each individual prediction when needed. For example, if a satellite image processing application estimates that a certain factory is operating at only 20% capacity, then it should be able to explain how it came to that conclusion.
Next, traceability, auditability, and automatic quality checks are needed for both datasets and models at every stage of their lifecycle.
Lastly, monitoring both quality and bias in data and models should be an always-on, automatic process that runs at the speed of AI, not the speed of humans. If it’s a separate process that’s run once a quarter, or a guideline like an HR manual that’s never used, then trust is at risk.
AI bias can take many forms. What’s important to one application might not be important to another. Thus trustworthy AI must search for bias in any subgroup that subject matter experts identify, such as gender and race in consumer credit, or age and comorbidities in vaccine testing.
3. Speed of Mission
The third key factor to AI readiness is speed or speed of mission. Speed comes from involving more people in the process, collaborating in real time, and employing reuse and continuous improvement. Fifty years ago, at the beginning of software engineering as its own discipline separate from coding, Brooks’ Law was observed. It said that adding more people to a late software project just makes it later. There were no economies of scale because software was largely developed by artisanal experts: highly skilled individuals often working alone or in small groups. AI is mostly developed that way today by artisanal gurus that do not scale.
We have learned a lot about software engineering over the past 50 years and it’s time we apply those learnings to AI to create industrial-scale processes that include:
- Data and model catalogs
- Data and model operations
- Interdisciplinary team collaboration tools
- Continuous development and continuous integration pipelines, and
- Always on quality monitoring
In conclusion, Everyday AI is upon us, but we need to address and strengthen the fundamental building blocks of AI readiness by focusing not on moonshots like self-driving cars but on:
- Upskilling hundreds of thousands of people that we already have
- Developing transparent processes, and
- Industrializing AI development and operations for scale and speed