Across nearly every major industry, enterprises are investing more than ever before in advanced analytics and AI-driven data processes. While this is cause for celebration, it also gives rise to a massive challenge in the way of visibility, control, and process management. With more teams developing, deploying, and deriving insights from data projects, it is becoming increasingly difficult for teams to manage comprehensive documentation and monitoring, as well as to mitigate operational and/or legal risks.
For this reason, data and analytics stakeholders and business leaders alike have begun to focus intently on governance processes and capabilities. Ideally, an enterprise’s data platform enables all users — from data engineers to business analysts — to work collaboratively within a recognized structure of accountability. From small tweaks to a model to major sign-offs, well-built governance operations should grant all stakeholders visibility onto every stage of a data project’s development.
Getting to Good Governance
Good governance is like a great family recipe: it requires several key ingredients, each one of them as important as the last; and it is greater than the sum of its parts. When well-built, an analytics and AI capability will enable users to safely scale AI with oversight and prioritize the data projects and models that deliver the most value. Let’s run through some of those key ingredients that good governance processes, like Dataiku Govern, should contain.
A Central Watchtower
When a project contains dozens, if not hundreds, of moving parts, and when multiple users with different degrees of data access, different coding skill sets, and different objectives are all working on the same processes and datasets, it is essential that everything revert to a single source. As your company scales its AI footprint, centralized program oversight is crucial for maintaining visibility and reducing risk.
With platforms like Dataiku, users gain access to a single place where data and analytics leaders and project managers can track the progress of multiple AI and analytics initiatives and ensure the right workflows and processes are in place to deliver Responsible AI. This hub serves as a central watchtower over your AI and analytics portfolio, allowing you to review all of the models, bundles, and projects across your design instances and determine which assets to explicitly govern.
Standardized Governance Plans and Workflows
While each team will have unique governance requirements for each project, you shouldn’t be reinventing the wheel every time you want to set up a well-governed workflow. Users should be able to track project statuses across all business initiatives to standardize their approach to AI, create project plans, and leverage workflow blueprints with clear steps and gates to explore, build, test, and deploy AI projects with optimized speed and value for each governed project.
Project owners, for their part, should be able to document and communicate the project’s objectives, its scope, and its potential use cases. And they should be able to attach additional information for all to see and reviews — such as the details of any business sponsors on the project, or model documentation.
Platforms like Dataiku allow you to tailor your governance processes to each project, while also providing you with pre-built plans and workflows to support your operations from the get-go. With Dataiku, users can leverage standardized project and workflow templates with clear steps and gates to explore, build, test, deploy, and maintain AI projects. Assign stakeholders, capture notes, and attach relevant documentation to each stage of a workflow to ensure the process is documented and tracked, from design to delivery.
Structured Sign-off and Approvals
Reviews and sign-offs might be the true core of any good governance operation. Getting stakeholder approval for analytics and AI projects can be challenging to manage and track, but is necessary to ensure both projects and models align with business needs, are auditable, and follow responsible AI best practices.
But there’s an essential yet difficult balance to be struck between efficiency and diligence. If processes are too byzantine and bureaucratic, they’ll never move out of production; if they’re geared for maximum speed, they risk being error-prone. In governed workflows, project owners request and collect sign-offs on models or project bundles prior to promoting them to production. That this occurs and is visible within the central Govern watchtower keeps transparency and fluidity high and reduces bottlenecks to a minimum, all while ensuring audit-readiness on deployment decisions. Without appropriate reviews and sign-off, a deployment will be blocked until proper approval is obtained.
With Dataiku Govern, it’s easy to assign relevant colleagues from different departments to ensure proper reviews and approvals take place across the process. This way, you tackle transparency and avoid misaligned priorities or undefined performance metrics that can expose your organization to operational, reputational, and legal issues.
Model and Bundle Registries
Aside from a smooth and reliable approval and sign-off structure, governed projects will also make it possible to follow a robust documentation protocol regarding what’s been done and what exists under the umbrella of the project. For data and analytics stakeholders — like risk and delivery managers, and machine learning engineers — it is especially important that their platform enables them to keep a Model Registry; that is, a central inventory of all their models. The idea is to have a centralized way to see all models in one place, versioned, and with performance metrics and project summaries for leaders and project managers.
With Dataiku Govern, the Model Registry includes not only those models built within the Dataiku instance, but also models imported from without, such as those developed using MLflow. Dataiku tracks not only the version in production, but also subflows for potential challenger or replacement models that are still under development and review. Users can simply select a model version to review its creation date and status, as well as the full history of its performance and drift.
The governance workflow comes into play when data teams want to push a new model or project version to production. They need to gather feedback and comments from stakeholders, or in some cases receive final sign off before deployment. In this way, the Model Registry and the structured approval flows work hand-in-hand to ensure a well-governed process.
Project Value & Risk Qualification
With a central hub providing a view onto the many moving parts that make up model development and deployment, governance processes should enable stakeholders to assess project value and risk using a standardized qualification framework. With limited resources to execute a growing number of AI and analytics project requests, a single value-risk matrix can help leaders compare initiatives, determine oversight requirements, and determine which projects should be prioritized for investment.
With Dataiku Govern, dashboards and gauges show the current status of multiple projects at a glance. You can also compare existing projects in terms of risk and value with a comprehensive heat map, or else zoom out with a business initiative overview. With the help of a kanban view, project leaders can easily compare projects and make informed decisions about resource and investment prioritization across their organization's entire AI portfolio.
Avoid AI and Analytics Chaos
As companies and organizations invest more heavily in data analytics and continue to increase their AI maturity, the need for governance on AI and analytics projects will only become more pressing. Like a finely tuned watch, the best data operations will comprise many moving parts, some more visible and some less. So it’s essential that teams find the data platforms that best enable them to gain a comprehensive and reliable command over their processes, ensuring that all stakeholders can help scale projects that minimize risk and maximize value.