With the advent of the Anthropocene era, the physical territory is subject to dramatic transformations and ecological degradations due to human action. Regular and detailed maps are required for us to find our way in this new historical period.
The Dataiku AI Lab sat down with the French National Institute of Geographic and Forest Information (Institute National de l'Information Géographique et Forestière, or IGN) as part of their ML Research, In Practice series to understand how the organization is leveraging deep learning techniques as well as the challenges of land cover mapping at scale.
The Importance of Land Cover Mapping
In order to make and inform policies about how we as humans live in this changing world, it’s critical to have descriptions of the land. This includes both land cover, which is a physical description of the land — in other words, is it buildings? Vegetation (and if so, what kind)? Road? It also includes land use, which is about the purpose of that land, whether forestry, residential, industrial, etc.
More specifically, the IGN is concerned with a concept they call “artificialization,” which is change in land cover and use that can result in ecosystem loss. For example, think about vegetation that is converted for urban use, like a road. France has specific goals around “zero artificialization” or “net artificialization,” meaning they need a way to track both land cover and use nationally as well as any delta or change in cover and use. That’s where IGN comes in.
Their goal is to produce this database (that is, land cover and use) nationally that is compatible with images of the territory that IGN acquires every three years. Notably, this database has already existed in some form, but mostly regionally, meaning it was not exhaustive enough to inform policy at a national level.
The Role of Machine Learning
One might ask — if lots of the database already exists from previous years, and there is imagery every three years, why doesn’t the team train models to predict land use and cover based on imagery? Unfortunately, this naive approach (as the team discovered) does not work. One of the main problems is that the land cover objects are too generic, meaning objects that are too small are not well represented.
Since IGN could not rely on historical data to build interesting models in this setting, they focused the use of AI on a more narrow task, which is semantic segmentation for land cover mapping. At the algorithmic level, the team explains, there’s not really anything very fancy. Most of the effort is in the data part, more specifically building relevant and sufficient training data to provide models that generalize enough on the national scale.
The team is also very focused on openness and to open source all they have done, including shared datasets, predictions, and models, to gather a community around the subject and challenges of land cover not just in France, but around the world.
Keeping Humans in the Loop
Importantly, IGN is not trying to automate the whole process with AI. They still rely on a lot of information that exists in geographical databases and that is identified by many different means. Because of that, they still have a lot of human intervention — from providing training data to completing data and inspecting the final quality of the product.
Their underlying mission is to underline where deep learning methods are most useful and integrate them into a broader technical and social system for the periodic production of national fine-scaled land cover maps that will, quite possibly, shape the policies that decide the fate of our world.