KI-Absicherung: Proof of Project Concept conducted
Michael Mock, Fraunhofer IAIS, Consortium Co-Lead and Scientific Coordinator
Stephan Scholz, Volkswagen AG, Consortium Lead
Loren Schwarz, BMW AG, TP1 Lead
Thomas Stauner, BMW AG, TP2 Lead
Fabian Hüger, Volkswagen AG, TP3 Lead
Frédérik Blank, Robert Bosch GmbH, TP4 Lead
Andreas Rohatschek, Robert Bosch GmbH, TP4 Lead
Developing a stringent safety-argumentation for AI-based perception functions requires a complete methodology to systematically organize the complex interplay between specifications, data and training of AI-functions, safety measures and metrics, risk analysis, safety goals and safety requirements. The project KI Absicherung has successfully conducted a "Proof of Project Concept" (PoPC) in order to define and exemplify the detailed technical workflow for developing a stringent safety-argumentation for AI-based perception functions in a minimalistic example.
The goal of the "Proof of Project Concept" is to give a minimalistic, but complete example of all required steps in a workflow leading to a stringent safety argumentation for AI-based functions, using concise and agreed notions defining and explaining the relationships and dependencies between the workflow steps. The major results are an exemplary mini-safety argumentation (represented in GSN - Goal Structuring Notation) and the methodology itself, explaining how to begin with safety requirement as starting point and going through an analysis of DNN insufficiencies and DNN specific safety concerns, mitigating them by DNN safety measures, and measuring the success by metrics which lead to providing evidences. In addition, the "Proof of Project Concept" has been implemented at the operational level, thus serving as a blueprint for cooperation and interaction between all sub-projects of KI Absicherung.
The workflow described in Figure 1 below (KI Absicherung Approach to Safety Argumentation for AI-based Functions - Big Picture) consists of three major blocks:
- Specification and development steps of the AI-based function (left hand side),
- Safety measures and metrics being applied to specific steps (middle part),
- and a safety argumentation making use of the evidences provided by the measures and metrics (right hand side).
Figure 1- KI Absicherung Approach to Safety Argumentation for AI-based Functions - Big Picture
Given the specifications provided in the upper specification block (left hand side), the safety analysis formulates safety requirements based on a risk analysis that takes the specification of the function, architecture and ODD into account. The safety requirements systematically address the DNN-specific safety concerns that lead in particular to the DNN insufficiency. A formal assurance case argumentation in GSN (Goal Structured Notation) is then developed that shows that the safety measures being applied provide sufficient evidence, measured in quantitative and qualitative metrics, that the risks raised by the DNN insufficiency and DNN-specific safety concerns are mitigated such that the safety requirements are fulfilled.
The PoPC was implemented with a minimalistic example filling this workflow while restricting us to a single DNN insufficiency, to only one of the induced safety requirements, and even making use of only one safety measure. We are fully aware that this example is not aiming at providing sufficient evidence for a complete safety argumentation that addresses all risks related to the specified AI-based function. Nevertheless, it shows how the different steps of the workflow depicted in Figure 1 interact and should serve as template for filling the still missing evidences in a complete safety argumentation. The basic elements of the PoPC example are as follows:
- We address the DNN insufficiency "lack of generalization" and the resulting safety requirement "AI-based function should know when it encounters data which is out of the training data distribution". To meet this safety requirement, we use one exemplary safety measure namely an auto-encoder, (we implemented and tested a variational auto-encoder and a GAN-based auto-encoder) along with suitable effectiveness metrics. The purpose of the auto-encoder is to detect when we leave the training data distribution.
- Note that the training data distribution is not necessarily a perfect representation of the ODD (operational design domain), there can be a gap between the two. Note also that the VAE does not provide any evidence on safety of the AI-based function within the training data distribution.
- We restrict the ODD in a few particular dimensions (e.g. lighting conditions like brightness/darkness) - making the simplistic assumption that this is exactly how we want our customer-facing functionality to be. The other dimensions of the ODD are allowed to vary within given boundaries.
- We create training data representing the ODD as well as possible and train both, AI-based function and the auto-encoder on this "within-ODD" data.
- We create validation data that leaves the ODD in the particular dimensions that were restricted in the ODD. This validation data is used to check the behavior of the VAE at ODD boundaries. When run on validation data together with the AI-based function, the VAE should detect the violation.
- The formal, evidence-based safety argumentation in the GSN relies on evidences from three strategies:
- The ODD is sufficiently represented by the training data
- The auto-encoder is sufficiently capable to detect leaving of the training distribution
- The detection of the leaving of the training distribution correlates with the performance of the AI-based function under test (detection of pedestrians)
The PoPC has shown the validity and applicability of the overall approach depicted in Figure 1, even if we were not able to produce all relevant evidences for the safety argumentation. Developing a stringent safety argumentation for AI-based perception functions requires a complete methodology linking all involved aspects of AI-Function development and safety analysis. The PoPC of the KI Absicherung project defines and exemplifies this methodology. The methodology is currently being further developed and applied in the project to cover a multi-dimensional ODD definition and include multiple DNN safety mechanisms, providing a safety-argumentation that takes the inherent multi-factorial nature of DNN failures into account. One of the central challenges in the project will be to jointly identify and evaluate assurance methods that can be used as a base for suitable evidences. These are to be applied in an adequate safety argumentation, thus allowing for a better assessment of capabilities and limitations of this approach.