Despite tremendous progress learning patterns in huge swarths of data, there had been something amiss. One such thing has been data representation for describing driving scenarios. Driving data especially camera data has rich amounts of information like road geometry. While using their pixel values as in RGB channels helped, they didn’t cover the important structural and relational information that we as a human use to make driving judgements, for example too close of an intervehicle distance or a pedestrian at the risk of being run over. A big step towards solving this problem is scene-graphs.
Current methods for autonomous mobility are not enough
The most common methods for building and gauging safe mobility systems are falling short on delivering adequate safety guarantees for any autonomous mobility solution. These techniques are either tedious (example, collecting driving data) or cannot guarantee completeness (example, scenario based validation).
Getting to some specifics of the problem
Let’s get into some of specifics of the problem.
- Tedious Statistical validation: Collecting millions of real driving footage, aka statistical validation techniques are impossibly tedious. Based on a study , to achieve 95% confidence, about a fleet of 100 vehicles have to driven for 225 years.
- Inadequate representation: A lot of ML based representation techniques such as CNNs and MLPs are inadequate to develop human-like understanding of high level scenario information explicitly such as capturing inter-object relationships and the road state.
- Incomprehensive Scenario-based validation techniques  are handy but have problems of their own around the completeness over an ODD such as
- identification of relevant scenarios given a certain ODD, especially the long-tailed ones,
- variations of relevant scenarios between ODD,
Requirements for an ideal representation for scenarios
This blog post is not about solving the above mentioned problems but talking about a data representation that gives us the best chance to understanding and fixing it. Let’s first explore what are the characteristics of such a representation has to be
- an abstraction to encapsulate to state of the traffic scene and then a scenario. It should contain
- spatial information: representation for the objects in the scene, for example, pedestrians, cars, road structure, traffic sign state etc.
- relational information: representation for the relationships between the spatial objects and states such as the inter-object distances and direction,
- temporal information: representation of how the spatial and relational states are evolving over time, for example too close of a cut-in or a jay-walking pedestrian at the risk being run over.
- Flexibility to have the best chance to representing the infinitely diverse and arbitrarily complex driving situations,
- Attributes of all the three types of state information (spatial, relational, and temporal) enumerated above.
- Accounting the domain knowledge for AV and ADAS applications.
Scene-graphs is the answer
Turns out an ideal data-structure that fits the bill is a scene-graph. They are a kind of knowledge graphs where nodes are the objects in a scene and edges are the relationships between those objects. A lot of motivation comes from the world of NLP where graph scene representations are relatively more mature for techniques such as image captioning and visual question answering.
Scene-graphs can help us build the right kind of scenario description framework. Such a framework has many benefits such as
- Explicit representation of important spatial, relational and temporal state of a driving scenario,
- It can support different abstraction levels in a form of a hierarchy. It includes support for the Pegasus  based hierarchical arrangement of scenarios (abstract, functional and concrete),
- Scenarios can be modular and reused based on its applicability elsewhere,
- Compatibility with graph neural networks which is another powerful new tool,
Rydesafely’s Driving Reasoning Engine and a scene-graph based scenario description methodology:
Rydesafely extensively uses scene-graphs for representing driving data through its Driving Reasoning Engine. Scene-graphs allow complex reasoning on driving insights to generate tests, identify useful scenario and fulfill requirements of the intended use-cases. Support for ontologies is motivated by the W3C Web Ontology Language (OWL) standard . Further it support hierarchical abstraction including the one prescribed as abstract, functional and concrete scenarios by the Pegasus project .
Representing scenarios is a critical problem which cannot be neglected for a robust AV, ADAS or any autonomous mobility solution. Scene-graphs are a great starting point to create the right kind of scenario description language.
 Kalra, Nidhi, and Susan M. Paddock. “Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability?.” Transportation Research Part A: Policy and Practice 94 (2016): 182-193.
 Winner, Hermann, Karsten Lemmer, Thomas Form, and Jens Mazzega. “Pegasus—first steps for the safe introduction of automated driving.” In Road Vehicle Automation 5, pp. 185-195. Springer, Cham, 2019
 McGuinness, Deborah L., and Frank Van Harmelen. “OWL web ontology language overview.” W3C recommendation 10, no. 10 (2004): 2004.
This post has been posted on Rydesafely’s blog here.