During Tesla AI Day the public gets a glimpse of the huge investments made by Tesla in human resources, software and hardware. To solve such a complex problem as Tesla’s, they need to make significant investments and push the entire AI field. Not everyone has these recourses, but other companies can be inspired by Tesla’s strategies and apply them to their challenges, but with more tangible solutions.
Brains – Bold decisions in the modelling phase
A major redesign of their machine learning models includes a multi camera neural network making predictions directly in vector space. Previous models made predictions in image space for each camera, and the results were then transformed to vector space in post-processing.
This holistic approach allows the model to simultaneously make predictions on views from all cameras, similar to how the human brain uses vision from both eyes when making decisions.
Figur 1: It’s basically night and day, you can actually drive on this” says Andrej while also highlighting the huge engineering effort required to get this right.
Tesla also explains how a lack of memory has been a problem for their previous AI models, citing issues such as temporary occlusion. By redesigning their models to incorporate so called
hidden states capable of memorizing information from previous frames in the video the accuracy improved.
Figur 2: Using the context in video. Andrej Karpathy demonstrating how using the context in video reduces false detections due to occlusions.
Muscles – Huge investments in data collection, labeling infrastructure and digital twins
After working with third party labelers Tesla now favors hiring professional labelers who are co-located with machine learning engineers. This allows Tesla to further develop and improve annotation software, as showcased in their 3D annotation tool.
Figur 3: Tesla’s 3D annotation tool capable of simultaneously annotating multiple camera views by making use of data captured from different cars.
Data is collected from the huge participating fleet of cars in real world traffic. Data can be collected on demand based on queries specified by the engineers, allowing specific hard cases to be improved within weeks rather than months. Offline analytics allows Tesla to automatically label much of their data, while still allowing human intervention where necessary.
For traffic scenarios not easily captured from their fleet Tesla uses simulation software to create realistic scenarios. These simulations realistically model real sensor noise, motion blur and other optical distortions.
Figur 4 & 5: Realistic simulation allows Tesla to render scenarios with perfect labels – saving annotation efforts. In this simulated intersection the not yet released Tesla Cybertruck can be spotted.
Speed – Computing power on the edge and in the Dojo
Making use of these efforts require a huge amount of compute power, something Tesla with their Project Dojo has developed their own scalable, distributed, processing architecture for. The supercomputer Dojo will feature their in-house designed D1 chip.
More info can be found here
Inspired? We can help you with your next step!
Whether you are beginning your AI journey or already have models in production, you can turn to Combine for expert help. Feel free to check out our blog posts about how to use virtual modeling for AI:
Road friction estimation using AI and Digital Twin Simulation
Large scale annotation using crowd-sourced data annotation and image classification using deep learning: