Tesla humanoid robot can carry courier!

Musk said that Optimus can actually do more things, which can only be shown on stage. Judging from the video shown on site, Optimus can not only move around, but also transport items, water flowers and other behaviors.

In factories, robots can remove a long object from a workbench and arrange it neatly into boxes containing the same objects. In a rendering from the robot’s point of view, it can use color to distinguish different objects in the real world. For example, the elongated object it holds is purple, the workbench is yellow, etc.

Subsequently, Tesla pushed a more realistic version of Optimus, which looks a bit similar to the model shown on AI DAY last year, with a human-like appearance and a higher degree of freedom.

In the process of use, it can also provide more services, such as the fingers can move freely, can operate many tools, the right hand can hold some tools, and even do some repetitive work in the factory.

However, Musk said that in the past year, the robot team has worked 7 days a week and more than ten hours a day. Although Tesla has done a lot of work, this robot is still in its early stages and can be used in the future. do better.

According to the introduction, the full-body Tesla humanoid robot “Optimus Prime” weighs 73kg, uses 100W of electrical power when sitting still, and 500W when walking fast. It has more than 200 degrees of freedom in its whole body and 27 degrees of freedom in its hands.

Musk made a spoiler for the cost and other information of this product. He believes that some other robots can be seen on the market at present, but the cost is very high. Tesla’s Optimus can be produced at low cost, and it is expected that the output in the future can reach several. million units, and the cost may be less than 20,000 US dollars (about 140,000 yuan).

Musk is obviously very optimistic about this product. He said that robots can reduce labor costs and develop the economy better. There will be no poverty in the future. Humans can freely choose the type of work. Physical work will no longer be a must for human beings. Can be more involved in mental work.

Although self-driving cars are very important, they can revolutionize the transportation capacity by an order of magnitude. But robots can reduce economic costs and make social development more dynamic. Musk hopes that robots can bring more help to humans in a safer way.

At the scene, Musk did not forget to advertise himself: “The purpose of our event is to attract more AI talents to join us and make better products.”

After Musk’s brief introduction, Tesla’s design team gave a little introduction to the Optimus design.

A Tesla robot leader introduced the specific progress of some robots. On last year’s AI DAY, Tesla briefly introduced Tesla’s robots. So far, it has undergone three evolutions, and finally presented the current results.

The core sensor used in the Optimus is a camera, similar to the one used in Tesla’s FSD system. Tesla currently collects a lot of data to train robots.

The power system of Tesla Optimus is integrated into the upper part. The reason for this design is to consider the design scheme of Tesla cars, hoping to reduce more wiring harnesses and concentrate power distribution and computing in the center of the torso. It contains a battery pack with a capacity of 2.3kWh that can run all day on a single charge.

Tesla robots use a single Tesla self-developed SoC chip and support LTE 4G connections, but unlike those used in cars, robots need to process visual data to respond quickly, based on multiple sensory inputs and communications, so they are installed. There are radio connections, audio support, and safety features needed to protect the robot body and humans.

In terms of action, Tesla’s Optimus Prime still draws on the powertrain of Tesla’s electric car. When designing, Tesla first analyzed what actions the robot needs to perform, including walking, going up and down stairs, etc. First, by analyzing the dynamic data of the robot’s walking, the time, energy consumption and trajectory required for these movements can be analyzed, and joints and actuators can be designed according to these data.

In terms of safety, Tesla has also made some designs. In order to protect the robot, the researchers have optimized its structural foundation, so that the transmission and arms will not be damaged when the humanoid robot falls, and this technology can be applied to the robot. After all, the maintenance cost of a robot is high.

The developers used the same underlying technology as the car, which allows the robot to create pressure in all components, making its walking control easier and less rigid.

Taking the knee as an example, the humanoid robot needs to imitate the real human knee structure when designing.

Researchers have linearized the human knee and the forces it experiences during motion to learn how to build a robotic knee with less force, allowing it to achieve better force control, and to tightly wrap the associated structure in the around the knee.

There are 6 types of actuators, including motors, traction actuators similar to weight scales, etc. Tesla also showed a video at the scene where a piano can be hoisted by a traction actuator.

Another focus of humanoid robots is the hand. Tesla wants Optimus’ hands to be as flexible as humans, capable of grasping objects, manipulating them, and having sensors to sense them.

Optimus Prime’s hand was also inspired by biology students. Through 6 actuators, Optimus Prime’s hand can move with 11 degrees of freedom, and can carry 20 pounds of weight, and can operate some instruments, or grab small objects.

Tesla’s technical experts said at the scene that the car is a wheeled robot, and Optimus Prime just erected the car to some extent.

In terms of the robot’s actions, Optimus uses the same neural network in Tesla’s electric car, the “occupancy network,” to identify drivable areas.

In terms of walking, after the software perceives and analyzes the external environment, it will draw a driving trajectory, and then plan the foothold of each foot according to the trajectory, and then let the actuator execute.

One of the key points of humanoid robots is to maintain an upright state and not fall to the ground easily. How to do it? Through sensors and perception of the outside world, it can adjust the torque of the controller itself and finally keep it balanced when it is affected by external influences.

In terms of grasping, Tesla first collects the trajectory data of human grasping behavior, and then maps it to the robot, so that it can perform some grasping actions.

In the future, Tesla hopes to make Optimus Prime more flexible, and hopes to go further from the prototype, so that it can be improved in all aspects, with better navigation and mobility.

02.

Strive for autonomous driving and have the ability to launch FSD globally by the end of the year

In terms of autonomous driving, Tesla first introduced the situation of FSD. In 2021, the FSD test has 2000 customers participating. Expanded to 160,000 customers in 2022. Tesla currently has accumulated 4.8 million pieces of data, trained 75,000 neural networks, and launched 35 FSD version updates on this basis.

In terms of autonomous driving technology architecture, Tesla’s approach is to use an automated data labeling system to automatically label the collected data, and then process the data to train the neural network, and then deploy the AI model to the FSD computer. Calculate to calculate the perception results of the external environment and calculate the driving rules of the vehicle.

In terms of technology display, Tesla first demonstrated the ability to turn left without protection. For example, when there are pedestrians and other vehicles when turning left, Tesla calculates the most suitable driving trajectory after considering the driving trajectories of different traffic participants.

Specifically, Tesla uses a technology called interactive search.

First, start with visual perception, perceive traffic participants, then infer their driving trajectories, then generate several strategies, and finally select the best driving trajectory.

It should be noted here that if there are more and more external targets, the amount of calculation required will also increase.

Tesla uses a surround-view camera to perceive the outside world, generate a 3D environment, and find drivable areas by occupying the network and know which obstacles are.

When working, the first step is to calibrate the image of the camera, integrate the images to form a 3D space, extract the data, input it into the neural network, and construct spatial features through the corresponding algorithm.

There is a problem here. After only generating the 3D space, without the precise positions of various objects, it is still impossible to carry out path planning. So Tesla’s approach is to calculate location data by analyzing key features.

Tesla’s fleet has amassed a lot of video footage on its daily journeys. How many frames are there in each video, 1.4 billion frames are needed to train a neural network, and 100,000 GPU man-hours (1 GPU work for 1 hour) are required, and the amount of training is huge.

This is where supercomputers and AI accelerators are needed. This is why Tesla has developed Dojo supercomputing, which can increase the speed of network training by 30%.

Regarding the behavior prediction of other traffic participants, Tesla also introduced its own approach.

The camera image will first enter the RegNet network, and the processed data will enter the Transformer model. There may be 1 billion parameters, and they are jointly optimized. What I want to achieve is to maximize the computing power and minimize the delay.

Cars will generate a large amount of data during operation, and these data also need to be labeled.

When labeling, Tesla first tried manual labeling, but it was time-consuming and labor-intensive. Later, it considered the way of supplier cooperation, but from the final result, neither timeliness nor quality was very good, and Tesla needs very efficient and scalable callouts.

At present, the standard method of human-machine cooperation adopted by Tesla includes both human annotation and machine annotation, but overall, the efficiency of machine annotation is better. The 30-minute workload of the machine may take a longer time for humans, so special Sla is building an automatic labeling system.

Through efficient annotation, the spatiotemporal fragments in the real world can be converted into usable data, thereby making FSD more intelligent and efficient.

Timely automatic labeling also needs to be sorted out. Tesla did not invest much energy in this area before, but now there are many engineers working in this area.

In addition, in autonomous driving, a very important part is the simulation system, which can improve the vehicle’s ability to cope with long tail scenarios.

Tesla built a scene generator that can generate a scene in as little as five minutes, 1,000 times faster, and can scan real objects and project them onto the screen, simulate signal lights, stop signs, etc., as close to the real world as possible .

This is very important for training.

And through the data engine, the neural network can be made more realistic, can bring more certainty, and solve the uncertainty of the real world. For example, when turning at an intersection, it is necessary to determine whether a vehicle parked across the ground is in a parked state or driving slowly. Only by creating more networks for evaluation, such a scenario can be solved.

At present, Tesla’s data set, partly from the information returned by the fleet, and partly from the simulated data, can make judgments on the scene more convenient.

As for the promotion of Tesla FSD Beat, Tesla will have the ability to launch FSD globally by the end of this year. However, outside North America, it is necessary to communicate with regulators, and in some countries and regions, supervision is still lagging behind.

03.

Dojo’s continuous iteration is promoting Tesla’s development

In the previous introduction to robots and autonomous driving, Tesla engineers have mentioned the Dojo supercomputing platform many times.

At last year’s inaugural Tesla AI Day, Tesla showed off its first AI training chip, the Dojo D1, and a complete Dojo cluster, ExaPOD, built on the chip to perform AI training tasks for its huge video of vehicles on the road Provide support for processing needs.

Currently, Tesla already has a large supercomputer based on Nvidia GPUs and a data center that stores 30PB of video material.

Tesla also showed a set of photos over the past two years, from the delivery of a custom cold liquid distribution unit (CDU) to the installation of the first integrated Dojo cabinet, to the load testing of the 2.2MW unit.

Tesla has been trying to optimize the scalability of the Dojo design and overcome challenges with a “quick and error” mentality. The Dojo accelerator features a single scalable computing plane, globally addressable fast memory, and unified high bandwidth + low latency.

Tesla technical engineers specifically talked about the voltage regulation module, which has high performance, high density (0.86A/mm²), and complex integration.

Its voltage regulation module has been updated 14 versions in 24 months.

The coefficient of thermal expansion (CTE) is important, so Tesla works with suppliers to provide power solutions. Its CTE is reduced by more than 50%, and Dojo performs 3 times as fast as the initial scaling.

At the meeting, the Dojo team showed the image of the Cybertruck running on Mars through Stable Diffusion through Dojo.

According to reports, only 4 Dojo cabinets can replace 72 GPU racks consisting of 4,000 GPUs. Dojo reduces work that normally takes months to a week.

In addition, Tesla’s self-developed D1 chip also played a role. D1 adopts TSMC’s 7nm process technology, and has 50 billion transistors in an area of 645mm². The computing power of BF16 and CFP8 can reach 362TFLOPS, the computing power of FP32 can reach 22.6TFLOPS, and the TDP (thermal design power consumption) is 400W.

Based on the D1 chip, Tesla has launched a system-level solution on wafers. By applying TSMC’s InFO_SoW packaging technology, all 25 D1 dies are integrated into one training tile. Each Dojo training tile consumes 15kW. There are compute, I/O, power, and liquid cooling modules in the Tesla Dojo training tile.

The Dojo System Tray has the characteristics of high-speed connection and dense integration, and the height of 75mm can support 135kg. Its BF16/CFP8 peak computing power can reach 54TFLOPS, and the power consumption is 100+kW.

The Dojo interface processor is a PCIe card with high bandwidth memory that utilizes Tesla’s own TTP interface.

Tesla Transport Protocol TTP can also bridge to standard Ethernet, and TTPOE can convert standard Ethernet to Z-plane topology with high Z-plane topology connectivity.

Since Tesla AI Day last year, Dojo development has ushered in a series of milestones, including the installation of the first Dojo cabinet and the 2.2mW load test. Now Tesla is building a Tile every day.

Tesla also announced that its first ExaPOD is expected to be completed in the first quarter of 2023, with plans to build a total of seven ExaPODs in Palo Alto.

According to reports, in a 10-cabinet system, the Dojo ExaPOD cluster will break through the E-level computing power.

Its BF16/CFP8 peak computing power reaches 1.1EFLOPS (10 billion billion floating point operations), and has 1.3TB high-speed SRAM and 13TB high-bandwidth DRAM.

04.
Conclusion: Tesla is not just a car company

In many general public impressions, Tesla is the global leader in electric vehicles, the first company in the world to promote electric vehicles on a large scale, and a great car company.

But in Tesla’s own view, the car company is not its final positioning for itself. Tesla can position itself as a very hard-core technology company. Therefore, Tesla has made efforts in autonomous driving, AI, robots, and even supercomputing, and has also achieved certain results.

In order to achieve these achievements, Tesla has also made a lot of efforts internally, fully respecting talented employees and jointly creating valuable products.