Prestigious AI Index Report by Stanford University highlighted Skoltech research

июнь 05, 2024

A team of scientists from Skoltech, AIRI Institute, MEPhI National Research Nuclear University, and foreign universities presented a study that was included in the prestigious Stanford University AI Index Report, which summarizes development trends of artificial intelligence in 2023. The article introduces a large set of data — Skoltech3D, which can help methods for obtaining three-dimensional models of real-world objects. Stanford experts pointed out that the dataset is qualitatively different from existing solutions. It is a first-ever large-scale collection of data from sensors of various types — cameras and depth cameras (lidars) — with a high-precision reference.

The existing methods for 3D reconstruction of objects, according to the authors of the study, are mainly based on one type of data: either from cameras or depth cameras. As both these types have their drawbacks, scientists aimed to create a more comprehensive solution.

“There were no algorithms that could effectively utilize both types of data because there was no data set that allowed these algorithms to be developed and tested. We have collected such a set of data. It is the first of its kind, containing combinations of data from different sensors and having highly accurate data that can be used as a reference when testing algorithms. When collecting it, we focused on such types of object surfaces that cause difficulties for modern reconstruction methods — glossy or uniform in color surfaces, as well as translucent materials,” said Oleg Voynov, the leading author of the study, a research engineer at Skoltech Applied AI.

Examples of 3D reconstruction made with modern methods. Areas that were reconstructed with errors or were not reconstructed at all are shown in black color and white gaps. Source: Multi-sensor large-scale dataset for multi-view 3D reconstruction.

The Stanford University report highlighted the work in the second chapter, which examines a wide range of artificial intelligence capabilities — from language processing to reinforcement learning. The authors of the report stressed that the new dataset contains 1.4 million images taken from 100 different angles with 14 types of lighting.

“The key technology for automating dataset collection is the use of a collaborative robot with 6 degrees of freedom. The robot positioned the cameras in space with an accuracy of 0.1 mm. The robot has generated 100 camera angles for each object. We have also developed a lighting system that provides 14 modes of operation. The movement of the manipulator, shooting with multimodal cameras, and lighting were synchronized and controlled from the server. The process of digitizing 3D objects can now be fully automated with these technologies, and in the future, they will aid in the expansion of the Skoltech3D three-dimensional image dataset,” said Dzmitry Tsetserukou, an associate professor and the head of the Skoltech Intellectual Space Robotics Laboratory.

3D reconstruction of objects is popular in many areas today. For example, a team of scientists led by the director of the AI Center, Professor Evgeny Burnaev, together with the Russian State Historical Museum took part in a project to create 3D digital copies of the most interesting objects in St. Basil’s Cathedral. The project aimed to make a digital 3D model of the cathedral to organize virtual exhibitions.

“In addition to preserving cultural and historical heritage, 3D reconstruction holds significant implications in diverse fields, including medicine. It can help make a three-dimensional model of the organ and plan an operation on the patient based on this model, and not on photographs with markings. Such methods are also popular in business and on marketplaces. Instead of a photo of the product, you can show its 3D model, which you can rotate and examine. A potential buyer can even walk around the model of their future apartment,” adds Oleg Voynov.

As the authors noted, the team continues to work on using a unique dataset to develop new, comprehensive 3D reconstruction methods.