Specialists from the University of Washington, Google and have developed a system that, from recording a football match on YouTube, generates its three-dimensional model in augmented reality. Trained on data from a FIFA game, the algorithm predicts a depth map for each player and recreates the match over time. Except for the ball, the tracking of which the developers have left for the future.
Scientists noted in their article the existence of other rendering techniques, but all of them require placing on the site a large number of synchronized cameras transmitting high-definition video.
The presented system works on the basis of a convolutional neural network. The developers compiled a data set of 12,000 pairs of “image – depth map” from the FIFA game and trained the neural network to track the correlation between these types of data. When working with a video of a match, the algorithm recognizes each player, creates a model of its skeleton and dynamics. Then, based on the training data, he predicts a depth map for an individual image and recreates the game in virtual space. With augmented reality glasses, Microsoft HoloLens developers were able to place a 3D model on the surface of the table.
The developers have not paid attention to optimization, so video processing at this stage requires a lot of computational power. They tested the algorithm on a desktop computer with a Core i7 processor, 32 GB of RAM and a 6 GB GTX 1080 video card – under these conditions, the analysis of each frame as 4K took about 15 seconds.
The quality of the model can degrade the strong motion blur due to the speed of movement, as well as recording in low resolution. In addition, the scientists noted that they only considered the horizontal movement of players – the algorithm cannot handle jumping.
In early June 2018, a group of scientists working on the Face2Face project presented a system that in real time recreates human movements in virtual reality – turns of the head, body, neck and even facial expressions. It can transmit these dynamics to the model of the user himself or another person.