Augmented Reality and Computer Vision in Navigation

Anyone getting into a Tesla vehicle or who uses navigation devices such as Sygic will know that augmented reality has been incorporated for some time as part of navigation aides. In fact, it is not just these companies but increasingly in the automotive and other sectors augmented reality and computer vision have become critical to enhance movement in given spaces.

We can think of augmented reality (AR) as an interactive experience that enhances objects and experience from the real world through computer-generated perceptual information. The use of AR for navigation means it is a technology used to understand the physical environment and add virtual components, where those objects also follow the rules of physics so that objects displayed on a screen are visually comprehensible. Often, computer vision is used as part of that enhancement through techniques such as segmentation that detect and identify objects such as pedestrians, buildings, or other objects needed to inform navigation devices.

The use and field of AR and computer vision are rapidly changing and, as Tory Smith from Mapbox in a MapScaping podcast discusses, there are also so many other applications that AR and computer vision could be used in navigation.

What is beneficial to the use of AR and computer vision is that simple equipment can be used to provide those who navigate a space with enhanced benefits. Simple cameras, often come as standard in automobiles for instance, could be used to provide AR experiences.

With Lidar, Internet of Things (IoT), and other sensors now added in vehicles, automobiles have even more access to data that can enhance the navigation experience. We can think of computer vision and AR working together through computer vision used to identify correct objects and surfaces, including where a horizon and landscape objects are in a given scene, and AR is then used to enhance this information through visualization using navigation devices, including GPS, and camera data.

HERE's work focuses on detecting 3D road features by applying deep convolutional networks to its HERE True LiDAR in order to capture 3D geometry and surface reflectivity to produce centimeter accurate 3D features. — HERE’s work focuses on detecting 3D road features by applying deep convolutional networks to its HERE True LiDAR in order to capture 3D geometry and surface reflectivity to produce centimeter accurate 3D features. Image: HERE

While AR provides a visually enhanced object, GPS is required to render an object in a given scene. In fact, data used to create and display AR objects include latitude, longitude, elevation, pitch, roll, and yaw. The last three terms relate to three dimensional movement of an object as it goes through a medium that enables inertial measurement units (IMUs) to determine camera positioning and object location as movement is occurring. In short, these six elements are required for AR to render a visually accurate object in the correct location.^[1]

In AR for navigation, often there are two steps to solve. First, determine where someone is relative to the Earth’s location, which means GPS and sensor data to apply IMUs to determine the location and trajectory of an object that is in motion. Additionally, that data are then used to inform where are given components relative to a navigation user.

Computer vision is mainly applied in object identification, which helps provide contextual information as you navigate a space. For instance, someone on a bicycle can be identified as you drive, which helps to inform you to be careful as you go through a given road. For recent navigation tools, a user interface is used to provide relevant information about the environment you are in. This includes objects identified and the context of a given location. Augmenting the location with more information helps those who navigate better understand a given space and can enhance safety. This is why such technologies are used for autonomous vehicles.

Road detection, tracking, and following in unstructured environments. From Ort, Paull, & Rus, 2018.

Going forward, Smith sees there might be hardware challenges as AR becomes more common for navigation. In particular, currently most AR uses screens to display information. Ideally, it would be more advantageous to have a more immersive experience that does not require a screen.

We can expect AR to become not only more common in navigation but even more useful as we navigate. In the future, we might be using AR to determine what turns to make, for instance, when visibility of an area is not yet clear.

Other uses include determining if parking spaces are available before arriving to a location and what type of parking there might be. The nice feature is for many of these applications specialist information or technologies may not be required, although moving to mediums outside of touch screens could be the next big hardware change.

Mapbox has turned maps, location, and turn-by-turn directions into a wearable display using Maps SDK for Unity. Image: Mapbox

For pedestrians, we might expect in the near future that you can simply point your phone and highlight a path you can take and determine what is there before walking a given path. In fact, it might be possible that GPS devices will not be critical, at least for pedestrian uses, as available data online, such as pictures, and onboard cameras may be sufficient to enable a navigation tool to help determine a given location, render AR objects, and provide you with enhanced data as you walk through a space.

Reference

^[1] For more on AR and computer vision in navigation, see: Sergiyenko, O., Flores-Fuentes, W., Mercorelli, P., 2020. Machine vision and navigation (affiliate link). Springer, Cham.

Disclaimer: This site contains affiliate links to products. When you buy something through our retail links, we earn an affiliate commission.

Reference

Related

Caitlin Dempsey

GIS

Geography

Geography Realm