by Anna Leida Mölder (NTNU)
What is it that motivates researchers to advance knowledge in a particular field? What is it that makes an engineer develop a tool for one specific purpose, but not another? And how do these choices of today affect the future?
In the era of artificial intelligence (AI), one of the most researched topics is how to teach computers to see, and to implement this computer vision in a range of applications, such as drones, microscopes and cars. At the same time we have at least 2.2 billion people in the world suffering from some type of vision impairment [L1].
Many technological advances have been made to help people with visual impairment. Voice navigation is available for most map functions and a number of mobile phone applications exist, such as iMove around [L2] and RightHear [L3]. But just like vision technology in cars, knowing where to go and how to get there is not enough. Users must also respond to dynamically appearing changes in their surroundings. In self-driving cars, intelligent systems based on a combination of sensors, such as LIDAR, cameras and time of flight measurements have been implemented to prevent the vehicle from running into people, animals or other cars. Depth-sensing cameras are being further developed and used recreationally by some of the world’s largest corporations. But for people with vision impairments, the go-to assist for depth sensing is still a guide dog or a white cane.
While there is nothing wrong with a guide dog—which is probably cheaper, more accurate and a better companion than any self-driving car—how can we explain the strong bias of technological development? A search on Google Scholar for “vision impairment LIDAR” turns up 7,220 hits. A search for “car LIDAR” turns up 126,000 and a search for “self-driving car” 2,760,000. This represents a more than 10-fold increase in the quantity of research being undertaken for the same technology for two very different purposes.
Image recognition and identification are also increasingly accessible for mobile devices, with the construction of less complex AI models, such as MobileNet . It is easy to find applications to classify anything from flowers to car number plates and cat and dog breeds, using smart phone captured images. The same technology could be used to help people with vision impairments identify specific items in their surroundings; things that the seeing community takes for granted. Imagine being able to shop for food and actually being able to find the Fair Trade logo on the package, being able to read the “best before” date or to figure out if the product contains the lactose you are allergic to, without having to ask for assistance every time.
QR codes were once developed especially with the vision impaired in mind. They are easy to access via a handheld device, which can in turn be configured to deliver the output by any means that the user chooses, audio or on screen. They can even be used to help customers make phone calls or send messages to a support function or store owner—a great help for any person with vision impairments entering to do their shopping. However, very few retailers place QR-codes on their products for any purpose other than promotion or discount management. Blind people often use Braille tags to identify items of clothing in their wardrobe. Similarly, retailers could use existing technology to help people with vision impairments and make the store accessible to all customers.
In this era of computer vision, facilitated by improved mobile device resources, assisted vision for many humans is lagging. As AI encroaches into our daily lives, it is important that we consider who we include in this new technological era, and how. Many applications will never be developed for the simple reason that researchers, engineers and developers are simply not spending their time working on it right now—today. Are we all really doing the best we can be doing, every day?
[L1] https://www.who.int/news-room/fact-sheets/detail/blindness-and-visual-impairment (8 Oct 2019)
 A. G. Howard et al.: “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, 2017. https://arxiv.org/abs/1704.04861
Anna Leida Mölder, NTNU, Norway