Have you ever wondered how in the world self-driving cars do it? Is there someone remotely operating it? How can a car sense a stop sign? How does it react to bicyclists? I have never trusted the concept of a self-operated machine because I never understood the systems behind it. This is why I chose to look into where all the answers to my questions come from – one amazing technology that is decades in the making called computer vision.
Computer vision is the science of creating systems that can identify, sense motion from, and react to visual stimulation. Computer vision is making its way into a variety of fields, from self-driving vehicles to robotics to selfie filters on Snapchat. The technology behind computer vision is essentially attempting to simulate the human brains’ reaction to seeing and comprehending objects. This is incredibly complex, as are most tech projects that attempt to recreate or improve the function of a human’s brain, but research in this field has made huge strides toward everyday use of computer vision.
The first step in the process of successful computer vision was recreating the human eye. As Devin Coldewey describes, “With larger, more optically perfect lenses and semiconductor subpixels fabricated at nanometer scales, the precision and sensitivity of modern cameras is nothing short of incredible (2016).” The cameras that have been generated can recognize and record thousands of images that it is exposed to in mere seconds, as well as accurately detect distances (Coldewey, 2016). This technology has been on standby until recently because computation systems were not advanced enough to use computer vision. The second step was to ensure that computers could describe what they were seeing with shape recognition. Integrating shape recognition called for computers to be able to differentiate shapes from other shapes, then giving a name to each shape (Coldewey, 2016).
The next step in the process was to give shapes meaning. This is possible through the use of training data. The computers essentially have a bank of data that they learn from that “has to be first annotated by hand, which is prohibitively expensive for all but the highest-demand applications” (Hardesty, 2016). Researchers at the University of Washington are working on a system called YOLO (You Only Look Once). It is a system that can identify shapes, track motion, and can even comprehend the breed of any dog in a given picture. In this TED talk, Joseph Redmon explains his research on YOLO and the Darknet project. The system that his team has developed can identify and analyze objects in real time with over 99% accuracy.
Throughout his presentation, Redmon reiterates that this kind of technology is not only prevalent in the car industry, but also in robotics and the healthcare sector. Researchers are working on a system that can identify cancer cells through computer vision. This example solidifies that computer vision has and will penetrate numerous fields. The British machine Vision Association and Society for Pattern Recognition lists a number of applications for computer vision, including:
- Artificial Intelligence
- Augmented Reality
- Industrial Quality Inspection
- Medical Image Analysis
- Pollution Monitoring
- Security and Surveillance
It will be years before we see computer vision successfully enter into the above fields, but the list still emphasizes the dramatic impact this technological advancement could have. For now, most of the research is geared toward self-driving vehicles. For example, researchers at MIT are developing depth sensors that are so sensitive and so reactive to scattered light that they may be the solution for self-driving cars that cannot operate in fog (Hardesty, 2017).
The potential for computer vision is essentially limitless. The advancements that can be made in the fields of robotics, medicine, and transportation are not only exciting, but have the potential to be disruptive enough to change the way we live.
By Rachel Brady
“What is computer vision?” (n.d.). BVMA. Retrieved from http://www.bmva.org/visionoverview.
Hardesty, L. (2017, December 21). “New depth sensors could be sensitive enough for self-driving cars.” MIT News. Retrieved from http://news.mit.edu/2017/new-depth-sensors-could-be-sensitive-enough-self-driving-cars-1222.
Hardesty, L. (2016, December 2). “Computer learns to recognize sounds by watching video.” MIT News. Retrieved from http://news.mit.edu/2016/computer-learns-recognize-sounds-video-1202.
Coldewey, D. (2016, November 13). “WTF is computer vision?” Tech Crunch. Retrieved from https://techcrunch.com/2016/11/13/wtf-is-computer-vision/.
Additional Video links: