Vision technology is becoming increasingly commonplace as companies compete to develop and release solutions that push the boundaries of what machines can do. Colleges and universities are following suit, offering courses and modules with specializations in vision; it requires a substantial understanding of logic, algorithms, machine learning, and computer science, and is gaining popularity as an area of study. You may hear this burgeoning technology referred to as machine vision, or you may hear the term computer vision – so, what’s the difference? Both refer to machines “seeing” and interpreting visual data using specially programmed cameras and sensors. Some institutions will use both terms interchangeably; but as vision technology and its associated industries continue to see rapid growth, the contextual differences become more pronounced.
In industry, machine vision most often refers to procedural applications of vision technology, such as in robotic, manufacturing, or mechanical systems that include a vision component. An example of this type of application is robot guided vision, where a robot will use a camera to determine the offset required to pick up a part that lacks consistency, such as when parts travel through an assembly line facing different directions. In such a case, the machine is assessing the input it has received against the images and code it has learned to determine the orientation of the part in order to correctly pick it up.
In the example above, the machine has interpreted the visual inputs and identified defects on the edge of the part. It is comparing the input it has received against input it has been taught is correct and determining whether there is a match. This type of application is particularly useful in manufacturing, warehousing, or assembly scenarios where work is undertaken by machinery like robots as the machine itself can detect errors and reject the part, rather than relying on manual labour to perform checks.
Computer vision is a broader term describing the field of computer science concentrating on the development of techniques to identify and understand the content of images and videos. In such cases, once the machine has received visual input in the form of still images or video, it is able to interpret what it “sees” and produce an output accordingly. Some examples are facial recognition, object detection, image segmentation, and image classification. In these cases, the machine will learn via a combination of historical data input and computer programming and use this in combination with new visual inputs to produce a result. For example, a machine may receive a visual input that consists of pose tracking from a series of cameras to provide live positional information of athletes on a soccer field.
The overlap between both machine vision and computer vision continues to grow – both refer to the process of providing a machine with historical data and code and allowing the machine to use that information to provide a desired output. However, when employing either term, it is important to understand the contextual implications of each; machine vision describes a controlled application - often in the industrial space - designed to isolate and extract data points from individual image features, such as the width of a machined gap or if a clip has been installed. On the other hand, computer vision is often employed to extract higher-level insight from high-variance images, such as what something is, where it is in 3D space, or the action it is taking. Importantly, both machine vision and computer vision represent significant advancement in what available technology can accomplish, with the promise to stretch these capabilities even further.
Comments