In the future, an AI is to take over various tasks, including the classification of objects. Computer vision deals, among other things, with teaching AI-based systems to “see”. One of our software developers and AI engineers gives us insights into this topic.
The term computer vision is actually self-explanatory: “The goal is to give systems the ability to capture, interpret and understand visual data and images in a similar way to human perception,” explains the expert. This involves using algorithms to recognise patterns and features in images. “However, it works differently than with humans. We visually grasp an object as a whole with our eyes, usually unconsciously and effortlessly, because we can place it in context through learned knowledge. In contrast, a machine recognises an object only by means of numerical values. In concrete terms, this means that we humans perceive images by processing the light generated via the retina as a signal. In contrast, images are captured and stored by machines with the help of sensors in the form of pixels. Each pixel consists of one or more numerical values (usually a red, green and blue value), which can be evaluated by the computer vision system.”
Methods & Techniques in the IT Field Computer Vision
But how does a system recognise an object? Perceiving an imaged object can be achieved through various methods, including image recognition, face recognition, image segmentation, motion detection, 3D reconstruction. “Since images are considered unstructured data, features need to be defined and programmed – unlike structured data e.g. data in tabular form. With classical computer vision methods, this is very time-consuming because it is virtually a manual process. But since machine learning (ML) is becoming more and more advanced, computer vision is also more feasible, because feature generation has become much easier with ML,” says the MVI PROPLANT colleague.
Different techniques and algorithms are used to enable computer vision, including deep neural networks. Especially for computer vision, there are so-called Convolutional Neural Networks (CNN). “These networks have different feature levels and work their way from the concrete spatial perception of an object to the abstract technical data level.”
Using the cat image as an example (see header image), this means that colour features for fur and eye colour, for example, must be stored in the system in all variants. In contrast to humans, a cat can have a yellow or reddish eye colour. So if the object’s eyes are yellow, the AI can end up excluding the object ‘human’. “And that is only one feature for object recognition. A computer vision dataset must contain thousands of entries for it to make meaningful evaluations. We could hardly do that without machine learning. So high quality datasets have to be integrated into ML so that a system can complete training that enables computer vision.”

Possible applications in industry
Computer vision systems are used in the manufacturing industry, for example, to comply with safety standards. “The systems can detect whether safety clothing is being worn in accordance with regulations in order to minimise possible safety risks in advance. In addition, such systems can classify the severity of accidents and take the necessary steps to help affected persons, for example by switching off certain machines or informing third parties,” he explains vividly.
Current developments on the market show that new architectures are emerging for this purpose, which can reduce the amount of data needed. “The reusability of already learned features of existing neural networks also plays a role here, as images are hierarchical and you can usually adapt the first layers (layers) with the simple object features for other systems or project requirements. There is a lot of potential in using computer vision systems for different use cases.” Robotic applications in production or even autonomous driving rely on object and image recognition that can be realised through computer vision. The AI engineers at MVI in Berlin are working on this trendy topic and are driving development in this field.