How Deep Learning Gives Machines Vision
Date Section Blog
When we look at a Christmas tree, our brain instantly recognises it as one. Even if it is a modern (and admittedly ugly) one built out of aluminium, we still understand that it was meant to represent a Christmas tree. We rarely stop to think about how good we are at processing visual information. That is, until we try to teach machines to do the same thing through images. Whereas we are able to interpret an image made up out of countless pixels, a computer only sees 0s and 1s.
Herein lies the main challenge in the field known as computer vision (CV): how do you teach a computer to make sense of what is effectively a meaningless mass of numbers to them? How does your phone know it is you when it runs facial recognition, or how does the AI of a sorting system tell one product from another on the conveyor belt?
Let’s take a look at the traditional, ‘hand-crafted’ approach to computer vision, and how the deep learning revolution allowed us to make computer vision truly ‘intelligent’. We also highlight the power of deep learning-based CV for companies and showcase a few of the numerous deep learning-based vision applications we have developed for our customers.
Traditional Computer Vision
When artificial intelligence (AI) was first proposed in the late 1950s, researchers soon started to ask themselves how they could teach machines to understand and ‘make sense of’ the visual world. As AI in the latter part of the 20th century was fundamentally based on hard-coded rules, the same approach was used for computer vision (CV).
The engineer tells the machine which features it should extract from an image: edges, colours, eyes and so on. If we go back to our Christmas tree, we could tell the machine to only look at green shapes to classify them as such. In effect, humans have to manually code every instruction for the machine. The system only performs a set of predefined steps. A simple example of this is a system whose camera visually detects the position of a computer chip on a belt through edge detection, after which it tells a robotic arm to pick it up.
Although not intelligent by any stretch of the imagination, such CV systems work incredibly well for a specific subset of mostly manufacturing-related tasks, where every product is identical and rules can be consistently applied. However, the traditional approach cannot deal with one key problem: variability. Our algorithm would not be able to classify the aluminium tree as a Christmas tree. We would have to add a specific rule for that. The same problem applies when you try to teach a system to sort fruits or identify diseases in medical images—products of nature are inherently variable. Likewise, it is impossible to use hard-coded rules for identifying defects in a product, since no two cracks are the same.
The Deep Learning Revolution
In the 2010s, a new form of machine learning took the world by storm: deep learning. Deep learning-based applications began to conquer fields that were previously thought too complex for AI, like the game of Go, where the search volume is vastly greater than the number of atoms in the universe. However, deep learning pulled it off: in 2016, AlphaGo defeated world champion Lee Sedol 4 to 1 in a game of Go. That same year, Google Translate started to employ neural networks, vastly improving the quality of translations. Businesses and consumers were suddenly awash with a whole raft of other applications, from speech recognition software to financial transaction anomaly detectors.
In deep learning, algorithms are trained through deep-neural networks that are modelled after the human brain. No longer do humans have to manually code all of the features beforehand. Much like us, deep-learning systems learn and extract patterns by themselves through examples. Or in machine terms: good and representative data.
Unsurprisingly, computer vision was one of the very first fields to be transformed by deep learning. After all, self-learning offered a way to solve the core problem of variability in CV. As long as a deep learning algorithm is fed enough examples, it can identify each product type with very high accuracy. With deep learning technology, researchers could finally tackle complex tasks that have stubbornly resisted efforts at automation. And it did not take long for innovative companies to translate their stunning academic results into real-world products.
Use Cases of Deep Learning for Computer Vision
Since then, intelligent computer vision applications have revolutionised entire industries and created new markets. Deep learning underpins some of the most exciting innovations of our day: self-driving cars, delivery drones, agri-bots and much more.
As one of the first companies to embrace deep learning for CV, we at Robovision know first-hand how important this technology can be for businesses. To give you a glimpse of what deep learning can do, here are a few examples of the AI projects we’ve completed.
Defect Detection on a Conveyor Belt
For the Unilin Group, we developed a deep-learning application that can spot minute defects in laminate boards moving on a conveyor belt at high speeds. Traditional CV would never have been able to distinguish the subtle difference between a laminate print and a real crack, nor would it have been capable of handling different kinds of defects at different spots.
Tulip Bulb Planter
For ISO Group, we developed a robot capable of grabbing and planting tulip bulbs. Since each tulip bulb is different from the previous one, automation through traditional CV is impossible. On top of that, grabbing the bulb involves three-dimensional space, an additional layer of complexity that cannot be fully captured by 2D deep learning. To account for this, we trained the robot using our cutting-edge 3D deep learning.
Wafer Inspection in Semiconductor Manufacturing
For Hitachi, we inspect defects in semiconductors. This is quality control at its finest: we need to detect defects inline in nodes as small as 10 nanometres (nm). To put that into perspective, a human hair is 80,000 nm thick. Our algorithms have been able to reach 100% accuracy at 40,000 views per hour.
Weed Detection Drone
Partnering with ILVO Vlaanderen (Flanders Research Institute for Agriculture, Fisheries & Food) for a pilot project, we built a weed detection drone equipped with 5G and our AI model. The drone flies over fields and sends images via 5G to the Cloud. Our AI model then analyses the data and sends the coordinates to a GPS-controlled tractor. The tractor then moves to carry out its pesticide spraying job on those patches only. This AI-based weed detection system is capable of detecting differently shaped patches of weed thanks to deep learning. By cutting down on pesticides, farmers reduce health risks for the consumer and also improve their bottom line in the process.
Automate Complex Processes with Robovision
Make no mistake, traditional CV still has its uses for specific tasks where clear-cut instructions can be applied consistently. This is why traditional CV is still in widespread use today. But for complex tasks that cannot be grasped with manual rules, deep learning-based CV is the way forward.
Deep learning for CV will inevitably become indispensable as more and more industries catch onto its many uses, from detecting COVID-19 in lung images in healthcare to recognising unscanned products at self-checkouts in retail. However, most companies still struggle to translate massive AI potential into actual real-world use cases.
AI leaders often can provide both expertise and technology to help them implement their AI projects successfully. And this is where Robovision, an AI pioneer, can help you. Our name Robovision says it all: we are an AI company specializing in deep learning for computer vision. At the core of our business lies Robovision Platform, a powerful yet accessible deep learning CV platform we developed to enable anyone to create and maintain AI applications.
Curious about what deep learning vision can do for your business? Do not hesitate to contact us. With a huge number of complex AI projects under our belt, we can help you implement human-level vision intelligence into your processes, whether you are active in manufacturing, healthcare, agriculture or another industry.