The 6th Sense?
AI’s need senses if they are ever going to challenge human intelligence and creativity, and all the tokenization of literature, images, videos, and Instagram posts cannot help. Yes, they might be able to answer a set of test questions better than a high-school student or even a college professor, but Ai art is not art, its copying without emotion. That said, what would happen if AI systems wer given senses? What if they could ‘see’ the outside world in real-time? What if they could hear the sound of a real laughing child and at the same time see what made the child laugh? Could they begin to develop an understanding of context? It’s a difficult question to answer and likely would require all of the human senses working together to truly allow the AI to understand context, but we are certainly not there yet, and finding ways to grant AI systems the sense of touch and smell are challenging at best.
There are plans to give AI systems a better sense of sight, by collecting data from moving vehicles. The data becomes part of a real-world model that strives to identify ‘how things work’, which can then be used to train other models. However for a model to be ‘real-world’, it has to be huge. Humans take in massive amounts of sensory ‘noise’ that has only a minute influence on decisions, but is essential in understanding how the world works. Much of that ‘noise’ is incomplete, ambiguous, or distracting but is part of the context we need to handle the uncertainty that our complex environment brings. Of couse, efficiency is also important, and humans have the sometimes dubious ability to filter out noise, while retaining the essence of an situation, something Ai systems would have to be programmed to do, and with all the noise being fed to a real-world model, the storage and processing needed would be astronomical.
Ethics are a hard concept to explain, and building algorithms that contain ethics are prone to bias, Humans are also prone to bias, but they are typically taught lessons in ethics by others around them. Whither they respond with their own interpretation of those ‘lessons’ or just mimic what they see, is a human issue. Ai systems form biases only based on their training data and their algoritms, so while a real-world model might tell the AI that driving a vehicle into a concrete wall triggers a rule of physics, it doesn’t tell them that they should feel regret for doing so when they have borrowed that vehicle from their parents. Humas also continue to learn, at least most do, so Ai real-world models must be ever expanding to be effective and that requires more power, more processing speed, and more storage.
So the idea of a general real-world model has a number of missing parts. That said, real-world ‘mini-models’ are a bit more feasible. Rather than trying to model the unbelieveable complexity of the real world, building smaller models that contain sensory data that is relevant to a particular application is, at least, more realistic. We can use visual (camera) data to control stoplights, but those systems react poorly to anomolies and that is where additional sensory data is needed. Someone crossing against the light might be looking at the street to avoid traffic, but at the same time can hear (no earbuds) the faint sound of a speeding car that has yet to hit their peripheral vision, and that information, as unsignificant as it might be to someone walking on the sidewalk, becomes very important to the person crossing against traffic.
Real-world models that try to mimic real-world situations must have sensory information and the ability to filter that information in a ‘human’ way, so the development of real-world models without more complete sensory information will not produce the human-like abilities to react to the endless number of potential scenarions that every second of our lives provides. Networks could help AI’s gather data, but until the AI is able to feel the stroke of camel hair on canvas, smell the paint and see the yellow of a sunflower, they cannot understand context in the truest sense, something humans begin to learn before their first birthday. We expect mini-real-world models will be effective for lots of applications but without sensory input, real-world context is a dream.