Physics of the Future: How Science Will Shape Human Destiny and Our Daily Lives by the Year 2100
important for our evolution, since our ancestors had only a split second to determine if a tiger was lurking in the bushes, even before they were fully aware of it.) For the first time, a robot consistently scored higher than a human on a specific vision recognition test.
The contest between me and the machine was simple. First, I sat in a chair and stared at an ordinary computer screen. Then a picture flashed on the screen for a split second, and I was supposed to press one of two keys as fast as I could, if I saw an animal in the picture or not. I had to make a decision as quickly as possible, even before I had a chance to digest the picture. The computer would also make a decision for the same picture.
Embarrassingly enough, after many rapid-fire tests, the machine and I performed about equally. But there were times when the machine scored significantly higher than I did, leaving me in the dust. I was beaten by a machine. (It was one consolation when I was told that the computer gets the right answer 82 percent of the time, but humans score only 80 percent on average.)
The key to Poggio’s machine is that it copies lessons from MotherNature. Many scientists are realizing the truth in the statement, “The wheel has already been invented, so why not copy it?” For example, normally when a robot looks at a picture, it tries to divide it up into a series of lines, circles, squares, and other geometric shapes. But Poggio’s program is different.
When we see a picture, we might first see the outlines of various objects, then see various features within each object, then shading within these features, etc. So we split up the image into many layers. As soon as the computer processes one layer of the image, it integrates it with the next layer, and so on. In this way, step by step, layer by layer, it mimics the hierarchical way that our brains process images. (Poggio’s program cannot perform all the feats of pattern recognition that we take for granted, such as visualizing objects in 3-D, recognizing thousands of objects from different angles, etc., but it does represent a major milestone in pattern recognition.)
Later, I had an opportunity to see both the top-down and bottom-up approaches in action. I first went to the Stanford University’s artificial intelligence center, where I met STAIR (Stanford artificial intelligence robot), which uses the top-down approach. STAIR is about 4 feet tall, with a huge mechanical arm that can swivel and grab objects off a table. STAIR is also mobile, so it can wander around an office or home. The robot has a 3-D camera that locks onto an object and feeds the 3-D image into a computer, which then guides the mechanical arm to grab the object. Robots have been grabbing objects like this since the 1960s, and we see them in Detroit auto factories.
But appearances are deceptive. STAIR can do much more. Unlike the robots in Detroit, STAIR is not scripted. It operates by itself. If you ask it to pick up an orange, for example, it can analyze a collection of objects on a table, compare them with the thousands of images already stored in its memory, then identify the orange and pick it up. It can also identify objects more precisely by grabbing them and turning them around.
To test its ability, I scrambled a group of objects on a table, and then watched what happened after I asked for a specific one. I saw that STAIR correctly analyzed the new arrangement and then reached out and grabbed the correct thing. Eventually, the goal is to have STAIR navigate in home and office environments, pick up and interact with various objects andtools, and even converse with people in a simplified language. In this way, it will be able to do anything that a gofer can in an office. STAIR is an example of the top-down approach: everything is programmed into STAIR from the very beginning. (Although STAIR can recognize objects from different angles, it is still limited in the number of objects it can recognize. It would be paralyzed if it had to walk outside and recognize random objects.)
Later, I had a chance to visit New York University, where Yann LeCun is experimenting with an entirely different design, the LAGR (learning applied to ground robots). LAGR is an example of the bottom-up approach: it has to learn everything from scratch, by bumping into things. It is the size of a small golf cart and has two stereo color cameras that scan the landscape, identifying objects in its path. It then moves
Weitere Kostenlose Bücher