Mobile computer vision is set to touch our lives in a tangible way. To continue the parallel, the big bang has started and the universe is expanding fast enough for us to experience the magic. There are three primary factors, in my view, that have contributed to the recent advance.
1) Powerful local descriptors: Early 2000s marked an exciting development in the field of image recognition, that has now touched every aspect of computer vision. The publication of SIFT descriptors will certainly go down in the history of computer vision in the same light as that of Turbo codes in digital communications. Using SIFT or SIFT-like framework, engineers can now robustly and accurately describe and match local regions in an image or video. A big leap forward, thanks to David Lowe and his team!
2) Machine learning: Although artificial intelligence with it’s rule-based deduction did not deliver on its promise, back in the 1980s, of solving all our problems, a related discipline, that of machine learning, has come to our rescue. Not depending on hand-coded rules and letting machines learn by looking at several examples via solving large optimization problems, it turns out, is the way to go. And we are figuring this out now!
3) Faster machines: Computer vision and image analysis problems are one of few that are always hungry for computing resources. Any computer vision researcher will tell you that having a powerful desktop is better than a laptop, having a cluster is better than a desktop, and having a cloud with thousands of computers is like having a vacation home on the moon. Needless to say Moore’s law has helped.
This is just the beginning. There are many unsolved problems and exciting challenges. This post gives the perspective and I’ll differ the question of what can be done and what can’t today to my next post.