This is something that everyone has wanted, for at least 10 years, Search Engines that can understand the text in images; TechCrunch did a post today that Google Lodges Patent For Reading Text In Images And Video:
"…information Week suggests
that privacy issues raised by Google Maps Street View will get more complicated as eventually YouTube videos will be indexable via the text that appears within them.
A full copy of the patent application “Recognizing Text In Images” can be viewed here
.
Some choice lines from the patent:
“Digital images can include a wide variety of content…For example, digital images can illustrate landscapes, people, urban scenes, and other objects. Digital images often include text. Digital images can be captured, for example, using cameras or digital video recorders. Image text (i.e., text in an image) typically includes text of varying size, orientation, and typeface. Text in a digital image derived, for example, from an urban scene (e.g., a city street scene) often provides information about the displayed scene or location. A typical street scene includes, for example, text as part of street signs, building names, address numbers, and window signs.”
I may be stating the blatantly obvious when I say that if Google has found a way to index text in static images and video this is a great leap forward in the progression of search technology. This will make every book in the Google Books database really searchable, with the next step being YouTube, Flickr (or Picasa Web) and more. The search capabilities of the future just became seriously advanced."
Chances are the quality of a initial scan of Flash Page, or a Flash Movie, will leave something to be desired - getting the text on an image 90% correct means it's 10% incorrect, so the meaning of what your datamining via Google not yield as much as you think because of it will be full of noise.
If you wrote a sentence, and every 10th letter was illegible - how easy would it be to understand? It would depend on the content - if it's mathematics, chemistry or biology - 10% would make the information unusable (IE: a formula that's read in with parts of it not understood).
But if it's a paragraph with words that Google had enough information about the letters and the words around it - it might be able to compensate enough.
I think Duncan Riley over simplifies the amount of technical horsepower, resources that be needed to scan and interpret images and rich media - not that Google doesn't have the resources - but the effort of doing so will probably render this patent, and the technology around it, at least a few years away from implementation, if not more.

