Tuesday 5 May 2009

Simple text extraction from images

You must have an screenshot tool and an OCR tool, the tools i use is imagemagick to take screenshots and gocr to recognize characters.

import -quality 100 /tmp/scan.png && gocr /tmp/scan.png


For example, testing with this example the output was :
Linux, ubuntu, gentoo, scripts, security, networks, digg, sIashdot, engadget, youtube.

Doesnt work 100% but can be usefull :)