I needed a tool that could convert a rasterised image into text. There are a few out there, but I don't think there's any that match Tesseract OCR for cross-language capability, community support and freedom (it's Free as in freedom and beer).
The setup isn't super-obvious, but once you've got it figured out, all of that can be automated. On top of that, there's lots of programming language-specific libraries out there that'll help plug your stuff into it.