Hi there - hello community,
i have a large ammount of files that i want to parse; they look like these ones: See a example:
http://www.foundationfinder.ch/ShowDetails.php?Id=134&InterfaceLanguage=&Type=Image
http://www.foundationfinder.ch/ShowDetails.php?Id=134&InterfaceLanguage=&Type=Html
well i guess that using Image::OCR::Tesseract could be interesting! I think i parse this with tesseract! ( Image::OCR::Tesseract - search.cpan.org )
use Image::OCR::Tesseract 'get_ocr';
my $image = './hi.jpg';
my $text = get_ocr($image);
what do you think!?