Hi there - hello community,
i have a large ammount of files that i want to parse; they look like these ones: See a example:
well i guess that using Image::OCR::Tesseract could be interesting! I think i parse this with tesseract! ( Image::OCR::Tesseract - search.cpan.org )
use Image::OCR::Tesseract 'get_ocr';
my $image = './hi.jpg';
my $text = get_ocr($image);
what do you think!?
I would write a small bash script.
Shouldn’t be more then 3 oder 4 lines if you have all images in one folder.
many thanks for the quick reply. Indeed i have all files in a folder. Tesseract is supposed to be one of the three most powerful OCR-engines. I am a bit unfamiliar with TA. But I try to write the script.
BTW - which one to take - the google ocr tesseract or the Perl one ( Image::OCR::Tesseract - search.cpan.org ).
Note: The google-one should fit into OpenSuse 11.3 with ease - at least i guess so!
love to hear from you.