[sane-devel] Configuring OCR tool

Jeff jffry at posteo.net
Sun Jul 28 10:09:49 BST 2019


On 26/07/2019 16:16, Business Kid wrote:
> I have sane(1.0.27) & xsane(0.999) working here on my HP LaserJet MFP
> 130nw Multifunction printer. I wanted to use it for OCR (At which I have
> some commercial experience). gocr seems to be the only OCR tool; but
> that project seems to be dying, or dead.
>  
> This query is about OCR. How do I set the ocr program & options in
> xsane? I would like to be able to choose tesseract, or ABBYY and pass
> options. I think tesseract has a 'stdout' option, which allows you to
> junk the original file. In commercial work, 500G disks were being
> swapped around regularly as they filled up and were queued for OCR.
>  
> I did a test of GPL linux tools a few years back, and *tesseract* came
> out best, with a new OCR engine in Beta. I was able to scan & then edit
> one of my father's plays which had been typewritten for him by a novice
> in the 1960s. He then corrected it by hand. Having done work for a firm
> here 10 years back, I knew that *ABBYY* was probably the best
> (commercial) package, then only available in M$Windoze.  ABBYY now have
> a (commercial) linux package, with a one month free trial :-D.

I can't help you with xsane, but I can suggest another scanning tool
that supports OCR, in particular tesseract (but I am biased, because I
am the author):

gscan2pdf

Regards

Jeff

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://alioth-lists.debian.net/pipermail/sane-devel/attachments/20190728/9fa2d36a/attachment-0001.sig>


More information about the sane-devel mailing list