[sane-devel] searching for business card scanner compatible with sane

Jelle de Jong jelledejong at powercraft.nl
Thu Dec 24 22:02:26 UTC 2009


Hi Allan, thank you for taking the time to response at my questions,

m. allan noah wrote, on 22-12-09 21:18:
>>>>> # bug02:
>>>>> It seems "ADF Back" and "ADF Front" are outputing something that is
>>>>> not on the original paper, only "ADF Duplex" seem to work here...
>>> I would need a sample image. Are you sure you are running libsane 1.0.20?
>> I tried to reproduce the results I encountered yesterday, but the
>> scanner is now responding in a different way. (see the attachment)
> 
> Did you connect the scanner to a windows machine in between these two
> tests? It is possible the windows driver updated the firmware...

Nope there was no Windows near the machine, I try to live in a windows
free zone. However I have not seem the behaviour again. So I am hoping
on it just being an anomaly.

>>>>> # bug03:
>>>>> It seems around 15mm of text is missing from the bottom of all pages
>>>>> when trying to scan simple A4 pages. Page geometry: 5096x6600+0+0
>>> please use the page width and page height options to set the actual
>>> size of the paper. if you are using a command line program, you should
>>> set those options before you set the x/y image size.
>> The datasheet(1) is saying it can scan A4 paper and automatically
>> recognizes document size, is this not supported in the device driver?
> 
> the 'datasheet' does not describe the hardware, but rather the
> hardware+windows device driver. Some features, like auto cropping are
> software only features. This is common. You need a very smart ($1000+)
> machine to do this in hardware. That said- the git development branch
> of sane-backends does have a first attempt at a software
> implementation of these features.

The device comes with a fujitsu scansnap carrier sheet to help guide
documents through the scanning process. It seems the scanner is
designed in a way to never have good scans but just scan the biggest
area possible allowed by the size of the input and do all the
processing later in the software. So the output of this device is
especially messy because it seems intended for post-processing.

>> Is it really that hard to scan A4 paper with a more then 500 Euro
>> scanner? So far I only have been trying to get the basis functionality
>> of the 100 Euro all in one HP scanner that I previous used, let alone
>> get the functionality I really wanted like automatic different size
>> business card scanning. Am I now just bitching and making no sense or
>> is there something wrong? I know I am a perfectionist but I need it
>> for the quality of the work I do.
> 
> You are overreacting a bit, yes. It is not particularly hard to set an
> extra page-length option. I could have set the default to A4 length,
> but I'm in the US :)

I can understand the settings being predefined for US Letter. I have
no issues with that, but having to manually measure and set the
--page-width and --page-height is something I did not expected to be
necessary to get a good scan result. Having to predefine this for A4
is still a disappointment but expectable. Having to do the same for
every business cards or special sized input papers would become
something extreme annoying.

>> # big black bars on the bottom of the pages, unusable for archiving and unprofessional for sending to other business partners
> 
> big? wow, you are hard to please :) You could try the current
> development version of sane, and see if the software based cropping
> and length detection features help. If not, we accept patches...

I know I can be demanding, sorry for this, I want to be able to set-up
sane based scanner solutions on multiple locations and the usability
and features should first reach my quality tress hold levels before I
can deploying it.

I have the feeling the driver is doing it's job, but the scanner needs
an complete additional layer to become productive usable.

This layer should still be optional invisible for the end user. The
few experiences I gathered the last days testing the scanner gives me
the impression the following features should be added as some sort of
filter that can be enable and lies between the driver and the scanadf
(sane) output image.

By default the scanner scans in duplex mode and scans all input
available (no -x, -y or --page-width and --page-height) this should
ensures no source data is missed. (this is currently not the case)

This data is send to the filter and does the following:

1) recognizes document size
2) removes blank pages (backsides of input with nothing on it)
3) auto rotation and autodeskew

extra "like to have" possibilities:
4) auto color detection
5) auto resolution selection depending on input sizes

After this there will be a simple but good usable image that can be
used for all other stuff. This output is what I expected when using
scanadf without any options on an simple A4 of A8 or C5 input sized
source.

I don't have time any more to do coding myself, but I would really
want some advice from you and other sane developer on what it would
take to make the above working and available under GNU compatible
licensing and compliant with the Debian Free Software Guidelines so it
can be included and distributed with Debian and derivatives. If there
is enough motivation to make it happen I can arrange for some
sponsoring in the form of money donations to an individual or project.

The output of scanadf can then go to the other post processing systems
that I think is not sane developer territory (DjVU and PDF with OCR
and meta-data tags detected from highlighting input images) This is
something I try to do myself by connection all kind of other floss
tools, with some smart data provided from the scanadf process. When
the floss tools are not working enough I set-out developer bounties
(see the pct-scanner-scripts tools) (got one bounty for tesseract and
hocr)

I also have some other issues when doing test with the correct
possibilities, please see the attachment.

Thanks in advance for any help and advice,

Kind regards,

Jelle


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fujitsu-scansnap-s1500-debugging04.txt
URL: <http://lists.alioth.debian.org/pipermail/sane-devel/attachments/20091224/1635c823/attachment.txt>


More information about the sane-devel mailing list