[Debian-med-packaging] Bug#995406: Bug#995406: bbmap: package does not ship resource files

Étienne Mollier emollier at emlwks999.eu
Thu Sep 30 21:45:07 BST 2021


Control: found -1 38.90+dfsg-1
Control: tag -1 confirmed

Hi all,

Andreas Tille, on 2021-09-30:
> Am Thu, Sep 30, 2021 at 01:22:23PM -0400 schrieb Robert:
> > The bbmap package does not ship the needed resource files which causes some of
> > the included tools not to work, e.g. bbduk when trying to process some fastq
> > data, crashes with output like [1].
> 
> Thanks a lot for the report.  Its extremely helpful since several of our
> maintainers are not using this software and we really need to rely on
> user input.

Thank you Robert!  Your report is very useful indeed!

[…]
> > $ bbduk.sh in1=fwd.fastq in2=rev.fastq ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo threads=48 out=out.fastq
> > java -ea -Xmx76702m -Xms76702m -cp /usr/share/java/bbmap.jar jgi.BBDuk in1=fwd.fastq in2=rev.fastq ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo threads=48 out=out.fastq
> > Executing jgi.BBDuk [in1=fwd.fastq, in2=rev.fastq, ktrim=r, k=21, mink=8, hdist=2, ftm=5, tpe, tbo, threads=48, out=out.fastq]
> > Version 38.90
> > 
> > Set threads to 48
> > maskMiddle was disabled because useShortKmers=true
> > Warning!  Cannot find primes.txt.gz /tmp/bbduk_test/file:/usr/share/java/bbmap.jar!/primes.txt.gz
> > 	at jgi.BBDuk.main(BBDuk.java:78)
> 
> If we could turn this into a test I could upload including test.

Andreas, I pulled some data files from python-biopython-doc,
and I think I managed to reproduce the problem on my end:

	$ bbduk.sh \
		in1=/usr/share/doc/python-biopython-doc/Tests/Quality/example.fastq \
		in2=/usr/share/doc/python-biopython-doc/Tests/Quality/solexa_example.fastq \
		ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo threads=48 \
		out=out.fastq
	java -ea -Xmx7195m -Xms7195m -cp /usr/share/java/bbmap.jar jgi.BBDuk in1=/usr/share/doc/python-biopython-doc/Tests/Quality/example.fastq in2=/usr/share/doc/python-biopython-doc/Tests/Quality/solexa_example.fastq ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo threads=48 out=out.fastq
	Executing jgi.BBDuk [in1=/usr/share/doc/python-biopython-doc/Tests/Quality/example.fastq, in2=/usr/share/doc/python-biopython-doc/Tests/Quality/solexa_example.fastq, ktrim=r, k=21, mink=8, hdist=2, ftm=5, tpe, tbo, threads=48, out=out.fastq]
	Version 38.93
	
	Set threads to 48
	maskMiddle was disabled because useShortKmers=true
	Warning!  Cannot find primes.txt.gz /home/emollier/tmp/bbduk_test/file:/usr/share/java/bbmap.jar!/primes.txt.gz
	java.lang.Exception
		at dna.Data.findPath(Data.java:1247)
		at dna.Data.findPath(Data.java:1194)
		at shared.Primes.fetchPrimes(Primes.java:167)
		at shared.Primes.<clinit>(Primes.java:177)
		at kmer.ScheduleMaker.<clinit>(ScheduleMaker.java:155)
		at jgi.BBDuk.<init>(BBDuk.java:964)
		at jgi.BBDuk.main(BBDuk.java:78)
	Exception in thread "main" java.lang.ExceptionInInitializerError
		at kmer.ScheduleMaker.<clinit>(ScheduleMaker.java:155)
		at jgi.BBDuk.<init>(BBDuk.java:964)
		at jgi.BBDuk.main(BBDuk.java:78)
	Caused by: java.lang.NullPointerException
		at fileIO.ByteFile.<init>(ByteFile.java:43)
		at fileIO.ByteFile1.<init>(ByteFile1.java:98)
		at fileIO.ByteFile1.<init>(ByteFile1.java:94)
		at shared.Primes.fetchPrimes(Primes.java:169)
		at shared.Primes.<clinit>(Primes.java:177)
		... 3 more

I tested the patch from Robert and applied by Andreas, and it
seems I could get much further in the processing.  For the
autopkgtest, note that I had to pick an appropriate dataset with
same dimensions in both files, otherwise the processing fails,
because of intrinsic data inconsistencies I presume:

	$ bbduk.sh \
		in1=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_sanger.fastq \
		in2=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_solexa.fastq \
		ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo threads=48 \
		out=out.fastq
	java -ea -Xmx7140m -Xms7140m -cp /usr/share/java/bbmap.jar jgi.BBDuk in1=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_sanger.fastq in2=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_solexa.fastq ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo threads=48 out=out.fastq
	Executing jgi.BBDuk [in1=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_sanger.fastq, in2=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_solexa.fastq, ktrim=r, k=21, mink=8, hdist=2, ftm=5, tpe, tbo, threads=48, out=out.fastq]
	Version 38.93
	
	Set threads to 48
	maskMiddle was disabled because useShortKmers=true
	0.018 seconds.
	Initial:
	Memory: max=7486m, total=7486m, free=7467m, used=19m
	
	******  WARNING! A KMER OPERATION WAS CHOSEN BUT NO KMERS WERE LOADED.  ******
	******  YOU NEED TO SPECIFY A REFERENCE FILE OR LITERAL SEQUENCE.       ******
	
	Input is being processed as paired
	Changed from ASCII-33 to ASCII-64 on input 9: 57 -> 26
	Started output streams:	0.032 seconds.
	Processing time:   		0.148 seconds.
	
	Input:                  	6 reads 		820 bases.
	FTrimmed:               	4 reads (66.67%) 	10 bases (1.22%)
	Trimmed by overlap:     	0 reads (0.00%) 	0 bases (0.00%)
	Total Removed:          	0 reads (0.00%) 	10 bases (1.22%)
	Result:                 	6 reads (100.00%) 	810 bases (98.78%)
	
	Time:                         	0.182 seconds.
	Reads Processed:           6 	0.03k reads/sec
	Bases Processed:         820 	0.00m bases/sec

Maybe this can be used as a stub for autopkgtest?  By the way,
this problem is also reproducible in bullseye.

Have a nice day,  :)
-- 
Étienne Mollier <emollier at emlwks999.eu>
Fingerprint:  8f91 b227 c7d6 f2b1 948c  8236 793c f67e 8f0d 11da
Sent from /dev/pts/2, please excuse my verbosity.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/debian-med-packaging/attachments/20210930/b3bdbe26/attachment.sig>


More information about the Debian-med-packaging mailing list