Bug#867302: licensecheck: incorrectly parses multi-line copyright notices

Ximin Luo infinity0 at debian.org
Wed Jul 5 19:17:00 UTC 2017


Jonas Smedegaard:
> [..]
> 
> Have a look (if interested) at /usr/share/perl5/String/Copyright.pm and 
> in particular the (huge when expanded) $signs_and_more_re at line 138.
> 
> [..]

Thanks for the tips! I'm not sure if you got my other follow-ups to the bug report - I did in fact find String::Copyright, but I didn't know about the history nor plans for it, so thanks for filling me in on that.

At any rate, here is an updated version of my patch, along with some test cases for Sage's copyright notices.

I did try to think of a way to achieve the same logic *inside* the massive $re regexes. However I don't think this is possible, at least with my current approach - which tries to be conservative in order to adapt to humans being annoyingly inconsistent.

What it does is, it joins subsequent lines only when the indent is greater than the main line (with the "Copyright" part). This means I have to call length() in an expression-replacement, which I don't think is possible to do inside a normal regex...

As for speed:

# with the patch
$ time debian/rules debian/licensecheck.copyright
licensecheck -l250 -i ^sage/build/ -r --deb-machine --merge-licenses sage > "debian/licensecheck.copyright"

real	0m35.318s
user	0m35.204s
sys	0m0.056s

# without the patch
$ time debian/rules debian/licensecheck.copyright
licensecheck -l250 -i ^sage/build/ -r --deb-machine --merge-licenses sage > "debian/licensecheck.copyright"

real	0m31.168s
user	0m31.040s
sys	0m0.076s

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git
-------------- next part --------------
A non-text attachment was scrubbed...
Name: copyright.diff
Type: text/x-diff
Size: 1463 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/pkg-perl-maintainers/attachments/20170705/905d465c/attachment-0001.diff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: copyright-test.sh
Type: application/x-shellscript
Size: 978 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/pkg-perl-maintainers/attachments/20170705/905d465c/attachment-0001.bin>


More information about the pkg-perl-maintainers mailing list