evil RAR files

Alexandre Detiste alexandre.detiste at gmail.com
Thu Apr 30 18:00:02 UTC 2015


Le jeudi 30 avril 2015, 16:09:53 Simon McVittie a écrit :
> > accept all "unknown", "non/official" RAR file.
> 
> I would rather not do this, but perhaps not for the reasons you think.

It's the kind of piece of knoweledge I might have read online during
the last 10 years I wasted at work; I felt it was wrong,
but not precisely why.
 
> I don't want to unpack unknown files (those without a known-safe
> cryptographic hash) in general, for a few reasons:

When is a md5 enough ? When is a sha1 needed ?

> [Zip files]
> ...
> which is why I didn't object to you adding this code for them. However,
> rar files don't have all of those advantages.

python3-rarfile answers some questions, but not everything.

The functionality of "unrar l" is entirely rewriten in python code.

For extractionit uses this:
| #: Command line args to use for extracting file to disk.
| EXTRACT_ARGS = ('x', '-y', '-idq')
 
/usr/lib/python3/dist-packages/rarfile.py itself acknoweledge these limitations:

| Basic logic:
| - Parse archive structure with Python.
| - Extract non-compressed files with Python
| - Extract compressed files with unrar.
| - Optionally write compressed data to temp file to speed up unrar,
|   otherwise it needs to scan whole archive on each execution.

"unrar e" (junk path) is presumely safe against directory traversal,
but that won't work if the files are identified in the yaml with a path part,
like the quake games.


> However, if they accidentally run malware as a result of unpacking
> warez'd game data that contains a successful exploit for the unpacker, I
> want it to be unambiguously not our fault.

I agree, too much risk for too little gain.

GDP is currently built like a tank, I want to keep it that way too :-)
especially after having read so many fragile case-sensitive packaging shell scripts.

> Regarding the performance point specifically, here is something to bear
> in mind. One day, I would like to be able to do something like
> 
>     game-data-packager anything ~/Downloads ~/GameMedia

I gave it a try:
https://github.com/a-detiste/game-data-packager/blob/stash/game_data_packager/stash.py

I first match files by size, and only when size > 100000, to get a list of maybe-possible games.

This led me to discover some bugs in the .yaml files.

To make it easier to not allways have both a hash + a size for a file;
would a yaml "compact" data format be ok ? :
   size   md5  filename

When I was cobbling togheter Dreamweb_uk & Dreamweb_us,
I was allways missing to append a "_uk" to some file;
this would make authoring easier.

> or the GUI equivalent, and have it produce any sensible .deb(s) that it
> possibly can from those inputs.

I imagine some gui with a lot of "leds" that turn progressively green
as games are identified asynchronously, and the checkboxes to decide
what to package.




More information about the Pkg-games-devel mailing list