Bug#601383: Zookeeper cleanup cronjob swaps dataDir and snapDir

Paul Paradise paul.paradise at socrata.com
Mon Oct 25 16:42:21 UTC 2010


Package: zookeeper
Version: 3.3.1+dfsg1-2

The cronjob installed as part of the Zookeeper package parses the 
zoo.cfg file to determine the parameters to call when running:

java -cp $CLASSPATH $JVMFLAGS \
      org.apache.zookeeper.server.PurgeTxnLog $DATADIR $DATALOGDIR -c 
$KEEPCOUNT

These names are self-consistent with the parameters from zoo.cfg, and 
works fine if you don't have a separate dataDir and dataLogDir. However, 
if you do have the dataDir and dataLogDir set to separate paths, thanks 
to some weird naming conventions on Zookeeper's part, they're actually 
backwards from what you would think they are - from the relevant 
information at 
http://hadoop.apache.org/zookeeper/docs/r3.3.1/zookeeperAdmin.html#sc_configuration

dataDir
the location where ZooKeeper will store the in-memory database snapshots 
and, unless specified otherwise, the transaction log of updates to the 
database.

dataLogDir
This option will direct the machine to write the transaction log to the 
dataLogDir rather than the dataDir. This allows a dedicated log device 
to be used, and helps avoid competition between logging and snaphots.

The first parameter to PurgeTxnLog should be where the transaction logs 
are (dataLogDir, falling back to dataDir if dataLogDir is emtpy) and the 
second parameter is where the snapshots live (dataDir).







More information about the pkg-java-maintainers mailing list