Go Back   { mindfrost82.com } > Gadget Corner > Tech Newsgroups > Linux > Slackware

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-22-2008, 06:06 AM
Grant
 
Posts: n/a
info: using lzma compression

Hi there,

After seeing filename.tar.lzma tarballs for dnsmasq recently I looked
into using lzma here, with slackware-11.0.

Firstly, why bother? Well, on a 2.3MB datafile lzma compression is
a lot better than bzip2:

-rw-r--r-- 1 grant wheel 3995 2008-07-22 10:12 ip2c-names
-rw-r--r-- 1 grant wheel 2304896 2008-07-22 10:13 ip2c-data
-rw-r--r-- 1 grant wheel 612749 2008-07-22 15:23 ip2c-database.tar.bz2
-rw-r--r-- 1 grant wheel 293557 2008-07-22 15:35 ip2c-database.tar.lzma

Lzma compression comes from the window world's 7zip archiver. 7zip publish
an SDK under the LGPL. I downloaded the GPL'd unix source from:

http://tukaani.org/lzma/
http://tukaani.org/lzma/lzma-4.32.6.tar.gz

And had no problems compiling / installing the lzma utilities. Next was
to add lzma to tar. There are patches in the source tarball but they don't
match the tar versions included with slack-11.0 or slack-12.1. Another
wrinkle is that slackware has two versions of tar installed, one for
pkgtools and the other for userspace (slackware-12.1):

grant@pooh:~$ ls -l /bin/tar*
-rwxr-xr-x 1 root root 233196 2006-12-14 16:37 /bin/tar*
-rwxr-xr-x 1 root root 115036 2006-12-14 16:37 /bin/tar-1.13*
lrwxrwxrwx 1 root root 3 2008-05-26 13:45 /bin/tar-1.16.1 -> tar*

Before patching tar myself, I checked for the latest version and found the
latest tar-1.20 does support lzma, but not with a single letter option (-a)
that the lzma utilities author used.

I ran the usual ./configure; make; su; make install and let tar-1.20 install
under /usr/local so it doesn't interfere with the slack tar, tar-1.20 is seen
first on the $PATH.

See: http://www.gnu.org/software/tar/ for the latest tar source.

The new tar -a option compresses a file according to the target filename
suffix:

grant@deltree:~/ip2c$ time tar cvaf ip2c-database.tar.bz2 ip2c-data ip2c-names
ip2c-data
ip2c-names

real 0m4.452s
user 0m4.230s
sys 0m0.130s
grant@deltree:~/ip2c$ time tar cvaf ip2c-database.tar.lzma ip2c-data ip2c-names
ip2c-data
ip2c-names

real 0m16.886s
user 0m16.549s
sys 0m0.253s

So you can see lzma takes much longer to compress the same files, but
decompression time is much faster (these times are on a 500MHz Celeron).

grant@deltree:~/ip2c/xxx$ time bzcat ../ip2c-database.tar.bz2 |tar xv
ip2c-data
ip2c-names

real 0m1.306s
user 0m1.150s
sys 0m0.153s
grant@deltree:~/ip2c/xxx$ time lzcat ../ip2c-database.tar.lzma |tar xv
ip2c-data
ip2c-names

real 0m0.484s
user 0m0.347s
sys 0m0.140s

Unfortunately there's no single letter option for tar's lzma decompress
like tar xvjf for bzip2, and: 'tar xvf ../ip2c-database.tar.lzma --lzma'
looks clumsier to me than the lzcat ... above.

The large datafile I'm compressing is very repetitive, with about 92k
records like:

117440512 134217727 US
134217728 150994943 US
150994944 167772159 US
167772160 184549375 ZZ

The lzma web page claims: "Average compression ratio of LZMA is about
30% better than that of gzip, and 15% better than that of bzip2."

Grant.
--
http://bugsplatter.mine.nu/
Reply With Quote
Reply

  { mindfrost82.com } > Gadget Corner > Tech Newsgroups > Linux > Slackware


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are Off
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT. The time now is 10:20 PM.


Powered by vBulletin, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.1.0 ©2007, Crawlability, Inc.
© 1999-2008 mindfrost82.com v11.0


Sponsors:
Online Loans | Refinance | Credit Report | Credit Cards | Buy Anything On eBay



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109