Report forwarded to debian-bugs-dist@lists.debian.org, Anthony Towns <ajt@debian.org>:
Bug#50731; Package cruft.   debian-bugs-dist@lists.debian.orgAnthony Towns  Subject: Bug#50731: cruft: uses too much disk space Reply-To: pmaydell@chiark.greenend.org.uk, 50731@bugs.debian.org Resent-From: pmaydell@chiark.greenend.org.uk Orignal-Sender: pm215@watchdragon.demon.co.uk Resent-To: debian-bugs-dist@lists.debian.org Resent-CC: Anthony Towns Resent-Date: Sat, 20 Nov 1999 15:18:02 GMT Resent-Message-ID: Resent-Sender: owner@bugs.debian.org X-Debian-PR-Message: report 50731 X-Debian-PR-Package: cruft X-Debian-PR-Keywords: X-Loop: owner@bugs.debian.org Received: via spool by bugs@bugs.debian.org id=B.9431102829007 (code B ref -1); Sat, 20 Nov 1999 15:18:02 GMT To: submit@bugs.debian.org From: pmaydell@chiark.greenend.org.uk Date: Sat, 20 Nov 1999 02:27:16 +0000 Sender: pm215@watchdragon.demon.co.uk Message-Id: Package: cruft Version: 0.9.5 Severity: wishlist The way cruft is designed causes it to use rather a lot of disk space. For example, when producing a report on my fairly small system (filesystems 1GB, 250MB, 250MB) it put 15MB of stuff into /var/spool/cruft/. Surely this could be reduced somewhat by a cleverer algorithm? For example, /usr/lib/cruft/explain/dev is just a 'find /dev' command. This means that all the files in /dev are listed in a file in the spool. It would be better to allow something like a wildcard or regexp syntax so you didn't have to list all the files in /dev explicitly. This gets much worse if you use this strategy for directories like /usr/local, naturally. I think that a better way to do the job would be to first create the files to be used to filter the file names, as we do now (expl_* and need_*), but then to do the weeding out of 'OK' files on the fly as part of the pipeline 'find $DRIVE...', rather than creating large file_* files and then processing them. Obviously this would be quite a bit of work :-> Peter Maydell   Acknowledgement sent to pmaydell@chiark.greenend.org.uk:
New Bug report received and forwarded. Copy sent to Anthony Towns <ajt@debian.org>.   -t  From: owner@bugs.debian.org (Debian Bug Tracking System) To: pmaydell@chiark.greenend.org.uk Subject: Bug#50731: Acknowledgement (cruft: uses too much disk space) Message-ID: In-Reply-To: References: X-Debian-PR-Message: ack 50731 Thank you for the problem report you have sent regarding Debian. This is an automatically generated reply, to let you know your message has been received. It is being forwarded to the developers mailing list for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): Anthony Towns If you wish to submit further information on your problem, please send it to 50731@bugs.debian.org (and *not* to bugs@bugs.debian.org). Please do not reply to the address at the top of this message, unless you wish to report a problem with the Bug-tracking system. Darren Benham (administrator, Debian Bugs database)   Received: (at submit) by bugs.debian.org; 20 Nov 1999 15:04:42 +0000 Received: (qmail 9004 invoked from network); 20 Nov 1999 15:04:42 -0000 Received: from finch-post-12.mail.demon.net (194.217.242.41) by master.debian.org with SMTP; 20 Nov 1999 15:04:42 -0000 Received: from watchdragon.demon.co.uk ([158.152.121.201]) by finch-post-12.mail.demon.net with esmtp (Exim 2.12 #1) id 11pC2n-00004t-0C for submit@bugs.debian.org; Sat, 20 Nov 1999 15:03:17 +0000 Received: from localhost (mnementh.local) [127.0.0.1] by watchdragon.demon.co.uk with esmtp (Exim 3.03 #1) id 11p0FA-0003Uk-00 (Debian); Sat, 20 Nov 1999 02:27:16 +0000 To: submit@bugs.debian.org From: pmaydell@chiark.greenend.org.uk Subject: cruft: uses too much disk space Date: Sat, 20 Nov 1999 02:27:16 +0000 Sender: pm215@watchdragon.demon.co.uk Message-Id: Package: cruft Version: 0.9.5 Severity: wishlist The way cruft is designed causes it to use rather a lot of disk space. For example, when producing a report on my fairly small system (filesystems 1GB, 250MB, 250MB) it put 15MB of stuff into /var/spool/cruft/. Surely this could be reduced somewhat by a cleverer algorithm? For example, /usr/lib/cruft/explain/dev is just a 'find /dev' command. This means that all the files in /dev are listed in a file in the spool. It would be better to allow something like a wildcard or regexp syntax so you didn't have to list all the files in /dev explicitly. This gets much worse if you use this strategy for directories like /usr/local, naturally. I think that a better way to do the job would be to first create the files to be used to filter the file names, as we do now (expl_* and need_*), but then to do the weeding out of 'OK' files on the fly as part of the pipeline 'find $DRIVE...', rather than creating large file_* files and then processing them. Obviously this would be quite a bit of work :-> Peter Maydell   Tags added: confirmed Request was from Marcin Owsiany <porridge@debian.org> to control@bugs.debian.org.   Received: (at control) by bugs.debian.org; 31 Aug 2005 09:48:46 +0000 From marcin@owsiany.pl Wed Aug 31 02:48:46 2005 Return-path: Received: from starnet.skynet.com.pl (skynet.skynet.com.pl) [213.25.173.230] by spohr.debian.org with esmtp (Exim 3.36 1 (Debian)) id 1EAPD3-0006Ho-00; Wed, 31 Aug 2005 02:48:46 -0700 Received: from unregister250204219081.c204.msk.pl ([81.219.204.250] helo=localhost) by skynet.skynet.com.pl with asmtp (Exim 3.35 #1 (Debian)) id 1EAPCt-0007ma-00 for ; Wed, 31 Aug 2005 11:48:35 +0200 Received: from porridge by localhost with local (Exim 4.52) id 1EAPCr-0004B8-0D for control@bugs.debian.org; Wed, 31 Aug 2005 11:48:33 +0200 From: Marcin Owsiany To: control@bugs.debian.org Subject: tagging 50731 Date: Wed, 31 Aug 2005 11:48:32 +0200 X-BTS-Version: 2.9.5 Message-Id: X-Scanner: exiscan *1EAPCt-0007ma-00*njShxvq8dRY* Delivered-To: control@bugs.debian.org X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2005_01_02 (1.212-2003-09-23-exp) on spohr.debian.org X-Spam-Level: X-Spam-Status: No, hits=-3.0 required=4.0 tests=BAYES_00,RCVD_IN_SBLXBL, RCVD_IN_SBLXBL_CBL,VALID_BTS_CONTROL autolearn=no version=2.60-bugs.debian.org_2005_01_02 # Automatically generated email from bts, devscripts version 2.9.5 # obviously cruft is just a POC now tags 50731 confirmed   Information forwarded to debian-bugs-dist@lists.debian.org, Anthony Towns <ajt@debian.org>:
Bug#50731; Package cruft.   debian-bugs-dist@lists.debian.orgAnthony Towns  X-Loop: owner@bugs.debian.org Subject: Bug#50731: cruft: suggest using compression (gzip?) for files in /var/spool/cruft Reply-To: Joe Wells , 50731@bugs.debian.org Resent-From: Joe Wells Resent-To: debian-bugs-dist@lists.debian.org Resent-CC: Anthony Towns Resent-Date: Sun, 04 Feb 2007 19:03:08 +0000 Resent-Message-ID: Resent-Sender: owner@bugs.debian.org X-Debian-PR-Message: report 50731 X-Debian-PR-Package: cruft X-Debian-PR-Keywords: confirmed X-Debian-PR-Source: cruft Received: via spool by 50731-submit@bugs.debian.org id=B50731.117061543525734 (code B ref 50731); Sun, 04 Feb 2007 19:03:08 +0000 Received: (at 50731) by bugs.debian.org; 4 Feb 2007 18:57:15 +0000 Received: from izanami.macs.hw.ac.uk ([137.195.13.6]) by spohr.debian.org with esmtp (Exim 4.50) id 1HDmRF-0002Uf-Q3 for 50731@bugs.debian.org; Sun, 04 Feb 2007 10:50:10 -0800 Received: from lxultra1.macs.hw.ac.uk ([137.195.27.173]:55947 helo=127.0.0.1) by izanami.macs.hw.ac.uk with smtp (Exim 4.51) id 1HDmQj-0004wh-EM for 50731@bugs.debian.org; Sun, 04 Feb 2007 18:49:37 +0000 Received: (nullmailer pid 22444 invoked by uid 1001); Sun, 04 Feb 2007 18:49:37 -0000 To: 50731@bugs.debian.org From: Joe Wells Date: Sun, 04 Feb 2007 18:49:37 +0000 Message-ID: <861wl5pxku.fsf@macs.hw.ac.uk> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2005_01_02 (1.212-2003-09-23-exp) on spohr.debian.org X-Spam-Level: X-Spam-Status: No, hits=0.5 required=4.0 tests=BAYES_20,RCVD_NUMERIC_HELO autolearn=no version=2.60-bugs.debian.org_2005_01_02 I agree that cruft uses too much disk space. For example, on my system, "du -s /var/spool/cruft" says 45940 (KiB), but after running "gzip /var/spool/cruft/*" the output of "du -s /var/spool/cruft" changes to 4252, which is only 9% of the uncompressed size. How hard would it be to change cruft to save its files in compressed (e.g., maybe using gzip) format? Note that this would be a much simpler change than the earlier suggestion from 1999 in this bug report. -- Joe Wells   Acknowledgement sent to Joe Wells <jbw@macs.hw.ac.uk>:
Extra info received and forwarded to list. Copy sent to Anthony Towns <ajt@debian.org>.   -t  Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.417 (Entity 5.417) Content-Type: text/plain; charset=utf-8 X-Loop: owner@bugs.debian.org From: owner@bugs.debian.org (Debian Bug Tracking System) To: Joe Wells Subject: Bug#50731: Info received (cruft: suggest using compression (gzip?) for files in /var/spool/cruft) Message-Id: References: <861wl5pxku.fsf@macs.hw.ac.uk> X-Debian-PR-Message: ack-info 50731 X-Debian-PR-Package: cruft X-Debian-PR-Keywords: confirmed X-Debian-PR-Source: cruft Reply-To: 50731@bugs.debian.org Thank you for the additional information you have supplied regarding this problem report. It has been forwarded to the package maintainer(s) and to other interested parties to accompany the original report. Your message has been sent to the package maintainer(s): Anthony Towns If you wish to continue to submit further information on this problem, please send it to 50731@bugs.debian.org, as before. Please do not reply to the address at the top of this message, unless you wish to report a problem with the Bug-tracking system. Debian bug tracking system administrator (administrator, Debian Bugs database)   Received: (at 50731) by bugs.debian.org; 4 Feb 2007 18:57:15 +0000 From jbw@macs.hw.ac.uk Sun Feb 04 10:57:15 2007 Return-path: Received: from izanami.macs.hw.ac.uk ([137.195.13.6]) by spohr.debian.org with esmtp (Exim 4.50) id 1HDmRF-0002Uf-Q3 for 50731@bugs.debian.org; Sun, 04 Feb 2007 10:50:10 -0800 Received: from lxultra1.macs.hw.ac.uk ([137.195.27.173]:55947 helo=127.0.0.1) by izanami.macs.hw.ac.uk with smtp (Exim 4.51) id 1HDmQj-0004wh-EM for 50731@bugs.debian.org; Sun, 04 Feb 2007 18:49:37 +0000 Received: (nullmailer pid 22444 invoked by uid 1001); Sun, 04 Feb 2007 18:49:37 -0000 To: 50731@bugs.debian.org Subject: cruft: suggest using compression (gzip?) for files in /var/spool/cruft From: Joe Wells Date: Sun, 04 Feb 2007 18:49:37 +0000 Message-ID: <861wl5pxku.fsf@macs.hw.ac.uk> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2005_01_02 (1.212-2003-09-23-exp) on spohr.debian.org X-Spam-Level: X-Spam-Status: No, hits=0.5 required=4.0 tests=BAYES_20,RCVD_NUMERIC_HELO autolearn=no version=2.60-bugs.debian.org_2005_01_02 I agree that cruft uses too much disk space. For example, on my system, "du -s /var/spool/cruft" says 45940 (KiB), but after running "gzip /var/spool/cruft/*" the output of "du -s /var/spool/cruft" changes to 4252, which is only 9% of the uncompressed size. How hard would it be to change cruft to save its files in compressed (e.g., maybe using gzip) format? Note that this would be a much simpler change than the earlier suggestion from 1999 in this bug report. -- Joe Wells   Information forwarded to debian-bugs-dist@lists.debian.org, Anthony Towns <ajt@debian.org>:
Bug#50731; Package cruft.   debian-bugs-dist@lists.debian.orgAnthony Towns  X-Loop: owner@bugs.debian.org Subject: Bug#50731: cruft: suggest using compression (gzip?) for files in /var/spool/cruft Reply-To: Marcin Owsiany , 50731@bugs.debian.org Resent-From: Marcin Owsiany Resent-To: debian-bugs-dist@lists.debian.org Resent-CC: Anthony Towns Resent-Date: Sun, 04 Feb 2007 20:33:22 +0000 Resent-Message-ID: Resent-Sender: owner@bugs.debian.org X-Debian-PR-Message: report 50731 X-Debian-PR-Package: cruft X-Debian-PR-Keywords: confirmed X-Debian-PR-Source: cruft Received: via spool by 50731-submit@bugs.debian.org id=B50731.117062044128077 (code B ref 50731); Sun, 04 Feb 2007 20:33:22 +0000 Received: (at 50731) by bugs.debian.org; 4 Feb 2007 20:20:41 +0000 Received: from starnet.skynet.com.pl ([213.25.173.230] helo=skynet.skynet.com.pl) by spohr.debian.org with esmtp (Exim 4.50) id 1HDnqq-0007I6-Ae for 50731@bugs.debian.org; Sun, 04 Feb 2007 12:20:40 -0800 Received: from acc9ab09.ipt.aol.com ([172.201.171.9] helo=localhost) by skynet.skynet.com.pl with esmtpsa (TLS-1.0:RSA_AES_256_CBC_SHA:32) (Exim 4.50) id 1HDnqm-0006Hg-JA; Sun, 04 Feb 2007 21:20:36 +0100 Received: from porridge by localhost with local (Exim 4.63) (envelope-from ) id 1HDnqH-0004Vq-3f; Sun, 04 Feb 2007 20:20:05 +0000 Date: Sun, 4 Feb 2007 20:20:05 +0000 From: Marcin Owsiany To: Joe Wells , 50731@bugs.debian.org Message-ID: <20070204202005.GB14578@kufelek> References: <861wl5pxku.fsf@macs.hw.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <861wl5pxku.fsf@macs.hw.ac.uk> User-Agent: Mutt/1.5.13 (2006-08-11) X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2005_01_02 (1.212-2003-09-23-exp) on spohr.debian.org X-Spam-Level: X-Spam-Status: No, hits=-5.0 required=4.0 tests=BAYES_00,HAS_BUG_NUMBER, RCVD_IN_SORBS autolearn=no version=2.60-bugs.debian.org_2005_01_02 On Sun, Feb 04, 2007 at 06:49:37PM +0000, Joe Wells wrote: > I agree that cruft uses too much disk space. > > For example, on my system, "du -s /var/spool/cruft" says 45940 (KiB), > but after running "gzip /var/spool/cruft/*" the output of "du -s > /var/spool/cruft" changes to 4252, which is only 9% of the > uncompressed size. This sounds like a clever idea.. If you feel like it, could you please do some size and time change measurments for this data, for different compression levels (say, --fast, default and --best)? > How hard would it be to change cruft to save its files in compressed > (e.g., maybe using gzip) format? Saving would be trivial - just filtering data through gzip in a shell script. However generating a report from such files would not be so straightforward, as an important part of it is done by a small program written in C, which opens the files on its own, so this would require adding zlib support to it. I'm not saying that it's impossible, but I'm currently looking at reimplementing some of the cruft's guts, so it's going to wait at least until I finish that. Marcin -- Marcin Owsiany http://marcin.owsiany.pl/ GnuPG: 1024D/60F41216 FE67 DA2D 0ACA FC5E 3F75 D6F6 3A0D 8AA0 60F4 1216   Acknowledgement sent to Marcin Owsiany <porridge@debian.org>:
Extra info received and forwarded to list. Copy sent to Anthony Towns <ajt@debian.org>.   -t  Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.417 (Entity 5.417) Content-Type: text/plain; charset=utf-8 X-Loop: owner@bugs.debian.org From: owner@bugs.debian.org (Debian Bug Tracking System) To: Marcin Owsiany Subject: Bug#50731: Info received (Bug#50731: cruft: suggest using compression (gzip?) for files in /var/spool/cruft) Message-Id: References: <20070204202005.GB14578@kufelek> X-Debian-PR-Message: ack-info 50731 X-Debian-PR-Package: cruft X-Debian-PR-Keywords: confirmed X-Debian-PR-Source: cruft Reply-To: 50731@bugs.debian.org Thank you for the additional information you have supplied regarding this problem report. It has been forwarded to the package maintainer(s) and to other interested parties to accompany the original report. Your message has been sent to the package maintainer(s): Anthony Towns If you wish to continue to submit further information on this problem, please send it to 50731@bugs.debian.org, as before. Please do not reply to the address at the top of this message, unless you wish to report a problem with the Bug-tracking system. Debian bug tracking system administrator (administrator, Debian Bugs database)   Received: (at 50731) by bugs.debian.org; 4 Feb 2007 20:20:41 +0000 From marcin@owsiany.pl Sun Feb 04 12:20:40 2007 Return-path: Received: from starnet.skynet.com.pl ([213.25.173.230] helo=skynet.skynet.com.pl) by spohr.debian.org with esmtp (Exim 4.50) id 1HDnqq-0007I6-Ae for 50731@bugs.debian.org; Sun, 04 Feb 2007 12:20:40 -0800 Received: from acc9ab09.ipt.aol.com ([172.201.171.9] helo=localhost) by skynet.skynet.com.pl with esmtpsa (TLS-1.0:RSA_AES_256_CBC_SHA:32) (Exim 4.50) id 1HDnqm-0006Hg-JA; Sun, 04 Feb 2007 21:20:36 +0100 Received: from porridge by localhost with local (Exim 4.63) (envelope-from ) id 1HDnqH-0004Vq-3f; Sun, 04 Feb 2007 20:20:05 +0000 Date: Sun, 4 Feb 2007 20:20:05 +0000 From: Marcin Owsiany To: Joe Wells , 50731@bugs.debian.org Subject: Re: Bug#50731: cruft: suggest using compression (gzip?) for files in /var/spool/cruft Message-ID: <20070204202005.GB14578@kufelek> References: <861wl5pxku.fsf@macs.hw.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <861wl5pxku.fsf@macs.hw.ac.uk> User-Agent: Mutt/1.5.13 (2006-08-11) X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2005_01_02 (1.212-2003-09-23-exp) on spohr.debian.org X-Spam-Level: X-Spam-Status: No, hits=-5.0 required=4.0 tests=BAYES_00,HAS_BUG_NUMBER, RCVD_IN_SORBS autolearn=no version=2.60-bugs.debian.org_2005_01_02 On Sun, Feb 04, 2007 at 06:49:37PM +0000, Joe Wells wrote: > I agree that cruft uses too much disk space. > > For example, on my system, "du -s /var/spool/cruft" says 45940 (KiB), > but after running "gzip /var/spool/cruft/*" the output of "du -s > /var/spool/cruft" changes to 4252, which is only 9% of the > uncompressed size. This sounds like a clever idea.. If you feel like it, could you please do some size and time change measurments for this data, for different compression levels (say, --fast, default and --best)? > How hard would it be to change cruft to save its files in compressed > (e.g., maybe using gzip) format? Saving would be trivial - just filtering data through gzip in a shell script. However generating a report from such files would not be so straightforward, as an important part of it is done by a small program written in C, which opens the files on its own, so this would require adding zlib support to it. I'm not saying that it's impossible, but I'm currently looking at reimplementing some of the cruft's guts, so it's going to wait at least until I finish that. Marcin -- Marcin Owsiany http://marcin.owsiany.pl/ GnuPG: 1024D/60F41216 FE67 DA2D 0ACA FC5E 3F75 D6F6 3A0D 8AA0 60F4 1216   Information forwarded to debian-bugs-dist@lists.debian.org, Anthony Towns <ajt@debian.org>:
Bug#50731; Package cruft.   debian-bugs-dist@lists.debian.orgAnthony Towns  X-Loop: owner@bugs.debian.org Subject: Bug#50731: cruft: suggest using compression (gzip?) for files in /var/spool/cruft Reply-To: Joe Wells , 50731@bugs.debian.org Resent-From: Joe Wells Resent-To: debian-bugs-dist@lists.debian.org Resent-CC: Anthony Towns Resent-Date: Mon, 05 Feb 2007 16:33:07 +0000 Resent-Message-ID: Resent-Sender: owner@bugs.debian.org X-Debian-PR-Message: report 50731 X-Debian-PR-Package: cruft X-Debian-PR-Keywords: confirmed X-Debian-PR-Source: cruft Received: via spool by 50731-submit@bugs.debian.org id=B50731.117069304017228 (code B ref 50731); Mon, 05 Feb 2007 16:33:07 +0000 Received: (at 50731) by bugs.debian.org; 5 Feb 2007 16:30:40 +0000 Received: from izanami.macs.hw.ac.uk ([137.195.13.6]) by spohr.debian.org with esmtp (Exim 4.50) id 1HE6jn-0004Br-4z for 50731@bugs.debian.org; Mon, 05 Feb 2007 08:30:40 -0800 Received: from lxultra1.macs.hw.ac.uk ([137.195.27.173]:60286 helo=127.0.0.1) by izanami.macs.hw.ac.uk with smtp (Exim 4.51) id 1HE6jD-00068q-Oh; Mon, 05 Feb 2007 16:30:04 +0000 Received: (nullmailer pid 16394 invoked by uid 1001); Mon, 05 Feb 2007 16:30:04 -0000 To: Marcin Owsiany Cc: 50731@bugs.debian.org References: <861wl5pxku.fsf@macs.hw.ac.uk> <20070204202005.GB14578@kufelek> From: Joe Wells Date: Mon, 05 Feb 2007 16:30:04 +0000 In-Reply-To: <20070204202005.GB14578@kufelek> (Marcin Owsiany's message of "Sun, 4 Feb 2007 20:20:05 +0000") Message-ID: <86veigo9df.fsf@macs.hw.ac.uk> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2005_01_02 (1.212-2003-09-23-exp) on spohr.debian.org X-Spam-Level: X-Spam-Status: No, hits=-1.6 required=4.0 tests=BAYES_44,HAS_BUG_NUMBER, RCVD_NUMERIC_HELO autolearn=no version=2.60-bugs.debian.org_2005_01_02 Marcin Owsiany writes: > On Sun, Feb 04, 2007 at 06:49:37PM +0000, Joe Wells wrote: >> I agree that cruft uses too much disk space. >> >> For example, on my system, "du -s /var/spool/cruft" says 45940 (KiB), >> but after running "gzip /var/spool/cruft/*" the output of "du -s >> /var/spool/cruft" changes to 4252, which is only 9% of the >> uncompressed size. > > This sounds like a clever idea.. If you feel like it, could you please > do some size and time change measurments for this data, for different > compression levels (say, --fast, default and --best)? Sure. Output included below. Quick summary: -9 (same as --best) gets a compressed result that is 8.5% of the uncompressed size. -1 (same as --fast) gets a compressed result that is 10.3% of the uncompressed size. The -9 compressed result is 83% of the size of the -1 compressed result. The default (same as -6) compressed size is 9% of the uncompressed size. Compressing with -9 takes 460% of the time used to compress with -1 and 280% of the time used to compress with -6. The compression time with -9 is only 6 seconds though so it is probably much smaller than the time taken to gather the data being compressed. Uncompression time seems to depend only on the compressed size, so it did not vary much. bzip2 compresses better (7.4% of uncompressed size) but takes a *lot* longer to compress and uncompress. (bzip2 also uses tons of memory when compressing and uncompressing.) Here are the stats: uncompressed size: 45760120 size compressed with "gzip -9": 3909493 size compressed with "gzip -1": 4723511 size compressed with "gzip -6": 4108964 size compressed with "bzip2": 3380903 time to compress with "gzip -9": 6.17u + 0.09s = 6.26 time to compress with "gzip -1": 1.26u + 0.09s = 1.35 time to compress with "gzip -6": 2.15u + 0.08s = 2.23 time to compress with "bzip2": 26.83u + 0.18s = 27.01 time to uncompress with "gzip -9": 0.49u + 0.16s = 0.65 time to uncompress with "gzip -1": 0.52u + 0.16s = 0.68 time to uncompress with "gzip -6": 0.48u + 0.17s = 0.65 time to uncompress with "bzip2": 3.41u + 0.21s = 3.62 >> How hard would it be to change cruft to save its files in compressed >> (e.g., maybe using gzip) format? > > Saving would be trivial - just filtering data through gzip in a shell > script. However generating a report from such files would not be so > straightforward, as an important part of it is done by a small program > written in C, which opens the files on its own, so this would require > adding zlib support to it. > > I'm not saying that it's impossible, but I'm currently looking at > reimplementing some of the cruft's guts, so it's going to wait at least > until I finish that. Okay. Hope it can be done, because it would be really nice to save all that space (my disks are always nearly full!). -- Joe Wells ====================================================================== Script started on 2007-02-05 T 15:47:02 > sudo time --verbose gzip -9 * Command being timed: "gzip -9 expl_alternatives expl_dev expl_diversions expl_dpkg expl_users file_in_dev file_in_dev-pts file_in_dev-shm file_in_home file_in_mnt-shared-data2 file_in_proc-bus-usb file_in_sys file_in_var-lock file_in_var-run file_root miss_alternatives miss_dev miss_diversions miss_dpkg miss_users need_alternatives need_link_dests report unex_in_home unex_in_mnt-shared-data2 unex_in_proc-bus-usb unex_in_sys unex_in_var-lock unex_in_var-run unex_root want_alternatives want_link_dests" User time (seconds): 6.17 System time (seconds): 0.09 Percent of CPU this job got: 93% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.67 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 215 Voluntary context switches: 1 Involuntary context switches: 6353 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > gzip -l * compressed uncompressed ratio uncompressed_name 1458 8928 84.1% expl_alternatives 4046 25080 84.0% expl_dev 349 1992 84.2% expl_diversions 847072 8510446 90.0% expl_dpkg 944074 12772141 92.6% expl_users 86 179 70.9% file_in_dev-pts 45 9 -22.2% file_in_dev-shm 2465 12114 79.9% file_in_dev 941235 12755631 92.6% file_in_home 833 9286 91.5% file_in_mnt-shared-data2 99 258 76.7% file_in_proc-bus-usb 21455 214812 90.0% file_in_sys 71 50 28.0% file_in_var-lock 312 907 69.3% file_in_var-run 903342 9035497 90.0% file_root 107 95 25.3% miss_alternatives 1387 12796 89.4% miss_dev 239 1215 83.1% miss_diversions 511 3048 84.2% miss_dpkg 82 95 44.2% miss_users 748 3779 81.2% need_alternatives 48503 407126 88.1% need_link_dests 97358 1081703 91.0% report 325 904 67.5% unex_in_home 831 9268 91.5% unex_in_mnt-shared-data2 99 258 76.7% unex_in_proc-bus-usb 21453 214807 90.0% unex_in_sys 59 22 -9.1% unex_in_var-lock 294 810 67.9% unex_in_var-run 54888 501895 89.1% unex_root 71 36 2.8% want_alternatives 15596 174933 91.1% want_link_dests 3909493 45760120 91.5% (totals) > sudo time --verbose gunzip * Command being timed: "gunzip expl_alternatives.gz expl_dev.gz expl_diversions.gz expl_dpkg.gz expl_users.gz file_in_dev-pts.gz file_in_dev-shm.gz file_in_dev.gz file_in_home.gz file_in_mnt-shared-data2.gz file_in_proc-bus-usb.gz file_in_sys.gz file_in_var-lock.gz file_in_var-run.gz file_root.gz miss_alternatives.gz miss_dev.gz miss_diversions.gz miss_dpkg.gz miss_users.gz need_alternatives.gz need_link_dests.gz report.gz unex_in_home.gz unex_in_mnt-shared-data2.gz unex_in_proc-bus-usb.gz unex_in_sys.gz unex_in_var-lock.gz unex_in_var-run.gz unex_root.gz want_alternatives.gz want_link_dests.gz" User time (seconds): 0.49 System time (seconds): 0.16 Percent of CPU this job got: 88% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.74 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 173 Voluntary context switches: 1 Involuntary context switches: 672 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > sudo time --verbose gzip -1 * Command being timed: "gzip -1 expl_alternatives expl_dev expl_diversions expl_dpkg expl_users file_in_dev file_in_dev-pts file_in_dev-shm file_in_home file_in_mnt-shared-data2 file_in_proc-bus-usb file_in_sys file_in_var-lock file_in_var-run file_root miss_alternatives miss_dev miss_diversions miss_dpkg miss_users need_alternatives need_link_dests report unex_in_home unex_in_mnt-shared-data2 unex_in_proc-bus-usb unex_in_sys unex_in_var-lock unex_in_var-run unex_root want_alternatives want_link_dests" User time (seconds): 1.26 System time (seconds): 0.09 Percent of CPU this job got: 93% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.45 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 217 Voluntary context switches: 1 Involuntary context switches: 1386 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > gzip -l * compressed uncompressed ratio uncompressed_name 1661 8928 81.8% expl_alternatives 4695 25080 81.4% expl_dev 412 1992 81.0% expl_diversions 1042721 8510446 87.7% expl_dpkg 1120052 12772141 91.2% expl_users 87 179 70.4% file_in_dev-pts 45 9 -22.2% file_in_dev-shm 2885 12114 76.4% file_in_dev 1116679 12755631 91.2% file_in_home 1003 9286 89.7% file_in_mnt-shared-data2 103 258 75.2% file_in_proc-bus-usb 25651 214812 88.1% file_in_sys 71 50 28.0% file_in_var-lock 332 907 67.1% file_in_var-run 1112710 9035497 87.7% file_root 107 95 25.3% miss_alternatives 1607 12796 87.7% miss_dev 277 1215 80.0% miss_diversions 613 3048 80.8% miss_dpkg 82 95 44.2% miss_users 818 3779 79.3% need_alternatives 58538 407126 85.6% need_link_dests 118862 1081703 89.0% report 331 904 66.8% unex_in_home 995 9268 89.7% unex_in_mnt-shared-data2 103 258 75.2% unex_in_proc-bus-usb 25643 214807 88.1% unex_in_sys 59 22 -9.1% unex_in_var-lock 315 810 65.3% unex_in_var-run 67241 501895 86.6% unex_root 71 36 2.8% want_alternatives 18742 174933 89.3% want_link_dests 4723511 45760120 89.7% (totals) > sudo time --verbose gunzip * Command being timed: "gunzip expl_alternatives.gz expl_dev.gz expl_diversions.gz expl_dpkg.gz expl_users.gz file_in_dev-pts.gz file_in_dev-shm.gz file_in_dev.gz file_in_home.gz file_in_mnt-shared-data2.gz file_in_proc-bus-usb.gz file_in_sys.gz file_in_var-lock.gz file_in_var-run.gz file_root.gz miss_alternatives.gz miss_dev.gz miss_diversions.gz miss_dpkg.gz miss_users.gz need_alternatives.gz need_link_dests.gz report.gz unex_in_home.gz unex_in_mnt-shared-data2.gz unex_in_proc-bus-usb.gz unex_in_sys.gz unex_in_var-lock.gz unex_in_var-run.gz unex_root.gz want_alternatives.gz want_link_dests.gz" User time (seconds): 0.52 System time (seconds): 0.16 Percent of CPU this job got: 89% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.76 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 174 Voluntary context switches: 1 Involuntary context switches: 710 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > sudo time --verbose gzip * Command being timed: "gzip expl_alternatives expl_dev expl_diversions expl_dpkg expl_users file_in_dev file_in_dev-pts file_in_dev-shm file_in_home file_in_mnt-shared-data2 file_in_proc-bus-usb file_in_sys file_in_var-lock file_in_var-run file_root miss_alternatives miss_dev miss_diversions miss_dpkg miss_users need_alternatives need_link_dests report unex_in_home unex_in_mnt-shared-data2 unex_in_proc-bus-usb unex_in_sys unex_in_var-lock unex_in_var-run unex_root want_alternatives want_link_dests" User time (seconds): 2.15 System time (seconds): 0.08 Percent of CPU this job got: 92% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.43 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 217 Voluntary context switches: 1 Involuntary context switches: 2249 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > gzip -l * compressed uncompressed ratio uncompressed_name 1498 8928 83.6% expl_alternatives 4235 25080 83.2% expl_dev 350 1992 84.1% expl_diversions 890947 8510446 89.5% expl_dpkg 989479 12772141 92.3% expl_users 86 179 70.9% file_in_dev-pts 45 9 -22.2% file_in_dev-shm 2486 12114 79.7% file_in_dev 986484 12755631 92.3% file_in_home 943 9286 90.3% file_in_mnt-shared-data2 99 258 76.7% file_in_proc-bus-usb 22579 214812 89.5% file_in_sys 71 50 28.0% file_in_var-lock 310 907 69.6% file_in_var-run 952426 9035497 89.5% file_root 107 95 25.3% miss_alternatives 1542 12796 88.2% miss_dev 240 1215 83.0% miss_diversions 512 3048 84.1% miss_dpkg 82 95 44.2% miss_users 753 3779 81.0% need_alternatives 50052 407126 87.7% need_link_dests 104366 1081703 90.4% report 325 904 67.5% unex_in_home 938 9268 90.3% unex_in_mnt-shared-data2 99 258 76.7% unex_in_proc-bus-usb 22576 214807 89.5% unex_in_sys 59 22 -9.1% unex_in_var-lock 293 810 68.0% unex_in_var-run 58314 501895 88.4% unex_root 71 36 2.8% want_alternatives 16597 174933 90.5% want_link_dests 4108964 45760120 91.0% (totals) > sudo time --verbose gunzip * Command being timed: "gunzip expl_alternatives.gz expl_dev.gz expl_diversions.gz expl_dpkg.gz expl_users.gz file_in_dev-pts.gz file_in_dev-shm.gz file_in_dev.gz file_in_home.gz file_in_mnt-shared-data2.gz file_in_proc-bus-usb.gz file_in_sys.gz file_in_var-lock.gz file_in_var-run.gz file_root.gz miss_alternatives.gz miss_dev.gz miss_diversions.gz miss_dpkg.gz miss_users.gz need_alternatives.gz need_link_dests.gz report.gz unex_in_home.gz unex_in_mnt-shared-data2.gz unex_in_proc-bus-usb.gz unex_in_sys.gz unex_in_var-lock.gz unex_in_var-run.gz unex_root.gz want_alternatives.gz want_link_dests.gz" User time (seconds): 0.48 System time (seconds): 0.17 Percent of CPU this job got: 88% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.74 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 172 Voluntary context switches: 2 Involuntary context switches: 663 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > sudo time --verbose bzip2 -v * expl_alternatives: 6.498:1, 1.231 bits/byte, 84.61% saved, 8928 in, 1374 out. expl_dev: 7.632:1, 1.048 bits/byte, 86.90% saved, 25080 in, 3286 out. expl_diversions: 5.298:1, 1.510 bits/byte, 81.12% saved, 1992 in, 376 out. expl_dpkg: 11.138:1, 0.718 bits/byte, 91.02% saved, 8510446 in, 764122 out. expl_users: 16.365:1, 0.489 bits/byte, 93.89% saved, 12772141 in, 780470 out. file_in_dev: 5.802:1, 1.379 bits/byte, 82.76% saved, 12114 in, 2088 out. file_in_dev-pts: 2.157:1, 3.709 bits/byte, 53.63% saved, 179 in, 83 out. file_in_dev-shm: 0.184:1, 43.556 bits/byte, -444.44% saved, 9 in, 49 out. file_in_home: 16.388:1, 0.488 bits/byte, 93.90% saved, 12755631 in, 778375 out. file_in_mnt-shared-data2: 12.498:1, 0.640 bits/byte, 92.00% saved, 9286 in, 743 out. file_in_proc-bus-usb: 2.745:1, 2.915 bits/byte, 63.57% saved, 258 in, 94 out. file_in_sys: 13.765:1, 0.581 bits/byte, 92.74% saved, 214812 in, 15606 out. file_in_var-lock: 0.667:1, 12.000 bits/byte, -50.00% saved, 50 in, 75 out. file_in_var-run: 2.660:1, 3.008 bits/byte, 62.40% saved, 907 in, 341 out. file_root: 11.130:1, 0.719 bits/byte, 91.02% saved, 9035497 in, 811819 out. miss_alternatives: 0.856:1, 9.347 bits/byte, -16.84% saved, 95 in, 111 out. miss_dev: 11.274:1, 0.710 bits/byte, 91.13% saved, 12796 in, 1135 out. miss_diversions: 4.483:1, 1.784 bits/byte, 77.70% saved, 1215 in, 271 out. miss_dpkg: 5.655:1, 1.415 bits/byte, 82.32% saved, 3048 in, 539 out. miss_users: 1.105:1, 7.242 bits/byte, 9.47% saved, 95 in, 86 out. need_alternatives: 4.694:1, 1.704 bits/byte, 78.70% saved, 3779 in, 805 out. need_link_dests: 8.842:1, 0.905 bits/byte, 88.69% saved, 407126 in, 46046 out. report: 12.195:1, 0.656 bits/byte, 91.80% saved, 1081703 in, 88699 out. unex_in_home: 2.354:1, 3.398 bits/byte, 57.52% saved, 904 in, 384 out. unex_in_mnt-shared-data2: 12.872:1, 0.621 bits/byte, 92.23% saved, 9268 in, 720 out. unex_in_proc-bus-usb: 2.745:1, 2.915 bits/byte, 63.57% saved, 258 in, 94 out. unex_in_sys: 13.783:1, 0.580 bits/byte, 92.74% saved, 214807 in, 15585 out. unex_in_var-lock: 0.355:1, 22.545 bits/byte, -181.82% saved, 22 in, 62 out. unex_in_var-run: 2.477:1, 3.230 bits/byte, 59.63% saved, 810 in, 327 out. unex_root: 9.718:1, 0.823 bits/byte, 89.71% saved, 501895 in, 51644 out. want_alternatives: 0.500:1, 16.000 bits/byte, -100.00% saved, 36 in, 72 out. want_link_dests: 11.343:1, 0.705 bits/byte, 91.18% saved, 174933 in, 15422 out. Command being timed: "bzip2 -v expl_alternatives expl_dev expl_diversions expl_dpkg expl_users file_in_dev file_in_dev-pts file_in_dev-shm file_in_home file_in_mnt-shared-data2 file_in_proc-bus-usb file_in_sys file_in_var-lock file_in_var-run file_root miss_alternatives miss_dev miss_diversions miss_dpkg miss_users need_alternatives need_link_dests report unex_in_home unex_in_mnt-shared-data2 unex_in_proc-bus-usb unex_in_sys unex_in_var-lock unex_in_var-run unex_root want_alternatives want_link_dests" User time (seconds): 26.83 System time (seconds): 0.18 Percent of CPU this job got: 94% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:28.65 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 11610 Voluntary context switches: 1 Involuntary context switches: 28080 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > du -s 3424 . > sudo time --verbose bunzip2 * Command being timed: "bunzip2 expl_alternatives.bz2 expl_dev.bz2 expl_diversions.bz2 expl_dpkg.bz2 expl_users.bz2 file_in_dev-pts.bz2 file_in_dev-shm.bz2 file_in_dev.bz2 file_in_home.bz2 file_in_mnt-shared-data2.bz2 file_in_proc-bus-usb.bz2 file_in_sys.bz2 file_in_var-lock.bz2 file_in_var-run.bz2 file_root.bz2 miss_alternatives.bz2 miss_dev.bz2 miss_diversions.bz2 miss_dpkg.bz2 miss_users.bz2 need_alternatives.bz2 need_link_dests.bz2 report.bz2 unex_in_home.bz2 unex_in_mnt-shared-data2.bz2 unex_in_proc-bus-usb.bz2 unex_in_sys.bz2 unex_in_var-lock.bz2 unex_in_var-run.bz2 unex_root.bz2 want_alternatives.bz2 want_link_dests.bz2" User time (seconds): 3.41 System time (seconds): 0.21 Percent of CPU this job got: 56% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.40 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 6219 Voluntary context switches: 1 Involuntary context switches: 3681 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > exit Script done on 2007-02-05 T 15:55:14   Acknowledgement sent to Joe Wells <jbw@macs.hw.ac.uk>:
Extra info received and forwarded to list. Copy sent to Anthony Towns <ajt@debian.org>.   -t  Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.417 (Entity 5.417) Content-Type: text/plain; charset=utf-8 X-Loop: owner@bugs.debian.org From: owner@bugs.debian.org (Debian Bug Tracking System) To: Joe Wells Subject: Bug#50731: Info received (Bug#50731: cruft: suggest using compression (gzip?) for files in /var/spool/cruft) Message-Id: References: <86veigo9df.fsf@macs.hw.ac.uk> X-Debian-PR-Message: ack-info 50731 X-Debian-PR-Package: cruft X-Debian-PR-Keywords: confirmed X-Debian-PR-Source: cruft Reply-To: 50731@bugs.debian.org Thank you for the additional information you have supplied regarding this problem report. It has been forwarded to the package maintainer(s) and to other interested parties to accompany the original report. Your message has been sent to the package maintainer(s): Anthony Towns If you wish to continue to submit further information on this problem, please send it to 50731@bugs.debian.org, as before. Please do not reply to the address at the top of this message, unless you wish to report a problem with the Bug-tracking system. Debian bug tracking system administrator (administrator, Debian Bugs database)   Received: (at 50731) by bugs.debian.org; 5 Feb 2007 16:30:40 +0000 From jbw@macs.hw.ac.uk Mon Feb 05 08:30:40 2007 Return-path: Received: from izanami.macs.hw.ac.uk ([137.195.13.6]) by spohr.debian.org with esmtp (Exim 4.50) id 1HE6jn-0004Br-4z for 50731@bugs.debian.org; Mon, 05 Feb 2007 08:30:40 -0800 Received: from lxultra1.macs.hw.ac.uk ([137.195.27.173]:60286 helo=127.0.0.1) by izanami.macs.hw.ac.uk with smtp (Exim 4.51) id 1HE6jD-00068q-Oh; Mon, 05 Feb 2007 16:30:04 +0000 Received: (nullmailer pid 16394 invoked by uid 1001); Mon, 05 Feb 2007 16:30:04 -0000 To: Marcin Owsiany Cc: 50731@bugs.debian.org Subject: Re: Bug#50731: cruft: suggest using compression (gzip?) for files in /var/spool/cruft References: <861wl5pxku.fsf@macs.hw.ac.uk> <20070204202005.GB14578@kufelek> From: Joe Wells Date: Mon, 05 Feb 2007 16:30:04 +0000 In-Reply-To: <20070204202005.GB14578@kufelek> (Marcin Owsiany's message of "Sun, 4 Feb 2007 20:20:05 +0000") Message-ID: <86veigo9df.fsf@macs.hw.ac.uk> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2005_01_02 (1.212-2003-09-23-exp) on spohr.debian.org X-Spam-Level: X-Spam-Status: No, hits=-1.6 required=4.0 tests=BAYES_44,HAS_BUG_NUMBER, RCVD_NUMERIC_HELO autolearn=no version=2.60-bugs.debian.org_2005_01_02 Marcin Owsiany writes: > On Sun, Feb 04, 2007 at 06:49:37PM +0000, Joe Wells wrote: >> I agree that cruft uses too much disk space. >> >> For example, on my system, "du -s /var/spool/cruft" says 45940 (KiB), >> but after running "gzip /var/spool/cruft/*" the output of "du -s >> /var/spool/cruft" changes to 4252, which is only 9% of the >> uncompressed size. > > This sounds like a clever idea.. If you feel like it, could you please > do some size and time change measurments for this data, for different > compression levels (say, --fast, default and --best)? Sure. Output included below. Quick summary: -9 (same as --best) gets a compressed result that is 8.5% of the uncompressed size. -1 (same as --fast) gets a compressed result that is 10.3% of the uncompressed size. The -9 compressed result is 83% of the size of the -1 compressed result. The default (same as -6) compressed size is 9% of the uncompressed size. Compressing with -9 takes 460% of the time used to compress with -1 and 280% of the time used to compress with -6. The compression time with -9 is only 6 seconds though so it is probably much smaller than the time taken to gather the data being compressed. Uncompression time seems to depend only on the compressed size, so it did not vary much. bzip2 compresses better (7.4% of uncompressed size) but takes a *lot* longer to compress and uncompress. (bzip2 also uses tons of memory when compressing and uncompressing.) Here are the stats: uncompressed size: 45760120 size compressed with "gzip -9": 3909493 size compressed with "gzip -1": 4723511 size compressed with "gzip -6": 4108964 size compressed with "bzip2": 3380903 time to compress with "gzip -9": 6.17u + 0.09s = 6.26 time to compress with "gzip -1": 1.26u + 0.09s = 1.35 time to compress with "gzip -6": 2.15u + 0.08s = 2.23 time to compress with "bzip2": 26.83u + 0.18s = 27.01 time to uncompress with "gzip -9": 0.49u + 0.16s = 0.65 time to uncompress with "gzip -1": 0.52u + 0.16s = 0.68 time to uncompress with "gzip -6": 0.48u + 0.17s = 0.65 time to uncompress with "bzip2": 3.41u + 0.21s = 3.62 >> How hard would it be to change cruft to save its files in compressed >> (e.g., maybe using gzip) format? > > Saving would be trivial - just filtering data through gzip in a shell > script. However generating a report from such files would not be so > straightforward, as an important part of it is done by a small program > written in C, which opens the files on its own, so this would require > adding zlib support to it. > > I'm not saying that it's impossible, but I'm currently looking at > reimplementing some of the cruft's guts, so it's going to wait at least > until I finish that. Okay. Hope it can be done, because it would be really nice to save all that space (my disks are always nearly full!). -- Joe Wells ====================================================================== Script started on 2007-02-05 T 15:47:02 > sudo time --verbose gzip -9 * Command being timed: "gzip -9 expl_alternatives expl_dev expl_diversions expl_dpkg expl_users file_in_dev file_in_dev-pts file_in_dev-shm file_in_home file_in_mnt-shared-data2 file_in_proc-bus-usb file_in_sys file_in_var-lock file_in_var-run file_root miss_alternatives miss_dev miss_diversions miss_dpkg miss_users need_alternatives need_link_dests report unex_in_home unex_in_mnt-shared-data2 unex_in_proc-bus-usb unex_in_sys unex_in_var-lock unex_in_var-run unex_root want_alternatives want_link_dests" User time (seconds): 6.17 System time (seconds): 0.09 Percent of CPU this job got: 93% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.67 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 215 Voluntary context switches: 1 Involuntary context switches: 6353 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > gzip -l * compressed uncompressed ratio uncompressed_name 1458 8928 84.1% expl_alternatives 4046 25080 84.0% expl_dev 349 1992 84.2% expl_diversions 847072 8510446 90.0% expl_dpkg 944074 12772141 92.6% expl_users 86 179 70.9% file_in_dev-pts 45 9 -22.2% file_in_dev-shm 2465 12114 79.9% file_in_dev 941235 12755631 92.6% file_in_home 833 9286 91.5% file_in_mnt-shared-data2 99 258 76.7% file_in_proc-bus-usb 21455 214812 90.0% file_in_sys 71 50 28.0% file_in_var-lock 312 907 69.3% file_in_var-run 903342 9035497 90.0% file_root 107 95 25.3% miss_alternatives 1387 12796 89.4% miss_dev 239 1215 83.1% miss_diversions 511 3048 84.2% miss_dpkg 82 95 44.2% miss_users 748 3779 81.2% need_alternatives 48503 407126 88.1% need_link_dests 97358 1081703 91.0% report 325 904 67.5% unex_in_home 831 9268 91.5% unex_in_mnt-shared-data2 99 258 76.7% unex_in_proc-bus-usb 21453 214807 90.0% unex_in_sys 59 22 -9.1% unex_in_var-lock 294 810 67.9% unex_in_var-run 54888 501895 89.1% unex_root 71 36 2.8% want_alternatives 15596 174933 91.1% want_link_dests 3909493 45760120 91.5% (totals) > sudo time --verbose gunzip * Command being timed: "gunzip expl_alternatives.gz expl_dev.gz expl_diversions.gz expl_dpkg.gz expl_users.gz file_in_dev-pts.gz file_in_dev-shm.gz file_in_dev.gz file_in_home.gz file_in_mnt-shared-data2.gz file_in_proc-bus-usb.gz file_in_sys.gz file_in_var-lock.gz file_in_var-run.gz file_root.gz miss_alternatives.gz miss_dev.gz miss_diversions.gz miss_dpkg.gz miss_users.gz need_alternatives.gz need_link_dests.gz report.gz unex_in_home.gz unex_in_mnt-shared-data2.gz unex_in_proc-bus-usb.gz unex_in_sys.gz unex_in_var-lock.gz unex_in_var-run.gz unex_root.gz want_alternatives.gz want_link_dests.gz" User time (seconds): 0.49 System time (seconds): 0.16 Percent of CPU this job got: 88% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.74 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 173 Voluntary context switches: 1 Involuntary context switches: 672 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > sudo time --verbose gzip -1 * Command being timed: "gzip -1 expl_alternatives expl_dev expl_diversions expl_dpkg expl_users file_in_dev file_in_dev-pts file_in_dev-shm file_in_home file_in_mnt-shared-data2 file_in_proc-bus-usb file_in_sys file_in_var-lock file_in_var-run file_root miss_alternatives miss_dev miss_diversions miss_dpkg miss_users need_alternatives need_link_dests report unex_in_home unex_in_mnt-shared-data2 unex_in_proc-bus-usb unex_in_sys unex_in_var-lock unex_in_var-run unex_root want_alternatives want_link_dests" User time (seconds): 1.26 System time (seconds): 0.09 Percent of CPU this job got: 93% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.45 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 217 Voluntary context switches: 1 Involuntary context switches: 1386 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > gzip -l * compressed uncompressed ratio uncompressed_name 1661 8928 81.8% expl_alternatives 4695 25080 81.4% expl_dev 412 1992 81.0% expl_diversions 1042721 8510446 87.7% expl_dpkg 1120052 12772141 91.2% expl_users 87 179 70.4% file_in_dev-pts 45 9 -22.2% file_in_dev-shm 2885 12114 76.4% file_in_dev 1116679 12755631 91.2% file_in_home 1003 9286 89.7% file_in_mnt-shared-data2 103 258 75.2% file_in_proc-bus-usb 25651 214812 88.1% file_in_sys 71 50 28.0% file_in_var-lock 332 907 67.1% file_in_var-run 1112710 9035497 87.7% file_root 107 95 25.3% miss_alternatives 1607 12796 87.7% miss_dev 277 1215 80.0% miss_diversions 613 3048 80.8% miss_dpkg 82 95 44.2% miss_users 818 3779 79.3% need_alternatives 58538 407126 85.6% need_link_dests 118862 1081703 89.0% report 331 904 66.8% unex_in_home 995 9268 89.7% unex_in_mnt-shared-data2 103 258 75.2% unex_in_proc-bus-usb 25643 214807 88.1% unex_in_sys 59 22 -9.1% unex_in_var-lock 315 810 65.3% unex_in_var-run 67241 501895 86.6% unex_root 71 36 2.8% want_alternatives 18742 174933 89.3% want_link_dests 4723511 45760120 89.7% (totals) > sudo time --verbose gunzip * Command being timed: "gunzip expl_alternatives.gz expl_dev.gz expl_diversions.gz expl_dpkg.gz expl_users.gz file_in_dev-pts.gz file_in_dev-shm.gz file_in_dev.gz file_in_home.gz file_in_mnt-shared-data2.gz file_in_proc-bus-usb.gz file_in_sys.gz file_in_var-lock.gz file_in_var-run.gz file_root.gz miss_alternatives.gz miss_dev.gz miss_diversions.gz miss_dpkg.gz miss_users.gz need_alternatives.gz need_link_dests.gz report.gz unex_in_home.gz unex_in_mnt-shared-data2.gz unex_in_proc-bus-usb.gz unex_in_sys.gz unex_in_var-lock.gz unex_in_var-run.gz unex_root.gz want_alternatives.gz want_link_dests.gz" User time (seconds): 0.52 System time (seconds): 0.16 Percent of CPU this job got: 89% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.76 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 174 Voluntary context switches: 1 Involuntary context switches: 710 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > sudo time --verbose gzip * Command being timed: "gzip expl_alternatives expl_dev expl_diversions expl_dpkg expl_users file_in_dev file_in_dev-pts file_in_dev-shm file_in_home file_in_mnt-shared-data2 file_in_proc-bus-usb file_in_sys file_in_var-lock file_in_var-run file_root miss_alternatives miss_dev miss_diversions miss_dpkg miss_users need_alternatives need_link_dests report unex_in_home unex_in_mnt-shared-data2 unex_in_proc-bus-usb unex_in_sys unex_in_var-lock unex_in_var-run unex_root want_alternatives want_link_dests" User time (seconds): 2.15 System time (seconds): 0.08 Percent of CPU this job got: 92% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.43 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 217 Voluntary context switches: 1 Involuntary context switches: 2249 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > gzip -l * compressed uncompressed ratio uncompressed_name 1498 8928 83.6% expl_alternatives 4235 25080 83.2% expl_dev 350 1992 84.1% expl_diversions 890947 8510446 89.5% expl_dpkg 989479 12772141 92.3% expl_users 86 179 70.9% file_in_dev-pts 45 9 -22.2% file_in_dev-shm 2486 12114 79.7% file_in_dev 986484 12755631 92.3% file_in_home 943 9286 90.3% file_in_mnt-shared-data2 99 258 76.7% file_in_proc-bus-usb 22579 214812 89.5% file_in_sys 71 50 28.0% file_in_var-lock 310 907 69.6% file_in_var-run 952426 9035497 89.5% file_root 107 95 25.3% miss_alternatives 1542 12796 88.2% miss_dev 240 1215 83.0% miss_diversions 512 3048 84.1% miss_dpkg 82 95 44.2% miss_users 753 3779 81.0% need_alternatives 50052 407126 87.7% need_link_dests 104366 1081703 90.4% report 325 904 67.5% unex_in_home 938 9268 90.3% unex_in_mnt-shared-data2 99 258 76.7% unex_in_proc-bus-usb 22576 214807 89.5% unex_in_sys 59 22 -9.1% unex_in_var-lock 293 810 68.0% unex_in_var-run 58314 501895 88.4% unex_root 71 36 2.8% want_alternatives 16597 174933 90.5% want_link_dests 4108964 45760120 91.0% (totals) > sudo time --verbose gunzip * Command being timed: "gunzip expl_alternatives.gz expl_dev.gz expl_diversions.gz expl_dpkg.gz expl_users.gz file_in_dev-pts.gz file_in_dev-shm.gz file_in_dev.gz file_in_home.gz file_in_mnt-shared-data2.gz file_in_proc-bus-usb.gz file_in_sys.gz file_in_var-lock.gz file_in_var-run.gz file_root.gz miss_alternatives.gz miss_dev.gz miss_diversions.gz miss_dpkg.gz miss_users.gz need_alternatives.gz need_link_dests.gz report.gz unex_in_home.gz unex_in_mnt-shared-data2.gz unex_in_proc-bus-usb.gz unex_in_sys.gz unex_in_var-lock.gz unex_in_var-run.gz unex_root.gz want_alternatives.gz want_link_dests.gz" User time (seconds): 0.48 System time (seconds): 0.17 Percent of CPU this job got: 88% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.74 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 172 Voluntary context switches: 2 Involuntary context switches: 663 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > sudo time --verbose bzip2 -v * expl_alternatives: 6.498:1, 1.231 bits/byte, 84.61% saved, 8928 in, 1374 out. expl_dev: 7.632:1, 1.048 bits/byte, 86.90% saved, 25080 in, 3286 out. expl_diversions: 5.298:1, 1.510 bits/byte, 81.12% saved, 1992 in, 376 out. expl_dpkg: 11.138:1, 0.718 bits/byte, 91.02% saved, 8510446 in, 764122 out. expl_users: 16.365:1, 0.489 bits/byte, 93.89% saved, 12772141 in, 780470 out. file_in_dev: 5.802:1, 1.379 bits/byte, 82.76% saved, 12114 in, 2088 out. file_in_dev-pts: 2.157:1, 3.709 bits/byte, 53.63% saved, 179 in, 83 out. file_in_dev-shm: 0.184:1, 43.556 bits/byte, -444.44% saved, 9 in, 49 out. file_in_home: 16.388:1, 0.488 bits/byte, 93.90% saved, 12755631 in, 778375 out. file_in_mnt-shared-data2: 12.498:1, 0.640 bits/byte, 92.00% saved, 9286 in, 743 out. file_in_proc-bus-usb: 2.745:1, 2.915 bits/byte, 63.57% saved, 258 in, 94 out. file_in_sys: 13.765:1, 0.581 bits/byte, 92.74% saved, 214812 in, 15606 out. file_in_var-lock: 0.667:1, 12.000 bits/byte, -50.00% saved, 50 in, 75 out. file_in_var-run: 2.660:1, 3.008 bits/byte, 62.40% saved, 907 in, 341 out. file_root: 11.130:1, 0.719 bits/byte, 91.02% saved, 9035497 in, 811819 out. miss_alternatives: 0.856:1, 9.347 bits/byte, -16.84% saved, 95 in, 111 out. miss_dev: 11.274:1, 0.710 bits/byte, 91.13% saved, 12796 in, 1135 out. miss_diversions: 4.483:1, 1.784 bits/byte, 77.70% saved, 1215 in, 271 out. miss_dpkg: 5.655:1, 1.415 bits/byte, 82.32% saved, 3048 in, 539 out. miss_users: 1.105:1, 7.242 bits/byte, 9.47% saved, 95 in, 86 out. need_alternatives: 4.694:1, 1.704 bits/byte, 78.70% saved, 3779 in, 805 out. need_link_dests: 8.842:1, 0.905 bits/byte, 88.69% saved, 407126 in, 46046 out. report: 12.195:1, 0.656 bits/byte, 91.80% saved, 1081703 in, 88699 out. unex_in_home: 2.354:1, 3.398 bits/byte, 57.52% saved, 904 in, 384 out. unex_in_mnt-shared-data2: 12.872:1, 0.621 bits/byte, 92.23% saved, 9268 in, 720 out. unex_in_proc-bus-usb: 2.745:1, 2.915 bits/byte, 63.57% saved, 258 in, 94 out. unex_in_sys: 13.783:1, 0.580 bits/byte, 92.74% saved, 214807 in, 15585 out. unex_in_var-lock: 0.355:1, 22.545 bits/byte, -181.82% saved, 22 in, 62 out. unex_in_var-run: 2.477:1, 3.230 bits/byte, 59.63% saved, 810 in, 327 out. unex_root: 9.718:1, 0.823 bits/byte, 89.71% saved, 501895 in, 51644 out. want_alternatives: 0.500:1, 16.000 bits/byte, -100.00% saved, 36 in, 72 out. want_link_dests: 11.343:1, 0.705 bits/byte, 91.18% saved, 174933 in, 15422 out. Command being timed: "bzip2 -v expl_alternatives expl_dev expl_diversions expl_dpkg expl_users file_in_dev file_in_dev-pts file_in_dev-shm file_in_home file_in_mnt-shared-data2 file_in_proc-bus-usb file_in_sys file_in_var-lock file_in_var-run file_root miss_alternatives miss_dev miss_diversions miss_dpkg miss_users need_alternatives need_link_dests report unex_in_home unex_in_mnt-shared-data2 unex_in_proc-bus-usb unex_in_sys unex_in_var-lock unex_in_var-run unex_root want_alternatives want_link_dests" User time (seconds): 26.83 System time (seconds): 0.18 Percent of CPU this job got: 94% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:28.65 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 11610 Voluntary context switches: 1 Involuntary context switches: 28080 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > du -s 3424 . > sudo time --verbose bunzip2 * Command being timed: "bunzip2 expl_alternatives.bz2 expl_dev.bz2 expl_diversions.bz2 expl_dpkg.bz2 expl_users.bz2 file_in_dev-pts.bz2 file_in_dev-shm.bz2 file_in_dev.bz2 file_in_home.bz2 file_in_mnt-shared-data2.bz2 file_in_proc-bus-usb.bz2 file_in_sys.bz2 file_in_var-lock.bz2 file_in_var-run.bz2 file_root.bz2 miss_alternatives.bz2 miss_dev.bz2 miss_diversions.bz2 miss_dpkg.bz2 miss_users.bz2 need_alternatives.bz2 need_link_dests.bz2 report.bz2 unex_in_home.bz2 unex_in_mnt-shared-data2.bz2 unex_in_proc-bus-usb.bz2 unex_in_sys.bz2 unex_in_var-lock.bz2 unex_in_var-run.bz2 unex_root.bz2 want_alternatives.bz2 want_link_dests.bz2" User time (seconds): 3.41 System time (seconds): 0.21 Percent of CPU this job got: 56% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.40 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 6219 Voluntary context switches: 1 Involuntary context switches: 3681 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > exit Script done on 2007-02-05 T 15:55:14