Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] how to tune reiser4 for millions of files?
- Date: Sun, 31 Jan 2010 15:43:16 +0900
- From: Curt Sampson <cjs@example.com>
- Subject: Re: [tlug] how to tune reiser4 for millions of files?
- References: <20100128095957.GB24344@example.com> <20100128132701.GI13095@example.com> <87wrz2s08d.fsf@example.com> <20100128180431.GA10687@example.com> <20100129134613.GA18169@example.com> <20100128073847.GH13095@example.com> <20100128095957.GB24344@example.com> <20100128132701.GI13095@example.com> <87wrz2s08d.fsf@example.com> <20100128180431.GA10687@example.com>
- User-agent: Mutt/1.5.18 (2008-05-17)
On 2010-01-28 13:04 -0500 (Thu), Patrick Bernier wrote: > I would tend to guess that his ls is trying to SORT the entire file listing > before printing anything, which of course cannot be done before having > loaded the ENTIRE file list in memory. That's a good thought, but loading the entire file list into memory is a relatively cheap operation, since it's a linear read of a file (on UFS, anyway--it's probably a still relatively cheap tree scan on Reiser), and the sort will be quite cheap assuming it stays in main memory. The expensive (i.e., time-consuming) thing will be the IO to pull in all of the inode information, since that's going to have to go all over the disk looking for it. You can quite easily compare the effects of the various operations: ls -1 -U # no sort, no inode lookup ls -1 # sorted, no inode lookup ls -1 -l -U # no sort, inode lookups ls -1 -l # sorted, inode lookups Note that I use (-1) here to avoid the buffering and various processing necessary to columnize the output. > ...using "find" is always the fastest way to go... Actually, as well as being simpler, "ls -1 -U" will be faster due to avoding the overhead of forking off all these "ls" processes. On modern machines it's not likely you will notice this. On 2010-01-29 14:46 +0100 (Fri), Michal Hajek wrote: > > find . -print0 | xargs -0 -r -- ls -l > > this yields the result in something like 30 min, which is a great > improvement compared to those 6+ hours which I spent with simple "ls -l"... Interesting. What you're doing here is the equivalant of "ls -1 -l -U" above. Why is it so much faster when unsorted, though it is doing inode lookups? Let me guess: the files were created in sequence without a large amount of other filesystem activity in between, and were not created in alphebetical order. If you create, write and close a large number of small files in sequence on a UFS-style file system, they inode entries for two files created one after the other will tend to be written side by side in the same block. Thus, when you look up those two inode entries one after the other, the second is already in a block in the buffer cache, and you need not go to disk for it. When you read the inode entries in order of filename, rather than creation, you're effectively reading them back in random order and thus you have standard "random I/O is slow" problem. My thoughts on solutions in a separate post. cjs -- Curt Sampson <cjs@example.com> +81 90 7737 2974 http://www.starling-software.com The power of accurate observation is commonly called cynicism by those who have not got it. --George Bernard Shaw
- References:
- Re: [tlug] how to tune reiser4 for millions of files?
- From: Curt Sampson
- Re: [tlug] how to tune reiser4 for millions of files?
- From: Michal Hajek
- Re: [tlug] how to tune reiser4 for millions of files?
- From: Stephen J. Turnbull
- Re: [tlug] how to tune reiser4 for millions of files?
- From: Patrick Bernier
- Re: [tlug] how to tune reiser4 for millions of files?
- From: Michal Hajek
- [tlug] how to tune reiser4 for millions of files?
- From: Michal Hajek
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] OSS network visualization software, WAS: [Semi-OT] Network connectivity diagnosis
- Next by Date: Re: [tlug] how to tune reiser4 for millions of files?
- Previous by thread: Re: [tlug] how to tune reiser4 for millions of files?
- Next by thread: [tlug] Looking for Accounting software
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links