
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] how to tune reiser4 for millions of files?
* Curt Sampson (cjs@example.com) [100131 07:43]:
> ls -1 -U # no sort, no inode lookup
> ls -1 # sorted, no inode lookup
> ls -1 -l -U # no sort, inode lookups
> ls -1 -l # sorted, inode lookups
thank you for a lot of explanations and help.
Here I come with some measurements.
The size of date is not precisely known, since:
# time du -sk /mnt/polea/out/data-0000001255522556768705-dat
^C
real 115m44.009s
user 0m23.448s
sys 1m54.327s
But the used part of filesystem is 113GB big, which means data have
something like 80GB. File sizes are about 10-12KB. So once again, Curt
made quite a good guess. I may try to run "du -sk" overnight.
# time ls -1 -U /mnt/polea/out/data-0000001255522556768705-dat/ | wc -l
7032035 (produces output immediately)
real 11m12.190s
user 0m6.695s
sys 0m47.079s
# time ls -1 /mnt/polea/out/data-0000001255522556768705-dat/ | wc -l
ls: memory exhausted
0
real 12m7.636s
user 0m1.929s
sys 0m47.453s
kfk-64 ~ # time ls -1 -l -U /mnt/polea/out/data-0000001255522556768705-dat/ | wc -l
^C (I killed it, not being patient enough and anyway - it demonstrates
the difference)
real 249m38.253s
user 0m20.212s
sys 5m53.706s
# time sh -c 'find /mnt/polea/out/data-0000001255522556768705-dat/ -type f -print0 | xargs -0 -r -- ls -l > /dev/null'
real 542m21.087s
user 1m47.939s
sys 10m41.090s
Here I let it run overnight :) I did not find a better way to measure
the time. Also, I do not know why it took so much more time. The only
difference is "-type f" which I added. Hopefuly >/dev/null makes no
difference. I added it because the terminal made the output somewhat
slow in the previous cases. Suggestions are wellcome.
Now the whole thing is only a matter of academical discussion or
personal interest. The analysis application is doomed for sure. But I
do not mind playing with the system for a while out of curiosity.
One side note, in dmesg I have:
[ 1778.148042] ls used greatest stack depth: 5084 bytes left
[73594.665219] ls used greatest stack depth: 5064 bytes left
[85121.212329] tee used greatest stack depth: 4912 bytes left
[85121.212415] sdcfedf used greatest stack depth: 4516 bytes left
sdcfedf - this is the analysis program.
All these messages came before the above measurements took place.
* Bruno Raoult (braoult@example.com) [100131 09:00]:
> Second *if*: Maybe you may know the filenames at first? Are filenames
> date- based, or something you could compute?
unfortunately, it is not the case. But this is surely one nice idea
which was not noticed before, I will make a note :) Thank you.
Best regards
michal
Home |
Main Index |
Thread Index