[unisog] large volume of files per filesystems
jim at pegasus.cc.ucf.edu
Thu Dec 27 13:58:45 GMT 2001
I did some more testing with a Solaris 8 system with 11 million files on
it. The full backup runs in about 5 hours and 15 minutes. The machine
with the backup performance problem is running Solaris 7 and the
application is active during the backup. Since I am seeing a 600%
reduction in backup time (for more files) either I am getting I/O
contention from &((*& webct or Solaris 7 has some file system performance
The backups were done to the same backup server (Sun E450 with Netbackup
3.2 and Sun L1800 tape library (4 DLT7000 tape heads).
I am trying to get some feedback from Veritas and Sun before working up an
upgrade plan. Due to academic schedules, my next real window for a major
change would be be May or more likely, August. But it looks like an OS
upgrade will be part of the upgrade plan.
Jim Ennis | jim at pegasus.cc.ucf.edu
Systems Administrator | (407) 823-1701 | Fax: (407) 823-5476
University of Central Florida | Murphy's paradox:
| Doing it the hard way is always easier.
On Wed, 26 Dec 2001, Patrick Darden wrote:
> Good points.
> As far as file system enhancements go, here are some parallels that are
> already in operation:
> Qmail derives a huge speed enhancement from moving inboxes from one
> directory (/var/spool/mail) to many (/home/user). Large sites enhance
> this further by spreading home directories out (/home/a/ausers
> /home/b/busers /home/c/cusers etc.).
> Squid uses 16 top level directories, each hoding 256 subdirectories. This
> speeds file access tremendously. 10M files is small time for Squid.
> INN switched to what it calls a Cylinder file system. Instead of each news
> article being a separate file, it now just rams the new article in at the
> end of the appropriate cylinder (alt or comp or rec...). Each cylinder is
> a file. This saves on inodes, reduces wastage of blocks due to lots of
> small files, makes disk access faster because you can use large blocks
> without huge space wastage.
> Storing files in a database could be the answer. MySQL should be able to
> store and index files much much more efficiently than flat inode files.
> Oracle actually boasts about this capability. Then you just backup one
> database file.
> Storing the info in a database instead of files might be cleaner, and it
> works very well. If you get more than about 3000 users it pays bigtime to
> turn your passwd file into a database.
> Finally, although I have never used this backup program, the backup progs
> I have used sometimes allow file indexing for fast individual file/dir
> restores. This is tremendously useful, but slows backups--especially if
> you have a lot of files vr. a lot of gigabytes. I would check to see if
> indexing is turned on, and turn it off for a trial.
> --Patrick Darden Internetworking Manager
> -- 706.475.3312 darden at armc.org
> -- Athens Regional Medical Center
> On Wed, 26 Dec 2001 lbuchana at csc.com wrote:
> > Hi,
> > In the responses so far, I have not noticed any mention of the issue of the
> > tape drive being a bottle neck. If you can not feed data to the tape drive
> > to keep it streaming, you will have horrible performance. Any interruption
> > in the data stream causes the tape drive to stop, rewind, and wait for the
> > next tape block. There is at least one tape drive on the market, that has
> > a variable write speed to reduce or eliminate this problem, but I have no
> > idea of how well it (they) work as I have never seen one.
> > One method that I have used to reduce the number of times a tape drive has
> > to rewind during a backup is to use very large tape blocks. How well this
> > works with modern hardware compression board is something I have never
> > tested.
> > Another issue to consider is reworking the application to reduce the number
> > of files. At a user group meeting several years ago, a sys admin described
> > an application that was dealing with small gene fragments, and the user was
> > putting each fragment into a separate file. The thrashing of opening and
> > closing thousands of files was killing system performance. The sys admin
> > rewrote the users application to only use two or three files. The
> > application ran on the order of a thousand times faster and did not
> > interfere with other users of the system.
> > My real point, is you need to look at the entire system.
> > B Cing U
> > Buck
More information about the unisog