[unisog] large volume of files per filesystems

Jim Ennis jim at pegasus.cc.ucf.edu
Thu Dec 27 13:58:45 GMT 2001

I did some more testing with a Solaris 8 system with 11 million files on
it.  The full backup runs in about 5 hours and 15 minutes.  The machine
with the backup performance problem is running Solaris 7 and the
application is active during the backup.  Since I am seeing a 600%
reduction in backup time (for more files) either I am getting I/O
contention from &((*& webct or Solaris 7 has some file system performance

The backups were done to the same backup server (Sun E450 with Netbackup
3.2 and Sun L1800 tape library (4 DLT7000 tape heads).

I am trying to get some feedback from Veritas and Sun before working up an
upgrade plan.  Due to academic schedules, my next real window for a major
change would be be May or more likely, August.  But it looks like an OS
upgrade will be part of the upgrade plan.

Jim Ennis                        | jim at pegasus.cc.ucf.edu
Systems Administrator            | (407) 823-1701  |  Fax: (407) 823-5476
University of Central Florida    | Murphy's paradox:
                                 | Doing it the hard way is always easier.

On Wed, 26 Dec 2001, Patrick Darden wrote:

> Good points.
> As far as file system enhancements go, here are some parallels that are
> already in operation:
> Qmail derives a huge speed enhancement from moving inboxes from one
> directory (/var/spool/mail) to many (/home/user).  Large sites enhance
> this further by spreading home directories out (/home/a/ausers
> /home/b/busers /home/c/cusers etc.).
> Squid uses 16 top level directories, each hoding 256 subdirectories.  This
> speeds file access tremendously.  10M files is small time for Squid.
> INN switched to what it calls a Cylinder file system. Instead of each news
> article being a separate file, it now just rams the new article in at the
> end of the appropriate cylinder (alt or comp or rec...).  Each cylinder is
> a file.  This saves on inodes, reduces wastage of blocks due to lots of
> small files, makes disk access faster because you can use large blocks
> without huge space wastage.
> Storing files in a database could be the answer.  MySQL should be able to
> store and index files much much more efficiently than flat inode files.
> Oracle actually boasts about this capability.  Then you just backup one
> database file.
> Storing the info in a database instead of files might be cleaner, and it
> works very well.  If you get more than about 3000 users it pays bigtime to
> turn your passwd file into a database.
> Finally, although I have never used this backup program, the backup progs
> I have used sometimes allow file indexing for fast individual file/dir
> restores.  This is tremendously useful, but slows backups--especially if
> you have a lot of files vr. a lot of gigabytes.  I would check to see if
> indexing is turned on, and turn it off for a trial.
> --
> --Patrick Darden                Internetworking Manager
> --                              706.475.3312    darden at armc.org
> --                              Athens Regional Medical Center
> On Wed, 26 Dec 2001 lbuchana at csc.com wrote:
> > Hi,
> >
> > In the responses so far, I have not noticed any mention of the issue of the
> > tape drive being a bottle neck.  If you can not feed data to the tape drive
> > to keep it streaming, you will have horrible performance.  Any interruption
> > in the data stream causes the tape drive to stop, rewind, and wait for the
> > next tape block.  There is at least one tape drive on the market, that has
> > a variable write speed to reduce or eliminate this problem, but I have no
> > idea of how well it (they) work as I have never seen one.
> >
> > One method that I have used to reduce the number of times a tape drive has
> > to rewind during a backup is to use very large tape blocks.  How well this
> > works with modern hardware compression board is something I have never
> > tested.
> >
> > Another issue to consider is reworking the application to reduce the number
> > of files.  At a user group meeting several years ago, a sys admin described
> > an application that was dealing with small gene fragments, and the user was
> > putting each fragment into a separate file.  The thrashing of opening and
> > closing thousands of files was killing system performance.  The sys admin
> > rewrote the users application to only use two or three files.  The
> > application ran on the order of a thousand times faster and did not
> > interfere with other users of the system.
> >
> > My real point, is you need to look at the entire system.
> >
> > B Cing U
> >
> > Buck
> >

More information about the unisog mailing list