[unisog] large volume of files per filesystems

Patrick Darden darden at armc.org
Thu Dec 27 15:53:08 GMT 2001


If the difference is sol7 vs. sol8, then it could be NFS2 vs. NFS3.  NFS3
incorporates many performance optimizations....

--
--Patrick Darden                Internetworking Manager             
--                              706.475.3312    darden at armc.org
--                              Athens Regional Medical Center


On Thu, 27 Dec 2001, Jim Ennis wrote:

> I did some more testing with a Solaris 8 system with 11 million files on
> it.  The full backup runs in about 5 hours and 15 minutes.  The machine
> with the backup performance problem is running Solaris 7 and the
> application is active during the backup.  Since I am seeing a 600%
> reduction in backup time (for more files) either I am getting I/O
> contention from &((*& webct or Solaris 7 has some file system performance
> issues.
> 
> The backups were done to the same backup server (Sun E450 with Netbackup
> 3.2 and Sun L1800 tape library (4 DLT7000 tape heads).
> 
> I am trying to get some feedback from Veritas and Sun before working up an
> upgrade plan.  Due to academic schedules, my next real window for a major
> change would be be May or more likely, August.  But it looks like an OS
> upgrade will be part of the upgrade plan.
> 
> 
> Jim Ennis                        | jim at pegasus.cc.ucf.edu
> Systems Administrator            | (407) 823-1701  |  Fax: (407) 823-5476
> University of Central Florida    | Murphy's paradox:
>                                  | Doing it the hard way is always easier.
> 
> 
> On Wed, 26 Dec 2001, Patrick Darden wrote:
> 
> >
> > Good points.
> >
> > As far as file system enhancements go, here are some parallels that are
> > already in operation:
> >
> > Qmail derives a huge speed enhancement from moving inboxes from one
> > directory (/var/spool/mail) to many (/home/user).  Large sites enhance
> > this further by spreading home directories out (/home/a/ausers
> > /home/b/busers /home/c/cusers etc.).
> >
> > Squid uses 16 top level directories, each hoding 256 subdirectories.  This
> > speeds file access tremendously.  10M files is small time for Squid.
> >
> > INN switched to what it calls a Cylinder file system. Instead of each news
> > article being a separate file, it now just rams the new article in at the
> > end of the appropriate cylinder (alt or comp or rec...).  Each cylinder is
> > a file.  This saves on inodes, reduces wastage of blocks due to lots of
> > small files, makes disk access faster because you can use large blocks
> > without huge space wastage.
> >
> > Storing files in a database could be the answer.  MySQL should be able to
> > store and index files much much more efficiently than flat inode files.
> > Oracle actually boasts about this capability.  Then you just backup one
> > database file.
> >
> > Storing the info in a database instead of files might be cleaner, and it
> > works very well.  If you get more than about 3000 users it pays bigtime to
> > turn your passwd file into a database.
> >
> > Finally, although I have never used this backup program, the backup progs
> > I have used sometimes allow file indexing for fast individual file/dir
> > restores.  This is tremendously useful, but slows backups--especially if
> > you have a lot of files vr. a lot of gigabytes.  I would check to see if
> > indexing is turned on, and turn it off for a trial.
> >
> > --
> > --Patrick Darden                Internetworking Manager
> > --                              706.475.3312    darden at armc.org
> > --                              Athens Regional Medical Center
> >
> >
> > On Wed, 26 Dec 2001 lbuchana at csc.com wrote:
> >
> > > Hi,
> > >
> > > In the responses so far, I have not noticed any mention of the issue of the
> > > tape drive being a bottle neck.  If you can not feed data to the tape drive
> > > to keep it streaming, you will have horrible performance.  Any interruption
> > > in the data stream causes the tape drive to stop, rewind, and wait for the
> > > next tape block.  There is at least one tape drive on the market, that has
> > > a variable write speed to reduce or eliminate this problem, but I have no
> > > idea of how well it (they) work as I have never seen one.
> > >
> > > One method that I have used to reduce the number of times a tape drive has
> > > to rewind during a backup is to use very large tape blocks.  How well this
> > > works with modern hardware compression board is something I have never
> > > tested.
> > >
> > > Another issue to consider is reworking the application to reduce the number
> > > of files.  At a user group meeting several years ago, a sys admin described
> > > an application that was dealing with small gene fragments, and the user was
> > > putting each fragment into a separate file.  The thrashing of opening and
> > > closing thousands of files was killing system performance.  The sys admin
> > > rewrote the users application to only use two or three files.  The
> > > application ran on the order of a thousand times faster and did not
> > > interfere with other users of the system.
> > >
> > > My real point, is you need to look at the entire system.
> > >
> > > B Cing U
> > >
> > > Buck
> > >
> >
> >
> 



More information about the unisog mailing list