[unisog] large volume of files per filesystems
darden at armc.org
Thu Dec 27 15:53:08 GMT 2001
If the difference is sol7 vs. sol8, then it could be NFS2 vs. NFS3. NFS3
incorporates many performance optimizations....
--Patrick Darden Internetworking Manager
-- 706.475.3312 darden at armc.org
-- Athens Regional Medical Center
On Thu, 27 Dec 2001, Jim Ennis wrote:
> I did some more testing with a Solaris 8 system with 11 million files on
> it. The full backup runs in about 5 hours and 15 minutes. The machine
> with the backup performance problem is running Solaris 7 and the
> application is active during the backup. Since I am seeing a 600%
> reduction in backup time (for more files) either I am getting I/O
> contention from &((*& webct or Solaris 7 has some file system performance
> The backups were done to the same backup server (Sun E450 with Netbackup
> 3.2 and Sun L1800 tape library (4 DLT7000 tape heads).
> I am trying to get some feedback from Veritas and Sun before working up an
> upgrade plan. Due to academic schedules, my next real window for a major
> change would be be May or more likely, August. But it looks like an OS
> upgrade will be part of the upgrade plan.
> Jim Ennis | jim at pegasus.cc.ucf.edu
> Systems Administrator | (407) 823-1701 | Fax: (407) 823-5476
> University of Central Florida | Murphy's paradox:
> | Doing it the hard way is always easier.
> On Wed, 26 Dec 2001, Patrick Darden wrote:
> > Good points.
> > As far as file system enhancements go, here are some parallels that are
> > already in operation:
> > Qmail derives a huge speed enhancement from moving inboxes from one
> > directory (/var/spool/mail) to many (/home/user). Large sites enhance
> > this further by spreading home directories out (/home/a/ausers
> > /home/b/busers /home/c/cusers etc.).
> > Squid uses 16 top level directories, each hoding 256 subdirectories. This
> > speeds file access tremendously. 10M files is small time for Squid.
> > INN switched to what it calls a Cylinder file system. Instead of each news
> > article being a separate file, it now just rams the new article in at the
> > end of the appropriate cylinder (alt or comp or rec...). Each cylinder is
> > a file. This saves on inodes, reduces wastage of blocks due to lots of
> > small files, makes disk access faster because you can use large blocks
> > without huge space wastage.
> > Storing files in a database could be the answer. MySQL should be able to
> > store and index files much much more efficiently than flat inode files.
> > Oracle actually boasts about this capability. Then you just backup one
> > database file.
> > Storing the info in a database instead of files might be cleaner, and it
> > works very well. If you get more than about 3000 users it pays bigtime to
> > turn your passwd file into a database.
> > Finally, although I have never used this backup program, the backup progs
> > I have used sometimes allow file indexing for fast individual file/dir
> > restores. This is tremendously useful, but slows backups--especially if
> > you have a lot of files vr. a lot of gigabytes. I would check to see if
> > indexing is turned on, and turn it off for a trial.
> > --
> > --Patrick Darden Internetworking Manager
> > -- 706.475.3312 darden at armc.org
> > -- Athens Regional Medical Center
> > On Wed, 26 Dec 2001 lbuchana at csc.com wrote:
> > > Hi,
> > >
> > > In the responses so far, I have not noticed any mention of the issue of the
> > > tape drive being a bottle neck. If you can not feed data to the tape drive
> > > to keep it streaming, you will have horrible performance. Any interruption
> > > in the data stream causes the tape drive to stop, rewind, and wait for the
> > > next tape block. There is at least one tape drive on the market, that has
> > > a variable write speed to reduce or eliminate this problem, but I have no
> > > idea of how well it (they) work as I have never seen one.
> > >
> > > One method that I have used to reduce the number of times a tape drive has
> > > to rewind during a backup is to use very large tape blocks. How well this
> > > works with modern hardware compression board is something I have never
> > > tested.
> > >
> > > Another issue to consider is reworking the application to reduce the number
> > > of files. At a user group meeting several years ago, a sys admin described
> > > an application that was dealing with small gene fragments, and the user was
> > > putting each fragment into a separate file. The thrashing of opening and
> > > closing thousands of files was killing system performance. The sys admin
> > > rewrote the users application to only use two or three files. The
> > > application ran on the order of a thousand times faster and did not
> > > interfere with other users of the system.
> > >
> > > My real point, is you need to look at the entire system.
> > >
> > > B Cing U
> > >
> > > Buck
> > >
More information about the unisog