Google Answers: GFS or VxFS: Which clustered file system for Linux?

View Question

Q: GFS or VxFS: Which clustered file system for Linux? ( Answered 5 out of 5 stars

Question

Subject: GFS or VxFS: Which clustered file system for Linux?
Category: Computers > Operating Systems
Asked by: dirtech-ga
List Price: $100.00

Posted: 04 May 2006 11:01 PDT
Expires: 03 Jun 2006 11:01 PDT
Question ID: 725461

We are in the process of moving from an environment of boxed servers
with direct-attached SCSI RAIDS to a blades/FC-SAN setup.  We mostly
run RHEL3 but we are interested in large (>2Tb) partitions so we are
planning to upgrade to RHEL4.   Our typical workload is a large
research job that accesses hundreds of megabytes of data and does some
fairly heavy computation.  Data tends to be stored in relatively large
(10-100Mb) flat files, i.e. no commercial database is used.  Currently
we use NFS to share files.  We often have contention for a given
server as jobs tend to run much faster when the data is local.  We
believe a clustered file system will solve this problem.   We're
trying to choose one.

Our current choice seems to be between GFS - the Red Hat Global File
System (fka Sistina) and Veritas VxFS.  The price seems close, with
Red Hat having a slight edge.  I haven't found anything on the Web
directly comparing the two.  We were hoping to find a researcher out
there with some experience with both of them who could steer us in the
right direction.  Alternatively, someone with some negative
experiences in one of them would be useful to ward us off.  
Especially useful would be someone using these for large (>2Tb)
partitions as we often find ourselves growing past 2Tb and end up
splitting data across partitions, which gets ugly fast.

To summarize: Should we go with GFS or VxFS, and are there unforseen
consequences using these past 2Tb?

Clarification of Question by dirtech-ga on 22 May 2006 12:24 PDT

I got no takers at $50, so I've upped the price to $100.  I'm also
willing to take research with no direct experience in the area.  Do
some web searching, come up with some things I haven't seen, and
pocket the cash.  Thanks.

Answer

Subject: Re: GFS or VxFS: Which clustered file system for Linux?
Answered By: leapinglizard-ga on 30 May 2006 11:55 PDT
Rated: 5 out of 5 stars

Dear dirtech,


Although I have no direct experience with GFS or VxFS, I am familiar
with the inadequacies of NFS in a terascale distributed environment. At
the information-retrieval laboratory where I worked previously, we used
NFS to share files among our 24 Linux boxes, half of which were hosting a
terabyte corpus and its associated index. We consistently had performance
problems, even after virtual-memory thrashing was cured by upgrading to
the 2.6 kernel. Ultimately, we migrated the terabyte corpus to a novel
distributed storage system that was designed by one of our researchers
and implemented by a graduate student for his thesis.

I was interested to read that GFS also began life as a thesis
project. This is neither good nor bad in itself, but the recent provenance
of GFS suggests that it may still be maturing.

    GFS originally developed as part of a thesis project at the
    University of Minnesota. At some point it made its way to
    Sistina Software, where it lived for a time as an open source
    project. Sometime in 2001 Sistina made the choice to make GFS a
    commercial product -- not under an open source license. OpenGFS
    forked from the last public release of GFS.

    In December 2003 Red Hat purchased Sistina. In late June 2004,
    Red Hat released GFS and many cluster infrastructure pieces under
    the GPL. Red Hat's current goal for the project (aside from the
    normal bug fixing and stabilization) envisages inclusion in the
    mainline Linux kernel. GFS now forms part of the Fedora Core
    4 distribution and can be purchased as a commercial product on
    top of Red Hat Enterprise Linux.

Wikipedia: Global File System
http://en.wikipedia.org/wiki/Global_File_System


There are several academic papers by the authors of GFS outlining its
theoretical advantages, but the few snippets of practical feedback I
have found on the web have been negative. Part of this, of course, is
merely due to the human propensity to raise one's voice in irritation
and to stay quiet in contentment. Observe, however, that the author of
the following passage compares GFS unfavorably to the SGI file system,
XFS, about which I shall say more later.

    > does anybody out there have a configuration with at least a 2
    > terabyte filesystem? i am using gfs 5.1 and in even doing simple
    > things like 'ls -l' it takes MINUTES to return a result. basically
    > we have a large ftp site currently running on  an sgi. the same
    > command on the same directory structure behaves normally, or as
    > one would expect it to, on the sgi. on our linux box, running
    > gfs 5.1, the results take up to two minutes to return a listing.

Global File System general discussion: "RE: really poor performance on
a 2 terabyte gfs"
http://permalink.gmane.org/gmane.comp.file-systems.gfs.user/29


The following research paper points out a weakness in the GFS design
that may explain gripes such as the one above.

    In GFS, whenever a transaction modifies a buffer, a copy is
    made to preserve its old contents. If the transaction must be
    aborted, GFS simply restores all affected buffers by using their
    frozen copies. Such a scheme is expensive in terms of its memory
    footprint and copying overhead.

USENIX: "yFS: A Journaling File System Design for Handling Large Data
Sets with Reduced Seeking"
http://www.usenix.org/events/fast/tech/full_papers/zhang/zhang_html/index.html


The following report states outright that GFS is ill-suited to accessing
large files and to intra-file sharing.

    Another example of a shared-disk cluster file system is the
    Global File System (GFS) [20], which originated as an open source
    file system for Linux. The newest version (GFS-4) implements
    journaling, and uses logging, locking, and recovery algorithms
    similar to those of GPFS and Frangipani. Locking in GFS is closely
    tied to physical storage. Earlier versions of GFS [21] required
    locking to be implemented at the disk device via extensions to
    the SCSI protocol. Newer versions allow the use of an external
    distributed lock manager, but still lock individual disk blocks
    of 4kB or 8kB size. Therefore, accessing large files in GFS
    entails significantly more locking overhead than the byte-range
    locks used in GPFS. Similar to Frangipani/Petal, striping in GFS
    is handled in a ?Network Storage Pool? layer; once created,
    however, the stripe width cannot be changed (it is possible to add
    a new ?sub-pools?, but striping is confined to a sub-pool,
    i.e., GFS will not stripe across sub-pools). Like Frangipani,
    GFS is geared more towards applications with little or no intra-
    file sharing.

IBM Almaden Research Center: "GPFS: A Shared-Disk File System for Large
Computing Clusters"
http://www.almaden.ibm.com/StorageSystems/File_Systems/GPFS/Fast02.pdf


I haven't found much user feedback on VxFS, but what I have seen has
been positive. In the following message, one fellow opines that VxFS
works well for terascale file systems.

    > So I would like to ask for your experiences with filesystems
    > larger than 4 TB.

    [...]

    Doug Hughes said "not a problem with something like VxFS. Less
    of a problem with UFS+ with logging turned on, but VxFS has a
    marginal edge with larger sizes.  The larger you get, the faster
    it gets for crash recovery (in comparison).  It goes up to 32TB
    and many sites use the whole thing (or more, depending upon OS
    version and Veritas version)"

SunManagers: Summaries: January 2005
http://www.sunmanagers.org/pipermail/summaries/2005-January.txt


More praise here.

    Most of us at this point are [used] to building fileystems with
    UFS. However[,] Veritas offers the Veritas File System[, a]
    journaling filesystem with performance advantage over UFS. My
    favorite use of VxFS, however, is that very large filesystems
    can be created very quickly.

Cuddletech: A Brief Discussion of VxFS
http://www.cuddletech.com/veritas/advx/x69.html


Based on this feedback, which I admit is insufficient in quantity to
make for a statistically valid sampling, I would venture to guess
that VxFS is more reliable, for the time being, as a terascale SAN
solution. Before shelling out any bucks, however, I would certainly look
into the possibility of running Clustered XFS, or CXFS, on Linux.

SGI: CXFS
http://www.sgi.com/products/storage/tech/file_systems.html


The core of CXFS is XFS, which has a long history of supporting large
storage systems. The guys at Gelato, who were early advocates for 64-bit
Linux, are among those who have plumped for XFS.

    At present XFS looks like the most appropriate file system for
    large file work (but check out JFS, reiserfs version 4, and ext3
    with large block sizes).
    
Gelato: Large File System support in Linux 2.5.x
http://www.gelato.unsw.edu.au/~peterc/lfs.html
    
    
Another word of support.
    
    A modern journaling file system designed for large disks is XFS,
    which is included in Linux and has no real limitation in disk
    or file size (multi exabyte). So that's the configuration we're
    running now.
    
Volker Gabler: Large disks with Linux (multi-terabyte)
http://www.lsw.uni-heidelberg.de/users/vgaibler/comp.html


CERN has interesting things to say about CXFS.

    The Clustered XFS file system technology
    (http://www.sgi.com/products/storage/software.html) is developed
    by Silicon Graphics for high-performance computing environments
    like their Origin. It is supported on IRIX 6.5, and also Linux
    and Windows NT. CXFS is designed as an extension to their XFS
    file system, and its performance, scalability and properties are
    for the main part similar to XFS, for instance, there is an API
    support for hierarchical storage management.  Quite good.

    Like XFS, CXFS is a high-performance and scalable file system,
    journaled for fast recovery, and has 64-bit scalability to support
    extremely large files and file system. Size limits are similar
    to XFS: maximum file size 9 EB, maximum file system size 18 EB,
    block and extends (contiguous data) size are configurable at
    file system creation, block size from 512 B to 64 kB for normal
    data and up to 1 MB for real-time data, and single extents can
    be up to 4 GB in size. There can be up to 64k partitions, 64k
    wide stripes and dynamic configurations.

    CXFS differs from XFS by being a distributed, clustered shared
    access file system, allowing multiple computers to share large
    amounts of data. All systems in a CXFS file system have the same,
    single file system view, i.e.  all systems read and write all
    files at the same time at near-local file system speeds. CXFS
    performance approaches the speed of standalone XFS even when
    multiple processes on multiple hosts are reading from and writing
    to the same file. This makes CXFS suitable for applications
    with large files, and even with real-time requirements like
    video streaming.  Dynamic allocation algorithms ensure that a
    file system can store and a single directory can contain millions
    of files without wasting disk space or degrading performance.

    CXFS extends XFS to Storage Area Network (SAN) disks, working with
    all storage devices and SAN environments supported by SGI. CXFS
    provides the infrastructure allowing multiple hosts and operating
    systems to have simultaneous direct access to shared disks,
    and the SAN provides high-speed physical connections between
    the hosts and disk storage.

CERN: DataGrid: Data Access and Mass Storage Systems
http://edg-wp2.web.cern.ch/edg-wp2/docs/DataGrid-02-D2.1-0105-2_0.doc


I have found it challenging but instructive to work on your question. I
hope you are pleased with my findings. If you are not, please advise me
through a Clarification Request and give me a chance to fully meet your
needs before you rate this answer.

Regards,

leapinglizard

dirtech-ga rated this answer: 5 out of 5 stars

Thank you very much for the fine answer.  You found a few things I had
not uncovered in my own surfing, and confirmed some things I had
thought.

Sorry for the late rating -- after checking every day for about a
month, I had given up hope.

One thing that would have made the answer even better is some
indication of the number of installed systems of each type.   This
seems impossible to come by -- at least I've only gotten glimpses of
it in all my surfing.   I found a RedHat guy who said something about
"over 100 GFS systems installed" which seemed kind of light -- we were
hoping for something more heavily used.  We suspect VxFS cluster is
more heavily used, but we are unable to verify that.

Thanks again.

Comments

Subject: Re: GFS or VxFS: Which clustered file system for Linux?
From: franck110-ga on 23 May 2006 08:34 PDT

Let me do some research with a close friend, we use the GFS 2 years ago.
 I will get his feedback

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy