Dropping XFS from My Workstation
3 Jan 2009My dual Opteron workstation has been around for nearly 5 years now. It’s had some bumps and bruises along the way (some of which were due to my own actions), but has been a great machine. It still has very good performance, especially given it’s age.
When I first built it in May of 2004, Fedora Core 2 was barely out and was the first Fedora to sport an AMD64 (x86_64) 64-bit version. That was the first and last time that I installed Linux on this box, from scratch. Since then, I’ve upgraded it to FC3, FC4, FC5, F6, F7, F8 and now F9 (I will upgrade to F10 in a week or so).
When I installed FC2, I used the ext3 filesystem for the root volume (I use LVM). I "converted" the root volume to the XFS filesystem on 2006/08/03. I also created a few volumes using XFS and reiserfs (v3.6) filesystems.
Over time, I’ve had a few minor problems with XFS. Recently, those problems grew in regards to the root volume to the point where I needed to convert it to something else, which I did the other day. The root volume is now on reiserfs. That leaves just 3 volumes that are still XFS.
After upgrading to F9 and installing updates, there were a couple of weird issues that I was dealing with. I also kept seeing some filesystem corruption messages (on the terminal, in the logs) for XFS volumes (but they don’t tell you which one). That’s it, I’m done with this XFS thing, so I’m going to convert those filesystems over to something else and get rid of XFS on this workstation.
The three volumes are for /usr/, /var/ and /var/log/. I could just drop to single user mode and convert /var/ and /var/log/ without any difficulty. For whatever reason, on Fedora and derivatives (including RHEL, CentOS, etc.), I have never been able to umount /usr/ successfully once it’s mounted. So, I’m using a rescue environment (to convert all 3) like I did for the root filesystem conversion, just so I don’t have to muck with it.
I haven’t had one problem at all with the two XFS volumes I have on my home file server (one for /music/ and one for /video/). That server is running openSUSE for a few years now. It’s also a much more complicated setup on that hardware, which I’ll talk about more in a later article.
For those who want to post comments that I’m an idiot for using reiserfs, please, don’t bother. I’ve heard every reason why this filesystem or that filesystem sucks and you should only use, “the other one,” instead. Look, it’s this simple: since the filesystem is the one piece of software where we just don’t tolerate buggy software, when something does go wrong the stories live on for years. I’ve heard horror stories of kinds that you might never be able to imagine describing data loss at the hands of ext2, ext3, reiserfs, XFS, JFS and many other filesystems. I’ve only experienced data loss with ext2 and ext3. XFS has given me problems, but thankfully not with files that I couldn’t easily replace. I haven’t hardly used JFS, but I do have a volume or two on my home file server that are JFS and there’s been zero trouble there.
Here’s my philosophy about filesystem type selection: use the right tool for the job.
It’s not always easy to say with perfect definitiveness that you should always use this filesystem here and that one there. Benchmarks show all 4 of ext3, reiserfs, JFS and XFS as having statistically equal performance for general use cases (like workstations). There are some rules by which I can say, from experience, that one of these will outperform the others for a particular use. I’ve been meaning to run another series of performance benchmarks on as many viable Linux filesystem types as I can. I’ll post results and talk about use cases then. For now, here are some basic tips from my experiences:
- Always check the “expiration date” on the horror stories that people tell you. It’s more likely to be old, as reiserfs, ext3, JFS and XFS have all been quite stable for many years now.
- ext3 will almost never outperform the others for a specialized task.
- reiserfs, JFS and XFS will almost always have roughly equal performance for most specialized tasks. This is primarily due to the fact that they share the very similar basic filesystem design concepts, though, obviously, the implementations vary. I’ve thought for many years that XFS was derived, in part, from reiserfs (due to some very hard to discount coincidences in XFS structures and code) but also shares some design elements in common with MacOS filesystems.
- If you’re going to have lots of files, big or small, then move away from ext3. Newer versions of ext3 (that are not backward compatible with older ext2/3 drivers) implemented some features (like hash-indexing) from reiserfs in order to improve performance in this area. Still, more than about one thousand files or so in a directory and ext3 starts to bog down quickly (when working in that directory). So, for example, ext3 is a really poor choice for spooling directories on busy servers or for proxy stores or any other application where tens if not hundreds of thousands of files will be created.
- ext3 has the worst file deletion performance of the group. Thus, for applications like print and mail servers, ext3 is a very poor choice. I have personally seen anywhere from 7 to 9 times better performance for print servers and from 10 to 23 times better performance for mail servers by simply converting the spool (and log, in the case of the mail servers) directories from ext3 to reiserfs.
- XFS and JFS have some specialized features that are very useful in high throughput applications. XFS has a bandwidth guarantee feature that is very useful with large media operations (like audio/video editing, compositing, etc.) and streaming. JFS has some sustained high throughput features that provide excellent performance for some types of databases (not database servers, but data operations by the servers).
- When it comes to databases, it’s very hard to predict which of these 4 will provide the best performance. It is very rare that ext3 is the winner, but it does happen. The only way to really know has been to create 4 volumes formatted with each filesystem type and run some benchmarks against the same DB on top of them. If you’re going to do this, make sure to use the database structures for the DB you want to test that you will be using in production. You don’t have to have “real” data, but make sure it is representative of the types and sizes of records that your database will be working with. Also, be sure the benchmarking test run “real” queries in the “right” ratios that you do (or expect to) see in your production environment. After all of that testing, you’ll probably see that one of the filesystem types outshines the rest.
Once, while I was consulting with a Fortune 500 company that will remain nameless, we saw that certain tables experienced huge performance benefits on one filesystem and other tables were significantly better on another. They actually reworked the application code to work with splitting the database into two databases, that were then stored on two different filesystems in order to take advantage of this.
Basically, each of these filesystem types have their advantages and disadvantages. There are other journalling and log filesystems available for Linux that are worth looking at for some applications. If you have a strong bias towards just one filesystem type and won’t even look at the others, then you are very likely missing out some benefits that you could have. If nothing else, it’s certainly an interesting topic … to some of us geeks.





