smaller reset larger        English         

Main Menu

All times are in GMT -8 (DST) :: The time is now 11:39 pm.

Sub Menu

Article Data
Article Ref
5125-EUZV-3897
Written By
Philip Slater
Date Created
Fri, 23rd Oct 2009
Updated By
Philip Slater
Date Modified
Fri, 23rd Oct 2009
 
(Lost?)

   OCFS Information

Question 

What have been the issues and features of running OCFS as part of a Dynamic Cluster

Answer 

Following information has been provided by CGS Sales Engineer Christian Ellsworth and his experiences with his installation.

I have set up several CGP Clusters, including a 1 Million Users Cluster in Santiago, Chile. Please let me share some of that experience with you.
This project started early 2006, while I was working at a systems integrator company in Chile (Partner company with CommuniGate Systems).
During the technical evaluation of that project searching for the best performance, tock solid stability and an affordable price tag. We tested several Cluster Filesystems, running over RedHat Enterprise Linux 4 or SuSE Enterprise Server 8/9, the Cluster FS (CFS) tested were:
* Lustre (ClusterFS bought by HP later)
* OCFS (Official Oracle builds, devel builds, beta builds and SuSE Builds)
* HP Cluster Files Gateway (NFS on steroids)
* GFS (Red Hat Supported)
* Polyserve
 
The Platform had to support normally 10.000 accounts connected at normal operation and 25.000 on a peak load. And offer 2Gbytes quota per account.
Our tests showed the following on the CFS:
Lustre: broke a lot the Kernel (kernel patches, special modules, binary compiles and fixes), Just the kernel thing was a turn off, because it voided any support we could had with Red Hat.
OCFS2: has "issues" with millions of files in a single FS like: slowdowns, nodes falling out of the cluster on high load, weird fencing signals leading to kernel Panics, but the key issue that discarded it as a candidate was that OCFS2 has a very high "metadata" use that ate a lot of File system space, at some point over a 28% of the file system was used in metadata rather that real data. The higher the file count the higher metadata use. - OCFS2 is made for handling Oracle DB's Large Data files (few large files per file system). We tested several builds and versions... with similar results.
HP Clustered Files Gateways: very expensive and the performance is not great... used NFS kernel access at the clients
Red Hat Enterprise Linux 4 and Red Hat GFS: the performance is very solid, it runs on anything supported by Red Hat, the metadata use is very low (2-4%), the stability of the burn-in tested showed 5 days at 100% IO without crashes or hiccups. The Price is not "GRATIS" but the performance and support worth it, also its Certified and supported by HP.
Polyserver: pretty much HP Clustered File Gateway... expensive, kernel taints, not really "sturdy"
OCFS2= I personally don't recommend using it, I would stick to a well tuned GFS rather than start guessing over OCFS2. CommuniGate should work on top of OCFS2, but I'm not sure if OCFS2 is fitted and designed to run a email production environment during a high load peak.
 
After 40+ months the set up still runs with no issues, with over 99.98% uptime for the whole platform, 25TB of mail data, I was the Lead support engineer for the whole platform for over a year, and I did slept well every night of that time.
We used Redhat EL4 with GFS.
General recommendations:
- No Cluster Filesystem will perform well with less than a 15% of free space on the Filesystems
- Get a certified platform by Hardware and OS vendor. - Define a sizing thinking in a technical worst case scenario and a best commercial scenario.
- Tuning is key for stability and performance
- Before going live with a new service, test it thoroughly: stress it, hammer it, run Specmail

How Useful Was This Article?      (Rating: 93%    Votes: 3)  

Select a Rating

Article Comments 

There are currently no comments.