dipe's log

Thursday, March 28, 2013

"ZFS on Linux" ready for wide scale deployment

Quoting lead developer Brian Behlendorf from Lawrence Livermore National Lab (LLNL):

"Today the ZFS on Linux project reached an important milestone with the official 0.6.1 release! Over two years of use by real users has convinced us ZoL is ready for wide scale deployment on everything from desktops to super computers." Read the full announcement

ZFSOnLinux (or ZoL) is a high performance implementation of ZFS as a Kernel module and performance is on par with Solaris (especially on new Hardware).

It is used by the LLNL Sequoia HPC Cluster with 55PB of storage:

http://arstechnica.com/information-technology/2012/06/with-16-petaflops-and-1-6m-cores-doe-supercomputer-is-worlds-fastest/

The porting of ZFS to Linux has been funded by DOE and did start in 2008. Is it important to understand that ZoL is not currently an unstable beta but the result of a more than 5 year effort.

Please see presentations from 2011 and 2012 that provide additional details:
http://zfsonlinux.org/docs.html

Sunday, February 17, 2013

Ubuntu 12.04.2 LTS Kernel confusion

Recently there were some changes in Ubuntu:With the release of 12.04.2 it seems new installs from cd/dvd will use the lts-quantal kernel 3.5 from Ubuntu 12.10 by default..... however apt-get upgrade and dist-upgrade will continue to default to the old 3.2 kernel. The 3.5 kernel will enjoy the same support the 3.2 kernel had, but the 3.5 kernel will only be supported until the next LTS release 14.04 while the 3.2 kernel will be supported for the full 5 years. Canonical recommends to leave VMs and cloud installs at 3.2.
https://wiki.kubuntu.org/Kernel/LTSEnablementStack
Since we want to keep our Scientific Computing stack fresh this would mean that we upgrade to 14.04 next year. This puts one question on the table: Do we want to upgrade our current compute systems to kernel 3.5 or stay on 3.2?
Not yet sure if there are any direct benefits other than better support for the Micron SSD controller in the Dell R720 hardware we use. One that I could see is that it supports tcp connection repair which is useful for HPC checkpointing.
Another interesting feature is improved performance debugging.
http://kernelnewbies.org/Linux_3.5#head-95fccbb746226f6b9dfa4d1a48801f63e11688de
and a network priority cgroup:
http://kernelnewbies.org/Linux_3.3#head-f0a57845639c0fbc242438e4cb76d44d1f103c24
we would probably leave most of our virtual systems on kernel 3.2 to enjoy the full 5 year support, our desktop deployment should may be go to 3.5 if the hardware requires it.

Wednesday, August 29, 2012

OpenStack Swift vs AWS Glacier costs

Since AWS Glacier hit the road there are some interesting discussions and blogs on comparing costs between Glacier and local solutions such as OpenStack Swift. Glacier is hard to beat if you do TCO calculations that include everything like datacenter, power, cooling & staff. For many of us these costs vary a lot dependent on things like being in a fortune 500 or in a government funded agency or residing in a location with low power and cooling costs vs LA or NYC. Some of us even have the notion of sunk costs .....

If we just look at the plain storage hardware we know for example that we can get 36 drive standard Supermicro storage servers for less than $5k and we have seen the latest and greatest 4TB Hitachi Deskstar for $239 on Google Shopping. The Hitachi Deskstar model seems to have an excellent reputation and folks who know what they are doing recommend it as well. (albeit the older 3TB version).
So we seem to be getting 144TB RAW which might roughly translate to 130TiB usable in this box and it costs ($5000+36*$239)/130TiB = $105-$115/TiB dependent on your sales tax... let's say $110/TiB. Swift needs 2-3 replicas so your actual costs would end up at $330/TB or $66/TB/Y if we assume that the whole system will run for 5 years. That's not too bad compared to Glacier which runs minimally at $120/TB/Y.
If swift sounds compelling to you, you still have to operate and support it but you can actually get tech support from a number of vendors such as www.swiftstack.com .

Amar here has another idea which I find intriguing. LTFS allows you to mount each tape drive (up to 3TB capacity each) into an individual folder on your Linux box. Just using LTFS is probably painful since you may have hundreds of small 3TB storage buckets ......but if there was a way to use Swift with LTFS this could possibly push down storage costs to under $20/TB/Y. I'd like to learn more about this.

Sunday, May 13, 2012

OpenStack Swift vs Gluster

As I am trying to get my head around OpenStack Swift storage I need to compare this to something we already know. We have been using GlusterFS for years in our shop and are reasonably happy with it for data that does not require high performance disk and high uptime. Gluster sounds like a simple solution but its codebase has grown over the years and it has not been free of bugs. As of 2012 it is really quite stable.

Let's look at the 2 codebases:

git clone https://github.com/gluster/glusterfs.git

git clone https://github.com/openstack/swift.git

>du -h --summarize glusterfs/

44M glusterfs/

>du -h --summarize swift/

15M swift

Well, gluster is 3 times the size, let's take a more detailed look at the code:

>cloc --by-file-by-lang glusterfs/

---------------------------------------------------
Language files comment code
---------------------------------------------------
C 272 14179 256462
C/C++ Header 214 5289 23208
XML 24 2 6544
Python 25 1836 5114
m4 3 85 1447
Bourne Shell 34 359 1419
Java 7 168 988
make 107 36 965
yacc 1 15 468
Lisp 2 59 124
vim script 1 49 89
lex 1 15 64
---------------------------------------------------
SUM: 691 22092 296892
---------------------------------------------------

>cloc --by-file-by-lang swift/

---------------------------------------------------
Language files comment code
---------------------------------------------------
Python 101 6137 32575
CSS 3 59 627
Bourne Shell 8 138 251
HTML 2 0 82
Bourne Again Shell 3 0 23
---------------------------------------------------
SUM: 117 6334 33558
---------------------------------------------------

Hm, gluster has 8 times more lines of C code (SLOC) than swift has python code. I'm not in the position to compare python with C (other than stating that as of 2012 they seem to be similarly popular) but if we simply assumed that the numbers of errors per lines of code is similar swift may at some point have a stability advantage over gluster. Gluster has been developed for many years and it took a long time to come along. Swift is only been around for 2 years and some really big shops seem to be betting on it. Of course this is somewhat an apples to oranges comparison because Gluster is accessible as posix file system and object store and also has it's own protocol stack (NFS/glusterfs) while Swift just uses HTTP. Also performance considerations are not discussed here.
As a comparison, the Linux kernel has roughly 25 million lines of code and a tool like GNU make has about 33000 lines of code. Make is not a very complex piece of software. Is OpenStack swift?

Saturday, May 12, 2012

Starting to research OpenStack Swift

As we are always looking at lowering our storage costs while still trying to manage petabytes of storage we heard about "object storage" for a few years. This Buzzword sounds a bit like a bad disease to a traditional Linux/Unix heavy Scientific Computing shop. It sounds like something that could break in all sorts of ways and would have unbearable latency etc.

On the other hand we see almost every day that storage and other IT vendors are jumping on the object and cloud storage bandwagon. Is it all just cloud hype or is there something more to it? One platform that sticks out particularly is OpenStack after more than a dozen companies (AT&T, IBM,
Red Hat, SUSE, Cisco, Dell, Canonical, etc) have pledged to support the OpenStack foundation. OpenStack was created by Rackspace and NASA (here is the story behind it) and the storage component Swift was originally developed at Rackspace. As we are most interested in storage, Swift is the thing we are looking at.
Now, is this really a OSS project with broad support and many contributors? Until today Rackspace appears to be doing most of the real work, but there is a fair number of other big names who are also contributing code.

We work quite a bit with Dell hardware and it is nice to see that they have created a nice deployment solution called Crowbar that uses an OSS DevOps approach to push openstack to their servers. Their cloud dude seems to be a bit of an OpenStack enthusiast. But there are also a few startups that are betting on OpenStack Swift, such as SwiftStack.com who sells you a customized Ubuntu Image with a web management tool that lets you deploy a Swift storage cluster in a few minutes. The SwiftStack people are core contributors to the OpenStack swift project so they know the code base very well.
How about end user adoption in Universities and other research places? The San Diego Super Computing Center has brought their OpenStack storage cloud online last year and is offering pretty reasonable pricing (about 1/3 of the price of S3).
Why are all these large companies joining OpenStack? Well, of course they all are way behind Amazon EC2/S3 and joining forces can either be seen as a good strategy or as a desperate attempt to catch up.

From a storage technology perspective there are may be 3 reasons for this push that come to my mind. First, it takes a very long time to develop a storage platform. For BlueArc, 3PAR, Compellent, Isilon, etc it took almost 10 years to convince many IT managers that those were viable options. HP and Dell needed to suck up one of those manufacturers to get the know how. Second, customers are increasing vary of vendor lock in and lack of scalability because big data capacity and especially performance needs are very unpredictable. And third, traditional storage techniques such as RAID will not be viable in the future and alternatives (examples are gpfs, panassas but also 3PAR with it's chunklet stuff) take a very long time to develop (again, see first point).

But why does OpenStack seem to have more followers than CloudStack, Eucalyptus or others? It is extremely scalable but I could not (yet) find any strong hints that it is more scalable than other stacks.
From a developer and system integrator view the OpenStack trump card seems to be modularity which is important for keeping up development speed and for allowing a large community of developers to participate.
What strikes me from a systems management perspective is the simplicity of the underlying toolset. Every Unix admin is familiar with Python, Sqlite, Rsync and Linux/XFS. At first you might think: What, that's what they are using? After all, rsync is more than 15 years old and this is the tool that is supposed to help conquering the storage world in the 21st century?
Then you think: Oh if our sysadmins ever have to do a root cause analysis on performance issues they already know rsync and if they ever have to throttle the replication engine they already know what --bwlimit is. That does not sound too bad....but we will have to take a deeper look at this ..... to be continued.

Random Links & Blogs:
http://programmerthoughts.com/openstack/swift-tech-overview/
http://searchstorage.techtarget.com/news/2240105808/Caringo-CAStor-integrates-object-storage-with-OpenStack-Swift
http://www.slideshare.net/HuiCheng2/integrating-open-stack
http://www.buildcloudstorage.com/
http://www.cloudconnectevent.com/santaclara/2012/presentations/free/99-john-dickinson.pdf
http://www.buildcloudstorage.com/2012/01/can-openstack-swift-hit-amazon-s3-like.html

Consultants:
http://www.talkincloud.com/it-consultants-build-openstack-cloud-business-practices/
http://www.griddynamics.com/ or http://openstackgd.wordpress.com/

Friday, November 21, 2008

how to remove spikes from rrd graphs (cacti, mrtg, zenoss, ganglia)

If you have been using open source tools for performance monitoring of your IT infrastructure, combine harvester or space station you have most likely come across rrdtool. This swiss army knife is used to create pretty graphs for most monitoring tools. Sometimes graphs are getting out of whack because some unplanned event causes a spike that pushes the graph out of proportion. In our case this was a crash of one of our storage head units that rendered the annual IOPS graph almost unreadable.

Fortunately there is a little tool called "removespikes" on the rrdtool website that can take care of this little problem:

http://oss.oetiker.ch/rrdtool/pub/contrib/

download and unpack the latest removespikes-xxxxxxx-mkn.tar.gz from the website, copy the script removespikes.pl into your rrdtool graph directory and execute it like this:

./removespikes.pl -d -l 0.1 netappa0_ops.rrd

The -l parameter defines how aggressively spikes are chopped off. Start with 0.1. The default is 0.6.

In my case it seemed to do the right thing and chopped of two peaks at 2008-01-25 and 2008-03-03

Using the "rrdtool tune" command is another option for repairing graphs. It just sets a max value for x or y axis. I think remove spikes is better but I have not seen the results of "rrdtool tune" yet.

dipe