dipe's log

Sunday, September 25, 2016

SQL performance of many big data environments

Recently I am enjoying Mark Litwintschik's no nonsense blog posts on query performance of different big data environments. For this he ingests the records of more than one billion NYC taxi cab rides , measures the time it takes to ingest the data and run 4 different (simple to complex) SQL queries against the database. He tells us what hardware or cloud config his tests run on and how much it costs. His tests are easily reproducible and he invites people to tell him if they think a test cloud be improved to be fair. Pretty awesome !

Here is a summary of his results:

Test Environment	median (sec)	factor slower
MapD & 8 x Nvidia Pascal Titan X	0.109	1
MapD & 4 x AWS g2.8xlarge	0.2185	2
MapD & 4 x Nvidia Titan X	0.285	3
MapD & 8 x Nvidia Tesla K80	0.123	1
AWS Redshift on 6 x ds2.8xlarge	1.905	17
AWS Redshift on 1 x ds2xlarge	73.5	674
Google BigQuery	2	18
Presto & EMR 5 x unknown instance	79	725
Presto & EMR 50 x unknown instance	43.5	399
Presto & EMR EMRFS 5 x m3.xlarge	81	743
Presto & EMR HDFS 5 x m3.xlarge	51.5	472
ElasticSearch & 1 x 4 core, 16GB, ssd	63.2	580
SparkSQL & EMRFS 5 x unknown	466.5	4280
Postgres & 1 x 4 core, 16GB, ssd	205	1881

a couple of observations:

GPU based databases such as MapD will take over the world (if you can fit your data in them)
GPU based databases can be inexpensive (The Titan X setup is 10x cheaper than the one using the Tesla) yet performance is 400 x faster than the fastest Hadoop cluster in the test
Google BigQuery and AWS Redshift are very fast as well. BigQuery has the advantage that you don't have to setup a Redshift server farm.
Spark is slow (4280 times slower than MapD !!!) and should be avoided for SQL only operations (it has other strengths and those can be very well combined with SparkSQL)
Presto was used because pure Hive was 3-6 x slower.

Please continue the great work Mark, perhaps you can do a test with Apache Drill, Impala or Slam Data on Mongo?

Sunday, January 25, 2015

modern compression tools are fast !

This morning I played with some compression tools on a new 28 core machine (Xeon E5-2695 v3 @ 2.30GHz).

I used python to create a 1GB string that consists of fake random DNA and dumped that to a text file:

   import sys, random  
   dnalist= list('ACGTACGTACGTACGT')  
   bytesize=1024*1024*1024  
   hostname=socket.gethostname()  
   dnastr = ''   
   for i in range(bytesize):  
     dnastr += random.choice(dnalist)  
   sys.stdout.write(dnastr)

then I tried standard gzip as well as the new lz4, lzo and the highly parallel pigz compressor which produces gzip compatible archives:

Tool	compression level	file size (MB)	run time (s)
gzip	6	293	111
lz4	1	693	15
lz4	6	466	69
lzo	6	500	6
lzo	7	399	412
pigz	6	292	5

lz4 performance is certainly an improvement over gzip at the price of lower compression ratio. However in this test it is not quite as impressive as in these benchmarks. https://code.google.com/p/lz4/

lzo is doing really well and is actually much faster than lz4 while delivering similar compression. Level 7-9 are really not that useful though.

lz4 claims to have much faster decompression times than lzo but I cannot confirm this here. Both tools take about 6 seconds to decompress and restore the 1GB file.

pigz shows what can be done with raw compute power. top showed 2800% cpu utilization on this 28 core linux system. It seems to scale almost linearly to the numbers of cores. Decompression takes about 3 seconds. Here the local raid array may be a limiting factor. It can write 300-400 MB/s

Monday, October 14, 2013

Linux Kernel Roadmap for Ubuntu LTS

Since we are mostly a Ubuntu Shop we are somewhat interested which Kernel will be in the next version 14.04 LTS. Since there is no official road map we have to do a little crystal balling. Let's see how long each release actually takes in the 3.x series. Taking the release dates from Wikipedia we see an average on 67 days in the last 2 years:

Version Release Date Days

3.0 7/22/2011 64

3.1 10/24/2011 94

3.2 1/05/2012 73

3.3 3/19/2012 74

3.4 5/21/2012 63

3.5 7/12/2012 52

3.6 10/01/2012 81

3.7 12/11/2012 71

3.8 2/19/2013 70

3.9 4/29/2013 69

3.10 6/30/2013 62

3.11 9/02/2013 64

and assuming that Linus will turn into a robot and releases every 67 days the future may look like this:

3.12 11/8/2013 67

3.13 1/14/2014 67

3.14 3/22/2014 67

3.15 5/28/2014 67

3.16 8/3/2014 67

3.17 10/9/2014 67

3.18 2/15/2014 67

3.19 2/20/2015 67

3.20 4/28/2015 67

3.21 7/4/2015 67

3.22 9/9/2015 67

3.23 11/15/2015 67

3.24 1/21/2016 67

3.25 3/28/2016 67

For Ubuntu 14.04 this means that the Kernel will either be 3.13 or 3.14, the former is perhaps more likely.

Thursday, March 28, 2013

"ZFS on Linux" ready for wide scale deployment

Quoting lead developer Brian Behlendorf from Lawrence Livermore National Lab (LLNL):

"Today the ZFS on Linux project reached an important milestone with the official 0.6.1 release! Over two years of use by real users has convinced us ZoL is ready for wide scale deployment on everything from desktops to super computers." Read the full announcement

ZFSOnLinux (or ZoL) is a high performance implementation of ZFS as a Kernel module and performance is on par with Solaris (especially on new Hardware).

It is used by the LLNL Sequoia HPC Cluster with 55PB of storage:

http://arstechnica.com/information-technology/2012/06/with-16-petaflops-and-1-6m-cores-doe-supercomputer-is-worlds-fastest/

The porting of ZFS to Linux has been funded by DOE and did start in 2008. Is it important to understand that ZoL is not currently an unstable beta but the result of a more than 5 year effort.

Please see presentations from 2011 and 2012 that provide additional details:
http://zfsonlinux.org/docs.html

Sunday, February 17, 2013

Ubuntu 12.04.2 LTS Kernel confusion

Recently there were some changes in Ubuntu:With the release of 12.04.2 it seems new installs from cd/dvd will use the lts-quantal kernel 3.5 from Ubuntu 12.10 by default..... however apt-get upgrade and dist-upgrade will continue to default to the old 3.2 kernel. The 3.5 kernel will enjoy the same support the 3.2 kernel had, but the 3.5 kernel will only be supported until the next LTS release 14.04 while the 3.2 kernel will be supported for the full 5 years. Canonical recommends to leave VMs and cloud installs at 3.2.
https://wiki.kubuntu.org/Kernel/LTSEnablementStack
Since we want to keep our Scientific Computing stack fresh this would mean that we upgrade to 14.04 next year. This puts one question on the table: Do we want to upgrade our current compute systems to kernel 3.5 or stay on 3.2?
Not yet sure if there are any direct benefits other than better support for the Micron SSD controller in the Dell R720 hardware we use. One that I could see is that it supports tcp connection repair which is useful for HPC checkpointing.
Another interesting feature is improved performance debugging.
http://kernelnewbies.org/Linux_3.5#head-95fccbb746226f6b9dfa4d1a48801f63e11688de
and a network priority cgroup:
http://kernelnewbies.org/Linux_3.3#head-f0a57845639c0fbc242438e4cb76d44d1f103c24
we would probably leave most of our virtual systems on kernel 3.2 to enjoy the full 5 year support, our desktop deployment should may be go to 3.5 if the hardware requires it.

Wednesday, August 29, 2012

OpenStack Swift vs AWS Glacier costs

Since AWS Glacier hit the road there are some interesting discussions and blogs on comparing costs between Glacier and local solutions such as OpenStack Swift. Glacier is hard to beat if you do TCO calculations that include everything like datacenter, power, cooling & staff. For many of us these costs vary a lot dependent on things like being in a fortune 500 or in a government funded agency or residing in a location with low power and cooling costs vs LA or NYC. Some of us even have the notion of sunk costs .....

If we just look at the plain storage hardware we know for example that we can get 36 drive standard Supermicro storage servers for less than $5k and we have seen the latest and greatest 4TB Hitachi Deskstar for $239 on Google Shopping. The Hitachi Deskstar model seems to have an excellent reputation and folks who know what they are doing recommend it as well. (albeit the older 3TB version).
So we seem to be getting 144TB RAW which might roughly translate to 130TiB usable in this box and it costs ($5000+36*$239)/130TiB = $105-$115/TiB dependent on your sales tax... let's say $110/TiB. Swift needs 2-3 replicas so your actual costs would end up at $330/TB or $66/TB/Y if we assume that the whole system will run for 5 years. That's not too bad compared to Glacier which runs minimally at $120/TB/Y.
If swift sounds compelling to you, you still have to operate and support it but you can actually get tech support from a number of vendors such as www.swiftstack.com .

Amar here has another idea which I find intriguing. LTFS allows you to mount each tape drive (up to 3TB capacity each) into an individual folder on your Linux box. Just using LTFS is probably painful since you may have hundreds of small 3TB storage buckets ......but if there was a way to use Swift with LTFS this could possibly push down storage costs to under $20/TB/Y. I'd like to learn more about this.

Sunday, May 13, 2012

OpenStack Swift vs Gluster

As I am trying to get my head around OpenStack Swift storage I need to compare this to something we already know. We have been using GlusterFS for years in our shop and are reasonably happy with it for data that does not require high performance disk and high uptime. Gluster sounds like a simple solution but its codebase has grown over the years and it has not been free of bugs. As of 2012 it is really quite stable.

Let's look at the 2 codebases:

git clone https://github.com/gluster/glusterfs.git

git clone https://github.com/openstack/swift.git

>du -h --summarize glusterfs/

44M glusterfs/

>du -h --summarize swift/

15M swift

Well, gluster is 3 times the size, let's take a more detailed look at the code:

>cloc --by-file-by-lang glusterfs/

---------------------------------------------------
Language files comment code
---------------------------------------------------
C 272 14179 256462
C/C++ Header 214 5289 23208
XML 24 2 6544
Python 25 1836 5114
m4 3 85 1447
Bourne Shell 34 359 1419
Java 7 168 988
make 107 36 965
yacc 1 15 468
Lisp 2 59 124
vim script 1 49 89
lex 1 15 64
---------------------------------------------------
SUM: 691 22092 296892
---------------------------------------------------

>cloc --by-file-by-lang swift/

---------------------------------------------------
Language files comment code
---------------------------------------------------
Python 101 6137 32575
CSS 3 59 627
Bourne Shell 8 138 251
HTML 2 0 82
Bourne Again Shell 3 0 23
---------------------------------------------------
SUM: 117 6334 33558
---------------------------------------------------

Hm, gluster has 8 times more lines of C code (SLOC) than swift has python code. I'm not in the position to compare python with C (other than stating that as of 2012 they seem to be similarly popular) but if we simply assumed that the numbers of errors per lines of code is similar swift may at some point have a stability advantage over gluster. Gluster has been developed for many years and it took a long time to come along. Swift is only been around for 2 years and some really big shops seem to be betting on it. Of course this is somewhat an apples to oranges comparison because Gluster is accessible as posix file system and object store and also has it's own protocol stack (NFS/glusterfs) while Swift just uses HTTP. Also performance considerations are not discussed here.
As a comparison, the Linux kernel has roughly 25 million lines of code and a tool like GNU make has about 33000 lines of code. Make is not a very complex piece of software. Is OpenStack swift?