Cassandra, meet Hadoop. Hadoop, meet Cassandra.
Just noticed that somebody had already made a product/service out of the idea to integrate Hadoop and Cassandra, two major hypes names in the Big Data, NoSQL space. – which is exactly what I had been planning as new architecture for my last project.
Datastax now offers Brisk:
- CassandraFS has the same interface as HDFS. So, in particular, you should be able to use most Hadoop add-ons with Brisk.
- CassandraFS has comparable performance to HDFS on sequential scans. That’s without predicate pushdown to Cassandra, which is Coming Soon but won’t be in the first Brisk release.
- Brisk/CassandraFS is much easier to administer than HDFS. In particular, there are no NameNodes, JobTracker single points of failure, or any other form of head node. Brisk/CassandraFS is strictly peer-to-peer.
- Cassandra is far superior to HBase for short-request use cases, specifically with 5-6X the random-access performance.
Checkout their white paper.