Wednesday, January 21, 2015

Hadoop and Security

This is a very old topic and no real good solutions; Hortonworks has published this article about Ranger + Dataguise http://hortonworks.com/blog/hadoop-security-different-paradigm/?mkt_tok=3RkMMJWWfF9wsRovuq%2FOZKXonjHpfsX66%2B8uWaW%2BlMI%2F0ER3fOvrPUfGjI4JSsJhI%2BSLDwEYGJlv6SgFT7TMMbFh1rgNUxc%3D

Monday, December 15, 2014

HDInsight Essentials 2nd version is coming soon

This is 2nd edition of my book HDInsight Essentials.  This one is more in-depth and go through a journey of building an enterprise data lake.   It is up to date with Hadoop 2.X and HDInsight 3.1.

I also take a real life project and walk through the ingestion, organization, transformation and reporting phases.

https://www.packtpub.com/big-data-and-business-intelligence/hdinsight-essentials-second-edition





Monday, December 8, 2014

Hive 14 released with useful features for RDBMS offload use cases

Great features in Hive 14 that make it really close to an RDBMS solution based on Hadoop: http://hortonworks.com/blog/announcing-apache-hive-0-14/

Key features:

  • Transactions with ACID semantics
  • Cost Based Optimizer
  • SQL Temporary Tables

Design Docs of Hive if you are interested to get to the details: 

Saturday, October 25, 2014

Strata Hadoop World 2014 Conference Speaker Notes and Links

I attended the 2014 Strata Hadoop Conference @ NY which was a great success and increase in participation compared to last year. Spark was a key highlight of the conference from a technology perspective. There are several new products and tools trying to capitalize the predicted 40$ Billon market. Link to slides and videos: http://strataconf.com/stratany2014/public/schedule/proceedings?imm_mid=0c5096&cmp=em-strata-na-info-stny14_thankyou

Monday, October 20, 2014

Hbase and Hive Integration

HBase has been the key database in Hadoop ecosystem providing transactional support enabling real time applications to be built on top of HDFS. Following is good article that describe the roadmap of Hbase and Hive from Hortonworks. This will help and streamline architectures in Hadooop http://hortonworks.com/blog/hbase-hive-better-together/