Monday, December 15, 2014

HDInsight Essentials 2nd version is coming soon

This is 2nd edition of my book HDInsight Essentials.  This one is more in-depth and go through a journey of building an enterprise data lake.   It is up to date with Hadoop 2.X and HDInsight 3.1.

I also take a real life project and walk through the ingestion, organization, transformation and reporting phases.

https://www.packtpub.com/big-data-and-business-intelligence/hdinsight-essentials-second-edition





Monday, December 8, 2014

Hive 14 released with useful features for RDBMS offload use cases

Great features in Hive 14 that make it really close to an RDBMS solution based on Hadoop: http://hortonworks.com/blog/announcing-apache-hive-0-14/

Key features:

  • Transactions with ACID semantics
  • Cost Based Optimizer
  • SQL Temporary Tables

Design Docs of Hive if you are interested to get to the details: