Sunday, August 28, 2016

How to avoid passwords in your Sqoop scripts

For all those who use Sqoop and are looking at ways to manage the password issue... here is a good article on use of JCEKS https://www.mapr.com/blog/key-tips-managing-passwords-sqoop

Sunday, August 21, 2016

Amazon QuickSight now can be used to analyze your billing data

AWS now is using the new QuickSight product to allow customers to analyze their data usage. Here is the latest article from AWS. https://aws.amazon.com/about-aws/whats-new/2016/08/aws-cost-and-usage-report-data-is-now-easy-to-upload-directly-into-amazon-redshift-and-amazon-quicksight/

Wednesday, August 10, 2016

Monday, August 8, 2016

Big Data Roles and Responsibilities

Here is a good list of Roles of Responsibilities for Big Data: http://www.kdnuggets.com/2015/11/different-data-science-roles-industry.html

Friday, May 20, 2016

SQL on Hadoop - Trafodion

http://trafodion.apache.org/quickstart.html http://trafodion.apache.org/architecture-overview.html

Thursday, March 10, 2016

My Kafka Blog

https://datafloq.com/read/realize-real-time-analytics-iot-monetization-kafka/1930

PL/SQL

This is great addition to Hive...now you can get data from Hive and RDBMS at same time. http://www.hplsql.org/home

Sunday, February 28, 2016

Kafka and Spotify

https://labs.spotify.com/2016/02/25/spotifys-event-delivery-the-road-to-the-cloud-part-i/

Friday, February 26, 2016

Apache NiFi aka DataFlow

http://www.infoworld.com/article/2975833/hadoop/hortonworks-buys-better-hadoop-data-flow-management.html

Thursday, February 25, 2016

Hive Streaming

https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest This leverages Hive transaction capability but is limited to tables with ORC format. Only supports Storm and Flume.

Spark Streaming vs Flink


Tuesday, February 23, 2016

Thursday, February 4, 2016

Sunday, January 24, 2016

ClickStream Data Analytics end-to-end

http://hortonworks.com/hadoop-tutorial/how-to-visualize-website-clickstream-data/

Coursera Big Data Training

Affordable training... online from a reputed university. https://www.coursera.org/specializations/big-data?utm_medium=onlineads&utm_campaign=Big+Data&utm_source=fb&nan_pid=1846868993