Hadoop Simplified: Avro DataSerialization in Hadoop

Avro is a data serialization system
http://avro.apache.org/
Key advantage is that it supports schema evolution.
Schema is stored along with data.
Schema is expressed in JSON format.
Both writer and reader have to define a schema to access avro files.
This allows a good way to handle schema evolution.
Youtube link: http://www.youtube.com/watch?v=EBV4C-P3G94
IBM article: http://www.ibm.com/developerworks/library/bd-avrohadoop/
MapReduce and Avro example: http://avro.apache.org/docs/current/mr.html#Example%3A+ColorCount
Other links:

Hive has a Serde, so it can query data that is in Avro format http://www.michael-noll.com/blog/2013/07/04/using-avro-in-mapreduce-jobs-with-hadoop-pig-hive/
MSDN article on how to use this with C#: http://code.msdn.microsoft.com/Schema-Evolution-In-Avro-240f0a7a

Hadoop Simplified