/* ---- Google Analytics Code Below */

Tuesday, August 12, 2014

Hadoop and Metadata

Excerpts from recent research about meta data use in the enterprise.  From Hortonworks and more in Infoq:

" ... One of the most attractive qualities of Hadoop is its flexibility to work with semi-structured and unstructured data without schemas.  Unstructured data represents 80% of  the overall data in most organizations and is growing at 10-50x  structured data. Indeed, Hadoop excels at extracting  structured data from unstructured data.  HCatalog helps Hadoop deliver value from the output of its labor, by providing access to mined, structured data by those who would consume it: analysts, systems and applications.

HCatalog is a metadata and table management system for Hadoop. It is based on the metadata layer found in Hive and provides a relational view through a SQL like language to data within Hadoop. HCatalog allows users to share data and metadata across Hive, Pig, and MapReduce. It also allows users to write their applications without being concerned with how or where the data is stored, and insulates users from schema and storage format changes. ... " 

No comments: