Written by Light4Data Blog24 April 201724 April 2017

Data Pipelines in Hadoop – Silicon Valley Data Science

In this post we’ll look at some real world examples of managing headaches while moving to Hadoop. Source: Data Pipelines in Hadoop – Silicon Valley Data Science

Written by Light4Data Blog22 April 201724 April 2017

“Architecture of Giants: Data Stacks at Facebook, Netflix, Airbnb, and Pinterest”

“Architecture of Giants: Data Stacks at Facebook, Netflix, Airbnb, and Pinterest” @michellewetzler https://blog.keen.io/architecture-of-giants-data-stacks-at-facebook-netflix-airbnb-and-pinterest-9b7cd881af54

Written by Light4Data Blog6 July 20166 July 2016

Untangling Apache Hadoop YARN

Part 1: Cluster and YARN Basics Ray Chiang is a Software Engineer at Cloudera. Dennis Dawson is a Senior Technical Writer at Cloudera. Categories: Hadoop MapReduce YARN In this multipart series, fully explore the tangled ball of thread that is YARN. YARN (Yet Another Resource Negotiator) is the resource management layer for the Apache Hadoop […]

Written by Light4Data Blog5 July 20165 July 2016

Using Apache Hive on Docker

Apache Hive is data warehouse framework for storing, managing and querying large data sets. The Hive query language HiveQL is a SQL-like language. Hive stores data in HDFS by default, and a Hive table may be used to define structure on the data. Hive supports two kinds of tables: managed tables and external tables. A managed table is […]

Senhadji's Blog…

About Software stuff, big data, analytics, docker, container, social interactions, neurosciences, brain, other mindful things, thoughts on everything…

Category: hadoop

Data Pipelines in Hadoop – Silicon Valley Data Science

“Architecture of Giants: Data Stacks at Facebook, Netflix, Airbnb, and Pinterest”

Untangling Apache Hadoop YARN

Using Apache Hive on Docker