Apache Pig is a high level data flow language for Hadoop eco system.
Pig facilitates defining simple to complex workflows that can operate
on data sizes ranging from gigabytes to petabytes. The simplicity
of Pig scripting is big plus compared to Java Map Reduce. Pig was
oringally developed at Yahoo; now Pig is heavily used by companies
like Netflix, LInkedin and Yahoo.
This workshop will introduce Apache Pig to students. We will go
through the Pig concepts and learn Pig Latin language. Students will
learn by working on hands-on labs using Hadoop and Pig. The workshop
will focus on solving practical, real world problems (no toy labs)
This is a HANDS-ON workshop. Estimated run time 2 hrs.
Note to attendees:
Attendees *must* have a working Apache Hadoop + Pig environment
pre-installed on their laptop. We recommend using Hadoop virtual
machines offered by Cloudera or HortonWorks. Since these are *BIG*
downloads, please download and install them well in advance.
Cloudera VM :
http://www.cloudera.com/content/support/en/downloads/quickstart_vms/cdh-5-1-x1.htmlHortonworks VM :
http://hortonworks.com/products/hortonworks-sandbox/