Reading and Writing data with Pig
- Loading data
- http://pig.apache.org/docs/r0.9.2/basic.html#load
- http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#pl_load
- Pig Schemas
- http://pig.apache.org/docs/r0.9.2/basic.html#Data+Types+and+More
- http://ofps.oreilly.com/titles/9781449302641/data_model.html
- Writing data
PigLatin in-depth
- FILTERing data
- http://pig.apache.org/docs/r0.9.2/basic.html#filter
- http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#filter
- Grouping and Sorting Data
- http://pig.apache.org/docs/r0.9.2/basic.html#GROUP
- http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#group_by
- http://pig.apache.org/docs/r0.9.2/basic.html#order-by
- http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#order_by
- Pig Expressions and Functions
- http://pig.apache.org/docs/r0.9.2/basic.html#expressions
- http://pig.apache.org/docs/r0.9.2/basic.html#udf-statements
- http://pig.apache.org/docs/r0.9.2/func.html
- http://pig.apache.org/docs/r0.9.2/udf.html
- http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#udfs
- http://ofps.oreilly.com/titles/9781449302641/writing_udfs.html
- http://agiletesting.blogspot.com/2012/02/handling-datetime-in-apache-pig.html
- http://engineering.linkedin.com/open-source/introducing-datafu-open-source-collection-useful-apache-pig-udfs
- Joining Multiple Datasets
- http://pig.apache.org/docs/r0.9.2/basic.html#join-inner
- http://pig.apache.org/docs/r0.9.2/basic.html#join-outer
- http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#join_basic
- Validating Datasets
- Advanced topics like COGROUP and STREAM
Debugging Pig scripts
- Strategies for debugging Pig programs
- Handling bad data
- Using ILLUSTRATE
Best Practices for Pig
- General best practices
- Achieving Optimal Pig Performance in Production
No comments:
Post a Comment