White scenery @showyou, hatena

If you have any comments, you may also send twitter @shsub or @showyou.

pig

http://d.hatena.ne.jp/yokkuns/20110426

cloudera使ってるので、
$ apt-get install hadoop-pig
/usr/lib/pig に実行ファイルが格納される

Pig環境設定

$ vim .bashrc
export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.24/ #追記
export PIG_HOME=/usr/lib/pig #追記
export PATH=$PATH:$PIG_HOME/bin #追記
$ source .bashrc
$ sudo vim /usr/lib/pig/conf/pig-env.sh
$ cat /usr/lib/pig/conf/pig-env.sh

PIG_CLASSPATH=$HADOOP_HOME/conf

$ sudo chown -R hadoop:hadoop /usr/lib/pig/
$ pig

$ bin/hadoop fs -put /etc/hadoop-0.20/conf/*.xml input
$ bin/hadoop fs -ls input

  • rw-r--r-- 1 hadoop supergroup 5041 2011-06-11 11:11 /user/hadoop/input/capacity-scheduler.xml
  • rw-r--r-- 1 hadoop supergroup 565 2011-06-11 11:11 /user/hadoop/input/core-site.xml

...

  • rw-r--r-- 1 hadoop supergroup 1065 2011-06-11 11:11 /user/hadoop/input/hdfs-site.xml
  • rw-r--r-- 1 hadoop supergroup 2033 2011-06-11 11:11 /user/hadoop/input/mapred-queue-acls.xml
  • rw-r--r-- 1 hadoop supergroup 582 2011-06-11 11:11 /user/hadoop/input/mapred-site.xml

grunt> ls input
hdfs://localhost/user/hadoop/input/capacity-scheduler.xml 5041
...
hdfs://localhost/user/hadoop/input/mapred-site.xml 582

grunt> A = LOAD 'input';
grunt> B = FILTER A BY $0 MATCHES '.*dfs[a-z.]+.*';
grunt> DUMP B;
...
( dfsadmin and mradmin commands to refresh the security policy in-effect. )
( dfs.replication)
( dfs.permissions)
( dfs.name.dir)
( dfs.namenode.plugins)
( dfs.datanode.plugins)
( dfs.thrift.address)

参考 http://pig.apache.org/docs/r0.8.0/tutorial.html