pig
http://d.hatena.ne.jp/yokkuns/20110426
cloudera使ってるので、
$ apt-get install hadoop-pig
/usr/lib/pig に実行ファイルが格納されるPig環境設定
$ vim .bashrc
export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.24/ #追記
export PIG_HOME=/usr/lib/pig #追記
export PATH=$PATH:$PIG_HOME/bin #追記
$ source .bashrc
$ sudo vim /usr/lib/pig/conf/pig-env.sh
$ cat /usr/lib/pig/conf/pig-env.shPIG_CLASSPATH=$HADOOP_HOME/conf
$ sudo chown -R hadoop:hadoop /usr/lib/pig/
$ pig$ bin/hadoop fs -put /etc/hadoop-0.20/conf/*.xml input
$ bin/hadoop fs -ls input
- rw-r--r-- 1 hadoop supergroup 5041 2011-06-11 11:11 /user/hadoop/input/capacity-scheduler.xml
- rw-r--r-- 1 hadoop supergroup 565 2011-06-11 11:11 /user/hadoop/input/core-site.xml
...
- rw-r--r-- 1 hadoop supergroup 1065 2011-06-11 11:11 /user/hadoop/input/hdfs-site.xml
- rw-r--r-- 1 hadoop supergroup 2033 2011-06-11 11:11 /user/hadoop/input/mapred-queue-acls.xml
- rw-r--r-- 1 hadoop supergroup 582 2011-06-11 11:11 /user/hadoop/input/mapred-site.xml
grunt> ls input
hdfs://localhost/user/hadoop/input/capacity-scheduler.xml5041
...
hdfs://localhost/user/hadoop/input/mapred-site.xml582 grunt> A = LOAD 'input';
grunt> B = FILTER A BY $0 MATCHES '.*dfs[a-z.]+.*';
grunt> DUMP B;
...
( dfsadmin and mradmin commands to refresh the security policy in-effect. )
(dfs.replication )
(dfs.permissions )
(dfs.name.dir )
(dfs.namenode.plugins )
(dfs.datanode.plugins )
(dfs.thrift.address )