{"id":640,"date":"2011-05-05T00:39:58","date_gmt":"2011-05-04T15:39:58","guid":{"rendered":"http:\/\/www.musiclogs.org\/blog\/?p=640"},"modified":"2020-04-13T08:41:32","modified_gmt":"2020-04-12T23:41:32","slug":"hadoop-sample-wordcount","status":"publish","type":"post","link":"https:\/\/www.musiclogs.org\/blog\/2011\/05\/05\/hadoop-sample-wordcount\/","title":{"rendered":"Hadoop\u306e\u30b5\u30f3\u30d7\u30ebwordcount\u3092\u52d5\u304b\u3059\u4ef6"},"content":{"rendered":"<p>\u30b4\u30fc\u30eb\u30c7\u30f3\u30a6\u30a3\u30fc\u30af\u306a\u306e\u3067\u3001\u524d\u304b\u3089\u6c17\u306b\u306a\u3063\u3066\u3044\u305fHadoop\u3092\u4e0b\u8a18\u672c\u3092\u53c2\u8003\u306b\u3057\u3064\u3064\u904a\u3093\u3067\u304a\u308a\u307e\u3059\u3002<br \/>\n<a onclick=\"javascript:pageTracker._trackPageview('\/outgoing\/www.amazon.co.jp\/gp\/product\/4798122335\/ref=as_li_ss_tl?ie=UTF8&amp;tag=musiclogs-22&amp;linkCode=as2&amp;camp=247&amp;creative=7399&amp;creativeASIN=4798122335');\"  href=\"https:\/\/www.amazon.co.jp\/gp\/product\/4798122335\/ref=as_li_ss_tl?ie=UTF8&amp;tag=musiclogs-22&amp;linkCode=as2&amp;camp=247&amp;creative=7399&amp;creativeASIN=4798122335\">Hadoop\u5fb9\u5e95\u5165\u9580<\/a><\/p>\n<p>\u30a4\u30f3\u30d5\u30e9\u30a8\u30f3\u30b8\u30cb\u30a2\u3068\u3057\u3066\u306f\u30a4\u30e1\u30fc\u30b8\u306f\u3064\u304d\u307e\u3057\u305f\u3002\u3067\u3082\u3001\u958b\u767a\u30a8\u30f3\u30b8\u30cb\u30a2\u3068\u3057\u3066\u306f\u696d\u52d9\u3067\u5229\u7528\u3059\u308b\u65b9\u6cd5\u304c\u30a4\u30e1\u30fc\u30b8\u3064\u3044\u3066\u306a\u3044\u3063\u3066\u611f\u3058\u3067\u3059\u304b\u306d\u3002<br \/>\n\u4ee5\u4e0b\u30b5\u30f3\u30d7\u30eb\u52d5\u304b\u3057\u305f\u3068\u304d\u306e\u30ed\u30b0\u3002<\/p>\n<p>\u74b0\u5883\u306fOracle Enterprise Linux 6(64bit)\u3068Hadoop 0.21.0\u3067\u3059\u3002<\/p>\n<p>Hadoop\u306b\u3064\u3044\u3066\u3044\u308bREADME.txt\u3067wordcount\u3092\u52d5\u304b\u305d\u3046\u3068\u3057\u305f\u3051\u3069\u3001Input path does not exist: hdfs:\/\/\u30fb\u30fb\u30fb\u3068\u8868\u793a\u3055\u308c\u308b\u3002\u30ab\u30a6\u30f3\u30c8\u3055\u305b\u305f\u3044README.txt\u304c\u898b\u3064\u304b\u3089\u306a\u3044\u3088\u3046\u3060\u3002<\/p>\n<blockquote><p>[hadoop@myhost hadoop]$ <em>.\/bin\/hadoop jar hadoop-mapred-examples-0.21.0.jar wordcount README.txt <\/em>\/tmp\/mapreduce2<br \/>\n11\/05\/05 00:06:49 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000<br \/>\n11\/05\/05 00:06:50 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id<br \/>\n11\/05\/05 00:06:50 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.<br \/>\n11\/05\/05 00:06:51 INFO mapreduce.JobSubmitter: Cleaning up the staging area hdfs:\/\/localhost:54310\/hadoop\/mapred\/staging\/hadoop\/.staging\/job_201105041812_0017<br \/>\norg.apache.hadoop.mapreduce.lib.input.InvalidInputException: <strong>Input path does not exist: hdfs:\/\/localhost:54310\/user\/hadoop\/README.txt<\/strong><br \/>\nat org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:245)<br \/>\nat org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:271)<\/p><\/blockquote>\n<p>\u305d\u3082\u305d\u3082\u3001hdfs:\/\/localhost:54310\/user\/hadoop\/README.txt\u3063\u3066\u3069\u3053\u3088\u3002\u3002\u3002\u3088\u304f\u3088\u304f\u8abf\u3079\u3066\u307f\u308b\u3068\u3001OS\u306e\u30d5\u30a1\u30a4\u30eb\u30b7\u30b9\u30c6\u30e0\u3067\u306a\u304f\u3001hdfs\u4e0a\u3092\u898b\u306b\u884c\u3063\u3066\u3044\u308b\u3089\u3057\u3044\u3002<\/p>\n<p><code>\u5bfe\u51e6\uff11\uff1a\u4e8b\u524d\u306b\u4e0b\u8a18\u306e\u3088\u3046\u306bhdfs\u306bput\u3057\u3066\u3084\u308b<\/code><\/p>\n<blockquote><p>[hadoop@myhost hadoop]$ <em>hdfs dfs -put .\/README.txt \/tmp<\/em><br \/>\n11\/05\/05 00:04:19 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000<br \/>\n11\/05\/05 00:04:19 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id<\/p><\/blockquote>\n<p>put\u3057\u305f\u7d50\u679c\u3092ls\u76f8\u5f53\u306e\u30b3\u30de\u30f3\u30c9(hdfs dfs -ls [Dir Name])\u3067\u307f\u308b\u3068\u4e0b\u8a18\u306e\u3088\u3046\u306b\u8868\u793a\u3055\u308c\u308b\u3002<\/p>\n<blockquote><p>[hadoop@myhost hadoop]$ <em>hdfs dfs -ls \/tmp<\/em><br \/>\n11\/05\/05 00:04:24 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000<br \/>\n11\/05\/05 00:04:24 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id<br \/>\n<strong>Found 1 items<br \/>\n-rw-r&#8211;r&#8211;&nbsp;&nbsp; 3 hadoop supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1366 2011-05-05 00:04 \/tmp\/README.txt<\/strong><\/p><\/blockquote>\n<p>\u5bfe\u51e6\uff12\uff1a\u300cfile:\u300d\u3067\u6307\u5b9a\u3059\u308b<\/p>\n<blockquote><p>[hadoop@myhost hadoop]$<em> .\/bin\/hadoop jar hadoop-mapred-examples-0.21.0.jar wordcount <strong>file:<\/strong>\/usr\/local\/hadoop\/README.txt <strong>file:<\/strong>\/tmp\/mapreduce5<\/em><br \/>\n11\/05\/05 00:27:03 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000<br \/>\n11\/05\/05 00:27:03 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id<br \/>\n11\/05\/05 00:27:03 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.<br \/>\n11\/05\/05 00:27:04 INFO input.FileInputFormat: Total input paths to process : 1<br \/>\n11\/05\/05 00:27:04 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps<br \/>\n11\/05\/05 00:27:04 INFO mapreduce.JobSubmitter: number of splits:1<br \/>\n11\/05\/05 00:27:04 INFO mapreduce.JobSubmitter: adding the following namenodes&#8217; delegation tokens:null<br \/>\n11\/05\/05 00:27:04 INFO mapreduce.Job: Running job: job_201105041812_0022<br \/>\n11\/05\/05 00:27:05 INFO mapreduce.Job:&nbsp; map 0% reduce 0%<br \/>\n11\/05\/05 00:27:12 INFO mapreduce.Job:&nbsp; map 100% reduce 0%<br \/>\n11\/05\/05 00:27:18 INFO mapreduce.Job:&nbsp; map 100% reduce 100%<br \/>\n11\/05\/05 00:27:20 INFO mapreduce.Job: Job complete: job_201105041812_0022<\/p><\/blockquote>\n<p>file:\u3068\u66f8\u304f\u3068\u30ed\u30fc\u30ab\u30eb\u306e\u30d5\u30a1\u30a4\u30eb\u30b7\u30b9\u30c6\u30e0\u3068\u3057\u3066\u51fa\u529b\u3057\u3066\u304f\u308c\u308b\u307f\u305f\u3044\u3002\u66f8\u304b\u306a\u3044\u3068hdfs\u4e0a\u306b\u66f8\u304f\u3002<\/p>\n<p>\u3053\u308c\u304b\u3089\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u5468\u308a\u3068Hive\u3092\u7814\u7a76\u3059\u308b\u4e88\u5b9a\u3002<\/p>\n<p><iframe style=\"width: 120px; height: 240px;\" src=\"http:\/\/rcm-jp.amazon.co.jp\/e\/cm?lt1=_blank&amp;bc1=000000&amp;IS2=1&amp;bg1=FFFFFF&amp;fc1=000000&amp;lc1=0000FF&amp;t=musiclogs-22&amp;o=9&amp;p=8&amp;l=as4&amp;m=amazon&amp;f=ifr&amp;ref=ss_til&amp;asins=4798122335\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" frameborder=\"0\"><\/iframe><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u30b4\u30fc\u30eb\u30c7\u30f3\u30a6\u30a3\u30fc\u30af\u306a\u306e\u3067\u3001\u524d\u304b\u3089\u6c17\u306b\u306a\u3063\u3066\u3044\u305fHadoop\u3092\u4e0b\u8a18\u672c\u3092\u53c2\u8003\u306b\u3057\u3064\u3064\u904a\u3093\u3067\u304a\u308a\u307e\u3059\u3002 Hadoop\u5fb9\u5e95\u5165\u9580 \u30a4\u30f3\u30d5\u30e9\u30a8\u30f3\u30b8\u30cb\u30a2\u3068\u3057\u3066\u306f\u30a4\u30e1\u30fc\u30b8\u306f\u3064\u304d\u307e\u3057\u305f\u3002\u3067\u3082\u3001\u958b\u767a\u30a8\u30f3\u30b8\u30cb\u30a2\u3068\u3057\u3066\u306f\u696d\u52d9\u3067\u5229\u7528\u3059\u308b\u65b9\u6cd5\u304c\u30a4\u30e1\u30fc\u30b8\u3064\u3044\u3066\u306a\u3044\u3063\u3066\u611f\u3058\u3067\u3059\u304b\u306d\u3002 \u4ee5\u4e0b\u30b5\u30f3\u30d7\u30eb\u52d5\u304b\u3057\u305f\u3068\u304d\u306e\u30ed\u30b0\u3002 \u74b0\u5883\u306fOracle Enterprise Linux 6(64bit)\u3068Hadoop 0.21.0\u3067\u3059\u3002 Had<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[11,12,8],"tags":[83,84,49,60],"_links":{"self":[{"href":"https:\/\/www.musiclogs.org\/blog\/wp-json\/wp\/v2\/posts\/640"}],"collection":[{"href":"https:\/\/www.musiclogs.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.musiclogs.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.musiclogs.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.musiclogs.org\/blog\/wp-json\/wp\/v2\/comments?post=640"}],"version-history":[{"count":14,"href":"https:\/\/www.musiclogs.org\/blog\/wp-json\/wp\/v2\/posts\/640\/revisions"}],"predecessor-version":[{"id":1554,"href":"https:\/\/www.musiclogs.org\/blog\/wp-json\/wp\/v2\/posts\/640\/revisions\/1554"}],"wp:attachment":[{"href":"https:\/\/www.musiclogs.org\/blog\/wp-json\/wp\/v2\/media?parent=640"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.musiclogs.org\/blog\/wp-json\/wp\/v2\/categories?post=640"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.musiclogs.org\/blog\/wp-json\/wp\/v2\/tags?post=640"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}