grouping - How can segregate data groups using apache PIG -
i have data in csv format columns "movie name", price output should under
5 : 5200 5-10 : 500 10-15 : 5140 and on
i tried below code
a = load '/root/pig-0.13.0/scripts/dvd_data/dvd_csv.txt' using pigstorage(','); b = foeach generate replace($0, '\\"', ''),$2,$6 i unable identify logic desired output.i looking it.
if use case count of movies under fixed set of price buckets (lt5, gt5 lt10, gt10 lt15) etc.. can make use of bincond operator.
pig script :
a = load 'a.csv' using pigstorage(',') (movie_name:chararray,price:long); b = foreach generate ((price < 5) ? '5' : ((price < 10) ? '5-10' : ((price < 15) ? '10-15' : '>15'))) key, price; c = group b key; d = foreach c generate group, count(b); dump d; sample input : a.csv :
movie1,1 movie2,2 movie3,3 movie4,4 movie5,5 movie7,7 movie9,9 movie10,10 movie11,11 movie12,12 output : dump d :
(5,4) (5-10,3) (10-15,3)
Comments
Post a Comment