grouping - How can segregate data groups using apache PIG -
i have data in csv
format columns "movie name", price
output should under
5 : 5200 5-10 : 500 10-15 : 5140
and on
i tried below code
a = load '/root/pig-0.13.0/scripts/dvd_data/dvd_csv.txt' using pigstorage(','); b = foeach generate replace($0, '\\"', ''),$2,$6
i unable identify logic desired output.i looking it.
if use case count of movies under fixed set of price buckets (lt5, gt5 lt10, gt10 lt15) etc.. can make use of bincond operator.
pig script :
a = load 'a.csv' using pigstorage(',') (movie_name:chararray,price:long); b = foreach generate ((price < 5) ? '5' : ((price < 10) ? '5-10' : ((price < 15) ? '10-15' : '>15'))) key, price; c = group b key; d = foreach c generate group, count(b); dump d;
sample input : a.csv :
movie1,1 movie2,2 movie3,3 movie4,4 movie5,5 movie7,7 movie9,9 movie10,10 movie11,11 movie12,12
output : dump d :
(5,4) (5-10,3) (10-15,3)
Comments
Post a Comment