apache pig - JOIN two data set on the basis of string matching condition in Pig -


i new in pig , have 2 data sets, "highspender" , "feedback".

highspender:

price,fname,lname $50,jack,brown $30,rovin,pall 

feedback:

date,name,rate 2015-01-02,jack b brown,5 2015-01-02,pall,4 

now have join these 2 datasets on basis of name. condition should fname or lname of highspender should match name of feedback. how join these 2 datasets? idea?

you can try below script same need replace names according data

highs = load 'highs' using pigstorage(',') (price:chararray,fname:chararray,lname:chararray); feedback = load 'feeds' using pigstorage(',') (date:chararray,name:chararray,rate:chararray); out = join highs fname, feedback name; out1 = join highs lname, feedback name; final_out = union out,out1; 

for further can refer pig reference manual

edit

as per comment script joining data string function bellow:

highs = load 'highs' using pigstorage(',') (price:chararray,fname:chararray,lname:chararray); feedback = load 'feeds' using pigstorage(',') (date:chararray,name:chararray,rate:chararray); crossout = cross highs, feedback; final_lname = filter crossout ( replace (feedback::name,highs::lname ,'') != feedback::name); final_fname = filter crossout ( replace (feedback::name,highs::fname ,'') != feedback::name); final = union final_lname, final_fname; 

Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -