regex - How to filter out alphanumeric values using regular expression in pig -
pig code
relation2 = filter relation1 column1 not matches '.*[a-z0-9.*].*'
hive logic
column1 not '%[a-z0-9]%'.
i want implement same logic in pig.
i think don't need alpha numeric records.
you records have either alphabets or numbers.
could try :
input :
(123aet) (123) (aet) (236met)
so expected output
(123) (aet)
pig script : script generic script can keep alphabets alone or number alone or alphanumeric alone or both further processing
records = load '/home/dir/alphanumeric.txt' using pigstorage(',') as(c1:chararray); records_each = foreach records generate c1, (regex_extract(c1,'(^[a-za-z]+$)',1) not null ? 'alphabets' : (regex_extract(c1,'(^[0-9]+$)',1) not null ? 'numbers' : 'alphanumerics')) c1_type; records_filter = filter records_each c1_type in( 'alphabets','numbers'); records_output = foreach records_filter generate c1; dump records_output;
Comments
Post a Comment