Perl - Regex to extract only the comma-separated strings -
i have question hoping with...
i have variable contains content webpage (scraped using www::mechanize).
the variable contains data such these:
$var = "ewrfs sdfdsf cat_dog,horse,rabbit,chicken-pig" $var = "fdsf iiukui aawwe dffg elephant,mouse_rat,spider,lion-tiger hdsfds jdlkf sdf" $var = "dsadp poids pewqwe antelope-giraffe,frog,fish,crab,kangaroo-koala sdfdsf hkew"
the bits interested in above examples are:
@array = ("cat_dog","horse","rabbit","chicken-pig") @array = ("elephant","mouse_rat","spider","lion-tiger") @array = ("antelope-giraffe","frog","fish","crab","kangaroo-koala")
the problem having:
i trying extract comma-separated strings variables , store these in array use later on.
but best way make sure strings @ start (ie cat_dog) , end (ie chicken-pig) of comma-separated list of animals not prefixed/suffixed comma.
also, variables contain webpage content, inevitable there may instances commas immediately succeeded space , word, correct method of using commas in paragraphs , sentences...
for example:
saturn long thought ringed planet, however, known not case. ^ ^ | | note spaces here , here
i not interested in cases comma followed space (as shown above).
i interested in cases comma not have space after (ie cat_dog,horse,rabbit,chicken-pig)
i have tried number of ways of doing cannot work out best way go constructing regular expression.
how about
[^,\s]+(,[^,\s]+)+
which match 1 or more characters not space or comma [^,\s]+
followed comma , 1 or more characters not space or comma, 1 or more times.
further comments
to match more 1 sequence add g
modifier global matching.
following splits each match $&
on ,
, pushes results @matches
.
my $str = "sdfds cat_dog,horse,rabbit,chicken-pig more pig,duck,goose"; @matches; while ($str =~ /[^,\s]+(,[^,\s]+)+/g) { push(@matches, split(/,/, $&)); } print join("\n",@matches),"\n";