bash - grep-ing a variable vs. a file - execution time -
made interesting observation - storing output of curl statement in text file , grep-ing strings. later changed code store output variable instead. turns out, change caused script run slower. counter intuitive me since thought i/o operations more expensive in-memory operations. here code:
#!/bin/bash url="http://m.cnbc.com" while read line; ua=$line curl -s --location --user-agent "$ua" $url > raw.txt #raw=`curl --location --user-agent "$ua" $url` l=`grep -c -e "advertise us" raw.txt` #l=`echo $raw | grep -c -e "advertise us"` m=`grep -c -e "id='menu'><button>menu</button>" raw.txt` #m=`echo $raw | grep -c -e "id='menu'><button>menu</button>"` d=`grep -c -e "careers" raw.txt` #d=`echo $raw | grep -c -e "careers"` if [[ ( $l == 1 && $m == 0 ) && ( $d == 0) ]] ac="legacy" elif [[ ( $l == 0 && $m == 1 ) && ( $d == 0) ]] ac="modern" elif [[ ( $l == 0 && $m == 0 ) && ( $d == 1) ]] ac="desktop" else ac="unable determine" fi echo $ac >> results.txt done < useragents.txt
the commented lines represent storing-in-variable approach. ideas why happening? there ways further speed-up script? right takes 8 minutes process 2000 input entries.
chepner correct. read each call curl
once, flagging each of 3 desired strings. here's example code using awk
. untested:
url="http://m.cnbc.com" while ifs= read -r line; raw=$(curl --location --user-agent "$line" $url) awk ' /advertise us/ { l=1 } /id='\''menu'\''><button>menu<\/button>/ { m=1 } /careers/ { d=1 } end { if (l==1 && m==0 && d==0) { s = "legacy" } else if (l==0 && m==1 && d==0) { s = "modern" } else if (l==0 && m==0 && d==1) { s = "desktop" } else { s = "unable determine" } print s >> "results.txt" }' "$raw" done < useragents.txt