I have a bash script that iterates over a list of links, curl\'s down an html page per link, greps for a particular string format (syntax is: CVE-####-####), removes the surroun
HTML files can contain carriage returns at the ends of lines, you need to filter those out.
curl -s "$link" | sed -n '/CVE-/s/<[^>]*>//gp' | tr -d '\r' | while read cve; do
Notice that there's no need to use grep, you can use a regular expression filter in the sed command. (You can also use the tr command in sed to remove characters, but doing this for \r is cumbersome, so I piped to tr instead).
It should look like this:
# First: Care about quoting your variables!
# Use read to read the file line by line
while read -r link ; do
# No grep required. sed can do that.
curl -s "$link" | sed -n '/CVE-/s/<[^>]*>//gp' | while read -r cve; do
echo "$cve"
# grep -F searches for fixed strings instead of patterns
grep -F "$cve" ./changelog.txt
done
done < links.txt