Get data from one file to another (Bash) - Web Scraping

问题

I am doing web scraping with bash. I have these URLs saved in a file called URL.txt.

?daypartId=1&amp;catId=1
?daypartId=1&amp;catId=11
?daypartId=1&amp;catId=2

I want to pass these URL to an array in another file main.sh which would append in the base URL https://www.mcdelivery.com.pk/pk/browse/menu.html**(append here)**. I want to append all the URl in URL.txt file in the end of the base URL one by one.

I have come up with the code to extract the URL from the URL.txt but it is unable to append it to the base URL one by one.

#!/bin/bash
ARRAY=()
while read -r LINE
do
    ARRAY+=("$LINE")
done < URL.txt

for LINE in "${ARRAY[@]}"
do    
    echo $LINE
    curl https://www.mcdelivery.com.pk/pk/browse/menu.html$LINE | grep -o '<span class="starting-price">.*</span>' | sed 's/<[^>]\+>//g' >> price.txt 
done

Just need help with the loop so that i can append different URL in URL.txt file at the end of the base URL in the main.sh file.

回答1:

regarding your grep | sed can't help because don't know expected output

this is example to demonstrate why URL is passed to curl without appending URI

#!/bin/bash

# just for demo
> URI.txt
URI='?daypartId=1&amp;catId='
URL=https://www.mcdelivery.com.pk/pk/browse/menu.html

# just for demo
for id in 1 11 2
  do
    echo -e "${URI}${id}" | tee -a URI.txt
    # reason why it fails
    echo -e "\n\n\n" >> URI.txt
done

ARRAY=()
while read -r LINE || [[ -n $LINE ]]
do
    ## how to prevent
    #[ "$LINE" ] && \
    ARRAY+=("$LINE")
done < URI.txt

for LINE in "${ARRAY[@]}"
  do
    # just for demo
    echo -e "LINE='$LINE'"
    # skipt empty lines
    [ "$LINE" ] && curl "${URL}${LINE}" | grep -o '<span class="starting-price">.*</span>' | sed 's/<[^>]\+>//g' >> price.txt 
done

exit 0

来源：https://stackoverflow.com/questions/62235280/get-data-from-one-file-to-another-bash-web-scraping

标签

bash

loops

url

web-scraping

scripting