How to parse $QUERY_STRING from a bash CGI script

为君一笑 提交于 2019-11-26 22:39:28

Try this:

saveIFS=$IFS
IFS='=&'
parm=($QUERY_STRING)
IFS=$saveIFS

Now you have this:

parm[0]=a
parm[1]=123
parm[2]=b
parm[3]=456
parm[4]=c
parm[5]=ok

In Bash 4, which has associative arrays, you can do this (using the array created above):

declare -A array
for ((i=0; i<${#parm[@]}; i+=2))
do
    array[${parm[i]}]=${parm[i+1]}
done

which will give you this:

array[a]=123
array[b]=456
array[c]=ok

Edit:

To use indirection in Bash 2 and later (using the parm array created above):

for ((i=0; i<${#parm[@]}; i+=2))
do
    declare var_${parm[i]}=${parm[i+1]}
done

Then you will have:

var_a=123
var_b=456
var_c=ok

You can access these directly:

echo $var_a

or indirectly:

for p in a b c
do
    name="var$p"
    echo ${!name}
done

If possible, it's better to avoid indirection since it can make code messy and be a source of bugs.

you can break $QUERY down using IFS. For example, setting it to &

$ QUERY="a=123&b=456&c=ok"
$ echo $QUERY
a=123&b=456&c=ok
$ IFS="&"
$ set -- $QUERY
$ echo $1
a=123
$ echo $2
b=456
$ echo $3
c=ok

$ array=($@)

$ for i in "${array[@]}"; do IFS="=" ; set -- $i; echo $1 $2; done
a 123
b 456
c ok

And you can save to a hash/dictionary in Bash 4+

$ declare -A hash
$ for i in "${array[@]}"; do IFS="=" ; set -- $i; hash[$1]=$2; done
$ echo ${hash["b"]}
456

Please don't use the evil eval junk.

Here's how you can reliably parse the string and get an associative array:

declare -A param   
while IFS='=' read -r -d '&' key value && [[ -n "$key" ]]; do
    param["$key"]=$value
done <<<"${QUERY_STRING}&"

If you don't like the key check, you could do this instead:

declare -A param   
while IFS='=' read -r -d '&' key value; do
    param["$key"]=$value
done <<<"${QUERY_STRING:+"${QUERY_STRING}&"}"

Listing all the keys and values from the array:

for key in "${!param[@]}"; do
    echo "$key: ${param[$key]}"
done

To converts the contents of QUERY_STRING into bash variables use the following command:

eval $(echo ${QUERY_STRING//&/;})

The inner step, echo ${QUERY_STRING//&/;}, substitutes all ampersands with semicolons producing a=123;b=456;c=ok which the eval then evaluates into the current shell.

The result can then be used as bash variables.

echo $a
echo $b
echo $c

The assumptions are:

  • values will never contain '&'
  • values will never contain ';'
  • QUERY_STRING will never contain malicious code

I packaged the sed command up into another script:

$cat getvar.sh

s='s/^.*'${1}'=\([^&]*\).*$/\1/p'
echo $QUERY_STRING | sed -n $s | sed "s/%20/ /g"

and I call it from my main cgi as:

id=`./getvar.sh id`
ds=`./getvar.sh ds`
dt=`./getvar.sh dt`

...etc, etc - you get idea.

works for me even with a very basic busybox appliance (my PVR in this case).

A nice way to handle CGI query strings is to use Haserl which acts as a wrapper around your Bash cgi script, and offers convenient and secure query string parsing.

I would simply replace the & to ;. It will become to something like:

a=123;b=456;c=ok

So now you need just evaluate and read your vars:

eval `echo "${QUERY_STRING}"|tr '&' ';'`
echo $a
echo $b
echo $c
badc0de

Following the correct answer, I've done myself some changes to support array variables like in this other question. I added also a decode function of which I can not find the author to give some credit.

Code appears somewhat messy, but it works. Changes and other recommendations would be greatly appreciated.

function cgi_decodevar() {
    [ $# -ne 1 ] && return
    local v t h
    # replace all + with whitespace and append %%
    t="${1//+/ }%%"
    while [ ${#t} -gt 0 -a "${t}" != "%" ]; do
        v="${v}${t%%\%*}" # digest up to the first %
        t="${t#*%}"       # remove digested part
        # decode if there is anything to decode and if not at end of string
        if [ ${#t} -gt 0 -a "${t}" != "%" ]; then
            h=${t:0:2} # save first two chars
            t="${t:2}" # remove these
            v="${v}"`echo -e \\\\x${h}` # convert hex to special char
        fi
    done
    # return decoded string
    echo "${v}"
    return
}

saveIFS=$IFS
IFS='=&'
VARS=($QUERY_STRING)
IFS=$saveIFS

for ((i=0; i<${#VARS[@]}; i+=2))
do
  curr="$(cgi_decodevar ${VARS[i]})"
  next="$(cgi_decodevar ${VARS[i+2]})"
  prev="$(cgi_decodevar ${VARS[i-2]})"
  value="$(cgi_decodevar ${VARS[i+1]})"

  array=${curr%"[]"}

  if  [ "$curr" == "$next" ] && [ "$curr" != "$prev" ] ;then
      j=0
      declare var_${array}[$j]="$value"
  elif [ $i -gt 1 ] && [ "$curr" == "$prev" ]; then
    j=$((j + 1))
    declare var_${array}[$j]="$value"
  else
    declare var_$curr="$value"
  fi
done

To bring this up to date, if you have a recent Bash version then you can achieve this with regular expressions:

q="$QUERY_STRING"
re1='^(\w+=\w+)&?'
re2='^(\w+)=(\w+)$'
declare -A params
while [[ $q =~ $re1 ]]; do
  q=${q##*${BASH_REMATCH[0]}}       
  [[ ${BASH_REMATCH[1]} =~ $re2 ]] && params+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})
done

If you don't want to use associative arrays then just change the penultimate line to do what you want. For each iteration of the loop the parameter is in ${BASH_REMATCH[1]} and its value is in ${BASH_REMATCH[2]}.

Here is the same thing as a function in a short test script that iterates over the array outputs the query string's parameters and their values

#!/bin/bash
QUERY_STRING='foo=hello&bar=there&baz=freddy'

get_query_string() {
  local q="$QUERY_STRING"
  local re1='^(\w+=\w+)&?'
  local re2='^(\w+)=(\w+)$'
  while [[ $q =~ $re1 ]]; do
    q=${q##*${BASH_REMATCH[0]}}
    [[ ${BASH_REMATCH[1]} =~ $re2 ]] && eval "$1+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})"
  done
}

declare -A params
get_query_string params

for k in "${!params[@]}"
do
  v="${params[$k]}"
  echo "$k : $v"
done          

Note the parameters end up in the array in reverse order (it's associative so that shouldn't matter).

why not this

    $ echo "${QUERY_STRING}"
    name=carlo&last=lanza&city=pfungen-CH
    $ saveIFS=$IFS
    $ IFS='&'
    $ eval $QUERY_STRING
    $ IFS=$saveIFS

now you have this

    name = carlo
    last = lanza
    city = pfungen-CH

    $ echo "name is ${name}"
    name is carlo
    $ echo "last is ${last}"
    last is lanza
    $ echo "city is ${city}"
    city is pfungen-CH

@giacecco

To include a hiphen in the regex you could change the two lines as such in answer from @starfry.

Change these two lines:

  local re1='^(\w+=\w+)&?'
  local re2='^(\w+)=(\w+)$'

To these two lines:

  local re1='^(\w+=(\w+|-|)+)&?'
  local re2='^(\w+)=((\w+|-|)+)$'

For all those who couldn't get it working with the posted answers (like me), this guy figured it out.

Can't upvote his post unfortunately...

Let me repost the code here real quick:

#!/bin/sh

if [ "$REQUEST_METHOD" = "POST" ]; then
  if [ "$CONTENT_LENGTH" -gt 0 ]; then
      read -n $CONTENT_LENGTH POST_DATA <&0
  fi
fi

#echo "$POST_DATA" > data.bin
IFS='=&'
set -- $POST_DATA

#2- Value1
#4- Value2
#6- Value3
#8- Value4

echo $2 $4 $6 $8

echo "Content-type: text/html"
echo ""
echo "<html><head><title>Saved</title></head><body>"
echo "Data received: $POST_DATA"
echo "</body></html>"

Hope this is of help for anybody.

Cheers

While the accepted answer is probably the most beautiful one, there might be cases where security is super-important, and it needs to be also well-visible from your script.

In such a case, first I wouldn't use bash for the task, but if it should be done on some reason, it might be better to avoid these new array - dictionary features, because you can't be sure, how exactly are they escaped.

In this case, the good old primitive solutions might work:

QS="${QUERY_STRING}"
while [ "${QS}" != "" ]
do
  nameval="${QS%%&*}"
  QS="${QS#$nameval}"
  QS="${QS#&}"
  name="${nameval%%=*}"
  val="${nameval#$name}"
  val="${nameval#=}"

  # and here we have $name and $val as names and values

  # ...

done

This iterates on the name-value pairs of the QUERY_STRING, and there is no way to circumvent it with any tricky escape sequence - the " is a very strong thing in bash, except a single variable name substitution, which is fully controlled by us, nothing can be tricked.

Furthermore, you can inject your own processing code into "# ...". This enables you to allow only your own, well-defined (and, ideally, short) list of the allowed variable names. Needless to say, LD_PRELOAD shouldn't be one of them. ;-)

Furthermore, no variable will be exported, and exclusively QS, nameval, name and val is used.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!