How to Parse Http Headers Using Bash

How to parse HTTP headers using Bash?

Full bashsolution. Demonstrate how to easily parse other headers without requiring awk:

shopt -s extglob # Required to trim whitespace; see below

while IFS=':' read key value; do
# trim whitespace in "value"
value=${value##+([[:space:]])}; value=${value%%+([[:space:]])}

case "$key" in
Server) SERVER="$value"
;;
Content-Type) CT="$value"
;;
HTTP*) read PROTO STATUS MSG <<< "$key{$value:+:$value}"
;;
esac
done < <(curl -sI http://www.google.com)
echo $STATUS
echo $SERVER
echo $CT

Producing:

302
GFE/2.0
text/html; charset=UTF-8

According to RFC-2616, HTTP headers are modeled as described in "Standard for the Format of ARPA Internet Text Messages" (RFC822), which states clearly section 3.1.2:

The field-name must be composed of printable ASCII characters
(i.e., characters that have values between 33. and 126.,
decimal, except colon). The field-body may be composed of any
ASCII characters, except CR or LF. (While CR and/or LF may be
present in the actual text, they are removed by the action of
unfolding the field.)

So the above script should catch any RFC-[2]822 compliant header with the notable exception of folded headers.

How to parse HTTP headers by /bin/sh?

You can trim whitespace by using read with a here document. Use a named pipe to "simulate" the process substitution. (Process substitution may in fact be implemented with named pipes on some operating systems.)

mkfifo headers
curl -sI http://www.google.com > headers &

{
# This line is guaranteed to be first, before any headers.
# Read it separately.
read -r PROTO STATUS MSG
while IFS=':' read -r key value; do
# trim whitespace in "value"
read -r value <<EOF
$value
EOF

case $key in
Server) SERVER="$value"
;;
Content-Type) CT="$value"
;;
esac
done
} < headers
rm headers

I leave it as an exercise to research how to indent the body of the here document properly.

Parse and format Header to variable from curl response?

One awk idea that replaces the grep and returns just the desired token:

curl ... | awk -F'[=;]' '/Set-Cookie/{gsub(" ","",$2);print $2;exit}'

Where:

  • -F'[=;]' - use = and ; as input field delimiters
  • /Set-Cookie/ - match any line with the string Set-Cookie (I'm assuming there will only be 1 such line generated by the curl call)
  • dsub(" ","") - remove spaces from field #2
  • print $2 - print field #2 (the desired token)
  • exit - we found what we want so exit

Applying this awk code to the text block (above, under Terminal response) generates:

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJXQS0yMjAwIiwibmJmIjoxNjA3NDUzNDY5LCJleHAiOjE2MDc0NTM3Njl9.KG4GXVLaTQ1TCe2nxOIVjLAHZyGNizbgM0Wb94-dkZI

To store in a variable:

$ result=$(curl ... | awk -F'[=;]' '/Set-Cookie/{gsub(" ","",$2);print $2;exit}')
$ echo "${result}"
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJXQS0yMjAwIiwibmJmIjoxNjA3NDUzNDY5LCJleHAiOjE2MDc0NTM3Njl9.KG4GXVLaTQ1TCe2nxOIVjLAHZyGNizbgM0Wb94-dkZI

Curl write out value of specific header

The variables specified for "-w" are not directly connected to the http header.
So it looks like you have to "parse" them on your own:

curl -I "server/some/resource" | grep -Fi etag


Related Topics



Leave a reply



Submit