How to Get the Contents of a Webpage in a Shell Variable

How to get the contents of a webpage in a shell variable?

You can use wget command to download the page and read it into a variable as:

content=$(wget google.com -q -O -)
echo $content

We use the -O option of wget which allows us to specify the name of the file into which wget dumps the page contents. We specify - to get the dump onto standard output and collect that into the variable content. You can add the -q quiet option to turn off's wget output.

You can use the curl command for this aswell as:

content=$(curl -L google.com)
echo $content

We need to use the -L option as the page we are requesting might have moved. In which case we need to get the page from the new location. The -L or --location option helps us with this.

How to read/search specific content in a webpage using shell scripting

You can use wget and curl to get the page. Then you'll need to use regex or some other string manipulation to get the information you need from that. It would be a lot easier to use a library to do some of these things for you.

Webpage to take input from shell script and display result

First of all, do not use $USERNAME in your BASH script. $USERNAME is a BASH variable that contains the current user's name. In fact, it is generally a bad idea to use UPPERCASE variables in BASH. Most BASH environment variables are upper case and that can lead to confusion. It is good practice to have your variables lower case.

Also, since I imagine you want to do this using an HTML form, you cannot have BASH read from STDIN. Modify tour script to take the user name as an argument:

BASH:

#!/bin/bash
user=$1;
DISPLAYNAME=`ldapsearch -p xxx -LLL -x -w test -h abc.com -D abc -b dc=abc,dc=com sAMAccountName=$user | grep displayName`
if [ -z "$DISPLAYNAME" ]; then
  echo "No entry found for $user"
else 
  echo "Entry found for $user"
fi

Perl:

#!/usr/bin/perl
use CGI qw(:standard);
use CGI::Carp qw(warningsToBrowser fatalsToBrowser); 
use strict;
use warnings;
## Create a new CGI object
my $cgi = new CGI;
## Collect the value of 'user_name' submitted by the webpage
my $name=$cgi->param('user_name');

## Run a system command, your display_name.sh,
## and save the result in $result
my $result=`./display_name.sh $name`;

## Print the HTML header
print header;
## Print the result
print "$result<BR>";

HTML:

<html>
<body>
<form ACTION="./cgi-bin/display_name.pl" METHOD="post">
<INPUT TYPE="submit" VALUE="Submit"></a>
<INPUT TYPE="text" NAME="user_name"></a>
</form>
</body>
</html>

This should do what you need. It assumes that both scripts are in the ./cgi-bin/ directory of your webpage and are called display_name.sh and display_name.pl. It also assumes that you have set their permissions correctly (they need to be executable by apache2's user, www-data). Finally, it assumes that you have set up apache2 to allow execution of scripts in ./cgi-bin.

Is there a specific reason you want to use BASH? You could just do everything directly from the Perl script:

#!/usr/bin/perl
use CGI qw(:standard);
use CGI::Carp qw(warningsToBrowser fatalsToBrowser); 
use strict;
use warnings;
## Create a new CGI object
my $cgi = new CGI;
## Collect the value of 'name' submitted by the webpage
my $name=$cgi->param('user_name');

## Run the ldapsearch system command
## and save the result in $result
my $result=`ldapsearch -p xxx -LLL -x -w test -h abc.com -D abc -b dc=abc,dc=com sAMAccountName=$name | grep displayName`;

## Print the HTML header
print header;
## Print the result
$result ? 
      print "Entry found for $name<BR>" : 
      print "No entry found for $name<BR>";

How to use sed to extract text from a webpage

Although sed is generally not the right tool for extracting text from web pages it may be sufficent for simple tasks. sed is a line oriented tool. So each line will be handled separately.

If you really want to do it with sed, this will you give some output:

curl -s http://example.com | sed -n -e 's/.*<h1>\(.*\)<\/h1>/\1 \n/p' -e 's/<p>\(This.*\)/\1 \n/p'

How to Get the Contents of a Webpage in a Shell Variable