Is There a PHP Equivalent of Perl'S Www::Mechanize

Is there a PHP equivalent of Perl's WWW::Mechanize?

SimpleTest's ScriptableBrowser can be used independendly from the testing framework. I've used it for numerous automation-jobs.

tiny runable www::Mechanize examples for the beginner

could you be a little more specific on what exactly you are after... For instance this is a script to log into a website:

use WWW::Mechanize;

my $mech = WWW::Mechanize->new();
my $url = "http://www.test.com";

$mech->cookie_jar->set_cookie(0,"start",1,"/",".test.com");
$mech->get($url);
$mech->form_name("frmLogin");
$mech->set_fields(user=>'test',passwrd=>'test');
$mech->click();
$mech->save_content("logged_in.html");

This is a script to perform google searches

use WWW::Mechanize;
use 5.10.0;
use strict;
use warnings;

my $mech = new WWW::Mechanize;

my $option = $ARGV[$#ARGV];

#you may customize your google search by editing this url (always end it with "q=" though)
my $google = 'http://www.google.co.uk/search?q=';


my @dork = ("inurl:dude","cheese");

#declare necessary variables
my $max = 0;
my $link;
my $sc = scalar(@dork);

#start the main loop, one itineration for every google search
for my $i ( 0 .. $sc ) {

#loop until the maximum number of results chosen isn't reached
while ( $max <= $option ) {
$mech->get( $google . $dork[$i] . "&start=" . $max );

#get all the google results
foreach $link ( $mech->links() ) {
my $google_url = $link->url;
if ( $google_url !~ /^\// && $google_url !~ /google/ ) {
say $google_url;
}
}
$max += 10;
}


}

Simple site crawler extracting information (html comments) from every page:

    #call the mechanize object, with autocheck switched off
#so we don't get error when bad/malformed url is requested
my $mech = WWW::Mechanize->new(autocheck=>0);
my %comments;
my %links;
my @comment;

my $target = "http://google.com";
#store the first target url as not checked
$links{$target} = 0;
#initiate the search
my $url = &get_url();

#start the main loop
while ($url ne "")
{
#get the target url
$mech->get($url);
#search the source for any html comments
my $res = $mech->content;
@comment = $res =~ /<!--[^>]*-->/g;
#store comments in 'comments' hash and output it on the screen, if there are any found
$comments{$url} = "@comment" and say "\n$url \n---------------->\n $comments{$url}" if $#comment >= 0;
#loop through all the links that are on the current page (including only urls that are contained in html anchor)

foreach my $link ($mech->links())
{
$link = $link->url();
#exclude some irrelevant stuff, such as javascript functions, or external links
#you might want to add checking domain name, to ensure relevant links aren't excluded

if ($link !~ /^(#|mailto:|(f|ht)tp(s)?\:|www\.|javascript:)/)
{
#check whether the link has leading slash so we can build properly the whole url
$link = $link =~ /^\// ? $target.$link : $target."/".$link;
#store it into our hash of links to be searched, unless it's already present
$links{$link} = 0 unless $links{$link};
}
}

#indicate we have searched this url and start over
$links{$url} = 1;
$url = &get_url();
}

sub get_url
{
my $key, my $value;
#loop through the links hash and return next target url, unless it's already been searched
#if all urls have been searched return empty, ending the main loop

while (($key,$value) = each(%links))
{
return $key if $value == 0;
}

return "";
}

It really depends what you are after, but if you want more examples I would refer you to perlmonks.org, where you can find plenty of material to get you going.

Definitely bookmark this though mechanize module man page, it is the ultimate resource...

How to use output from WWW::Mechanize?

The documentation points out that the links are returned as WWW::Mechanize::Link objects. Therefore:

my @links = $m->find_all_links(url_regex => qr/google/);
print $_->url, "\n" for @links;

Perl Mechanize Returns Jumbled Text

You probably want $result->decoded_content(), not $result->content().

See https://metacpan.org/pod/HTTP::Response#r-content-bytes



Related Topics



Leave a reply



Submit