Retrieve Contents of Url as String

Retrieve contents of URL as string

The open method passes an IO representation of the resource to your block when it yields. You can read from it using the IO#read method

open([mode [, perm]] [, options]) [{|io| ... }] 
open(path) { |io| data = io.read }

PHP Get URL Contents And Search For String

Just read the contents of the page as you would read a file. PHP does the connection stuff for you. Then just look for the string via regex or simple string comparison.

$url = 'http://my.url.com/';
$data = file_get_contents( $url );

if ( strpos( 'maybe baby love you', $data ) === false )
{

// do something

}

How to extract XML data to a string via URL

This fragment can help you

 new Thread() {
public void run() {
URL url = null;
BufferedReader in = null;
try {
url = new URL("your url");

in = new BufferedReader(
new InputStreamReader(
url.openStream(),"UTF-8"));//in most cases there is utf 8

String inputLine;
StringBuilder builder = new StringBuilder();
while ((inputLine = in.readLine()) != null)
builder.append(inputLine);
String urlContent = builder.toString();
// process your received data somehow
} catch (IOException e) {
e.printStackTrace();
} finally {
if (in != null) {
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}.start();

Is there a way to retrieve the HTML content of a web page by casting it into a string in Python?

may be would be better if you use the beautiful soup because it help to parse into html
if you don't have this module install it like pip install bs4 on windows and pip3 install bs4 if on mac or linux and i hope requests already exists in python 3 and if you don't have lxml module go ahead and install it with pip install

import requests
from bs4 import BeautifulSoup

res = request.get('website-url-here')
src = res.content
soup = BeautifulSoup(src, 'lxml')
markup = soup.prettify()
print(markup)

and you'll get the entire page of the scraping web may be would would be easy for you
to extract the useful on
by finding the contents that you want

soup.find_all('div', {'class', 'classname'})

this will return into array while this don't

soup.find('div', {'class', 'classname'})

but this will return the first content the choice is yours

Return HTML content as a string, given URL. Javascript Function

you need to return when the readystate==4 e.g.

function httpGet(theUrl)
{
if (window.XMLHttpRequest)
{// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp=new XMLHttpRequest();
}
else
{// code for IE6, IE5
xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.onreadystatechange=function()
{
if (xmlhttp.readyState==4 && xmlhttp.status==200)
{
return xmlhttp.responseText;
}
}
xmlhttp.open("GET", theUrl, false );
xmlhttp.send();
}


Related Topics



Leave a reply



Submit