Retrieve contents of URL as string
The open
method passes an IO
representation of the resource to your block when it yields. You can read from it using the IO#read
method
open([mode [, perm]] [, options]) [{|io| ... }]
open(path) { |io| data = io.read }
PHP Get URL Contents And Search For String
Just read the contents of the page as you would read a file. PHP does the connection stuff for you. Then just look for the string via regex or simple string comparison.
$url = 'http://my.url.com/';
$data = file_get_contents( $url );
if ( strpos( 'maybe baby love you', $data ) === false )
{
// do something
}
How to extract XML data to a string via URL
This fragment can help you
new Thread() {
public void run() {
URL url = null;
BufferedReader in = null;
try {
url = new URL("your url");
in = new BufferedReader(
new InputStreamReader(
url.openStream(),"UTF-8"));//in most cases there is utf 8
String inputLine;
StringBuilder builder = new StringBuilder();
while ((inputLine = in.readLine()) != null)
builder.append(inputLine);
String urlContent = builder.toString();
// process your received data somehow
} catch (IOException e) {
e.printStackTrace();
} finally {
if (in != null) {
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}.start();
Is there a way to retrieve the HTML content of a web page by casting it into a string in Python?
may be would be better if you use the beautiful soup because it help to parse into html
if you don't have this module install it like pip install bs4
on windows and pip3 install bs4
if on mac or linux and i hope requests already exists in python 3 and if you don't have lxml module go ahead and install it with pip install
import requests
from bs4 import BeautifulSoup
res = request.get('website-url-here')
src = res.content
soup = BeautifulSoup(src, 'lxml')
markup = soup.prettify()
print(markup)
and you'll get the entire page of the scraping web may be would would be easy for you
to extract the useful on
by finding the contents that you want
soup.find_all('div', {'class', 'classname'})
this will return into array while this don't
soup.find('div', {'class', 'classname'})
but this will return the first content the choice is yours
Return HTML content as a string, given URL. Javascript Function
you need to return when the readystate==4 e.g.
function httpGet(theUrl)
{
if (window.XMLHttpRequest)
{// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp=new XMLHttpRequest();
}
else
{// code for IE6, IE5
xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.onreadystatechange=function()
{
if (xmlhttp.readyState==4 && xmlhttp.status==200)
{
return xmlhttp.responseText;
}
}
xmlhttp.open("GET", theUrl, false );
xmlhttp.send();
}
Related Topics
"Don't Run Bundler as Root" - What Is the Exact Difference Made by Using Root
How to Setup Urls for Static Site with Ruby Rack on Heroku
How to Safely Join Relative Url Segments
How to Get a HTML Table Row with Capybara
Change Default Capybara Browser Window Size
Is It Acceptable Practice to Patch Ruby's Base Classes, Such as Fixnum
Single Table Inheritance or Class Table Inheritance
How to Get an Empty Temporary Directory in Ruby on Rails
Adding a Submit Button Image to a Rails Form
How to Run Ruby 2.0 with Jruby 1.7
Using Ruby, Reading a File, Containing Name/Value Pairs into a Hash
Ruby - Activerecord::Connectionnotestablished
How to Recursively Require All Files in a Directory in Ruby
Does C1 Code Coverage Analysis Exist for Ruby
Rails 3 Validates Inclusion of When Using a Find (How to Proc or Lambda)