How to Use Curl to Fetch Specific Data from a Website and Then Save It My Database Using PHP

How to use cURL to fetch specific data from a website and then save it my database using php

Using cURL:

$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, 'http://www.something.com');
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true);

$content = curl_exec($ch);

Then you can load the element into a DOM Object and parse the dom for the specific data. You could also try and parse the data using search strings, but using regex on HTML is highly frowned upon.

$dom = new DOMDocument();
$dom->loadHTML( $content );

// Parse the dom for your desired content
  • http://www.php.net/manual/en/class.domdocument.php

Post data to database from external server using curl or webhooks

Very simply put, you can use this to get values posted to your serverb:

$email = $_POST['email'];
$phone = $_POST['phone'];

Note your code is not production-ready as it is vulnerable for SQL injections, it also doesn't validates incoming data etc. Be sure to address this before using it on your live website.

How to save automatically specific content of a website to my Mysql database using php

There are a few ways. I'd avoid file_get_contents() if I were you. Try cURL.

If you want a wrapper for cURL, check out the REST client of Spoon Library. You can make easy GET requests with it:
SpoonRESTClient::execute($url, $parameters)

fetching content from a webpage using curl

The algorithm is pretty straight forward:

  • download www.zedge.net/txts/4519 with curl
  • parse it with DOM (or alternative) for links
  • either store them all into text file/database or process them on the fly with "subrequest"

 

// Load main page
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, "http://www.zedge.net/txts/4519");
$contents = curl_exec ($ch);
$dom = new DOMDocument();
$dom->loadHTML( $contents);

// Filter all the links
$xPath = new DOMXPath( $dom);
$items = $xPath->query( '//a[class=myLink]');

foreach( $items as $link){
$url = $link->getAttribute('href');
if( strncmp( $url, 'http', 4) != 0){
// Prepend http:// or something
}

// Open sub request
curl_setopt($ch, CURLOPT_URL, "http://www.zedge.net/txts/4519");
$subContent = curl_exec( $ch);
}

See documentation and examples for xPath::query, note that DOMNodeList implements Traversable and therefor you can use foreach.

Tips:

  • Use curl opt COOKIE_JAR_FILE
  • Use sleep(...) not to flood server
  • Set php time and memory limit

how to use CURL to request a php page on another server and then process the response

On server A

$post_fields = array(
'variable_name' => 'variable_value',
'variable' => $variable,
);
$ch = curl_init('http://www.serverB.com/example.php');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
curl_setopt($ch, CURLOPT_POST, 1);
$result = curl_exec($ch);

$result now contains the HTML of the page you have requested from server B.

On server B

// pseudocode
$variable = $_POST['variable'];
$variable_name = $_POST['variable_name'];
$db_results = $db->getQuery('SELECT * FROM table WHERE `variable` = ?', array($variable))->toString();
echo $db_results;

Is it safe?

This depends. Does the information coming from the DB need to be protected from public view? Obviously with the setup above the information is just echod out to a page on server B. Was someone to find that page then they would be able to see the information.

If that does not matter then its perfectly save and does not open any doors (you own both sites right?) particularly.

If you need to protect against that then I suggest sending a token from server A to server B to authenticate that the correct script is attempting to access the information. Something like an API key, which you could pass as a header in your curl request and then get out and verify from $_SERVER on server B.

curL specific data from a webpage

Using a combination of curl and domdocument, you would then be able to (A) download the remote page, and (B) parse the dom document based off of selectors / paths.



Related Topics



Leave a reply



Submit