Getting title and meta tags from external website
This is the way it should be:
function file_get_contents_curl($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$html = file_get_contents_curl("http://example.com/");
//parsing begins here:
$doc = new DOMDocument();
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');
//get and display what you need:
$title = $nodes->item(0)->nodeValue;
$metas = $doc->getElementsByTagName('meta');
for ($i = 0; $i < $metas->length; $i++)
{
$meta = $metas->item($i);
if($meta->getAttribute('name') == 'description')
$description = $meta->getAttribute('content');
if($meta->getAttribute('name') == 'keywords')
$keywords = $meta->getAttribute('content');
}
echo "Title: $title". '<br/><br/>';
echo "Description: $description". '<br/><br/>';
echo "Keywords: $keywords";
get meta description , title and image from url like facebook link sharing
Why are you using regular expression for parsing the <meta>
tags ?
PHP has an in-built function for parsing the meta information , it is called the get_meta_tags()
Illustration :
<?php
$tags = get_meta_tags('http://www.stackoverflow.com/');
echo "<pre>";
print_r($tags);
OUTPUT:
Array
(
[twitter:card] => summary
[twitter:domain] => stackoverflow.com
[og:type] => website
[og:image] => http://cdn.sstatic.net/stackoverflow/img/apple-touch-icon@2.png?v=fde65a5a78c6
[og:title] => Stack Overflow
[og:description] => Q&A for professional and enthusiast programmers
[og:url] => http://stackoverflow.com/
)
As you can see the title , image and description are being parsed which you really want.
Get meta description from external website
You can use the find
method on the soup object and find the tags with specific attributes. Here we need to find the meta
tag with either name
attribute equal to og:description
or description
or property
attribute equal to description
.
# First get the meta description tag
description = soup.find('meta', attrs={'name':'og:description'}) or soup.find('meta', attrs={'property':'description'}) or soup.find('meta', attrs={'name':'description'})
# If description meta tag was found, then get the content attribute and save it to db entry
if description:
entry.description = description.get('content')
Related Topics
How to Write SQL For a Table That Shares the Same Name as a Protected Keyword in MySQL
PHP Check If Date Between Two Dates
How to Catch the Fatal Error: Maximum Execution Time of 30 Seconds Exceeded in PHP
PHP Fatal Error: Cannot Redeclare Class
Example of How to Use Bind_Result VS Get_Result
How to Get Input Field Value Using PHP
How to Put Composite Keys in Models in Laravel 5
PHP: How to Handle ≪![Cdata[ With Simplexmlelement
Curl and PHP - How to Pass a Json Through Curl by Put,Post,Get
How to Fix the Session_Register() Deprecated Issue