How to get results from the Wikipedia API with PHP?
The problem you are running into here is related to the MW API's User-Agent policy - you must supply a User-Agent
header, and that header must supply some means of contacting you.
You can do this with file_get_contents()
with a stream context:
$opts = array('http' =>
array(
'user_agent' => 'MyBot/1.0 (http://www.mysite.com/)'
)
);
$context = stream_context_create($opts);
$url = 'http://en.wikipedia.org/w/api.php?action=query&titles=Your_Highness&prop=revisions&rvprop=content&rvsection=0';
var_dump(file_get_contents($url, FALSE, $context));
Having said that, it might be considered more "standard" to use cURL, and this will certainly give you more control:
$url = 'http://en.wikipedia.org/w/api.php?action=query&titles=Your_Highness&prop=revisions&rvprop=content&rvsection=0';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, 'MyBot/1.0 (http://www.mysite.com/)');
$result = curl_exec($ch);
if (!$result) {
exit('cURL Error: '.curl_error($ch));
}
var_dump($result);
PHP: How to retrieve extract text from Wiki API
There are various possible solutions to this.
You could use reset() / current() against the pages
property to get the first / current item in that array, or you could loop around that property with a foreach
and ignore the keys. You could also use array_values() on the pages
property to get force sequential indicies, or use array_keys() on it to get a list of the page ids and use those to access each item. (There are other ways).
The foreach
option is going to be your best bet.
foreach($wiki_array['query']['pages'] as $page)
$page inside the loop will be the array that you're after.
You should then make sure you can deal with multiple results properly.
Extracting data from Wikipedia API
$pageid
was returning an array with one element. If you only want to get the fist one, you should do this:
$pageid = $data->query->pageids[0];
You were probably getting this warning:
Array to string conversion
Full code:
$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=extracts|info&exintro&titles=google&format=json&explaintext&redirects&inprop=url&indexpageids';
$json = file_get_contents($url);
$data = json_decode($json);
$pageid = $data->query->pageids[0];
echo $data->query->pages->$pageid->title;
Getting Wikipedia API
Sorry to bother you but you could do this
$ua = array();
$ua[] = 'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0';
$ua[] = 'content-type:application/json; charset=utf-8';
$data = json_decode(get("https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=$query",$ua));
foreach ($data->query->pages as $pid) {
echo 'Your pageid = ' . $pid->pageid . PHP_EOL;
echo 'Title = ' . $pid->title . PHP_EOL;
echo 'extract = ' . $pid->extract . PHP_EOL;
}
RESULT
Your pageid = 7529378
Title = Facebook
extract = Facebook (stylized as facebook) is an American online social media and social networking service based in Menlo Park, California, and a flagship service of the namesake company Facebook, Inc. It was founded by Mark Zuckerberg, along with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin Moskovitz, and Chris Hughes.
The founders of Facebook initially limited membership to Harvard students. Membership was expanded to Columbia, Stanford, and Yale before being expanded to the rest of the Ivy League, MIT, and higher education institutions in the Boston area, then various other universities, and lastly high school students. Since 2006, anyone who claims to be at least 13 years old has been allowed to become a registered user of Facebook, though this may vary depending on local laws. The name comes from the face book directories often given to American university students.
Facebook can be accessed from devices with Internet connectivity, such as personal computers, tablets and smartphones. After registering, users can create a profile revealing information about themselves. They can post text, photos and multimedia which is shared with any other users that have agreed to be their "friend", or, with a different privacy setting, with any reader. Users can also use various embedded apps, join common-interest groups, buy and sell items or services on Marketplace, and receive notifications of their Facebook friends' activities and activities of Facebook pages they follow. Facebook claimed that it had 2.74 billion monthly active users as of September 2020, and it was the most downloaded mobile app of the 2010s globally.Facebook has been the subject of numerous controversies, often involving user privacy (as with the Cambridge Analytica data scandal), political manipulation (as with the 2016 U.S. elections), mass surveillance, psychological effects such as addiction and low self-esteem, and content such as fake news, conspiracy theories, copyright infringement, and hate speech. Commentators have accused Facebook of willingly facilitating the spread of such content and also exaggerating its number of users in order to appeal to advertisers. As of January 21, 2021, Alexa Internet ranks Facebook seventh in global internet usage.
Note: I removed the ,true
so the json object gets converted to a PHP object
OR simply
echo 'PageId = ' . array_keys((array)$data->query->pages)[0];
get page id from title with wiki api none english
You are making the request to English Wikipedia instead of Vietnamese. Change the en
to vi
in your call and you will get results. See here:
https://vi.wikipedia.org/w/api.php?action=query&titles=Trung%20%C4%90%C3%B4ng&prop=iwlinks&format=json
Is there a Wikipedia API just for retrieve the content summary?
There's a way to get the entire "introduction section" without any HTML parsing! Similar to AnthonyS's answer with an additional explaintext
parameter, you can get the introduction section text in plain text.
Query
Getting Stack Overflow's introduction in plain text:
Using the page title:
https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=Stack%20Overflow
Or use pageids
:
https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&pageids=21721040
JSON Response
(warnings stripped)
{
"query": {
"pages": {
"21721040": {
"pageid": 21721040,
"ns": 0,
"title": "Stack Overflow",
"extract": "Stack Overflow is a privately held website, the flagship site of the Stack Exchange Network, created in 2008 by Jeff Atwood and Joel Spolsky, as a more open alternative to earlier Q&A sites such as Experts Exchange. The name for the website was chosen by voting in April 2008 by readers of Coding Horror, Atwood's popular programming blog.\nIt features questions and answers on a wide range of topics in computer programming. The website serves as a platform for users to ask and answer questions, and, through membership and active participation, to vote questions and answers up or down and edit questions and answers in a fashion similar to a wiki or Digg. Users of Stack Overflow can earn reputation points and \"badges\"; for example, a person is awarded 10 reputation points for receiving an \"up\" vote on an answer given to a question, and can receive badges for their valued contributions, which represents a kind of gamification of the traditional Q&A site or forum. All user-generated content is licensed under a Creative Commons Attribute-ShareAlike license. Questions are closed in order to allow low quality questions to improve. Jeff Atwood stated in 2010 that duplicate questions are not seen as a problem but rather they constitute an advantage if such additional questions drive extra traffic to the site by multiplying relevant keyword hits in search engines.\nAs of April 2014, Stack Overflow has over 2,700,000 registered users and more than 7,100,000 questions. Based on the type of tags assigned to questions, the top eight most discussed topics on the site are: Java, JavaScript, C#, PHP, Android, jQuery, Python and HTML."
}
}
}
}
Documentation: API: query/prop=extracts
How to use Wikipedia API to search for values input by a user?
There are a few things to know about the Wikipedia API.
Consider the url that you have shared:
var url = "https://en.wikipedia.org/w/api.php?action=opensearch&search="+ searchTerm + "&format=json&callback=?";
There are two parts in the API URL.
- The API Entry Point: https://en.wikipedia.org/w/api.php - This is
the URL to which you make all your API calls i.e. it is the part
common to all API calls. - Parameters: The rest of the URL are parameters. In the parameters, you specify what exactly you want from the API call. I am explaining some of the parameters below:
action
parameter: There are many action
parameters available in the Wikipedia API. action=query
parameter is used to get information about a wikipedia article. Another common action parameter is action=opensearch
which is used to search Wikipedia - which is also there in the URL above. To read more on the Action parameter go here.
Each action
parameter also may have its own sub-parameters. For example, the search
parameter which is used in the url above. It tells the API what term to search for.
format
parameter tells which format you want the result in. It is usually json
though php
and xml
are also supported but deprecated. More on this here.
callback=?
may have been added in your query to trigger a JSONP response to avoid violation of Same Origin Policy. More information on Cross Site Requests regarding the Wikipedia API are available here.
`
Related Topics
Why Doesn't Sprintf() Output Anything
What Is the ASP.NET Equivalent to PHP's Echo
Best Way to Check a Empty Array
Symfony: How to Refresh the Authenticated User from the Database
Get/Set Dpi with PHP Gd/Imagick
Call PHP from JavaScript and Return an Array from PHP to JavaScript Function
How to Detect Ambiguous and Invalid Datetime in PHP
How to Remove an HTML Element Using the Domdocument Class
Escaping Curl @ Symbol with PHP
PHP Variable Variables with Array Key
How to Insert Large Files in MySQL Db Using PHP
How to Get the List of Available Locales in PHP
PHP Preg-Replace More Than One Underscore
How to Keep All the Post Information While Redirecting in PHP