How to Determine If a Url Object Returns '404 Not Found'

How to know if it's actually a 404 page?

You can check the HTTP status code, and see if it is 404 or not. The status code is on the first line of the response:

HTTP/1.1 404 Not Found

If you are using HTTPlib you can just read the status property of the HTTPResponse object.

However, it is the server that decides what HTTP status code to send. Just because 404 is defined to mean "page not found" does not mean the server can not lie to you. It is quite common to do things like this:

  • Send 404 instead of 403, to hide the resource that requires authentication.
  • Send 404 instead of 500, to hide the fact something is not working.
  • Send 404 when your IP is blocked for some reason.

Without access to the server, it is impossible to know what is really going on behind the curtains.

JSON POST and GET 404 (Not Found)

you call edit_email without id here:

button.addEventListener('click', () => edit_email());

of cause, after call you get /edit/undefined on this line:

fetch(`/edit/${id}`)

you don't send anything like id, I can imagine it should be something like this:

button.addEventListener('click', (event) => edit_email(event.target.value));

You will also need to pass the value property to the button as post.id assuming that the post object will have an id key in your for loop.

If you are getting a reference error you need to check if page_obj.object_list has an id key for all the posts.

how to check if a URL exists or not - error 404 ? (using php)

If you have allow_url_fopen, you can do:

$exists = ($fp = fopen("http://www.faressoft.org/", "r")) !== FALSE;
if ($fp) fclose($fp);

although strictly speaking, this won't return false only for 404 errors. It's possible to use stream contexts to get that information, but a better option is to use the curl extension:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/notfound");
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_exec($ch);
$is404 = curl_getinfo($ch, CURLINFO_HTTP_CODE) == 404;
curl_close($ch);

Getting a 404 error on a page that was accessible

I tried and I didn't get any error so I think this is about your user_agent. Try it like this:

import requests
from bs4 import BeautifulSoup as bs

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36'}

url = "https://finance.yahoo.com/quote/NXPI/options?p=NXPI&date=1629417600&guccounter=1"

page = requests.get(url, headers=headers).text
soup = bs(page,'html.parser')
print(soup)

HTTP Error 404: Not Found with an existing url

Some sites require a valid "User-Agent" identifier header. In your example with urllib, as the URL parameter of urlopen can also be a Request object, you could specify the headers in the Request object along with the url, as below:

from urllib.request import Request, urlopen

index = 'MSFT'
url_is = 'https://finance.yahoo.com/quote/' + index + '/financials?p=' + index
req = Request(url_is, headers={'User-Agent': 'Mozilla/5.0'})
html = urlopen(req).read()


Related Topics



Leave a reply



Submit