Why Do Browsers Still Inject ≪Tbody≫ in Html5

Why do browsers still inject tbody in HTML5?

The answer of "backwards compatiblity" makes absolutely zero sense
because I specifically opted in for a HTML5 doctype.

However, browsers don't differentiate between versions of HTML. HTML documents with HTML5 doctype and with HTML4 doctype (with the small exception of HTML4 transitional doctype without URL in FPI) are parsed and rendered the same way.

I'll quote the relevant part of HTML5 parser description:

8.2.5.4.9 The "in table" insertion mode

...

A start tag whose tag name is one of: "td", "th", "tr"

Act as if a start tag token with the tag name "tbody" had been
seen, then reprocess the current token.

Why do browsers insert tbody element into table elements?

http://htmlhelp.com/reference/html40/tables/tbody.html:

The TBODY element defines a group of data rows in a table. A TABLE must have one or more TBODY elements, which must follow the optional TFOOT. The TBODY end tag is always optional. The start tag is optional when the table contains only one TBODY and no THEAD or TFOOT.

So there always is a tbody there (albeit sometimes with both the start and end tags optional and omitted), and the tools you are using are correct in showing it to you.

thead or tfoot, on the other hand, are never present unless you explicitly include them, and if you do that, the tbody(s) must be explicit too.

Browser sometimes adds in tbody

When creating a table element with JavaScript, you need to insert the tbody element in order to match the structure of the table element as created by the HTML parser. The HTML markup need not have tbody tags, but the tbody element is there.

So you simply need to modify the JavaScript code so that after creating the table element, it creates a tbody element, makes it a child of the table, and later makes all tr elements children of the tbody, not the table.

Can I ignore the tbody element in HTML 5?

In HTML5 spec, it states clearly about table

In this order: optionally a caption element, followed by zero or more
colgroup elements, followed optionally by a thead element, followed
optionally by a tfoot element, followed by either zero or more tbody
elements or one or more tr elements, followed optionally by a tfoot
element (but there can only be one tfoot element child in total).

It's optional but good practice to explicitly add it, as other answer also mentioned that, I agree with zzzzBov.

Browser inspector adds tbody to table not present in raw HTML

@ggorlen sugest for using different parser because content that I looked in the browser's inspector adds by itself a that I can saw.
After used html5lib it works fine. It does mean parser fix content from webiste automaticlly added a missing things. There is recommendation for skipping the missing call anyway, even if you find a parser that injects it

import html5lib
web_content = requests.get('https://koniewyscigowe.pl/wyscig?w=14222-tor-partynice-nagroda-cheval-francais')
soup = BeautifulSoup(web_content.text, "html5lib")
for index, table in enumerate(soup.find_all('div', {'class': 'table-responsive'})):
if index == 0:
pass
elif index == 1:
for starts_stats in table.tbody.find_all('tr'):
print('HERE WE ARE')


Related Topics



Leave a reply



Submit