Why do browsers insert tbody element into table elements?
http://htmlhelp.com/reference/html40/tables/tbody.html:
The TBODY element defines a group of data rows in a table. A TABLE must have one or more TBODY elements, which must follow the optional TFOOT. The TBODY end tag is always optional. The start tag is optional when the table contains only one TBODY and no THEAD or TFOOT.
So there always is a tbody there (albeit sometimes with both the start and end tags optional and omitted), and the tools you are using are correct in showing it to you.
thead or tfoot, on the other hand, are never present unless you explicitly include them, and if you do that, the tbody(s) must be explicit too.
Tbody tag in xpath produced by fire bug
In order to take into account and avoid this problem, use XPath expressions of the following kind:
/locStep1/locStep2/.../table/YourSubExpression
|
/locStep1/locStep2/.../table/tbody/YourSubExpression
If the table
doesn't have a tbody
child, then the second argument of the union operator (|
) selects no nodes and the first argument of the union selects the wanted nodes.
Alternatively, if the table
has a tbody
child, then the first argument of the union operator selects no nodes and the second argument of the union selects the wanted nodes.
The end result: in both cases the wanted nodes are selected
Why does my XPath query (scraping HTML tables) only work in Firebug, but not the application I'm developing?
The Problem: DOM Requires <tbody/>
Tags
Firebug, Chrome's Developer Tool, XPath functions in JavaScript and others work on the DOM, not the basic HTML source code.
The DOM for HTML requires that all table rows not contained in a table header of footer (<thead/>
, <tfoot/>
) are included in table body tags <tbody/>
. Thus, browsers add this tag if it's missing while parsing (X)HTML. For example, Microsoft's DOM documentation says
The
tbody
element is exposed for all tables, even if the table does not explicitly define atbody
element.
There is an in-depth explanation in another answer on stackoverflow.
On the other hand, HTML does not necessarily require that tag to be used:
The
TBODY
start tag is always required except when the table contains only one table body and no table head or foot sections.
Most XPath Processors Work on raw XML
Excluding JavaScript, most XPath processors work on raw XML, not the DOM, thus do not add <tbody/>
tags. Also HTML parser libraries like tag-soup and htmltidy only output XHTML, not "DOM-HTML".
This is a common problem posted on Stackoverflow for PHP, Ruby, Python, Java, C#, Google Docs (Spreadsheets) and lots of others. Selenium runs inside the browser and works on the DOM -- so it is not affected!
Reproducing the Issue
Compare the source shown by Firebug (or Chrome's Dev Tools) with the one you get by right-clicking and selecting "Show Page Source" (or whatever it's called in your browsers) -- or by using curl http://your.example.org
on the command line. Latter will probably not contain any <tbody/>
elements (they're rarely used), Firebug will always show them.
Solution 1: Remove /tbody
Axis Step
Check if the table you're stuck at really does not contain a <tbody/>
element (see last paragraph). If it does, you've probably got another kind of problem.
Now remove the /tbody
axis step, so your query will look like
//table[@id="example"]/tr[2]/td[1]
Solution 2: Skip <tbody/>
Tags
This is a rather dirty solution and likely to fail for nested tables (can jump into inner tables). I would only recommend to to this in very rare cases.
Replace the /tbody
axis step by a descendant-or-self step:
//table[@id="example"]//tr[2]/td[1]
Solution 3: Allow Both Input With and Without <tbody/>
Tags
If you're not sure in advance that your table or use the query in both "HTML source" and DOM context; and don't want/cannot use the hack from solution 2, provide an alternative query (for XPath 1.0) or use an "optional" axis step (XPath 2.0 and higher).
- XPath 1.0:
//table[@id="example"]/tr[2]/td[1] | //table[@id="example"]/tbody/tr[2]/td[1]
- XPath 2.0:
//table[@id="example"]/(tbody, .)/tr[2]/td[1]
Why do browsers still inject tbody in HTML5?
The answer of "backwards compatiblity" makes absolutely zero sense
because I specifically opted in for a HTML5 doctype.
However, browsers don't differentiate between versions of HTML. HTML documents with HTML5 doctype and with HTML4 doctype (with the small exception of HTML4 transitional doctype without URL in FPI) are parsed and rendered the same way.
I'll quote the relevant part of HTML5 parser description:
8.2.5.4.9 The "in table" insertion mode
...
A start tag whose tag name is one of: "td", "th", "tr"
Act as if a start tag token with the tag name "tbody" had been
seen, then reprocess the current token.
Browser sometimes adds in tbody
When creating a table
element with JavaScript, you need to insert the tbody
element in order to match the structure of the table
element as created by the HTML parser. The HTML markup need not have tbody
tags, but the tbody
element is there.
So you simply need to modify the JavaScript code so that after creating the table
element, it creates a tbody
element, makes it a child of the table
, and later makes all tr
elements children of the tbody
, not the table
.
Trying to minimize the table height (TABLE, TBODY and offset)
If I understood your question correctly, it looks like you want to do this on all your table elements:
padding:0px
border-collapse:collapse;
http://www.w3schools.com/Css/pr_tab_border-collapse.asp
In general using a good reset.css should help you.
Colgroup tag in code but missing from view source or debugger tool
The following question Table caption does not show when it is runat=server explains that
A complex table model is not supported. You cannot have an HtmlTable control with nested caption, col, colgroup, tbody, thead, or tfoot elements. These elements are removed without warning and do not appear in the output HTML. MSDN
When I create the following HTML
<table border="1">
<colgroup>
<col span="2" style="background-color:orange"></col>
</colgroup>
<tr>
<td>column 1</td>
<td>column 2</td>
<td>column 3</td>
</tr>
</table>
The colgroup/col
tags are still there
My best guess is that since your table tag is being with runat="server", the C# parser must be removing it. You can prove that by looking at the actual HTML source that is sent to the client, that is use "view source" instead of looking at the generated DOM.
One of the reasons I can't stand this mix of server and client code writing my HTML for me....
Creating tables with jQuery without tbody
Why Would you try to remove tbody
. your browser is trying to add the part of valid html, you are missing. Is it not nice?
jquery says tbody.length = 1 even though no tbody tag is present
tbody adds automatically in table, you could see that tbody is there by right clicking and view component (chrome developer tools)
<table id="test">
<tbody>
<tr>
<td>some table cell</td>
</tr>
</tbody>
</table>
Related Topics
Why Are My Css3 Media Queries Not Working on Mobile Devices
Css Vertical Alignment of Inline/Inline-Block Elements
Show an Image Preview Before Upload
Equal Width Flex Items Even After They Wrap
How to Make Css3 Rounded Corners Hide Overflow in Chrome/Opera
How to Replicate Background-Attachment Fixed on Ios
Best Practices & Considerations When Writing HTML Emails
Html5 Canvas Drawimage: How to Apply Antialiasing
Data Protocol Url Size Limitations
How to Keep Origin in Center of Image in Scale Animation
Align an Element to Bottom With Flexbox
Why Does Overflow:Hidden Not Work in a ≪Td≫
How to Change the Button Text of ≪Input Type="File" /≫
Display Pdf Within Web Browser