What's The Key Difference Between HTML 4 and HTML 5

What are the key differences between HTML 4 and HTML 5? How to know, which HTML to use?

The easiest way to check whether the website is using HTML5 or not is the doctype. HTML5 has a really easy doctype <!DOCTYPE html>. So if you do not see the website having HTML 4.01 in the doctype and just has a simple <!DOCTYPE html>, that website is on HTML5.

Please use this link Hemdip provided to get the key differences between HTML4 and HTML 5.

HTML5 is the new standard which introduces a lot of new features such as <canvas>,<video>,<audio> and <track> tags and updated a lot of other elements(more of that here). Hence you should definitely try to use HTML 5 wherever possible.

You can follow this link where you can get complete tabular difference in both. Both are compared on tag basis. Which are new tags in HTML5 and which are removed from HTML5

What's the key difference between HTML 4 and HTML 5?

HTML5 has several goals which differentiate it from HTML4.

Consistency in Handling Malformed Documents

The primary one is consistent, defined error handling. As you know, HTML purposely supports 'tag soup', or the ability to write malformed code and have it corrected into a valid document. The problem is that the rules for doing this aren't written down anywhere. When a new browser vendor wants to enter the market, they just have to test malformed documents in various browsers (especially IE) and reverse-engineer their error handling. If they don't, then many pages won't display correctly (estimates place roughly 90% of pages on the net as being at least somewhat malformed).

So, HTML5 is attempting to discover and codify this error handling, so that browser developers can all standardize and greatly reduce the time and money required to display things consistently. As well, long in the future after HTML has died as a document format, historians may still want to read our documents, and having a completely defined parsing algorithm will greatly aid this.

Better Web Application Features

The secondary goal of HTML5 is to develop the ability of the browser to be an application platform, via HTML, CSS, and Javascript. Many elements have been added directly to the language that are currently (in HTML4) Flash or JS-based hacks, such as <canvas>, <video>, and <audio>. Useful things such as Local Storage (a js-accessible browser-built-in key-value database, for storing information beyond what cookies can hold), new input types such as date for which the browser can expose easy user interface (so that we don't have to use our js-based calendar date-pickers), and browser-supported form validation will make developing web applications much simpler for the developers, and make them much faster for the users (since many things will be supported natively, rather than hacked in via javascript).

Improved Element Semantics

There are many other smaller efforts taking place in HTML5, such as better-defined semantic roles for existing elements (<strong> and <em> now actually mean something different, and even <b> and <i> have vague semantics that should work well when parsing legacy documents) and adding new elements with useful semantics - <article>, <section>, <header>, <aside>, and <nav> should replace the majority of <div>s used on a web page, making your pages a bit more semantic, but more importantly, easier to read. No more painful scanning to see just what that random </div> is closing - instead you'll have an obvious </header>, or </article>, making the structure of your document much more intuitive.

HTML 4 vs HTML 5

If you're doing LOB stuff and nothing too fancy graphics-wize, then probably the biggest change would simply be using the HTML5 doctype tag:

<!DOCTYPE html>

Even on browsers that don't support HTML5 directly (e.g. IE7) this is interpreted as a valid DOCTYPE and the browser stays in "standards" mode. So as a starting point, that's probably the simplest you can do.

Then you can start looking at some of the additional attributes and so on that HTML5 brings to the table. Support for HTML5 forms is quite lacking at the moment (mostly it's just Chrome/Safari/WebKit and Opera that supports most of them) but it doesn't hurt adding them (they're backwards compatible).

How does a Web browser differentiate between HTML5 and HTML4?

According to w3schools, the following DOCTYPE defines a document as HTML5:

<!DOCTYPE html>

And the three types of HTML4 are defined by the following DOCTYPE declarations:

HTML 4.01 Strict

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

HTML 4.01 Transitional

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

HTML 4.01 Frameset

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">  

Is it considered okay (standards-wise) to mix HTML 4 and HTML 5?

The elements you specified are both HTML 5, hence you aren't mixing anything.

(If you give a better example of perhaps what issue you're facing, or which elements you are thinking of, perhaps we can elaborate.)

Using HTML 4 elements is safe in HTML 5, since new elements are introduced and only a few deprecated ones were dropped.

How will the key features of HTML5 work, contrast to HTML4?

Read

HTML 5 differences from HTML 4

HTML 4, HTML 5, XHTML, MIME types - the definitive resource

Contents.

  • Terminology
  • Languages and Serializations
  • Specifications
  • Browser Parsers and Content (MIME) Types
  • Browser Support
  • Validators and Document Type Definitions
  • Quirks, Limited Quirks, and Standards modes.

Terminology

One of the difficulties of describing this is clearly that the terminology within the official specifications has changed over the years, since HTML was first introduced. What follows below is based on HTML5 terminology. Also, "file" is used as a generic term to mean a file, document, input stream, octet stream, etc to avoid having to make fine distinctions.

Languages and Serializations

HTML and XHTML are defined in terms of a language and a serialization.

The language defines the vocabulary of the elements and attributes, and their content model, i.e. which elements are permitted inside which other elements, which attributes are allowed on which element, along with the purpose and meaning of each element and attribute.

The serialization defines how mark-up is used to describe these elements and attributes within a text document. This includes which tags are required and which can be inferred, and the rules for those inferences. It describes such things as how void elements should be marked up (e.g. “>” vs “/>”) and when attribute values need to be quoted.

Specifications

The HTML 4.01 specification is the current specification that defines both the HTML language and the HTML serialization.

The XML 1.0 specification defines a serialization but leaves the language to be defined by other specifications, which are termed “XML applications”

The XHTML 1.0 and 1.1 specifications are both in use. Essentially, they use the same language as HTML 4.01 but use a different serialization, one that is compatible with the XML 1.0 specification. i.e. XHTML is an XML application.

The HTML5 (as of 2010-04-18, draft) specification describes a new language for both HTML and XHTML. This language is mostly a superset of the HTML 4.01 language, but is intended to only be backward compatible with existing web tools, (e.g. browsers, search engines and authoring tools) and not with previous specifications, where differences arise. So the meaning of some elements are occasionally changed from the earlier specifications. Similarly, each of the serializations are backward compatible with the current tools.

Browser Parsers and Content (MIME) Types

When a text file is sent to a browser, it is parsed into its internal memory structure (object model). To do so it uses a parser which follows either the HTML serialization rules or XML serialization rules. Which parser it uses depends on what it deduces the content type to be, based for non-local files on the “content-type” HTTP header. Internally, once the file has been parsed, the browser treats the object model in almost the same way, regardless of whether it was originally supplied using an HTML or XHTML serialization.

For a browser to use its XHTML parser, the content type HTTP header must be one of the XML content types. Most commonly, this is either application/xml or application/xhtml+xml. Any non XML content type will mean that the file, regardless of whether it meets all the XHTML language and serialization rules or not, will not be processed by the browser as XHTML.

Using a HTTP content type of text/html (or in most fallback scenarios, where the content type is missing or any other non-XML type) will cause the browser to use its HTML serialization parser.

One key difference between the two parsers is that the HTML serialization parser performs error recovery. If the input file to the parser does not meet the HTML serialization rules, the parser will recover in ways reverse engineered from previous browsers and carry on building its object model until it reaches the end of the file. HTML5 contains the first normative definition of the recovery but no mainstream browser has shipped an implementation of the algorithm enabled in a release version as of 2010-04-26.

In contrast, the XML serialization parser, will stop when it encounters anything that it cannot interpret as XML (i.e. when it discovers that the file is not XML well-formed). This is required of parsers by the XML 1.0 specification.

Browser Support

Most modern browsers contain support for both an HTML parser and an XML parser. However, in Microsoft Internet Explorer versions 8.0 and earlier, the XML parser cannot directly create an object model for rendering as an HTML page. The XML structure can, however be processed with an XSLT file to create a stream which in turn be parsed using the HTML parser to create a object model that can be rendered.

Starting with Internet Explorer 9 Platform Preview, XHTML supplied using an XML content type can be parsed directly in the same way as the other modern browsers.

When their XML parsers detect that their input files are not XML well-formed, some browsers display an error message, and others show the page as constructed up to the point where the error was detected and some offer the user the opportunity to have the file re-parsed using their HTML parser.

Validators and Document Type Definitions

HTML and XHTML files can begin with a Document Type Definition (DTD) declaration which indicates the language and serialization that is being used in the document. Validators, such as the one at http://validator.w3.org/ use this information to match the language and serialization used within the file against the rules defined in the DTD. It then reports errors based on where the rules in the DTD are violated by mark up in the file.

Not all HTML serialization and language rules can be described in a DTD, so validators only test for a subset of all the rules described by the specifications.

HTML 4.01 and XHTML 1.0 define Strict, Transitional, and Frameset DTDs which differ in the language elements and attributes that are permitted in compliant files.

Validators based on HTML5 such as validator.nu behave more like browsers, processing the page according to the HTTP content type and using a non DTD-based rule set so that they catch errors that cannot be described by DTDs.

Quirks, Limited Quirks, and Standards modes.

Browsers do not validate the files sent to them. Nor do they use any DTD declaration to determine the language or serialization of the file. However, they do use it to guess the era in which the page was created, and therefore the likely parsing and rendering behaviour the author would have expected of a browser at that time. Accordingly, they define three parsing and rendering modes, known as Quirks mode, Limited Quirks (or Almost Standards) mode and Standards mode.

Any file served using an XML content type is always processed in standards mode. For files parsed using the HTML parser, if there is no DTD provided or the DTD is determined to be very old, browsers use their quirks mode. Broadly speaking, HTML 4.01 and XHTML files processed as text/html will be processed with limited quirks mode if they contain a transitional DTD and with standards mode if using a strict DTD.

Where the DTD is not recognised, the mode is determined by a complex set of rules. One special case is where the public and system identifiers are omitted and the declaration is simply <!DOCTYPE html>. This is known to be the shortest doctype declaration where current browsers will treat the file as standards mode. For that reason, it is the declaration specified to be used for HTML5 compliant files.

What is the difference between this html5 form elements?

Briefly, some terminology: Confusingly, "HTML" now means two things:

  • The definition of the various kinds of elements that make up what we use in web pages and such. This is what tells us that there is an element called div and what it's for.
  • One of the two serializations of it (the written form), which tells us we write div elements like this: <div>content</div>.

The other serialization of HTML is XHTML. The two serializations differ in places, because XHTML is XML.

HTML defines some elements that never have content, like <br>, and in the HTML serialization they're usually written just like that, <br>. In the XHTML serialization that's a problem, because XML requires that all tags be closed and <br> is just a start tag. Putting the slash ("solidus") just before the ending > closes the tag, so in XHTML, <br> becomes <br/>. The / is tolerated in the HTML serialization, but it serves no purpose. It only serves a purpose in XHTML. (Note that in really, really old browsers, you may need a space before the solidus, e.g. <br />, but we're talking very old indeed.)

This is only true for void elements like <br> and <input> that never have any content, and foreign elements (MathML and SVG). You never write <div/>, for instance, even if the div is going to be empty. The correct form of an empty div is always <div></div> (whether in the HTML or XHTML serialization).

Full detail in the specification, and in particular §8.1.2.1.

So regarding your two specific examples: The first is only valid in the HTML serialization. The second is also valid in the HTML serialization, and would be valid in the XHTML serialization if the autofocus attribute had a value (in XML, attributes must have a value, so you have to write autofocus="autofocus").



Related Topics



Leave a reply



Submit