What Is Cdata in Html

What is CDATA in HTML?

All text in an XML document will be parsed by the parser.

But text inside a CDATA section will be ignored by the parser.

CDATA - (Unparsed) Character Data

The term CDATA is used about text data that should not be parsed by the XML parser.

Characters like "<" and "&" are illegal in XML elements.

"<" will generate an error because the parser interprets it as the start of a new element.

"&" will generate an error because the parser interprets it as the start of an character entity.

Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script code can be defined as CDATA.

Everything inside a CDATA section is ignored by the parser.

A CDATA section starts with "<![CDATA[" and ends with "]]>"

Use of CDATA in program output

CDATA sections in XHTML documents are liable to be parsed differently by web browsers if they render the document as HTML, since HTML parsers do not recognise the CDATA start and end markers, nor do they recognise HTML entity references such as < within <script> tags. This can cause rendering problems in web browsers and can lead to cross-site scripting vulnerabilities if used to display data from untrusted sources, since the two kinds of parsers will disagree on where the CDATA section ends.

A brief SGML tutorial.

Also, see the Wikipedia entry on CDATA.

Should I use <![CDATA[...]]> in HTML5?

The CDATA structure isn't really for HTML at all, it's for XML.

People sometimes use them in XHTML inside script tags because it removes the need for them to escape <, > and & characters. It's unnecessary in HTML though, since script tags in HTML are already parsed like CDATA sections.

Edit: This is where we open that really mouldy old can of worms from 2002 over whether you're sending XHTML as text/html or as application/xhtml+xml like you’re “supposed” to :-)

When is a CDATA section necessary within a script tag?

A CDATA section is required if you need your document to parse as XML (e.g. when an XHTML page is interpreted as XML) and you want to be able to write literal i<10 and a && b instead of i<10 and a && b, as XHTML will parse the JavaScript code as parsed character data as opposed to character data by default. This is not an issue with scripts that are stored in external source files, but for any inline JavaScript in XHTML you will probably want to use a CDATA section.

Note that many XHTML pages were never intended to be parsed as XML in which case this will not be an issue.

For a good writeup on the subject, see https://web.archive.org/web/20140304083226/http://javascript.about.com/library/blxhtml.htm

What does <![CDATA[]]> in XML mean?

CDATA stands for Character Data and it means that the data in between these strings includes data that could be interpreted as XML markup, but should not be.

The key differences between CDATA and comments are:

  • As Richard points out, CDATA is still part of the document, while a comment is not.
  • In CDATA you cannot include the string ]]> (CDEnd), while in a comment -- is invalid.
  • Parameter Entity references are not recognized inside of comments.

This means given these four snippets of XML from one well-formed document:

<!ENTITY MyParamEntity "Has been expanded">


<!--
Within this comment I can use ]]>
and other reserved characters like <
&, ', and ", but %MyParamEntity; will not be expanded
(if I retrieve the text of this node it will contain
%MyParamEntity; and not "Has been expanded")
and I can't place two dashes next to each other.
-->


<![CDATA[
Within this Character Data block I can
use double dashes as much as I want (along with <, &, ', and ")
*and* %MyParamEntity; will be expanded to the text
"Has been expanded" ... however, I can't use
the CEND sequence. If I need to use CEND I must escape one of the
brackets or the greater-than sign using concatenated CDATA sections.
]]>


<description>An example of escaped CENDs</description>
<!-- This text contains a CEND ]]> -->
<!-- In this first case we put the ]] at the end of the first CDATA block
and the > in the second CDATA block -->
<data><![CDATA[This text contains a CEND ]]]]><![CDATA[>]]></data>
<!-- In this second case we put a ] at the end of the first CDATA block
and the ]> in the second CDATA block -->
<alternative><![CDATA[This text contains a CEND ]]]><![CDATA[]>]]></alternative>

how to comment in HTML <![CDATA[

It is not possible to have comment inside a CDATA section. In XML, and hence in HTML when using XHTML syntax, a CDATA section is a used “to escape blocks of text containing characters which would otherwise be recognized as markup”. It has simple syntax: it begins with <![CDATA[ and ends with the ]]>. No markup of any kind is recognized between the limiters. In HTML5, CDATA sections in HTML syntax are defined in an ad hoc manner and can only used to embed external content, namely SVG or MathML content; they may contain comments as per general XML rules, but they are no comments from HTML viewpoint, just data.

The above answers the question in the title. The question in the body seems to be different and not quite clear. The sample code contains no CDATA section at all. It contains only a head element containing a comment and the character data .... (and whitespace). There is no script element, since data that would otherwise constitute such an element is wrapped inside a comment, hence ignored. And neither is there any CDATA section, since when parsing a comment, only the comment terminator string is recognized, no markup.

Why use *//<![CDATA[* and *//]]>* in a jQuery script?

CDATA is used to allow the document to be loaded as straight XML. You can embed JS in XML documents without replacing special XML characters like <, >, &, etc by XML entities <, >, & etc to prevent that the XML syntax get corrupted.

So double slash // in your XML will be treated as text instead of a comment and hence it makes CDATA as an XML tag.

The wiki says that:-

In an XML document or external parsed entity, a CDATA section is a
section of element content that is marked for the parser to interpret
as only character data, not markup. A CDATA section is merely an
alternative syntax for expressing character data; there is no semantic
difference between character data that manifests as a CDATA section
and character data that manifests as in the usual syntax in which <
and & would be represented by < and &, respectively.

What is the meaning of CDATA

With <![CDATA[ you can embed JS in XML (and XHTML) documents without the need to replace special XML characters like <, >, &, etc by XML entities <, >, & etc to prevent that the XML syntax get malformed and that you get errors like The entity name must immediately follow the '&' in the entity reference. The general recommendation is however to put JS code in its own .js file which you then include by a <script src>.

The <![CDATA[ is not needed in plain HTML documents. Unless you're developing with a XML based view technology like Facelets (for JSF) or ASP.NET MVC, there's absolutely no need to declare your HTML as XHTML. Just a <!DOCTYPE html> would suffice

what actually is PCDATA and CDATA?

From WIKI:

PCDATA

Simply speaking, PCDATA stands for Parsed Character Data. That means the characters are to be parsed by the XML, XHTML, or HTML parser. (< will be changed to <, <p> will be taken to mean a paragraph tag, etc). Compare that with CDATA, where the characters are not to be parsed by the XML, XHTML, or HTML parser.

CDATA

The term CDATA, meaning character data, is used for distinct, but related purposes in the markup languages SGML and XML. The term indicates that a certain portion of the document is general character data, rather than non-character data or character data with a more specific, limited structure.

Why does CDATA is commented out under script tags ?

XHTML is supposed to be served as XML by using media type application/xhtml+xml. In HTML5, the markup is only XHTML if it is served with an XML media type. When served like this, the contents of script elements are not CDATA.

So to get the XML parser to treat the script contents as CDATA, they can be wrapped in <![CDATA[ ]]>.

While few people have historically served markup as application/xhtml+xml, many have validated their pages as if it was XHTML. The XHTML validator equally expects that the script contents are not ordinarily CDATA, and so will typically reject tags and other scraps of markup embedded in the JavaScript, unless they are escaped with <![CDATA[ ]]>

Having validated their pages as XHTML, they'd then serve their pages with a text/html media type to browsers, which meant that the browser treats the markup as HTML, not XHTML. In this case, the HTML parser is used, which does treat the script contents as CDATA automatically, so the <![CDATA[ and ]]>. become part of the script to be run by the JavaScript engine. Therefore, to hide those strings from the JavaScript engine, they are preceded with // on the same line, which means that the JavaScript engine thinks the lines are comments.

Finally, some people serve the same markup as both application/xhtml+xml and text/html, switching based on the information found in the HTTP request message. For the same reasons as above, to get the script contents to be processed correctly in both modes, the //<![CDATA[ and //]]> pattern is a very effective technique.



Related Topics



Leave a reply



Submit