What Characters Are Allowed in an HTML Attribute Name

What characters are allowed in an HTML attribute name?

It depends what you mean by "allowed". Each tag has a fixed list of attribute names which are valid, and in html they are case insensitive. In one important sense, only these characters in the correct sequence are "allowed".

Another way of looking at it, is what characters will browsers treat as a valid attribute name. The best advice here comes from the parser spec of HTML 5, which can be found here: https://html.spec.whatwg.org/multipage/syntax.html#attributes-2

It says that all characters except tab, line feed, form feed, space, solidus, greater than sign, quotation mark, apostrophe and equals sign will be treated as part of the attribute name. Personally, I wouldn't attempt pushing the edge cases of this though.

What characters are allowed (not allowed) in the names of custom data-* attributes?

See the definition of the data-* attribute in the W3C HTML5 Recommendation:

  • In HTML5, the name must be XML-compatible (and it gets ASCII-lowercased automatically).

  • In XHTML5, the name must be XML-compatible and must not contain uppercase ASCII letters.

The definition of XML-compatible says that it

  • must not contain : characters
  • must match the Name production in the XML 1.0 specification

This Name production lists which characters are allowed.


tl;dr: For the part after data-, you may use the following characters:

  • 0-9
  • a-z
  • A-Z (not in XHTML5)
  • - _ . ·
  • and characters from these Unicode ranges:

    • [#x0300-#x036F] (Combining Diacritical Marks)
    • [#x203F-#x2040] ( )
    • [#xC0-#xD6]
    • [#xD8-#xF6]
    • [#xF8-#x2FF]
    • [#x370-#x37D]
    • [#x37F-#x1FFF]
    • [#x200C-#x200D] (ZERO WIDTH NON-JOINER, ZERO WIDTH JOINER)
    • [#x2070-#x218F]
    • [#x2C00-#x2FEF]
    • [#x3001-#xD7FF]
    • [#xF900-#xFDCF]
    • [#xFDF0-#xFFFD]
    • [#x10000-#xEFFFF]

So the @ (U+0040) is not allowed.

What values can I put in an HTML attribute value?

If your attribute value is quoted (starts and ends with double quotes "), then any characters except for double quotes and ampersands are allowed, which must be quoted as " and & respectively (or the equivalent numeric entity references, " and &)

You can also use single quotes around an attribute value. If you do this, you may use literal double quotes within the attribute: <span title='This is a "good" title.'>...</span>. In order to escape single quotes within such an attribute value, you must use the numeric entity reference ' since some browsers don't support the named entity, ' (which was not defined in HTML 4.01).

Furthermore, you can also create attributes with no quotes, but that restricts the set of characters you can have within it much further, disallowing the use of spaces, =, ', ", <, >, ` in the attribute.

See the HTML5 spec for more details.

What characters are allowed in the HTML Name attribute inside input tag?

The only real restriction on what characters can appear in form control names is when a form is submitted with GET

"The "get" method restricts form data set values to ASCII characters." reference

There's a good thread on it here.

Valid characters in custom data- attribute name in HTML5

From your spec link 2, the allowed characters come from the Name production in XML, which is, given that the attribute already starts with data-

":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] |
[#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
[#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | "-" |
"." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]

_ is explicitly listed as OK

! (#x21) and $ (#x24) are not allowed

Spec link 1 is irrelevant. That's for user-agents, to describe how the characters should be processed, regardless of whether they are valid or not.

what characters are allowed in the value attribute of form's input

HTML 5

Except where otherwise specified, attributes on HTML elements may have any string value, including the empty string. Except where explicitly stated, there is no restriction on what text can be specified in such attributes.

HTML 5

The value content attribute gives the default value of the input element. When the value content attribute is added, set, or removed, if the control's dirty value flag is false, the user agent must set the value of the element to the value of the value content attribute, if there is one, or the empty string otherwise, and then run the current value sanitization algorithm, if one is defined.

So there are no restrictions but the value might get altered by the value sanitization algorithm.


For instance, if I enclose this attribute in single close, I can not safely use single quotes in it.

You can. You just can't use literal single quotes. You have to use character references.

Valid value for the name attribute in HTML

By HTML rules, the name attribute may have any value: it is declared with CDATA type. Do not confuse this attribute with the references to attributes declared as having NAME type. See 17.4 The INPUT element, name = cdata [CI].

In the use of $POST[...] in PHP, you need to note this PHP rule: “Dots and spaces in variable names are converted to underscores. For example <input name="a.b" /> becomes $_REQUEST["a_b"].” See Variables From External Sources.

So $_POST['1'] should work as is and does work, but instead of $_POST['1.1'] you need to write $_POST['1_1'].

What are valid values for the id attribute in HTML?

For HTML 4, the answer is technically:

ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").

HTML 5 is even more permissive, saying only that an id must contain at least one character and may not contain any space characters.

The id attribute is case sensitive in XHTML.

As a purely practical matter, you may want to avoid certain characters. Periods, colons and '#' have special meaning in CSS selectors, so you will have to escape those characters using a backslash in CSS or a double backslash in a selector string passed to jQuery. Think about how often you will have to escape a character in your stylesheets or code before you go crazy with periods and colons in ids.

For example, the HTML declaration <div id="first.name"></div> is valid. You can select that element in CSS as #first\.name and in jQuery like so: $('#first\\.name'). But if you forget the backslash, $('#first.name'), you will have a perfectly valid selector looking for an element with id first and also having class name. This is a bug that is easy to overlook. You might be happier in the long run choosing the id first-name (a hyphen rather than a period), instead.

You can simplify your development tasks by strictly sticking to a naming convention. For example, if you limit yourself entirely to lower-case characters and always separate words with either hyphens or underscores (but not both, pick one and never use the other), then you have an easy-to-remember pattern. You will never wonder "was it firstName or FirstName?" because you will always know that you should type first_name. Prefer camel case? Then limit yourself to that, no hyphens or underscores, and always, consistently use either upper-case or lower-case for the first character, don't mix them.


A now very obscure problem was that at least one browser, Netscape 6, incorrectly treated id attribute values as case-sensitive. That meant that if you had typed id="firstName" in your HTML (lower-case 'f') and #FirstName { color: red } in your CSS (upper-case 'F'), that buggy browser would have failed to set the element's color to red. At the time of this edit, April 2015, I hope you aren't being asked to support Netscape 6. Consider this a historical footnote.

What characters are allowed in DOM IDs?

Actually there is a difference between HTML and XHTML.
As XHTML is XML the rules for XML IDs apply:

Values of type ID MUST match the Name production.

NameStartChar ::=   ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] |
[#xD8-#xF6] | [#xF8-#x2FF] |
[#x370-#x37D] | [#x37F-#x1FFF] |
[#x200C-#x200D] | [#x2070-#x218F] |
[#x2C00-#x2FEF] | [#x3001-#xD7FF] |
[#xF900-#xFDCF] | [#xFDF0-#xFFFD] |
[#x10000-#xEFFFF]

NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 |
[#x0300-#x036F] | [#x203F-#x2040]

Source: Extensible Markup Language (XML) 1.0 (Fifth Edition) 2.3

For HTML the following applies:

id = name [CS]

This attribute assigns a name to an element. This name must be unique in a document.

ID and NAME tokens must begin with a
letter ([A-Za-z]) and may be followed
by any number of letters, digits
([0-9]), hyphens ("-"), underscores
("_"), colons (":"), and periods
(".").

Source: HTML 4 Specification, Chapter 6, ID Token



Related Topics



Leave a reply



Submit