What characters are allowed in DOM IDs?
Actually there is a difference between HTML and XHTML.
As XHTML is XML the rules for XML IDs apply:
Values of type ID MUST match the Name production.
NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] |
[#xD8-#xF6] | [#xF8-#x2FF] |
[#x370-#x37D] | [#x37F-#x1FFF] |
[#x200C-#x200D] | [#x2070-#x218F] |
[#x2C00-#x2FEF] | [#x3001-#xD7FF] |
[#xF900-#xFDCF] | [#xFDF0-#xFFFD] |
[#x10000-#xEFFFF]
NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 |
[#x0300-#x036F] | [#x203F-#x2040]
Source: Extensible Markup Language (XML) 1.0 (Fifth Edition) 2.3
For HTML the following applies:
id = name [CS]
This attribute assigns a name to an element. This name must be unique in a document.ID and NAME tokens must begin with a
letter ([A-Za-z]) and may be followed
by any number of letters, digits
([0-9]), hyphens ("-"), underscores
("_"), colons (":"), and periods
(".").
Source: HTML 4 Specification, Chapter 6, ID Token
What are legal characters for an HTML element id?
In HTML5, the only restrictions are that the ID must be unique within the document, contain at least one character and contain no spaces. See http://www.w3.org/TR/2014/REC-html5-20141028/dom.html#the-id-attribute
As other answers have pointed out, HTML 4 is more restrictive and specifies that
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
What are valid values for the id attribute in HTML?
For HTML 4, the answer is technically:
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
HTML 5 is even more permissive, saying only that an id must contain at least one character and may not contain any space characters.
The id attribute is case sensitive in XHTML.
As a purely practical matter, you may want to avoid certain characters. Periods, colons and '#' have special meaning in CSS selectors, so you will have to escape those characters using a backslash in CSS or a double backslash in a selector string passed to jQuery. Think about how often you will have to escape a character in your stylesheets or code before you go crazy with periods and colons in ids.
For example, the HTML declaration <div id="first.name"></div>
is valid. You can select that element in CSS as #first\.name
and in jQuery like so: $('#first\\.name').
But if you forget the backslash, $('#first.name')
, you will have a perfectly valid selector looking for an element with id first
and also having class name
. This is a bug that is easy to overlook. You might be happier in the long run choosing the id first-name
(a hyphen rather than a period), instead.
You can simplify your development tasks by strictly sticking to a naming convention. For example, if you limit yourself entirely to lower-case characters and always separate words with either hyphens or underscores (but not both, pick one and never use the other), then you have an easy-to-remember pattern. You will never wonder "was it firstName
or FirstName
?" because you will always know that you should type first_name
. Prefer camel case? Then limit yourself to that, no hyphens or underscores, and always, consistently use either upper-case or lower-case for the first character, don't mix them.
A now very obscure problem was that at least one browser, Netscape 6, incorrectly treated id attribute values as case-sensitive. That meant that if you had typed id="firstName"
in your HTML (lower-case 'f') and #FirstName { color: red }
in your CSS (upper-case 'F'), that buggy browser would have failed to set the element's color to red. At the time of this edit, April 2015, I hope you aren't being asked to support Netscape 6. Consider this a historical footnote.
Javascript regex to remove illegal characters from DOM ID
var str = "99% of People are not the 1%";
str = str.replace(/^[^a-z]+|[^\w:.-]+/gi, "");
Can a DOM element have an ID that contains a space?
According to the HTML 4.0 specification for basic types:
ID and NAME tokens must begin with a
letter ([A-Za-z]) and may be followed
by any number of letters, digits
([0-9]), hyphens ("-"), underscores
("_"), colons (":"), and periods
(".").
And even if spaces were valid, an id attribute with spaces would be interpreted by jQuery as an ancestor descendant selector with the current selector syntax.
Allowed HTML 4.01 id values regex
You can use this regex
^[a-zA-Z][\w:.-]*$
^
depicts the start of string
[a-zA-Z]
matches an uppercase or lowercase letter
*
matches the preceding character 1 to many times
\w
is similar to [a-zA-Z\d_]
$
is the end of string
Allowed characters for CSS identifiers
The charset doesn't matter. The allowed characters matters more. Check the CSS specification. Here's a cite of relevance:
In CSS, identifiers (including element names, classes, and IDs in selectors) can contain only the characters
[a-zA-Z0-9]
and ISO 10646 charactersU+00A0
and higher, plus the hyphen (-
) and the underscore (_
); they cannot start with a digit, two hyphens, or a hyphen followed by a digit. Identifiers can also contain escaped characters and any ISO 10646 character as a numeric code (see next item). For instance, the identifier"B&W?"
may be written as"B\&W\?"
or"B\26 W\3F"
.
Update: As to the regex question, you can find the grammar here:
ident -?{nmstart}{nmchar}*
Which contains of the parts:
nmstart [_a-z]|{nonascii}|{escape}
nmchar [_a-z0-9-]|{nonascii}|{escape}
nonascii [\240-\377]
escape {unicode}|\\[^\r\n\f0-9a-f]
unicode \\{h}{1,6}(\r\n|[ \t\r\n\f])?
h [0-9a-f]
This can be translated to a Java regex as follows (I only added parentheses to parts containing the OR and escaped the backslashes):
String h = "[0-9a-f]";
String unicode = "\\\\{h}{1,6}(\\r\\n|[ \\t\\r\\n\\f])?".replace("{h}", h);
String escape = "({unicode}|\\\\[^\\r\\n\\f0-9a-f])".replace("{unicode}", unicode);
String nonascii = "[\\240-\\377]";
String nmchar = "([_a-z0-9-]|{nonascii}|{escape})".replace("{nonascii}", nonascii).replace("{escape}", escape);
String nmstart = "([_a-z]|{nonascii}|{escape})".replace("{nonascii}", nonascii).replace("{escape}", escape);
String ident = "-?{nmstart}{nmchar}*".replace("{nmstart}", nmstart).replace("{nmchar}", nmchar);
System.out.println(ident); // The full regex.
Update 2: oh, you're more a PHP'er, well I think you can figure how/where to do str_replace
?
Selecting elements with special characters in the ID
Try escaping it:
$('#abc\\@def.com').val();
First paragraph of http://api.jquery.com/category/selectors/
What is a practical maximum length for HTML id?
Just tested: 1M characters works on every modern browser: Chrome1, FF3, IE7, Konqueror3, Opera9, Safari3.
I suspect even longer IDs could become hard to remember.
Related Topics
Ie8 Issue With Twitter Bootstrap 3
Why Are Bootstrap Tabs Displaying Tab-Pane Divs With Incorrect Widths When Using Highcharts
Can You Provide Examples of Parsing Html
Target="_Blank" Vs. Target="_New"
How to Submit Form on Change of Dropdown List
Is It Sometimes Bad to Use ≪Br /≫
Is ≪Img≫ Element Block Level or Inline Level
Freeze the Top Row For an HTML Table Only (Fixed Table Header Scrolling)
What Is the Correct Value For the Disabled Attribute
Css :Selected Pseudo Class Similar to :Checked, But For ≪Option≫ Elements
How to Create a Checkbox With a Clickable Label
How to Target a Specific Column or Row in CSS Grid Layout
Download Attribute on a Tag Not Working in Ie
How to Use CSS to Surround a Number With a Circle
Change Select Box Option Background Color
Absolute Urls Omitting the Protocol (Scheme) in Order to Preserve the One of the Current Page