Are Dollar-Signs Allowed in Identifiers in C++03

Are dollar-signs allowed in identifiers in C++03?

A c++ identifier can be composed of any of the following: _ (underscore), the digits 0-9, the letters a-z (both upper and lower case) and cannot start with a number.

There are a number of exceptions as C99 allows extensions to the standard (e.g. visual studio).

Does C++11 allow dollar signs in identifiers?

This is implementation defined behavior, $ is not included in grammar for identifiers. The rules for identifier names in C++11 are:

  1. It can not start with a number
  2. Can be composed of letters, numbers, underscore, universal character names and implementation defined characters
  3. Can not be a keyword

Implementation-defined characters are allowed and many compilers support as an extension, including gcc, clang, Visual Studio and as noted in a comment apparently DEC C++ compilers.

The grammar is covered in the draft C++ standard section 2.11 Indentifier, I added additional notes starting with <-:

identifier:
identifier-nondigit <- Can only start with a non-digit
identifier identifier-nondigit <- Next two rules allows for subsequent
identifier digit <- characters to be those outlined in 2 above
identifier-nondigit:
nondigit <- a-z, A-Z and _
universal-character-name
other implementation-defined characters
[...]

If we compile this code using clang with the -pedantic-errors flag it will not compile:

int $ = 0

and generates the following error:

error: '$' in identifier [-Werror,-Wdollar-in-identifier-extension]
int $ = 0;
^

Are dollar-signs allowed in identifiers in C++03?

A c++ identifier can be composed of any of the following: _ (underscore), the digits 0-9, the letters a-z (both upper and lower case) and cannot start with a number.

There are a number of exceptions as C99 allows extensions to the standard (e.g. visual studio).

dollar sign in variable name?

The only legal characters according to the standard are alphanumerics
and the underscore. The standard does require that just about anything
Unicode considers alphabetic is acceptable (but only as single
code-point characters). In practice, implementations offer extensions
(i.e. some do accept a $) and restrictions (most don't accept all of the
required Unicode characters). If you want your code to be portable,
restrict symbols to the 26 unaccented letters, upper or lower case, the
ten digits, and the '_'.

$ symbol in c++

It is being used as part of an identifer.

[C++11: 2.11/1] defines an identifier as "an arbitrarily long sequence of letters and digits." It defines "letters and digits" in a grammar given immediately above, which names only numeric digits, lower- and upper-case roman letters, and the underscore character explicitly, but does also allow "other implementation-defined characters", of which this is presumably one.

In this scenario the $ has no special meaning other than as part of an identifier — in this case, the name of a variable. There is no special significance with it being at the start of the variable name.

What are the '@' and '$' for in C/C++?

Neither $ or @ are part of standard C's character set (C11 5.2.1 Character sets, paragraph 3):

Both the basic source and basic execution character sets shall have the following members: the 26 uppercase letters of the Latin alphabet

    A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z

the 26 lowercase letters of the Latin alphabet

    a b c d e f g h i j k l m
n o p q r s t u v w x y z

the 10 decimal digits

    0 1 2 3 4 5 6 7 8 9

the following 29 graphic characters

    ! " # % & ' ( ) * + , - . / :
; < = > ? [ \ ] ^ _ { | } ~

the space character, and control characters representing horizontal tab, vertical tab, and form feed.

The C++ standard says about the same (2.2 Character sets, paragraph 1):

The basic source character set consists of 96 characters: the space character, the control characters representing horizontal tab, vertical tab, form feed, and new-line, plus the following 91 graphical characters:

a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9
_ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ∼ ! = , \ " ’

So if you can or can't use them (at all or even for a specific purpose) it's up to your implementation.

In your case it sounds like you're probably using GCC, which allows $ in identifiers as an extension, but doesn't allow @ - probably because GCC also compiles Objective-C code, where @ has special meaning.

From the GCC documentation:

In GNU C, you may normally use dollar signs in identifier names. This is because many traditional C implementations allow such identifiers. However, dollar signs in identifiers are not supported on a few target machines, typically because the target assembler does not allow them.

Does At symbol (@) and Dollar Sign ($) has any special meaning in C or C++

@ is generally invalid in C; it is not used for anything. It is used for various purposes by Objective-C, but that's a whole other kettle of fish.

$ is invalid as well, but many implementations allow it to appear in identifiers, just like a letter. (In these implementations, for instance, you could name a variable or function $$$ if you liked.) Even there, though, it doesn't have any special meaning.

Why can identifiers contain '$' in C?

This is not good practice. Generally, you should only use alphanumeric characters and underscores in identifiers ([a-z][A-Z][0-9]_).

Surface Level

Unlike in other languages (bash, perl), C does not use $ to denote the usage of a variable. As such, it is technically valid. As of C++ 17, this is standards conformant, see Draft n4659. In C it most likely falls under C11, 6.4.2. This means that it does seem to be supported by modern compilers.

As for your C++ question, lets test it!

int main(void) {
int $ = 0;
return $;
}

On GCC/G++/Clang/Clang++, this indeed compiles, and runs just fine.

Deeper Level

Compilers take source code, lex it into a token stream, put that into an abstract syntax tree (AST), and then use that to generate code (e.g. assembly/LLVM IR). Your question really only revolves around the first part (e.g. lexing).

The grammar (thus the lexer implementation) of C/C++ does not treat $ as special, unlike commas, periods, skinny arrows, etc... As such, you may get an output from the lexer like this from the below c code:

int i_love_$ = 0;

After the lexer, this becomes a token steam like such:

["int", "i_love_$", "=", "0"]

If you where to take this code:

int i_love_$,_and_.s = 0;

The lexer would output a token steam like:

["int", "i_love_$", ",", "_and_", ".", "s", "=", "0"]

As you can see, because C/C++ doesn't treat characters like $ as special, it is processed differently than other characters like periods.



Related Topics



Leave a reply



Submit