Why Can't Variable Names Start With Numbers

Why can't variable names start with numbers?

Because then a string of digits would be a valid identifier as well as a valid number.

int 17 = 497;
int 42 = 6 * 9;
String 1111 = "Totally text";

Why can't Java variable names start with a number?

Because the Java Language specification says so:

IdentifierChars:

JavaLetter {JavaLetterOrDigit}

So - yes, an identifier must start with a letter; it can't start with a digit.

The main reasons behind that:

  • it is simply what most people expect
  • it makes parsing source code (much) easier when you restrict the "layout" of identifiers; for example it reduces the possible ambiguities between literals and variable names.

Why a variable can't start with a number but can ends with it?

It simplifies the parser. It can tell from the first character in a token whether it's an identifier or a number.

Also, the syntax for floating point can look like this:

123e45

This means 123x1045. If identifiers could start with a number, this could be confused with a variable.

There are some languages that don't have this prohibition, Common Lisp for instance. It's rule is essentially that the token is a symbol unless it can be parsed as a number. Since it also allows the input radix to be customized, it has the property that whether a token is a number or symbol depends on the setting of a variable (it also has escaping mechanisms that allow you to force it one way or the other).

Why variable can not start with a Number?

This is how PHP was designed. You can't start a variable name with a number.

What you can do is use underscore:

$_6 = $_REQUEST['6'];

EDIT:
since variables start with $ in PHP, is would be possible to have variables starting with numbers, but that would be a bit confusing to most people since there is no other language that allows variables starting with numbers (or at least I don't know any).

But let's imagine variables starting with numbers are allowed.

Can you imagine a coworker saying to you: 23 equals 74? That is a bit confusing. Hearing n23 equals 74 makes more sense. You know n23 is a variable without having to explicitly say that.

Why can you start a variable name with $ in C?

In the C 2018 standard, clause 6.4.2, paragraph 1 allows implementations to allow additional characters in identifiers.

It defines an identifier to be an identifier-nondigit character followed by any number of identifier-nondigit or digit characters. It defines digit to be “0“ to “9”, and it defines the identifier-nondigit characters to be:

  • a nondigit, which is one of underscore, “a” to “z”, or “A” to “Z”,
  • a universal-character-name, or
  • other implementation-defined characters.

Thus, implementations may define other characters that are allowed in identifiers.

The characters included as universal-character-name are those listed in ranges in Annex D of the C standard.

The resource you link to is wrong in several places:

Variable names in C are made up of letters (upper and lower case) and digits.

This is false; identifiers may include underscores and the above universal characters in every conforming implementation and other characters in implementations that permit them.

$ not allowed -- only letters, and _

This is incorrect. The C standard does not require an implementation to allow “$”, but it does not disallow an implementation from allowing it. “$” is allowed by some implementations and not others. It can be said not to be a part of strictly conforming C programs, but it may be a part of conforming C programs.

Can variable names in Python start with an integer?

Python parser forbids naming variables that way, for the sake of parsing numbers and variables separately, as naming a variable 1e1 would create a chaos - is it the number 10.0 or the variable 1e1?

"Python, please output for me 1e1!" - "Why is it 10.0? I stored 100 over there!"

But the variables are actually stored in a way that allows binding a string that starts with a number to a value, because that feature is no harm in hashing maps of any kind, and so using this "trick" you can achieve your wanted numeral-prefixed-name variable without hurting the parser severability.

I would say that technically, naming variables in that manner is not a violation to python guidelines, but it is highly discouraged, and as a rule unnecessary. Using globals for injecting variables is known as a very bad practice and this case should not be an outstanding.


Of course, python could have used an encloser to numerals like strings, say *123*, but I believe the intent of inventing python was to make programming easier, not stretching the limits of variable naming space.


Practically speaking, if you must use number-headed names you better do it with your own dictionary, rather than globals:

>>> number_headed_vars = {'1a': 100}
>>> number_headed_vars['1a']
100

That way you can create your own variables system - and avoid abusing globals().

Can table variable names start with a number character?

No, table variable names can't start with a number since they follow the same naming conventions as normal variables. The 'x' is automatically added by readtable to create a valid name. You should have noticed this warning when calling readtable:

Warning: Variable names were modified to make them valid MATLAB identifiers.
The original names are saved in the VariableDescriptions property.

So, you can't get rid of the 'x' within the table. But, if you need to do any comparisons, you can do them against the original values saved in the VariableDescriptions property, which will have this format:

>> T.Properties.VariableDescriptions

ans =

1×2 cell array

'Original column heading: '99BM'' 'Original column heading: '105CL''

You can parse these with a regular expression, for example:

originalNames = regexp(T.Properties.VariableDescriptions, '''(.+)''', 'tokens', 'once');
originalNames = vertcat(originalNames{:});

originalNames =

2×1 cell array

'99BM'
'105CL'

And then use these in any string comparisons you need to do.

Why can't variable names have spaces in them?

There’s no fundamental reason, apart from the decisions of language designers and a history of single-token identifiers. Some languages in fact do allow multi-token identifiers: MultiMedia Fusion’s expression language, some Mac spreadsheet/notebook software whose name escapes me, and I’m sure of others. There are several considerations that make the problem nontrivial, though.

Presuming the language is free-form, you need a canonical representation, so that an identifier like account name is treated the same regardless of whitespace. A compiler would probably need to use some mangling convention to please a linker. Then you have to consider the effect of that on foreign exports—why C++ has the extern "C" linkage specifier to disable mangling.

Keywords are an issue, as you have seen. Most C-family languages have a lexical class of keywords distinct from identifiers, which are not context-sensitive. You cannot name a variable class in C++. This can be solved by disallowing keywords in multi-token identifiers:

if account age < 13 then child account = true;

Here, if and then cannot be part of an identifier, so there is no ambiguity with account age and child account. Alternatively, you can require punctuation everywhere:

if (account age < 13) {
child account = true;
}

The last option is to make keywords pervasively context-sensitive, leading to such monstrosities as:

IF IF = THEN THEN ELSE = THEN ELSE THEN = ELSE

The biggest issue is that juxtaposition is an extremely powerful syntactic construct, and you don’t want to occupy it lightly. Allowing multi-token identifiers prevents using juxtaposition for another purpose, such as function application or composition. Far better, I think, just to allow most nonwhitespace characters and thereby permit such identifiers as canonical-venomous-frobnicator. Still plenty readable but with fewer opportunities for ambiguity.

Why numbers can't be used for variable's first character?

Imagine a C-derived language where numbers can begin identifiers. Now compile:

int main(int argc, char **argv) {
int 42L = 42;
long foo = 42L;
/* compiler: is that a long literal or an identifier?
* Why Can't Variable Names Start With NumbersWhy Can't Variable Names Start With Numbersaaaaaaargh!!!
*/
}

It is extremely hard to make a compiler that can figure that out.

It is, however, possible to have languages where identifiers can begin with numbers. In your average Lisp dialect, for example, the rules are very different from a C-derived language. Lisp code is made up primarily of parenthesized lists of symbols/lists, like this sample:

(defun foo (x y z)
(* (+ x y) (1+ (log z)))) ; Yes, that function is named 1+

which, for those of you unfamiliar with Lisp, is equivalent to:

double foo(double x, double y, double z) {
return (x + y) * (log(z) + 1);
}

Lisp identifiers can contain nearly anything. In Common Lisp (my dialect of choice), the exceptions are parentheses ( ), backslashes \, pipes |, whitespace (it separates list elements), and a few others. And you can actually include them - just prefix with a backslash or surround with pipes. This is a legal Lisp identifier:

\\foo-|(bar)|-baz\ frobnicator

(Although I most definitely wouldn't use it as an identifier!)



Related Topics



Leave a reply



Submit