Allowed Characters in Linux Environment Variable Names

Allowed characters in Linux environment variable names

From The Open Group:

These strings have the form
name=value; names shall not contain
the character '='. For values to be
portable across systems conforming to
IEEE Std 1003.1-2001, the value shall
be composed of characters from the
portable character set (except NUL
and as indicated below).

So names may contain any character except = and NUL, but:

Environment variable names used by the utilities in the Shell and
Utilities volume of IEEE Std
1003.1-2001 consist solely of uppercase letters, digits, and the '_'
(underscore) from the characters
defined in Portable Character Set and
do not begin with a digit. Other
characters may be permitted by an
implementation; applications shall
tolerate the presence of such names.

So while the names may be valid, your shell might not support anything besides letters, numbers, and underscores.

What are valid characters for Windows environment variable names and values?

About variable values: you can use most characters as variable values, including white space. If you use the special characters <, >, |, &, or ^, you must precede them with the escape character (^) or quotation marks. If you use quotation marks, they are included as part of the value because everything following the equal sign is taken as the value.

Check section "Setting environment variables".

About variable names: in my opinion, for best compatibility with every application, you should limit yourself to letters, numbers, underscore (_) and minus (-).

I'm quite sure that all POSIX valid characters for files are ok, but I did not found any evidence of this.

Concerning variable names names we need to also accept parenthesis since %ProgramFiles(x86)% is a well-known envar. From my experiments it seems that in addition to letters and digits characters, these characters are valid _(){}[]$*+-\/"#',;.@!? and these characters are not valid %<>^&|=:.

I didn't do an exhaustive search but just tested most common non alphanumeric characters.

And just for the fun of it you can name an envar %_(){}[]$*+-\/"#',;.@!?%:

C:\>set _(){}[]$*+-\/"#',;.@!?=xyz

C:\>echo %_(){}[]$*+-\/"#',;.@!?%
xyz

What Unicode symbols are acceptable in BASH variable names?

I would say none.

á=3
á=3: command not found

If you like unicode symbol names, use Perl:

perl -e 'use utf8; $á = 42; print $á'
42

Is the equals sign character = allowed in the value of an environment value in Linux?

Environment variables can contain any character except \0, since the null byte is the C string terminator character. When parsing the environment, the first = in each environment variable separates the name from the value, but additional = characters have no impact.

barmar@dev:~$ export myDN=OU=MY_OU,DC=MYDC,DC=local
barmar@dev:~$ echo $myDN
OU=MY_OU,DC=MYDC,DC=local

Create environment variable with dot in the current environment

Bash does not allow environment variables with non-alphanumeric characters in their names (aside from _). While the environment may contain a line such as A.B=D, there is no requirement that a shell be able to make use of it, and bash will not. Other shells may be more flexible.

Utilities which make use of oddly-named environment variables are discouraged, but some may exist. You will need to use env to create such an environment variable. You could avoid the subprocess with exec env bash but it won't save you much in the way of time or resources.

Why can't environment variables with dashes be accessed in bash 4.1.2?

The "why" is almost irrelevant: The POSIX standard makes it very clear that export is only required to support arguments which are valid names, and anything with a dash is not a valid name. Thus, no POSIX shell is required to support exporting or expanding variable names with dashes, via indirect expansion or otherwise.

It's worth noting that ShellShock -- a major security bug caused by sloppy handling of environment contents -- is fixed in the bash 4.1 present in the current CentOS 6 updates repo; increased rigor in an area which spawned security bugs should be no surprise.

The remainder of this answer will focus on demonstrating that the new behavior of bash 4.1 is explicitly allowed, or even required, by POSIX -- and thus that the prior behavior was an undefined implementation artifact.

To quote POSIX on environment variables:

These strings have the form name=value; names shall not contain the character '='. For values to be portable across systems conforming to IEEE Std 1003.1-2001, the value shall be composed of characters from the portable character set (except NUL and as indicated below). There is no meaning associated with the order of strings in the environment. If more than one string in a process' environment has the same name, the consequences are undefined.

Environment variable names used by the utilities in the Shell and Utilities volume of IEEE Std 1003.1-2001 consist solely of uppercase letters, digits, and the '_' (underscore) from the characters defined in Portable Character Set and do not begin with a digit. Other characters may be permitted by an implementation; applications shall tolerate the presence of such names. Uppercase and lowercase letters shall retain their unique identities and shall not be folded together. The name space of environment variable names containing lowercase letters is reserved for applications. Applications can define any environment variables with names from this name space without modifying the behavior of the standard utilities.

Note: Other applications may have difficulty dealing with environment variable names that start with a digit. For this reason, use of such names is not recommended anywhere.

Thus:

Tools (including the shell) are required to fully support environment variable names with uppercase and lowercase letters, digits (except in the first position), and the underscore.
Tools (including the shell) may modify their behavior based on environment variables with names that comply with the above and additionally do not contain lowercase letters.
Tools (including the shell) should tolerate other names -- meaning they shouldn't crash or misbehave in their presence -- but are not required to support them.

Finally, shells are explicitly allowed to discard environment variable names which are not also shell variable names. From the relevant standard:

It is unspecified whether environment variables that were passed to the shell when it was invoked, but were not used to initialize shell variables (see Shell Variables) because they had invalid names, are included in the environment passed to execl() and (if execl() fails as described above) to the new shell.

Moreover, what defines a valid shell name is well-defined:

Name - In the shell command language, a word consisting solely of underscores, digits, and alphabetics from the portable character set. The first character of a name is not a digit.

Notably, only underscores (not dashes) are considered part of a valid name in a POSIX-compliant shell.

...and the POSIX specification for export explicitly uses the word "name" (which it defined in the text quoted above), and describes it as applying to "variables" (shell variables, the restrictions on names for which are also subject to restrictions quoted elsewhere in this document):

The shell shall give the export attribute to the variables corresponding to the specified names, which shall cause them to be in the environment of subsequently executed commands. If the name of a variable is followed by = word, then the value of that variable shall be set to word.

All the above being said -- if your operating system provides a /proc/self/environ which represents the state of your enviroment variables at process startup (before a shell has, as it's allowed to do, potentially discarded any variables which don't have valid names in shell), you can extract content with invalid names like so:

# using a lower-case name where possible is in line with POSIX guidelines, see above
aws_access_key_id_var="AWS_${BUCKET_NAME}_ACCESS_KEY_ID"
while IFS= read -r -d '' var; do
  [[ $var = "$aws_access_key_id_var"=* ]] || continue
  val=${var#"${aws_access_key_id_var}="}
  break
done </proc/self/environ
echo "Extracted value: $val"

Bash - export environment variables with special characters ($)

You don't need eval at all, just use declare built-in in bash to create variables on-the-fly!

case "$key" in
  '#'*) ;;
   *)
       declare $key=$value
       export "$key"
esac

Export variable containing special characters(args)

Exporting isn't something you do with a value; it's something you do to a name.

export CELERY_ARGS

adds CELERY_ARGS to a list of variable names (if it isn't already there) whose values should be added to the environment of a child process when it is started.

Note that this means it doesn't matter what the value of CELERY_ARGS is when you export the name (or even if the variable is defined yet); the value received by the child process is the value at process creation.

For example,

$ printenv bar
$ bar=7
$ printenv bar
$ export bar
$ bar=9
$ printenv bar
9

What characters are forbidden in Windows and Linux directory names?

A “comprehensive guide” of forbidden filename characters is not going to work on Windows because it reserves filenames as well as characters. Yes, characters like
* " ? and others are forbidden, but there are a infinite number of names composed only of valid characters that are forbidden. For example, spaces and dots are valid filename characters, but names composed only of those characters are forbidden.

Windows does not distinguish between upper-case and lower-case characters, so you cannot create a folder named A if one named a already exists. Worse, seemingly-allowed names like PRN and CON, and many others, are reserved and not allowed. Windows also has several length restrictions; a filename valid in one folder may become invalid if moved to another folder. The rules for
naming files and folders
are on the Microsoft docs.

You cannot, in general, use user-generated text to create Windows directory names. If you want to allow users to name anything they want, you have to create safe names like A, AB, A2 et al., store user-generated names and their path equivalents in an application data file, and perform path mapping in your application.

If you absolutely must allow user-generated folder names, the only way to tell if they are invalid is to catch exceptions and assume the name is invalid. Even that is fraught with peril, as the exceptions thrown for denied access, offline drives, and out of drive space overlap with those that can be thrown for invalid names. You are opening up one huge can of hurt.

Allowed Characters in Linux Environment Variable Names