Is argv[0] = name-of-executable an accepted standard or just a common convention?
Guesswork (even educated guesswork) is fun but you really need to go to the standards documents to be sure. For example, ISO C11 states (my emphasis):
If the value of
argc
is greater than zero, the string pointed to byargv[0]
represents the program name;argv[0][0]
shall be the null character if the program name is not available from the host environment.
So no, it's only the program name if that name is available. And it "represents" the program name, not necessarily is the program name. The section before that states:
If the value of
argc
is greater than zero, the array membersargv[0]
throughargv[argc-1]
inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program startup.
This is unchanged from C99, the previous standard, and means that even the values are not dictated by the standard - it's up to the implementation entirely.
This means that the program name can be empty if the host environment doesn't provide it, and anything else if the host environment does provide it, provided that "anything else" somehow represents the program name. In my more sadistic moments, I would consider translating it into Swahili, running it through a substitution cipher then storing it in reverse byte order :-).
However, implementation-defined does have a specific meaning in the ISO standards - the implementation must document how it works. So even UNIX, which can put anything it likes into argv[0]
with the exec
family of calls, has to (and does) document it.
Windows vs. Linux GCC argv[0] value
No, there isn't. Under most shells on Linux, argv[0]
contains exactly what the user typed to run the binary. This allows binaries to do different things depending on what the user types.
For example, a program with several different command-line commands may install the binary once, and then hard-link the various different commands to the same binary. For example, on my system:
$ ls -l /usr/bin/git*
-rwxr-xr-x 109 root wheel 2500640 16 May 18:44 /usr/bin/git
-rwxr-xr-x 2 root wheel 121453 16 May 18:43 /usr/bin/git-cvsserver
-rwxr-xr-x 109 root wheel 2500640 16 May 18:44 /usr/bin/git-receive-pack
-rwxr-xr-x 2 root wheel 1021264 16 May 18:44 /usr/bin/git-shell
-rwxr-xr-x 109 root wheel 2500640 16 May 18:44 /usr/bin/git-upload-archive
-rwxr-xr-x 2 root wheel 1042560 16 May 18:44 /usr/bin/git-upload-pack
-rwxr-xr-x 1 root wheel 323897 16 May 18:43 /usr/bin/gitk
Notice how some of these files have exactly the same size. More investigation reveals:
$ stat /usr/bin/git
234881026 459240 -rwxr-xr-x 109 root wheel 0 2500640 "Oct 29 08:51:50 2011" "May 16 18:44:05 2011" "Jul 26 20:28:29 2011" "May 16 18:44:05 2011" 4096 4888 0 /usr/bin/git
$ stat /usr/bin/git-receive-pack
234881026 459240 -rwxr-xr-x 109 root wheel 0 2500640 "Oct 29 08:51:50 2011" "May 16 18:44:05 2011" "Jul 26 20:28:29 2011" "May 16 18:44:05 2011" 4096 4888 0 /usr/bin/git-receive-pack
The inode number (459240) is identical and so these are two links to the same file on disk. When run, the binary uses the contents of argv[0]
to determine which function to execute. You can see this (sort of) in the code for Git's main()
.
When can argv[0] have null?
With the exec
class of calls, you specify the program name and program executable separately so you can set it to NULL then.
But that quote is actually from the ISO standard (possibly paraphrased) and that standard covers a awfully large range of execution environments from the smallest micro-controller to the latest z10 Enterprise-class mainframe.
Many of those embedded systems would be in the situation where an executable name makes little sense.
From the latest c1x draft:
The value of
argc
shall be nonnegative.The value
argv[argc]
shall be a null pointer.If the value of
argc
is greater than zero, the array membersargv[0]
throughargv[argc-1]
inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program start up.
This means that, if argc
is zero (and it can be), argv[0] is NULL.
But, even when argc
is not 0, you may not get the program name, since the standard also states:
If the value of
argc
is greater than zero, the string pointed to byargv[0]
represents the program name;argv[0][0]
shall be the null character if the program name is not available from the host environment. If the value ofargc
is greater than one, the strings pointed to byargv[1]
throughargv[argc-1]
represent the program parameters.
So, there is no requirement under the standard that a program name be provided. I've seen programs use a wide selection of options for this value:
- no value at all (for supposed security).
- a blatant lie (such as
sleep
for a malicious piece of code). - the actual program name (such as
sleep
). - a slightly modified one (such as
-ksh
for the login shell). - a descriptive name (e.g.,
progname - a program for something
).
What is the type of command-line argument `argv` in C?
Directly quoting from C11
, chapter §5.1.2.2.1/p2, program startup, (emphasis mine)
int main(int argc, char *argv[]) { /* ... */ }
[...] If the value of
argc
is greater than zero, the array membersargv[0]
through
argv[argc-1]
inclusive shall contain pointers to strings, [...]
and
[...] and the strings pointed to by the
argv
array [...]
So, basically, argv
is a pointer to the first element of an array of strings note. This can be made clearer from the alternative form,
int main(int argc, char **argv) { /* ... */ }
You can rephrase that as pointer to the first element of an array of pointers to the first element of null-terminated char
arrays, but I'd prefer to stick to strings .
NOTE:
To clarify the usage of "pointer to the first element of an array" in above answer, following §6.3.2.1/p3
Except when it is the operand of the
sizeof
operator, the_Alignof
operator, or the
unary&
operator, or is a string literal used to initialize an array, an expression that has
type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points
to the initial element of the array object and is not an lvalue. [...]
Can argv[0] contain an empty string?
It's implementation defined. §5.1.2.2.1 abridged:
If the value of
argc
is greater than zero, the array membersargv[0]
through
argv[argc-1]
inclusive shall contain pointers to strings, which are given
implementation-defined values by the host environment prior to program startup. The
intent is to supply to the program information determined prior to program startup
from elsewhere in the hosted environment. [...]If the value of
argc
is greater than zero, the string pointed to byargv[0]
represents the program name;argv[0][0]
shall be the null character if the
program name is not available from the host environment. [...]
So if argc
is greater than zero, it's quite the intention that argv[0]
never be an empty string, but it could happen. (Note that with argc
equal to n
, argv[0]
through argv[n - 1]
are never null and always point to a string. The string itself may be empty, though. If n
is zero, argv[0]
is null.)
In practice, of course, you just need to make sure the platforms your targetting behave as needed.
Related Topics
How to Serialize an Object in C++
Store Derived Class Objects in Base Class Variables
How to Use Cout ≪≪ Myclass
Calling Class Method Through Null Class Pointer
Is a String Literal in С++ Created in Static Memory
Accessing Protected Members in a Derived Class
Rand() Returns Same Values When Called Within a Single Function
How to Iterate Over Cin Line by Line in C++
Why Is There No Call to the Constructor
Significance of Ios_Base::Sync_With_Stdio(False); Cin.Tie(Null);
Does the Size of an Int Depend on the Compiler And/Or Processor
What Does the Question Mark Character ('') Mean in C++
What Does 'Unsigned Temp:3' in a Struct or Union Mean
Why Are Elementwise Additions Much Faster in Separate Loops Than in a Combined Loop
How to Make a Http Request With C++