How Does the Main() Method Work in C

How does the main() method work in C?

Some of the features of the C language started out as hacks which just happened to work.

Multiple signatures for main, as well as variable-length argument lists, is one of those features.

Programmers noticed that they can pass extra arguments to a function, and nothing bad happens with their given compiler.

This is the case if the calling conventions are such that:

  1. The calling function cleans up the arguments.
  2. The leftmost arguments are closer to the top of the stack, or to the base of the stack frame, so that spurious arguments do not invalidate the addressing.

One set of calling conventions which obeys these rules is stack-based parameter passing whereby the caller pops the arguments, and they are pushed right to left:

 ;; pseudo-assembly-language
;; main(argc, argv, envp); call

push envp ;; rightmost argument
push argv ;;
push argc ;; leftmost argument ends up on top of stack

call main

pop ;; caller cleans up
pop
pop

In compilers where this type of calling convention is the case, nothing special need to be done to support the two kinds of main, or even additional kinds. main can be a function of no arguments, in which case it is oblivious to the items that were pushed onto the stack. If it's a function of two arguments, then it finds argc and argv as the two topmost stack items. If it's a platform-specific three-argument variant with an environment pointer (a common extension), that will work too: it will find that third argument as the third element from the top of the stack.

And so a fixed call works for all cases, allowing a single, fixed start-up module to be linked to the program. That module could be written in C, as a function resembling this:

/* I'm adding envp to show that even a popular platform-specific variant
can be handled. */
extern int main(int argc, char **argv, char **envp);

void __start(void)
{
/* This is the real startup function for the executable.
It performs a bunch of library initialization. */

/* ... */

/* And then: */
exit(main(argc_from_somewhere, argv_from_somewhere, envp_from_somewhere));
}

In other words, this start module just calls a three-argument main, always. If main takes no arguments, or only int, char **, it happens to work fine, as well as if it takes no arguments, due to the calling conventions.

If you were to do this kind of thing in your program, it would be nonportable and considered undefined behavior by ISO C: declaring and calling a function in one manner, and defining it in another. But a compiler's startup trick does not have to be portable; it is not guided by the rules for portable programs.

But suppose that the calling conventions are such that it cannot work this way. In that case, the compiler has to treat main specially. When it notices that it's compiling the main function, it can generate code which is compatible with, say, a three argument call.

That is to say, you write this:

int main(void)
{
/* ... */
}

But when the compiler sees it, it essentially performs a code transformation so that the function which it compiles looks more like this:

int main(int __argc_ignore, char **__argv_ignore, char **__envp_ignore)
{
/* ... */
}

except that the names __argc_ignore don't literally exist. No such names are introduced into your scope, and there won't be any warning about unused arguments.
The code transformation causes the compiler to emit code with the correct linkage which knows that it has to clean up three arguments.

Another implementation strategy is for the compiler or perhaps linker to custom-generate the __start function (or whatever it is called), or at least select one from several pre-compiled alternatives. Information could be stored in the object file about which of the supported forms of main is being used. The linker can look at this info, and select the correct version of the start-up module which contains a call to main which is compatible with the program's definition. C implementations usually have only a small number of supported forms of main so this approach is feasible.

Compilers for the C99 language always have to treat main specially, to some extent, to support the hack that if the function terminates without a return statement, the behavior is as if return 0 were executed. This, again, can be treated by a code transformation. The compiler notices that a function called main is being compiled. Then it checks whether the end of the body is potentially reachable. If so, it inserts a return 0;

what is the purpose of arguments in main method in C language

The signature of main is:

int main(int argc, char **argv);

Where argc is the number of command line arguments passed in, which includes the actual name of the program, as invoked by the user.

argv contains the actual arguments, starting with index 1, since index 0 is the program name.

So, if you run your program like this:

./program hello world

Then:

argc would be 3.

argv[0] would be ./program.

argv[1] would be hello.

argv[2] would be world.

I hope this is clear enough for you.

If you want to understand it more clearly, go to these: Link, Link

Return type of main function

Only book authors seem to be privy to the place where a return type of void for main() is allowed. The C++ standard forbids it completely.

The C standard says that the standard forms are:

int main(void) { ... }

and

int main(int argc, char **argv) { ... }

allowing alternative but equivalent forms of declaration for the argument types (and the names are completely discretionary, of course, since they're local variables to the function).

The C standard does make small provision for 'in some other implementation defined manner'. The ISO/IEC 9899:2011 standard says:

5.1.2.2.3 Program termination


If the return type of the main function is a type compatible with int, a return from the
initial call to the main function is equivalent to calling the exit function with the value
returned by the main function as its argument;11) reaching the } that terminates the
main function returns a value of 0. If the return type is not compatible with int, the
termination status returned to the host environment is unspecified.

11) In accordance with 6.2.4, the lifetimes of objects with automatic storage duration declared in main
will have ended in the former case, even where they would not have in the latter.

This clearly allows for non-int returns, but makes it clear that it is not specified. So, void might be allowed as the return type of main() by some implementation, but you can only find that from the documentation.

(Although I'm quoting C2011 standard, essentially the same words were in C99, and I believe C89 though my text for that is at the office and I'm not.)

Incidentally, Appendix J of the standard mentions:

J.5 Common extensions


The following extensions are widely used in many systems, but are not portable to all
implementations. The inclusion of any extension that may cause a strictly conforming
program to become invalid renders an implementation nonconforming. Examples of such
extensions are new keywords, extra library functions declared in standard headers, or
predefined macros with names that do not begin with an underscore.

J.5.1 Environment arguments


In a hosted environment, the main function receives a third argument, char *envp[],
that points to a null-terminated array of pointers to char, each of which points to a string
that provides information about the environment for this execution of the program
(5.1.2.2.1).

Why does void main() work?

The question observes that void main() works. It 'works' because the compiler does its best to generate code for programs. Compilers such as GCC will warn about non-standard forms for main(), but will process them. The linker isn't too worried about the return type; it simply needs a symbol main (or possibly _main, depending on the system) and when it finds it, links it into the executable. The start-up code assumes that main has been defined in the standard manner. If main() returns to the startup code, it collects the returned value as if the function returned an int, but that value is likely to be garbage. So, it sort of seems to work as long as you don't look for the exit status of your program.

What should main() return in C and C++?

The return value for main indicates how the program exited. Normal exit is represented by a 0 return value from main. Abnormal exit is signaled by a non-zero return, but there is no standard for how non-zero codes are interpreted. As noted by others, void main() is prohibited by the C++ standard and should not be used. The valid C++ main signatures are:

int main()

and

int main(int argc, char* argv[])

which is equivalent to

int main(int argc, char** argv)

It is also worth noting that in C++, int main() can be left without a return-statement, at which point it defaults to returning 0. This is also true with a C99 program. Whether return 0; should be omitted or not is open to debate. The range of valid C program main signatures is much greater.

Efficiency is not an issue with the main function. It can only be entered and left once (marking the program's start and termination) according to the C++ standard. For C, re-entering main() is allowed, but should be avoided.

How to run a function with empty main() method in C?

With no main() is quite possible, but involves redefining the _start() function.

/*main.c*/

#include <stdio.h>
#include <stdlib.h>

void _start()
{
printf("No main function!\n");
exit(0);
}

compile with:

For Windows(10, gcc 8.1.0) and Ubuntu(18.04, gcc 9.2.0)

gcc main.c -nostartfiles

For MacOS (10.14.6, Xcode 11.3)

clang -Wl,-e,-Wl,__start main.c

More info about Linux program start Linux x86 Program Start Up

Why do we need to use `int main` and not `void main` in C++?

The short answer, is because the C++ standard requires main() to return int.

As you probably know, the return value from the main() function is used by the runtime library as the exit code for the process. Both Unix and Win32 support the concept of a (small) integer returned from a process after it has finished. Returning a value from main() provides one way for the programmer to specify this value.

Is a main() required for a C program?

No, the ISO C standard states that a main function is only required for a hosted environment (such as one with an underlying OS).

For a freestanding environment like an embedded system (or an operating system itself), it's implementation defined. From C99 5.1.2:

Two execution environments are defined: freestanding and hosted. In both cases, program startup occurs when a designated C function is called by the execution environment.

In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined.

As to how Linux itself starts, the start point for the Linux kernel is start_kernel though, for a more complete picture of the entire boot process, you should start here.



Related Topics



Leave a reply



Submit