Initialization of Instance Fields VS. Local Variables

Initialization of instance fields vs. local variables

For local variables, the compiler has a good idea of the flow - it can see a "read" of the variable and a "write" of the variable, and prove (in most cases) that the first write will happen before the first read.

This isn't the case with instance variables. Consider a simple property - how do you know if someone will set it before they get it? That makes it basically infeasible to enforce sensible rules - so either you'd have to ensure that all fields were set in the constructor, or allow them to have default values. The C# team chose the latter strategy.

Why do local variables require initialization, but fields do not?

Yuval and David's answers are basically correct; summing up:

  • Use of an unassigned local variable is a likely bug, and this can be detected by the compiler at low cost.
  • Use of an unassigned field or array element is less likely a bug, and it is harder to detect the condition in the compiler. Therefore the compiler makes no attempt to detect the use of an uninitialized variable for fields, and instead relies upon the initialization to the default value in order to make the program behavior deterministic.

A commenter to David's answer asks why it is impossible to detect the use of an unassigned field via static analysis; this is the point I want to expand upon in this answer.

First off, for any variable, local or otherwise, it is in practice impossible to determine exactly whether a variable is assigned or unassigned. Consider:

bool x;
if (M()) x = true;
Console.WriteLine(x);

The question "is x assigned?" is equivalent to "does M() return true?" Now, suppose M() returns true if Fermat's Last Theorem is true for all integers less than eleventy gajillion, and false otherwise. In order to determine whether x is definitely assigned, the compiler must essentially produce a proof of Fermat's Last Theorem. The compiler is not that smart.

So what the compiler does instead for locals is implements an algorithm which is fast, and overestimates when a local is not definitely assigned. That is, it has some false positives, where it says "I can't prove that this local is assigned" even though you and I know it is. For example:

bool x;
if (N() * 0 == 0) x = true;
Console.WriteLine(x);

Suppose N() returns an integer. You and I know that N() * 0 will be 0, but the compiler does not know that. (Note: the C# 2.0 compiler did know that, but I removed that optimization, as the specification does not say that the compiler knows that.)

All right, so what do we know so far? It is impractical for locals to get an exact answer, but we can overestimate not-assigned-ness cheaply and get a pretty good result that errs on the side of "make you fix your unclear program". That's good. Why not do the same thing for fields? That is, make a definite assignment checker that overestimates cheaply?

Well, how many ways are there for a local to be initialized? It can be assigned within the text of the method. It can be assigned within a lambda in the text of the method; that lambda might never be invoked, so those assignments are not relevant. Or it can be passed as "out" to anothe method, at which point we can assume it is assigned when the method returns normally. Those are very clear points at which the local is assigned, and they are right there in the same method that the local is declared. Determining definite assignment for locals requires only local analysis. Methods tend to be short -- far less than a million lines of code in a method -- and so analyzing the entire method is quite quick.

Now what about fields? Fields can be initialized in a constructor of course. Or a field initializer. Or the constructor can call an instance method that initializes the fields. Or the constructor can call a virtual method that initailizes the fields. Or the constructor can call a method in another class, which might be in a library, that initializes the fields. Static fields can be initialized in static constructors. Static fields can be initialized by other static constructors.

Essentially the initializer for a field could be anywhere in the entire program, including inside virtual methods that will be declared in libraries that haven't been written yet:

// Library written by BarCorp
public abstract class Bar
{
// Derived class is responsible for initializing x.
protected int x;
protected abstract void InitializeX();
public void M()
{
InitializeX();
Console.WriteLine(x);
}
}

Is it an error to compile this library? If yes, how is BarCorp supposed to fix the bug? By assigning a default value to x? But that's what the compiler does already.

Suppose this library is legal. If FooCorp writes

public class Foo : Bar
{
protected override void InitializeX() { }
}

is that an error? How is the compiler supposed to figure that out? The only way is to do a whole program analysis that tracks the initialization static of every field on every possible path through the program, including paths that involve choice of virtual methods at runtime. This problem can be arbitrarily hard; it can involve simulated execution of millions of control paths. Analyzing local control flows takes microseconds and depends on the size of the method. Analyzing global control flows can take hours because it depends on the complexity of every method in the program and all the libraries.

So why not do a cheaper analysis that doesn't have to analyze the whole program, and just overestimates even more severely? Well, propose an algorithm that works that doesn't make it too hard to write a correct program that actually compiles, and the design team can consider it. I don't know of any such algorithm.

Now, the commenter suggests "require that a constructor initialize all fields". That's not a bad idea. In fact, it is such a not-bad idea that C# already has that feature for structs. A struct constructor is required to definitely-assign all fields by the time the ctor returns normally; the default constructor initializes all the fields to their default values.

What about classes? Well, how do you know that a constructor has initialized a field? The ctor could call a virtual method to initialize the fields, and now we are back in the same position we were in before. Structs don't have derived classes; classes might. Is a library containing an abstract class required to contain a constructor that initializes all its fields? How does the abstract class know what values the fields should be initialized to?

John suggests simply prohibiting calling methods in a ctor before the fields are initialized. So, summing up, our options are:

  • Make common, safe, frequently used programming idioms illegal.
  • Do an expensive whole-program analysis that makes the compilation take hours in order to look for bugs that probably aren't there.
  • Rely upon automatic initialization to default values.

The design team chose the third option.

Why Java initializing only class variables by default but not local variables?

Static/Non-static fields that are not primitives, like your Node, are initialized at null by default.
Static/Non-static fields that are primitive gets their default values.

There's also another case where some variables are initialized with default: when you instantiate an array. Each cell represents has default value, regarding the type:

  • 0 for int
  • null for Integer
  • etc.

However, in a local method, compiler does not assign default value to local variables.

That's why your IDE warns about: "may not be initialized!".

To understand why, you may be interested in this post.

Un-initialized final local variable vs un-initialized final instance variable

import java.util.*;
public class test
{
final int x = 5;
public static void main(String[] args)
{
final int y;
System.out.println("test program");
y=6;
y=7;
}
}

y=7 will give error:The final local variable y may already have been assigned. Since it is a final variable, and it has been assigned to 6.

IMHO, a final local variable means once assigned, it cannot be re-assigned. But by final int y you are only declaring a final variable without assignment(initialization), which is legal in Java.(But in order to use it you still have to initialize it, or an error occurs.)

Update:

As commented below, you have noticed the difference between a class field final variable and a local final variable.

From Java Language Specification:

  1. a final field must be definely assigned in the static initializer or the constructor:

    8.3.1.2 final Fields
    A field can be declared final (§4.12.4). Both class and instance variables (static
    and non-static fields) may be declared final.
    A blank final class variable must be definitely assigned by a static initializer of
    the class in which it is declared, or a compile-time error occurs (§8.7, §16.8).
    A blank final instance variable must be definitely assigned at the end of every
    constructor of the class in which it is declared, or a compile-time error occurs (§8.8,
    §16.9).

(Note that a non-final field can be left un-initialized)

2.A local variable(whether final or not) must be explicitly given a value before it is used:(chapter 4.12.5,P88)

• A local variable (§14.4, §14.14) must be explicitly given a value before it is
used, by either initialization (§14.4) or assignment (§15.26), in a way that can be
verified using the rules for definite assignment (§16 (Definite Assignment)).

Why local variable MUST be initialized and why instance variables MUST NOT be initialized before using?

As I know,

  • Instance variable: will be initial at the run time when class initial and default of instance variable is null => instance variable will error at run time.
  • Local variable: Unlike class and instance variables, a local variable is fussy about where you position the declaration for it: You must place the declaration before the first statement that actually uses the variable. => local variable error with syntax error.

    ref: Local variable in java

Why must local variables, including primitives, always be initialized in Java?

Basically, requiring a variable to be assigned a value before you read it is a Good Thing. It means you won't accidentally read something you didn't intend to. Yes, variables could have default values - but isn't it better for the compiler to be able to catch your bug instead, if it can prove that you're trying to read something which might not have been assigned yet? If you want to give a local variable a default value, you can always assign that explicitly.

Now that's fine for local variables - but for instance and static variables, the compiler has no way of knowing the order in which methods will be called. Will a property "setter" be called before the "getter"? It has no way of knowing, so it has no way of alerting you to the danger. That's why default values are used for instance/static variables - at least then you'll get a known value (0, false, null etc) instead of just "whatever happened to be in memory at the time." (It also removes the potential security issue of reading sensitive data which hadn't been explicitly wiped.)

There was a question about this very recently for C#... - read the answers there as well, as it's basically the same thing. You might also find Eric Lippert's recent blog post interesting; it's at least around the same area, even though it has a somewhat different thrust.

Array declaration as class field vs local variable

There is clear answer to this and you can find it in specification:

A variable initializer for an instance field cannot reference the instance being created. Thus, it is a compile-time error to reference this in a variable initializer, as it is a compile-time error for a variable initializer to reference any instance member through a simple_name. In the example:

class A
{
int x = 1;
int y = x + 1; // Error, reference to instance member of this
}

the variable initializer for y results in a compile-time error because it references a member of the instance being created.

What is the difference between local and instance variables in Java?

One extra thing I can think of:

Instance variables are given default values, i.e., null if it's an object reference, and 0 if it's an int.

Local variables don't get default values, and therefore need to be explicitly initialized (and the compiler usually complains if you fail to do this).

Java variable initialization different ways of handling?

From the Oracle documentation on Java Primitive Data Types:

Local variables are slightly different; the compiler never assigns a
default value to an uninitialized local variable. If you cannot
initialize your local variable where it is declared, make sure to
assign it a value before you attempt to use it. Accessing an
uninitialized local variable will result in a compile-time error.

So this is an interesting nuance. If a primitive type variable is locally declared, you must specify a value for it.

Why Final variable doesn't require initialization in main method in java?

For instance variable level

  • A final variable can be initialized only once.

  • A final variable at class level must be initialized before the end of the constructor.

For local (method) level

  • A final variable at method level can be initialized only once.
  • It must be initialized before it is used

So basically if you don't use a local final variable you can also skip it's initialization.

If the variable is at instance level you have to initialize it in the definition or in the costructor body.

In your code you have an instance variable final int b that is never initialized so you have an error.

You have also a local variable final int a that is never used. So you haven't an error for that variable.



Related Topics



Leave a reply



Submit