Lambdas: Local Variables Need Final, Instance Variables Don'T

Lambdas: local variables need final, instance variables don't

The fundamental difference between a field and a local variable is that the local variable is copied when JVM creates a lambda instance. On the other hand, fields can be changed freely, because the changes to them are propagated to the outside class instance as well (their scope is the whole outside class, as Boris pointed out below).

The easiest way of thinking about anonymous classes, closures and labmdas is from the variable scope perspective; imagine a copy constructor added for all local variables you pass to a closure.

Why does variables in lambdas have to be final or effectively final?

It is related to multi-thread programming.

Local variables in Java have until now been immune to race conditions
and visibility problems because they are accessible only to the thread
executing the method in which they are declared. But a lambda can be
passed from the thread that created it to a different thread, and that
immunity would therefore be lost if the lambda, evaluated by the
second thread, were given the ability to mutate local variables. - Source

Why don't instance fields need to be final or effectively final to be used in lambda expressions?

Instance variables are stored in the heap space whereas local variables are stored in the stack space. Each thread maintains its own stack and hence the local variables are not shared across the threads. On the other hand, the heap space is shared by all threads and therefore, multiple threads can modify an instance variable. There are various mechanisms to make the data thread-safe and you can find many related discussions on this platform. Just for the sake of completeness, I've quoted below an excerpt from http://web.mit.edu/6.005/www/fa14/classes/18-thread-safety/

There are basically four ways to make variable access safe in
shared-memory concurrency:
Confinement. Don’t share the variable between threads. This idea is called confinement, and we’ll explore it today.
Immutability. Make the shared data immutable. We’ve talked a lot about immutability already, but there are some additional constraints
for concurrent programming that we’ll talk about in this reading.
Threadsafe data type. Encapsulate the shared data in an existing threadsafe data type that does the coordination for you. We’ll talk
about that today.
Synchronization. Use synchronization to keep the threads from accessing the variable at the same time. Synchronization is what you
need to build your own threadsafe data type.

Why variable used in lambda expression should be final or effectively final

I'd like to preface this answer by saying what I show below is not actually how lambdas are implemented. The actual implementation involves java.lang.invoke.LambdaMetafactory if I'm not mistaken. My answer makes use of some inaccuracies to better demonstrate the point.

Let's say you have the following:

public static void main(String[] args) {
  String foo = "Hello, World!";
  Runnable r = () -> System.out.println(foo);
  r.run();
}

Remember that a lambda expression is shorthand for declaring an implementation of a functional interface. The lambda body is the implementation of the single abstract method of said functional interface. At run-time an actual object is created. So the above results in an object whose class implements Runnable.

Now, the above lambda body references a local variable from the enclosing method. The instance created as a result of the lambda expression "captures" the value of that local variable. It's almost (but not really) like you have the following:

public static void main(String[] args) {
  String foo = "Hello, World!";

  final class GeneratedClass implements Runnable {
    
    private final String generatedField;

    private GeneratedClass(String generatedParam) {
      generatedField = generatedParam;
    }

    @Override
    public void run() {
      System.out.println(generatedField);
    }
  }

  Runnable r = new GeneratedClass(foo);
  r.run();
}

And now it should be easier to see the problems with supporting concurrency here:

Local variables are not considered "shared variables". This is stated in §17.4.1 of the Java Language Specification:
Memory that can be shared between threads is called shared memory or heap memory.
All instance fields, static fields, and array elements are stored in heap memory. In this chapter, we use the term variable to refer to both fields and array elements.
Local variables (§14.4), formal method parameters (§8.4.1), and exception handler parameters (§14.20) are never shared between threads and are unaffected by the memory model.
In other words, local variables are not covered by the concurrency rules of Java and cannot be shared between threads.
At a source code level you only have access to the local variable. You don't see the generated field.

I suppose Java could be designed so that modifying the local variable inside the lambda body only writes to the generated field, and modifying the local variable outside the lambda body only writes to the local variable. But as you can probably imagine that'd be confusing and counterintuitive. You'd have two variables that appear to be one variable based on the source code. And what's worse those two variables can diverge in value.

The other option is to have no generated field. But consider the following:

public static void main(String[] args) {
  String foo = "Hello, World!";
  Runnable r = () -> {
    foo = "Goodbye, World!"; // won't compile
    System.out.println(foo);
  }
  new Thread(r).start();
  System.out.println(foo);
}

What is supposed to happen here? If there is no generated field then the local variable is being modified by a second thread. But local variables cannot be shared between threads. Thus this approach is not possible, at least not without a likely non-trivial change to Java and the JVM.

So, as I understand it, the designers put in the rule that the local variable must be final or effectively final in this context in order to avoid concurrency problems and confusing developers with esoteric problems.

Why is the Variable used in Lambda expression must be final or effectively final warning ignored for instance variables

We tend to forget that instanceCounter is actually this.instanceCounter, you are capturing this which is well, effectively final. As to why this is needed, the answer is obviously here

Why do java 8 lambdas allow access to non-final class variables?

We all agree that the first example won't work as local variables or parameters must be final or effectively final to be used within a lambda expression body.

But your second example does not involve local variables or parameters, as str is an instance field. Lambda expressions can accessing instance fields the same way as instance methods:

15.27.2. Lambda Body

A lambda body is either a single expression or a block (§14.2). Like a method body, a lambda body describes code that will be executed whenever an invocation occurs.

In fact, the java compiler creates a private method lambda$0 out of your lambda expression, that simply accesses the instance field str:

private java.lang.String lambda$0() {
    0 aload_0;                /* this */
    1 getfield 14;            /* .str */
    4 areturn;
}

Another point of view: You could also implement the Supplier using a plain-old anonymous inner class:

public Supplier<String> makeSupplier() {
    return new Supplier<String>() {
        public String get() { return str; }
    };
}

Accessing instance fields from inner classes is very common and not a speciality of Java 8.

Why does java allow class level variables to be reassigned in anonymous inner class, whereas same is not allowed for local variables

To understand why non-local variables are allowed to change, we first need to understand why local variables aren't. And that's because local variables are stored on the stack (which instance (or static) variables aren't).

The problem with stack variables is that they're going to disappear once their containing method returns. However the instance of your anonymous class might live longer than that. So if accessing local variables were implemented naively, using the local variable from inside the inner class after the method returned would access a variable on a stack frame that no longer exists. That would either lead to a crash, an exception or undefined behavior depending on the exact implementation. Since that's clearly bad, access to local variables is implemented via copying instead. That is, all the local variables that are used by the class (including the special variable this) are copied into the anonymous object. So when a method of the inner class accesses a local variable x, it's not actually accessing that local variable. It's accessing a copy of it stored inside the object.

But what would happen if a local variable changed after the object was created or if a method of the object changed the variable? Well, the former would cause the local variable to change, but not the copy in the object, and the latter would change the copy, but not the original. So either way the two versions of the variable would no longer be the same, which would be very counter-intuitive to any programmer who doesn't know about the copying going on. So to avoid this problem, you're only allowed to access local variables if their value is never changed.

Instance variables don't need to be copied because they won't disappear until their containing object is garbage collected (and static variables never disappear) - since the anonymous object will contain a reference to the outer this, this won't happen until the anonymous object is garbage collected as well. So since they aren't copied, modifying them doesn't cause any issues and there's no reason to disallow it.

Variable used in lambda expression should be final or effectively final

A final variable means that it can be instantiated only one time.
in Java you can't reassign non-final local variables in lambda as well as in anonymous inner classes.

You can refactor your code with the old for-each loop:

private TimeZone extractCalendarTimeZoneComponent(Calendar cal,TimeZone calTz) {
    try {
        for(Component component : cal.getComponents().getComponents("VTIMEZONE")) {
        VTimeZone v = (VTimeZone) component;
           v.getTimeZoneId();
           if(calTz==null) {
               calTz = TimeZone.getTimeZone(v.getTimeZoneId().getValue());
           }
        }
    } catch (Exception e) {
        log.warn("Unable to determine ical timezone", e);
    }
    return null;
}

Even if I don't get the sense of some pieces of this code:

you call a v.getTimeZoneId(); without using its return value
with the assignment calTz = TimeZone.getTimeZone(v.getTimeZoneId().getValue()); you don't modify the originally passed calTz and you don't use it in this method
You always return null, why don't you set void as return type?

Hope also these tips helps you to improve.

Lambdas: Local Variables Need Final, Instance Variables Don'T