Case Statement Block Level Declaration Space in C#

UPDATE: This question was used as the inspiration for this blog post; see it for further details.

http://ericlippert.com/2009/08/13/four-switch-oddities/

Thanks for the interesting question.

There are a number of confusions and mis-statements in the various other answers, none of which actually explain why this is illegal. I shall attempt to be definitive.

First off, to be strictly correct, "scope" is the wrong word to use to describe the problem. Coincidentally, I wrote a blog post last week about this exact mis-use of "scope"; that will be published after my series on iterator blocks, which will run throughout July.

The correct term to use is "declaration space". A declaration space is a region of code in which no two different things may be declared to have the same name. The scenario described here is symptomatic of the fact that a switch section does not define a declaration space, though a switch block does. Since the OP's two declarations are in the same declaration space and have the same name, they are illegal.

(Yes, the switch block also defines a scope but that fact is not relevant to the question because the question is about the legality of a declaration, not the semantics of an identifier lookup.)

A reasonable question is "why is this not legal?" A reasonable answer is "well, why should it be"? You can have it one of two ways. Either this is legal:

switch(y)
{
case 1:  int x = 123; ... break;
case 2:  int x = 456; ... break;
}

or this is legal:

switch(y)
{
case 1:  int x = 123; ... break;
case 2:  x = 456; ... break;
}

but you can't have it both ways. The designers of C# chose the second way as seeming to be the more natural way to do it.

This decision was made on July 7th, 1999, just shy of ten years ago. The comments in the notes from that day are extremely brief, simply stating "A switch-case does not create its own declaration space" and then giving some sample code that shows what works and what does not.

To find out more about what was in the designers minds on this particular day, I'd have to bug a lot of people about what they were thinking ten years ago -- and bug them about what is ultimately a trivial issue; I'm not going to do that.

In short, there is no particularly compelling reason to choose one way or the other; both have merits. The language design team chose one way because they had to pick one; the one they picked seems reasonable to me.

Variable declaration in a C# switch statement

I believe it has to do with the overall scope of the variable, it is a block level scope that is defined at the switch level.

Personally if you are setting a value to something inside a switch in your example for it to really be of any benefit, you would want to declare it outside the switch anyway.

C# switch case behaviour for declared vars

Because you're not starting a new scope. Personally, I almost exclusively use block scopes in my case statements:

switch (test)
{
    case "hello":
    {
        string demo = "123";
        break;
    }
    case "world":
    {
        var demo = "1234";
        break;
    }
    case "hello world":
    {
        var demo = 34;
        break;
    }
}

In my opinion, the main reasons for this are 1) simplicity, and 2) compatibility with C. There already is a syntax for starting a new block scope, and that's using { ... }. No need to add another rule "just because". In C#, there's not much point in not having a separate scope for each of the case statements, since you are prohibited from reading possibly unassigned variables.

For example, the following isn't allowed in C#:

switch (test)
{
  case 1: string demo = "Hello"; goto case 2;
  case 2: demo += " world"; break;
}

Of course, the solution to this is rather easy - just declare the local outside of the switch scope and give it a default value if needed.

How is switch variable declaration scoped?

Question is, how is x scoped inside each case without any visible blocks. meanwhile, variable invalid cant be declared in different switch cases. it has to be inside a block.

Variables introduced via pattern matching in case labels only have the scope of the body of that case.

Variables introduced "normally" in the case bodies have the scope of the whole switch statement.

Yes, it's inconsistent - but I'd argue that:

It's particularly useful to be able to introduce multiple variables with the same name via pattern matching
The scoping of variables introduced in case statements was a design mistake to start with, and this is just preventing the mistake from going any further

Note that you can't declare the same variable multiple times using pattern matches for cases which use the same case block. For example, with a simplification of your code, this is fine:

object o = null;
switch (o)
{
    case Type x when x == typeof(byte):
        break;
    case Type x when x == typeof(short):
        break;
}

But this isn't:

object o = null;
switch (o)
{
    case Type x when x == typeof(byte):
    case Type x when x == typeof(short):
        break;
}

Arguably the compiler could have some rules to allow you to introduce multiple variables so long as they're of the same type - that could be really handy for common code. But it would definitely make the language even more complicated...

As an example of the "design mistake" point, the C# 5 specification actually has an error due to it. The C# 5 spec (8.7.2) claims:

The “no fall through” rule prevents a common class of bugs that occur in C and C++ when break statements are accidentally omitted. In addition, because of this rule, the switch sections of a switch statement can be arbitrarily rearranged without affecting the behavior of the statement.

This "arbitrary rearrangement" is untrue in C# 7 due to pattern matching ordering anyway, but it's always been untrue. Consider this code:

class Test
{
    static void Main(string[] args)
    {
        switch (args.Length)
        {
            case 0:
                string x;
                break;
            case 1:
                x = args[0];
                break;
        }
    }
}

That's valid due to the odd scoping rules - x is in scope and usable in the "case 1" block. If you rearrange the cases, however:

class Test
{
    static void Main(string[] args)
    {
        switch (args.Length)
        {
            case 1:
                x = args[0]; // Invalid
                break;
            case 0:
                string x;
                break;
        }
    }
}

... this now gives a compile-time error. The variable is still in scope (the compiler knows what you mean by x) but you can't assign a value to a local variable before its declaration.

As far as I'm aware no-one ever wants to use a variable declared by an earlier scope - it would have made much more sense either for each case block to introduce a new variable declaration space, or for C# to require braces for the case block anyway.

Why doesn't C# allow declaring variables with the same number inside different case blocks for switch statements?

A case statement does not define a variable scope. You could add something in curly braces inside your case statement to define a new variable scope.

Why does C# allow statements after a case but not before it?

Because your indentation is misleading, the first code actually is:

var s = "Nice";
switch (s)
{
    case "HI":
        break;
        const string x = "Nice";
    case x:
        Console.Write("Y");
        break;
}

That is, x is declared inside a case statement (though after a break), where it is valid. However, directly inside a switch statement it’s invalid – the only valid statements there are case and default.

Furthermore, const declarations are evaluated at compile time, so x is defined even though there’s a break statement before.

However, note that the Mono C# compiler will not compile this code, it complains that “the name ‘x’ does not exist in the current scope” so Mono seems to implement more checks than the .NET compiler. However, I can’t find any rules in the C# standard which forbid this use of the const declaration so I assume that the .NET compiler is right and the Mono compiler is wrong.

variable scope in statement blocks

By my understanding of scope, the first example should be fine.

Your understanding of scope is fine. This is not a scoping error. It is an inconsistent use of simple name error.

int i = 10; // error, 'i' already exists

That is not the error that is reported. The error that is reported is "a local variable named i cannot be declared in this scope because it would give a different meaning to i which is already used in a child scope to denote something else"

The error message is telling you what the error is; read the error message again. It nowhere says that there is a conflict between the declarations; it says that the error is because that changes the meaning of the simple name. The error is not the redeclaration; it is perfectly legal to have two things in two different scopes that have the same name, even if those scopes nest. What is not legal is to have one simple name mean two different things in nested local variable declarations spaces.

You would get the error "a local variable named i is already defined in this scope" if instead you did something like

int i = 10;
int i = 10;

Surely 'i' is either in scope or not.

Sure -- but so what? Whether a given i is in scope or not is irrelevant. For example:

class C 
{
    int i;
    void M()
    {
        string i;

Perfectly legal. The outer i is in scope throughout M. There is no problem at all with declaring a local i that shadows the outer scope. What would be a problem is if you said

class C 
{
    int i;
    void M()
    {
        int x = i;
        foreach(char i in ...

Because now you've used i to mean two different things in two nested local variable declaration spaces -- a loop variable and a field. That's confusing and error-prone, so we make it illegal.

Is there something non-obvious about scope I don't understand which means the compiler genuinely can't resolve this?

I don't understand the question. Obviously the compiler is able to completely analyze the program; if the compiler could not resolve the meaning of each usage of i then how could it report the error message? The compiler is completely able to determine that you've used 'i' to mean two different things in the same local variable declaration space, and reports the error accordingly.

Case Statement Block Level Declaration Space in C#