Variable Scope and Order of Parsing VS. Operations: Assignment in an "If"

Variable scope and order of parsing vs. operations: Assignment in an if

It only happens when you try to assign a literal value, if you call a function it works.

def foo(a)
  a
end

p 'not shown' if(value = foo(false))
p 'shown' if(value = foo(true))

# This outputs a Warning in IRB
p 'shown' if(value = false)
(irb):2: warning: found = in conditional, should be ==

If you turn on debugging (-d) you will see a warning about an used variable value

warning: assigned but unused variable - value

This "works" because the statement does evaluate to true as far as if is concerned, allowing the code preceeding it to run.

What is happening here is that if() when used as a modifier has it's own binding scope, or context. So the assignment is never seen outside of the if, and therefore makes no sense to perform. This is different than if the control structure because the block that the if statement takes is also within the same scope as the assignment, whereas the line that preceeded the if modifier is not within the scope of the if.

In other words, these are not equivelant.

if a = some(value)
  puts a
end

puts a if(a = some(value))

The former having puts a within the scope of the if, the latter having puts a outside the scope, and therefore having different bindings(what ruby calls context).

Ruby Order of Operations

Ruby: Use the return of the conditional for variable assignment and comparison

What the warning is telling you to do is:

res = if block_given?
        yield(array[i], array[i+1])
      else
        array[i] - array[i+1]
      end

That is, having a single assignment instead of two (or even more).

NameError: undefined - have parsing rules for local variables changed in Ruby 2.1.2?

Yes it changed in ruby 2.1.2

In 1.8.7, 1.9.3, 2.0.0 and even 2.1.1 I get 2 warnings and no errors:

2.0.0-p247 :007 > bar if bar = false
(irb):7: warning: found = in conditional, should be ==
 => nil 
2.0.0-p247 :008 > bar if bar = true
(irb):8: warning: found = in conditional, should be ==
 => true

whereas in the 2.1.2 version you mention I get 2 warnings and 1 NameError error.

2.1.2 :001 > bar if bar = true
(irb):1: warning: found = in conditional, should be ==
NameError: undefined local variable or method `bar' for main:Object
        from (irb):1
        from /home/durrantm/.rvm/rubies/ruby-2.1.2/bin/irb:11:in `<main>'
2.1.2 :002 > bar if bar = false
(irb):2: warning: found = in conditional, should be ==
 => nil

This is on my Ubuntu 14

Ruby forgets local variables during a while loop?

I think this is because message is defined inside the loop. At the end of the loop iteration "message" goes out of scope. Defining "message" outside of the loop stops the variable from going out of scope at the end of each loop iteration. So I think you have the right answer.

You could output the value of message at the beginning of each loop iteration to test whether my suggestion is correct.

Why is a statement like 1 + n *= 3 allowed in Ruby?

Checking ruby -y output, you can see exactly what is happening. Given the source of 1 + age *= 2, the output suggests this happens (simplified):

tINTEGER found, recognised as simple_numeric, which is a numeric, which is a literal, which is a primary. Knowing that + comes next, primary is recognised as arg.

+ found. Can't deal yet.

tIDENTIFIER found. Knowing that next token is tOP_ASGN (operator-assignment), tIDENTIFIER is recognised as user_variable, and then as var_lhs.

tOP_ASGN found. Can't deal yet.

tINTEGER found. Same as last one, it is ultimately recognised as primary. Knowing that next token is \n, primary is recognised as arg.

At this moment we have arg + var_lhs tOP_ASGN arg on stack. In this context, we recognise the last arg as arg_rhs. We can now pop var_lhs tOP_ASGN arg_rhs from stack and recognise it as arg, with stack ending up as arg + arg, which can be reduced to arg.

arg is then recognised as expr, stmt, top_stmt, top_stmts. \n is recognised as term, then terms, then opt_terms. top_stmts opt_terms are recognised as top_compstmt, and ultimately program.

On the other hand, given the source 1 + age * 2, this happens:

tINTEGER found, recognised as simple_numeric, which is a numeric, which is a literal, which is a primary. Knowing that + comes next, primary is recognised as arg.

+ found. Can't deal yet.

tIDENTIFIER found. Knowing that next token is *, tIDENTIFIER is recognised as user_variable, then var_ref, then primary, and arg.

* found. Can't deal yet.

tINTEGER found. Same as last one, it is ultimately recognised as primary. Knowing that next token is \n, primary is recognised as arg.

The stack is now arg + arg * arg. arg * arg can be reduced to arg, and the resultant arg + arg can also be reduced to arg.

What's the critical difference? In the first piece of code, age (a tIDENTIFIER) is recognised as var_lhs (left-hand-side of assignment), but in the second one, it's var_ref (a variable reference). Why? Because Bison is a LALR(1) parser, meaning that it has one-token look-ahead. So age is var_lhs because Ruby saw tOP_ASGN coming up; and it was var_ref when it saw *. This comes about because Ruby knows (using the huge state transition table that Bison generates) that one specific production is impossible. Specifically, at this time, the stack is arg + tIDENTIFIER, and next token is *=. If tIDENTIFIER is recognised as var_ref (which leads up to arg), and arg + arg reduced to arg, then there is no rule that starts with arg tOP_ASGN; thus, tIDENTIFIER cannot be allowed to become var_ref, and we look at the next matching rule (the var_lhs one).

So Aleksei is partly right in that there is some truth to "when it sees a syntax error, it tries another way", but it is limited to one token into future, and the "attempt" is just a lookup in the state table. Ruby is incapable of complex repair strategies we humans use to understand sentences like "the horse raced past the barn fell", where we happily parse till the last word, then reevaluate the whole sentence when the first parse turns out impossible.

tl;dr: The precedence table is not exactly correct. There is no place in Ruby source where it exists; rather, it is the result of the interplay of various parsing rules. Many of the precedence rules break in when left-hand-side of an assignment is introduced.

What are the differences between = and - assignment operators?

What are the differences between the assignment operators = and <- in R?

As your example shows, = and <- have slightly different operator precedence (which determines the order of evaluation when they are mixed in the same expression). In fact, ?Syntax in R gives the following operator precedence table, from highest to lowest:

…
‘-> ->>’           rightwards assignment
‘<- <<-’           assignment (right to left)
‘=’                assignment (right to left)
…

But is this the only difference?

Since you were asking about the assignment operators: yes, that is the only difference. However, you would be forgiven for believing otherwise. Even the R documentation of ?assignOps claims that there are more differences:

The operator <- can be used anywhere,
whereas the operator = is only allowed at the top level (e.g.,
in the complete expression typed at the command prompt) or as one
of the subexpressions in a braced list of expressions.

Let’s not put too fine a point on it: the R documentation is wrong. This is easy to show: we just need to find a counter-example of the = operator that isn’t (a) at the top level, nor (b) a subexpression in a braced list of expressions (i.e. {…; …}). — Without further ado:

x
# Error: object 'x' not found
sum((x = 1), 2)
# [1] 3
x
# [1] 1

Clearly we’ve performed an assignment, using =, outside of contexts (a) and (b). So, why has the documentation of a core R language feature been wrong for decades?

It’s because in R’s syntax the symbol = has two distinct meanings that get routinely conflated (even by experts, including in the documentation cited above):

The first meaning is as an assignment operator. This is all we’ve talked about so far.
The second meaning isn’t an operator but rather a syntax token that signals named argument passing in a function call. Unlike the = operator it performs no action at runtime, it merely changes the way an expression is parsed.

So how does R decide whether a given usage of = refers to the operator or to named argument passing? Let’s see.

In any piece of code of the general form …

‹function_name›(‹argname› = ‹value›, …)
‹function_name›(‹args›, ‹argname› = ‹value›, …)

… the = is the token that defines named argument passing: it is not the assignment operator. Furthermore, = is entirely forbidden in some syntactic contexts:

if (‹var› = ‹value›) …
while (‹var› = ‹value›) …
for (‹var› = ‹value› in ‹value2›) …
for (‹var1› in ‹var2› = ‹value›) …

Any of these will raise an error “unexpected '=' in ‹bla›”.

In any other context, = refers to the assignment operator call. In particular, merely putting parentheses around the subexpression makes any of the above (a) valid, and (b) an assignment. For instance, the following performs assignment:

median((x = 1 : 10))

But also:

if (! (nf = length(from))) return()

_{Now you might object that such code is atrocious (and you may be right). But I took this code from the base::file.copy function (replacing <- with =) — it’s a pervasive pattern in much of the core R codebase.}

The original explanation by John Chambers, which the the R documentation is probably based on, actually explains this correctly:

[= assignment is] allowed in only two places in the grammar: at the top level (as a complete program or user-typed expression); and when isolated from surrounding logical structure, by braces or an extra pair of parentheses.

In sum, by default the operators <- and = do the same thing. But either of them can be overridden separately to change its behaviour. By contrast, <- and -> (left-to-right assignment), though syntactically distinct, always call the same function. Overriding one also overrides the other. Knowing this is rarely practical but it can be used for some fun shenanigans.

Does the || operator evaluate the second argument even if the first argument is true?

This happens because the ruby interpreter defines a variable when it sees an assignment to it (but before it executes the actual line of code). You can read more about it in this answer.

Boolean OR (||) expression will evaluate to the value of left hand expression if it is not nil and not false, else || will evaluate to the value of right hand expression.

In your example the ruby interpreter sees an assignment to a and rr (but it doesn't execute this line yet), and initializes (defines, creates) a and rr with nil. Then it executes the || expression. In this || expression, a is assigned to 10 and 10 is returned. r=20 is not evaluated, and rr is not changed (it is still nil). This is why in the next line rr is nil.

Variable Scope and Order of Parsing VS. Operations: Assignment in an "If"