Why Can't We Override '||' and '&&'

Why can't we override `||` and `&&`?

Unlike some other operators on objects, who's behavior logically can depend on class, the boolean operators are part of the language. When you have an operator like, say, ==, it is logical to say that the behavior of this operator depends on the type of object. A string should check character by character, a Hash key-value tuple by key-value tuple, etc. However, the behavior of && and || are based on the language's definition of true and false, not anything object specific. If the language allowed you to override these operators, there could be no consistent boolean model, and these operators would become completely useless.

Additionally, there is a performance consideration too. Because && and || are short circut operators, which means that if the first argument to, say, &&, evaluates to false, the second one is never even evaluated. With ||, if the first evaluates to true, the second is never evaluated. This behavior would not be possible if you could override these operators, as in Ruby operators are overloaded as methods. And all parameters must be evaluated, by definition, before the method is called. So, the performance boost, and programming convenience of a short circuit operator is lost.

How do I override && (logical and operator)?

Conditional logic operators cannot be overloaded.

According to the documentation:

The conditional logical operators cannot be overloaded, but they are evaluated using & and |, which can be overloaded.

This article provides more information on how to implement your own custom && and || operators.

Why '&&' and not '&'?

In most cases, && and || are preferred over & and | because the former are short-circuited, meaning that the evaluation is canceled as soon as the result is clear.

Example:

if(CanExecute() && CanSave())
{
}

If CanExecute returns false, the complete expression will be false, regardless of the return value of CanSave. Because of this, CanSave is not executed.

This is very handy in the following circumstance:

string value;
if(dict.TryGetValue(key, out value) && value.Contains("test"))
{
// Do Something
}

TryGetValue returns false if the supplied key is not found in the dictionary. Because of the short-circuiting nature of &&, value.Contains("test") is only executed, when TryGetValue returns true and thus value is not null. If you would use the bitwise AND operator & instead, you would get a NullReferenceException if the key is not found in the dictionary, because the second part of the expression is executed in any case.

A similar but simpler example of this is the following code (as mentioned by TJHeuvel):

if(op != null && op.CanExecute())
{
// Do Something
}

CanExecute is only executed if op is not null. If op is null, the first part of the expression (op != null) evaluates to false and the evaluation of the rest (op.CanExecute()) is skipped.

Apart from this, technically, they are different, too:

&& and || can only be used on bool whereas & and | can be used on any integral type (bool, int, long, sbyte, ...), because they are bitwise operators. & is the bitwise AND operator and | is the bitwise OR operator.

To be very exact, in C#, those operators (&, | [and ^]) are called "Logical operators" (see the C# spec, chapter 7.11). There are several implementations of these operators:

  1. For integers (int, uint, long and ulong, chapter 7.11.1):

    They are implemented to compute the bitwise result of the operands and the operator, i.e. & is implement to compute the bitwise logical AND etc.
  2. For enumerations (chapter 7.11.2):

    They are implemented to perform the logical operation of the underlying type of the enumeration.
  3. For bools and nullable bools (chapter 7.11.3 and 7.11.4):

    The result is not computed using bitwise calculations. The result is basically looked up based on the values of the two operands, because the number of possibilities is so small.

    Because both values are used for the lookup, this implementation isn't short-circuiting.

Is there actually a reason why overloaded && and || don't short circuit?

All design processes result in compromises between mutually incompatible goals. Unfortunately, the design process for the overloaded && operator in C++ produced a confusing end result: that the very feature you want from && -- its short-circuiting behavior -- is omitted.

The details of how that design process ended up in this unfortunate place, those I don't know. It is however relevant to see how a later design process took this unpleasant outcome into account. In C#, the overloaded && operator is short circuiting. How did the designers of C# achieve that?

One of the other answers suggests "lambda lifting". That is:

A && B

could be realized as something morally equivalent to:

operator_&& ( A, ()=> B )

where the second argument uses some mechanism for lazy evaluation so that when evaluated, the side effects and value of the expression are produced. The implementation of the overloaded operator would only do the lazy evaluation when necessary.

This is not what the C# design team did. (Aside: though lambda lifting is what I did when it came time to do expression tree representation of the ?? operator, which requires certain conversion operations to be performed lazily. Describing that in detail would however be a major digression. Suffice to say: lambda lifting works but is sufficiently heavyweight that we wished to avoid it.)

Rather, the C# solution breaks the problem down into two separate problems:

  • should we evaluate the right-hand operand?
  • if the answer to the above was "yes", then how do we combine the two operands?

Therefore the problem is solved by making it illegal to overload && directly. Rather, in C# you must overload two operators, each of which answers one of those two questions.

class C
{
// Is this thing "false-ish"? If yes, we can skip computing the right
// hand size of an &&
public static bool operator false (C c) { whatever }

// If we didn't skip the RHS, how do we combine them?
public static C operator & (C left, C right) { whatever }
...

(Aside: actually, three. C# requires that if operator false is provided then operator true must also be provided, which answers the question: is this thing "true-ish?". Typically there would be no reason to provide only one such operator so C# requires both.)

Consider a statement of the form:

C cresult = cleft && cright;

The compiler generates code for this as thought you had written this pseudo-C#:

C cresult;
C tempLeft = cleft;
cresult = C.false(tempLeft) ? tempLeft : C.&(tempLeft, cright);

As you can see, the left hand side is always evaluated. If it is determined to be "false-ish" then it is the result. Otherwise, the right hand side is evaluated, and the eager user-defined operator & is invoked.

The || operator is defined in the analogous way, as an invocation of operator true and the eager | operator:

cresult = C.true(tempLeft) ? tempLeft : C.|(tempLeft , cright);

By defining all four operators -- true, false, & and | -- C# allows you to not only say cleft && cright but also non-short-circuiting cleft & cright, and also if (cleft) if (cright) ..., and c ? consequence : alternative and while(c), and so on.

Now, I said that all design processes are the result of compromise. Here the C# language designers managed to get short-circuiting && and || right, but doing so requires overloading four operators instead of two, which some people find confusing. The operator true/false feature is one of the least well understood features in C#. The goal of having a sensible and straightforward language that is familiar to C++ users was opposed by the desires to have short circuiting and the desire to not implement lambda lifting or other forms of lazy evaluation. I think that was a reasonable compromise position, but it is important to realize that it is a compromise position. Just a different compromise position than the designers of C++ landed on.

If the subject of language design for such operators interests you, consider reading my series on why C# does not define these operators on nullable Booleans:

http://ericlippert.com/2012/03/26/null-is-not-false-part-one/

Uses of & and && operator

If you're asking about all languages then I don't think it's reasonable to talk about "the & operator". The token & could have all sorts of meanings in different languages, operator and otherwise.

For example in C alone there are two distinct & operators (unary address-of and binary bitwise-and). Unary & in C and related languages is the only example I can immediately think of, of a use I've encountered that meets your criteria.

However, C++ adds operator overloading so that they can mean anything you like for user-defined classes, and in addition the & character has meaning in type declarations. In C++0x the && token has meaning in type declarations too.

A language along the lines of APL or J could "reasonably" use an & operator to mean pretty much anything, since there is no expectation that code in those languages bears any resemblance at all to C-like languages. Not sure if either of those two does in fact use either & or &&.

What meanings it's "reasonable" for a binary & operator overload to have in C++ is a matter of taste - normally it would be something that's analogous to bitwise & in some way, because the values represented by your class can be considered as a sequence of bits in some way. Doesn't have to be, though, as long as it's something that makes sense in the domain. Normally it's fairly "unreasonable" to use an & overload just because & happens to be unused. But if your class represents something fairly abstruse in mathematics and you need a third binary operator after + and *, I suppose you'd start looking around. If what you want is something with even lower precedence than +, binary & is a candidate. I can't for the moment think of any structures in abstract algebra that want such a thing, but that doesn't mean there aren't any.

Overloading operator&& in C++ is moderately antisocial, since the un-overloaded version of the operator short-circuits and overloaded versions don't. C++ programmers are used to writing expressions like if (p && *p != 0), so by overloading operator&& you're in effect messing with a control structure.

Overloading unary operator& in C++ is extremely antisocial. It stops people taking pointers to your objects. IIRC there are some awkward cases where common implementations of standard templates require of their template parameters that unary operator& results in a pointer (or at least a very pointer-like thing). This is not documented in the requirements for the argument, but is either almost or completely unavoidable when the library-writer comes to implement the template. So the overload would place restrictions on the use of the class that can't be deduced from the standard, and there'd better be a very good reason for that.

[Edit: what I didn't know when I wrote this, but do know now, is that template-writers could work around the need to use unary operator& with template parameters where the standard doesn't specify what & does for that type (i.e. all of them). You can do what boost::addressof does, which is:

reinterpret_cast<Foo*>(&reinterpret_cast<char&>(foo))

The standard doesn't require much of reinterpet_cast, but since we're talking about standard templates they know exactly what it does in the implementation, and anyway it's legal to reinterpret an object as chars. I think this is guaranteed to work - but if not the implementation can ensure that it does work if necessary to write fully conforming standard templates.

But, if your implementation doesn't go to these lengths to avoid calling an overloaded operator&, the original problem remains.]

Any way to override the and operator in Python?

You cannot override the and, or, and not boolean operators.

redefine __and__ operator

__and__ is the binary (bitwise) & operator, not the logical and operator.

Because the and operator is a short-circuit operator, it can't be implemented as a function. That is, if the first argument is false, the second argument isn't evaluated at all. If you try to implement that as a function, both arguments have to be evaluated before the function can be invoked.

Why && in return and casting?

It's a logical short-circuiting and operator. It is a shorter (and faster) way to write

if (name.equals(other.name)) {
if (salary == other.salary) {
return hireDay.equals(other.hireDay);
}
}
return false;

(note that the original does not involve branches). As for why it needs a cast for otherObject to Employee; it is precisely because it does not know that otherObject is an Employee - in fact, you have

public boolean equals(Object otherObject)

which means otherObject is an Object (as required by Object.equals(Object)). You need a cast to tell the compiler that at runtime otherObject is an Employee (or throw a class cast exception).

If you expected the compiler to "know" that after

// if the classes don't match, they can't be equal 
if (getClass() != otherObject.getClass())
return false;

It's safe to infer otherObject is an Employee, I'm sorry to inform you Java does not make any such inference (currently). Compilers aren't sentient (despite seeming like it sometimes).

|| and && aren't methods on Object -- what are they?

Both | and || are operators. || is part of the language while | is implemented as a method by some classes (Array, FalseClass, Integer, NilClass and TrueClass) .

In programming languages, | is used in general as the bitwise OR operator. It combines the bits of its integer operands and produces a new integer value. When used with non-integer operands, some languages convert them to integer, others prohibit such usage.

|| is the logical OR operator. It combines two boolean values (true or false) and produces another boolean value. When its operands are not boolean values, they are converted to boolean by some languages. Ruby (and JavaScript and other languages) evaluate its first operand as boolean and the value of the expression is the value of its first operand if its boolean value is true or the value of its second operand if the logical value of its first one is false. The type of the resulting value is its original type, it is not converted to boolean.

Each language uses its own rules to decide what non-boolean values are converted to false (usually the number 0, the empty string '' and null or undefined); all the other values are converted to true. The only "false" values in Ruby are false (boolean) and nil (non-boolean); all the other values (including 0) are "true".

Because true || anything is true and false && anything is false, many programming languages including Ruby implement short-circuit evaluation for logical expressions.

Using short-circuit evaluation, a logical expression is evaluated from left to right, one operand at a time until the value of the expression can be computed without the need to compute the other operands. In the examples above, the value of anything doesn't change the value of the entire expression. Using short-circuit evaluation, the value of anything is not computed at all because it does not influence the value of the entire expression. Being anything a method call that takes considerable time to execute, the short-circuit evaluation avoids calling it and saves execution time.

As others already mentioned in comments to the question, implementing || as a method of some class is not possible. The value of its second operand must be evaluated in order to be passed as argument to the method and this breaks the short-circuiting behaviour.

The usual representation of the logical values in programming languages uses only one bit (and I guess Ruby does the same.) Results of | and || are the same for operands stored on one bit.

Ruby uses the | symbol to implement different flavors of the OR operation as follows:

  • bitwise OR for integers;
  • non-short-circuit logical OR for booleans and nil;
  • union for arrays.

An expression like:

x = false | a | b | c

ensures that all a, b and c expressions are evaluated (no short-circuit) and the value of x is the logical OR of the logical values of a, b and c.

If a, b and c are method calls, to achieve the same result using the logical OR operator (||) the code needs to look like this:

aa = a
bb = b
cc = c
x = aa || bb || cc

This way each method is called no matter what values are returned by the methods called before it.

For TrueClass, FalseClass and NilClass, the | operator is useful when short-circuit evaluation is not desired.

Also, for Array (an array is just an ordered set), the | operator implements union, an operation that is the semantically equivalent of logical OR for sets.

Why doesn't c++ have &&= or ||= for booleans?

A bool may only be true or false in C++. As such, using &= and |= is relatively safe (even though I don’t particularly like the notation). True, they will perform bit operations rather than logical operations (and thus they won’t short-circuit) but these bit operations follow a well-defined mapping, which is effectively equivalent to the logical operations, as long as both operands are of type bool.1

Contrary to what other people have said here, a bool in C++ must never have a different value such as 2. When assigning that value to a bool, it will be converted to true as per the standard.

The only way to get an invalid value into a bool is by using reinterpret_cast on pointers:

int i = 2;
bool b = *reinterpret_cast<bool*>(&i);
b |= true; // MAY yield 3 (but doesn’t on my PC!)

But since this code results in undefined behaviour anyway, we may safely ignore this potential problem in conforming C++ code.


1 Admittedly this is a rather big caveat as Angew’s comment illustrates:

bool b = true;
b &= 2; // yields `false`.

The reason is that b & 2 performs integer promotion such that the expression is then equivalent to static_cast<int>(b) & 2, which results in 0, which is then converted back into a bool. So it’s true that the existence of an operator &&= would improve type safety.



Related Topics



Leave a reply



Submit