Why Should Casting Be Avoided

Why should casting be avoided?

You've tagged this with three languages, and the answers are really quite different between the three. Discussion of C++ more or less implies discussion of C casts as well, and that gives (more or less) a fourth answer.

Since it's the one you didn't mention explicitly, I'll start with C. C casts have a number of problems. One is that they can do any of a number of different things. In some cases, the cast does nothing more than tell the compiler (in essence): "shut up, I know what I'm doing" -- i.e., it ensures that even when you do a conversion that could cause problems, the compiler won't warn you about those potential problems. Just for example, char a=(char)123456;. The exact result of this implementation defined (depends on the size and signedness of char), and except in rather strange situations, probably isn't useful. C casts also vary in whether they're something that happens only at compile time (i.e., you're just telling the compiler how to interpret/treat some data) or something that happens at run time (e.g., an actual conversion from double to long).

C++ attempts to deal with that to at least some extent by adding a number of "new" cast operators, each of which is restricted to only a subset of the capabilities of a C cast. This makes it more difficult to (for example) accidentally do a conversion you really didn't intend -- if you only intend to cast away constness on an object, you can use const_cast, and be sure that the only thing it can affect is whether an object is const, volatile, or not. Conversely, a static_cast is not allowed to affect whether an object is const or volatile. In short, you have most of the same types of capabilities, but they're categorized so one cast can generally only do one kind of conversion, where a single C-style cast can do two or three conversions in one operation. The primary exception is that you can use a dynamic_cast in place of a static_cast in at least some cases and despite being written as a dynamic_cast, it'll really end up as a static_cast. For example, you can use dynamic_cast to traverse up or down a class hierarchy -- but a cast "up" the hierarchy is always safe, so it can be done statically, while a cast "down" the hierarchy isn't necessarily safe so it's done dynamically.

Java and C# are much more similar to each other. In particular, with both of them casting is (virtually?) always a run-time operation. In terms of the C++ cast operators, it's usually closest to a dynamic_cast in terms of what's really done -- i.e., when you attempt to cast an object to some target type, the compiler inserts a run-time check to see whether that conversion is allowed, and throw an exception if it's not. The exact details (e.g., the name used for the "bad cast" exception) varies, but the basic principle remains mostly similar (though, if memory serves, Java does make casts applied to the few non-object types like int much closer to C casts -- but these types are used rarely enough that 1) I don't remember that for sure, and 2) even if it's true, it doesn't matter much anyway).

Looking at things more generally, the situation's pretty simple (at least IMO): a cast (obviously enough) means you're converting something from one type to another. When/if you do that, it raises the question "Why?" If you really want something to be a particular type, why didn't you define it to be that type to start with? That's not to say there's never a reason to do such a conversion, but anytime it happens, it should prompt the question of whether you could re-design the code so the correct type was used throughout. Even seemingly innocuous conversions (e.g., between integer and floating point) should be examined much more closely than is common. Despite their seeming similarity, integers should really be used for "counted" types of things and floating point for "measured" kinds of things. Ignoring the distinction is what leads to some of the crazy statements like "the average American family has 1.8 children." Even though we can all see how that happens, the fact is that no family has 1.8 children. They might have 1 or they might 2 or they might have more than that -- but never 1.8.

Can casts be completely avoided with a good design?

If by code smell you mean that it should raise a flag in a code review, then they are a code smell. If you mean that they should never appear in code, then no, there are some fine uses of casts.

For an interesting example (I always find type erasure interesting), take a look at the implementation of boost::any where dynamic_cast is required to safely read from the stored value (unlike unions where you must guess the type and are limited)

Sketch:

struct any_base {
   virtual ~any_base() {}
};
template <typename T>
struct any_data : any_base {
   T value;
   any_data( T const & value ) : value(value) {}
};
struct any {
   any_base * data;
   any() : data() {}
   ~any() { delete data; }

   template <typename T>
   any( T const & v ) : data( new any_data<T>(v) {}
}
template <typename T>
T any_cast( any const & a ) {
   any_base<T> * p = dynamic_cast< any_base<T>* >( a.data );
   if ( !p ) throw invalid_cast();
   return *p;
}

Need some clarification regarding casting in C

There are several situations that require perfectly valid casting in C. Beware of sweeping assertions like "casting is always bad design", since they are obviously and patently bogus.

One huge group of situations that critically relies on casts is arithmetic operations. The casting is required in situations when you need to force the compiler to interpret arithmetic expression within a type different from the "default" one. As in

unsigned i = ...;
unsigned long s = (unsigned long) i * i;

to avoid overflow. Or in

double d = (double) i / 5;

in order to make the compiler to switch to floating-point division. Or in

s = (unsigned) d * 3 + i;

in order to take the whole part of the floating point value. And so on (the examples are endless).

Another group of valid uses is idioms, i.e. well-established coding practices. For example, the classic C idiom when a function takes a const pointer as an input and returns a non-const pointer to the same (potentially constant) data, like the standard strstr for example. Implementing this idiom usually requires a use of a cast in order to cast away the constness of the input. Someone might call it bad design, but in reality there's no better design alternative in C. Otherwise, it wouldn't be a well-established idiom :)

Also it is worth mentioning, as an example, that a pedantically correct use of standard printf function might require casts on the arguments in general case. (Like %p format specifier expecting a void * pointer as an argument, which means that an int * argument has to be transformed into a void * in one way or another. An explicit cast is the most logical way to perform the transformation.).

Of course, there are other numerous examples of perfectly valid situations when casts are required.

The problems with casts usually arise when people use them thoughtlessly, even where they are not required (like casting the return of malloc, which is bad for more reasons than one). Or when people use casts to force the compiler to accept their bad code. Needless to say, it takes certain level of expertise to tell a valid cast situation from a bad cast one.

In some cases casts are used to make the compiler to stop issuing some annoying and unnecessary warning messages. These casts belong to the gray area between the good and the bad casts. On the one hand, unnecessary casts are bad. On the other hand, the user might not have control over the compilation settings, thus making the casts the only way to deal with the warnings.

Java How To Avoid Type Casting

Short answer

Generics is not the right tool here. You can make the casting explicit:

public class CompanyPDFGenerator implements EntityPDFGenerator
{
    public void generate(Entity entity)
    {
        if (! (entity instanceof Company)) {
            throw new IllegalArgumentException("CompanyPDFGenerator works with Company object. You provided " + (entity == null ? "null" : entity.getClass().getName()));
        }
        Company company = (Company) entity;
        System.out.println(company);
        // create Company related PDF
    }
}

Or you can define some sort of data structure in the entity class and use only that in the printer:

public abstract class Entity
{
    int id;
    public abstract EntityPdfData getPdfData();
}

// ...

public class CompanyPDFGenerator implements EntityPDFGenerator
{
    public void generate(Entity entity)
    {
        EntityPdfData entityPdfData = entity.getPdfData();
        // create Company related PDF
    }
}

Long answer

Generics is useful if you know the types at compile-time. I.e. if you can write into your program that actual type. For lists it looks so simple:

// now you know at compile time that you need a list of integers
List<Integer> list = new ArrayList<>();

In your example you don't know that:

public void generate(Entity entity)
{
    // either Article or Company can come it. It's a general method
    EntityPDFGenerator pdfGenerator = getConcretePDFGenerator(entity);
    pdfGenerator.generate(entity);

}

Suppose you want to add type to the EntityPDFGenerator , like this:

public static interface EntityPDFGenerator<T extends Entity>
{
    void generate(T entity);
}

public static class ArticlePDFGenerator implements EntityPDFGenerator<Article>
{
    public void generate(Article entity)
    {
        Article article = (Article) entity;
        // create Article related PDF from entity
    }
}

public static class CompanyPDFGenerator implements EntityPDFGenerator<Company>
{
    public void generate(Company entity)
    {
        Company company = (Company) entity;
        // create Company related PDF
    }
}

This looks nice. However, getting the right generator will be tricky. Java generics is invariant. Even ArrayList<Integer> is not a subclass of ArrayList<Number>. So, ArticlePdfGenerator is not a subclass of EntityPDFGenerator<T extends Entity>. I.e. this will not compile:

<T extends Entity> EntityPDFGenerator<T> getConcretePDFGenerator(T entity, Class<T> classToken)
{
    if(entity instanceof Article){
        return new ArticlePDFGenerator();
    }else{
        return new CompanyPDFGenerator();
    }
}

How to avoid casting in child classes in java

The problem is that answer and answerDto are both objects of the interface, not the implementation class. You do not need a conversion if the interface declares the functions getValue and setValue.

Additionally you can create an interface HasValue<T> which has the functions public T getValue() and public void setValue(T value), and let Answer and AnswerDto implement this interface. Doing so you need to write the code only once.

Why Should Casting Be Avoided