Is C++ Considered Weakly Typed? Why

Is C strongly typed?

"Strongly typed" and "weakly typed" are terms that have no widely agreed-upon technical meaning. Terms that do have a well-defined meaning are

  • Dynamically typed means that types are attached to values at run time, and an attempt to mix values of different types may cause a "run-time type error". For example, if in Scheme you attempt to add one to true by writing (+ 1 #t) this will cause an error. You encounter the error only if you attempt to execute the offending code.

  • Statically typed means that types are checked at compile time, and a program that does not have a static type is rejected by the compiler. For example, if in ML you attempt to add one to true by writing 1 + true, the program will be rejected with a (probably cryptic) error message. You always get the error even if the code might never be executed.

Different people prefer different systems according in part to how much they value flexibility and how much they worry about run-time errors.

Sometimes "strongly typed" is used loosely to mean "statically typed", and "weakly typed" is used incorrectly to mean "dynamically typed". A better use for the term "strongly typed" is that "you cannot work around or subvert the type system", whereas "weakly typed" means "there are loopholes in the type system". Perversely, most languages with static type systems have loopholes, while many languages with dynamic type systems have no loopholes.

None of these terms are connected in any way with the number of implicit conversions available in a language.

If you want to talk precisely about programming languages, it is best to avoid the terms "strongly typed" and "weakly typed". I would say that C is a language that is statically typed but that has a lot of loopholes. One loophole is that you can freely cast any pointer type to any other pointer type. You can also create a loophole between any two types of your choice by declaring a C union that has two members, one for each of the types in question.

I have written more about static and dynamic typing at why-interpreted-langs-are-mostly-ducktyped-while-compiled-have-strong-typing.

Is C++ considered weakly typed? Why?

That paper first claims:

In contrast, a language is weakly-typed if type-confusion can occur silently (undetected), and eventually cause errors that are difficult to localize.

And then claims:

Also, C and C++ are considered weakly typed since, due to type-casting, one can interpret a field of a structure that was an integer as a pointer.

This seems like a contradiction to me. In C and C++, the type-confusion that can occur as a result of casts will not occur silently -- there's a cast! This does not demonstrate that either of those languages is weakly-typed, at least not by the definition in that paper.

That said, by the definition in the paper, C and C++ may still be considered weakly-typed. There are, as noted in the comments on the question already, cases where the language supports implicit type conversions. Many types can be implicitly converted to bool, a literal zero of type int can be silently converted to any pointer type, there are conversions between integers of varying sizes, etc, so this seems like a good reason to consider C and C++ weakly-typed for the purposes of the paper.

For C (but not C++), there are also more dangerous implicit conversions that are worth mentioning:

int main() {
int i = 0;
void *v = &i;
char *c = v;
return *c;
}

For the purposes of the paper, that must definitely be considered weakly-typed. The reinterpretation of bits happens silently, and can be made far worse by modifying it to use completely unrelated types, which has silent undefined behaviour that typically has the same effect as reinterpreting bits, but blows up in mysterious yet sometimes amusing ways when optimisations are enabled.

In general, though, I think there isn't a fixed definition of "strongly-typed" and "weakly-typed". There are various grades, a language that is strongly-typed compared to assembly may be weakly-typed compared to Pascal. To determine whether C or C++ is weakly-typed, you first have to ask what you want weakly-typed to mean.

Seeking clarification on apparent contradictions regarding weakly typed languages

UPDATE: This question was the subject of my blog on the 15th of October, 2012. Thanks for the great question!


What does it really mean for a language to be "weakly typed"?

It means "this language uses a type system that I find distasteful". A "strongly typed" language by contrast is a language with a type system that I find pleasant.

The terms are essentially meaningless and you should avoid them. Wikipedia lists eleven different meanings for "strongly typed", several of which are contradictory. This indicates that the odds of confusion being created are high in any conversation involving the term "strongly typed" or "weakly typed".

All that you can really say with any certainty is that a "strongly typed" language under discussion has some additional restriction in the type system, either at runtime or compile time, that a "weakly typed" language under discussion lacks. What that restriction might be cannot be determined without further context.

Instead of using "strongly typed" and "weakly typed", you should describe in detail what kind of type safety you mean. For example, C# is a statically typed language and a type safe language and a memory safe language, for the most part. C# allows all three of those forms of "strong" typing to be violated. The cast operator violates static typing; it says to the compiler "I know more about the runtime type of this expression than you do". If the developer is wrong, then the runtime will throw an exception in order to protect type safety. If the developer wishes to break type safety or memory safety, they can do so by turning off the type safety system by making an "unsafe" block. In an unsafe block you can use pointer magic to treat an int as a float (violating type safety) or to write to memory you do not own. (Violating memory safety.)

C# imposes type restrictions that are checked at both compile-time and at runtime, thereby making it a "strongly typed" language compared to languages that do less compile-time checking or less runtime checking. C# also allows you to in special circumstances do an end-run around those restrictions, making it a "weakly typed" language compared with languages which do not allow you to do such an end-run.

Which is it really? It is impossible to say; it depends on the point of view of the speaker and their attitude towards the various language features.

Is there a statically weak typed language?

The definition of strongly and weakly typed is not well defined, especially in the context of rating just one language. It is a commonly used axis on which to compare languages, and in that context strong and weak typing gain more meaning but it is important to understand that there is no rigorous definition like static and dynamic. What makes a type system weak or strong comes down to a the ways in which the programmer is able to create type errors.

Unchecked explicit casting

A lot of people would consider C weakly typed because a programmer is allowed to cast types. I can add a pointer to a character if I just tell C that they are both integers.

int main () {
char c = 'a';
void *p;
(int)c + (int)p;
}

In Haskell, however, I can explicitly cast from on type to another, but only certain types will work.

ord('c') + 10
fromIntegral (2::Int) + 4.13

Java has static type casting as well which allows the programmer to, for example, downcast objects. This makes the static type system not sound. But Java has dynamic type checking for just this reason. Yes, Java has dynamic and static type checking. For this reason, however, I think many people would consider Java to be strongly typed.

Automatic casting

Perl and Javascript will take strings and consider them to be numbers if they look enough like a number and automatically make it work.

'2 is my favorite number' + 413 == 415 # in Perl

If you want to a string to a number in, say, Scheme you have to explicitly convert using a function that does a check and raises exception if they string is not a number.

(= (+ (string->number '2') 413) 415) ; In Scheme

For this reason a lot of people would consider Scheme strongly typed.

No types at all

In some languages there aren't any types. The untyped Lambda Calculus is one such example. This is clearly not strongly typed. Everything is a function. I can have numbers using Church Numerals or pairs or strings or whatever using various encodings but values only mean what I agree they mean and there is certainly overlap.

Comparison

Like I said, the terms are not well defined but they are a little more useful when used in a relative manner. For example I could make a good claim that OCaml is more strongly typed than Java, because Java allows for explicit static down casting whereas OCaml does not.

Conclusion

The terms aren't rigorous but they are useful. To answer your original question, in my opinion, C/C++ are static and weakly typed, so they fit the description.

What are the benefits (and drawbacks) of a weakly typed language?

The cited advantage of static typing is that there are whole classes of errors caught at compile time, that cannot reach runtime. For example, if you have a statically-typed class or interface as a function parameter, then you are darn well not going to accidentally pass in an object of the wrong type (without an explicit and incorrect cast, that is).

Of course, this doesn't stop you passing in the wrong object of the right type, or an implementation of an interface where you've given it the right functions but they do the wrong things. Furthermore, if you have 100% code coverage, say the PHP/Python/etc folks, who cares whether you catch the error at compile time or at run time?

Personally, I've had fun times in languages with static typing, and fun times in languages without. It's rarely the deciding issue, since I've never had to choose between two languages which are identical other than their kind of typing and there are normally more important things to worry about. I do find that when I'm using statically typed languages I deliberately "lean on the compiler", trying to write code in such a way that if it's wrong, it won't compile. For instance there are certain refactors which you can perform by making a change in one place, and then fixing all the compilation errors which result, repeat until clean compile. Doing the same thing by running a full test suite several times might not be very practical. But it's not unheard-of for IDEs to automate the same refactors in other languages, or for tests to complete quickly, so it's a question of what's convenient, not what's possible.

The folks who have a legitimate concern beyond convenience and coding style preference are the ones working on formal proofs of the correctness of code. My ignorant impression is that static type deduction can do most (but not all) of the work that explicit static typing does, and saves considerable wear and tear on the keyboard. So if static typing forces people to write code in a way that makes it easier to prove, then there could well be something to it from that POV. I say "if": I don't know, and it's not as if most people prove their statically-typed code anyway.

changing variable types on the fly and such

I think that's of dubious value. It's always so tempting to do something like (Python/Django):

user = request.GET['username']
# do something with the string variable, "user"
user = get_object_or_404(User,user)
# do something with the User object variable, "user"

But really, should the same name be used for different things within a function? Maybe. Probably not. "Re-using", for example, integer variables for other things in statically typed languages isn't massively encouraged either. The desire not to have to think of concise, descriptive variable names, probably 95% of the time shouldn't override the desire for unambiguous code...

Btw, usually weak typing means that implicit type conversions occur, and strong typing means they don't. By this definition, C is weakly typed as far as the arithmetic types are concerned, so I assume that's not what you mean. I think it's widely considered that full strong typing is more of a nuisance than a help, and "full weak typing" (anything can be converted to anything else) is nonsensical in most languages. So the question there is about how many and what implicit conversions can be tolerated before your code becomes too difficult to figure out. See also, in C++, the ongoing difficulty in deciding whether to implement conversion operators and non-explicit one-arg constructors.

Static/Dynamic vs Strong/Weak

  • Static/Dynamic Typing is about when type information is acquired (Either at compile time or at runtime)

  • Strong/Weak Typing is about how strictly types are distinguished (e.g. whether the language tries to do an implicit conversion from strings to numbers).

See the wiki-page for more detailed information.

Difference between strongly and weakly typed languages?

Check Eric Lippert's blog out. There's an entry about just what you're looking for here.

From the looks of his blog, those terms are subjective, so "speak more precisely about type system features."



Related Topics



Leave a reply



Submit