Should I use byte or int?
Usually yes, a 32 bit integer will perform slightly better because it is already properly aligned for native CPU instructions. You should only use a smaller sized numeric type when you actually need to store something of that size.
Why should I use int instead of a byte or short in C#
Performance-wise, an int is faster in almost all cases. The CPU is designed to work efficiently with 32-bit values.
Shorter values are complicated to deal with. To read a single byte, say, the CPU has to read the 32-bit block that contains it, and then mask out the upper 24 bits.
To write a byte, it has to read the destination 32-bit block, overwrite the lower 8 bits with the desired byte value, and write the entire 32-bit block back again.
Space-wise, of course, you save a few bytes by using smaller datatypes. So if you're building a table with a few million rows, then shorter datatypes may be worth considering. (And the same might be good reason why you should use smaller datatypes in your database)
And correctness-wise, an int doesn't overflow easily. What if you think your value is going to fit within a byte, and then at some point in the future some harmless-looking change to the code means larger values get stored into it?
Those are some of the reasons why int should be your default datatype for all integral data. Only use byte if you actually want to store machine bytes. Only use shorts if you're dealing with a file format or protocol or similar that actually specifies 16-bit integer values. If you're just dealing with integers in general, make them ints.
In java, is it more efficient to use byte or short instead of int and float instead of double?
Am I wrong in assuming it should be faster and more efficient? I'd hate to go through and change everything in a massive program to find out I wasted my time.
Short answer
Yes, you are wrong. In most cases, it makes little difference in terms of space used.
It is not worth trying to optimize this ... unless you have clear evidence that optimization is needed. And if you do need to optimize memory usage of object fields in particular, you will probably need to take other (more effective) measures.
Longer answer
The Java Virtual Machine models stacks and object fields using offsets that are (in effect) multiples of a 32 bit primitive cell size. So when you declare a local variable or object field as (say) a byte
, the variable / field will be stored in a 32 bit cell, just like an int
.
There are two exceptions to this:
long
anddouble
values require 2 primitive 32-bit cells- arrays of primitive types are represent in packed form, so that (for example) an array of bytes hold 4 bytes per 32bit word.
So it might be worth optimizing use of long
and double
... and large arrays of primitives. But in general no.
In theory, a JIT might be able to optimize this, but in practice I've never heard of a JIT that does. One impediment is that the JIT typically cannot run until after there instances of the class being compiled have been created. If the JIT optimized the memory layout, you could have two (or more) "flavors" of object of the same class ... and that would present huge difficulties.
Revisitation
Looking at the benchmark results in @meriton's answer, it appears that using short
and byte
instead of int
incurs a performance penalty for multiplication. Indeed, if you consider the operations in isolation, the penalty is significant. (You shouldn't consider them in isolation ... but that's another topic.)
I think the explanation is that JIT is probably doing the multiplications using 32bit multiply instructions in each case. But in the byte
and short
case, it executes extra instructions to convert the intermediate 32 bit value to a byte
or short
in each loop iteration. (In theory, that conversion could be done once at the end of the loop ... but I doubt that the optimizer would be able to figure that out.)
Anyway, this does point to another problem with switching to short
and byte
as an optimization. It could make performance worse ... in an algorithm that is arithmetic and compute intensive.
Secondary questions
I know java doesn't have unsigned types but is there anything extra I could do if I knew the number would be positive only?
No. Not in terms of performance anyway. (There are some methods in Integer
, Long
, etc for dealing with int
, long
, etc as unsigned. But these don't give any performance advantage. That is not their purpose.)
(I'd assume the garbage collector only deals with Objects and not primitive but still deletes all the primitives in abandoned objects right? )
Correct. A field of an object is part of the object. It goes away when the object is garbage collected. Likewise the cells of an array go away when the array is collected. When the field or cell type is a primitive type, then the value is stored in the field / cell ... which is part of the object / array ... and that has been deleted.
If I use byte instead int, will my loop iterate faster?
No, not at all; if anything, it will be slower, because the underlying hardware generally has instructions for working with the native "int" type (32-bit two's complement integer) but not for working with 8-bit signed bytes.
Any reason to use byte/short etc.. in C#?
A single byte
compared to a long
won't make a huge difference memory-wise, but when you start having large arrays, these 7 extra bytes will make a big difference.
What's more is that data types help communicate developers' intent much better: when you encounter a byte length;
you know for sure that length
's range is that of a byte
.
byte/short Vs int as for loop counter variable
It is more likely to be confusing than helpful. Most developers expect to see an int
value and you only have 32-bit or 64-bit registers in your CPU so it won't change how your program works or performs.
There are many options which work and are not harmful to your program but you need to think about the poor developer who has to read it and understand it later, this could be you 6 months from now. ;)
It is also not worth making such a change even if the performance were faster unless it was dramatically faster. Consider this change.
for (byte i = 1; i <= 120; i++)
or
for (byte i = 1; i <= x; i++)
You might thing this is fine as 200 < 2^8 and it compiles just fine, but it's actually an infinite loop.
You have to ask the question; How much faster does it have to be, if you increase the risk of introducing a bug later?
Usually the answer is it has to make my whole program significantly faster in a way I have measured (not just the bit you change) AND I have to need it to be significantly faster.
Why does the Java API use int instead of short or byte?
Some of the reasons have already been pointed out. For example, the fact that "...(Almost) All operations on byte, short will promote these primitives to int". However, the obvious next question would be: WHY are these types promoted to int
?
So to go one level deeper: The answer may simply be related to the Java Virtual Machine Instruction Set. As summarized in the Table in the Java Virtual Machine Specification, all integral arithmetic operations, like adding, dividing and others, are only available for the type int
and the type long
, and not for the smaller types.
(An aside: The smaller types (byte
and short
) are basically only intended for arrays. An array like new byte[1000]
will take 1000 bytes, and an array like new int[1000]
will take 4000 bytes)
Now, of course, one could say that "...the obvious next question would be: WHY are these instructions only offered for int
(and long
)?".
One reason is mentioned in the JVM Spec mentioned above:
If each typed instruction supported all of the Java Virtual Machine's run-time data types, there would be more instructions than could be represented in a byte
Additionally, the Java Virtual Machine can be considered as an abstraction of a real processor. And introducing dedicated Arithmetic Logic Unit for smaller types would not be worth the effort: It would need additional transistors, but it still could only execute one addition in one clock cycle. The dominant architecture when the JVM was designed was 32bits, just right for a 32bit int
. (The operations that involve a 64bit long
value are implemented as a special case).
(Note: The last paragraph is a bit oversimplified, considering possible vectorization etc., but should give the basic idea without diving too deep into processor design topics)
EDIT: A short addendum, focussing on the example from the question, but in an more general sense: One could also ask whether it would not be beneficial to store fields using the smaller types. For example, one might think that memory could be saved by storing Calendar.DAY_OF_WEEK
as a byte
. But here, the Java Class File Format comes into play: All the Fields in a Class File occupy at least one "slot", which has the size of one int
(32 bits). (The "wide" fields, double
and long
, occupy two slots). So explicitly declaring a field as short
or byte
would not save any memory either.
Why would I use byte, double, long, etc. when I could just use int?
Sometimes, when you're building massive applications that could take up 2+ GB of memory, you really want to be restrictive about what primitive type you want to use. Remember:
int
takes up 32 bits of memoryshort
takes up 16 bits of memory, 1/2 that ofint
byte
is even smaller, 8 bits.
See this java tutorial about primitive types: http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
The space taken up by each type really matters if you're handling large data sets. For example, if your program has an array of 1 million int
s, then you're taking up 3.81 MB of RAM. Now let's say you know for certain that those 1,000,000 numbers are only going to be in the range of 1-10. Why not, then use a byte
array? 1 million byte
s only take up 976 Kilobytes, less than 1 MB.
You always want to use the number type that is just "large" enough to fit, just as you wouldn't put an extra-large T-shirt on a newborn baby.
Related Topics
Entity Framework Db-First, Implement Inheritance
Add Parameter to Button Click Event
How to Display Progress During a Busy Loop
How to Set a Proxy for Webbrowser Control Without Effecting the System/Ie Proxy
Converting Long String of Binary to Hex C#
Are Timers and Loops in .Net Accurate
Unity Eventmanager with Delegate Instead of Unityevent
No Access to the Session Information Through Signalr Hub. Is My Design Is Wrong
Piping in a File on the Command-Line Using System.Diagnostics.Process
Newtonsoft JSON Deserialize Dictionary as Key/Value List from Datacontractjsonserializer
Process Queue with Multithreading or Tasks
Using Graphics.Drawimage() to Draw Image with Transparency/Alpha Channel
Multicast Delegate of Type Func (With Return Value)
Alternative to Findmimefromdata Method in Urlmon.Dll One Which Has More Mime Types