Why Does Inputstream#Read() Return an Int and Not a Byte

Why does InputStream#read() return an int and not a byte?

Because a byte can only hold -128 until 127, while it should return 0 until 255 (and -1 when there's no byte left (i.e. EOF)). Even if it returned byte, there would be no room to represent EOF.

A more interesting question is why it doesn't return short.

Why does InputStream read() return an int and not a short?

The most important reason to prefer int over short is that short is kind of a second-class citizen: all integer literals, as well as all arithmetical operations, are int-typed so you've got short->int promotion happening all over the place. Plus there is very little or no argument against the usage of int.

What does an int value returned by InputStream.read() represent?

In two's complement, the number -1 is represented by all bits set to a 1.

Here is a byte value of -1:*

1111 1111

Here is an int value of -1:*

1111 1111 1111 1111 1111 1111 1111 1111

If we take the byte value of -1 and store it in an int without extending the sign, it becomes 255:

0000 0000 0000 0000 0000 0000 1111 1111

This is just the behavior of two's complement. Storing bytes in an int leaves us leftover bits we can use to indicate other things.

So input stream returns byte values from 0-255, and -1 to indicate end of stream. To get the byte values, cast the int to a byte:

int byteAsInt = inputStream.read();
if(byteAsInt > -1) {
     byte byteValue = (byte)byteAsInt;

     // use the byte
}

The int values 128-255 will become interpreted as negative numbers when it is casted to the byte. Alternatively, you can perform "unsigned" arithmetic on the int.

The bytes themselves can mean anything. For example if they are from a .txt file, they are probably straight ASCII codes. Other formats can be much more complicated.

* 4.2:

The integral types are byte, short, int, and long, whose values are 8-bit, 16-bit, 32-bit and 64-bit signed two's-complement integers, respectively […].

Why does FileInputStream read method in java return a int, not a short?

Java documentation on primitive types suggests that shorts should be used instead of ints to "save memory in large arrays":

short: The short data type is a 16-bit signed two's complement integer. It has a minimum value of -32,768 and a maximum value of 32,767 (inclusive). As with byte, the same guidelines apply: you can use a short to save memory in large arrays, in situations where the memory savings actually matters.

Since in this situation memory savings do not actually matter, using int is a more consistent choice.

Why does the read() in FileInputStream return an integer?

The reason for it returning the value as an int is that it needs to return a value between 0-255, as well as being able to indicate when there is no more bytes to read from the file. By using an int, you can return the full range of positive unsigned values 0-255, as well as indicate when the file is complete. It wouldn't be able to provide this with only the 256 distinct values of a byte value, half of which are negative by Java default.

At which instance does reading an InputStream.read() return a positive integer and when does it return -1?

doSomething(recvBuffer);

That should be

doSomething(recvBuffer, i);

The method needs to know how many bytes were actually received.

I read that read(byte[] b, int off, int len) method returns an int that is supposed to represent the number of bytes read from the stream.

Correct.

Doesn't that mean whenever bytes are available, it sets the value of
i to the respective number of bytes?

Yes.

Once it reads x number of bytes
and reaches the end of the stream, wouldn't it return a positive
integer representing the number of bytes, instead of -1?

No, it transfers the bytes and returns the count, then the next time there are no bytes, only end of stream, so it returns -1.

Then, when would the check against -1 happen for i? I know I'm interpreting this
wrong but I can't say how.

See above.

Also, I know the max amount of bytes that a sender can push onto a stream. In this case, is it sufficient to specify the size of recvBuffer as the max amount of bytes or is it prudent to allocate a bit more than that?

Most people use 4096 or 8192 bytes. There's not a lot of point in specifying a buffer larger than the path MTU in truth, which is normally < 1500, unless you are slow at reading so that the kernel socket receive buffer fills up.

Why is the return type of input stream off int?

A "byte" is an 8-bit value in a file. There are 256 possible combinations of those 8-bits; those are all the values from 0 to 255, or from -128 to 127, however you want to view it. read() has to be able to return all 256 of those values, since any one of them could be in a file. read() also has to be able to return some special marker to indicate end-of-file. Therefore, read() has to have the ability to return 257 distinct values, and it cannot do this if it returns a byte, since byte has only 256 possible values.

why return type of read() is integer?

from the manual:

the next byte of data, or -1 if the end of the file is reached.

you can't store -1 in a byte, otherwise it would be indistinguishable from a valid read, so that's the reason.

How much data does inputstream.read reads in java

The whole point of this interface (or to be precise: abstract class): you can absolutely not rely on assuming how many bytes were read. You always always always have to check the return value of that method to know.

Background: there are many different implementations for this interface. Some my buffer, some may not. Some read "fixed" input (maybe from existing data in memory). Somebody might decide to give you a stream that turns to the internet, download a 10 GB file and then start sending you one byte after the other.

The only thing you know is: the method returns

the total number of bytes read into the buffer

End of story.

Why Does Inputstream#Read() Return an Int and Not a Byte