Too Many Open Files Error But Lsof Shows a Legal Number of Open Files

Too many open files error but lsof shows a legal number of open files

It turns out the problem was that my program was running as an upstart init script, and that the exec stanza does not invoke a shell. ulimit and the settings in limits.conf apply only to user processes in a shell.

I verified this by changing the exec stanza to

exec sudo -u username java $JAVA_OPTS -jar program.jar

which runs java in username's default shell. That allowed the program to use as many open files as it needs.

I have seen it mentioned that you can also call ulimit -n prior to invoking the command; for an upstart script I think you would use a script stanza instead.

I found a better diagnostic than lsof to be ls /proc/{pid}/fd | wc -l, to obtain a precise count of the open file descriptor. By monitoring that I could see that the failures occurred right at 4096 open fds. I don't know where that 4096 comes from; it's not in /etc anywhere; I guess it's compiled into the kernel.

Socket accept - Too many open files

There are multiple places where Linux can have limits on the number of file descriptors you are allowed to open.

You can check the following:

cat /proc/sys/fs/file-max

That will give you the system wide limits of file descriptors.

On the shell level, this will tell you your personal limit:

ulimit -n

This can be changed in /etc/security/limits.conf - it's the nofile param.

However, if you're closing your sockets correctly, you shouldn't receive this unless you're opening a lot of simulataneous connections. It sounds like something is preventing your sockets from being closed appropriately. I would verify that they are being handled properly.

Java IOException Too many open files

On Linux and other UNIX / UNIX-like platforms, the OS places a limit on the number of open file descriptors that a process may have at any given time. In the old days, this limit used to be hardwired1, and relatively small. These days it is much larger (hundreds / thousands), and subject to a "soft" per-process configurable resource limit. (Look up the ulimit shell builtin ...)

Your Java application must be exceeding the per-process file descriptor limit.

You say that you have 19 files open, and that after a few hundred times you get an IOException saying "too many files open". Now this particular exception can ONLY happen when a new file descriptor is requested; i.e. when you are opening a file (or a pipe or a socket). You can verify this by printing the stacktrace for the IOException.

Unless your application is being run with a small resource limit (which seems unlikely), it follows that it must be repeatedly opening files / sockets / pipes, and failing to close them. Find out why that is happening and you should be able to figure out what to do about it.

FYI, the following pattern is a safe way to write to files that is guaranteed not to leak file descriptors.

Writer w = new FileWriter(...);
try {
// write stuff to the file
} finally {
try {
w.close();
} catch (IOException ex) {
// Log error writing file and bail out.
}
}

1 - Hardwired, as in compiled into the kernel. Changing the number of available fd slots required a recompilation ... and could result in less memory being available for other things. In the days when Unix commonly ran on 16-bit machines, these things really mattered.

UPDATE

The Java 7 way is more concise:

try (Writer w = new FileWriter(...)) {
// write stuff to the file
} // the `w` resource is automatically closed

UPDATE 2

Apparently you can also encounter a "too many files open" while attempting to run an external program. The basic cause is as described above. However, the reason that you encounter this in exec(...) is that the JVM is attempting to create "pipe" file descriptors that will be connected to the external application's standard input / output / error.

Tomcat too many open files error (Ubuntu 18.04)

The problem was with the Tomcat version. Version 9.0.13 would show "Too many open files" error, even when the total amount of opened files was around 5000 out of 100000 (limit). I updated the version to 9.0.14 and everything seems fine now.

kafka Too many open files

In Kafka, every topic is (optionally) split into many partitions. For each partition some files are maintained by brokers (for index and actual data).

kafka-topics --zookeeper localhost:2181 --describe --topic topic_name

will give you the number of partitions for topic topic_name. The default number of partitions per topic num.partitions is defined under /etc/kafka/server.properties

The total number of open files could be very huge if the broker hosts many partitions and a particular partition has many log segment files.

You can see the current file descriptor limit by running

ulimit -n

You can also check the number of open files using lsof:

lsof | wc -l

To solve the issue you either need to change the limit of open file descriptors:

ulimit -n <noOfFiles>

or somehow reduce the number of open files (for example, reduce number of partitions per topic).



Related Topics



Leave a reply



Submit