I've run into what appears to be a 2KB limit on the number of concurrent open files.
Windows 2000 or 2003; JDK 1.4.2 The error occurs in a native method : java.io.FileInputStream.open() The error does not occur in JDK 1.5
I'm assuming its something along the lines of : the JVM is using some old NT4 limit as a default. I need to find the setting and change it, no luck so far.
The real situation is more complex than the example code below, I am limitted to the OS and JDK version specified.
The code : FileLeak.java Like the name says, it leaks handles, to demonstrate the limit. The sleep at the end is just so I can use process viewer in Task Manager to verify the number of handles opened by the java process.
As I understand it, the max number of open files is set by the operating system. If the problem is with a native method (i.e. java.io.FileInputStream.open()) then it's happening outside the VM. This number is a hard limit in Unix-like OS's but in Windows the limit depends on how much paging space exists and how big the files are, so it's a little harder to nail down.
I wrote a C app to test the OS, calling the OpenFile() win API. It works fine, doesn't stop until the 50,000 cap I put in the code. Also, JDK 1.5 works too. So, yes there is a limit imposed by the OS, but not a 2K, that's something the 1.4 JVM is deciding to do.
Joined: Jul 06, 2005
I did find an answer. I had to register to search Sun's bug logs, basically I'm screwed. Its a limit in the MS implementation of the C run time library.
The real problem started in Weblogic 8 when we went to SP4; on new hardware in a new data center. There is allot of network lag connecting to the new hardware, we think that may be a factor. I don't think its the application code alone, because we don't have the same problem in our test environment or prod. Maybe something in the code that was not exposed until the lag was introduced, there is allot of I/O between some parts of the app and a back end repository, but mostly just DB access.
It only takes like 6 people hitting the site to cause the error, so it should be easy enough to profile and figure out the cause.
I was just trying to find a way to raise the limit as a fall back plan. If giving it 5000 was to eliminate the errors, then it would be easier to just up the limit for now and do the research as a side task. It would also be nice to have a higher threshold to monitor the leak, see how bad it gets, because I don't think 2048 is an insanely high number for an I/O oriented server process. [ July 15, 2005: Message edited by: Scott Clark ]