Win a copy of Terraform in Action this week in the Cloud forum!

Mike Fourier

Greenhorn
+ Follow
since Apr 02, 2008
Cows and Likes
Cows
Total received
0
In last 30 days
0
Total given
0
Likes
Total received
0
Received in last 30 days
0
Total given
0
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Mike Fourier

So I've inherited a codebase. And pretty much everywhere that an object returns an int, where a String is desired, I find code like this:



Is there some sort of advantage to that code, that I don't understand? I mean: why not just this:

13 years ago
I forgot to add these:

and


Tenured gen is only 11%, so not sure why a full GC is being called for. Perhaps though, the 11% represents what the tenured gen was able to be drained to, at the time the JVM fell over. (that is: it *was* higher, like 90% used, and the VM was able to get it down to 11%, and then experienced a fatal error).

Yes/No?
13 years ago
In the last couple weeks, our production system's JVM has twice gone AWOL overnight. The java process is simply not there anymore, nothing to be found in any application log(4j) files.

In the bin directory, we found a couple of those hs_err_pid.log files, and it contains all sorts of information as to what the state of the JVM was, when it decided to die.

(more from me, after this dump)


and looking up that thread, I find it is:
(the middle one)

The above snips are found in one of the dump files, and are representative of those found in the other. The current VM operation is "full generation collection" and the current thread is "GC Daemon", in both. So, I assume the JVM is dying, when it attempts a full GC.

Question 1 Is that a reasonable deduction to make? (it dies attempting a full GC)?

What changed in the past few weeks?
No JVM updates.
No hardware updates.
No OS config changes.
A small code update (we patched a few bugs)

My thoughts so far:
1) the small code updates we did, while not *in themselves* buggy, now exercise a previously existing condition in our JVM/environment. For example, if the new code is slightly more efficient with memory, then GC's don't happen as often as they used to, and perhaps that causes problems "down the road".

2) hardware? (is one of our RAM sticks dying?)

3) a genuine bug in the JVM (ha!)

Question 2 What else should I be looking at/thinking of?
[ August 20, 2008: Message edited by: Mike Fourier ]
13 years ago
More Info:

Inside struts-config-bin.xml, I've also tried these action paths:


I've also noticed that when I view source on all my pages, stuff that looks like this in JSP:


is turned into URLs to the wrong module:



The thing is, I would have expected this to break, but it doesn't. Submit that form, and all is well.. it goes to the default module (it goes as if the call had been for /appName/headerMenu).

Will struts do that? if it doesn't recognize a module name, will it try to send it through the default module anyways?

If so: why does struts recognize the 'bin' module enough to output it to HTML, but not enough to route through it, on a request?

The other thing I've noticed is that when I *want* it to output 'bin', it doesn't. For example, a dynamic image that in JSP looks like:



Doesn't even get the appName portion of the URL, never mind the module name:

13 years ago
I've having trouble getting my request to be serviced by the correct module.

The short version of my web.xml is:



and inside struts-config-bin.xml:


And in my html page, I'm outputting the following:



Now, in the default module, we've created a RequestProcessor subclass, and one of the things it does is rejects GET requests. So that's why I'm using a different module with no special RequestProcessor specified. I want this GET request for the image to work.

But in the log files, I'm seeing all my debug statements from the specialized request processor. So... my request is being put to the wrong module (the default one).

What is wrong with my URLs?
13 years ago
Just thought I'd bump this.

Is anyone experienced in finding connection leaks in JBoss-hosted (EJB) applications?

Googling, I see things that surround those classes I mentioned:
CachedConnectionManager
and
ManagedConnectionPool

But I'm not sure why I'm getting 500+ listings when I click the 'InUseConnections'. Clearly this doesn't mean what I think it means.
13 years ago
Jboss 3.2.6 on linux 2.6.8 using JVM 1.4.2_06-b03

Ok, I'm not sure of the proper terminology for these things, so I'll just hope this is enough info.

JMX Console:

Domain: jboss.jca
service: LocalTxCM
name: fooDS

On that page, there are two links to other MBeans, CachedConnectionManager and ManagedConnectionPool.

Clicking through to the ManagedConnectionPool, I see that slowly but surely as time goes by, the AvailableConnectionCount approaches zero. If I ever do let it reach zero, the Help Desk email start up immediately with "I can't login!", etc, etc. All I need to do, at the bottom of the page, is click the 'Invoke' button for the flush() method. Then it shoots back up to 100. So I obviously have some sort of leak.

I used to have to do this at least 2-3 times per week. We've just deployed an upgrade, and all day yesterday the pool sat at 100. Ooops, this morning I see it's at 93. So there must still be some leaks.

Now I turn to CachedConnectionManager. There is an 'invoke' button for the method listInUseConnections(). Hopefully this will give me some clues as to where the "in use" (but actually leaked) connections are.

Holy doodle! The 'InUseConnections' attribute says 536!! And the listing I get is huge! (536 stack traces).

Our connection pool is supposed to be 100. But 'in use connections' are far in excess of that. What am I not understanding?
13 years ago
I have a production app that runs just fine under Tomcat 4.1.31. I'm making changes to it, and on my dev server, I am using Tomcat 5.0.28

Between TC 4.1 and TC 5.0.28, the commons-dbcp and commons-pool upgrades look like:


Why provide this info? Well, the code in question is a long-running report that uses temp tables in a Sybase database. That would be tables defined with a "#" as the first character of their name.

Now, *as far as I can tell*[1] the code in question obtains a single connection, and passes it back and forth and "up and down" method calls.

First, the method issues a statement such as:

"Drop table #groupingTempTable"

Which resulted in:
com.sybase.jdbc2.jdbc.SybSQLException: Cannot drop the table '#groupingTempTable', because it doesn't exist in the system catalogs.


So, I then commented out the part of the code that "cares" that this statement fails (don't cry to me that a temp table doesn't exist... just go ahead and create it in your next step).

So then, it presumably still failed, but was quiet about it, and then made this statement:

"Create table #groupingTempTable (grouping numeric(10))"

But then it said:

com.sybase.jdbc2.jdbc.SybSQLException: #groupingTempTable not found. Specify owner.objectname or use sp_help to check whether the object exists (sp_help may produce lots of output).

After looking at it sideways and tracing through method calls and trying to find where it's using two different connections (and failing to find that incorrect coding, but I'm still looking)... I came to the theory that something about the connection itself was somehow.... not stable. That somehow, my connection was 'loosing' track of its temp tables. That is: that somehow, I have a "pointer to a pool-managed connection object" that is very stable in the production version, but is unreliable in my dev version.

I then began to think about "what changed in DBCP?" and that's why I listed the versions above.

Except.. that seems like madness, and a sure recipe for calamitous bug reports against DBCP. How could using temp tables ever work, if the pool kept swapping connections on you?

Anyways... I thought I'd throw that out there, in case anyone else has experienced something similar, or has other theories.



[1] I've looked and looked, but maybe that just means I haven't looked long enough/hard enough yet. But... I'm reasonably certain the code does *not* use two connections. Otherwise, for the years this app has been in production, every single time the improperly coded method (the one that obtains a connection, creates a temp table, returns that connection, then gets a new connection, and expects the temp table to be there) is being used, wouldn't it need to be lucky every time, that on subsequent calls to "getConnection()" that the pool returned "the right" connection every time? The error I'm seeing *every* time in dev, would have to occur at least *some* of the time in production, wouldn't you think? So while possible, it's unlikely to be improper application coding (and plus, I still can't find the improper dual connections yet).

I've tried looking here: http://commons.apache.org/dbcp/changes-report.html, but alas, they change report doesn't go back far enough.

I thought I'd ask here, because Tomcat users (in my experience) tend to encounter lots of DBCP issues, and so might be aware of DBCP changes.
13 years ago
I'm going to have to take more care in how I post questions, I guess.


With this code:
The program threw an OOME before records ever reached 10k (10 thousand records)

With this code:
The program ran to more than 50 thousand records, without any OOME.

Note, the only thing that changed, was how often I logged. This made no sense to me, so I switched back and forth between 10,000 and 1,000 three times. It was consistent. The application did not make it to 10k records, logging every 10k records, but made it past 50k records, logging every 1k.

So I ask again: What might explain that? Or: did I not wait long enough? Was it just fluke?

Look, I realize I'm going to have to figure out what *exactly* is going on myself. But I wanted to know if anyone has seen something this "strange" before, or has tips on where to start looking.
[ May 27, 2008: Message edited by: Mike Fourier ]
13 years ago
I've got a program that reads in 400k database rows, and by the end of it, there are 400k new rows in a second table. So it reads a row, does all sorts of lookups/calculations, and inserts one row in a second table.

There was no previous logging code in the method, other than "method begins" and "method ends". Somewhere between those two, I was getting "OutOfMemoryException" (OOME).

After adding a method variable named 'records', within the "while (rs.next())" processing loop, I added this code:


After about 2 minutes, somewhere between records 0 and 10,000 , I got the OutOfMemoryException (OOME). So then I thought to increase the granularity of the logging, and switched the mod to " % 1000". And waiting for about 4 minutes, and on to record 50,000.... no OOME's.

Perhaps I didn't wait long enough. Perhaps if I had waited even more time, the OOME that was *going* to happen, would have happened.

But nevertheless, in 3 trials, switching code back and forth between 1k and 10k, it *always* threw an OOME fairly quickly with 10k, and never threw an OOME at 1k (within the limits of my patience in the test).

So that's a bit strange, right?
13 years ago
I've got a program that reads in 400k database rows, and by the end of it, there are 400k new rows in a second table. So it reads a row, does all sorts of lookups/calculations, and inserts one row in a second table.

There was no previous logging code in the method, other than "method begins" and "method ends". Somewhere between those two, I was getting "OutOfMemoryException" (OOME).

After adding a method variable named 'records', within the "while (rs.next())" processing loop, I added this code:


After about 2 minutes, somewhere between records 0 and 10,000 , I got the OutOfMemoryException (OOME). So then I thought to increase the granularity of the logging, and switched the mod to " % 1000". And waiting for about 4 minutes, and on to record 50,000.... no OOME's.

Perhaps I didn't wait long enough. Perhaps if I had waited even more time, the OOME that was *going* to happen, would have happened.

But nevertheless, in 3 trials, switching code back and forth between 1k and 10k, it *always* threw an OOME fairly quickly with 10k, and never threw an OOME at 1k (within the limits of my patience in the test).

So that's a bit strange, right?
13 years ago
Sorry, this will be a bit ranty.

I'm stuck between replying "most unhelpful reply ever" and "ask a stupid question..." (but mostly that second thing). So all of this, is with a certain amount of chagrin. Perhaps there's just no "smart" way to ask this question...

The title of my post was totally misleading. I wasn't actually ever asking for what the batch size was supposed to be. For that, I already know the answer ("it depends"). Anyone that asks "what should my batch size be?" does deserve "depends". But even then, the *generic* advice is (seemingly) "between 1/4 and 1/2 the size of the expected result size." So clearly (to some people) there seems to be some general advice one could give. And, of course, all the usual caveats apply to general advice (that being: it is *general* advice, and one's mileage will vary).

Another example of general advice that could be given, even though everyone will have different exact experiences: Given an expected resultset in the millions, then someone saying "In general, a batch size of 100 will be more performant than a batch size of 10" would not be incorrect. They would, in general, be giving good advice.

So:

Is there not some similar statement that one can make that answers the question: "do server side cursors generally underperform client-side resultsets?" (which is what I was, in my befuddled post, asking).

As for trying it out myself: Yes, I'll be doing that. And if I switch to a cursor and batchsize, I will be testing what is the 'best' size for my hardware / schema / network, etc, etc. But what I was looking for was an experienced person to say something like:

a) "know what? forget it. you're already using the best way"
or
b) "I've had mixed results, you'll really have to try both"
or
c) "batches are, in my experience, always faster for large resultsets"

Something to let me know if the time involved is even worth while. (How much time could it be??) Well... the current operation takes longer than my day at work. So testing several scenarios (batch sizes) would take several days. It also does a number of our test server, which other people share.
13 years ago
I've looking at code that has about 9 million rows to process, and see no attempts at optimization (in the code). I say that, because I don't see any attempts at setting fetch size, or use cursors. Now: Perhaps that is because the defaults are already the most performant.

There seems to be two approaches:
1) retrieve the entire resultset to the client
2) use server-side cursors, and a fetch size

Is this statement true:

1) if the client has unlimited (ie: "enough") memory, does fetching "all at once" to the client outperform a solution where cursors are used, and multiple fetches are performed?

I would tend to think so, because you're hitting the db and network just once. Granted, the Resultset will be massive on the client, but assuming you have the memory and CPU to handle it....

I suppose when you really get *right* down to measuring seconds in an hours-long process, perhaps there's a performance benefit (or only a perceived one?) to doing fetches. That being: You can start to processing the resultset much faster (after the first fetch) rather than waiting for it all to traverse. But... it's not like the driver is doing a background fetch, right? So now it's a discussion between "wait a long while, then never again" vs "wait less time, but many times over".

My database is Sybase: from googling, I think this tends to matter.

For example, here's an Oracle post that makes it clear that anyone not setting their batchsize is asking for sucky performance:
http://blog.lishman.com/2008/03/jdbc-fetch-size.html

But... jTDS seems to indicate that for Sybase, it (by default) fetches everything anyways (note #4)...
http://jtds.sourceforge.net/resultSets.html

So.. I'm thinking that the Oracle speed-boost is really only about " *IF* you are using cursors, then set your batch size correctly" But if I'm not using that, then I'm using a faster/fastest resultset possible already...

Do I have that sort of correct?

edit: changed title from:
JDBC - Resultsets - What is the right 'size' for best performance?
[ May 26, 2008: Message edited by: Mike Fourier ]
13 years ago
thanks Ernest,

If I could ask for a clarification:

Any time I've ever tried to do multiple merges like this, it's been a mess



Did you mean "Any time I've ever tried to do this (multiple merges), it's been a mess"

or did you mean to suggest there was some other way (other than "... like this...") to do multiple merges from branch to trunk, that works 'better'?


I think what compounded my initial confusion, was that I was reading the 'bible' here at my new job, and it describes tagging *trunk* with a "after_last_merge" tag, and then using it, and the branch name (effectively, branch "HEAD") for the two 'j' parameters. And so I thought for sure *that* was correct ,and somehow I was misreading the cederqvist et al.

Wow... I will definitely be testing this tomorrow.

I wonder how they (my new job) hasn't been screwing up all this time. Maybe it's something they've rarely (or ever?) done in reality. We do the "junk in the trunk" style, so each time we cut a release, it's tagged and branched off right then, in case patches are required, or UAT turns up something.

Maybe they've always just done the "easier" thing of manually committing fixes on both trunk and branch, and haven't really done merging. Or... perhaps they've done merging when "it didn't matter you did that wrong, because there were no changes on trunk...". Odd


thanks for your help.
I've been using CVS for 6 years now, but in all that time, I've never had to (or just never did) do any branching and merging. And I'm a bit confused about the 'merging'.

I've been reading O'Reilly's "Essential CVS" by Vesperman and also "the cederqvist" and I think I've turned myself around somewhere along the way.

in the o'reilly book, it says in Chapter 10, command reference ,for "update" regarding the -j switch:

if two -j options are used, determine the changes between the first -j revision and the second -j revision and merge those changes to the sandbox.



Then, in the cederqvist, section 5.7 "Merging from a branch several times" it says:


... you need to specify that you only want to merge the changes on the branch which have not yet been merged into the trunk. To do that you specify two '-j' options, and CVS merges the changes from the first revision to the second revision.




So... from these two samples, what I *think* it means is this:

"pick the two tags on the *branch* you wish to find the differences between. Then take that set of differences (from tag 1, to tag 2) and apply that diff to your working copy (which you would ensure is 'trunk'). "

ie: They are *not* saying "it merges *from* branch_tag_1 *to* trunk_tag_2". They are saying "it merges the difference between branch_tag_1 and branch_tag_2, *to* trunk".

Yes or no?


(I will be trying this out as well, but thought I'd ask for clarification from anyone that's been in the same spot as me).