Mike Simmons

Master Rancher
+ Follow
since Mar 05, 2008
Cows and Likes
Cows
Total received
53
In last 30 days
1
Total given
1
Likes
Total received
747
Received in last 30 days
25
Total given
132
Given in last 30 days
5
Forums and Threads
Scavenger Hunt
expand Rancher Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Mike Simmons

You haven't contradicted the rule.  You have created a file with 2 classes, but 0 public classes.  Since 0 is less than 1, you have not violated the rule that at most one of the classes is allowed to be public.

The rule is talking about .java source files, not compiler-generated .class files.
14 hours ago
Well, your command is looking for a directory named "java.io".  If nothing is output, that suggests there is no directory named "java.io".

Which is not too surprising.  I've never seen a directory named "java.io".  Why do you think one should exist?  Why are you looking for this thing?

Are you perhaps looking for the source code for classes in the java.io package?  Or something else?
16 hours ago
So, for anyone new coming to this thread, please note two things:

1. The original string was "28,5892.22", which has a comma in it.  That's the problem.

2. The problem has already been solved several ways.  We don't really need another solution, especially one that doesn't work.

Thank you.
17 hours ago
No, for several reasons.

As for setting unknClass to null, that's pointless.  It's a local variable, about to go out of scope, so there's no point in setting it to null - once it's out of scope, no one can access the reference anyway, and it can't prevent garbage collection.  The reference effectively does not exist after that.

As for calling the clear() method, that's also pointless, for similar reasons.  The list should be eligible for garback collection anyway right after this, so why waste time clearing it first?  The memory will be recycled soon anyway.

However...

If may really depend on what happens in that "......" part of the code.  Without something happening there, nothing is done with the list anyway.  You might as well not build the list at all... unless something interesting happens in the "......".  So, what happens there?  Most likely, some code reads each element in the list and does something with it.  in which case, after you're done, there's no use for the list anyway, and you could clear it, but there's no need, it will be garbage collected soon anyway.

But... what if you call some other method, and that other method stores a reference to the list somewhere?  What if they start other threads that aren't done with the list yet?  In that case, calling clear() is not just pointess, but it could actively interfere with whatever the other method was supposed to do.

So, calling clear() here is either harmless but pointless, or possibly, it's harmful.  Either way, why do it?

Best practice would be to eliminate both the = null  and clear() call.  There's nothing else to do here.  If you call some other method, let the other method decide when it's done with the data.  You don't need to worry about it in this method.
1 day ago
Perhaps it compiled originally, and if he's not doing a clean build, it's still got the old .class files from the successful compilation the first time?  Depends how the build is being done.
Another option is to use DecimalFormat, a subclass of NumberFormat:
1 week ago
Can you show us some code that will compile?  I see several errors in parentheses still that make it hard to be sure what you're trying to say.

sai rama krishna wrote:


Are you casting to List<Employee>?  That would look like this:

sai rama krishna wrote:


Sooo... what's happening here?  What class is get("rd") returning?  I thought you were casting it to List<Employee>, but List doesn't have a findFirst() method.  So what class are you working with now?  A List, a Stream, or something else?  Are you trying to cast it to something?

I also agree with Paul Clapham's question.  Do you want to return one employee (or EmployeeData or whatever), or a list/collection/array of many employees?  Can you change the return type of your method to a List<EmployeeData> or something similar?
1 week ago

sai rama krishna wrote:I have database execute method returning a resultMap with a key as rd and value as list of custom object Employee data.


If in fact the value were a list of data, you'd already be done, right?  But apparently not...

sai rama krishna wrote:emp=resultMap.get("rd")).findFirst().get()


It looks like you've got an extra ")" in there, or a missing "(".  But, is it possible the thing returned by resultMap.get("rd") is a Stream of some sort, rather than a List?  In that case, you have a few options.  Given:

You can do:

or

or, if using JDK 16 or later:
1 week ago

Carey Brown wrote:Here's a possible reason for keeping a List<T> for each unique weight (assuming duplicate weights would appear often otherwise).
Here the while loop will repeat until the set has been completely filled. For a very large amount of data waiting for random numbers to hit all the slots may take a while. Whereas with fewer slots and each slot manages a List<T> it's not a big deal.


So, on the one hand, this is the thing that was bothering me about the first version of the code.  The new version periodically recalculates the map in order to get rid of all the entries for elements that have already been chosen.  It's a tradeoff, but I think it's worth it.

On the other hand... this comment makes me wonder if I've misunderstood your requirements.  If you have a list of elements with the same priority, once you chose that list, do you need to keep everything in the list together in the shuffle?  So if you have

A -> 10
B -> 1
C -> 1

Are you saying that you can have results like

ABC
ACB
BCA
CBA

But not

BAC
CAB

?

That would change the results substantially, I think.  But if that's not what you mean, I'm not sure how it would work, putting all elements with the same weight in a list.
1 week ago
New and improved (I think):

"quantileMap" is the renamed inverseCumulativeDistribution.  The Quantile function is another term for inverse cumulative distribution function.  

With a RECALCULATION_RATIO of 2, this will recalculate the quantileMap and related fields whenever more than half (by weight) of the map is also in the usedSet. Each time we recalculate, we discard all entries marked "used". That way we do more work along the way, but don't have a bunch of retries at the end.  I haven't measured to see how it compares to the original version; I'm just going by gut instinct here. It's entirely possible the first version performs better overall.
1 week ago
Ah, I see.  Your explanation was probably fine; I just read too quickly.  Yes, if there are many duplicate weights, that might be a more efficient way to do it.  Though I don't think it's much of a problem to have duplicates, really, unless we're running up against memory requirements.
1 week ago

Carey Brown wrote:It seems that your cumulative weights, to work properly, would need the initial Map of weights in increasing weight order. Correct? Otherwise it would seem like your "cumulativeWeight" would have large jumps and gaps.



No, while the individual weights are not assumed to be in any order, the cumulative weight is always increasing.  Assuming all weights are positive, at least.  Each time we put a new cumulativeWeight into the map, we've added another weight to it, so it keeps going up.

Imagine the following map of weights:

A -> 6
B -> 3
C -> 1

In this order, when we make the inverse cumulative distribution, we get:

0 -> A
6 -> B
9 -> C

And total weight is 10.

Which means that when we generate a random number in the range 0 <= x < 10 it will be interpreted as:


0 <= x < 6  --> A (60% chance)
6 <= x < 9  --> B (30% chance)
9 <= x < 10 --> C (10% chance)

So the probabilities are as we require.



What if the initial weight map had been in the reverse order?

C -> 1
B -> 3
A -> 6

Then our inverse cumulative distribution would be:

0 -> C
1 -> B
4 -> A

And the total weight is again 10.

Which means that when we generate a random number in the range 0 <= x < 10 it will be interpreted as:

0 <= x <  1 --> C (10% chance)
1 <= x <  4 --> B (30% chance)
4 <= x < 10 --> A (60% chance)

With the result that the probabilities of each outcome are the same as with the first sort order.  The cumulative distribution is different, but the overall probabilities for each outcome would be the same.

The same applies to any other sort order you can imagine for the original map.

Now I think there is some benefit to sorting the map in increasing order - the calculation of total weight will be more accurate, because with floating point it's better to add small numbers before large numbers.  But that should be a minor effect; I'm not sure it's worth the trouble to do the sort.

Carey Brown wrote:I see "cumulativeWeight" as "a" way of distributing keys on a floating point scale. I could see using some function f(w) that might compute a weight key based on some formula, e.g. squared(w).



Yep!  Technically the cumulative function is the integral of the weight function.  Lots of fun calculus problems behind this, if you're interested.

Carey Brown wrote:The scenarios I've imagined, e.g. playlist, should have unique keys but the value, the weight, is probably not unique for a number of different scenarios. In which case it seems like you wouldn't want to accumulate duplicate weights and for the current cumulativeWeight key the value would need to be a List<T>.



No, the cumulative weight, always increasing, will be unique as long as the individual weights are positive.

Carey Brown wrote:Thanks for your effort it gave me a new way to view the problem.



You're welcome!

I still want to improve it to minimize the number of retries it has completing the sort...
1 week ago

I dislike the repeated retries inside the shuffle() method - as we get close to the end, set(add) value usually has no effect since the value is already in there.  But it works, and is reasonably simple...
1 week ago
Good example.  If your enum is nested inside a Book, why call it BookType?  Why not just Type?  Then when you call it from elsewhere, it's a Book.Type.  Easy.

In other cases, if for some reason you can't shorten the name like that, you can also use static imports.  E.g.

lets you replace Book.BookType.BOTH with BookType.BOTH.  And

lets you replace it with BOTH.  Which may be too short, in some contexts.  But you can decide for yourself how much you want to shorten things, and how much context to provide.
2 weeks ago
Well, those numbers look very close to me, and the time seems too short to mean much... Java performance times vary substantially in different circumstances, and you generally need to repeat the operation many times to have a meaningful result.  And if you're doing IO with a method like System.out.println(), that will be far slower than the calculation.  So I don't think those numbers are very meaningful.  Try doing millions of calculations, with no IO, to get a better idea of the performance.

Moreover, you seem to think it's very important to be as accurate as possible, and also to be as fast as possible.  The thing is, those two things can be in conflict, to some extent.  And you haven't really given any clear idea of which one is more important.  As an example, I pointed out the difference between casting to int, and using the round() method to reduce roundoff error.  Well, calling the round method may be a little slower than simply casting to int - is it worth it?  Personally I think it is, but that's really your decision; we don't know what you're planning to use this method for.  You need to decide how much error is acceptable to you.

Generally, it would be best to first, make sure your code is as accurate as possible, ignoring the performance.  Then make sure you have a good set of unit tests to verify that your code is achieving the necessary accuracy.  Then write a performance test that gives you a good measure of how fast it is.  Only after you have those, then try changing the code to make it faster.  Otherwise you don't know when your changes are introducing unacceptable problems with accuracy.
2 weeks ago