It receives a list of bitset objects, ORs them together and returns the resulting bitset. This method is very slow, mainly because of the loop that runs through each bitset object. I clone the first bitset to not to affect original. Do you see anything wrong with this approach and is there a faster alternative to achieve the same result?
I don't see any point to the clone() method there. None of the data in bitSets is ever modified directly in your code. Only residualBitSet is being changed. There's nothing special about the first element. You might as well simplify and skip the unnecessary clone():
This code also fixes another potential issue in your code - using list.get(i) in a loop. This will work fine for an ArrayList, but can be very poor for a large LinkedList. Do you know which it is? In general you're best off avoiding this issue entirely by using an Iterator any time you need to iterate a list. That's what the new for loop does for you automatically.
Aside from these issues, I don't see anything wrong with the code. How long does it take to run? How many BitSets are in the list, and how many bits do they have, generally?
One other optimization you could do is to use multiple threads. This will probably only help if you have multiple processors, but it's something. For example, if you have five processors, and 10000 BitSets to or together, you could create a thread pool with five threads, and give each one 2000 BitSets to or. Then you or together the five resulting BitSets into one final result. It doesn't really matter what order the bits are ored together; the result should be the same
Joined: Jul 10, 2008
Thanks Mike for pointing out few issues with code. You are right, cloning is not required. Actually I have a similar method for AND operation, where I guess cloning is required at the beginning. But in this code not, you are right.
Multithreading will make the whole thing more complicated, especially because this code is part of server side code of a web app. I just noticed that for loop takes like half a second roughly for ORing 5000 bitsets of 100 bits each. I was hoping for a single java-provided method to play with whole collection, instead of running a loop.
The first thing I'd do is check for an existing standard bitset class. If not, I'd be strongly tempted to write a Native class myself.
I don't do native code lightly. It usually isn't necessary. But this is a case where you can use simple machine instructions to test and set bits at the machine-word level instead of extracting each bit and operating on it separately. Plus, modern processors may further optiimize tight word-logic loop in the pipeline.
But that kind of operation is just common enough that I'd first look around for an existing implementation before breaking out the assembler or C compiler.
Customer surveys are for companies who didn't pay proper attention to begin with.
Joined: Mar 05, 2008
Tim Holloway wrote:The first thing I'd do is check for an existing standard bitset class.
Varun Chopra wrote:I was hoping for a single java-provided method to play with whole collection, instead of running a loop.
Well, BitSet is already taking care of one loop for you, because you don't need to OR each bit individually. It loops over all of them. Internally, it optimizes this by storing the bits as an array of longs, looping over the longs, and doing a bitwise OR of each long. That's roughly equivalent to Tim's "use simple machine instructions to test and set bits at the machine-word level instead of extracting each bit and operating on it separately". If you're on a 32-bit machine, then doing it with ints might be faster, but not by a lot, I think.
Are these BitSets completely unique each time you get a request? Or is there a way to cache results from previous computations?
This is the best way to go. The JVM implementors probably optimized for the hardware. Sometimes that may work best per-bit, sometimes per-word, sometimes it may depend. The IBM System/370 had memory-to-memory bit operations, but on some models, fetching (32-bit) words and doing register-to-memory operations might have been faster, providing the data was properly aligned in RAM (so that the entire data bus was employed). However, even there you might have seen some cases where one approach outperformed another. Modern-day CPUs can optimize at the pipeline level, so predicting performance gets less obvious.
The java.util.BitSetmodus operandi is a 2-address model, where the source gets modified instead of producing a distinct output. However, since it's a core class, I wouldn't be surprised to find some of the JIT compilers optimizing a clone+and (or or, xor) into a 3-operand machine instruction operation on hardware where that's supported.
There are just such an incredibly large number of evil optimizations that can be done. That's why running metrics is so important!
Joined: Jul 10, 2008
Thanks for reply my friends......I guess for now I will have to be content with what java provides. Thanks for your time.