Which will be called from another part of the program, I have all the threading issues and many other matters well thought out and this is a long awaited moment. I need to be able to read and code the toArray method of ArrayList in an effort to populate the supplied new ArrayList with work completed in the class where the getWords() method is coded.
I will end up with a java.util.HashSet, the ArrayList is an interim collection obtained by using Set<K> TreeMap.keySet(), I can see there are several intervening steps and objects that can be dispensed with.
I am guessing the call can reference an Object of the Collection type I need to end up with and just do WordCount.getWords(HashSet hs);.
The HashSet hs will then be passed to another part of the code where the real work gets done.
In practice, it means that if your parameter is declared as a String, the return value will be a String, if your parameter is declared as an Integer, the return value will be an Integer, etc. [ January 26, 2008: Message edited by: Rob Prime ]
Okay, I tried a recode ( all new code for this node in the logic of my pet project. This is all new, and have decided to implement strict copy-semantics here, though at other points in the program I may be relying on that having been done here.
Here is the code I have so far, implemented as a default scope class in the source code file for my controller class.
Rather than go back and rework major portions, I show what I have so far to avoid getting too far off on a fundamental design point. It is the syntax for Generics that I am requesting help for. Secondary comments may be provided if wished on: This is THE place where we build data structures that will be used on massive data sets where shaving 2 ms in two seconds off a decision of if(Collection.find()) can result in substaintial gains of the work in utility to the customer. Nuances of exactly what we look for will be done as wrappers that call into this code, I am prototyping those as Regex's right now.
Glasspacks and Duntov's here, no 3208's - just whap/bap/next, raw, screaming velocity.
This is actually a lot simpler than you think. First of all, ditch the HashSet return type - make it Set only. Next, all maps have a method called keySet(), which returns a Set with all the keys of the map. Be careful though, if you remove from this set you remove from the map too. Also, most Collection implementations have a constructor that takes any other Collection - as long as the types are compatible. If it's not present, there's still the addAll method that does the same. So:
Note: if you want the order too, use LinkedHashSet. It is (nearly) as fast as HashMap but keeps the insertion order intact.
This can be handled similarly as the previous code, using the fact (documented in the API) that an Integer's hashCode is the integer value itself:
Just like keySet(), values() returns a Collection with all values. It's a Collection and not a Set because values can occur multiple times.
Joined: Sep 17, 2006
This is great, and is exactly the sort of advice I had hoped for.
See, I am trying to 'plan ahead'. I believe I should Iterate the keys of the TreeMap ( that part is working already so best left alone for now ) and populate a Fast-Find collection with new'd Integers from Iterating the entire key set returned from the TreeMap.keySet(); call. This provides for not remove from this set you remove from the map too. as there is somewhat of a logical split in my mind here and the TreeMap may ( emphasise may ) be modified structurally later when I get to coding and potentially prune the TreeMap for efficiency or some other unforseen idea not anticipated. When I get the keys ( possibly I should do this in the constructor or in a method call ) it is already stipulated as a design decision at contract discussions that the 80/20 rule should be applied: If we lose as much as 20 % of possible finds in unrestrained battle for raw power, the customer will be grinning like a cat at the gate of a Tuna factory. I.O.W. Hashtable and related cousins. To this end, I decided on some sort of Set, but it is a lot of reading to find which one disallows dups and does not take a null.
If it takes a null, I will init() it ( the collection upon which find() will be done ) to have one null so that if some new somewhere returns a null, maybe we get fail-fast behaviour. The design drawback that caused me to look elswhere than Hashtable for my FastFind was to avoid buckets ( which the hashtable design paradigm generally implements to avoid losing data ) What I want is if I try to .add(Integer val);// and there is already one in FastFind, just the hell with it and move on to the next one - no throw style: Just abandon the individual ( val ) and increment the counter to the "next()"
Thus, I came to the conclusion ( on which I invite comments, and will likely get around to testing ) that a sorted data structure of smaller data types ( lower memeory footprint ) could be binary searched, and was thus my true optimum design. I left the ( your recode ) Set< String > getStringHashSet() to return a set of strings in the original code I posted ( HashSet< String > getStringHashSet() ) as a practical matter in that on ten thousand iterations, something like doing string based finds may well have been optomized to run just as well as doing a binary search on a sorted data structure of Integer(s). Ideally, I first looked at HashMap and HashSet, but I want to eliminate dupes, buckets and nulls.
It's a lot of reading to dig through all of the collections to see exactly how the do things like ints.addAll(sourceData.keys()); to see if it is copy semantics or if sorting is done or checks for nulls and trying to do that while becoming comfortable with the Object<Type> syntax is not the most effective approach right at the moment. As well the voice of experience could be used really well here, I will move any speculative coding to some other region of the codebase.
The datatypes returned by the getter methods for FindFast class should abandon all preventions in singular dedication to raw clock reductions when dispatched to a challenge in a Thread.start(); call, there may be several other areas of the program waiting to have one of these finish. All the customer wants is raw power: Even 50/50 loss would be a profound and dramatic improvement over what they now have.
Harnessing and safety will be done in other parts of the code. [ January 27, 2008: Message edited by: Nicholas Jordan ]