You have a datum which matches a customer, and in the other repository a more recent datum matching the same customer, and you want to update the older repository? Is it something like that?
When I see customer and data, the word Map immediately comes into my head . . .
Is there any way you can take some sort of index out of the data to use as a "key"? Then you can use the remainder of the datum as a "value". Then, maybe, you can use the put or putAll methods of your destination Map to transfer the data. Have a look at the Map interface and the HashMap class and see whether you think that would work.
Joined: Jan 28, 2005
Thanks for your reply Campbell
I see what you're saying and that would work if we had a unique "key" we could use.
For example, if we could say that if the date and description and price all matched then they're the same otherwise they're not.
The problem is, we need to be able to find the best match possible on more vague criteria.
For example, if we can't find a match on all three pieces of data then we could match on two.
Or if the date is within a week we may consider that a "match".
I don't think maps can help us in this situation as we don't have a "key" that we can use here.
I think I'll need to start with something like the following
Then I'm not sure where I go after this - maybe something like:
Joined: Oct 13, 2005
You could have two items with the same description and same date, but different price. Then you are making the assumption that these are in fact a partial match and probably the same item. How can you be sure they are the same item?
How about putting them into sorted sets, using comparators for date, date and price or date price description? Would that help, or not?
How about putting the lot into a List, sorting with a price comparator, then description comparator, then by date? That reverse order sorting will give you a List ordered by date, then description within date, then price within description (at least I think it would). Then you can create two Lists and iterate through them looking for matches. You may have to go backwards and forwards to get matches.
Anybody else got any ideas, please? I feel I am scraping the bottom of the barrel for ideas, and often other people can see another solution.
Joined: Jan 28, 2005
Thanks for your help Campbell.
Yes it's possible we would end up matching data that is not actually the same item.
Unfortunately that is a known risk but we don't have a unique key to match the items so it is something we have to live with.
What I'm really looking for is some sort of algorithm to get the best fit match between two sets of data.
I was hoping there was something standard out there but I can't find anything.
I'm not sure how sorting would work as there are a number of different criteria that we are matching on.
For example, if we prioritise sort on date then this may miss a match on description and price that has a slightly more different date than another record.
Maybe something like the following would work (basically a cleaner version of what I had in the original post).