Hi Bill,
look like you are facing sort of as jion problem. The thing you first implemented is a so called
nested loop join, the most simple and basic join algorithm, but aso the least efficient, it's O(n�).
The algorithm discussed in the
thread is a so called
hash join, which is a little more complex and works for equi-joins only, but also way more efficient, it's O(n).
If your values are strings, which I assume for CSV data, and if your 3,000,000 lines are sorted, you might also give a thought to a o called
sort-merge-join, which is O(n log n).
Literature on database systems will provide you with a lot of details and implementaion hints for all three of the algorithms, just give it a google. If you face problems like the one discussed here more often, it'll be really woth the reading.
Or insert your data in a database using
JDBC and let the database engine do the join for you. It would be a left outer join, assuming that your 17,000 lines are the left hand side.
Hope that helps,
Guido