...would it be better to store data in files that can then be read rather than storing everything in arrays..
Is there a way that you can read the original file data into memory,
But most of the data is redundant, so why store it at all?
2. using the File object, get the length of the file and create a maching byte
3. read the entire file into the byte buffer
More efficient to scan with BufferedReader's readLine() - it calculates line terminating characters correctly, and obtains a String we'll want to parse anyway.
4. scan for line ends, recording the matching line starts as an ArrayList of Integer objects
I don't know how the SNPLocation headings are being derived: numbers like 21876767 followed by 21880326 and 21884299 may not necessarily be in the appropriate order in the input file - and I'm not sure how you loop those from 1 to 20000 without knowing what they are in advance (there is no common difference between those three for instance)? Unless there's a pattern/formula I'm missing which determines them, it seemed sensible to order on the values of the input SNPLocations, rather than trust the file to be in the correct order already. If there is a pattern to exploit, HashMap is probably faster, though I would think TreeMap only has a performance overhead when adding the data (not retrieving it). Otherwise I think we have basically identical approaches!
I don't see a purpose to using TreeMap rather than HashMap, since the ordering we need in the output is achieved by the loops. HashMap ought to be faster. But I might be missing something.
I was interpreting n as the size of the resulting matrix (i.e. n x n matrix), while in fact if N is the number of lines in the input file (ignoring the first two), then we should have:
Doesn't your map have only n entries? Same as mine?