Out of Memory Exception when using Levenshtein Algorithm
kaparapu madhu
Greenhorn
Joined: Jun 13, 2005
Posts: 4
posted
0
I am comparing two html files, using Levenshtein Algorithm. I am reading these two files in arraylist and giving as input to Levenshteing Algorithm. This algorithm uses Edit matrix for comparision with size m*n ( arrayList1.size() * arrayList2.size() ). It is working good and efficient for small files, but for large file it throws out of memory exception. To overcome this we used -Xmx700000000. But the problem is, it takes lot of time to generate diffReport nearly 15 minutes and above. I am searching for a possible solution to reduce the time
Geoffrey Falk
Ranch Hand
Joined: Aug 17, 2001
Posts: 171
posted
0
You do not need to allocate a 2x2 array. It is possible to implement the Levenshtein algorithm using only 1-dimensional arrays.
Calculate the matrix one row at a time. You only need the previous row to determine the values in the next row.
Geoffrey
Sun Certified Programmer for the Java 2 Platform
Geoffrey Falk
Ranch Hand
Joined: Aug 17, 2001
Posts: 171
posted
0
This uses JDK 1.5:
[ June 14, 2005: Message edited by: Geoffrey Falk ]