aspose file tools
The moose likes Java in General and the fly likes Out of Memory Exception when using Levenshtein Algorithm Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login
JavaRanch » Java Forums » Java » Java in General
Reply Bookmark "Out of Memory Exception when using Levenshtein Algorithm" Watch "Out of Memory Exception when using Levenshtein Algorithm" New topic
Author

Out of Memory Exception when using Levenshtein Algorithm

kaparapu madhu
Greenhorn

Joined: Jun 13, 2005
Posts: 4
I am comparing two html files, using Levenshtein Algorithm. I am reading these two files in arraylist and giving as input to Levenshteing Algorithm. This algorithm uses Edit matrix for comparision with size m*n ( arrayList1.size() * arrayList2.size() ).
It is working good and efficient for small files, but for large file it throws out of memory exception. To overcome this we used -Xmx700000000. But the problem is, it takes lot of time to generate diffReport nearly 15 minutes and above. I am searching for a possible solution to reduce the time
Geoffrey Falk
Ranch Hand

Joined: Aug 17, 2001
Posts: 171
You do not need to allocate a 2x2 array. It is possible to implement the Levenshtein algorithm using only 1-dimensional arrays.

Calculate the matrix one row at a time. You only need the previous row to determine the values in the next row.

Geoffrey


Sun Certified Programmer for the Java 2 Platform
Geoffrey Falk
Ranch Hand

Joined: Aug 17, 2001
Posts: 171
This uses JDK 1.5:




[ June 14, 2005: Message edited by: Geoffrey Falk ]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Out of Memory Exception when using Levenshtein Algorithm
 
Similar Threads
Help me pronounce - 'Levenshtein '
BadPaddingException in AES
out of memory
Regarding the IText API
Interesting question - how would you do this in Java