This week's book giveaway is in the Jobs Discussion forum.
We're giving away four copies of Soft Skills: The software developer's life manual and have John Sonmez on-line!
See this thread for details.
Win a copy of Soft Skills: The software developer's life manual this week in the Jobs Discussion forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Out of Memory Exception when using Levenshtein Algorithm

 
kaparapu madhu
Greenhorn
Posts: 4
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am comparing two html files, using Levenshtein Algorithm. I am reading these two files in arraylist and giving as input to Levenshteing Algorithm. This algorithm uses Edit matrix for comparision with size m*n ( arrayList1.size() * arrayList2.size() ).
It is working good and efficient for small files, but for large file it throws out of memory exception. To overcome this we used -Xmx700000000. But the problem is, it takes lot of time to generate diffReport nearly 15 minutes and above. I am searching for a possible solution to reduce the time
 
Geoffrey Falk
Ranch Hand
Posts: 171
1
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You do not need to allocate a 2x2 array. It is possible to implement the Levenshtein algorithm using only 1-dimensional arrays.

Calculate the matrix one row at a time. You only need the previous row to determine the values in the next row.

Geoffrey
 
Geoffrey Falk
Ranch Hand
Posts: 171
1
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This uses JDK 1.5:




[ June 14, 2005: Message edited by: Geoffrey Falk ]
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic