aspose file tools*
The moose likes Java in General and the fly likes To detect block moves Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "To detect block moves" Watch "To detect block moves" New topic
Author

To detect block moves

Shine Tom
Greenhorn

Joined: Sep 17, 2012
Posts: 19
Hello,

I need to implement a program to detect block moves in 2 files. If someone has the code for same please post it. Thanking you in Advance.

Regards,
shine
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8016
    
  22

Shine Tom wrote:I need to implement a program to detect block moves in 2 files. If someone has the code for same please post it.

First: You'll have to explain what you mean by "block moves".

Second: 'Fraid that's not how it works here. We are NotACodeMill (←click) and we frown on people simply posting ready-made code.

Third: The new file.nio package for version 7 has a WatchService class, but whether it covers your needs only you can say. I suggest you read the tutorials. Otherwise there are several Java "file watching" packages out there in Internet-land.

Winston


Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
Shine Tom
Greenhorn

Joined: Sep 17, 2012
Posts: 19
Hello,
I am sorry for posting like that.. i want to write a program which could help me in comparing 2 files. It should detect new insertions, deletions nad also if there was a block move of the content from one location to another (say line 15 to 20 moved to line 25 to 30). If you could help me in implementing the same, as in how I can start which classes could help me etc.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12806
    
    5
What do you know about these files - are they text or arbitrary binary?

How far have you gotten - can you detect that two files are not identical?

Bill
Shine Tom
Greenhorn

Joined: Sep 17, 2012
Posts: 19
Hi,
The intention is do compare 2 versions of the same file(C files) and then detect the changes(insertions, deletions and block move). I am trying to implement the The String-to-String Correction Problem with
Block Moves from Walter F. Tichy. It would be of great help if you could give some ideas as to how to start.

Many Thanks,
Shine
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8016
    
  22

Shine Tom wrote:The intention is do compare 2 versions of the same file(C files) and then detect the changes(insertions, deletions and block move). I am trying to implement the The String-to-String Correction Problem with
Block Moves from Walter F. Tichy. It would be of great help if you could give some ideas as to how to start.

Seems to me that this is two problems:
1. Detecting changes.
2. Detecting what changes were made.
The first can be probably be done with a WatchService or a decent digest algorithm (eg, MD5). The second needs some sort of 'diff' function; but you'll need a 'before' and 'after' image if you're going to do that.

Winston
Shine Tom
Greenhorn

Joined: Sep 17, 2012
Posts: 19
Hello Everyone,

@Winston: the algorithm which you pointed seems little complicated for me.
I started with a simple way. my aim is if 2 files are there, first Source one with
a
b
c
d
e
f
g
h

and Target with
a
b
c
d
l
m
e
f
a
b
c

Then I want the result as to detect that in the target file lines(0-4; a,b,c,d) and lines(6-8; e,f) are are moved from the source file. The lines (8-11; a,b,c) though it is same as from the Source file, is not of importance since if have another block move(0-4; a,b,c,d) which is larger. The program which I have written now, indicates all the position of the block move. Can someone help me to reach my target and also, if you find any improvements in the code which needs to be made please suggest.

Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8016
    
  22

Shine Tom wrote:Then I want the result as to detect that in the target file lines(0-4; a,b,c,d) and lines(6-8; e,f) are are moved from the source file. The lines (8-11; a,b,c) though it is same as from the Source file, is not of importance since if have another block move(0-4; a,b,c,d) which is larger. The program which I have written now, indicates all the position of the block move. Can someone help me to reach my target and also, if you find any improvements in the code which needs to be made please suggest...

I have to say I haven't looked at it fully, but at first glance it looks too simple. As Einstein said: "Everything should be as simple as possible, but no simpler", and a proper diff is not as simple as you probably think.

The standard algorithm uses a "staircase" methodology to swap between the files in order to find maximal subsequences of matching lines - the purpose being to work out - especially in the case of changed lines - the minimal changes required to make one file look like the other.
If you already know which is your 'before' and which your 'after' file, you can probably reduce the complexity even more; but I'd still be looking to test it with sets of line changes rather than simply additions and removals.

the standard diff program also uses the same algorithm for changes in lines rather than just changes to lines, and so can detect whether a line has simply changed or has truly been added or removed. Whether you want to get that sophisticated is up to you, but I wouldn't be at all surprised if somebody has already written a version of it for Java.

Winston

PS: I should also add that a decent checksum algorithm is probably simpler than anything you're likely write yourself if you simply need to know whether any change has been made or not, particularly if you're happy to assume that matching checksums == no change. Java also already has classes for MD5.
Shine Tom
Greenhorn

Joined: Sep 17, 2012
Posts: 19
Hello Winston ,
Thankyou for your update. I will try to do little more research on this and will update the thread. As you said, the diff algorithm is not as simple as I have written.. what i tried to works for just words in a file but not for comparing normal files.
Regards,
Shine
Shine Tom
Greenhorn

Joined: Sep 17, 2012
Posts: 19
Hello Everyone,

I found a diff algorithm in internet and then modified it according to my needs for showing the 'moves' instead of 'delete + insert'. the program also indicates insert and delete as well. Since my requirement is fulfilled, i m marking the thread as resolved.

Thankyou all once again
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: To detect block moves