This week's book giveaway is in the OCPJP forum.
We're giving away four copies of OCA/OCP Java SE 7 Programmer I & II Study Guide and have Kathy Sierra & Bert Bates on-line!
See this thread for details.
The moose likes Java in General and the fly likes Detect duplicated programs Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Detect duplicated programs" Watch "Detect duplicated programs" New topic
Author

Detect duplicated programs

Sitrarasu Jayaraman
Greenhorn

Joined: Feb 06, 2011
Posts: 7
Hello,

I need to develop an application which should get 2 java source code files (ex : file1.java , file2.java) as input and tell whether the 2 files are copied from one another..

this project is for my department, they want to find who are all copying when a programming assignment is given to them.

the copier should not escape if he just adds extra comments, change the variable names, change the position of some codes ex: writing main() at last or at the beginning etc..

i am looking for a best algorithm to solve this problem..

please help me by your valuable suggestions.. Any help in this regard is greatly appreciated. please post whatever idea comes to your mind..

Thanks for your help.
Christophe Verré
Sheriff

Joined: Nov 24, 2005
Posts: 14688
    
  16

the copier should not escape if he just adds extra comments, change the variable names, change the position of some codes ex: writing main() at last or at the beginning etc..

When is it going to fail then ??


[My Blog]
All roads lead to JavaRanch
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39478
    
  28
Such applications are already available; we use one here at Teesside. Would you like me to find out what it's called?
Sitrarasu Jayaraman
Greenhorn

Joined: Feb 06, 2011
Posts: 7
@Christophe Verré

The condition will fail if they the logic used is different, or the coding style is different.

It should report if someone exactly replicates the logic but changes the variable names and add extra space and comments to escape from the fast inspection nature of the human(i.e the staff members)

And i believe if someone can change the programming constructs altogether, then he need not be considered as a copier. because the assignment is aimed at checking his/her programming ability, so when he can change the constructs used in the program altogether then that should be fine..
ex:


Note: it would be fine if our application could detect the similar flow like the following, i.e just changing the looping construct but rest of them remains the same.
ex:


Sitrarasu Jayaraman
Greenhorn

Joined: Feb 06, 2011
Posts: 7
@Campbell Ritchie,

That would be helpful to me, if you could do that..

looking forward to your help.

Thanks.
Leonardo Shikida
Greenhorn

Joined: Jan 02, 2003
Posts: 29
try google "java code plagiarism analysis"

there are several tools/tecniques

TIA

Leo K.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39478
    
  28
So sorry; I forgot to ask. I went round today, and haven't yet found anybody who knows. I shall try to remember to ask again next week.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39478
    
  28
It is called Turnitin.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Detect duplicated programs