File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes I/O and Streams and the fly likes File size comparison Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "File size comparison" Watch "File size comparison" New topic
Author

File size comparison

Nakataa Kokuyo
Ranch Hand

Joined: Jul 24, 2011
Posts: 189
Good day,

I would like to create a component to detect the file being modify before process.

Seeking for the advice, whether my thought is make sense and is it the right way to detect the file modification based on file size value?

Below are the flow:

1. Get the file size of a file
2. Used file size value encrypt it with MD5 algorithm, and say it generated us encrypted value "0123sdf"
3. to avoid user modify the file content, before file process, we take the file and do the encryption with md5 again, if it return value "0123sdf", then we are sure it doesn't have modification.

my question:
a. is it the right approach to detect file modification?
b. what the library advise to use or using java.security.DigestInputStream will do?

Thanks for enlighten!
K. Tsang
Bartender

Joined: Sep 13, 2007
Posts: 2615
    
    9

IMHO using file size as the only indicator for "file changed status" is rather vague. The file size may be the same yet the content is changed (eg a letter)

I suggest using the last modified date/time.

Also you should consider what happens when your program is doing whatever to the file (in memory perhaps) that no one tampers it in between. In another word, can the file be locked (similar to database row or table locking) during processing?


K. Tsang JavaRanch SCJP5 SCJD/OCM-JD OCPJP7 OCPWCD5 OCPBCD5
Nakataa Kokuyo
Ranch Hand

Joined: Jul 24, 2011
Posts: 189
Thanks K. Tsang, that's sound a better idea on using modified date !

Between, After the file generation, it will drop into a unix folder and then move into DB, it suppose not interfere by the user, management want to ensure security.

regards the implementation, if possible could you please shed some light on this too ?

K. Tsang
Bartender

Joined: Sep 13, 2007
Posts: 2615
    
    9

First how is the file uploaded to the server? FTP, web UI upload etc? If FTP, do end users have direct access to the FTP folder through FTP client software or something?

Suppose it's web UI upload. Given user A and user B has the same file name called "Y2014budget.txt". Both user A and user B made some changes independent of each other. They upload their respective file through the web interface.

In normal situation, whoever upload last will overwrite the previous uploads..... I doubt you want that!

Under this situation, the "uploaded" file name would have to be changed, say appending a unique key (eg db PK) at the end or something to say Y2014budget.txt_1

This advantage to this approach is uploaded files never get overwritten. Disadvantage, use more disk space.

Depending on whether the files have any defined spec having header, footer, etc. If so, the file renaming may need to corresponding to those "file name" entries inside the file (usually in header and/or footer)
Tony Docherty
Bartender

Joined: Aug 07, 2007
Posts: 2398
    
  50
Using the last modified date/time is not at all safe. It is very easy to modify a file and set it's modified time back the original file's time.
Nakataa Kokuyo
Ranch Hand

Joined: Jul 24, 2011
Posts: 189
Hi K. Tsang,

there are two ways user can trigger the file generation - manual or auto.

Both approach will trigger from web UI, for manual, user can download the file and allow to upload it into different environment at anytime.
for auto, user trigger the file generation process from web UI and the generated file will transfer over via FTP to other environment instantly.

the file size checksum is to ensure the content is not being modify by the user.

 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: File size comparison