• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Version Control for binary files - any good system available?

 
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
CVS and SVN are not intended for binary files. Storing such files in a file system with appropriate directory structure leads to other sort of problems.

Could you suggest any version control system, that could reasonably deal thousands of binary files like DOC, ODT, XLS, PDF, JPG, PNG, ZIP, JAR?
 
Ranch Hand
Posts: 398
Android Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can try Mercurial.

Get more info @ http://en.wikipedia.org/wiki/Mercurial_(software)
 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think SVN is good enough. We are using SVN to store and share more then 5000 word file and more then 10000 PDF files. It works perfect. Just for your Information.
 
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'll chime in with support for SVN in this respect. It works fine for us - we put all of our help files in it (and associated images) and it works quite well.
 
Saloon Keeper
Posts: 27752
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
SVN allegedy has the ability to efficiently difference binary files, reducing the amount of disk storage space required. Many VCS's allow custom differencers to be plugged in, however.

I've stored binaries in SVN and CVS for years without any problems. The only "gotcha" is that you do have to make sure that the binary is stored AS a binary, since VCS's these days are fond of altering the end-of-line characters in what they think are text files. In other words, the same sort of trouble you'd get when trying to download a ZIP file from a Windows FTP host if you forgot to tell the server that you wanted binary file transfer.
 
Timo Patuelli
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
to Tim Holloway:
"I've stored binaries in SVN and CVS for years without any problems"

May be you have used small files. Would you use 20 to 30 to MB binaries, you would notice horrible problems. CVS loads all the versions of such binary file as a single piece. Monthly we have had about 20 versions (on a daily basis), each about 30 MB. Request from a single developer led to 600 MB RAM occupied at once. When several developers accessed the CVS repository, the server was incredibly slow! In one year the size of all versions for a single file could be as much as 4-5 GB. Disk was not a problem, but RAM...

We couldn't work without versioning of that files. Using file system for version control was not an option. We decided to delete some versions from repository on a regular basis, according to some rule connected with the build process. Yes, we had really deleted them in the file system of CVS server, regularly. Not a graceful solution. But the others were even worse.

SVN is a bit better, but also far not perfect.


Now for a new somewhat similar large project I'm looking for a smart and rapid version control system. It can be commercial. But should meet our needs. So far I've found no suitable system.
 
Timo Patuelli
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Mourouganandame Arunachalam wrote:You can try Mercurial.

Distributed system is not an option. The core "con" is that it leads to enormous network trafic, whereas many developers don't need that files.

 
Tim Holloway
Saloon Keeper
Posts: 27752
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Actually, I frequently archive production modules of that size. Maybe my network's just better tuned, but I can live with the time it takes.

However, I exclude files of any size from Version Control unless there's a reasonable expectation I'm going to retrieve them. Work files and directories are on the projects ignore list.

It sounds like what you want is a client smart enough to difference binaries BEFORE they're sent to the server, and I don't know of any. If you're talking JAR/WAR/EAR files or compiled code, there's an extra challenge. Today's optimizing compilers are smart enough to make significant global changes to the resulting code based on fairly minor alterations to source code. Including moving whole blocks of code in and out of line, hoisting loop code and much more. In simpler times, you could use a tool like IBM's ZAP utility and modify a few bytes and the difference file would be miniscule. No longer. Not unless you're coding in assembler or have one really dumb compiler like those in common use back when zapping was standard procedure.

I can think of 2 things that might alleviate your issues, assuming that you can't simple add the offending files to cvsignore or its equivalent:

1. Limit people to nightly checkins. Personally I'm not good at that, but allegedly some teams are.

2. Teach people to do partial checkins. This is a bit risky, since it's easy to do an incomplete checkin when a complete one is needed and vice versa. A variation on this is to target your builds so that the build results go into a separate directory and that directory is only checked in at end-of-day.

I actually did something like option #2 at a previous employer. All production modules had to be placed in CVS for the operations staff to retrieve. For their convenience (and ours) we had a special project that held the deployables. In order to avoid having everything in one large directory, we made subdirectories for the various products based on their toplevel package qualfiers (for Java projects) or their equivalents (for non-Java projects).
 
Angus Edison
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Forgot to mention there is one disadvantage of using Subversion to host binary files. You need to have a working copy (your files) + .svn folder contains mirror of your working copy. The advantage is you can do some operation in location (e.g. check status and revert file). The disadvantage is you double your size in local working copy.

Although the hard disk price is very low today, keeping and copying such large folder is not easy. I vote for Subversion to provide "NO" mirror of local copy.
 
Sheriff
Posts: 67746
173
Mac Mac OS X IntelliJ IDE jQuery TypeScript Java iOS
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
"Timo Timo", please check your private messages for an important administrative matter.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic