This week's giveaway is in the Android forum.
We're giving away four copies of Android Security Essentials Live Lessons and have Godfrey Nolan on-line!
See this thread for details.
The moose likes Ant, Maven and Other Build Tools and the fly likes Maven repository maintenance Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Engineering » Ant, Maven and Other Build Tools
Bookmark "Maven repository maintenance" Watch "Maven repository maintenance" New topic
Author

Maven repository maintenance

Leroy J Brown
Ranch Hand

Joined: Dec 02, 2007
Posts: 71
I'm taking over a large enterprise java application's development architecture shortly and have been considering migrating it to maven. One concern I have been hearing from my manager is that maintaining an ever growing repository of our (rather large) build artifacts may not be easy on our drive space. Does maven (or any third party) provide some functionality for cleaning up your internal remote repository? manual deletion of older files seems like it could be a bit of a boring maintenance problem?

Thanks

-TJCR
stanislav bashkirtsev
Ranch Hand

Joined: Aug 17, 2009
Posts: 75
I don't know exactly, but I think it has no such functionality. It considers old libs like "the lib of a certain version". Not less not more.
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 15960
    
  19

That's kind of like saying "I want to delete old files from my version control system". The minute you do, you'll end up needing them.

Maven does have one advantage, however. If the deleted versions are part of the public Maven repositories, they'll be refetched on demand.

Compared to the storage (and maintenance!) overhead of keeping libraries all over the shop, a Maven repo with obsolete libraries is a small price to pay, and actually, I suspect that you'll probably not consume that much space. My local repository cache is under 1GB and I've been building it up for a long time and using it with a lot of complex projects.

When terabyte disk drives are under $100, a Maven repo that can fit with room to spare on a DVD seems like cheap insurance.

This question (or one like it) has been asked before a few months back, so you might want to search the forum and see what was said back then.

Customer surveys are for companies who didn't pay proper attention to begin with.
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

I'm not sure, but an Artifact management tool such as Nexus might help.


Cheers, Martijn - Blog,
Twitter, PCGen, Ikasan, My The Well-Grounded Java Developer book!,
My start-up.
stanislav bashkirtsev
Ranch Hand

Joined: Aug 17, 2009
Posts: 75
Tim answered the question: just delete all libs! Ones that will be needed, they will be downloaded again while maven will be running.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

How do you do it now? What artifacts are you specifically concerned about? Do you really have that little drive space that you can't keep multiple WAR/EAR files lying around (assuming you install them via Maven)?!
Leroy J Brown
Ranch Hand

Joined: Dec 02, 2007
Posts: 71
At the moment all of our build artifacts are stored in a folder structure on the build server that cruise control runs on. Older artifacts have to be purged from time to time as drive space fills up. Some of our web app build artifacts for example can be quite large (they can include images etc), and one that I can think of off the top of my head is on its 183rd build version the beginning of this development effort. If that artifact was ~25mb then the history of that one module for this release alone would be 4.5 gigs. Our cruise control install increments the build value each time it builds and archives the new artifact before deploying it automatically to our dev environment. This situation would worsen if we turned on automated building in cruise control instead of our current set up which only builds on command by a button click. The only real difference here would be that our current folder structure is much simpler than the repository structure and is easier to clean up than I would imagine the repository would be.

Nexus may be a useful tool here from what I've seen, thank you for that suggestion.

I also haven't had the time to learn about Maven's "snapshot" functionality, and that might be relevant for this discussion.
Marco Ehrentreich
best scout
Bartender

Joined: Mar 07, 2007
Posts: 1280

Hi Tristan,

I guess I'm not a Maven expert here but I think your "space problem" is not really a problem with Maven. From what I've seen so far Maven is mostly or only used to handle CODE artifacts. Of course it is possible to install every kind of JARs in your Maven repository but then again Maven maybe is the wrong tool. I'm sure there is a good reason to package image data and everything else together with your code for this project. But perhaps you should think about another strategy to manage all kinds of project files. I've seen projects myself where GBs of data accumulated of the years ranging from database dumps to images and so on, too. Mostly this was because it was convenient to put everything into one directory at the beginning of a project but over the years it simply became too big to handle. Nevertheless often nobody dared to remove unneeded things or to restructure the whole project because nobody knew what was still needed and what not. But as I said, maybe you have good reasons and can't or don't want to change things anyway. Just my thoughts...

Marco
Leroy J Brown
Ranch Hand

Joined: Dec 02, 2007
Posts: 71
I'm not suggesting that this is a defect of Maven's, just that I'm not sure what the proper way to use this tool is. As for why we include images in our WARs, they would not be deployable artifacts if we did not. I'm also not considering installing any files into the repository that are not maven build artifacts.

I'll look into Nexus.

Thanks!
Jaikiran Pai
Marshal

Joined: Jul 20, 2005
Posts: 9947
    
161

May not be exactly what you are looking for, but there is a plugin http://maven.apache.org/plugins/maven-dependency-plugin/purge-local-repository-mojo.html which has the purge-local-repo goal which you can take a look. I haven't used it before so don't exactly know how it behaves.

[My Blog] [JavaRanch Journal]
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 15960
    
  19

This sounds like a concern about how Maven would hold the results of Maven builds. That is to say, application releases.

You have an advantage there, since you would normally only need to prune from a limited subtree. But there are other considerations.

Maven isn't intended to manage the day-to-day builds. When you do a "mvn install" or "mvn site", you're supposed to be preparing a production release with a specific version ID. So that can cut down on the actual respository-resident artefacts considerably. If you really are doing a new production release every day, that's probably an indication that the development process itself has problems.

Actually, I didn't use Maven to hold the production deployables anyway last big shop I worked in. We used CVS for that. It was easier for the sysadmins to work with.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

If you really are doing a new production release every day, that's probably an indication that the development process itself has problems.

It seems like with most agile process that wouldn't really indicate a problem, though.
Leroy J Brown
Ranch Hand

Joined: Dec 02, 2007
Posts: 71
Our system is semi-agile in that sense. We don't really make distinctions between development builds and production builds while the build is occurring, we just build and test until it we're happy with it and then deploy the builds that have been signed off on by their respective developers/testers to production. If our module happens to currently be of version label 4.0.113 (the label is incremented by cruise control on every build) then that's the name of the one that goes into production.

Additionally, this application's modules are deployed to many different server clusters in our production environment, and most modules have other modules as dependencies. Therefor all of the build artifacts would need to be deployed to the repository so that other modules can use them as input to their own builds.

Our source code versioning system maintains revision labels that correspond to build artifacts, so we are able to recreate any (*)AR we need if it is lost. What I'm trying to get away from however is maintaining the jars of module A as dependencies of modules B, C and Z in this versioning system.

Am I missing something important in snapshots? I'm not sure how they fit into the Maven picture and if they would be appropriate to use in any new build strategy for this system.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

Snapshot, in general, are considered interim builds. That doesn't mean they can't be used as a dependency by another project, however. Whether or not they *should* be used as a dependency is an internal decision you'll have to make. Since it's all internal it doesn't seem like it'd be as bad as, say, me using a snapshot of a third-party library--when it's a third-party I can't (necessarily) determine the quality of that build and whether or not it's safe for me to use.
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 15960
    
  19

David Newton wrote:
If you really are doing a new production release every day, that's probably an indication that the development process itself has problems.

It seems like with most agile process that wouldn't really indicate a problem, though.


Not by my understanding of Agile. That sounds more like ADHD. Agile was - so I was informed - more like a release every 2 weeks or so. Enough time to get significant work done, but not so much that it created a waterfall. More often and the progress wouldn't be apparent and people would see too much broken stuff that stayed broken over multiple releases. Plus, too much temptation for users to micro-manage the process.

Nightly alpha's don't count in my definition of "production", though.
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 15960
    
  19

Tristan Rouse wrote:Our system is semi-agile in that sense. We don't really make distinctions between development builds and production builds while the build is occurring, we just build and test until it we're happy with it and then deploy the builds that have been signed off on by their respective developers/testers to production. If our module happens to currently be of version label 4.0.113 (the label is incremented by cruise control on every build) then that's the name of the one that goes into production.

Additionally, this application's modules are deployed to many different server clusters in our production environment, and most modules have other modules as dependencies. Therefor all of the build artifacts would need to be deployed to the repository so that other modules can use them as input to their own builds.

Our source code versioning system maintains revision labels that correspond to build artifacts, so we are able to recreate any (*)AR we need if it is lost. What I'm trying to get away from however is maintaining the jars of module A as dependencies of modules B, C and Z in this versioning system.

Am I missing something important in snapshots? I'm not sure how they fit into the Maven picture and if they would be appropriate to use in any new build strategy for this system.


I worked more formally. We had development, Q/A and production, each with their own set of resources. Development was alpha, Q/A was beta. If the end users signed off on Beta, then we'd slate it for the weekly Change Control with all the concomitant paperwork.

The maven system can be set up to do a SNAPSHOT/release cycle if you like "nightly" builds. I've got one app I do that on now. It automatically tags the interim build results - and in my case, that includes creating a unique RPM for Linux installation/upgrade. When I'm ready to "go live", I do a "mvn release:prepare", which checks everything into version control, builds, runs the unit tests, and creates deployables. A Maven "release:perform" then ensures that the maven repository and the project website get updated.

This means that when it comes time to clear deadwood, you can easily separate the milestones. They're the ones that don't have "-SNAPSHOT" in their names.

As a general rule, I don't like to pull snapshots in as dependencies. To me, it indicates that there's probably too much coupling between what should be discrete components. That level of coupling would be more appropriate for a multi-module project, and even then I prefer to minimize it.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

Tim Holloway wrote:Nightly alpha's don't count in my definition of "production", though.

I think that's a limited view of what's possible with a good agile process. For example, see here for an example of continuous deployment taken to an extreme. If the processes are in place there's nothing "alpha" about it.

IMO there's no technical reason to wait an arbitrarily prescribed amount of time between feature releases.
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 15960
    
  19

David Newton wrote:
Tim Holloway wrote:Nightly alpha's don't count in my definition of "production", though.

I think that's a limited view of what's possible with a good agile process. For example, see here for an example of continuous deployment taken to an extreme. If the processes are in place there's nothing "alpha" about it.

IMO there's no technical reason to wait an arbitrarily prescribed amount of time between feature releases.


We're at risk for drifting off topic. I most definitely do not recommend running on a "release clock", however. That just promotes the conceit that software is something that comes out of a meat grinder. I prefer to set small, mutually agreed-upon goals and attach them to milestones. The milestones may average a week, 2, or 3 between each other, but it's meeting the contract that counts. The art is in doing a good job estimating how long that will take. By setting and meeting small, attainable goals, the client gets a feeling of continuous progress, and, just as importantly, gets to see what the consequences are, and whether/how to alter the future goals before too much is invested on the wrong track. Releasing with known bugs just to meet a timetable goes against my grain. I'd rather fence off the offending parts with an "under construction sign" so that users can find the unknown bugs instead and not be greeted with surprises that lessen their respect for the quality of the work in progress.

I note that the sample "extreme" case quoted still wasn't nightly frequency.

Alpha testing, is by definition, testing by the development staff. End users should not be getting alpha builds. With Maven, those would typically be snapshot releases.

Beta testing is user testing. I've already mentioned the level of quality I prefer to see for that.

A few shops have also had gamma tests. Typically at the gamma level, feature requests are not permitted, only bug reports, so I don't know how well that fits into most Agile projects. The vendors I've known who did that were prepping mass releases, and these days we usually use a different term for that stage, but senility has kicked in and I've forgotten what it is.
Leroy J Brown
Ranch Hand

Joined: Dec 02, 2007
Posts: 71
This is definitely off topic. I'm not interested in changing or in any position to change the project development/release model that my company uses. My questions have been regarding maintaining an artifact repository and what snapshots are used for in Maven. I think they've been answered as well as they will be.

Thanks for the help with that.

On a side note, I'd appreciate it if people would not use ADHD as a pejorative term. I find it offensive as skilled professional with pronounced ADHD.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

I'm also ADHD, OCD, and BPD. Just to provide balance, I'm not even remotely offended, and find it somewhat amusing to ascribe human characteristics to processes.
Leroy J Brown
Ranch Hand

Joined: Dec 02, 2007
Posts: 71
You're just cooler than me. What can I say?


ADHD + CSCI
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 15960
    
  19

Actually, I meant it more as a description than an insult. Having embarrassed myself more than once when I didn't allow a suitable time to meditate on what I'd done before I leapt off onto the next stage. I shouldn't think anyone could object much to an affliction that's reputedly as common to successful executives as autism is to successful software geeks. And I have a certain autistic bent myself. In the final analysis, it's not so much about what particular alleged disabilities one has, it's how one can bend the so-called disadvantages to turn them into advantages.

Sadly, my crack about senility was... Um, what was I talking about again?
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Maven repository maintenance
 
Similar Threads
How do I add a jar file to artifactory?
How to reference own jar in Maven
duplicate maven artifacts
Creating an ear file from war and jar files
Running help:describe command in Maven