Version Control for One-Man Project Using Eclipse

version control for one-man project using eclipse?

a/ It is necessary to have a VCS

b/ CVCS and DVCS are quite different

c/ Eclipse is currently moving all its project to Git (and is improving on EGit), so Git will be the VCS target on Eclipse.

Should I keep my project files under version control?

You do want to keep in version control any portable setting files,

meaning:

Any file which has no absolute path in it.

That includes:

  • .project,
  • .classpath (if no absolute path used, which is possible with the use of IDE variables, or user environment variables)
  • IDE settings (which is where i disagree strongly with the 'accepted' answer). Those settings often includes static code analysis rules which are vitally important to enforce consistently for any user loading this project into his/her workspace.
  • IDE specific settings recommandations must be written in a big README file (and versionned as well of course).

Rule of thumb for me:

You must be able to load a project into a workspace and have in it everything you need to properly set it up in your IDE and get going in minutes.

No additional documentation, wiki pages to read or what not.

Load it up, set it up, go.

Organizing Java projects

You seem to be almost exactly in the point where the place I worked at was when I started there 1,5 years ago, only difference being that you've started toying with branches which is actually something we still don't do at my work but more about that later on in this answer.

Anyway, you're listing a very good set of tools which can help a small company and those work really nicely as subtopics so without further ado,

Version control systems

Most commonly small companies currently use CVS or SVN and there's nothing bad in that, in fact I'd be really worried if no version control was really used at all. However you have to use version control right, just having one won't make your life easier. We currently use CVS and are looking into Mercurial, but we've found that the following works as a good set of conventions when working with CVS (and I'd suspect SVN too):

  • Have separate users for all commiters. It's beyond valuable to know who commited what.
  • Don't allow empty commit messages. In fact if possible, configure the repository to reject any commits without comments and/or default comment. Initial commit for FooBarizer is better than Empty log message
  • Use tags to mark milestones, prototypes, alphas, betas, release candidates and final versions. Don't use tags for experimental work or as footnotes/Post-It notes.
  • Don't use branches since they really don't work if you're continuing on developing the application. This is mainly because in CVS and SVN branching just doesn't work as expected and it becomes an exercise in futility to maintain any more than two living branches ( head and any secondary branch ) over time.

Always remember that for the software company the source code is your source of income and contains all your business value, so treat it that way. Also if you have extra 70 minutes, I really recommend that you watch through this talk Linus Thorvalds gave at Google about git and (d)VCS in general, it's really insightful.

Automated builds and Continuous Integration environments

These are about the same actually. Daily builds is a PR joke and has little no resemblance to the state of the actual software beyond some very rudimentary "Does it compile?" issues. You can compile a lot of awful code noise that doesn't do anything, keeping the software quality up has nothing to do with getting the code to compile.

On the other hand unit tests is a great way to maintain software quality and I can with a bit of personal pride say that rigorous unit testing helps even the worst of the programmers to improve a lot and catch stupid errors. In fact there has so far only been a total of three bugs that code I have written has reached production environments and I'd say that in 18 months that's a pretty damn good achievement. In our new production code we usually have a instruction code coverage of +80%, mostly +90% and in one special case reaching all the way to 98%. This part is very lively field and you're better of Googling for the following: TDD, BDD, unit tests, integration tests, acceptance tests, xUnit, mock objects.

That's a bit of a lengthy preface, I know. The actual meat for all the above is this: If you want to have automated builds, have them occur every time someone commits and make sure there's a constantly increasing and improving amount of unit tests for production code. Have the continuous integration system of your choice (we use Hudson CI) run all the unit tests related to project and only accept builds if all the tests pass. Do not make any compromises! If unit tests show that the software is broken, fix the software.

Additionally, Continuous Integration systems aren't just for compiling code but instead they should be used for tracking the state of the software project's metrics. For Hudson CI I can recommend all these plugins:

  • Checkstyle - Checks if the actual source code is written in a way you define. Big part of writing maintainable code is to use common conventions.
  • Cobertura - Code coverage metrics, very useful to see how the coverage develops over time. Also keeping in line with the "source is God" mentality, allows you to discard builds if coverage falls below a certain level.
  • Task Scanner - Simple but sweet: Scans for specific tags such as BUG, TODO, NOTE etc. in your code and creates a list from them for everyone to read. Simple way to track short notes or known bugs which needs fixing or whatever you can come up with.

Project structure and Dependency Management

This is a controversial one. Basically everyone agrees that having an unified structure is great but since there's several camps with different requirements, habits and views to issue they tend to disagree. For example Maven people really believe that there's only one way - the Maven way - to do things and that's it while Ivy supporters believe that the project structure shouldn't be hammered down your throat by external parties, only the dependencies need to be managed properly and in an unified manner. Just that it's not left unclear, our company simply loves Ivy.

So since we don't use project structure imposed by external parties, I'm going to tell you a bit about how we got into what we got into our current project structure.

In the beginning we used individual projects for actual software and related tests (usually named Product and Product_TEST). This is very close to what you have, one huge directory for everything with the dependencies as JARs directly included in the directory. What we did was that we checked out both projects from CVS and then linked the actual project to the test software project in Eclipse as runtime dependency. A bit clunky but it worked.

We soon came to realize that these extra steps are completely useless since by using Ant - by the way, you can invoke Ant tasks directly in Hudson - we could tell the JAR/WAR building step to ignore everything by either file name (say, everything that ends with Test or TestCase) or by source folder. Pretty soon we converted our software project to use a simple structure two root folders, src and test. We haven't looked back ever since. The only debate we currently have is if we should allow for a third folder called spikes to exist in our standard project structure and that's not a very heated debate at all.

This has worked tremendously well and doesn't require any additional support or plugins from any of IDEs out there which is a great plus - number two reason we didn't choose Maven was seeing how M2Eclipse basically took over Eclipse. And since you must be wondering, number one reason for rejecting Maven was the clunkiness of Maven itself, endless amount of lengthy XML declarations for configuration and the related learning curve was considered a too big cost as to what we would get from using it.

Rather interestingly later on commiting to Ivy instead of Maven has allowed us to a smooth shift to do some Grails development which uses folder and class names as conventions for just about everything when structuring the web application.

Also a final note about Maven, while it claims to promote convention over configuration, if you don't want to do things exactly the way the Maven's structure says you should do things, you're in a world of pain for the aforementioned reasons. Certainly that's an expected side effect of having conventions but no convention shouldn't be final, there always has to be at least some room for changes, bending the rules or choosing the appropriate from a certain set.

In short, my opinion is that Maven is a bazooka, you work in a house and you ultimate goal is to have it bug free. Each of these are good on it's own and work even if you pick any two of them, but the three together just doesn't work.

Final words

As long as you have less than 10 code-centric people, you have all the flexibility needed to do the important decisions. When you go beyond that, you have to live with whatever choices you've made, no matter how good or bad they are. Don't just believe things you hear on the Internet, sit down and test everything rigorously - heck, our senior tech guy even wrote his bachelor's thesis about Java web frameworks just to figure out which one we should use - and really figure out what you really need. Don't commit to anything just because you may need some of the functionality it provides in distant future, pick those things that has the lowest possible negative impact to the whole company. Being the 10th person hired to the company I work at I can undersign everything in this paragraph with my own blood, we currently have 16+ people working and changing certain conventions would actually be a bit scary at this point.

Eclipse version control - problems with project no longer showing in workspace

I think I've settled on the following approach. This seems to work well so far and avoids some of the headaches mentioned in my original question.

1) each developer creates an Eclipse workspace on their machine somewhere, outside of version control; only the project directory is checked into version control - the workspace is completely uncontrolled

2) developers checkout the project directory from version control (in a different directory structure than where the workspace was created) and then use File >> Import, but they leave the "copy into workspace" unchecked.

So with the above, you can checkout from version control and work with the files right where they were checked out. There's no need to move them out then import them back in. When you import with the copy option unchecked, the workspace (which itself is not controlled) is just referencing the files where they're at on disk.

The only minor downside is that any workspace stuff has to be setup individually. Other articles mention controlling the launch params, but so far this hasn't been an issue - pretty easy to pick that once the first time you launch.

So anyway, hopefully this helps someone else :) This seems to be a reasonably smooth way to do it and avoids the issues we ran into initially.

Best version control for a one man web app?

SVN, but you need to be able to easily deploy your webapp with SVN.

Since it is not always a simple task, so I just point out this article which may be of interest for your project.

General principle:

  • Configure Apache on your development server so that it picks up your checked out working copies as separate subdomains. Using this, you can simply make a checkout of your project and it will automagically be up and running. No need to touch the Apache configuration. You need a DNS wildcard entry so that all subdomains of dev.example.org go to your development server.

The only problem with using the above Apache configuration locally is the DNS wildcard. Unless your desktop is assigned a hostname by your network's DNS server and you can set the wildcard there, you will have to make do with your localhost address. You can install dnsmasq to act as a local caching DNS server and put the wildcard on your own machine

  • Use dnsmasq so you can achieve the same effect on your own development machine. That way you can develop your web applications locally and you won't need a central development server. In my examples I will be assuming you use subversion for your version control, but it works virtually the same with other version control packages, such as git or bazaar.

Note: (Humor)

This other question on Subversion allowed me to point out to this article about publishing its (source-controlled) data into production, with in it probably the ugliest diagram I ever saw on the topic ;-)

diagram

When working with Eclipse, should I add the workspace to the source control?

I would not add the complete workspace, but I would add the .classpath and .project files (as well as the source, of course) so that you can recreate the project if needbe.

Create versions in eclipse using local history

You're going through a lot of hassle just to avoid version control. Instead! Use a local version of subversion for all your version control needs.

You do not need to host a server. It will use the filesystem only! Use subclipse or subversive to integrate into eclipse.

A tutorial how to set it up(takes less than 5 minutes):

http://vincenthomedev.wordpress.com/2007/10/15/setup-svn-local-repository-step-by-step/

Revision Control System Recommendations

Try these searches

https://stackoverflow.com/search?q=free+svn+hosting

https://stackoverflow.com/search?q=free+mercurial+hosting

As for choosing which one - I tend to agree with the google review here:

  • Learning Curve. Git has a steeper learning curve than Mercurial
    due to a number of factors. Git has
    more commands and options, the volume
    of which can be intimidating to new
    users. Mercurial's documentation tends
    to be more complete and easier for
    novices to read. Mercurial's
    terminology and commands are also a
    closer to Subversion and CVS, making
    it familiar to people migrating from
    those systems.

  • Windows Support. Git has a strong Linux heritage, and the
    official way to run it under Windows
    is to use cygwin, which is far from
    ideal from the perspective of a
    Windows user. A MinGw based port of
    Git is gaining popularity, but Windows
    still remains a "second class citizen"
    in the world of Git. Based on limited
    testing, the MinGW port appeared to be
    completely functional, but a little
    sluggish. Operations that normally
    felt instantaneous on Linux or Mac OS
    X took several tenths of a second on
    Windows. Mercurial is Python based,
    and the official distribution runs
    cleanly under Windows (as well as
    Linux, Mac OS X, etc).

But his is the hands-down clincher:

Maintenance. Git requires periodic
maintenance of repositories (i.e.
git-gc), Mercurial does not require
such maintenance. Note, however, that
Mercurial is also a lot less
sophisticated with respect to managing
the clients disk space (see Client
Storage Management above).

I don't want to have to do "maintenance" on the git repos. That's just unacceptable.

Summary

In terms of implementation effort,
Mercurial has a clear advantage due to
its efficient HTTP transport protocol.

In terms of features, Git is more
powerful, but this tends to be offset
by it being more complicated to use.

I have not moved all my stuff to mercurial - SVN is just fine for most projects - especially single-person projects.



Related Topics



Leave a reply



Submit