About

This is the archive page for Head of the Kyu. Click to go to the frontpage of this site.

Last Comments

Eric (Principles and pa…): I have given some thought…
Stlan (Candidate for the…): Sincere congratulations f…
Glen B. Alleman (The insurance poi…): There are completion bond…
Marco De Angelis ("Just a mistake" …): Hi Laurent. Do we also …
Ryan Platte (Nil nisi bonum): As you may remember, Ron …
Chris (The insurance poi…): I agree that people are k…
Graham Oakes (The insurance poi…): For me, people is the cle…
Doug (The insurance poi…): In my experience, the big…
Jason Marshall (The insurance poi…): One of my new favorite ph…
ethauvin (XP Day France 200…): Laurent, Did you know t…

Calendar

« July 2008
S M T W T F S
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    

Archives

Next Archive Previous Archive

01 Nov - 30 Nov 2007
01 Oct - 31 Oct 2007
01 Jul - 31 Jul 2007
01 Jan - 31 Jan 2007
01 Oct - 31 Oct 2006
01 Feb - 28 Feb 2006
01 Jan - 31 Jan 2006
01 Nov - 30 Nov 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Jul - 31 Jul 2005
01 Jun - 30 Jun 2005
01 May - 31 May 2005
01 Mar - 31 Mar 2005
01 Feb - 28 Feb 2005
01 Jan - 31 Jan 2005
01 Dec - 31 Dec 2004
01 Nov - 30 Nov 2004
01 Oct - 31 Oct 2004
01 Sep - 30 Sep 2004
01 Aug - 31 Aug 2004
01 Jul - 31 Jul 2004
01 Jun - 30 Jun 2004
01 May - 31 May 2004
01 Apr - 30 Apr 2004
01 Mar - 31 Mar 2004
01 Feb - 28 Feb 2004
01 Jan - 31 Jan 2004
01 Dec - 31 Dec 2003
01 Nov - 30 Nov 2003
01 Oct - 31 Oct 2003
01 Sep - 30 Sep 2003
01 Aug - 31 Aug 2003
01 Jul - 31 Jul 2003
01 Jun - 30 Jun 2003
01 May - 31 May 2003
01 Apr - 30 Apr 2003
01 Mar - 31 Mar 2003
01 Feb - 28 Feb 2003
01 Jan - 31 Jan 2003

Miscellany

Powered by Pivot - 1.40.0: 'Dreadwind' 
XML: RSS Feed 
XML: Atom Feed 

06 October 07 - 15:13Principles and patterns of codebase structure ?

Let's say you just had a nifty idea for a software product. Could be a Web site, an IM client, a new Ruby gem, anything. And let's say you've done enough up-front work to suit your purposes and your process preferences. So it's time for you to sit down and prepare a directory tree for the initial import into Subversion or CVS.

What are the rules for doing that ? And what are the rules that govern ongoing elaboration of this directory tree ? I'm not aware that these are written up anywhere (if you know of such a write-up, I'd appreciate a pointer).

Most competent programmers I see sketching a directory tree seem to be doing "the same thing". They'll start at the top level with, say, "src", "test", "doc", "lib" directories. But why ? If their intent is something more complex than a simple app they'll create additional divisions at the top level, or below, which make sense for that intent. These programmers don't seem to be following explicit guidelines or principles or known patterns, and sometimes I find codebases whose authors tried to imitate this "standard" structure in ways that just don't make sense. So, making the principles and patterns explicit could be valuable. I'm thinking that one good way to make those principles explicit would be to look at "ugly" codebases, identify the violations we see that make them "ugly", and reason backwards from that to the "proper" principles.

Based on experience with violations, I'm postulating some "principles of transparency", which suggest things that codebase structure should not reflect:

  • A principle of transparency with respect to team organization - the codebase should not reflect the structure of the team. I was quite shocked when one colleague reported to me that he had seen a team using Subversion where the top level of the code base was a set of directories, one for each member of the team. This suggests that the team members "integrate" by manually copying code from their "private" directories to one that is authoritative.

  • A principle of transparency with respect to change over time - the codebase should not reflect version evolution. It's always unpleasant (but not quite in the "shocking" category) to come across a directory labeled "old" or "version_before_ajax" or some such. It's the job of version control to track those things.



Primarily I think of the contents of the repository as a "blueprint" for creating a running system. The repository doesn't have to be organized along the same lines as the running system will be organized. (I even debated adding that to the list of "transparency" principles above, but I believe it can actually make sense e.g. to have a top-level division into "client" and "server".) However, the repository does have to contain everything you need to know in order to produce a running system. The repository also has to contain everything you need to know in order to advance to a new version of the system - to fix a bug or add a feature. A memorable formulation could be "For maintenance you need the repository, all the repository and nothing but the repository".

The decomposition of a system into "products" (in the sense of that post by Reg Braithwaite) plays an important role, I suspect, in structuring the code base. Independent "products" should live in separate repositories; however some systems might depend on very closely related "products" (again the example of client and server comes to mind) sharing some common code.

For a single-product repository, the "description of how to produce the end result" theory makes good sense of why many projects have "src", "doc", "lib", "test" and a makefile at the top level. Directly at the top level is what we'll be concerned with - "building" or "deploying" or "installing" a running system from the repository. (Or sometimes a "shippable" rather than a "running" system; a level of indirection is introduced by having the repository be a recipe for building an installation CD and the running system only results from the end user running the installer.) The top-level groupings are according to the function that the various elements play. The "doc" directory is where you go to understand the code, which you then find in "src", after which you can verify your changes by running the tests in "test". In the "lib" directory there are components which are needed to build the system but which you don't need to know, as they are unlikely to require any changes. For multi-product repositories, we can expect that the top level will have a directory for each product, under which the same structure will be found as for a single-product repository.

Do these observations make sense ? Did you come across codebases whose structure you found perfectly clear, or perfectly bewildering, and for what reasons ?

- default - one comment / No trackbacks - §

Linkdump