Walking the Walk
Look at the size of that thing
By Julian Browne on May 5, 2009. Filed Under delivery, gadget, requirements
This short series started with Planning the Plan, an article that tried to put into context some of the roadmap and planning activities that take place before projects get approved and started. I suggested using the McFarlan Matrix as a way of categorising potential projects so that they might be more likely to deliver benefits in line with whatever the business strategy is for that year. There are plenty of good ways of doing this, but the McFarlan Matrix is simple and quick, and forces conversations about what good looks like for the coming 12 months of financial spend. I began the series because, in my experience, most companies adopt one of two extremes: they either don't plan at all, funding projects as they get thought up or by the rule of who-shouts-loudest, or, they get so caught up in planning and strategising for its own sake that they forget what the point of it all is: a portfolio of work that does what the business wants, can be clearly communicated to staff and motivates those who work on it by providing a transparent link from day-to-day technology choices to oil-tanker-steering business strategy.
The important final step in signing off a portfolio though is knowing how long each project is likely to take, which draws us into estimation territory, so I added a side article looking at Tactical vs. Strategic decision making, because many projects don't need estimation - they already have a 'needed-by' date due to some (often rash) business deadline imposed from outside. My point is that it's never worth getting hung up on a tactical-first, strategic-later design because there will never be time to do anything other than what you do.
The follow-up article to Planning the Plan was The Estimation Game which itemised inherent contradictions in the estimation process that, like planning, can find us immersed in a bizarre world of involved processes that produces nothing but an illusory accuracy, the result of which is we look incompetent and unprofessional in front of the business when we don't deliver. The one glimmer of hope in all this is that we do know one thing with 100% accuracy and that is how long it took us to deliver projects in the past.
There is no silver bullet - all features of a project (skills, availability, scope, legacy foundation, priority) continually shift until the code is live and further change falls below the financial radar into that bucket we call business-as-usual - but I wanted to end the series with a look at how historical data modelling might provide a basis for reasonably reliable estimation, at least at the portfolio stage.
A few years ago, when this notion originally ocurred to me, I went looking for research work on portfolio planning and came across an article entitled Software Cost Estimating Methods for Large Projects by Capers Jones of Software Productivity Research in "Crosstalk", the Journal of Defence Software Engineering. Usually I give topics like that a wide berth because whatever estimating "model" is being put forward often dehumanises the process, putting the needs and processes of the modellers above getting on and doing the work. Precisely the kind of end-in-itself I've mentioned before and precisely the sort of thing to be avoided when all you want is a bit of a roadmap and some confidence in what it says.
But Capers Jones wasn't pitching a specific model and the paper contains some very interesting facts. Jones and his colleagues examined 12,000 completed projects and noticed all sorts of things: how different project types affect the amount of documentation per function point, likelihood of defects, change rate of requirements, etc. But table 1 contains a percentage breakdown of the typical effort expended against various activities by project type. I've transcribed this data here.
Percentage of Work Effort by Software Development Activity for Different Project Types
Data is copyright 2005 Capers Jones. Software Productivity Research.
Manual vs. Automated Estimating
Additionally he compares the accuracy of manual vs. automated estimates on 100 completed projects, each of which had a reasonable complexity (5000 function points equivilent to around 600,000 lines of C code). Not surprisingly the manual estimates were more optimistic than the automated ones. What was surprising was by how much: 92% of the manual estimates gave delivery dates that were unrealistically early, compared to only 2% of the automated estimates. Where the manual estimates fell apart was in their inability to provide enough time for "support" activity like project management, documentation and testing. But here's the thing: both were pretty spot-on at estimating code development time.
Perhaps none of that is surprising at all. Given a set of business problems I think most people who can code to a decent standard aren't too bad at guessing how long it will take them to deliver them. I also think, if we're honest, most of us underestimate all that other stuff that is necessary to make a project happen either because we don't like it, don't understand it, or simply don't put much of a value on it.
Anyway, I took this data and turned it into a very simple estimating tool, which I have found to be pretty accurate in the early stages of planning. I say pretty accurate because of course whatever comes out of the tool can be radically changed when any requirement bargaining starts.
My theory is this: if we're pretty good at estimating code time and we can see that, over 12,000 projects, project management accounts for 10-13% of project effort, and we're running a web project where that coding estimate represents 30% of the effort then it's not rocket science to work out how much effort the whole project will require and, ultimately, how long it will take. If my development time estimate is a bit sketchy (because I'm not in full possession of the facts) I can caveat it with a percentage confidence.
I have a requirement to build a web site. I think it will take three good developers three months just to write the code. At 20 working days a month (roughly) that 3 x 3 x 20 = 180 mandays. If 180 mandays is just the development effort and that's 30% of the overall effort (from the table above) then this is a 600 manday piece of work. If I'm only 80% sure of this then I really need to be thinking of this (for planning purposes) as a 750 manday requirement. At an average rate of 500 GBP per day we're asking the board to cough up 375k in resource costs alone. Stick on costs for hardware, software, travel, provision for operational costs and so on and this starts to become a sizeable endeavour. Add in availability of people, task sequencing and delays like sickness and you can start to see where the end date might fall. 80% would, I suspect, be an unusually high confidence level at this stage.
Fancy a try?
|Project Type :|
|Effort to code :||Man-days|
|Initial Design||Detailed Design|
|Reue Acquisition||Package purchase|
|Code Inspection||Independent Verification|
|Configuration Management||Formal Integration|
|User Documentation||Unit Testing|
|Functional Testing||Integration Testing|
|System Testing||Field Testing|
|Acceptance Testing||Independent Testing|
|Project Management||Project Total||Man-days|
So would I stake my life on this? No. But then I'm not saying this is an accurate estimating model for project managers. I'm saying that given 50 projects vying for attention in the portfolio you can quickly come up with some figures that help knock out the no-go options and target the maybes for elaboration. I can say that I've used this approach, after tailoring the percentages to the particular organisation, with consistently good results.
The table Capers Jones produced is based on a lot of raw data, which makes the data good for an average project and potentially way off for a niche project. The principle that people are more or less as good as automated tools at estimating code effort though is unlikely to change unless there are some special cases such as using untested libraries where massive workaround requirements may suddenly appear out of nowhere. But then you could spend thousands on a modelling tool which would still fail at that.
I have a more sophisticated verison of this in a spreadsheet where I can assign different day rates to roles, add capital costs etc. All based on the premise that if you know enough about one activity the others generally follow a pattern. The model works for Agile too - if you know the iteration length and team size you already know your available total effort and you can work backward to the percentage that represents the coding effort available.