By M. David Green

What’s the Point of Agile Points?

By M. David Green

One of the first questions I always get asked when I’m talking to someone who is new to scrum is this: Why use points to estimate the effort involved in completing a story, instead of estimating in hours or days?

Agile points seem arbitrary, and it’s hard to see how they can help a team with its real-world concerns, like meeting deadlines.

Hours and days, on the other hand, are easy to understand, measure and weigh against practical external requirements.

Additionally, one of the key rewards that gets teams interested in implementing an agile workflow is the promise that they can learn to accurately estimate the amount of work the team can do in a certain number of days or weeks. Using points seems like taking a step back from that goal.

So why use points?

There is a practical, measurable, real-world value to using a relative estimation tool like points.

To illustrate this, let’s set up two imaginary teams–one that uses hours to estimate effort and one that uses points.

To keep the playing field even, let’s assume that both teams are just beginning to implement an agile workflow. The stories the teams need to complete in their first sprint are clearly defined, and the teams have a very solid understanding of the codebase, so they have a relatively good idea of what’s required to accomplish their goals.

Further, let’s even assume that the teams have been together for several months before this attempt to adopt agile, and that they have completed similar tasks before.

Estimating with hours

The first team is using hours to estimate their effort.

When looking at a story, they all try to compare it to work that they’ve done in the past. So for a given story, they think back to the last project that feels similar to this one. They know that it took one of their engineers about two days to complete, so they estimate this story at 16 hours.

Using this approach, they go through the stack of stories until they’ve estimated enough stories to fill the days of the engineers on the team for the duration of the sprint, and they add these to their first sprint backlog.

Note that the team has already made a few assumptions.

First of all, can an engineer realistically work 8 hours a day for the entire length of the sprint? Will anyone need to review the code? Will there be any overhead for meetings or other interruptions?

The team has also arbitrarily decided that eight is the number hours in a day, and that is already an abstract estimate.

Once the engineer assigned to the two-day story gets started, he realizes that the story is actually a little bit more complicated then the team anticipated, and it involves some extra research. This happens, regardless of whether you’re using points or using hours.

By the time the story is completed, it turns out that the engineer spent about 30 hours over the course of four days on it–about twice as long as the team had anticipated.

And at the end of the sprint, the team makes note of how long it took to complete this story and the others in the backlog for reference in future estimates.

At the next sprint planning meeting, the product owner comes in with another story that’s very similar to the previous story.

The question is, how should the team estimate the amount of hours it’s going to take to complete the story? Do they look back at history and say that stories like this generally take 16 hours? Do they assign the story 30 hours based on the most recent experience? Has their ability to complete stories like this actually changed since the first story, or are the stories themselves getting more complicated?

After only one sprint, there’s very little clarity about how to estimate the number of hours this second story will take.

The team decides to be conservative, and it estimates that this story is going to take 30 hours to complete. But when the engineer assigned to the story starts working on it, she realizes that this particular story is only going to take 10 hours, and she can complete it in under two days. Again, she makes note of the mis-estimation at the sprint retrospective.

At the next sprint planning, the team is presented with a third story, very similar to the previous two.

What scale should be used to estimate how long it’s actually going to take? If they say it will take 30 hours, they’re probably being overly conservative. If they say it’s going to take 10 hours, they may end up trying to accomplish too much and too little time. How many hours the story will actually take to complete is anybody’s guess.

Estimating with points

Now let’s consider another team, presented with the exact same stories, and with the exact same work history. But in this case the team will be estimating using points.

The team is presented with the new story. Relative to the other stories the team has estimated, this one falls somewhere in the middle of their point scale, which runs from 1 to 21 following a Fibonacci sequence of 1, 2, 3, 5, 8, 13, and 21. So they estimate it will take about five points of effort to complete. For them, five points means about the amount of effort that has been involved in stories like this that they’ve encountered in the past.

After working about twice as hard on the story than they had on previous stories like it, the engineer who completed this story mentions at the retrospective that it took longer than he had expected. As a result, the team was unable to complete all the stories in the backlog for this sprint, lowering their velocity.

At the next sprint planning meeting, the team is presented with a second story, very similar to the first. Relative to the other stories being presented, and the stories they completed in the previous sprint, this story now feels like an eight. The points that they are estimating reflect the relative effort the team believes will be involved in completing a story similar to this one, compared to the other stories they need to work on.

An engineer starts to work on the story, and notices that it actually took less effort than the previous similar story had. During the retrospective, she brings up the fact that the story didn’t seem as difficult as the points the team had assigned. But for the sake of tracking, it was still an eight point story, and it is completed and recorded as eight points on the team’s burn down chart.

As a result of underestimating the points for the story, the team was able to take on more work from the icebox during the sprint. That also means that their average velocity, or the average number of points the team can complete in a single sprint, goes up.

Now when the next story like this comes in, the team has an easy decision about how to estimate it.

Relative to the other stories presented, and the team’s their history, they estimate that the new story will take five points of effort. The number of points accepted into the sprint backlog reflects what they believe they can get done in a single sprint, based on the experience of two sprints. And this value is adjusted constantly to match the quality of the stories the team is receiving, and the team’s improving ability to estimate their capacity.

Points make for more accurate estimates

Did you notice how the first team’s estimate of hours, and the second team’s estimate of points, actually had very little to do with how much work the team had to put in to get the stories done?

In fact, the first-team was almost using hours as if they represented some abstract concept, rather than a measurable objective hour. Just calling their unit of effort “hours” made their attempts to estimate much more complicated, because an “hour” for them didn’t map directly to real-world hours.

Furthermore, without the relative scaling that comes from points, engineers may be tempted to inflate the number of hours they estimate a story might require to allow for practical overhead, or underestimate the hours to demonstrate their own heroic ability to get more done in less time than is sustainable or realistic.

After just two sprints, the second team already has an advantage over the first team: They know how to estimate new stories going forward, and how to incorporate them into their capacity planning. When they estimate a story is going to require eight points of effort, that means something specific to the team.

Points help the team members understand their own ability to take on a new story and accomplish it, relative to other stories they have already completed. Without the useful abstraction provided by points, a team has a much harder time predicting how much work they can actually accomplish in any given period of time.

  • Bruno Seixas

    I can see the logic behind a point measure system, you explain it very nicely :)
    So with more and more experiences/stories done, the measure accuracy, when using points, get´s very precise and manageable.

    A question, do you think that with an “estimating with hours” system the “clear vision” is still accomplished, although it takes a lot longer and demands a lot more effort?

    • Oscar

      But couldn’t this also be done for hours? I mean, I can code a form that sends an email, and think it will take me 8 hrs., but after doing it 3 times over, it takes me 4, 5, and 6 hours each, I can then estimate that it will probably take me 5 hrs on the 4th attempt with about 90% accuracy. The example above appears to be the same demonstration of points.

  • Where I work we point the story and then task it, then do hour estimates on the tasks to determine capacity for the Sprint. So is that the best of both worlds or the worst?

  • Also curious as to how to measure capacity using points. Something can be both difficult and time-consuming, while something else could be relatively easy but still be extremely time-consuming. (Updating strings throughout an app, for example.)

  • gihrig

    Beyond the common illusion that we can accomplish more than we can in reality, I fail to see the value of points. I didn’t get from the story that the team using points actually got any more accomplished, or was any more accurate in predicting how long the work would take. Do they ever correlate points with hours or days?

    I tend to take the position that estimates are pretty meaningless and only serve to set unrealistic expectations. The only honest way I know to correlate cost vs benefit from a business perspective is to build the simplest piece that has measurable value, then based on that cost/value ratio, decide to move on or abandon the project. The project lives only as long as it’s producing value in excess of cost. Even this system has the flaw that accurately estimating the value of software is often somewhat arbitrary.

    I think it boils down deciding: what is the value of an estimate versus what is the value of a product.

  • Matt

    Two points;

    1) The term ‘mis-estimate’ is used. There’s no such thing – an estimate by definition is not an accurate measure. Estimates can be defined as good or bad based on how much the initial value matched reality but by and by, as estimate is not wrong per say.

    2) I’d challenge anyone to respond to a client with ‘about 8 points’ when the question asked is ‘how long will that take?’. If you can’t communicate it to a client then it’s a tricky proposition, also dangerous when you’ve got to translate between points and hours for no real benefit. Businesses still run on billable hours.

    If the staff aren’t clever enough to deal with this solid fact, an abstract concept such as points may be beyond them, let alone the notion of an agile project framework.

    • dex black

      Using the analogous equation for distance. D=V*T.
      Hours = Day Length [e.g. 8 hours] * Sprint Length [days/sprint] * Total Story Points / Velocity [points/sprint]
      If you can’t do the math, just leave it to those who can and get back to your gardening.

  • Cosmin

    “When looking at a story, they all try to compare it to work that they’ve done in the past.”

    You’re completely wrong. We don’t compare with work done in the past, but we divide work to be done in smaller components that we can estimate with certain precision, but not based on previous exploits.

    Example: implement registration form. Cannot take more than x hours, even if that one time it took 3 days.

  • Hans

    @Matt. In your second point you argue that you need to be able to tell the customer how many hours they will be billed for. Well… since this is Scrum they will be billed for 1 month times x members of the team (or however long you decided with them that the sprint should be). The whole idea is that customers are not meddling with how the development team organise their work. In other words, the client does not micro-manage. The team and the product owner negotiate how much functionality will be put into the program during the coming month at the sprint planning meeting, but not how many hours will be spent working on this and that.

  • John

    I prefer to do range-based estimating. Let’s add a third group that uses time ranges. Based on similar projects, they might feel it is 16hours +- 4hours, or about 12-20 hours. The 8 hour spread is a measure of complexity/doubt. When it takes 30 hours to implement, we’d evaluate how much work this kind of task would typically take and how much uncertainty. If we found there is more inherent work involved than we guessed, and we’d need to do that work every time we did that kind of job, then we’d adjust our center. If we ran into a lot of unusual problems, then we’d adjust the range. In this case we may feel we underestimated the center and also the range, and decide this type of job should really be 20hours +- 10 hours (so 10 to 30 hours). When we do it again and get it done in 10 hours, that range of numbers will be supported.

    I enjoy point-based sizing when doing projects for large corporations, because it’s often good enough to do relative measures and allow the velocity (an therefore correlation between points and dollars) to emerge over time. But for small clients or money-sensitive projects, the clients need to know how much money a project will cost. For those kinds of clients it’s important to get good at predicting dollar figures while recognizing that there is a great deal of uncertainty in developing software.

  • Robert Grant

    Just a couple of things:

    1) Points are not intrinsic to Agile. They’re used in Scrum, for example.
    2) The difference between changing an estimate from 30 hours to 15 instead of 13 points to 8 seems nonexistent.
    3) Points end up being time anyway – if you have a fixed-length sprint, and you know how many points are available per-sprint, then you end up with hours anyway. For this reason (and I’d love to be proved wrong), I think points are the weakest part of Scrum.

    • Gerrick Phillips

      I agree… it’s all time. I never assum straight hours. If it takes 16 hours that does equal 2 days. It’s just total effort. I only est that a developer will work 5.5 hours of productive time a day so 16 hours is more like 3 days

  • Putcha V. Narasimham

    I have not studied what points are and what they measure.
    AA: From this discussion and top few comments, I do not see what “characteristics of the work being done or to be done” are related to the “points” being used to estimate the effort of doing the work. This apart, it is not clear if the quality of the work done is a factor of the work done.
    BB: The article and comments also pointed out that the time estimates do not take into account the waiting and wasted hours.
    CC: With such vital factors AA and BB not being addressed, I am not sure that estimating in points is any better than estimating hours, though points are more sophisticated.
    DD: Ultimately it is the correlation between estimates and actual time and effort taken that determines the credibility of a team. With close monitoring and mobilization of resources it may be possible for a team to perform and deliver what the team promises. Then the validity of measures and accuracy of their estimates become immaterial.
    PVN 21JUL16

  • Michael A Clark

    When a developer is asked to quote a project it is the responsibility for the project manager to account for actual hours the assigned resource can truly work, daily tasks of position, meetings, testing, roll-out all play a factor and are most commonly not thought of by the developer. When a developer is given the “story” they are to analyse the deliverable to determine the estimate pulling on experience individually and their team.

    It seems to me that points, although can be a form of measurement, they do not translate to a customers (Product owners) expectations as pointed out by others.

Get the latest in Entrepreneur, once a week, for free.