3.4. Bug Tracker
Bug tracking is a broad topic; various
aspects of it are discussed throughout this book. Here
I'll try to concentrate mainly on setup and
technical considerations, but to get to those, we have to start with
a policy question: exactly what kind of information should be kept in
a bug tracker?
The term bug tracker is misleading. Bug
tracking systems are also frequently used to track new feature
requests, one-time tasks, unsolicited patches�really anything
that has distinct beginning and end states, with optional transition
states in between, and that accrues information over its lifetime.
For this reason, bug trackers are also called issue
trackers, defect trackers,
artifact trackers, request
trackers, trouble ticket systems,
etc. See Appendix B for a list of software.
In this book, I'll continue to use bug
tracker for the software that does the tracking, because
that's what most people call it, but will use
issue to refer to a single item in the bug
tracker's database. This allows us to distinguish
between the behavior or misbehavior that the user encountered (that
is, the bug itself), and the tracker's
record of the bug's discovery,
diagnosis, and eventual resolution. Keep in mind that although most
issues are about actual bugs, issues can be used to track other kinds
of tasks too.
The classic issue life cycle looks like this:
Someone files the issue. She provides a summary, an initial
description (including a reproduction recipe, if applicable; see
Section 8.1.5 in Chapter 8 for how
to encourage good bug reports), and whatever other information the
tracker asks for. The person who files the issue may be totally
unknown to the project�bug reports and feature requests are as
likely to come from the user community as from the developers. Once filed, the issue is in what's called an
open state. Because no action has been taken
yet, some trackers also label it as unverified
and/or unstarted. It is not assigned to
anyone; or, in some systems, it is assigned to a fake user to
represent the lack of real assignation. At this point, it is in a
holding area: the issue has been recorded, but not yet integrated
into the project's consciousness. Others read the issue, add comments to it, and perhaps ask the
original filer for clarification on some points. The bug gets reproduced. This may be the most
important moment in the life cycle. Although the bug is not actually
fixed yet, the fact that someone besides the original filer was able
to make it happen proves that it is genuine, and, no less
importantly, confirms to the original filer that
she's contributed to the project by reporting a real
bug. The bug gets diagnosed: its cause is
identified, and if possible, the effort required to fix it is
estimated. Make sure these things get recorded in the issue; if the
person who diagnosed the bug suddenly has to step away from the
project for a while (as can often happen with volunteer developers),
someone else should be able to pick up where she left off. In this stage, or sometimes the previous one, a developer may
"take ownership" of the issue and
assign it to herself (Section 8.1.1.1 in Chapter 8 examines the
assignment process in more detail). The issue's
priority may also be set at this stage. For
example, if it is so severe that it should delay the next release,
that fact needs to be identified early, and the tracker should have
some way of noting it. The issue gets scheduled for resolution. Scheduling
doesn't necessarily mean naming a date by which it
will be fixed. Sometimes it just means deciding which future release
(not necessarily the next one) the bug should be fixed by, or
deciding that it need not block any particular release. Scheduling
may also be dispensed with, if the bug is quick to fix. The bug gets fixed (or the task completed, or the patch applied, or
whatever). The change or set of changes that fixed it should be
recorded in a comment in the issue, after which the issue is
closed and/or marked as
resolved.
There are some common variations on this life cycle. Sometimes an
issue is closed very soon after being filed, because it turns out not
to be a bug at all, but rather a misunderstanding on the part of the
user. As a project acquires more users, more and more such invalid
issues will come in, and developers will close them with increasingly
short-tempered responses. Try to guard against the latter tendency.
It does no one any good, as the individual user in each case is not
responsible for all the previous invalid issues; the statistical
trend is visible only from the developers' point of
view, not the user's. (In Section 3.4.2 later in this chapter,
we'll look at techniques for reducing the number of
invalid issues.) Also, if different users are experiencing the same
misunderstanding over and over, it might mean that aspect of the
software needs to be redesigned. This sort of pattern is easiest to
notice when there is an issue manager monitoring the bug database;
see Section 8.2.4 in Chapter 8.
Another common life cycle variation is for the issue to be closed as
a duplicate soon after Step 1. A duplicate is
when someone files an issue that's already known to
the project. Duplicates are not confined to open issues:
it's possible for a bug to come back after having
been fixed (this is known as a regression), in
which case the preferred course is usually to reopen the original
issue and close any new reports as duplicates of the original one.
The bug tracking system should keep track of this relationship
bidirectionally, so that reproduction information in the duplicates
is available to the original issue, and vice versa.
A third variation is for the developers to close the issue, thinking
they have fixed it, only to have the original reporter reject the fix
and reopen it. This is usually because the developers simply
don't have access to the environment necessary to
reproduce the bug, or because they didn't test the
fix using the exact same reproduction recipe as the reporter.
Aside from these variations, there may be other small details of the
life cycle that vary depending on the tracking software. But the
basic shape is the same, and while the life cycle itself is not
specific to open source software, it has implications for how open
source projects use their bug trackers.
As Step 1 implies, the tracker is as much a public face of the
project as the mailing lists or web pages. Anyone may file an issue,
anyone may look at an issue, and anyone may browse the list of
currently open issues. It follows that you never know how many people
are waiting to see progress on a given issue. While the size and
skill of the development community constrains the rate at which
issues can be resolved, the project should at least try to
acknowledge each issue the moment it appears. Even if the issue
lingers for a while, a response encourages the reporter to stay
involved, because she feels that a human has registered what she has
done (remember that filing an issue usually involves more effort
than, say, posting an email). Furthermore, once an issue is seen by a
developer, it enters the project's consciousness, in
the sense that the developer can be on the lookout for other
instances of the issue, can talk about it with other developers, etc.
The need for timely reactions implies two things:
The tracker must be connected to a mailing list, such that every
change to an issue, including its initial filing, causes a mail to go
out describing what happened. This mailing list is usually different
from the regular development list, since not all developers may want
to receive automated bug mails, but (just as with commit mails) the
Reply-to header should be set to the development mailing list. The form for filing issues should capture the
reporter's email address, so she can be contacted
for more information. (However, it should not
require the reporter's email
address, as some people prefer to report issues anonymously. See
Section 3.7.1.2 later in this chapter
for more on the importance of anonymity.)
3.4.1. Interaction with Mailing Lists
Make sure the bug tracker doesn't turn into a
discussion forum. Although it is important to maintain a human
presence in the bug tracker, it is not fundamentally suited to
real-time discussion. Think of it rather as an archiver, a way to
organize facts and references to other discussions, primarily those
that take place on mailing lists.
There are two reasons to make this distinction. First, the bug
tracker is more cumbersome to use than the mailing lists (or than
real-time chat forums, for that matter). This is not because bug
trackers have bad user interface design; it's just
that their interfaces were designed for capturing and presenting
discrete states, not free-flowing discussions. Second, not everyone
who should be involved in discussing a given issue is necessarily
watching the bug tracker. Part of good issue management (see
"Share Management Tasks as Well as Technical
Tasks" in Chapter 8) is to make sure each issue is
brought to the right peoples' attention, rather than
requiring every developer to monitor all issues. In Section 6.5 in Chapter 6,
we'll look at ways to make sure people
don't accidentally siphon discussions out of
appropriate forums and into the bug tracker.
Some bug trackers can monitor mailing lists and automatically log all
emails that are about a known issue. Typically they do this by
recognizing the issue's identifying number in the
subject line of the mail, as part of a special string; developers
learn to include these strings in their mails to attract the
tracker's notice. The bug tracker may either save
the entire email, or (even better) just record a link to the mail in
the regular mailing list archive. Either way, this is a very useful
feature; if your tracker has it, make sure both to turn it on and to
remind people to take advantage of it.
3.4.2. Prefiltering the Bug Tracker
Most issue databases eventually suffer from the same problem: a
crushing load of duplicate or invalid issues filed by well-meaning
but inexperienced or ill-informed users. The first step in combatting
this trend is usually to put a prominent notice on the front page of
the bug tracker, explaining how to tell if a bug is really a bug, how
to search to see if it's already been filed, and
finally, how to effectively report it if one still thinks
it's a new bug.
This will reduce the noise level for a while, but as the number of
users increases, the problem will eventually come back. No individual
user can be blamed for it. Each one is just trying to contribute to
the project's well-being, and even if their first
bug report isn't helpful, you still want to
encourage them to stay involved and file better issues in the future.
In the meantime, though, the project needs to keep the issue database
as free of junk as possible.
The two things that will do the most to prevent this problem are:
making sure there are people watching the bug tracker who have enough
knowledge to close issues as invalid or duplicates the moment they
come in, and requiring (or strongly encouraging) users to confirm
their bugs with other people before filing them in the tracker.
The first technique seems to be used universally. Even projects with
huge issue databases (say, the Debian bug tracker at http://bugs.debian.org/, which contained
315,929 issues as of this writing) still arrange things so that
someone sees each issue that comes in. It may be
a different person depending on the category of the issue. For
example, the Debian project is a collection of software packages, so
Debian automatically routes each issue to the appropriate package
maintainers. Of course, users can sometimes misidentify an
issue's category, with the result that the issue is
sent to the wrong person initially, who may then have to reroute it.
However, the important thing is that the burden is still
shared�whether the user guesses right or wrong when filing,
issue watching is still distributed more or less evenly among the
developers, so each issue is able to receive a timely response.
The second technique is less widespread, probably because
it's harder to automate. The essential idea is that
every new issue gets "buddied" into
the database. When a user thinks he's found a
problem, he is asked to describe it on one of the mailing lists, or
in an IRC channel, and get confirmation from someone that it is
indeed a bug. Bringing in that second pair of eyes early can prevent
a lot of spurious reports. Sometimes the second party is able to
identify that the behavior is not a bug, or is fixed in recent
releases. Or she may be familiar with the symptoms from a previous
issue, and can prevent a duplicate filing by pointing the user to the
older issue. Often it's enough just to ask the user
"Did you search the bug tracker to see if
it's already been reported?" Many
people simply don't think of that, yet are happy to
do the search once they know someone's
expecting them to.
The buddy system can really keep the issue database clean, but it has
some disadvantages too. Many people will file solo anyway, either
through not seeing, or through disregarding, the instructions to find
a buddy for new issues. Thus it is still necessary for volunteers to
watch the issue database. Furthermore, because most new reporters
don't understand how difficult the task of
maintaining the issue database is, it's not fair to
chide them too harshly for ignoring the guidelines. Thus the
volunteers must be vigilant, and yet exercise restraint in how they
bounce unbuddied issues back to their reporters. The goal is to train
each reporter to use the buddying system in the future, so that there
is an ever-growing pool of people who understand the issue-filtering
system. On seeing an unbuddied issue, here are the ideal steps:
Immediately respond to the issue, politely thanking the user for
filing, but pointing them to the buddying guidelines (which should,
of course, be prominently posted on the web site). If the issue is clearly valid and not a duplicate, approve it anyway,
and start it down the normal life cycle. After all, the
reporter's now been informed about buddying, so
there's no point wasting the work done so far by
closing a valid issue. Otherwise, if the issue is not clearly valid, close it, but ask the
reporter to reopen it if they get confirmation from a buddy. When
they do, they should put a reference to the confirmation thread
(e.g., a URL into the mailing list archives).
Remember that although this system will improve the signal/noise
ratio in the issue database over time, it will never completely stop
the misfilings. The only way to prevent misfilings entirely is to
close off the bug tracker to everyone but developers�a cure
that is almost always worse than the disease. It's
better to accept that cleaning out invalid issues will always be part
of the project's routine maintenance, and to try to
get as many people as possible to help.
See also Section 8.2.4 in Chapter
8.
|