Wednesday, October 28, 2009

14.2 Preventing Duplicates from Occurring in a Table




I l@ve RuBoard










14.2 Preventing Duplicates from Occurring in a Table




14.2.1 Problem



You
want to prevent a table from ever containing duplicates, so that you
won't have to worry about eliminating them later.





14.2.2 Solution



Use a PRIMARY KEY or a
UNIQUE index.





14.2.3 Discussion



To make sure that records in a table are unique, some column or
combination of columns must be required to contain unique values in
each row. When this requirement is satisfied, you can refer to any
record in the table unambiguously using its unique identifier. To
make sure a table has this characteristic, include a
PRIMARY KEY or
UNIQUE index in the table structure when you
create the table. The following table contains no such index, so it
would allow duplicate records:



CREATE TABLE person
(
last_name CHAR(20),
first_name CHAR(20),
address CHAR(40)
);


To prevent multiple records with the same first and last name values
from being created in this table, add a
PRIMARY
KEY to its definition. When you do this,
it's also necessary to declare the indexed columns
to be NOT NULL, because a
PRIMARY KEY does not allow
NULL values:



CREATE TABLE person
(
last_name CHAR(20) NOT NULL,
first_name CHAR(20) NOT NULL,
address CHAR(40),
PRIMARY KEY (last_name, first_name)
);


The presence of a unique index in a table normally causes an error to
occur if you insert a record into the table that duplicates an
existing record in the column or columns that define the index. Recipe 14.3 discusses how to handle such errors or modify
MySQL's duplicate-handling behavior.



Another way to enforce uniqueness is to add a
UNIQUE index
rather than a PRIMARY KEY to a
table. The two types of indexes are identical, with the exception
that a UNIQUE index can be created on columns that
allow NULL values. For the
person table, it's likely that
you'd require both the first and last names to be
filled in. If so, you'd still declare the columns as
NOT NULL, and the following
table declaration would be effectively equivalent to the preceding
one:



CREATE TABLE person
(
last_name CHAR(20) NOT NULL,
first_name CHAR(20) NOT NULL,
address CHAR(40),
UNIQUE (last_name, first_name)
);


If a UNIQUE index
does happen to allow NULL values,
NULL is special because it is the one value that
can occur multiple times. The rationale for this is that it is not
possible to know whether one unknown value is the same as another, so
multiple unknown values are allowed.



It may of course be that you'd want the
person table to reflect the real world, in which
people do sometimes have the same name. In this case, you cannot set
up a unique index based on the name columns, because duplicate names
must be allowed. Instead, each person must be assigned some sort of
unique identifier, which becomes the value that distinguishes one
record from another. In MySQL, a common technique for this is the
AUTO_INCREMENT column:



CREATE TABLE person
(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
last_name CHAR(20),
first_name CHAR(20),
address CHAR(40),
PRIMARY KEY (id)
);


In this case, when you create a record with an id
value of NULL, MySQL assigns that column a unique
ID automatically. Another possibility is to assign identifiers
externally and use those IDs as unique keys. For example, citizens in
a given country might have unique taxpayer ID numbers. If so, those
numbers can serve as the basis for a unique index:



CREATE TABLE person
(
tax_id INT UNSIGNED NOT NULL,
last_name CHAR(20),
first_name CHAR(20),
address CHAR(40),
PRIMARY KEY (tax_id)
);




14.2.4 See Also



AUTO_INCREMENT columns are discussed further in
Chapter 11.










    I l@ve RuBoard



    No comments:

    Post a Comment