Thursday, October 22, 2009

6.5 Displaying One Set of Values While Sorting by Another




I l@ve RuBoard










6.5 Displaying One Set of Values While Sorting by Another




6.5.1 Problem



You want to sort a result set using
values that you're not selecting.





6.5.2 Solution



That's not a problem. You can use columns in the
ORDER BY clause that
don't appear in the column output list.





6.5.3 Discussion



ORDER BY is not limited to
sorting only those columns named in the column output list. It can
sort using values that are "hidden"
(that is, not displayed in the query output). This technique is
commonly used when you have values that can be represented different
ways and you want to display one type of value but sort by another.
For example, you may want to display mail message sizes not in terms
of bytes, but as strings such as 103K for 103
kilobytes. You can convert a byte count to that kind of value using
this expression:



CONCAT(FLOOR((size+1023)/1024),'K')


However, such values are strings, so they sort lexically, not
numerically. If you use them for sorting, a value such as
96K sorts after 2339K, even
though it represents a smaller number:



mysql> SELECT t, srcuser,
-> CONCAT(FLOOR((size+1023)/1024),'K') AS size_in_K
-> FROM mail WHERE size > 50000
-> ORDER BY size_in_K;
+---------------------+---------+-----------+
| t | srcuser | size_in_K |
+---------------------+---------+-----------+
| 2001-05-12 12:48:13 | tricia | 191K |
| 2001-05-14 17:03:01 | tricia | 2339K |
| 2001-05-11 10:15:08 | barb | 57K |
| 2001-05-14 14:42:21 | barb | 96K |
| 2001-05-15 10:25:52 | gene | 976K |
+---------------------+---------+-----------+


To achieve the desired output order, display the string, but use the
actual numeric size for sorting:



mysql> SELECT t, srcuser,
-> CONCAT(FLOOR((size+1023)/1024),'K') AS size_in_K
-> FROM mail WHERE size > 50000
-> ORDER BY size;
+---------------------+---------+-----------+
| t | srcuser | size_in_K |
+---------------------+---------+-----------+
| 2001-05-11 10:15:08 | barb | 57K |
| 2001-05-14 14:42:21 | barb | 96K |
| 2001-05-12 12:48:13 | tricia | 191K |
| 2001-05-15 10:25:52 | gene | 976K |
| 2001-05-14 17:03:01 | tricia | 2339K |
+---------------------+---------+-----------+


Displaying values as
strings but sorting them as numbers also can bail you out of some
otherwise difficult situations. Members of sports teams typically are
assigned a jersey number, which normally you might think should be
stored using a numeric column. Not so fast! Some players like to have
a jersey number of zero (0), and some like
double-zero (00). If a team happens to have
players with both numbers, you cannot represent them using a numeric
column, because both values will be treated as the same number. The
way out of the problem is to store jersey numbers as strings:



CREATE TABLE roster
(
name CHAR(30), # player name
jersey_num CHAR(3) # jersey number
);


Then the jersey numbers will display the same way you enter them, and
0 and 00 will be treated as
distinct values. Unfortunately, although representing numbers as
strings solves the problem of distinguishing 0 and
00, it introduces a different problem. Suppose a
team comprises the following players:



mysql> SELECT name, jersey_num FROM roster;
+-----------+------------+
| name | jersey_num |
+-----------+------------+
| Lynne | 29 |
| Ella | 0 |
| Elizabeth | 100 |
| Nancy | 00 |
| Jean | 8 |
| Sherry | 47 |
+-----------+------------+


The problem occurs when you try to sort
the team members by jersey number. If those numbers are stored as
strings, they'll sort lexically, and lexical order
often differs from numeric order. That's certainly
true for the team in question:



mysql> SELECT name, jersey_num FROM roster ORDER BY jersey_num;
+-----------+------------+
| name | jersey_num |
+-----------+------------+
| Ella | 0 |
| Nancy | 00 |
| Elizabeth | 100 |
| Lynne | 29 |
| Sherry | 47 |
| Jean | 8 |
+-----------+------------+


The values 100 and 8 are out of
place. But that's easily solved. Display the string
values, but use the numeric values for sorting. To accomplish this,
add zero to the jersey_num values to force a
string-to-number conversion:



mysql> SELECT name, jersey_num FROM roster ORDER BY jersey_num+0;
+-----------+------------+
| name | jersey_num |
+-----------+------------+
| Ella | 0 |
| Nancy | 00 |
| Jean | 8 |
| Lynne | 29 |
| Sherry | 47 |
| Elizabeth | 100 |
+-----------+------------+


The technique of displaying one value but sorting by another is also
useful when you want to display composite values that are formed from
multiple columns but that don't sort the way you
want. For example, the mail table lists message
senders using separate srcuser and
srchost values. If you want to display message
senders from the mail table as email addresses in
srcuser@srchost format with the username first,
you can construct those values using the following expression:



CONCAT(srcuser,'@',srchost)


However, those values are no good for sorting if you want to treat
the hostname as more significant than the username. Instead, sort the
results using the underlying column values rather than the displayed
composite values:



mysql> SELECT t, CONCAT(srcuser,'@',srchost) AS sender, size
-> FROM mail WHERE size > 50000
-> ORDER BY srchost, srcuser;
+---------------------+---------------+---------+
| t | sender | size |
+---------------------+---------------+---------+
| 2001-05-15 10:25:52 | gene@mars | 998532 |
| 2001-05-12 12:48:13 | tricia@mars | 194925 |
| 2001-05-11 10:15:08 | barb@saturn | 58274 |
| 2001-05-14 17:03:01 | tricia@saturn | 2394482 |
| 2001-05-14 14:42:21 | barb@venus | 98151 |
+---------------------+---------------+---------+


The same idea commonly is applied to sorting
people's names. Suppose you have a table
names that contains last and first names. To
display records sorted by last name first, the query is
straightforward when the columns are displayed separately:



mysql> SELECT last_name, first_name FROM name
-> ORDER BY last_name, first_name;
+-----------+------------+
| last_name | first_name |
+-----------+------------+
| Blue | Vida |
| Brown | Kevin |
| Gray | Pete |
| White | Devon |
| White | Rondell |
+-----------+------------+


If instead you want to display each name as a single string composed
of the first name, a space, and the last name, you can begin the
query like this:



SELECT CONCAT(first_name,' ',last_name) AS full_name FROM name ...


But then how do you sort the names so they come out in the last name
order? The answer is to display the composite names, but refer to the
constituent values in the ORDER
BY clause:



mysql> SELECT CONCAT(first_name,' ',last_name) AS full_name
-> FROM name
-> ORDER BY last_name, first_name;
+---------------+
| full_name |
+---------------+
| Vida Blue |
| Kevin Brown |
| Pete Gray |
| Devon White |
| Rondell White |
+---------------+


If you want to write queries that sort on non-displayed values,
you'll have problems if the sort columns are
expressions and you're using an older version of
MySQL. This is because expressions aren't allowed in
ORDER BY clauses until
MySQL 3.23.2 (as discussed in Recipe 6.4).



The solution is to "unhide" the
expression�add it as an extra output column, and then refer to
it by position or by using an alias. For example, to write a query
that lists names from the names table with the
longest names first, you might do this in MySQL 3.23.2 and up:



mysql> SELECT CONCAT(first_name,' ',last_name) AS name
-> FROM names
-> ORDER BY LENGTH(CONCAT(first_name,' ',last_name)) DESC;
+---------------+
| name |
+---------------+
| Rondell White |
| Kevin Brown |
| Devon White |
| Vida Blue |
| Pete Gray |
+---------------+


To rewrite this query for older versions of MySQL, put the expression
in the output column list and use an alias to sort it:



mysql> SELECT CONCAT(first_name,' ',last_name) AS name,
-> LENGTH(CONCAT(first_name,' ',last_name)) AS len
-> FROM names
-> ORDER BY len DESC;
+---------------+------+
| name | len |
+---------------+------+
| Rondell White | 13 |
| Kevin Brown | 11 |
| Devon White | 11 |
| Vida Blue | 9 |
| Pete Gray | 9 |
+---------------+------+


Or else refer to the additional output column by position:



mysql> SELECT CONCAT(first_name,' ',last_name) AS name,
-> LENGTH(CONCAT(first_name,' ',last_name)) AS len
-> FROM names
-> ORDER BY 2 DESC;
+---------------+------+
| name | len |
+---------------+------+
| Rondell White | 13 |
| Kevin Brown | 11 |
| Devon White | 11 |
| Vida Blue | 9 |
| Pete Gray | 9 |
+---------------+------+


Whichever workaround you use, the output will of course contain a
column that's there only for sorting purposes and
that you really aren't interested in displaying. If
you're running the query from the
mysql program, that's
unfortunate, but there's nothing you can do about
the additional output. In your own programs, the extra output column
is no problem. It'll be returned in the result set,
but you can ignore it. Here's a Python example that
demonstrates this. It runs the query, displays the names, and
discards the name lengths:



cursor = conn.cursor (MySQLdb.cursors.DictCursor)
cursor.execute ("""
SELECT CONCAT(first_name,' ',last_name) AS full_name,
LENGTH(CONCAT(first_name,' ',last_name)) AS len
FROM name
ORDER BY len DESC
""")
for row in cursor.fetchall ( ):
print row["full_name"] # print name, ignore length
cursor.close ( )









    I l@ve RuBoard



    No comments:

    Post a Comment