6.5 Displaying One Set of Values While Sorting by Another
6.5.1 Problem
You want to sort a result set using values that you're not selecting.
6.5.2 Solution
That's not a problem. You can use columns in the ORDER BY clause that don't appear in the column output list.
6.5.3 Discussion
ORDER BY is not limited to sorting only those columns named in the column output list. It can sort using values that are "hidden" (that is, not displayed in the query output). This technique is commonly used when you have values that can be represented different ways and you want to display one type of value but sort by another. For example, you may want to display mail message sizes not in terms of bytes, but as strings such as 103K for 103 kilobytes. You can convert a byte count to that kind of value using this expression:
CONCAT(FLOOR((size+1023)/1024),'K')
However, such values are strings, so they sort lexically, not numerically. If you use them for sorting, a value such as 96K sorts after 2339K, even though it represents a smaller number:
mysql> SELECT t, srcuser, -> CONCAT(FLOOR((size+1023)/1024),'K') AS size_in_K -> FROM mail WHERE size > 50000 -> ORDER BY size_in_K; +---------------------+---------+-----------+ | t | srcuser | size_in_K | +---------------------+---------+-----------+ | 2001-05-12 12:48:13 | tricia | 191K | | 2001-05-14 17:03:01 | tricia | 2339K | | 2001-05-11 10:15:08 | barb | 57K | | 2001-05-14 14:42:21 | barb | 96K | | 2001-05-15 10:25:52 | gene | 976K | +---------------------+---------+-----------+
To achieve the desired output order, display the string, but use the actual numeric size for sorting:
mysql> SELECT t, srcuser, -> CONCAT(FLOOR((size+1023)/1024),'K') AS size_in_K -> FROM mail WHERE size > 50000 -> ORDER BY size; +---------------------+---------+-----------+ | t | srcuser | size_in_K | +---------------------+---------+-----------+ | 2001-05-11 10:15:08 | barb | 57K | | 2001-05-14 14:42:21 | barb | 96K | | 2001-05-12 12:48:13 | tricia | 191K | | 2001-05-15 10:25:52 | gene | 976K | | 2001-05-14 17:03:01 | tricia | 2339K | +---------------------+---------+-----------+
Displaying values as strings but sorting them as numbers also can bail you out of some otherwise difficult situations. Members of sports teams typically are assigned a jersey number, which normally you might think should be stored using a numeric column. Not so fast! Some players like to have a jersey number of zero (0), and some like double-zero (00). If a team happens to have players with both numbers, you cannot represent them using a numeric column, because both values will be treated as the same number. The way out of the problem is to store jersey numbers as strings:
CREATE TABLE roster ( name CHAR(30), # player name jersey_num CHAR(3) # jersey number );
Then the jersey numbers will display the same way you enter them, and 0 and 00 will be treated as distinct values. Unfortunately, although representing numbers as strings solves the problem of distinguishing 0 and 00, it introduces a different problem. Suppose a team comprises the following players:
mysql> SELECT name, jersey_num FROM roster; +-----------+------------+ | name | jersey_num | +-----------+------------+ | Lynne | 29 | | Ella | 0 | | Elizabeth | 100 | | Nancy | 00 | | Jean | 8 | | Sherry | 47 | +-----------+------------+
The problem occurs when you try to sort the team members by jersey number. If those numbers are stored as strings, they'll sort lexically, and lexical order often differs from numeric order. That's certainly true for the team in question:
mysql> SELECT name, jersey_num FROM roster ORDER BY jersey_num; +-----------+------------+ | name | jersey_num | +-----------+------------+ | Ella | 0 | | Nancy | 00 | | Elizabeth | 100 | | Lynne | 29 | | Sherry | 47 | | Jean | 8 | +-----------+------------+
The values 100 and 8 are out of place. But that's easily solved. Display the string values, but use the numeric values for sorting. To accomplish this, add zero to the jersey_num values to force a string-to-number conversion:
mysql> SELECT name, jersey_num FROM roster ORDER BY jersey_num+0; +-----------+------------+ | name | jersey_num | +-----------+------------+ | Ella | 0 | | Nancy | 00 | | Jean | 8 | | Lynne | 29 | | Sherry | 47 | | Elizabeth | 100 | +-----------+------------+
The technique of displaying one value but sorting by another is also useful when you want to display composite values that are formed from multiple columns but that don't sort the way you want. For example, the mail table lists message senders using separate srcuser and srchost values. If you want to display message senders from the mail table as email addresses in srcuser@srchost format with the username first, you can construct those values using the following expression:
CONCAT(srcuser,'@',srchost)
However, those values are no good for sorting if you want to treat the hostname as more significant than the username. Instead, sort the results using the underlying column values rather than the displayed composite values:
mysql> SELECT t, CONCAT(srcuser,'@',srchost) AS sender, size -> FROM mail WHERE size > 50000 -> ORDER BY srchost, srcuser; +---------------------+---------------+---------+ | t | sender | size | +---------------------+---------------+---------+ | 2001-05-15 10:25:52 | gene@mars | 998532 | | 2001-05-12 12:48:13 | tricia@mars | 194925 | | 2001-05-11 10:15:08 | barb@saturn | 58274 | | 2001-05-14 17:03:01 | tricia@saturn | 2394482 | | 2001-05-14 14:42:21 | barb@venus | 98151 | +---------------------+---------------+---------+
The same idea commonly is applied to sorting people's names. Suppose you have a table names that contains last and first names. To display records sorted by last name first, the query is straightforward when the columns are displayed separately:
mysql> SELECT last_name, first_name FROM name -> ORDER BY last_name, first_name; +-----------+------------+ | last_name | first_name | +-----------+------------+ | Blue | Vida | | Brown | Kevin | | Gray | Pete | | White | Devon | | White | Rondell | +-----------+------------+
If instead you want to display each name as a single string composed of the first name, a space, and the last name, you can begin the query like this:
SELECT CONCAT(first_name,' ',last_name) AS full_name FROM name ...
But then how do you sort the names so they come out in the last name order? The answer is to display the composite names, but refer to the constituent values in the ORDER BY clause:
mysql> SELECT CONCAT(first_name,' ',last_name) AS full_name -> FROM name -> ORDER BY last_name, first_name; +---------------+ | full_name | +---------------+ | Vida Blue | | Kevin Brown | | Pete Gray | | Devon White | | Rondell White | +---------------+
If you want to write queries that sort on non-displayed values, you'll have problems if the sort columns are expressions and you're using an older version of MySQL. This is because expressions aren't allowed in ORDER BY clauses until MySQL 3.23.2 (as discussed in Recipe 6.4).
The solution is to "unhide" the expression�add it as an extra output column, and then refer to it by position or by using an alias. For example, to write a query that lists names from the names table with the longest names first, you might do this in MySQL 3.23.2 and up:
mysql> SELECT CONCAT(first_name,' ',last_name) AS name -> FROM names -> ORDER BY LENGTH(CONCAT(first_name,' ',last_name)) DESC; +---------------+ | name | +---------------+ | Rondell White | | Kevin Brown | | Devon White | | Vida Blue | | Pete Gray | +---------------+
To rewrite this query for older versions of MySQL, put the expression in the output column list and use an alias to sort it:
mysql> SELECT CONCAT(first_name,' ',last_name) AS name, -> LENGTH(CONCAT(first_name,' ',last_name)) AS len -> FROM names -> ORDER BY len DESC; +---------------+------+ | name | len | +---------------+------+ | Rondell White | 13 | | Kevin Brown | 11 | | Devon White | 11 | | Vida Blue | 9 | | Pete Gray | 9 | +---------------+------+
Or else refer to the additional output column by position:
mysql> SELECT CONCAT(first_name,' ',last_name) AS name, -> LENGTH(CONCAT(first_name,' ',last_name)) AS len -> FROM names -> ORDER BY 2 DESC; +---------------+------+ | name | len | +---------------+------+ | Rondell White | 13 | | Kevin Brown | 11 | | Devon White | 11 | | Vida Blue | 9 | | Pete Gray | 9 | +---------------+------+
Whichever workaround you use, the output will of course contain a column that's there only for sorting purposes and that you really aren't interested in displaying. If you're running the query from the mysql program, that's unfortunate, but there's nothing you can do about the additional output. In your own programs, the extra output column is no problem. It'll be returned in the result set, but you can ignore it. Here's a Python example that demonstrates this. It runs the query, displays the names, and discards the name lengths:
cursor = conn.cursor (MySQLdb.cursors.DictCursor) cursor.execute (""" SELECT CONCAT(first_name,' ',last_name) AS full_name, LENGTH(CONCAT(first_name,' ',last_name)) AS len FROM name ORDER BY len DESC """) for row in cursor.fetchall ( ): print row["full_name"] # print name, ignore length cursor.close ( )
|
No comments:
Post a Comment