Tải bản đầy đủ - 0 (trang)
12 Generating "Click to Sort" Table Headings

12 Generating "Click to Sort" Table Headings

Tải bản đầy đủ - 0trang

To retrieve the table and display its contents as an HTML table, you can use the techniques

discussed in Recipe 17.4. Here we'll use those same concepts but modify them to produce

"click to sort" table column headings.

A "plain" HTML table would include a row of column headers consisting only of the column








To make the headings active links that reinvoke the script to produce a display sorted by a

given column name, we need to produce a header row that looks like this:







To generate such headings, the script needs to know the names of the columns in the table,

as well as its own URL. Recipe 9.6 and Recipe 18.2 show how to obtain this information using

query metadata and information in the script's environment. For example, in PHP, a script can

generate the header row for the columns in a given query like this:

$self_path = get_self_path ( );

print ("\n");

for ($i = 0; $i < mysql_num_fields ($result_id); $i++)


$col_name = mysql_field_name ($result_id, $i);

printf ("%s\n",


urlencode ($col_name),

htmlspecialchars ($col_name));


print ("\n");

The following script, clicksort.php, implements this kind of table display. It checks its

environment for a

sort parameter that indicates which column to use for sorting. The script

then uses the parameter to construct a query of the following form:

SELECT * FROM $tbl_name ORDER BY $sort_col LIMIT 50

(If no

sort parameter is present, the script uses ORDER BY 1 to produce a default of

sorting by the first column.) The LIMIT clause is simply a precaution to prevent the script

from dumping huge amounts of output if the table is large.

Here's what the script looks like:

# clicksort.php - display query result as HTML table with "click to sort"

# column headings





Rows from the database table are displayed as an HTML table.

Column headings are presented as hyperlinks that reinvoke the

script to redisplay the table sorted by the corresponding column.

The display is limited to 50 rows in case the table is large.

include "Cookbook.php";

include "Cookbook_Webutils.php";

$title = "Table Display with Click-To-Sort Column Headings";


<?php print ($title); ?>

# ---------------------------------------------------------------------$tbl_name = "mail";

# table to display; change as desired

$conn_id = cookbook_connect ( );

print ("

Table: " . htmlspecialchars ($tbl_name) . "


print ("

Click on a column name to sort the table by that



# Get the name of the column to sort by (optional). If missing, use

# column one. If present, perform simple validation on column name;

# it must consist only of alphanumeric or underscore characters.

$sort_col = get_param_val ("sort");

# column name to sort by (optional)

if (!isset ($sort_col))

$sort_col = "1";

# just sort by first column

else if (!ereg ("^[0-9a-zA-Z_]+$", $sort_col))

die (htmlspecialchars ("Column name $sort_col is invalid"));

# Construct query to select records from the named table, optionally


# by a particular column. Limit output to 50 rows to avoid dumping entire

# contents of large tables.

$query = "SELECT * FROM $tbl_name";

$query .= " ORDER BY $sort_col";

$query .= " LIMIT 50";

$result_id = mysql_query ($query, $conn_id);

if (!$result_id)

die (htmlspecialchars (mysql_error ($conn_id)));

# Display query results as HTML table. Use query metadata to get column

# names, and display names in first row of table as hyperlinks that cause

# the table to be redisplayed, sorted by the corresponding table column.

print ("\n");

$self_path = get_self_path ( );

print ("\n");

for ($i = 0; $i < mysql_num_fields ($result_id); $i++)


$col_name = mysql_field_name ($result_id, $i);

printf ("\n",


urlencode ($col_name),

htmlspecialchars ($col_name));


print ("\n");

while ($row = mysql_fetch_row ($result_id))


print ("\n");

for ($i = 0; $i < mysql_num_fields ($result_id); $i++)


# encode values, using   for empty cells

$val = $row[$i];

if (isset ($val) && $val != "")

$val = htmlspecialchars ($val);


$val = " ";

printf ("\n", $val);


print ("\n");


mysql_free_result ($result_id);

print ("

mysql_close ($conn_id);


In Recipe 18.8, I mentioned that placeholder techniques apply only to data values, not to

identifiers such as column names. Our

sort parameter is a column name, so it cannot be

"sanitized" using placeholders or an encoding function. Instead, the script performs a

rudimentary test to verify that the name contains only alphanumeric characters and

underscores. This is a simple test that works for the majority of table names, though it may

fail if you have tables with unusual names. The same kind of test applies also to database,

index, column, and alias names.

Another approach to validating the column name is to run a

SHOW COLUMNS query to find

out which columns the table actually has. If the sort column is not one of them, it is invalid.

The clicksort.php script shown here does not do that. However, the

recipes distribution

contains a Perl counterpart script, clicksort.pl, that does perform this kind of check. Have a

look at it if you want more information.

The cells in the rows following the header row contain the data values from the database

table, displayed as static text. Empty cells are displayed using

  so that they display

with the same border as nonempty cells (see Recipe 17.4).

18.13 Web Page Access Counting

18.13.1 Problem

You want to count the number of times a page has been accessed. This can be used to display

a hit counter in the page. The same technique can be used to record other types of

information as well, such as the number of times each of a set of banner ads has been served.

18.13.2 Solution

Implement a hit counter, keyed to the page you want to count.

18.13.3 Discussion

This section discusses access counting, using hit counters for the examples. Counters that

display the number of times a web page has been accessed are not such a big thing as they

used to be, presumably because page authors now realize that most visitors don't really care

how popular a page is. Still, the general concept has application in several contexts. For

example, if you're displaying banner ads in your pages (Recipe 17.8), you may be charging

vendors by the number of times you serve their ads. To do so, you need to count the number

of accesses for each one. You can adapt the technique shown in this section for purposes such

as these.

There are several methods for writing a page that displays a count of the number of times it

has been accessed. The most basic is to maintain the count in a file. When the page is

requested, you open the file, read the count, increment it and write the new count back to the

file and display it in the page. This has the advantage of being easy to implement and the

disadvantage that it requires a counter file for each page that includes a hit count. It also

doesn't work properly if two clients access the page at the same time, unless you implement

some kind of locking protocol in the file access procedure. It's possible to reduce counter file

litter by keeping multiple counts in a single file, but that makes it more difficult to access

particular values within the file, and it doesn't solve the simultaneous-access problem. In fact,

it makes it worse, because a multiple-counter file has a higher likelihood of being accessed by

multiple clients simultaneously than does a single-counter file. So you end up implementing

storage and retrieval methods for processing the file contents, and locking protocols to keep

multiple processes from interfering with each other. Hmm . . . those sound suspiciously like

the problems that MySQL already takes care of! Keeping the counts in the database

centralizes them into a single table, SQL provides the storage and retrieval interface, and the

locking problem goes away because MySQL serializes access to the table so that clients can't

interfere with each other. Furthermore, depending on how you manage the counters, you may

be able to update the counter and retrieve the new sequence value using a single query.

I'll assume that you want to log hits for more than one page. To do that, create a table that

has one row for each page to be counted. This means it's necessary to have a unique identifier

for each page, so that counters for different pages don't get mixed up. You could assign

identifiers somehow, but it's easier just to use the page's path within your web tree. Web

programming languages typically make this path easy to obtain; in fact, we've already

discussed how to do so in Recipe 18.2. On that basis, you can create a

hitcount table as










This table definition involves some assumptions:


BINARY keyword in the path column definition makes the column values case

sensitive. That's appropriate for a web platform where pathnames are case sensitive,

such as most versions of Unix. For Windows or for HFS+ filesystems under Mac OS X,

filenames are not case sensitive, so you'd omit

BINARY from the definition.


path column has a maximum length of 255 characters, which limits you to page

paths no longer than that. If you expect to require longer values, use a BLOB or

TEXT type rather than VARCHAR. But in this case, you're still limited to indexing a

maximum of the leftmost 255 characters of the column values, so you'd use a nonunique index rather than a


The mechanism works for a single document tree, such as when your web server is

used to serve pages for a single domain. If you institute a hit count mechanism on a

host that servers multiple virtual domains, you may want to add a column for the

domain name. This value is available in the

SERVER_NAME value that Apache puts

into your script's environment. In this case, the hitcount table index would

include both the hostname and the page path.

The general logic involved in hit counter maintenance is to increment the

hits column of the

record for a page, then retrieve the updated counter value. One way to do that is by using the

following two queries:

UPDATE hitcount SET hits = hits + 1 WHERE path = 'page

SELECT hits FROM hitcount WHERE path = 'page



Unfortunately, if you use that approach, you may often not get the correct value. If several

clients request the same page simultaneously, several

close temporal proximity. The following

the corresponding

UPDATE statements may be issued in

SELECT statements then wouldn't necessarily get

hits value. This can be avoided by using a transaction or by locking the

hitcount table, but that slows down hit counting. MySQL provides a solution that allows

each client to retrieve its own count, no matter how many updates happen at the same time:

UPDATE hitcount SET hits = LAST_INSERT_ID(hits+1) WHERE path = 'page



The basis for updating the count here is

LAST_INSERT_ID(expr), which was

discussed in Recipe 11.17. The UPDATE statement finds the relevant record and increments

its counter value. The use of LAST_INSERT_ID(hits+1) rather than just hits+1

tells MySQL to treat the value as though it were an AUTO_INCREMENT value. This allows

it to be retrieved in the second query using LAST_INSERT_ID( ). The

LAST_INSERT_ID( ) function returns a connection-specific value, so you always get

back the value corresponding to the UPDATE issued on the same connection. In addition,

the SELECT statement doesn't need to query a table, so it's very fast. A further efficiency

may be gained by eliminating the SELECT query altogether, which is possible if your API

provides a means for direct retrieval of the most recent sequence number. For example, in

Perl, you can update the count and get the new value with a single query like this:

$dbh->do (

"UPDATE hitcount SET hits = LAST_INSERT_ID(hits+1) WHERE path = ?",

undef, $page_path);

$hits = $dbh->{mysql_insertid};

However, there's still a problem here. What if the page isn't listed in the

In that case, the

hitcount table?

UPDATE statement finds no record to modify and you get a counter value

of zero. You could deal with this problem by requiring that any page that includes a hit counter

must be registered in the

hitcount table before the page goes online. A friendlier

alternate approach is to create a counter record automatically for any page that is found not

to have one. That way, page designers can put counters in pages with no advance

preparation. To make the counter mechanism easier to use, put the code in a utility function

that takes a page path as its argument, handles the missing-record logic internally, and

returns the count. Conceptually, the function acts like this:

update the counter

if the update modifies a row

retrieve the new counter value


insert a record for the page with the count set to 1

The first time you request a count for a page, the update modifies no rows because the page

won't be listed in the table yet. The function creates a new counter and returns a value of one.

For each request thereafter, the update modifies the existing record for the page and the

function returns successive access counts.

In Perl, a hit-counting function might look like this, where the arguments are a database

handle and the page path:

sub get_hit_count


my ($dbh, $page_path) = @_;

my $rows = $dbh->do (

"UPDATE hitcount SET hits = LAST_INSERT_ID(hits+1) WHERE path = ?",

undef, $page_path);

return ($dbh->{mysql_insertid}) if $rows > 0; # counter was incremented

# If the page path wasn't listed in the table, register it and

# initialize the count to one. Use IGNORE in case another client

# tries same thing at the same time.

$dbh->do ("INSERT IGNORE INTO hitcount (path,hits) VALUES(?,1)",

undef, $page_path);

return (1);


The CGI.pm

script_name( ) function returns the local part of the URL, so you use

get_hit_count( ) like this:

my $hits = get_hit_count ($dbh, script_name ( ));

print p ("This page has been accessed $hits times.");

The counting mechanism potentially involves multiple queries, and we haven't used a

transactional approach, so the algorithm still has a race condition that can occur for the first

access to a page. If multiple clients simultaneously request a page that is not yet listed in the

hitcount table, each of them may issue the UPDATE query, find the page missing, and

as a result issue the INSERT query to register the page and initialize the counter. The

algorithm uses INSERT IGNORE to suppress errors if simultaneous invocations of the

script attempt to initialize the counter for the same page, but the result is that they'll all get a

count of one. Is it worth trying to fix this problem by using transactions or table locking? For

hit counting, I'd say no. The slight loss of accuracy doesn't warrant the additional processing

overhead. For a different application, the priority may be accuracy over efficiency, in which

case you would opt for transactions to avoid losing a count.

A PHP version of the hit counter looks like this:

function get_hit_count ($conn_id, $page_path)


$query = sprintf ("UPDATE hitcount SET hits = LAST_INSERT_ID(hits+1)

WHERE path = %s", sql_quote ($page_path));

if (mysql_query ($query, $conn_id) && mysql_affected_rows ($conn_id) >


return (mysql_insert_id ($conn_id));

# If the page path wasn't listed in the table, register it and

# initialize the count to one. Use IGNORE in case another client

# tries same thing at the same time.

$query = sprintf ("INSERT IGNORE INTO hitcount (path,hits)

VALUES(%s,1)", sql_quote ($page_path));

mysql_query ($query, $conn_id);

return (1);


To use it, call the

get_self_path( ) function that returns the script pathname (see

Recipe 18.2):

$self_path = get_self_path ( );

$hits = get_hit_count ($conn_id, $self_path);

print ("

This page has been accessed $hits times.


In Python, the function looks like this:

def get_hit_count (conn, page_path):

cursor = conn.cursor ( )

cursor.execute ("""

UPDATE hitcount SET hits = LAST_INSERT_ID(hits+1)

WHERE path = %s

""", (page_path,))

if cursor.rowcount > 0:

# a counter was incremented

count = cursor.insert_id ( )

cursor.close ( )

return (count)

# If the page path isn't listed in the table, register it and

# initialize the count to one. Use IGNORE in case another client

# tries same thing at the same time.

cursor.execute ("""

INSERT IGNORE INTO hitcount (path,hits) VALUES(%s,1)

""", (page_path,))

cursor.close ( )

return (1)

And is used as follows:

self_path = os.environ["SCRIPT_NAME"]

count = get_hit_count (conn, self_path)

print "

This page has been accessed %d times.

" % count


recipes distribution includes demonstration scripts hit counter scripts for Perl, PHP,

and Python under the apache directory. A JSP version is under the tomcat directory. Install

any of these in your web tree, invoke it a few times, and watch the count increase. (First you'll

need to create the

hitcount table, as well as the hitlog table described in Recipe

18.14. Both tables can be created from the hits.sql script provided in the tables directory.)

18.14 Web Page Access Logging

18.14.1 Problem

You want to know more about a page than just the number of times it's been accessed, such

as the time of access and the host from which the request originated.

18.14.2 Solution

Maintain a hit log rather than a simple counter.

18.14.3 Discussion


hitcount table records only the count for each page registered in it. If you want to

record other information about page access, use a different approach. Suppose you want to

track the client host and time of access for each request. In this case, you need a log for each

page rather than just a count. But you can still maintain the counts by using a multiplecolumn index that combines the page path and an

AUTO_INCREMENT sequence column:











PRIMARY KEY (path,hits)


To insert new records, use this query:

INSERT INTO hitlog (path, host) VALUES(path_val,host_val);

For example, in a JSP page, hits can be logged like this:

<%= request.getRemoteHost ( ) %>

<%= request.getRemoteAddr ( ) %>


INSERT INTO hitlog (path, host) VALUES(?,?)

<%= request.getRequestURI ( ) %>


hitlog table has the following useful properties:

Access times are recorded automatically in the

TIMESTAMP column t when you

insert new records.

By linking the

path column to an AUTO_INCREMENT column hits, the

counter values for a given page path increment automatically whenever you insert a

new record for that path. The counters are maintained separately for each distinct

path value. (For more information on how multiple-column sequences work, see

Recipe 11.15.)

There's no need to check whether the counter for a page already exists, because you

insert a new row each time you record a hit for a page, not just for the first hit.

If you want to determine the current counters for each page, select the record for

each distinct

path value that has the largest hits value:

SELECT path, MAX(hits) FROM hitlog GROUP BY path;

18.15 Using MySQL for Apache Logging

18.15.1 Problem

You don't want to use MySQL to log accesses for just a few pages, as shown in Recipe 18.14.

You want to log all pages accesses, and you don't want to have to put logging actions in each

page explicitly.

18.15.2 Solution

Tell Apache to log pages accesses to MySQL.

18.15.3 Discussion

The uses for MySQL in a web context aren't limited just to page generation and processing.

You can use it to help you run the web server itself. For example, most Apache servers are set

up to log a record of web requests to a file. But it's also possible to send log records to a

program instead, from which you can write the records wherever you like—such as to a

database. With log records in a database rather than a flat file, the log becomes more highly

structured and you can apply SQL analysis techniques to it. Log file analysis tools may be

written to provide some flexibility, but often this is a matter of deciding which summaries to

display and which to suppress. It's more difficult to tell a tool to display information it wasn't

built to provide. With log entries in a table, you gain additional flexibility. Want to see a

particular report? Write the SQL statements that produce it. To display the report in a specific

format, issue the queries from within an API and take advantage of your language's output

production capabilities.

By handling log entry generation and storage using separate processes, you gain some

additional flexibility. Some of the possibilities are to send logs from multiple web servers to

the same MySQL server, or to send different logs generated by a given web server to different

MySQL servers.

This section shows how to set up web request logging from Apache into MySQL and

demonstrates some summary queries you may find useful.

18.15.4 Setting Up Database Logging

Apache logging is controlled by directives in the httpd.conf configuration file. For example, a

typical logging setup uses

LogFormat and CustomLog directives that look like this:

LogFormat "%h %l %u %t \"%r\" %>s %b" common

CustomLog /usr/local/apache/logs/access_log common


LogFormat line defines a format for log records and gives it the nickname common.

The CustomLog directive indicates that lines should be written in that format to the

access_log file in Apache's logs directory. To set up logging to MySQL instead, use the

following procedure:[4]


If you're using logging directives such as TransferLog rather than LogFormat

and CustomLog, you'll need to adapt the instructions in this section.

Decide what values you want to record and set up a table that contains the

appropriate columns.

Write a program to read log lines from Apache and write them into the database.

Set up a

LogFormat line that defines how to write log lines in the format the

program expects, and a CustomLog directive that tells Apache to write to the

program rather than to a file.

Suppose you want to record the date and time of each request, the host that issued the

request, the request method and URL pathname, the status code, the number of bytes

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

12 Generating "Click to Sort" Table Headings

Tải bản đầy đủ ngay(0 tr)