mod_log_sql 2.0 Documentation

Home » Projects » Apache » Mod_log_sql » Docs-2.0 » mod_log_sql 2.0 Documentation

Prev: Using the module in an ISP environment --- Up: Advanced logging scenarios --- Next: Using the same database for production and test

Logging many-to-one data in separate tables

A given HTTP request can have a one-to-many relationship with certain kinds of data. For example, a single HTTP request can have 4 cookies, 3 headers and 5 "mod_gzip" notes associated with it. mod_log_sql is capable of logging these relationships due to the elegance of SQL relational data.

You already have a single table containing access requests. One of the columns in that table is 'id' which is intended to contain the unique request ID supplied by the standard Apache module mod_unique_id -- all you need to do is compile in that module and employ the LogSQLTransferLogFormat character 'I'. Thereafter, each request gets a unique ID that can be thought of as a primary key within the database, useful for joining multiple tables. So let's envision several new tables: a notes table, a cookies table, and a table for inbound and outbound headers.

Table 1: <tblAcc>access_log
id remote_host request_uri time_stamp status bytes_sent
PPIDskBRH30AAGPtAsg /mod_log_sql/index.html 1022493617 200 2215
Table 2: <tblNotes>notes_log
id item val
PPIDskBRH30AAGPtAsg mod_gzip_result OK
PPIDskBRH30AAGPtAsg mod_gzip_compression_ratio 69
Table 3: <tblHdr>headers_log
id item val
PPIDskBRH30AAGPtAsg Content-Type text/html
PPIDskBRH30AAGPtAsg Accept-Encoding gzip, deflate
PPIDskBRH30AAGPtAsg Expires Tue, 28 May 2002 10:00:18 GMT
PPIDskBRH30AAGPtAsg Cache-Control max-age=86400

We have a certain request, and its unique ID is "PPIDskBRH30AAGPtAsg". Within each separate table will be multiple entries with that request ID: several cookie entries, several header entries, etc. As you can see in tables [tblAcc], [tblNotes] and [tblHdr], you have a one-to-many relationship for request PPIDskBRH30AAGPtAsg: that one access has two associated notes and four associated headers. You can extract this data easily using the power of SQL's "select" statement and table joins. To see the notes associated with a particular request:

SELECT a.remote_host, a.request_uri, n.item, n.val
FROM access_log a JOIN notes_log n ON
Table 4: access_log joined to notes_log
remote_host request_uri item val /mod_log_sql/index.html mod_gzip_result OK /mod_log_sql/index.html mod_gzip_compression_ratio 69

Naturally you can craft similar statements for the outboud headers, inbound headers and cookies, all of which can live in separate tables. Your statements are limited in power only by your skill with SQL.

In order to use this capability of mod_log_sql, you must do several things.

  • Compile mod_unique_id into Apache (statically or as a DSO). mod_log_sql employs the unique request ID that mod_unique_id provides in order to key between the separate tables. You can still log the data without mod_unqiue_id, but it will be completely uncorrelated and you will have no way to discern any meaning.

  • Create the appropriate tables. This will be done for you if you permit mod_log_sql to create its own tables using LogSQLCreateTables On, or if you use the enclosed "create_tables.sql" script.

  • Create a SQL index on the "id" column. Without this index, table joins will be deathly slow. I recommend you consult the MySQL documentation on the proper way to create a column index if you are not familiar with this operation.

  • Within each appropriate VirtualHost stanza, use the LogSQLWhich* and LogSQL*LogTable directives to tell the module what and where to log the data. In the following example, I have overridden the name for the notes table whereas I have left the other table names at their defaults. I have then specified the cookies, headers and notes that interest me. (And as you can see, these directives do not require me to add any characters to LogSQLTransferLogTable.)

     LogSQLNotesLogTable notestable
     LogSQLWhichCookies bluecookie redcookie greencookie
     LogSQLWhichNotes mod_gzip_result mod_gzip_compression_ratio
     LogSQLWhichHeadersOut Expires Content-Type Cache-Control
     LogSQLWhichHeadersIn User-Agent Accept-Encoding Host