mod_log_sql 2.0 Documentation

Home » Projects » Apache » Mod_log_sql » Docs-2.0 » mod_log_sql 2.0 Documentation

Up: TOC --- Next: Installation

Introduction

Summary

This Apache module will permit you to log to a SQL database; it can log each access request as well as data associated with each request: cookies, notes, and inbound/outbound headers. Unlike logging to a flat text file -- which is standard in Apache -- a SQL-based log exhibits tremendous flexibility and power of data extraction. (See FAQ entry for further discussion and examples of the advantages to SQL.)

This module can either replace or happily coexist with mod_log_config, Apache's text file logging facility. In addition to being more configurable than the standard module, mod_log_sql is much more flexible.

Approach

This project was formerly known as "mod_log_mysql." It was renamed "mod_log_sql" in order to reflect the project goal of database in-specificity. The module currently supports MySQL, but support for other database back-ends is underway.

In order to save speed and overhead, links are kept alive in between queries. This module uses one dedicated SQL link per httpd child, opened by each child process when it is born. Among other things, this means that this module supports logging into only one MySQL server, and for now, also, only one SQL database. But that's a small tradeoff compared to the blinding speed of this module. Error reporting is robust throughout the module and will inform the administrator of database issues in the Apache ErrorLog for the server/virtual server.

Virtual hosts are supported in the same manner they are in the regular logging modules. The administrator defines some basic 'global' directives in the main server config, then defines more specific 'local' directives inside each VirtualHost stanza.

A robust "preserve" capability has now been implemented. This permits the module to preserve any failed INSERT commands to a local file on its machine. In any situation that the database is unavailable -- e.g. the network fails or the database host is rebooted -- mod_log_sql will note this in the error log and begin appending its log entries to the preserve file (which is created with the user and group ID of the running Apache process, e.g. "nobody/nobody" on many Linux installations). When database availability returns, mod_log_sql seamlessly resumes logging to it. When convenient for the sysadmin, he/she can easily import the preserve file into the database because it is simply a series of SQL insert statements.

What gets logged by default?

All the data that would be contained in the "Combined Log Format" is logged by default, plus a little extra. Your best bet is to begin by accepting this default, then later customize the log configuration based on your needs. The documentation of the run-time directives includes a full explanation of what you can log, including examples -- see section Configuration Directive Reference .

Miscellaneous Notes

  • Note which directives go in the 'main server config' and which directives apply to the 'virtual host config'. This is made clear in the directive documentation.

  • The 'time_stamp' field is stored in an UNSIGNED INTEGER format, in the standard unix "seconds since the epoch" format. This is superior to storing the access time as a string due to size requirements: an UNSIGNED INT requires 4 bytes, whereas an Apache date string (e.g. "18/Nov/2001:13:59:52 -0800") requires 26 bytes: those extra 22 bytes become significant when multiplied by thousands of accesses on a busy server. Besides, an INT type is far more flexible for comparisons, etc.

    In MySQL 3.21 and above you can easily convert this to a human readable format using from_unixtime(), e.g.:

    SELECT remote_host,request_uri,from_unixtime(time_stamp)
    FROM access_log;

    The enclosed perl program "make_combined_log.pl" extracts your access log in a format that is completely compatible with the Combined Log Format. You can then feed this to your favorite web log analysis tool.

  • The table's string values can be CHAR or VARCHAR, at a length of your choice. VARCHAR is superior because it truncates long strings; CHAR types are fixed-length and will be padded with spaces, resulting in waste. Just like the time_stamp issue described above, that kind of space waste multiplies over thousands of records.

  • Be careful not to go overboard setting fields to NOT NULL. If a field is marked NOT NULL then it must contain data in the INSERT statement, or the INSERT will fail. These mysterious failures can be quite frustrating and difficult to debug.

  • When Apache logs a numeric field, it uses a '-' character to mean "not applicable," e.g. the number of bytes returned on a 304 (unchanged) request. Since '-' is an illegal character in an SQL numeric field, such fields are assigned the value 0 instead of '-' which, of course, makes perfect sense anyway.

Author / Maintainer

The actual logging code was taken from the already existing flat file text modules, so all that credit goes to the Apache Software Foundation.

The MySQL routines and directives were added by Zeev Suraski <bourbon@netvision.net.il>.

All changes from 1.06+ and the new documentation were added by Chris Powell . It seems that the module had fallen into the "un-maintained" category -- it had not been updated since 1998 -- so Chris adopted it as the new maintainer.

In December of 2003, Edward Rudd porting the module to Apache 2.0, cleaning up the code, converting the documentation to DocBook, optimizing the main logging loop, and added the much anticipated database abstraction layer.

As of February 2004, Chris Powell handed over maintenance of the module over to Edward Rudd. So you should contact Edward Rudd about the module from now on.

Mailing Lists

A general discussion and support mailing list is provided for mod_log_sq at lists.outoforder.cc. To subscribe to the mailing list send a blank e-mail to mod_log_sql-subscribe@lists.outoforder.cc. The list archives can be accessed via Gmane.org's mailng list gateway via any new reader news://news.gmane.org/gmane.comp.apache.mod-log-sql , or via a web browser at http://news.gmane.org/gmane.comp.apache.mod-log-sql .