To begin with, let's get it out of the way: logging to a database is not a panacea. But while there are complexities with this solution, the benefit can be substantial for certain classes of administrator or people with advanced requirements:
from acc_log_tbl where status=404 order by time_stamp;
acc_log_tbl where request_uri like '%mod_log_sql%' group by request_uri order
by howmany desc;
request_uri='/mod_log_sql/' group by referer order by num desc;
MySQL is a robust, free, and very powerful production-quality database engine. It is well supported and comes with detailed documentation. Many 3rd-party software pacakges (e.g. Slashcode, the engine that powers Slashdot) run exclusively with MySQL. In other words, you will belong to a very robust and well-supported community by choosing MySQL.
That being said, there are alternatives. PostgreSQL is probably MySQL's leading "competitor" in the free database world. There is also an excellent module available for Apache to permit logging to a PostgreSQL database, called pgLOGd (http://www.digitalstratum.com/pglogd/).
By all accounts it is. It is known to work without a problem on many-thousands-of-hits-per-day webservers. Does that mean it is 100% bug free? Well, no software is. But it is well-tested and believed to be fully compatible with production environments. (The usual disclaimers apply. This software is provided without warranty of any kind.)
Good question! It would be great to find out! If you are a production-level mod_log_sql user, please contact the maintainer, Chris Powell (email@example.com) so that you can be mentioned here.
There are circumstances when that would be quite unwise - for example, if Apache could not reach the MySQL server for some reason and needed to log that fact. Without a text-based error log you'd never know anything was wrong, because Apache would be trying to log a database connection error to the database... you get the point.
Error logs are usually not very high-traffic and are really best left as text files on a web server machine.
As of this writing, no. The Apache Group significantly altered the module API with the release of Apache 2.0. All modules written for 1.3, including mod_log_sql, will not work with 2.0.
mod_log_sql will eventually be ported to Apache 2.x, but not immediately. It is going to take some time, and there are other features that have higher priority. Please sign up for the announcements list (on the main website) or monitor the website for updates to learn when the port (and other releases) are available.
<OPINION>If you're a *NIX user, stick with Apache 1.3.x for now. Major modules like mod_ssl and PHP are not even ready for 2.0 yet, and the main benefits in 2.0 are for Win32 users anyway. Apache 1.3.x is rock-stable and performs equally well on *NIX as 2.0.</OPINION>
It depends! This is not determined by mod_log_sql. mod_log_sql relies on a connection command that is supplied in the MySQL API, and that command is somewhat intelligent. How it works:
Please contact the maintainer (firstname.lastname@example.org)! Your comments, suggestions, bugfixes, bug catches, and usage testimonials are always welcome. As free software, mod_log_sql is intended to be a community effort - any code contributions or other ideas will be fully and openly credited, of course.
This occurs if you compiled PHP with MySQL database support. PHP utilizes its internal, bundled MySQL libraries by default. These conflict with the ``real'' MySQL libraries linked by mod_log_sql, causing the segmentation fault.
The solution is to configure PHP to link against the real MySQL libraries and recompile mod_php. Apache will run properly once the modules are all using the same version of the MySQL libraries.
If you do not see any entries in the access_log, then something is preventing the inserts from happening. This could be caused by several things:
First examine the MySQL log that you established in step 6 of section 3.1. Ensure that the INSERT statements are not being rejected because of a malformed table name or other typographical error. By enabling that log, you instructed MySQL to log every connection and command it receives - if you see no INSERT attempts in the log, the module isn't successfully connecting to the database. If you see nothing at all in the log - not even a record of your administrative connection attempts, then you did not enable the log correctly. If you do see INSERT attempts but they are failing, the log should tell you why.
Second, confirm that your LOGSQL* directives are all correct.
Third, examine the Apache error logs for messages from mod_log_sql; the module will offer hints as to why it cannot connect, etc.
The next thing to do is recompile the module with debugging output activated. change the "#undef DEBUG" on line 8 of mod_log_sql.c to "#define DEBUG" and recompile/reinstall. The module will now output copious notes about what it is doing, and this will help you (and the maintainer) solve the problem. In order to see the debugging messages, ensure that you make them visible using the LOGLEVEL directive in the main server config as well as in each VIRTUALHOST config:
At a minimum, LOGSQLDATABASE and LOGSQLLOGININFO must be defined in order for the module to be able to establish a database link. If these are not defined or are incomplete you will receive this error message.
The rule of thumb: if you have n webservers each configured to support y MAXCLIENTS, then your database must be able to handle simultenous connections in the worst case. Certainly you must use common sense, consider reasonable traffic expectations and structure things accordingly.
Tweaking my.cnf to scale to high connection loads is imperative. But if hardware limitations prevent your MySQL server from gracefully handling the number of incoming connections, it would be beneficial to upgrade the memory or CPU on that server in order to handle the load.
Jeremy Zawodny, a highly respected MySQL user and contributor to Linux Magazine, has this very helpful and highly appropriate article on tuning MySQL: http://jeremy.zawodny.com/blog/archives/000173.html
Please remember that mod_log_sql's overriding principle is performance - that is what the target audience demands and expects. Other database logging solutions do not open and maintain many database connections, but their performance suffers drastically. For example, pgLOGd funnels all log connections through a separate daemon that connects to the database, but that bottlenecks the entire process. mod_log_sql achieves performance numbers an order of magnitude greater than the alternatives because it dispenses with the overhead associated with rapid connection cycling, and it doesn't attempt to shoehorn all the database traffic through a single extra daemon or proxy process.
This message may appear every now and then in your Apache error log, especially on very lightly loaded servers. This doesn't mean that anything is necessarily wrong. Within each httpd child process, mod_log_sql will open (and keep open) a connection to the MySQL server. MySQL, however, will close connections that haven't been used in a while; the default timeout is 8 hours. When this occurs, mod_log_sql will notice and re-open the connection. That event is what is being logged, and looks like this:
API said: error 2013, Lost connection to MySQL server during query
[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: reconnect successful
[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: second attempt successful
mod_log_sql scales to very high loads. Apache 1.3.22 + mod_log_sql was benchmarked using the "ab" (Apache Bench) program that comes with the Apache distribution; here are the results.
Ten total ab runs were conducted: five with MySQL logging enabled, and five with all MySQL directives commented out of httpd.conf. Then each five were averaged. The results:
If you run this benchmark yourself, take note of three things:
mysql> optimize table access_log;
Short answer: you shouldn't be worried.
Long answer: you might be evaluating at the output of ``ps -aufxw'' and becoming alarmed at all the 7MB httpd processes or 22MB mysqld children that you see. Don't be alarmed. It's true that mod_log_sql opens and holds open many MySQL connections: each httpd child maintains one open database connection (and holds it open for performance reasons). Four webservers, each running 20 Apache children, will hold open 80 MySQL connections, which means that your MySQL server needs to handle 80 simultaneous connections. In truth, your MySQL server needs to handle far more than that if traffic to your website spikes and the Apache webservers spawn off an additional 30 children each...
Fortunately the cost reported by 'ps -aufxw' is deceptive. This is due to an OS memory-management feature called ``copy-on-write.'' When you have a number of identical child processes (e.g. Apache, MySQL), it would appear in ``ps'' as though each one occupies a great deal of RAM - as much as 7MB per httpd child! In actuality each additional child only occupies a small bit of extra memory - most of the memory pages are common to each child and therefore shared in a ``read-only'' fashion. The OS can get away with this because the majority of memory pages for one child are identical across all children. Instead of thinking of each child as a rubber stamp of the others, think of each child as a basket of links to a common memory area.
A memory page is only duplicated when it needs to be written to, hence ``copy-on-write.'' The result is efficiency and decreased memory consumption. ``ps'' may report 7MB per child, but it might really only ``cost'' 900K of extra memory to add one more child. It is not correct to assume that 20 Apache children with a VSZ of 7MB each equals of memory consumption - the real answer is much, much lower. The same ``copy-on-write'' rules apply to all your MySQL children: 40 mysqld children @ 22MB each do not occupy 880MB of RAM.
The bottom line: although there is a cost to spawn extra httpd or mysqld children, that cost is not as great as ``ps'' would lead you to believe.
If you have exhausted all the tuning possibilities on your existing server, it is probably time you evaluated the benefits of clustering two or more webservers together in a load-balanced fashion. In fact, users of such a setup are mod_log_sql's target audience!
There are several.
If after understanding these problems you still wish to enable delayed inserts, section 3.5.4 discusses how.
Proper usage of the Apache runtime SERVERNAME directive and the directive USECANONICALNAME ON (or DNS) are necessary to prevent this problem. ``On'' is the default for USECANONICALNAME, and specifies that self-referential URLs are generated from the SERVERNAME part of your VirtualHost:
With UseCanonicalName on (and in all versions prior to 1.3) Apache will use the ServerName and Port directives to construct the canonical name for the server. With UseCanonicalName off Apache will form self-referential URLs using the hostname and port supplied by the client if any are supplied (otherwise it will use the canonical name, as defined above). [From the Apache documentation http://httpd.apache.org/docs/mod/core.html#usecanonicalname]The module inherits Apache's ``knowledge'' about the server name being accessed. As long as those two directives are properly configured, mod_log_sql will log to only one table per virtual host while using LOGSQLMASSVIRTUALHOSTING.
mod_log_sql would be virtually useless if there weren't a way for you to extract the data from your database in a somewhat meaningful fashion. To that end there's a Perl script enclosed with the distribution. That script (make_combined_log.pl) is designed to extract N-many days worth of access logs and provide them in a Combined Log Format output. You can use this very tool right in /etc/crontab to extract logs on a regular basis so that your favorite web analysis tool can read them. Or you can examine the Perl code to construct your own custom tool.
For example, let's say that you want your web statistics updated once per day in the wee hours of the morning. A good way to accomplish that could be the following entries in /etc/crontab:
05 04 * * * root make_combined_log.pl 1 www.grubbybaby.com > /var/log/temp01
# Run webalizer on httpd log
30 04 * * * root webalizer -c /etc/webalizer.conf; rm -f /var/log/temp01
/usr/local/sbin/make_combined_log.pl 1 www.yourdomain.com > /var/log/httpd/templog
/usr/local/bin/webalizer -q -c /etc/webalizer.conf
rm -f /var/log/httpd/templog
A number of people like to log mod_usertrack cookies in their Apache TransferLog to aid in understanding their visitors' clickstreams. This is accomplished, for example, with a statement as follows:
First make sure you have a column called "cookie" in the MySQL database to hold the cookies, which can be done as follows if you already have a working database:
Recap: the 'c' character activates cookie logging, and the LOGSQLWHICHCOOKIE directive chooses which cookie to log.
Perform some hits on your server and run a select:
As of version 1.17, you have a choice in how you want cookie logging handled.
If you are interested in logging only one cookie per request, follow the instructions in section 4.4.3 above. That cookie will be logged to a column in the regular access_log table, and the actual cookie you want to log is specified with LOGSQLWHICHCOOKIE. Don't forget to specify the 'c' character in LOGSQLTRANSFERLOGFORMAT.
If, however, you need to log multiple cookies per request, you must employ the LOGSQLWHICHCOOKIES (note the plural) directive. The cookies you specify will be logged to a separate table (as discussed in section 3.5.2), and entries in that table will be linked to the regular access_log entries via the unique ID that is supplied by mod_unique_id. Without mod_unique_id the information will still be logged but you will be unable to correlate which cookies go with which access-requests. Furthermore, with LOGSQLWHICHCOOKIES, you do not need to include the 'c' character in LOGSQLTRANSFERLOGFORMAT.
LOGSQLWHICHCOOKIE and LOGSQLWHICHCOOKIES can coexist without conflict because they operate on entireley different tables, but you're better off choosing the one you need.
Note: you do not need to compile SSL support into mod_log_sql in order to simply use it with a secure site. You only need to compile SSL support into mod_log_sql if you want to log SSL-specific data such as the cipher type used, or the keysize that was negotiated. If that information is unimportant to you, you can ignore this FAQ.
By adding certain characters to your LOGSQLTRANSFERLOGFORMAT string you can tell mod_log_sql to log the SSL cipher, the SSL keysize of the connection, and the maximum keysize that was available. This would let you tell, for example, which clients were using only export-grade security to access your secure software area.
You can compile mod_log_sql with SSL logging support if you have the right packages installed. If you already have an SSL-enabled Apache then you by definition have the correct packages already installed: OpenSSL and mod_ssl.
You need to ensure that your database is set up to log the SSL data. Issue the following commands to MySQL if your access table does not already have them:
alter table access_log add column ssl_keysize smallint unsigned;
alter table access_log add column ssl_maxkeysize smallint unsigned;
Restart Apache, then perform some hits on your server. Then run the following select statement:
from access_log where ssl_cipher is not null;