HOWTO: Log Client IP AND X-Forwarded-For IP in Apache

Introduction

When placing apache web servers behind a load-balancing proxy like the BigIP or Pound or behind a caching proxy like Squid or a BlueCoat proxy, the client IP address from the browser is replaced with the IP address of the load-balancer/proxy.  A custom HTTP request header was developed by the squid development team, the X-Forwarded-For header, which has evolved into an industry standard.  Systems supporting the X-Forwarded-For header read the IP address, insert it into the X-Forwarded-For header, and pass it along upstream in the http request.  Apache and Tomcat can log this address in the server's access logs but will only do so for those requests that have passed through the proxy.  If you send a request directly to your apache server, for testing purposes or monitoring, the IP address will not show up in the logs.  If you still want to be able to log the client ip address for systems accessing your server's directly, this article provides a mechanism for accomplishing this with the Apache web server.

Apache LogFormat and CustomLog Configuration Changes

Although Apache has a large number of options insofar as what gets logged is concerned, this article is going to focus on the combined log format, which typically involves logging the following items:

  • Remote Host (will use hostnames if apache is configured to look them up)
  • Remote logname (typically a dash but could contain the rfc1413-compliant remote user name)
  • Remote User (typically a dash unless apache is doing some kind of authentication)
  • Timestamp of when the request was received.  This is the local time for the server locale.
  • The first line of the request (typically the request URI)
  • The status code returned by the server (after redirection has taken place)
  • The size of the request minus response headers
  • The referring website, if present.
  • The user-agent (browser, robot, spider, etc) that made the request.

A default logging configuration in your httpd.conf looks like this:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined
CustomLog log/acces_log combined

There are several changes you are going to want to make to the default format in order to log the X-Forwarded-For client ip address or the real client ip address if the X-Forwarded-For header does not exist. Those changes are below:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" proxy
SetEnvIf X-Forwarded-For "^.*\..*\..*\..*" forwarded
CustomLog "logs/access_log" combined env=!forwarded
CustomLog "logs/access_log" proxy env=forwarded

This format takes advantage of apache's built-in support for conditional logging based upon environment variable.  The first line is the standard combined log formatted string from the default.  The second line replaces the %h (remote host) field with the value(s) pulled from the X-Forwarded-For header and sets the name of this log file pattern to "proxy".  Line 3 is a setting for environment variable "forwarded" that contains a loose regular expression matching an ip address, which is ok in this case since we really care more whether an ip address exists in the X-Forwarded-For header.  Explained another way, line 3 could be read as:  "If there is an X-Forwarded-For value, use it."  Lines 4 and 5 tell apache which log pattern to use.  If and X-Forwarded-For value exists, use the "proxy" pattern, else use the "combined" pattern for that request.  For readability, lines 4 and 5 do not take advantage of Apache's rotatelogs (piped) logging feature but I assume that it is in use by most everyone.

These changes should result in logging an IP address for every request.  I have tested them and it works as expected.  If you are comfortable with Groovy, the test client below can be used to validate that it is working.  First, change the value in line 3 from "TEST_URL_HERE" to the hostname and URI of your web server. Next, tail your access log, then execute the script directly against your web server (bypassing your proxy and/or load-balancer).  Then comment out line 5 and re-execute the test.  For the first run, you should see IP address "204.9.177.195" show up in your logs.  The second run should have your machine's IP address.

#!/usr/bin/env groovy
// setup connection
def url = new URL("http://TEST_URL_HERE/")
def connection = url.openConnection()
  connection.setRequestProperty("X-Forwarded-For","204.9.177.195")
  connection.setRequestProperty("User-Agent","benevolent-bot/blog.techstacks.com")
 
if(connection.responseCode == 200){
 println "Connection successful"
 }
else{
 println "Ouchy! Error!"
 }

Creative Commons Attribution-ShareAlike 3.0 Unported