/Apache HTTP Server

Apache Module mod_filter

Description: Context-sensitive smart filter configuration module
Status: Base
ModuleIdentifier: filter_module
SourceFile: mod_filter.c
Compatibility: Version 2.1 and later


This module enables smart, context-sensitive configuration of output content filters. For example, apache can be configured to process different content-types through different filters, even when the content-type is not known in advance (e.g. in a proxy).

mod_filter works by introducing indirection into the filter chain. Instead of inserting filters in the chain, we insert a filter harness which in turn dispatches conditionally to a filter provider. Any content filter may be used as a provider to mod_filter; no change to existing filter modules is required (although it may be possible to simplify them).

Smart Filtering

In the traditional filtering model, filters are inserted unconditionally using AddOutputFilter and family. Each filter then needs to determine whether to run, and there is little flexibility available for server admins to allow the chain to be configured dynamically.

mod_filter by contrast gives server administrators a great deal of flexibility in configuring the filter chain. In fact, filters can be inserted based on complex boolean expressions This generalises the limited flexibility offered by AddOutputFilterByType.

Filter Declarations, Providers and Chains

[This image displays the traditional filter model]
Figure 1: The traditional filter model

In the traditional model, output filters are a simple chain from the content generator (handler) to the client. This works well provided the filter chain can be correctly configured, but presents problems when the filters need to be configured dynamically based on the outcome of the handler.

[This image shows the mod_filter model]
Figure 2: The mod_filter model

mod_filter works by introducing indirection into the filter chain. Instead of inserting filters in the chain, we insert a filter harness which in turn dispatches conditionally to a filter provider. Any content filter may be used as a provider to mod_filter; no change to existing filter modules is required (although it may be possible to simplify them). There can be multiple providers for one filter, but no more than one provider will run for any single request.

A filter chain comprises any number of instances of the filter harness, each of which may have any number of providers. A special case is that of a single provider with unconditional dispatch: this is equivalent to inserting the provider filter directly into the chain.

Configuring the Chain

There are three stages to configuring a filter chain with mod_filter. For details of the directives, see below.

Declare Filters
The FilterDeclare directive declares a filter, assigning it a name and filter type. Required only if the filter is not the default type AP_FTYPE_RESOURCE.
Register Providers
The FilterProvider directive registers a provider with a filter. The filter may have been declared with FilterDeclare; if not, FilterProvider will implicitly declare it with the default type AP_FTYPE_RESOURCE. The provider must have been registered with ap_register_output_filter by some module. The final argument to FilterProvider is an expression: the provider will be selected to run for a request if and only if the expression evaluates to true. The expression may evaluate HTTP request or response headers, environment variables, or the Handler used by this request. Unlike earlier versions, mod_filter now supports complex expressions involving multiple criteria with AND / OR logic (&& / ||) and brackets. The details of the expression syntax are described in the ap_expr documentation.
Configure the Chain
The above directives build components of a smart filter chain, but do not configure it to run. The FilterChain directive builds a filter chain from smart filters declared, offering the flexibility to insert filters at the beginning or end of the chain, remove a filter, or clear the chain.

Filtering and Response Status

mod_filter normally only runs filters on responses with HTTP status 200 (OK). If you want to filter documents with other response statuses, you can set the filter-errordocs environment variable, and it will work on all responses regardless of status. To refine this further, you can use expression conditions with FilterProvider.

Upgrading from Apache HTTP Server 2.2 Configuration

The FilterProvider directive has changed from httpd 2.2: the match and dispatch arguments are replaced with a single but more versatile expression. In general, you can convert a match/dispatch pair to the two sides of an expression, using something like:

"dispatch = 'match'"

The Request headers, Response headers and Environment variables are now interpreted from syntax %{req:foo}, %{resp:foo} and %{env:foo} respectively. The variables %{HANDLER} and %{CONTENT_TYPE} are also supported.

Note that the match no longer support substring matches. They can be replaced by regular expression matches.


Server side Includes (SSI)
A simple case of replacing AddOutputFilterByType
FilterDeclare SSI
FilterProvider SSI INCLUDES "%{CONTENT_TYPE} =~ m|^text/html|"
FilterChain SSI
Server side Includes (SSI)
The same as the above but dispatching on handler (classic SSI behaviour; .shtml files get processed).
FilterProvider SSI INCLUDES "%{HANDLER} = 'server-parsed'"
FilterChain SSI
Emulating mod_gzip with mod_deflate
Insert INFLATE filter only if "gzip" is NOT in the Accept-Encoding header. This filter runs with ftype CONTENT_SET.
FilterDeclare gzip CONTENT_SET
FilterProvider gzip inflate "%{req:Accept-Encoding} !~ /gzip/"
FilterChain gzip
Image Downsampling
Suppose we want to downsample all web images, and have filters for GIF, JPEG and PNG.
FilterProvider unpack jpeg_unpack "%{CONTENT_TYPE} = 'image/jpeg'"
FilterProvider unpack gif_unpack "%{CONTENT_TYPE} = 'image/gif'"
FilterProvider unpack png_unpack "%{CONTENT_TYPE} = 'image/png'"

FilterProvider downsample downsample_filter "%{CONTENT_TYPE} = m|^image/(jpeg|gif|png)|"
FilterProtocol downsample "change=yes"

FilterProvider repack jpeg_pack "%{CONTENT_TYPE} = 'image/jpeg'"
FilterProvider repack gif_pack "%{CONTENT_TYPE} = 'image/gif'"
FilterProvider repack png_pack "%{CONTENT_TYPE} = 'image/png'"
<Location "/image-filter">
    FilterChain unpack downsample repack

Protocol Handling

Historically, each filter is responsible for ensuring that whatever changes it makes are correctly represented in the HTTP response headers, and that it does not run when it would make an illegal change. This imposes a burden on filter authors to re-implement some common functionality in every filter:

  • Many filters will change the content, invalidating existing content tags, checksums, hashes, and lengths.
  • Filters that require an entire, unbroken response in input need to ensure they don't get byteranges from a backend.
  • Filters that transform output in a filter need to ensure they don't violate a Cache-Control: no-transform header from the backend.
  • Filters may make responses uncacheable.

mod_filter aims to offer generic handling of these details of filter implementation, reducing the complexity required of content filter modules. This is work-in-progress; the FilterProtocol implements some of this functionality for back-compatibility with Apache 2.0 modules. For httpd 2.1 and later, the ap_register_output_filter_protocol and ap_filter_protocol API enables filter modules to declare their own behaviour.

At the same time, mod_filter should not interfere with a filter that wants to handle all aspects of the protocol. By default (i.e. in the absence of any FilterProtocol directives), mod_filter will leave the headers untouched.

At the time of writing, this feature is largely untested, as modules in common use are designed to work with 2.0. Modules using it should test it carefully.

AddOutputFilterByType Directive

Description: assigns an output filter to a particular media-type
AddOutputFilterByType filter[;filter...] media-type [media-type] ...
Context: server config, virtual host, directory, .htaccess
Override: FileInfo
Status: Base
Module: mod_filter
Compatibility: Had severe limitations before being moved to mod_filter in version 2.3.7

This directive activates a particular output filter for a request depending on the response media-type.

The following example uses the DEFLATE filter, which is provided by mod_deflate. It will compress all output (either static or dynamic) which is labeled as text/html or text/plain before it is sent to the client.

AddOutputFilterByType DEFLATE text/html text/plain

If you want the content to be processed by more than one filter, their names have to be separated by semicolons. It's also possible to use one AddOutputFilterByType directive for each of these filters.

The configuration below causes all script output labeled as text/html to be processed at first by the INCLUDES filter and then by the DEFLATE filter.

<Location "/cgi-bin/">
    Options Includes
    AddOutputFilterByType INCLUDES;DEFLATE text/html

See also

FilterChain Directive

Description: Configure the filter chain
FilterChain [+=-@!]filter-name ...
Context: server config, virtual host, directory, .htaccess
Override: Options
Status: Base
Module: mod_filter

This configures an actual filter chain, from declared filters. FilterChain takes any number of arguments, each optionally preceded with a single-character control that determines what to do:

Add filter-name to the end of the filter chain
Insert filter-name at the start of the filter chain
Remove filter-name from the filter chain
Empty the filter chain and insert filter-name
Empty the filter chain
Equivalent to +filter-name

FilterDeclare Directive

Description: Declare a smart filter
FilterDeclare filter-name [type]
Context: server config, virtual host, directory, .htaccess
Override: Options
Status: Base
Module: mod_filter

This directive declares an output filter together with a header or environment variable that will determine runtime configuration. The first argument is a filter-name for use in FilterProvider, FilterChain and FilterProtocol directives.

The final (optional) argument is the type of filter, and takes values of ap_filter_type - namely RESOURCE (the default), CONTENT_SET, PROTOCOL, TRANSCODE, CONNECTION or NETWORK.

FilterProtocol Directive

Description: Deal with correct HTTP protocol handling
FilterProtocol filter-name [provider-name] proto-flags
Context: server config, virtual host, directory, .htaccess
Override: Options
Status: Base
Module: mod_filter

This directs mod_filter to deal with ensuring the filter doesn't run when it shouldn't, and that the HTTP response headers are correctly set taking into account the effects of the filter.

There are two forms of this directive. With three arguments, it applies specifically to a filter-name and a provider-name for that filter. With two arguments it applies to a filter-name whenever the filter runs any provider.

Flags specified with this directive are merged with the flags that underlying providers may have registerd with mod_filter. For example, a filter may internally specify the equivalent of change=yes, but a particular configuration of the module can override with change=no.

proto-flags is one or more of

Specifies whether the filter changes the content, including possibly the content length. The "no" argument is supported in 2.4.7 and later.
The filter changes the content, but will not change the content length
The filter cannot work on byteranges and requires complete input
The filter should not run in a proxy context
The filter transforms the response in a manner incompatible with the HTTP Cache-Control: no-transform header.
The filter renders the output uncacheable (eg by introducing randomised content changes)

FilterProvider Directive

Description: Register a content filter
FilterProvider filter-name provider-name expression
Context: server config, virtual host, directory, .htaccess
Override: Options
Status: Base
Module: mod_filter

This directive registers a provider for the smart filter. The provider will be called if and only if the expression declared evaluates to true when the harness is first called.

provider-name must have been registered by loading a module that registers the name with ap_register_output_filter.

expression is an ap_expr.

See also

FilterTrace Directive

Description: Get debug/diagnostic information from mod_filter
FilterTrace filter-name level
Context: server config, virtual host, directory
Status: Base
Module: mod_filter

This directive generates debug information from mod_filter. It is designed to help test and debug providers (filter modules), although it may also help with mod_filter itself.

The debug output depends on the level set:

0 (default)
No debug information is generated.
mod_filter will record buckets and brigades passing through the filter to the error log, before the provider has processed them. This is similar to the information generated by mod_diagnostics.
2 (not yet implemented)
Will dump the full data passing through to a tempfile before the provider. For single-user debug only; this will not support concurrent hits.

© 2016 The Apache Software Foundation
Licensed under the Apache License, Version 2.0.