Table of Contents:
The Installation is very very simple (example of installation on Linux OS):
% cd /usr/src % lwp-download http://www.apache.org/dist/apache_x.x.x.tar.gz % lwp-download http://perl.apache.org/dist/mod_perl-x.xx.tar.gz % tar zvxf apache_x.xx.tar.gz % tar zvxf mod_perl-x.xx.tar.gz % cd mod_perl-x.xx % perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \ DO_HTTPD=1 USE_APACI=1 PERL_MARK_WHERE=1 EVERYTHING=1 % make && make test && make install % cd ../apache_x.x.x % make install
That's all!
Notes: Replace x.x.x with the real version numbers of mod_perl and apache.
gnu tar
uncompresses as well (with z
flag).
First download the sources of both packages, e.g. you can use
lwp-download
utility to do it. lwp-download
is a part of the LWP (or libwww
) package, you will need to have it installed in order for mod_perl's make test
to pass. Once you install this package unless it's already installed, lwp-download
will be available for you as well.
% lwp-download http://www.apache.org/dist/apache_x.x.x.tar.gz % lwp-download http://perl.apache.org/dist/mod_perl-x.xx.tar.gz
Extract both sources. Usually I open all the sources in /usr/src/
, your mileage may vary. So move the sources and chdir
to the directory, you want to put the sources in. Gnu tar
utility knows to uncompress too with z
flag, if you have a non-gnu tar
utility, it will be incapable to decompress, so you would do it in two
steps: first uncompressing the packages with gzip -d apache_x.xx.tar.gz
and gzip -d mod_perl-x.xx.tar.gz
, second un-tarring them with tar
xvf apache_x.xx.tar
and tar xvf mod_perl-x.xx.tar
.
% cd /usr/src % tar zvxf apache_x.xx.tar.gz % tar zvxf mod_perl-x.xx.tar.gz
chdir
to the mod_perl source directory:
% cd mod_perl-x.xx
Now build the make file, for a basic work and first time installation the
parameters in the example below are the only ones you would need. APACHE_SRC
tells where the apache src
directory is. If you have followed my suggestion and have extracted the
both sources under the same directory (/usr/src
), do:
% perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \ DO_HTTPD=1 USE_APACI=1 PERL_MARK_WHERE=1 EVERYTHING=1
There are many additional parameters. You can find some of them in the
configuration dedicated and other sections. While running perl
Makefile.PL ...
the process will check for prerequisites and tell you if something is
missing, If you are missing some of the perl packages or other software --
you will have to install these before you proceed.
Now we make the project (by building the mod_perl extension and calling make
in apache source directory to build a httpd
),
test it (by running various tests) and install the mod_perl modules.
% make && make test && make install
Note that if make fails, neither make test nor make install will be not executed. If make test fails, make install will be not executed.
Now change to apache source directory and run make install
to install apache's headers, default configuration files, to build apache
directory tree and to put the httpd
there.
% cd ../apache_x.x.x % make install
When you execute the above command, apache installation process will tell
you how to start a freshly built webserver (the path of the
apachectl
, more about it later) and where the configuration files are. Remember (or
even better write down) both, since you will need this information very
soon. On my machine the two important paths are:
/usr/local/apache/bin/apachectl /usr/local/apache/conf/httpd.conf
Now the build and the installation processes are completed. Just configure httpd.conf
and start the webserver.
A basic configuration is a simple one. First configure the apache as you
always do (set Port
, User
, Group
, correct ErrorLog
and other file paths and etc), start the server and make sure it works. One
of the ways to start and stop the server is to use
apachectl
utility:
% /usr/local/apache/bin/apachectl start % /usr/local/apache/bin/apachectl stop
Shut the server down, open the httpd.conf
in your favorite editor and scroll to the end of the file, where we will
add the mod_perl configuration directives (of course you can place them
anywhere in the file).
Add the following configuration directives:
Alias /perl/ /home/httpd/perl/
Assuming that you put all your scripts, that should be executed by mod_perl
enabled server, under /home/httpd/perl/
directory.
PerlModule Apache::Registry <Location /perl> SetHandler perl-script PerlHandler Apache::Registry Options ExecCGI PerlSendHeader On allow from all </Location>
Now put a test script into /home/httpd/perl/
directory:
test.pl ------- #!/usr/bin/perl -w use strict; print "Content-type: text/html\r\n\r\n"; print "It worked!!!\n"; -------
Make it executable and readable by server, if your server is running as
user nobody
(hint: look for User
directive in httpd.conf
file), do the following:
% chown nobody /home/httpd/perl/test.pl % chmod u+rx /home/httpd/perl/test.pl
Test that the script is running from the command line, by executing it:
% /home/httpd/perl/test.pl
You should see:
Content-type: text/html It worked!!!
Now it is a time to test our mod_perl server, assuming that your config
file includes Port 80
, go to your favorite Netscape browser and fetch the following URL (after
you have started the server):
http://localhost/perl/test.pl
Make sure that you have a loop-back device configured, if not -- use the real server name for this test, for example:
http://www.nowhere.com/perl/test.pl
You should see:
It worked!!!
If something went wrong, go through the installation process again, and
make sure you didn't make a mistake. If that doesn't help, read the INSTALL
pod document (perlpod INSTALL
) in the mod_perl distribution directory.
Now copy some of your perl/CGI scripts into a /home/httpd/perl/
directory and see them working much much faster, from the newly configured
base URL (/perl/
). Some of your scripts will not work out of box and will demand some minor
tweaking or major rewrite to make them work properly with mod_perl enabled
server. Chances are that if you are not practicing a sloppy programming
techniques -- the scripts will work without any modifications at all.
The above setup is very basic, it will help you to have a mod_perl enabled server running and to get a good feeling from watching your previously slow CGIs now flying.
As with perl you can start benefit from mod_perl from the very first moment you try it. When you become more familiar with mod_perl you will want to start writing apache handlers and deploy more of the mod_perl power.
Since we are going to run two apache servers we will need two different sets of configuration, log and other files. We need a special directory layout. While some of the directories can be shared between the two servers (assuming that both are built from the same source distribution), others should be separated. From now on I will refer to these two servers as httpd_docs (vanilla Apache) and httpd_perl (Apache/mod_perl).
For this illustration, we will use /usr/local
as our root
directory. The Apache installation directories will be stored under this
root (/usr/local/bin
, /usr/local/etc
and etc...)
First let's prepare the sources. We will assume that all the sources go
into /usr/src
dir. It is better when you use two separate copies of apache sources. Since
you probably will want to tune each apache version at separate and to do
some modifications and recompilations as the time goes. Having two
independent source trees will prove helpful, unless you use DSO
, which is covered later in this section.
Make two subdirectories:
% mkdir /usr/src/httpd_docs % mkdir /usr/src/httpd_perl
Put the Apache sources into a /usr/src/httpd_docs
directory:
% cd /usr/src/httpd_docs % gzip -dc /tmp/apache_x.x.x.tar.gz | tar xvf -
If you have a gnu tar:
% tar xvzf /tmp/apache_x.x.x.tar.gz
Replace /tmp
directory with a path to a downloaded file and
x.x.x
with the version of the server you have.
% cd /usr/src/httpd_docs % ls -l drwxr-xr-x 8 stas stas 2048 Apr 29 17:38 apache_x.x.x/
Now we will prepare the httpd_perl
server sources:
% cd /usr/src/httpd_perl % gzip -dc /tmp/apache_x.x.x.tar.gz | tar xvf - % gzip -dc /tmp/modperl-x.xx.tar.gz | tar xvf - % ls -l drwxr-xr-x 8 stas stas 2048 Apr 29 17:38 apache_x.x.x/ drwxr-xr-x 8 stas stas 2048 Apr 29 17:38 modperl-x.xx/
Time to decide on the desired directory structure layout (where the apache files go):
ROOT = /usr/local
The two servers can share the following directories (so we will not duplicate data):
/usr/local/bin/ /usr/local/lib /usr/local/include/ /usr/local/man/ /usr/local/share/
Important: we assume that both servers are built from the same Apache source version.
Servers store their specific files either in httpd_docs
or
httpd_perl
sub-directories:
/usr/local/etc/httpd_docs/ httpd_perl/ /usr/local/sbin/httpd_docs/ httpd_perl/ /usr/local/var/httpd_docs/logs/ proxy/ run/ httpd_perl/logs/ proxy/ run/
After completion of the compilation and the installation of the both
servers, you will need to configure them. To make things clear before we
proceed into details, you should configure the
/usr/local/etc/httpd_docs/httpd.conf
as a plain apache and Port
directive to be 80 for example. And
/usr/local/etc/httpd_perl/httpd.conf
to configure for mod_perl server and of course whose Port
should be different from the one
httpd_docs
server listens to (e.g. 8080). The port numbers issue will be discussed
later.
The next step is to configure and compile the sources: Below are the procedures to compile both servers taking into account the directory layout I have just suggested to use.
Let's proceed with installation. I will use x.x.x instead of real version numbers so this document will never become obsolete :).
% cd /usr/src/httpd_docs/apache_x.x.x % make clean % env CC=gcc \ ./configure --prefix=/usr/local \ --sbindir=/usr/local/sbin/httpd_docs \ --sysconfdir=/usr/local/etc/httpd_docs \ --localstatedir=/usr/local/var/httpd_docs \ --runtimedir=/usr/local/var/httpd_docs/run \ --logfiledir=/usr/local/var/httpd_docs/logs \ --proxycachedir=/usr/local/var/httpd_docs/proxy
If you need some other modules, like mod_rewrite and mod_include (SSI), add them here as well:
--enable-module=include --enable-module=rewrite
Note: gcc
-- compiles httpd by 100K+ smaller then cc
on AIX OS. Remove the line env CC=gcc
if you want to use the default compiler. If you want to use it and you are
a (ba)?sh user you will not need the
env
function, t?csh users will have to keep it in.
Note: add --layout
to see the resulting directories' layout without actually running the
configuration process.
% make % make install
Rename httpd
to http_docs
% mv /usr/local/sbin/httpd_docs/httpd \ /usr/local/sbin/httpd_docs/httpd_docs
Now update an apachectl utility to point to the renamed httpd via your favorite text editor or by using perl:
% perl -p -i -e 's|httpd_docs/httpd|httpd_docs/httpd_docs|' \ /usr/local/sbin/httpd_docs/apachectl
Before you start to configure the mod_perl sources, you should be aware
that there are a few Perl modules that have to be installed before building
mod_perl. You will be alerted if any required modules are missing when you
run the perl Makefile.PL
command line below. If you discover that some are missing, pick them from
your nearest CPAN repository (if you do not know what is it, make a visit
to http://www.perl.com/CPAN ) or run
the CPAN
interactive shell via the command line perl -MCPAN -e shell
.
Make sure the sources are clean:
% cd /usr/src/httpd_perl/apache_x.x.x % make clean % cd /usr/src/httpd_perl/mod_perl-x.xx % make clean
It is important to make clean since some of the versions are not binary compatible (e.g apache 1.3.3 vs 1.3.4) so any ``third-party'' C modules need to be re-compiled against the latest header files.
Here I did not find a way to compile with gcc
(my perl was compiled with cc
so we have to compile with the same compiler!!!
% cd /usr/src/httpd_perl/mod_perl-x.xx
% /usr/local/bin/perl Makefile.PL \ APACHE_PREFIX=/usr/local/ \ APACHE_SRC=../apache_x.x.x/src \ DO_HTTPD=1 \ USE_APACI=1 \ PERL_MARK_WHERE=1 \ PERL_STACKED_HANDLERS=1 \ ALL_HOOKS=1 \ APACI_ARGS=--sbindir=/usr/local/sbin/httpd_perl, \ --sysconfdir=/usr/local/etc/httpd_perl, \ --localstatedir=/usr/local/var/httpd_perl, \ --runtimedir=/usr/local/var/httpd_perl/run, \ --logfiledir=/usr/local/var/httpd_perl/logs, \ --proxycachedir=/usr/local/var/httpd_perl/proxy
Notice that all APACI_ARGS
(above) must be passed as one long line if you work with t?csh
!!! However it works correctly the way it shown above with (ba)?sh
(by breaking the long lines with '\
'). If you work with t?csh
it does not work, since t?csh
passes APACI_ARGS
arguments to ./configure
by keeping the new lines untouched, but stripping the original '\
', thus breaking the configuration process.
As with httpd_docs
you might need other modules like
mod_rewrite
, so add them here:
--enable-module=rewrite
Note: PERL_STACKED_HANDLERS=1
is needed for Apache::DBI
Now, build, test and install the httpd_perl
.
% make && make test && make install
Note: apache puts a stripped version of httpd
at
/usr/local/sbin/httpd_perl/httpd
. The original version which includes debugging symbols (if you need to run
a debugger on this executable) is located at
/usr/src/httpd_perl/apache_x.x.x/src/httpd
.
Note: You may have noticed that we did not run make install
in the apache's source directory. When USE_APACI
is enabled,
APACHE_PREFIX
will specify the --prefix
option for apache's
configure
utility, specifying the installation path for apache. When this option is
used, mod_perl's make install
will also
make install
on the apache side, installing the httpd binary, support tools, along with
the configuration, log and document trees.
If make test
fails, look into t/logs
and see what is in there. Also see make test fails.
While doing perl Makefile.PL ...
mod_perl might complain by warning you about missing libgdbm
. Users reported that it is actually crucial, and you must have it in order
to successfully complete the mod_perl building process.
Now rename the httpd
to httpd_perl
:
% mv /usr/local/sbin/httpd_perl/httpd \ /usr/local/sbin/httpd_perl/httpd_perl
Update the apachectl utility to point to renamed httpd name:
% perl -p -i -e 's|httpd_perl/httpd|httpd_perl/httpd_perl|' \ /usr/local/sbin/httpd_perl/apachectl
Now when we have completed the building process, the last stage before running the servers, is to configure them.
Configuring of httpd_docs
server is a very easy task. Open
/usr/local/etc/httpd_docs/httpd.conf
into your favorite editor (starting from version 1.3.4 of Apache - there is
only one file to edit). And configure it as you always do. Make sure you
configure the log files and other paths according to the directory layout
we decided to use.
Start the server with:
/usr/local/sbin/httpd_docs/apachectl start
Here we will make a basic configuration of the httpd_perl
server. We edit the /usr/local/etc/httpd_perl/httpd.conf
file. As with
httpd_docs
server configuration, make sure that ErrorLog
and other file's location directives are set to point to the right places,
according to the chosen directory layout.
The first thing to do is to set a Port
directive - it should be different from 80
since we cannot bind 2 servers to use the same port number on the same
machine. Here we will use 8080
. Some developers use port 81
, but you can bind to it, only if you have root permissions. If you are
running on multiuser machine, there is a chance someone already uses that
port, or will start using it in the future - which as you understand might
cause a collision. If you are the only user on your machine, basically you
can pick any not used port number. Port number choosing is a controversial
topic, since many organizations use firewalls, which may block some of the
ports, or enable only a known ones. From my experience the most used port
numbers are: 80
, 81
, 8000
and 8080
. Personally, I prefer the port 8080
. Of course with 2 server scenario you can hide the nonstandard port number
from firewalls and users, by either using the mod_proxy's ProxyPass
or proxy server like squid.
For more details see Publishing port numbers different from 80 , Running 1 webserver and squid in httpd accelerator mode, Running 2 webservers and squid in httpd accelerator mode and Using mod_proxy.
Now we proceed to mod_perl specific directives. A good idea will be to add
them all at the end of the httpd.conf
, since you are going to fiddle a lot with them at the beginning.
First, you need to specify the location where all mod_perl scripts will be located.
Add the following configuration directive:
# mod_perl scripts will be called from Alias /perl/ /usr/local/myproject/perl/
From now on, all requests starting with /perl
will be executed under mod_perl
and will be mapped to the files in
/usr/local/myproject/perl/
.
Now we should configure the /perl
location.
PerlModule Apache::Registry
<Location /perl> #AllowOverride None SetHandler perl-script PerlHandler Apache::Registry Options ExecCGI allow from all PerlSendHeader On </Location>
This configuration causes all scripts that are called with a /perl
path prefix to be executed under the Apache::Registry
module and as a CGI (so the ExecCGI
, if you omit this option the script will be printed to the user's browser
as a plain text or will possibly trigger a 'Save-As' window). Apache::Registry
module lets you run almost unaltered CGI/perl scripts under mod_perl
. PerlModule
directive is an equivalent of perl's require()
. We load the
Apache::Registry
module before we use it in the PerlHandler
in the Location
configuration.
PerlSendHeader On
tells the server to send an HTTP header to the browser on every script
invocation. You will want to turn this off for nph (non-parsed-headers)
scripts.
This is only a very basic configuration. Server Configuration section covers the rest of the details.
Now start the server with:
/usr/local/sbin/httpd_perl/apachectl start
While I have detailed the mod_perl server installation, you are on your own
with installing the squid server (See Getting Helped for more details). I run linux, so I downloaded the rpm package, installed
it, configured the /etc/squid/squid.conf
, fired off the server and was all set. Basically once you have the squid
installed, you just need to modify the default squid.conf
the way I will explain below, then you are ready to run it.
First, let's understand what do we have in hands and what do we want from
squid. We have an httpd_docs
and httpd_perl
servers listening on ports 81 and 8080 accordingly (we have to move the
httpd_docs server to port 81, since port 80 will be taken over by squid).
Both reside on the same machine as squid. We want squid to listen on port
80, forward a single static object request to the port httpd_docs server
listens to, and dynamic request to httpd_perl's port. Both servers return
the data to the proxy server (unless it is already cached in the squid), so
user never sees the other ports and never knows that there might be more
then one server running. Proxy server makes all the magic behind it
transparent to user. Do not confuse it with mod_rewrite, where a server redirects the request somewhere according to the rules and
forgets about it. The described functionality is being known as httpd accelerator mode
in proxy dialect.
You should understand that squid can be used as a straight forward proxy
server, generally used at companies and ISPs to cut down the incoming
traffic by caching the most popular requests. However we want to run it in
the httpd accelerator mode
. Two directives:
httpd_accel_host
and httpd_accel_port
enable this mode. We will see more details in a few seconds. If you are
currently using the squid in the regular proxy mode, you can extend its
functionality by running both modes concurrently. To accomplish this, you
extend the existent squid configuration with httpd accelerator mode
's related directives or you just create one from scratch.
As stated before, squid listens now to the port 80, we have to move the httpd_docs server to listen for example to the port 81 (your mileage may vary :). So you have to modify the httpd.conf in the httpd_docs configuration directory and restart the httpd_docs server (But not before we get the squid running if you are working on the production server). And as you remember httpd_perl listens to port 8080.
Let's go through the changes we should make to the default configuration
file. Since this file (/etc/squid/squid.conf
) is huge (about 60k+) and we would not use 95% of it, my suggestion is to
write a new one including only the modified directives.
We want to enable the redirect feature, to be able to serve requests, by
more then one server (in our case we have httpd_docs and httpd_perl)
servers. So we specify httpd_accel_host
as virtual. This assumes that your server has multiple interfaces - Squid
will bind to all of them.
httpd_accel_host virtual
Then we define the default port - by default, if not redirected, httpd_docs will serve the pages. We assume that most requests will be of the static nature. We have our httpd_docs listening on port 81.
httpd_accel_port 81
And as described before, squid listens to port 80.
http_port 80
We do not use icp (icp used for cache sharing between neighbor machines), which is more relevant in the proxy mode.
icp_port 0
hierarchy_stoplist
defines a list of words which, if found in a URL, causes the object to be
handled directly by this cache. In other words, use this to not query
neighbor caches for certain objects. Note that I have configured the /cgi-bin
and /perl
aliases for my dynamic documents, if you named them in a different way,
make sure to use the correct aliases here.
hierarchy_stoplist /cgi-bin /perl
Now we tell squid not to cache dynamic pages.
acl QUERY urlpath_regex /cgi-bin /perl no_cache deny QUERY
Please note that the last two directives are controversial ones. If you
want your scripts to be more complying with the HTTP standards, the headers
of your scripts should carry the Caching Directives
according to the HTTP specs. You will find a complete tutorial about this
topic in Tutorial on HTTP Headers for mod_perl users
by Andreas J. Koenig (at http://perl.apache.org ). If you set the
headers correctly there is no need to tell squid accelerator to NOT
try to cache something. The headers I am talking about are
Last-Modified
and Expires
. What are they good for? Squid would not bother your mod_perl server a
second time if a request is (a) cachable and (b) still in the cache. Many
mod_perl applications will produce identical results on identical requests
at least if not much time goes by between the requests. So your squid might
have a hit ratio of 50%, which means that mod_perl servers will have as
twice as less work to do than before. This is only possible by setting the
headers correctly.
Even if you insert user-ID and date in your page, caching can save resources when you set the expiration time to 1 second. A user might double click where a single click would do, thus sending two requests in parallel, squid could serve the second request.
But if you are lazy, or just have too many things to deal with, you can leave the above directives the way I described. But keep in mind that one day you will want to reread this snippet and the Andreas' tutorial and squeeze even more power from your servers without investing money for additional memory and better hardware.
While testing you might want to enable the debugging options and watch the
log files in /var/log/squid/
. But turn it off in your production server. I list it commented out. (28
== access control routes).
# debug_options ALL, 1, 28, 9
We need to provide a way for squid to dispatch the requests to the correct
servers, static object requests should be redirected to httpd_docs (unless
they are already cached), while dynamic should go to the httpd_perl server.
The configuration below tells squid to fire off 10 redirect daemons at the
specified path of the redirect daemon and disables rewriting of any Host:
headers in redirected requests (as suggested by squid's documentation). The
redirection daemon script is enlisted below.
redirect_program /usr/lib/squid/redirect.pl redirect_children 10 redirect_rewrites_host_header off
Maximum allowed request size in kilobytes. This one is pretty obvious. If you are using POST to upload files, then set this to the largest file's size plus a few extra kbytes.
request_size 1000 KB
Then we have access permissions, which I will not explain. But you might want to read the documentation so to avoid any security flaws.
acl all src 0.0.0.0/0.0.0.0 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl myserver src 127.0.0.1/255.255.255.255 acl SSL_ports port 443 563 acl Safe_ports port 80 81 8080 81 443 563 acl CONNECT method CONNECT http_access allow manager localhost http_access allow manager myserver http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports # http_access allow all
Since squid should be run as non-root user, you need these if you are invoking the squid as root.
cache_effective_user squid cache_effective_group squid
Now configure a memory size to be used for caching. A squid documentation warns that the actual size of squid can grow three times larger than the value you are going to set.
cache_mem 20 MB
Keep pools of allocated (but unused) memory available for future use. Read more about it in the squid documents.
memory_pools on
Now tight the runtime permissions of the cache manager CGI script (cachemgr.cgi
,that comes bundled with squid) on your production server.
cachemgr_passwd disable shutdown #cachemgr_passwd none all
Now the redirection daemon script (you should put it at the location you
have specified by redirect_program
parameter in the config file above, and make it executable by webserver of
course):
#!/usr/local/bin/perl $|=1; while (<>) { # redirect to mod_perl server (httpd_perl) print($_), next if s|(:81)?/perl/|:8080/perl/|o;
# send it unchanged to plain apache server (http_docs) print; }
In my scenario the proxy and the apache servers are running on the same
machine, that's why I just substitute the port. In the presented squid
configuration, requests that passed through squid are converted to point to
the localhost (which is 127.0.0.1
). The above redirector can be more complex of course, but you know the
perl, right?
A few notes regarding redirector script:
You must disable buffering. $|=1;
does the job. If you do not disable buffering, the STDOUT
will be flushed only when the buffer becomes full and its default size is
about 4096 characters. So if you have an average URL of 70 chars, only
after 59 (4096/70) requests the buffer will be flushed, and the requests
will finally achieve the server in target. Your users will just wait till
it will be filled up.
If you think that it is a very ineffective way to redirect, I'll try to prove you the opposite. The redirector runs as a daemon, it fires up N redirect daemons, so there is no problem with perl interpreter loading, exactly like mod_perl -- perl is loaded all the time and the code was already compiled, so redirect is very fast (not slower if redirector was written in C or alike). Squid keeps an open pipe to each redirect daemon, thus there is even no overhead of the expensive system calls.
Now it is time to restart the server, at linux I do it with:
/etc/rc.d/init.d/squid restart
Now the setup is complete ...
Almost... When you try the new setup, you will be surprised and upset to discover a port 81 showing up in the URLs of the static objects (like htmls). Hey, we did not want the user to see the port 81 and use it instead of 80, since then it will bypass the squid server and the hard work we went through was just a waste of time?
The solution is to run both squid and httpd_docs at the same port. This can
be accomplished by binding each one to a specific interface. Modify the httpd.conf
in the httpd_docs
configuration directory:
Port 80 BindAddress 127.0.0.1 Listen 127.0.0.1:80
Modify the squid.conf
:
http_port 80 tcp_incoming_address 123.123.123.3 tcp_outgoing_address 127.0.0.1 httpd_accel_host 127.0.0.1 httpd_accel_port 80
Where 123.123.123.3
should be replaced with IP of your main server. Now restart squid and
httpd_docs in either order you want, and voila the port number has gone.
You must also have in the /etc/hosts
an entry (most chances that it's already there):
127.0.0.1 localhost.localdomain localhost
Now if your scripts were generating HTML including fully qualified self references, using the 8080 or other port -- you should fix them to generate links to point to port 80 (which means not using the port at all). If you do not, users will bypass squid, like if it was not there at all, by making direct requests to the mod_perl server's port.
The only question left is what to do with users who bookmarked your services and they still have the port 8080 inside the URL. Do not worry about it. The most important thing is for your scripts to return a full URLs, so if the user comes from the link with 8080 port inside, let it be. Just make sure that all the consecutive calls to your server will be rewritten correctly. During a period of time users will change their bookmarks. What can be done is to send them an email if you have one, or to leave a note on your pages asking users to update their bookmarks. You could avoid this problem if you did not publish this non-80 port in first place. See Publishing port numbers different from 80.
<META> Need to write up a section about server logging with squid. One thing I sure would like to know is how requests are logged with this setup. I have, as most everyone I imagine, log rotation, analysis, archiving scripts and they all assume a single log. Does one have different logs that have to be merged (up to 3 for each server + squid) ? Even when squid responds to a request out of its cache I'd still want the thing to be logged. </META>
See Using mod_proxy for information about X-Forwarded-For
.
To save you some keystrokes, here is the whole modified squid.conf
:
http_port 80 tcp_incoming_address 123.123.123.3 tcp_outgoing_address 127.0.0.1 httpd_accel_host 127.0.0.1 httpd_accel_port 80 icp_port 0 hierarchy_stoplist /cgi-bin /perl acl QUERY urlpath_regex /cgi-bin /perl no_cache deny QUERY # debug_options ALL,1 28,9 redirect_program /usr/lib/squid/redirect.pl redirect_children 10 redirect_rewrites_host_header off request_size 1000 KB acl all src 0.0.0.0/0.0.0.0 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl myserver src 127.0.0.1/255.255.255.255 acl SSL_ports port 443 563 acl Safe_ports port 80 81 8080 81 443 563 acl CONNECT method CONNECT http_access allow manager localhost http_access allow manager myserver http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports # http_access allow all cache_effective_user squid cache_effective_group squid cache_mem 20 MB memory_pools on cachemgr_passwd disable shutdown
Note that all directives should start at the beginning of the line.
When I was first told about squid, I thought: ``Hey, Now I can drop the
httpd_docs
server and to have only squid and httpd_perl
servers``. Since all my static objects will be cached by squid, I do not
need the light httpd_docs
server. But it was a wrong assumption. Why? Because you still have the
overhead of loading the objects into squid at first time, and if your site
has many of them -- not all of them will be cached (unless you have devoted
a huge chunk of memory to squid) and my heavy mod_perl servers will still
have an overhead of serving the static objects. How one would measure the
overhead? The difference between the two servers is memory consumption,
everything else (e.g. I/O) should be equal. So you have to estimate the
time needed for first time fetching of each static object at a peak period
and thus the number of additional servers you need for serving the static
objects. This will allow you to calculate additional memory requirements. I
can imagine, this amount could be significant in some installations.
So I have decided to have even more administration overhead and to stick with squid, httpd_docs and httpd_perl scenario, where I can optimize and fine tune everything. Of course this can be not your case. If you are feeling that the scenario from the previous section is too complicated for you, make it simpler. Have only one server with mod_perl built in and let the squid to do most of the job that plain light apache used to do. As I have explained in the previous paragraph, you should pick this lighter setup only if you can make squid cache most of your static objects. If it cannot, your mod_perl server will do the work we do not want it to.
If you are still with me, install apache with mod_perl and squid. Then use
a similar configuration from the previous section, but now httpd_docs is
not there anymore. Also we do not need the redirector anymore and we
specify httpd_accel_host
as a name of the server and not virtual
. There is no need to bind two servers on the same port, because we do not
redirect and there is neither Bind
nor Listen
directives in the httpd.conf
anymore.
The modified configuration (see the explanations in the previous section):
httpd_accel_host put.your.hostname.here httpd_accel_port 8080 http_port 80 icp_port 0 hierarchy_stoplist /cgi-bin /perl acl QUERY urlpath_regex /cgi-bin /perl no_cache deny QUERY # debug_options ALL, 1, 28, 9 # redirect_program /usr/lib/squid/redirect.pl # redirect_children 10 # redirect_rewrites_host_header off request_size 1000 KB acl all src 0.0.0.0/0.0.0.0 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl myserver src 127.0.0.1/255.255.255.255 acl SSL_ports port 443 563 acl Safe_ports port 80 81 8080 81 443 563 acl CONNECT method CONNECT http_access allow manager localhost http_access allow manager myserver http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports # http_access allow all cache_effective_user squid cache_effective_group squid cache_mem 20 MB memory_pools on cachemgr_passwd disable shutdown
To build it into apache just add --enable-module=proxy during the apache configure stage.
Now we will talk about apache's mod_proxy and understand how it works.
The server on port 80 answers http requests directly and proxies the mod_perl enabled server in the following way:
ProxyPass /modperl/ http://localhost:81/modperl/ ProxyPassReverse /modperl/ http://localhost:81/modperl/
PPR
is the saving grace here, that makes apache a win over Squid. It rewrites
the redirect on its way back to the original URI.
You can control the buffering feature with ProxyReceiveBufferSize
directive:
ProxyReceiveBufferSize 1048576
The above setting will set a buffer size to be of 1Mb. If it is not set
explicitly, then the default buffer size is used, which depends on OS, for
Linux I suspect it is somewhere below 32k. So basically to get an immediate
release of the mod_perl server from stale awaiting,
ProxyReceiveBufferSize
should be set to a value greater than the biggest generated respond
produced by any mod_perl script.
The ProxyReceiveBufferSize
directive specifies an explicit buffer size for outgoing HTTP and FTP connections. It has to be greater than 512 or set to 0 to
indicate that the system's default buffer size should be used.
As the name states, its buffering feature applies only to downstream data (coming from the origin server to the proxy) and not upstream (i.e. buffering the data being uploaded from the client browser to the proxy, thus freeing the httpd_perl origin server from being tied up during a large POST such as a file upload).
Apache does caching as well. It's relevant to mod_perl only if you produce proper headers, so your scripts' output can be cached. See apache documentation for more details on configuration of this capability.
Ask Bjoern Hansen has written a mod_proxy_add_forward
module for apache, that sets the X-Forwarded-For
field when doing a
ProxyPass
, similar to what squid can do. (Its location is specified in the help
section). Basically, that module adds an extra HTTP header to proxying
requests. You can access that header in the mod_perl-enabled server, and
set the IP of the remote server. You won't need to compile anything into
the back-end server, if you are using Apache::{Registry,PerlRun}
just put something like the following into start-up.pl
:
sub My::ProxyRemoteAddr ($) { my $r = shift; # we'll only look at the X-Forwarded-For header if the requests # comes from our proxy at localhost return OK unless ($r->connection->remote_ip eq "127.0.0.1"); if (my ($ip) = $r->header_in('X-Forwarded-For') =~ /([^,\s]+)$/) { $r->connection->remote_ip($ip); } return OK; }
And in httpd.conf
:
PerlPostReadRequestHandler My::ProxyRemoteAddr
Different sites have different needs. If you're using the header to set the
IP address, apache believes it is dealing with (in the logging and stuff),
you really don't want anyone but your own system to set the header. That's
why the above ``recommended code'' checks where the request is really
coming from, before changing the remote_ip
.
Generally you shouldn't trust the X-Forwarded-For
header. You only want to rely on X-Forwarded-For
headers from proxies you control yourself. If you know how to spoof a
cookie you've probably got the general idea on making HTTP headers and can
spoof the
X-Forwarded-For
header as well. The only address *you* can count on as being a reliable
value is the one from
r->connection->remote_ip
.
From that point on, the remote IP address is correct. You should be able to
access REMOTE_ADDR
as usual.
You could do the same thing with other environment variables (though I think several of them are preserved, you will want to run some tests to see which ones).
To build the mod_perl as DSO add USE_DSO=1
to the rest of configuration parameters (to build libperl.so
instead of
libperl.a
), like:
perl Makefile.PL USE_DSO=1 ...
If you run ./configure
from apache source do not forget to add:
--enable-shared=perl
Then just add the LoadModule
directive into your httpd.conf
.
You will find a complete explanation in the INSTALL.apaci
pod which can be found in the mod_perl distribution.
Some people reported that DSO compiled mod_perl would not run on specific OS/perl version. Also threads enabled perl reported sometimes to break the mod_perl/DSO. But it still can work for you.
Assuming that you have a setup of one ``front-end'' server, which proxies the ``back-end'' (mod_perl) server, if you need to perform the authentication in the ``back-end'' server, it should handle all authentication itself. If apache proxies correctly, it seems like it would pass through all authentication information, making the ``front-end'' apache somewhat ``dumb'', as it does nothing, but passes through all the information.
The only possible caveat in the config file is that your Auth
stuff needs to be in <Directory ...
> ... </Directory
> tags because if you use a <Location /...
> ... </Location
> the proxypass server takes the auth info for its own authentication
and would not pass it on.
The same with mod_ssl, if plugged into a front-end server, all the SSL requests be encoded/decoded properly by it.
|
||
Written by Stas Bekman.
Last Modified at 09/25/1999 |
![]() |
Use of the Camel for Perl is a trademark of O'Reilly & Associates, and is used by permission. |