Mod Perl Icon Mod Perl Icon Choosing the Right Strategy


[ Prev | Main Page | Next ]

Table of Contents:


The Writing Apache Modules with Perl and C book can be purchased online from O'Reilly and Amazon.com.
Your corrections of either technical or grammatical errors are very welcome. You are encouraged to help me to improve this guide. If you have something to contribute please send it directly to me.

[TOC]


Do it like me?!

There is no such a thing as a single RIGHT strategy in web server business, though there are many wrong ones. Never believe a person who says: "Do it this way, this is the best!". As the old saying goes: "Trust but verify". There are too many technologies out there to choose from, and it would take an enormous investment of time and money to try to validate each one before deciding which is the best choice for your situation. Keeping this idea in mind, I will present some different combinations of mod_perl and other technologies or just standalone mod_perl. I'll describe how these things work together, and offer my opinions on the pros and cons of each, the relative degree of difficulty in installing and maintaining them, some hints on approaches that should be used and things to avoid.

To be clear, I will not address all technologies and tools, but limit this discussion to those complementing mod_perl.

Please let me stress it again: DO NOT blindly copy someone's setup and hope for a good result. Choose what is best for your situation -- it might take some effort to find it out.

[TOC]


mod_perl Deployment Overview

There are several different ways to build, configure and deploy your mod_perl enabled server. Some of them are:

  1. Having one binary and one config file (one big binary for mod_perl).

  2. Having two binaries and two config files (one big binary for mod_perl and one small for static objects like images.)

  3. Having one DSO-style binary, mod_perl loadable object and two config files (Dynamic linking lets you compile once and have a big and a small binary in memory BUT you have to deal with a freshly made solution that has weak documentation and is still subject to change and is rather more complex.)

  4. Any of the above plus a reverse proxy server in http accelerator mode.

If you are a newbie, I would recommend that you start with the first option and work on getting your feet wet with apache and mod_perl. Later, you can decide whether to move to the second one which allows better tuning at the expense of more complicated administration, or to the third option -- the more state-of-the-art-yet-suspiciously-new DSO system, or to the fourth option which gives you even more power.

  1. The first option will kill your production site if you serve a lot of static data with ~2-12 MB webserver processes. On the other hand, while testing you will have no other server interaction to mask or add to your errors.

  2. The second option allows you to seriously tune the two servers for maximum performance. On the other hand you have to deal with proxying or fancy site design to keep the two servers in synchronization. In this configuration, you also need to choose between running the two servers on multiple ports, multiple IPs, etc... This adds the burden of administrating more than one server.

  3. The third option (DSO) -- as mentioned above -- means playing with the bleeding edge. In addition mod_so (the DSO module) adds size and complexity to your binaries. With DSO, modules can be added and removed without recompiling the server, and modules are even shared among multiple servers. Again, it is bleeding edge and still somewhat platform specific, but your mileage may vary. See mod_perl server as DSO.

  4. The fourth option (proxy in http accelerator mode), once correctly configured and tuned, improves the performance of any of the above three options by caching and buffering page results.

The rest of this chapter discusses the pros and the cons of each of these presented configurations. Real World Scenarios Implementaion describes the implementation techniques of these schemas.

[TOC]


Standalone mod_perl Enabled Apache Server

The first approach is to implement a straightforward mod_perl server. Just take your plain apache server and add mod_perl, like you add any other apache module. You continue to run it at the port it was running before. You probably want to try this before you proceed to more sophisticated and complex techniques.

The advantages:

The disadvantages:

If you are new to mod_perl, this is probably the best way to get yourself started.

And of course, if your site is serving only mod_perl scripts (close to zero static objects, like images), this might be the perfect choice for you!

For implementation notes see : One Plain and One mod_perl enabled Apache Servers

[TOC]


One Plain and One mod_perl-enabled Apache Servers

As I have mentioned before, when running scripts under mod_perl, you will notice that the httpd processes consume a huge amount of virtual memory, from 5Mb to 15Mb and even more. That is the price you pay for the enormous speed improvements under mod_perl. (Again -- shared memory keeps the real memory that is being used much smaller :)

Using these large processes to serve static objects like images and html documents is overkill. A better approach is to run two servers: a very light, plain apache server to serve static objects and a heavier mod_perl-enabled apache server to serve requests for dynamic (generated) objects (aka CGI).

From here on, I will refer to these two servers as httpd_docs (vanilla apache) and httpd_perl (mod_perl enabled apache).

The advantages:

An important note: When user browses static pages and the base URL in the Location window points to the static server, for example http://www.nowhere.com/index.html -- all relative URLs (e.g. <A HREF="/main/download.html">) are being served by the light plain apache server. But this is not the case with dynamically generated pages. For example when the base URL in the Location window points to the dynamic server -- (e.g. http://www.nowhere.com:8080/perl/index.pl) all relative URLs in the dynamically generated HTML will be served by the heavy mod_perl processes. You must use a fully qualified URLs and not the relative ones! http://www.nowhere.com/icons/arrow.gif is a full URL, while /icons/arrow.gif is a relative one. Using <BASE HREF="http://www.nowhere.com/"> in the generated HTML is another way to handle this problem. Also the httpd_perl server could rewrite the requests back to httpd_docs (much slower) and you still need an attention of the heavy servers. This is not an issue if you hide the internal port implementations, so client sees only one server running on port 80. (See Publishing port numbers different from 80)

The disadvantages:

Before you go on with this solution you really want to look at the Adding a Proxy Server in http Accelerator Mode section.

For implementation notes see : One Plain and One mod_perl enabled Apache Servers

[TOC]


One light non-Apache and One mod_perl enabled Apache Servers

If the only requirement from the light server is for it to serve static objects, then you can get away with non-apache servers having an even smaller memory footprint. thttpd has been reported to be about 5 times faster then apache (especially under a heavy load), since it is very simple and uses almost no memory (260k) and does not spawn child processes.

Meta: Hey, No personal experience here, only rumours. Please let me know if I have missed some pros/cons here. Thanks!

The Advantages:

The Disadvantages:

[TOC]


Adding a Proxy Server in http Accelerator Mode

At the beginning there were 2 servers: one - plain apache server, which was very light, and configured to serve static objects, the other -- mod_perl enabled, which was very heavy and aimed to serve mod_perl scripts. We named them: httpd_docs and httpd_perl appropriately. The two servers coexisted at the same IP(DNS) by listening to different ports: 80 -- for httpd_docs (e.g. http://www.nowhere.com/images/test.gif ) and 8080 -- for httpd_perl (e.g. http://www.nowhere.com:8080/perl/test.pl ). Note that I did not write http://www.nowhere.com:80 for the first example, since port 80 is a default http port. (Later on, I will be moving the httpd_docs server to port 81.)

Now I am going to convince you that you want to use a proxy server (in the http accelerator mode). The advantages are:

The disadvantages are:

Have I succeeded in convincing you that you want the proxy server?

If you are on a local area network (LAN), then the big benefit of the proxy buffering the output and feeding a slow client is gone. You are probably better off sticking with a straight mod_perl server in this case.

As of this writing the two proxy implementations are known to be used in bundle with mod_perl - squid proxy server and mod_proxy which is a part of the apache server. Let's compare the two of them.

[TOC]


The Squid Server

The Advantages:

The Disadvantages:

The presented pros and cons lead to an idea, that probably you might want squid more for its dynamic content buffering features, but only if your server serves mostly dynamic requests. So in this situation it is better to have a plain apache server serving static objects, and squid proxying the mod_perl enabled server only. At least when performance is the goal.

For implementation details see: Running 1 webserver and squid in httpd accelerator mode and Running 2 webservers and squid in httpd accelerator mode

[TOC]


An Apache's mod_proxy

I do not think the difference in speed between apache's mod_proxy and squid is relevant for most sites, since the real value of what they do is buffering for slow client connections. However squid runs as a single process and probably consumes fewer system resources. The trade-off is that mod_rewrite is easy to use if you want to spread parts of the site across different back end servers, and mod_proxy knows how to fix up redirects containing the back-end server's idea of the location. With squid you can run a redirector process to proxy to more than one back end, but there is a problem in fixing redirects in a way that keeps the client's view of both server names and port numbers in all cases. The difficult case being where you have DNS aliases that map to the same IP address for an alias and you want the redirect to use port 80 (when the server is really on a different port) but you want it to keep the specific name the browser sent so it does not change in the client's Location window.

The Advantages:

The Disadvantages:

For implementation see Using mod_proxy.

[TOC]


The Writing Apache Modules with Perl and C book can be purchased online from O'Reilly and Amazon.com.
Your corrections of either technical or grammatical errors are very welcome. You are encouraged to help me to improve this guide. If you have something to contribute please send it directly to me.
[ Prev | Main Page | Next ]

Written by Stas Bekman.
Last Modified at 08/17/1999
Mod Perl Icon Use of the Camel for Perl is
a trademark of O'Reilly & Associates,
and is used by permission.