1. Introduction to Apache
The World Wide Web (WWW) is the Internet’s
most successful application, and its most prominent component is a web server.
The web server serves the user’s request by returning the requested web page to
the user. Two applications are required in order to process such requests: a
web server, and a web client. A protocol known as the Hyper Text Transfer
Protocol (HTTP or http) is required for communication between a client and
a server, and between a web client and a web server.
According to Netcraft’s monthly secure server
surveys available at http://news.netcraft.com/,
the Apache web server currently has 68.01% of the market share as compared to
its competitors, Microsoft at 20.56%, and Sun Microsystems at 2.47%.
The Apache HTTP web server is a part of the Apache
Software Foundation, which supports other open source projects as well,
including Ant, SpamAssasin, Struts, and Tomcat, etc. The current version of the
Apache web server, which is being used for the purposes of this tutorial, is
version 2.2.0. It can be downloaded from its official website at http://httpd.apache.org/download.cgi.
2. Installation
Apache is usually pre-installed in most Linux
distributions. Use the rpm -qa |grep httpd command to confirm whether it
is installed or not. If Apache has been installed from the source code, the command
mentioned above will not produce any result. In this case, try locating the
httpd/apache/apache2 directories. If these directories exist on your system, it
means that Apache has already
been installed on it.
Apache can also be installed manually as
well, by downloading either the rpm or the source code. This tutorial will demonstrate both
methods.
2.1. Installing from the rpm
- Download Apache’s latest version from http://httpd.apache.org/download.cgi.
# wget
\
2. If you have already installed a previous
version of Apache:
·
From the rpm: Uninstall it, using the command:
# rpm -e httpd
·
From the source installation: Install the new rpm
on a path that is different from the path of the source installation.
Apache’s rpm can be installed by the following command:
# rpm -ivh
httpd-2.2.0-1.i386.rpm
If you get any dependency
errors regarding the Apache Portable Runtime (APR or apr) packages, upgrade it to the version compatible with the
current version, httpd-2.2.0-1. This is the apr-1.2.2-1, which
can be downloaded from http://apr.apache.org/.
3. Verify the installation by running:
# rpmm -q httpd
Browse to the
"/etc/httpd" path.
2.2. Installing from the source
A number of options can be used to configure
Apache. Customized installation will be discussed in the “References” section.
- Download Apache from http://httpd.apache.org/download.cgi
# wget
http://apache.mirror99.com/httpd/httpd-2.2.0.tar.gz
- Create an Apache directory in "/usr/local". This path is optional, and is being used for the purposes of this tutorial only.
Unpack
the distribution:
# tar
zxvf httpd-2.2.0.tar.gz -C /usr/local
# cd /usr/local/httpd-2.2.0/
# cd apache2
- Run configure with the following options:
#
./configure --with-layout=Apache --prefix=/usr/local/apache2 \--enable-module=most--enable-mods-shared=most
- Run make to compile the distribution:
# make
- Install Apache by running the following command:
# make install
3. Apache Configuration
If you are using pre-installed Apache that
comes with the distribution, then it is probably installed in /etc/httpd.
If you have built it from the source, and followed the procedure mentioned in
the previous section, then the path is /usr/local/apache2. In order to
refer to this default installation path (“/etc/httpd" or
"/usr/local/apache2") $APACHE_HOME will be used for the purposes of
this tutorial only.
Apache runs as a daemon in the background, on
which the server handles requests continuously. Port 80 is specified by default
in the Apache configuration file, httpd.conf. Running Apache on port 80 requires root privileges,
and can be run via the following command:
# $APACHE_HOME/bin/apachectl start
[If a pre-installed version of Apache is
being used, then the bin might not be under the $APACHE_HOME directory]
Other useful commands include:
# $APACHE_HOME/bin/apachectl stop
# $APACHE_HOME/bin/apachectl restart
# $APACHE_HOME/bin/apachectl status
A start-up script, httpd can also be
used to start, stop, or restart the Apache
web server:
# /etc/init.d/httpd start
Apache reads a special file at start-up, httpd.conf,
which contains configuration-specific information. This is the main
configuration file, and its location can be configured either at the time of
compilation, or it can be specified by passing the -f option, $apachectl
-f /path/to/config/file.
This configuration file is divided into three
sections:
- Global Environment: This section defines configuration parameters for the Apache server process e.g. the path to the Apache configuration directory; the Apache pid file, and the path to other configuration files, etc.
- Main Server Configuration: Apache can be configured to host multiple websites on a single host, and each website can be handled by defining a virtual host entry. The main server configuration specifies the default settings for the Apache server which are not handled by virtual hosts.
- Virtual Host: This section defines settings for virtual hosts that are either IP-based, or name-based.
The configuration file is configured by
placing directives. Most directives have a global scope that applies to the
entire server, but this can be changed by placing the directives in some
special directives, such as <Directory>, <DirectoryMatch>, <Files>,
and <Location>, etc.
3.1. Running Apache
In order to test whether the web server
configuration file is syntactically correct or not, run the command:
# apachectl configtest
The output will display "Syntax OK"
if everything is correct.
The Apache configuration file, httpd.conf,
specifies the web server listening port; it is 80 by default. If it is not,
change the port to 80, restart the Apache web server, and browse to "http://localhost".
If the configuration is correct, the browser will display, "Test
Page".
Note: In Fedora Core 3, a special package, "SE Linux", can create problems in Apache’s
configuration. Ensure that it is disabled before testing the configuration, and
then restart Apache.
4. Basics of Apache Configuration
Some common configuration tasks include
server-wide configuration, site-specific configuration, virtual hosting,
logging, access control and authentication.
4.1. Server-wide configuration
Basic server configuration specifies the
following:
Server Name: This specifies the
server name and the port which is used by the server to identify itself. This
is useful for the purposes of redirection e.g. when the machine’s name is
xyz.osrc.org.pk, but it has the DNS entry for www.osrc.org.pk, and
you want to identify the machine as the latter, then the "ServerName"
can be used as given below:
ServerName www.osrc.org.pk:80
Specify the server name in order to prevent
any problem at start-up. This directive can also be used in the virtual host
section.
Listening Port: This specifies the
port number or IP, and the port number on which the web server will listen for
incoming requests. If only the port is specified, then the server will listen
on the given port number on all IP interfaces, otherwise it will listen to the
specified IP and port number only:
Listen 80
[Listens on port 80
and all available interfaces]
Listen 12.34.56.78:80
[Listens on port 80
and the IP 12.34.54.78 only]
4.2. Site-specific configuration
Document Root: The default web
folder for Apache is /var/www/html where you can publish HTML documents. This can be changed
by using the DocumentRoot directive. This directive can also be used in
the virtual host section:
DocumentRoot /var/www/html
Directory Index: If the requested URL
specifies a directory, this option specifies the resources to look for e.g. http://www.xyz.com/downloads/
where / specifies that "downloads" is a directory. The resources can
be, for instance, index.html index.php, etc. It is important to note that the
order matters, and that the first available resource will always be returned:
DirectoryIndex index.html idnex.php index.txt
The above configuration tells Apache to look
for the index.html file in the "downloads" directory. If there
is no index.html, look for index.php, and then index.txt.
If none of these resources can be found, then the behavior depends upon whether
the Options directive is set or not with the Indexes options.
This directive can also be used in virtual host section.
Options Indexes: If this option is set for a directory, and the requested
URL maps to a directory e.g. http://www.xyz.com/downloads/,
and no DirectoryIndex is set, or the resource specified in the DirectoryIndex
cannot be found, then this option will create a default formatted listing for
the requested directory:
<Directory "/var/www/html">
Options Indexes
</Directory>
This configuration will set the auto index
generation for the directory "html" and its sub-directories.
This directive can also be used in the virtual host section.
4.3. Virtual Hosts
Virtual hosting allows running more than one
website on a single machine. Apache usually allows running only one website on
a single machine. In order to run multiple websites, you can either use
multiple Apache daemons, with each daemon handling a specific website, or
configure Apache for virtual hosting. Running multiple daemons is an
inefficient practice, and should, therefore, be avoided. Virtual hosts can be:
4.3.1. IP-Based
This allows running multiple websites, each
with a different IP, on a single machine. This can be achieved by hosts that
have multiple network connections, or by virtual interfaces. A multi-homed
machine, for example, can have two network cards with IPs 192.168.2.58 and
10.10.10.100. You can configure a website http://www.xyz.com/accounts
on 192.168.2.178 and http://www.xyz.com/hr
on 10.10.10.100.
The following is a sample configuration of
IP-based virtual hosts. The hostnames will be resolved to their respective IP
addresses.
<VirtualHost www.example1.com>
DocumentRoot /var/www/html/example1
</VirtualHost>
<VirtualHost www.example2.com>
DocumentRoot /var/www/html/example2
</VirtualHost>
|
Ensure that the entry NameVirtualHost in the main section is commented out. The above
configuration specifies that when a request is made from the client to http://www.example1.com
then first resolve the hostname, which returns to 192.168.2.58. This returns
the contents in the directory specified by DocumentRoot.
A similar operation can be performed by
Apache for http://www.example2.com,
where the IP address is 10.10.10.100. These hostnames, and their corresponding
IP addresses, should be specified in the "/etc/hosts" file in the web
server machine, in addition to creating entries in the DNS server. Otherwise,
the client will need to specify http://www.example1.com/example1
instead of just www.example1.com.
The above-mentioned configuration requires
DNS name resolution, which will obviously slow down the entire process. Please
refer to http://httpd.apache.org/docs/2.2/dns-caveats.html for more
information. The recommended practice is to specify IP address instead of the
hostname in the virtual host section.
<VirtualHost 192.168.2.58>
DocumentRoot /var/www/html/example1
ServerName www.example1.com
</VirtualHost>
<VirtualHost 10.10.10.100>
DocumentRoot /var/www/html/example2
ServerName www.example2.com
</VirtualHost>
|
You need an additional directive, ServerName,
so that the requests for example1 or example2 can be mapped. If
no ServerName is specified, then Apache will try the reverse DNS in order
to look up the hostname.
4.3.2. Name-Based
Name-based virtual hosts allow multiple
websites on a single IP address. This is in contrast to IP-based virtual hosts,
where you need an IP address for each website. IP-based virtual hosts rely
explicitly on IP addresses to determine the correct virtual host to the server.
Name-based virtual hosts rely on the client to specify the hostname in the HTTP
headers. Name-based virtual hosts are easy to configure, and do not require
multiple IP addresses, and can, therefore, work in situations in which you are
short of IPs. Prefer name-based virtual hosting over IP-based virtual hosting
unless you have very specific reasons for doing otherwise. The following is a
sample configuration for name-based virtual hosts:
NameVirtualHost 192.168.2.58:80
<VirtualHost 192.168.2.58:80>
DocumentRoot /var/www/html/example1
ServerName www.example1.com
</VirtualHost>
<VirtualHost 192.168.2.58:80>
DocumentRoot /var/www/html/example2
ServerName www.example2.com
</VirtualHost>
|
The directive NameVirtualHost
specifies that IP 192.168.2.58 must listen on this specific IP for incoming
requests. Normally, you can use * here, but in cases which require mixed types
of settings, i.e. a host that supports both IP-based and name-based virtual
hosts, you need to specify which IP address you want to configure for
name-based virtual hosting. If you are planning to use multiple ports, such as
SSL, for example, then specify the port here. The argument given in NameVirtualHost
must match with the virtual host
section for name-based virtual hosts:
NameVirtualHost *
<VirtualHost *>
DocumentRoot /var/www/html/example1
ServerName www.example1.com
</VirtualHost>
<VirtualHost *>
DocumentRoot /var/www/html/example2
ServerName www.example2.com
</VirtualHost>
|
4.4. Authentication, Authorization and Access Control
Authentication refers to the verification of
the identity of the requesting host and/or user i.e. the user/host is actually
who/what they claim to be.
Authorization is the process of granting
someone access to the areas to which the user is allowed to go.
Access control is also authorization, but it
provides authorization at another layer i.e. based on an IP address, hostname
or the characteristic of the request.
Make sure that the requisite modules are
installed and loaded in Apache beforehand. Please refer to http://httpd.apache.org/docs/2.2/howto/auth.html
and http://httpd.apache.org/docs/2.2/howto/access.html
for the list.
In order to implement such security
mechanisms, you first need to understand the Apache directory’s structure, and
its configuration. Apache is normally configured using the main httpd.conf
file, where the configuration parameters are applicable to all the published
web folders. Sometimes you need to customize configuration based on specific
directories, URLs, files, hosts, or locations. You might, for example, want to
restrict a particular section of the website to a few users, in which case
Apache provides two options: either use <Directory> </Directory>
in the main configuration file httpd.conf, or use the .htaccess
special file by placing it in that directory. Conceptually, there is no
difference in either of the above-mentioned methods, as both have the same
syntax and applicability. The difference between a directory, a file, and
locations is as follows:
<Directory
/var/www/html/test>
Order allow,deny
Deny from all
</Directory>
This means denying access to the directory test
and all its sub-directories. So, access to the URL http://www.test.com
pointing to the directory /var/www/html/test is denied. Access to the URL http://www.test.com/public
pointing to the directory /var/www/html/all is allowed.
<File
private.html>
Order allow,deny
Deny from all
</File>
This means that access to the file private.html
located anywhere is denied.
<Location
/private>
Order allow,deny
Deny from all
</Location>
This means that access to any URL containing private
is denied. Access to http://www.test.com/private/public
is not allowed, whereas access to http://www.test.com/public
is allowed.
The .htaccess method is easy to
configure. Place the contents of the .httaccess file in <Directory>
</Directory> in the main configuration file.
The name of the .htaccess file can be
changed by using the AccessFileName directive in the main configuration
file. Configure Apache to allow such configuration files for directories. This
can be done by using AllowOverride AuthConfig in <Directory>
</Directory>. If you want a special directory, /var/www/html/public/restricted
to be restricted, for example, you must allow the use of the .htaccess file.
Place the following configuration in Apache’s main configuration file:
<Directory
/var/www/html/public/restricted>
AllowOverride
AuthConfig
</Directory>
Define the users who are granted access to
the restricted area. These users, and their passwords, will be defined in a
special file, which should be placed somewhere which is inaccessible to the
web. The file can be created with a special utility htpasswd that comes
with Apache:
# htpasswd -c /etc/httpd/conf/passwd user1
New password:
Re-type new password:
Adding password for user user1
Create the .htpasswd file in /var/www/html/public/restricted
from where the Apache server will read the configuration about the password
file and users in order to allow them access to the restricted area:
.htaccess
----------------------------------------------
AuthType Basic
AuthName
"Restricted Files"
# Optional line:
AuthBasicProvider file
AuthUserFile
/usr/local/apache/passwd/passwords
AuthUserFile
/etc/httpd/conf/passwd
Require user user1
----------------------------------------------
AuthType specifies the type of authentication,
and Basic is unencrypted. AuthName specifies the realm which is
used as a temporary session identifier. AuthUserFile specifies the path
of the password file, and Require user specifies the user to whom
access must be granted. Sometimes access needs to be granted to more than one
user. This can be achieved by using the Require valid-user, which will
allow access to the restricted area to anyone listed in the password file. Please
see the “References” section for more advanced techniques regarding configuring
authentication/authorization, using groups, and databases.
Now consider restricting access based on
hostnames, IP addresses, or the characteristic of the request. Please refer to http://httpd.apache.org/docs/2.2/howto/access.html
for a list of modules that require installing and loading in this regard.
In order to customize access based on
hosts/IPs, use Allow and Deny directives. The Order
directive can also be used to specify the order in which the filters should be
applied. The syntax is:
Allow from HOST
Deny from HOST
Order Allow,Deny
Order Deny, Allow
Consider the examples given below:
- Allow from 192.168.2.100 [Allow from this host only]
- Allow from 192.168.2.0/24 [Allow from this network 192.168.2 only]
- Allow from 192.168.2.100 192.168.2.200 [Allow from these hosts only]
- Allow from my.host.com
Order specifies the order of the filters, which
can be:
Deny,Allow: First Deny, and
then the Allow directive is evaluated. Access is allowed by the default
meaning that any client that matches neither the Deny nor the Access
directive will be allowed to access the server.
Allow,Deny: First
Allow, and then the Deny directive is evaluated. Access is denied
by the default meaning that any client that matches neither the Allow nor
the Deny directive will be allowed to access the server.
Consider a real example, a directory /var/www/html/localusers.
You want only local users falling in the 192.168.2 network access to /var/www/html/localusers.
Use the following
configuration:
<Directory
/var/www/html/localusers>
Order Allow,Deny
Allow from 192.168.2.0/24
</Directory>
Consider the following configuration:
<Directory
/var/www/html/localusers>
Order Allow,Deny
Allow from 192.168.2.0/24
Deny from 192.168.2.178
</Directory>
This will allow access to all hosts in the
network 192.168.2.0/24 except 192.168.2.178. All other requests will be denied
by default. Changing the order from Allow,Deny to Deny,Allow will
only allow the host 192.168.2.178 to access, since Allow will override
the Deny behavior.
4.5 Logging
Apache logs provide comprehensive information
and customization for the purposes of security analysis and troubleshooting.
Apache logs are located, by default, under the /var/log/httpd directory.
There are two basic types of logs:
Error Log: This log provides
error information while processing requests for diagnostic purposes. The
location of this log can be controlled by the ErrorLog directive in the
main configuration file. Error logs cannot be customized.
Access Log: This log records
useful information, such as client IP, date/time, location accessed, client
platform information, and so on. An access log can be customized, and its
location and content can be controlled by the CustomLog directive.
5. An Example Set-up
Consider a real-world example to configure a
static website. The configuration is given below:
Routable Server IP 203.215.183.11
Non-routable IP 192.168.2.178
Domain name www.testmachine.org
host name osrc-test
FQDN osrc-test.testmachine.org
The machine’s name is osrc-test, but
the DNS alias for this configuration is www.testmachine.org.
Steps
- Open the Apache configuration file httpd.conf
- Locate DocumentRoot and ensure that it is set to /var/www/html
- Set ServerName to testmachine.org:80
- Put your web-publishing directory directly under /var/www/html. If you have all the data that is to be published under/home/user1/website, type:
$ mv /home/user1/website/* /var/www/html
- Save the Apache configuration file with new changes, exit, and restart the Apache service.
Ensure that valid DNS entries exist for www.testmachine.org
that should point to the IP of your machine.
Test the website by pointing to www.testmachine.org.
No comments:
Post a Comment