Apache

Versions
Terminology and how it all fits together
Commands
Config
Apache redirection
Useful Modules
Useful bits of code
Scripts
Apache Bench -Load test your system
Logs
Error codes

Versions

The default with Centos/Redhat is 2.2.x (CentOS like the Red Hat OS it derives from, are geared towards stability and so tend to lag behind what's cutting edge)

2.4 is available and offers improved performance
2.4 moves away from Prefork MPM and moves to "event" mpm which uses a lot less memory

new features are listed here http://httpd.apache.org/docs/2.4/new_features_2_4.html

If you want to upgrade source compiling seems the most popular way although here is a link to do it via yum
http://developerblog.redhat.com/2013/10/24/apache-httpd-2-4-on-red-hat-enterprise-linux-6/

Terminology and how it all fits together

An apache thread is essentially another term for an apache child process and can be used interchangeably.

Apache has a root process, and then for each client (i.e. whenever any separate user goes to your website) it creates a “child” process.

Each of these processes will use a certain amount of memory (which is dictated by your application). This memory usage is not static, so can vary between each process. It can also vary depending on what the application is serving to each client.

Apache doesn’t allow multiple versions of PHP using mod_php, so you need to compile it using fastcgi support. This will allow you to configure multiple sites, each one with a different version of PHP

If you are using Name Based Virtual Hosting for SSL, if a domain is unconfigured for the ip:port it will be sent to a default virtual host.
Apache defaults the first virtual host it sees for an ip:port combination to be the default. This can be confirmed with the command httpd -S

Commands

httpd –S # show the virtuahost settings (taken from the config file)
httpd –M # list of loaded Static and Shared Modules.
httpd –l # Output a list of modules compiled into the server. This will not
list dynamically loaded modules
httpd –X # run in debug mode
httpd -k start -c “DocumentRoot /var/www/html_debug/” Start up Apache using an alternative DocumentRoot.
This is useful if you’re trying out alternative versions of your web site, as it avoids editing the DocumentRoot option.
You can also use -c to set any other directive. Note that -c processes the directive after reading the config files (so it will overwrite config file settings),
whereas -C will process the directive before the config files.
httpd -k start -e debug# While you are debugging an issue, you can change the LogLevel of the Apache temporarily, without modifying the LogLevel directive in the httpd.conf as shown below using option -e.
the LogLevel is set to debug.Possible values you can pass to option -e are: debug, info, notice, warn, error, crit, alert, emerg

Config

A good page that covers a lot of the fundamentals of the appache config can be found at http://www.rackspace.com/knowledge_center/article/centos-apache-virtual-hosts

The main apache config is /etc/httpd/conf/httpd.conf
A lot lot of the parameters in that file can be tweaked to improve performance.
Some of the more useful settings (with typically assigned values):

StartServers 8

MinSpareServers 5

MaxSpareServers 20

ServerLimit 256

MaxClients 50

The StartServers, MinSpareServers, MaxSpareServers, and MaxClients regulate how the parent process creates children to serve requests.
In general, Apache is very self-regulating, so most sites do not need to adjust these directives from their default values.

MaxSpareServers and MinSpareServers determine how many child processes to keep while waiting for requests. If the MinSpareServers is too low and a bunch of requests come in, then Apache will have to spawn additional child processes to serve the requests. Creating child processes is relatively expensive. If the server is busy creating child processes, it won't be able to serve the client requests immediately. MaxSpareServers shouldn't be set too high, it can cause resource problems since the child processes consume resources.
Tune MinSpareServers and MaxSpareServers such that Apache need not frequently spwan more than 4 child processes per second (Apache can spwan a maximum of 32 child processes per second). When more than 4 children are spawned per second, a message will be logged in the ErrorLog.

The StartServers directive sets the number of child server processes created on startup. Apache will continue creating child process until the MinSpareServers setting is reached. Doesn't have much effect on performance if the server isn't restarted frequently. If there are lot of requests and Apache is restarted frequently, set this to a relatively high value.

http://rudd-o.com/linux-and-free-software/tuning-an-apache-server-in-5-minutes has some good hints

To know which fork (prefork MPM or worker MPM to alter) normally its prefork mpm

run httpd -l and you can see what appache was compiled with
Compiled in modules:
core.c
prefork.c
http_core.c
mod_so.c

a good page on performance tuning http://fuseinteractive.ca/blog/drupal-performance-tuning-0

appache full status is a great help in troubleshooting, the following pages cover basic setup and troubleshooting when it doesn't work if you have multiple vhosts it can get a bit messy...

http://articles.slicehost.com/2010/3/26/enabling-and-using-apache-s-mod_status-on-centos

http://www.mydigitallife.info/request-url-server-status-or-404-page-not-found-apache-error/

RedirectMatch 301 ^/bbc.co.uk http://www.bbc.co.uk # the following can be added to .htaccess or httpd.conf

the following link show details on how to add a module to limit the amount of cpu taken up by a processhttp://www.apacheref.com/ref/http_core/RLimitCPU.html

the following links show details on how to add a module to limit the amount of memory taken up by a process
http://httpd.apache.org/docs/current/mod/core.html#rlimitmem

Apache redirection

so we want to redirect all URLs containing http://a.com need to be redirected to https:abc

RedirectMatch permanent ^/*$ abc

or

RedirectMatch permanent ^/.*$ abc

The dot matches any characters except line breaks and asterix matches zero or more times the preceding token
so basically it says match anything that starts with without dot it matches only / or //...n

Useful Modules

Modules allow you to add features to the default service, here are some of the more useful:

https://httpd.apache.org/docs/trunk/mod/mod_log_forensic.html example of how to use at http://northernmost.org/blog/mod_log_forensic-howto/

mod_limitipconn.c can be used to limit the amount of download from an IP

mod_log_forensic This module provides for forensic logging of client requests. Logging is done before and after processing a request, so the forensic log contains two log lines for each request.

Useful bits of code

ps aux | grep [h]ttpd | awk '{sum+=$6} END {print "httpd mem:",sum /1024}' - shows amount of memory being taken up by apache

apachectl fullstatus|grep -Eo '[1-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'|grep -v "127.0.0.1"|sort -n|uniq -c|sort -nr|head -1 - shows who is connected at present by most connections

grep -c '13/Jan/2014:14' /etc/httpd/logs/* | sort -t':' -nk 2 | grep -v :0 – good to see which logs files where populated at a certain hour

grep -c "Invalid URI in request" /etc/httpd/logs/* - check all logs for text displays number of instances each was found

awk '{print $4}' access_log | cut -d: -f1 | uniq -c -shows number of requests per day

grep "13/Jan" access_log | cut -d[ -f2 | cut -d] -f1 | awk -F: '{print $2":00"}' | sort -n | uniq -c - shows number of connections per hour

grep "13/Jan/2014:19" access_log | cut -d[ -f2 | cut -d] -f1 | awk -F: '{print $2":"$3}' | sort -nk1 -nk2 | uniq -c | awk '{ if ($1 > 10) print $0}' – number of requests per minute

cat error_log | grep "Jan 16" | grep "01:[0-9][0-9]:"|awk '{print $4}' | awk -F: '{print $1 ":" $2}' | uniq -c -show number of events at 15:xx on the 14 Jan

cut -d\" -f 2 /etc/httpd/logs/access_log | sort | uniq -c | sort -rnk 1 | head -n 30 -shows most popular pages

cat access.log |grep wp-login # check what wp-login pages have been called

cat ./access_list grep bot > bot_report.txt # output any mentions of the word bot

grep -e "01/Jun/2014" -e "31/May/2014" access_log* | awk '{print $1}' | cut -d\? -f1 | sort | uniq -ic | sort -rn | head -25 check the access logs for connections via IP on the dates specified , order via number of connections

Scripts
to log when appache hits a high number of users

#!/bin/bash
log_path=/root/scripts/number_of_appache_users.log
if [ $number -gt 150 ]
then
date >> $log_path
echo "number of indivdual connections" >> $log_path
/usr/sbin/apachectl fullstatus | awk '{print $11}' | grep -v "::1" | grep -v ^$ | sort | uniq -c | sort -n | tail >> $log_path
echo "pages being view" >> $log_path
/usr/sbin/apachectl fullstatus | awk '{print $11,$12,$13,$14}' | grep -v "::1" | grep -v ^$ | sort | uniq -c |sort -n >> $log_path
fi

to check recommendation of appache max clients

#!/bin/bash
# Settings
RESERVE=0.25 # How much RAM to reserve (in GB) for other processes
APACHE=httpd # What the apache process is called

# Calculate how much RAM this machine has in GB
TOTAL_RAM=`cat /proc/meminfo | head -n1 | awk ‘{print $2}’`
RAM_TO_USE=$(echo “scale=3;($TOTAL_RAM.0/1048576.0)-$RESERVE” | bc)

# Count the memory in use by apache processes
# Change ‘rss’ to ‘size’ if a lot of shared memory is in use
TOTAL=0; PROCESSES=0; while read value; do let TOTAL=TOTAL+$value; let PROCESSES=PROCESSES+1; done < <(ps -eao “rss,cmd” | grep “[/]$APACHE” | awk ‘{print $1}’)

if [ $PROCESSES -eq 0 ]; then
echo “Apache does not appear to be running, or its binary is not called ‘$APACHE’.”
exit 1
fi

AVERAGE=$(echo “scale=0;($TOTAL/$PROCESSES)” | bc)
MAX_CLIENTS=$(echo “scale=0;($RAM_TO_USE*1048576/$AVERAGE)” | bc)

echo “Total is $TOTAL kB in $PROCESSES processes, an average of $AVERAGE kB per process.”
echo “Assuming you are devoting $RAM_TO_USE GB to apache, this would suggest a MaxClients value of approx $MAX_CLIENTS.”

Apache Bench

To load test your system use appache bench, check the performance (memory and load) and then retest.

Apache Bench (ab) is installed by default

example:

ab –n 100 –c 10 http://url/ (100 requests, and concurrency 10)

where n= number of requests c=councurent connections i.e current clients, make sure the url is in the format mentioned.

some useful pages at http://www.petefreitag.com/item/689.cfm and http://www.devside.net/wamp-server/load-testing-apache-with-ab-apache-bench

Another useful test with curl see http://servermonitoringhq.com/blog/how_to_quickly_stress_test_a_web_server

Logs

A very good review and read of Appache logs http://www.the-art-of-web.com/system/logs/

a great command to strip out logs between 2 times

awk -F'[]]|[[]' \
'$0 ~ /^\[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^\[/ && $2 >= "2014-04-08 02:00:01" { p=0 }
p { print $0 }' log.name

if you want to insert you appache logs into mysql look at http://www.hashbangcode.com/blog/apache-log-file-mysql-table and http://www.onlamp.com/pub/a/apache/2005/02/10/database_logs.html?page=2

A really good example of a deep investigation into appache logs
https://blog.sucuri.net/2015/11/distributed-vulnerability-search-told-via-access-logs.html

Error codes

Basic appache error codes

100 -informational
200- success
300- redirect
400- client errors
500- server errors