Optimizing Gentoo: Vacuum Portage Configurations

Alex Brandt 2011-02-03 14:36 Comments

All systems collect dust but with the configurability of Gentoo, we tend to collect a bit more dust in our world than other distributions. This will be a quick walkthrough of helpful one-liners that assist with keeping /etc/portage/package.* clean.

In all of the examples presented, replace /etc/portage/package.use with the file you are currently cleaning.

Check for Multiple Occurences of an Atom

for atom in $(gawk '{ print $1 }' /etc/portage/package.use)
do
  [ "${grep ${atom} /etc/portage/package.use | wc -l)" -gt "1" ] && echo ${atom}
done

Check for N Uses of a USE Flag

for flag in ${gawk '{ print $2 }' /etc/portage/package.use)
do
  [ "${grep "${flag}" /etc/portage/package.use | wc -l)" -gt "2" ] && echo ${flag}
done

This can be used to find frequently used USE flags that might be better placed in /etc/make.conf.

Check for Removed Atoms

for atom in $(gawk '{ print $1 }' /etc/portage/package.use)
do
  [ "$(portageq match / ${atom} | wc -l)" -lt "1" ] && echo ${atom}
done

Disable KScreenSaver Using DBUS

Alex Brandt 2011-02-03 14:30 Comments

KDE 4 has some major improvements over older versions, but it also seems to have gone backwards in places. The new libraries probably contribute to this and are absolutely the way to go. A nice ability that I've been looking for in powerdevil (the new power manager in KDE 4) is how to have the screensaver disabled when entering presentation mode. This is behavior that I know I expected but found, to my dismay and partway through a presentation, was not available as the screensaver came to life.

After looking around for ways to solve this issue, I finally found some interesting information in the form of the DBUS interface provided by the screensaver in KDE. Using qdbusviewer I was able to find an API for the screensaver that can be invoked at any point and from anywhere (assuming the application is part of the session with the screensaver). Using this new ammunition for more Googling, I found that I could write a daemon in python that would keep the screensaver from displaying while it was turned on.

The result of this work can be found in my subversion repository as stop_kscreensaver.py. This script only has three parameters and is very easy to use. When starting the daemon you simply pass a time between activity simulations (this should be set just shorter than the timeout for the screensaver it is meant to control) and if desired a different logging level. To stop the daemon you simply pass the kill parameter which reads the PID from a standard file and makes sure the daemon dies.

The timing parameter for this script is fairly functional in that you can pass the time in with various units and the conversion will be taken into account. For example, one could pass a time of 2h32.1m94.34s. Why anyone would is beyond me, but it's there if anyone finds it useful. If no units are passed, the script assumes that the number passed was in seconds. As always if any bugs are found please e-mail me, Alex Brandt with a description of the problem. Patches for issues that you experienced are always welcome.

Now the important part. How do we get this to work with powerdevil? That's the easiest part of all with powerdevil's "execute this scrip when switching to this profile." We simply save the script somewhere, make it executable (chmod 755), and then set the path (or browse to it) in the powerdevil configuration interface.

Once that is in place you can switch to the profile you set the daemon up to start in and the screensaver although active will not start up until you switch profiles again. This lets you watch that movie you wanted to just like our favorite comic XKCD tells us about.

Teaching: Introductory C++ Programming

Alex Brandt 2011-02-03 12:55 Comments

It's strange that I'm the only tutor for our Computer Science and Information Systems department, but I can live with that. What irks me though, is how these students come to gain their knowledge about programming.

To be fair, I will not mention names or place blame. I truly believe this to be a simple matter of miscommunication and easily solved. I also feel that the scope of these classes needs to be monitored a little more closely and what follows is my proposal for what should be taught in an introductory programming course contrasted with my understanding of what is currently being taught in our introductory courses.

The courses in question are our first introductory classes to C++ (101 and 201). These courses together should leave the student comfortable with the basic constructs and built-ins of C++. It should also make them comfortable with classes and working in an object oriented programming (OOP) paradigm. The problem crops up in the transition between these two courses. It seems (and this I know thanks to my tutees) that their understanding of the basics gets muddled in the first course, and thus they cannot build upon this knowledge in the second course. I have heard many things about what may be happening wrong in this course to cause such a calamity, but let us refocus and start by going over what should be covered in the first class in order to succeed in the following course and program.

In an introductory programming course it should be important for students to become familiar with functions, control structures, and various keywords (not all of them mind you, just most of them). Thus they should be comfortable looking through a piece of code like the following and have an understanding of the majority of it:

#include <iostream>
#include <cstdlib>

using namespace std;

int factorial(int n)
{
    if (n == 0)
        return 1;
    return n * factorial(n - 1);
}

int main(int argc, char *argv[])
{
    if (argc != 2)
    {
        cerr << "You must pass a number to get the factorials!" << endl;
        return EXIT_FAILURE;
    }

    int n = argv[1] + 1;

    while (--n, n > 0)
        cout << "The factorial of " << n << " is " << factorial(n) << "." << endl;

    return EXIT_SUCCESS;
}

The students should also be able to understand Makefiles (rudimentary Makefiles not automake or anything of its ilk), header files versus source files, the stages of compiling and how to deal with different errors in each (i.e. a linking error versus a compilation error), and they should understand array mechanics (if not exactly what they are then how to use them). They should be able to put all of this together, and make a nice modular program that allows them to expand and reuse code in an intelligent way. Then if there is time (although this gets covered in great detail in the secod course) a quick coverage of structs should ensue. At this point the eduction they should have gained from the first class should be enough of the basics that they understand the constructs of the language and can start reading simple programs like the one above.

In reality, this goal (mind you this is my personal goal and does not reflect the goals of the professors or the department) was not met, and based on experience with people I'm tutoring who are in the second course at this time, it is quite obvious where their shortcomings are. Those shortcomings are outlined below:

Lack of understanding as to how looping control structures function
Lack of knowledge as to how arrays function and how to use them
Lack of knowledge as to how the compiler actually takes the source code (possibly spread over many files) and makes an executable
Lack of understanding as to how to solve simple problems like manipulating all of the entries in an array
Lack of understanding as to how to do a simple search or utilize functions provided in an advanced programming interface (API)

Note

This list will be updated appropriately as I gain more knowledge on what the students actually know. Thus, this list may have inaccuracies and shortcomings and may not reflect reality.

The list goes on in much the same fashion here after, and this is obviously a product of the course not fulfilling its primary purpose of laying down a foundation that the students can build on. To better acheive this goal it may be necessary to recommend some supplemental texts (see the list at the end of this article for recommended books for the various programming courses).

Now, what I've heard reports of in this class is the following:

The concepts outlined above are not getting covered in the detail taht is required of them in the subsequent classes (which only confirms what I've observed in students)
There has been a report that a new teaching tool, Alice , was used in the introduction course

The first point appears to just be a shortcoming of the class and could be ameliorated by spending more time going over more examples of the concepts in question.

Also, the students themselves need to realize that they can control their education and must participate for a class to truly be successful for them. When entering a class, students should think to themselves, "What do I want from this class and how can I have this class help me achieve that goal?" By asking questions and looking for more material on the specific area of study the student is interested in, they can get a much more fulfilling experience and better understanding of the topics involved, but they may hit a point where they just can't keep up with the course and need further assistance that isn't self-guided. Tutors are available from the tutoring department, and should not be shied away from. If a student really wants to learn the material, they have to be willing to get help and help themself towards that goal. That's only part of the problem, students who do try are still having a hard time within the status quo.

The second point requires that I do not put forth my personal opinion, but does require that I state its purpose (from the Alice website):

Alice is an innovative 3-D programming environment that makes it easy to
create an animation for telling a story, playing an interactive game, or a
video to share on the web.  Alice is a freely available teaching tool
designed to be a student's first exposure to object-oriented programming.
It allows students to learn fundamental programming concepts in the
context of creating animated movies and simple video games.  In Alice, 3-D
objects (e.g., people, animals, and vehicles) populate a virtual world and
students create a program to animate the objects.

In Alice's interactive interface, students drag and drop graphic tiles to
create a program, where the instructions correspond to standard statements
in a production oreinted programming language, such as Java, C++, and C#.
Alice allows students to immediately see how their animation programs run,
enabling them to easily understand the relationship between the
programming statements and the behavior of objects in their animation.  By
manipulating the objects in their virtual world, students gain experience
with all the programming constructs typically taught in an introductory
programming course.

In conclusion, it is my perception that there is a missing communication link in the way these courses are handled, but it's not just between the professors of the two courses (they seem to hot a smooth break between the courses), it's between the students and the professors that the communication has really broken down. The students must speak up for their education or they may see it going down a path that does not maximally further their education. This is wider spread than just the simple course example I've given here. Almost everywhere one looks, it seems that students are becoming more lethargic; pushing to just get through the courses. There is a lack of genuine interest in the education being provided, and more of a view that college is now a necessity to continue in society. Fortunately, we can still fight for the freedom of our minds.

Supplemental Texts for CSIS Courses

CSIS 152:	C++ in Plain English by Brian Overland
CSIS 252:	C++ in Plain English by Brian Overland
CSIS 352:	C++ in Plain English by Brian Overland Beyond the C++ Standard Library: An Introduction to Boost by Björn Karlsson C++ Coding Standards: 101 Rules, Guidelines, and Best Practices by Herb Sutter & Andrei Alexandrescu Design Patterns: Elements of Reusable Object-Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson & John M. Vlissides

Using AWStats on Gentoo

Alex Brandt 2011-02-03 11:26 Comments

Requirements

Note

It's assumed for the purposes of this guide that you already have apache installed and running a website to monitor.

Start by installing awstats:

emerge -av awstats

The USE flags that can be tweaked are the following:

apache2:	Add apache2 support (recommended)
geoip:	Add geoip support for country and city lookup based on IPs
vhosts:	Adds support for installing web-based applications into a virtual hosting environment

Configuration

setup a configuration file for the web site so we can update the statistics:
```
cp /etc/awstats/awstats.model.conf /etc/awstats/awstats.FQDN.conf
```
Where FQDN is the fully qualified domain name of your website you'll be monitoring.

After you've copied the default configuration, customize it for your particular needs.

enable awstats in your apache virtualhost configuration:

CustomLog /var/www/localhost/log/apache/production.log combined

Alias /awstats/classes "/usr/share/webapps/awstats/6.9-r1/htdocs/classes/"
Alias /awstats/css "/usr/share/webapps/awstats/6.9-r1/htdocs/css/"
Alias /awstats/icon "/usr/share/webapps/awstats/6.9-r1/htdocs/icon/"

ScriptAlias /awstats/ "/usr/share/webapps/awstats/6.9-r1/hostroot/cgi-bin/"
ScriptAlias /awstats "/usr/share/webapps/awstats/6.9-r1/hostroot/cgi-bin/awstats.pl"
ScriptAlias /awstats.pl "/usr/share/webapps/awstats/6.9-r1/hostroot/cgi-bin/awstats.pl"

Options ExecCGI
AllowOverride None

Order allow,deny
Allow from all

verify the logging output in /etc/apache2/modules.d/00_mod_log_config.conf:

# The following directives define some format nicknames for use with
# a CustomLog directive (see below).
LogFormat "%h %l %u %t \"%r\" %&gt;s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %&gt;s %b" common

LogFormat "%{Referer}i -&gt; %U" referer
LogFormat "%{User-Agent}i" agent
LogFormat "%v %h %l %u %t \"%r\" %&gt;s %b %T" script
LogFormat "%v %h %l %u %t \"%r\" %&gt;s %b \"%{Referer}i\" \"%{User-Agent}i\" VLOG=%{VLOG}e" vhost
# You need to enable mod_logio.c to use %I and %O
LogFormat "%h %l %u %t \"%r\" %&gt;s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio
LogFormat "%v %h %l %u %t \"%r\" %&gt;s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" vhostio

# The location and format of the access logfile (Common Logfile Format).
# If you do not define any access logfiles within a
# container, they will be logged here.  Contrariwise, if you *do*
# define per- access logfiles, transactions will be
# logged therein and *not* in this file.
CustomLog /var/log/apache2/access_log common

# If you would like to have agent and referer logfiles,
# uncomment the following directives.
#CustomLog /var/log/apache2/referer_log referer
#CustomLog /var/log/apache2/agent_logs agent

# If you prefer a logfile with access, agent, and referer information
# (Combined Logfile Format) you can use the following directive.
#CustomLog /var/log/apache2/access_log combined

add a cron entry to update the statistics on a regular basis:

System Message: ERROR/3 (<string>, line 93)

Cannot find pygments lexer for language "cron"

.. code-block:: cron

    # AWStats
    */15 * * * * perl /usr/share/webapps/awstats/6.9-r1/hostroot/cgi-bin/awstats.pl -config=FQDN -update > /dev/null

Conclusion

Barring the standard "your mileage may vary" warning your awstats setup should be complete and functional. It will take a bit of time (~15 minutes) for the statistics to start collecting.

Optimizing LAMP: Apache and MySQL

Alex Brandt 2011-02-03 11:23 Comments

Introduction

Faster and faster … optimization always comes up when discussing web applications and serving techniques. We'll be covering optimization of MySQL as well as how to tune the Apache MPM (multi-processing module).

Quick Wins

There are a few applications that require little to no configuration that can be installed or enabled to improve performance in a PHP environment:

Optimizing MySQL

Major Hayden (a.k.a. Racker Hacker) wrote an excellent utility for profiling MySQL called mysqltuner.pl. This utility summarizes the common analytics that MySQL captures and presents them with recommendations for optimizations. The recommendations should generally be ignored as they're for a specific work type but the nice report of information (shown below) can be used as a quick guide for tuning MySQL. Running this script is extermely easy: wget -O - mysqltuner.pl | perl. If you do not have a ~/.my.cnf file with your credentials the script will prompt you for them so it can collect the information it needs.

>>  MySQLTuner 1.2.0 - Major Hayden <major@mhtx.net>
>>  Bug reports, feature requests, and downloads at
>>  http://mysqltuner.com/
>>  Run with '--help' for additional options and output filtering

-------- General Statistics --------------------------------------------------
[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.1.67-log
[OK] Operating on 32-bit architecture with less than 2GB RAM

-------- Storage Engine Statistics -------------------------------------------
[--] Status: +Archive -BDB -Federated +InnoDB -ISAM -NDBCluster
[--] Data in MyISAM tables: 5M (Tables: 25)
[--] Data in InnoDB tables: 23M (Tables: 74)
[!!] Total fragmented tables: 9

-------- Security Recommendations -------------------------------------------
[OK] All database users have passwords assigned

-------- Performance Metrics -------------------------------------------------
[--] Up for: 3045d 21h 30m 31s (14K q [0.000 qps], 24 conn, TX: 45M, RX: 1M)
[--] Reads / Writes: 50% / 50%
[--] Total buffers: 318.0M global + 5.4M per thread (50 max threads)
[OK] Maximum possible memory usage: 589.9M (29% of installed RAM)
[OK] Slow queries: 0% (2/14K)
[OK] Highest usage of available connections: 4% (2/50)
[OK] Key buffer size / total MyISAM indexes: 16.0M/2.9M
[!!] Key buffer hit rate: 73.3% (90 cached / 24 reads)
[OK] Query cache efficiency: 85.0% (8K cached / 10K selects)
[OK] Query cache prunes per day: 0
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 169 sorts)
[!!] Temporary tables created on disk: 34% (506 on disk / 1K total)
[OK] Thread cache hit rate: 91% (2 created / 24 connections)
[!!] Table cache hit rate: 2% (13 open / 486 opened)
[OK] Open file limit used: 0% (29/4K)
[OK] Table locks acquired immediately: 100% (3K immediate / 3K locks)
[OK] InnoDB data size / buffer pool: 23.4M/32.0M

-------- Recommendations -----------------------------------------------------
General recommendations:
    Run OPTIMIZE TABLE to defragment tables for better performance
    Temporary table size is already large - reduce result set size
    Reduce your SELECT DISTINCT queries without LIMIT clauses
    Increase table_cache gradually to avoid file descriptor limits
Variables to adjust:
    table_cache (> 2048)

Note

The longer MySQL is running the more data you have to work with when doing analysis like this. By only getting information from a one hour uptime you may be missing out on trends over a day or days.

Understanding how these parameters affect memory and runtime are going to be the best way to optimize MySQL for your particular workload. Blindly following the recommendations provided can make your MySQL instance run worse rather than better.

With that in mind, the next section is a small glossary of parameters that have the most noticeable impact on MySQL's performance. As always your mileage may vary and the MySQL manual s are an excellent resource as well.

Glossary of mysqltuner Categories and Controlling Parameters

Maximum possible memory:
	Not actually a hard limit on the memory usage but the calculated maximum memory usage based on per thread allocations and number of threads allowed; a good check to ensure you're not overallocating memory with other MySQL parameters
Highest usage of available connections:
	The maximum number of concurrent connections seen since MySQL was started; may indicate a spike during the runtime of MySQL and not a sustained connection set
Query cache efficiency:
	Query cache hit rate; ideally all queries would come from the query cache but some workloads want this disabled completely (i.e. extremely dynamic queries where there are very few identical queries happening back to back)
Temporary tables created on disk:
	Indicates the number of temporary tables (from joins and other temporary table creating queries) that were created on the disk rather than in memory; this is the first place to trim if memory usage is too high but also a good place to allocate those gobs of extra memory on the system
Thread cache hit rate:
	Number of threads (connections) that were re-used rather than torn down and re-created
Table cache hit rate:
	Number of table file descriptors that were re-used rather than re-opened.

Optimizing Apache

Common Apache tunables are in httpd.conf unless your distribution organizes its Apache configuration into multiple, easier to read files. Gentoo stores the tunables we'll be covering in /etc/apache2/modules.d/00_mpm.conf.

Apache allows you to change the multi-processing strategy through modules. The common MPMs are prefork (the default), worker, peruser, and event. Determining which MPM you are currently using is done by issuing /usr/sbin/apache2 -l. Most binary distributions don't even offer the last two as options for their builds of Apache.

# Server-Pool Management (MPM specific)

# PidFile: The file in which the server should record its process
# identification number when it starts.
#
# DO NOT CHANGE UNLESS YOU KNOW WHAT YOU ARE DOING
PidFile /var/run/apache2.pid

# The accept serialization lock file MUST BE STORED ON A LOCAL DISK.
#LockFile /var/run/apache2.lock

# Only one of the below sections will be relevant on your
# installed httpd.  Use "/usr/sbin/apache2 -l" to find out the
# active mpm.

# common MPM configuration
# These configuration directives apply to all MPMs
#
# StartServers: Number of child server processes created at startup
# MaxClients: Maximum number of child processes to serve requests
# MaxRequestsPerChild: Limit on the number of requests that an individual child
#                      server will handle during its life


# prefork MPM
# This is the default MPM if USE=-threads
#
# MinSpareServers: Minimum number of idle child server processes
# MaxSpareServers: Maximum number of idle child server processes
<IfModule mpm_prefork_module>
  StartServers            5
  MinSpareServers         5
  MaxSpareServers         10
  MaxClients              150
  MaxRequestsPerChild     10000
</IfModule>

# worker MPM
# This is the default MPM if USE=threads
#
# MinSpareThreads: Minimum number of idle threads available to handle request spikes
# MaxSpareThreads: Maximum number of idle threads
# ThreadsPerChild: Number of threads created by each child process
<IfModule mpm_worker_module>
  StartServers            3
  MinSpareThreads         10
  MaxSpareThreads         20
  ThreadsPerChild         10
  MaxClients              150
  MaxRequestsPerChild     5000
</IfModule>

# event MPM
#
# MinSpareThreads: Minimum number of idle threads available to handle request spikes
# MaxSpareThreads: Maximum number of idle threads
# ThreadsPerChild: Number of threads created by each child process
<IfModule mpm_event_module>
  StartServers        2
  MinSpareThreads     25
  MaxSpareThreads     75
  ThreadsPerChild     25
  MaxClients          150
  MaxRequestsPerChild 10000
</IfModule>

# peruser MPM
#
# MinSpareProcessors: Minimum number of idle child server processes
# MinProcessors: Minimum number of processors per virtual host
# MaxProcessors: Maximum number of processors per virtual host
# ExpireTimeout: Maximum idle time before a child is killed, 0 to disable
# Multiplexer: Specify a Multiplexer child configuration.
# Processor: Specify a user and group for a specific child process
<IfModule mpm_peruser_module>
  MinSpareProcessors  4
  MinProcessors       2
  MaxProcessors       80
  MaxClients          256
  MaxRequestsPerChild 4000
  ExpireTimeout       0

  #Multiplexer nobody nobody
  User        nobody
  Group       nobody
  Processor   apache apache
</IfModule>

# vim: ts=4 filetype=apache

Glossary of MPM Parameters

The important parameters to tweak when playing with Apache memory and performance are the following:

MinSpareServers:
StartServers:	Number of servers to start running and handling connections when Apache is started
	Minimum number of servers to have running and not handling connections
MaxSpareServers:
	Maximum number of servers to have running and not handling connections
MaxClients:	Maximum number of clients that can simultaneously connect to Apache
MaxRequestsPerChild:
	Maximum number of requests that a child will respond to before terminating

Conclusion

Optimizing Apache and MySQL can be done in a multitude of ways with an even larger number of tunable parameters. As always, after making changes test to verify that they do improve performance for your workload. This should provide a start when optimizing a LAMP setup.

Using rdiff-backup: Backup Remote Clients With Ease

Alex Brandt 2011-02-03 11:10 Comments

Introduction

Backups are awesome! Unless, you don't have any. Also, it's hard to find space for them and setting them up isn't always fun. Without backups the day will come when we lose data and need to get it back and can't. Whether it's an accidental delete of the project you've been working hard on; a disk issue resulting in partial or complete data loss; or something completely different, data loss is only a matter of when not if.

The Problem

Backing up clients behind NAT and other network obfuscation techniques adds another set of challenges to the equation. These can be solved by initiating them client side and having management set up around the idea of a data dump.

The Solution

rdiff-backup allows us to initiate backups from the client, use SSH as the communication protocol, keep incremental backups, &c.

Automating backups with rdiff-backup isn't overly challenging but isn't outlined (with required nuances). The magic lies in a little known option, --remote-schema.

For example, the following BASH snippet is the cron entry (could be moved to a proper script) that backs up my laptop to a remote site (split across lines for readability):

/usr/bin/rdiff-backup \
--remote-schema 'ssh -i /home/alunduil/.ssh/backup_dsa %s rdiff-backup --server' \
--exclude-other-filesystems \
--print-statistics \
/home/alunduil \
daneel.alunduil.com::elijah-backup && \
/usr/bin/rdiff-backup \
--remote-schema 'ssh -i /home/alunduil/.ssh/backup_dsa %s rdiff-backup --server' \
--remove-older-than 7D \
--force \
daneel.alunduil.com::elijah-backup

Breaking this down, we have a few things that require explanation:

/usr/bin/rdiff-backup:
	The rdiff-backup script
--remote-schema:
	The magic! This specifies the way that SSH is called by rdiff-backup allowing particular control over the SSH tunnel used to communicate with the server
--exclude-other-filesystems:
	Don't cross filesystem boundaries when finding files to backup
--print-statistics:
	Print a nice report of the files uploaded, &c when finished
/home/alunduil:	The local directory to backup
daneel.alunduil.com::elijah-backup:
	The remote host and directory to backup into
--remove-older-than:
	Remove any backups older than the time passed
--force:	Force the action even if warnings might occur

Conclusion

Creating backup strategies for remote clients with rdiff-backup is quite easy. This solution is ideal for mobile and spottily connected clients like laptop machines that might not be able to phone home for a variety of reasons.

Using APC to Speed Up PHP

Alex Brandt 2011-02-03 11:05 Comments

Introduction

Making PHP run faster is easily accomplished by updating the code or algorithm in use but what if we don't want to fix the code or even look at it? What other options do we have for speeding up our PHP applications?

We've already discussed optimizing LAMP by optimizing MySQL and Apache in Optimizing LAMP: Apache and MySQL.

Install APC

APC is a PHP opcode caching system. This means that when the PHP script is executed and converted to bytecode the intermediary bytecode is cached in APC so that it can simply be looked up next time this line is seen rather than compiling it again.

APC not only provides opcode caching but can also be used as an object cache (similar to Memcached). This object caching requires modifications to the application that wants to utilize it whereas the opcode caching is automatically enabled for all PHP executed while the module is loaded.

Note

To use the object cache with Wordpress one should use the W3 Total Cache plugin.

To install APC simply install it with your package manager (the package is probably named pecl-apc) or with PECL.

Configuring APC

APC's configuration is in the normal location, /etc/php/apache2-php5/ext-active/apc.ini and usually looks like the following:

extension=apc.so
apc.enabled="1"
apc.shm_segments="4"
apc.shm_size="128"
apc.num_files_hint="1024"
apc.ttl="7200"
apc.user_ttl="7200"
apc.gc_ttl="3600"
apc.cache_by_default="1"
;apc.filters=""
;apc.mmap_file_mask="/tmp/apcphp5.XXXXXX"
apc.slam_defense="0"
apc.file_update_protection="2"
apc.enable_cli="0"
apc.max_file_size="1M"
apc.stat="1"
apc.write_lock="1"
apc.report_autofilter="0"
apc.include_once_override="0"
apc.rfc1867="0"
apc.rfc1867_prefix="upload_"
apc.rfc1867_name="APC_UPLOAD_PROGRESS"
apc.rfc1867_freq="0"
apc.localcache="0"
apc.localcache.size="512"
apc.coredump_unmap="0"

Note

The options for APC are documented in the APC Manual.

Most of these options can be left with their default values and provide the desired effect.

If you have plenty of available memory on the system utilizing APC you can tweak the following to improve your experiene:

apc.shm_segments:
	Number of chunks to use in /dev/shm
apc.shm_size:	Size of the apc.shm_segments

These two parameters dictate the total memory usage for APC. In fact, if you simply multiply these values together you'll get the maximum amount of memory used by APC for the cache.

Caveats

There is a limitation in the kernel on the size of shm files that limits the apc.shm_size. Run the following command to determine this limit for your system:

cat /proc/sys/kernel/shmmax

Note

/proc/sys/kernel/shmmax prints out the limit in Bytes but APC expects its sizes in MegaBytes.

Conclusion

APC can help you get a bit more performance out of your PHP applications without modifying a line of application code or a lot of configuration. Play with the memory settings to determine what's right for your workload and start serving PHP just a hair faster.

Using Holland to Backup PostgreSQL

Alex Brandt 2011-02-02 19:04 Comments

Introduction

Backups are a subject I return to semi-frequently with a passion to avoid "oh shit" scenarios. Last time I built my backup system, Bacula with a PostgreSQL database backend, I determined that I would move to a common database backup script for all of my databases. Holland fits the bill perfectly with support for PostgreSQL, SQLite , and MySQL . This allows one command to backup all of my databases on all of my servers and subsequently creates a much simpler bacula configuration (the database job is defined the same as the catalog job).

Configuring Holland for PostgreSQL

The problem I ran into with Holland backing up my PostgreSQL databases is the lack of example configuration file. It wasn't hard to craft a working default PostgreSQL configuration and the following is what I came up with (/etc/holland/backupsets/default.conf):

[holland:backup]

plugin = pgdump
backups-to-keep = 1
auto-purge-failures = yes
purge-policy = after-backup
estimated-size-factor = 1.0

[pgdump]

role = postgres

[pgauth]

username = postgres

Conclusion

Setting up Holland to backup databases is incredibly easy and flexible. By having a common backup solution for all databases, other configurations become easier and processes can be streamlined.

Optimizing Gentoo: CFLAGS

Alex Brandt 2011-02-02 16:04 Comments

Introduction

Determining what CFLAGS to use on a Gentoo system can be slightly awkward if you're not sure about the processor you're working on and what instruction set architectures (ISAs) it supports.

We also don't want to make any breaking changes to our build system and will limit ourselves for this discussion to enabling the ISAs for our processor.

Note

It's assumed throughout this quick guide that you've read and are familiar with the existing Gentoo Optimization Guide.

Research

As we mentioned, we need to know our system and processor before we can start enabling CFLAGS.

Processor flags can be pulled from the CPU information in the proc filesystem:

grep flags /proc/cpuinfo | uniq
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy 3dnowprefetch

Knowing the processor flags is only half of the battle, we also need to know which ISAs gcc will enable automatically with -march=native. We can use the following to see what -m parameters are not enabled:

gcc -Q -c -v -march=native --help=target | grep disabled

If any flag (prefixed with -m in the gcc output) is in both of these outputs it can probably be safely added to your CFLAGS line in /etc/make.conf. For my CPU the result is: -msse3 -m3dnow; thus, my CFLAGS are: -march=native -O2 -pipe -msse3 -m3dnow.

Conclusion

Adding particular ISA support in your CFLAGS does ensure that you're using everything your processor has to offer and augments the aforementioned Gentoo Optimization Guide.

How To: Setup a Symfony Development Environment

Alex Brandt 2011-02-02 15:30 Comments

Introduction

Symfony , is an MVC (Model, View, Controller) framework written in PHP that takes the toil out of application writing.

By utilizing a framework and a lifecycle like Agile, elegant code is easier to produce and maintain. With proper design patterns already in use, one simply fills in the appropriate logic to create an application.

These benefits are wonderful but we need an environment we can write and test this application. The easiest location is a local workspace but how do we create a Symfony development environment?

The Players

The following packages should be installed before setting up the environment (required USE flags are in parenthesis):

apache
mysql
php (cli ctype reflection spl simplexml xml pcre session apache2 mysql pdo xsl)
symfony

The Music

Now that all of the players are installed, how do we turn a directory full of various projects into a browseable web location such as http://localhost/PROJECT? The easy way is to setup symlinks to the real DocumentRoots in a dummy directory for apache's sake:

rm -rf /var/www/localhost/htdocs
ln -snf ${PROJECT_DIR} /var/www/localhost/htdocs

Once we've set the stage, we can change the virtual host declaration for the default virtual host in apache. All we need is a few extra lines:

AliasMatch ^/((?!cgi-bin|icons).+)/sf/(.*) /usr/share/php5/data/symfony/web/sf/$2
AliasMatch ^/((?!sf|cgi-bin|icons).+?)/(.*) /var/www/localhost/htdocs/$1/web/$2

Options Indexes FollowSymLinks

Order allow,deny
Allow from all

The last thing is to ensure that your symfony project's .htaccess file redirects / to /index.php.

Conclusion

That should get a working development environment for multiple symfony projects up and running that allows easy access to each project as a subweb off of localhost.

Check for Multiple Occurences of an Atom

Check for N Uses of a USE Flag

Check for Removed Atoms

Supplemental Texts for CSIS Courses

Requirements

Configuration

Conclusion

Introduction

Quick Wins

Optimizing MySQL

Glossary of mysqltuner Categories and Controlling Parameters

Optimizing Apache

Glossary of MPM Parameters

Conclusion

Introduction

The Problem

The Solution

Conclusion

Introduction

Install APC

Configuring APC

Caveats

Conclusion

Introduction

Configuring Holland for PostgreSQL

Conclusion

Introduction

Research

Conclusion

Introduction

The Players

The Music

Conclusion

Hackery &c credits