Kernel Paging Bug at mm/filemap.c:128


In the last couple of months, I've been plagued by a problem with the system that hosts this website. A nice kernel panic followed by a random uptime and a crash. This is a bit difficult to swallow when the server is nearly 1500 miles away but a reboot and away it goes again.

The actual error is shown here:

System Message: ERROR/3 (<string>, line 8)

Cannot find pygments lexer for language "dmesg"

.. code-block:: dmesg

    Feb 01 14:13:27 [kernel] kernel BUG at mm/filemap.c:128!
    Feb 01 14:13:27 [kernel] Modules linked in:
    Feb 01 14:13:27 [kernel] Pid: 13604, comm: apache2 Not tainted 2.6.36-hardened-r6 #1 0K8980/Dimension 3000
    Feb 01 14:13:27 [kernel] EIP: 0060:[<c105af4f>] EFLAGS: 00010046 CPU: 0
    Feb 01 14:13:27 [kernel] EIP is at __remove_from_page_cache+0x44/0x91
    Feb 01 14:13:27 [kernel] EAX: 00000000 EBX: c16acf40 ECX: 00000015 EDX: 00000015
    Feb 01 14:13:27 [kernel] ESI: de877d7c EDI: ffffffff EBP: 00000015 ESP: dc4abe28
    Feb 01 14:13:27 [kernel]  DS: 0068 ES: 0068 FS: 00d8 GS: 0033 SS: 0068
    Feb 01 14:13:27 [kernel]  de877d7c c16acf40 c105afbc c16acf40 de877d7c c1061079 c16acf40 00000000
    Feb 01 14:13:27 [kernel] <0> c1061123 de877d7c 00000015 00000006 ffffffff 00000000 0000000e 00000000
    Feb 01 14:13:27 [kernel] <0> c1641f40 c1641f60 c1641f80 c1641fa0 c173ecc0 c1695de0 c16acf40 c15d2f00
    Feb 01 14:13:27 [kernel]  [<c105afbc>] ? remove_from_page_cache+0x20/0x27
    Feb 01 14:13:27 [kernel]  [<c1061079>] ? truncate_inode_page+0x6c/0x7d
    Feb 01 14:13:27 [kernel]  [<c1061123>] ? truncate_inode_pages_range+0x99/0x23a
    Feb 01 14:13:27 [kernel]  [<c10612cd>] ? truncate_inode_pages+0x9/0xc
    Feb 01 14:13:27 [kernel]  [<c10fa452>] ? ext4_evict_inode+0x83/0x265
    Feb 01 14:13:27 [kernel]  [<c108dab2>] ? evict+0x17/0x7b
    Feb 01 14:13:27 [kernel]  [<c108e211>] ? iput+0x182/0x1df
    Feb 01 14:13:27 [kernel]  [<c108b58e>] ? d_kill+0x2a/0x43
    Feb 01 14:13:27 [kernel]  [<c108c292>] ? dput+0xf3/0xfb
    Feb 01 14:13:27 [kernel]  [<c107e289>] ? fput+0x191/0x1b3
    Feb 01 14:13:27 [kernel]  [<c106e14f>] ? remove_vma+0x34/0x52
    Feb 01 14:13:27 [kernel]  [<c106f26a>] ? __do_munmap+0x257/0x2a8
    Feb 01 14:13:27 [kernel]  [<c106f335>] ? sys_munmap+0x49/0x60
    Feb 01 14:13:27 [kernel]  [<c1378005>] ? syscall_call+0x7/0xb
    Feb 01 14:13:27 [kernel] ---[ end trace 4598df0f375c22c4 ]---

The uprecords of this machine:

     #               Uptime | System                                     Boot up
----------------------------+---------------------------------------------------
     1    24 days, 20:31:52 | Linux 2.6.32-hardened-r2  Thu Nov 25 20:58:34 2010
     2    18 days, 09:10:28 | Linux 2.6.36-hardened-r6  Sat Jan  8 12:31:38 2011
     3    15 days, 12:41:25 | Linux 2.6.32-hardened-r9  Fri Oct  8 09:48:07 2010
     4    14 days, 03:43:24 | Linux 2.6.32-hardened-r2  Wed Nov  3 11:29:55 2010
     5     9 days, 19:44:59 | Linux 2.6.32-hardened-r2  Sun Oct 24 15:14:10 2010
     6     7 days, 21:49:22 | Linux 2.6.32-hardened-r2  Mon Dec 20 17:31:43 2010
     7     7 days, 17:34:24 | Linux 2.6.36-hardened-r6  Tue Dec 28 15:58:45 2010
     8     7 days, 08:53:26 | Linux 2.6.32-hardened-r2  Thu Nov 18 11:55:57 2010
     9     5 days, 16:58:46 | Linux 2.6.36-hardened-r6  Wed Jan 26 21:43:51 2011
    10     3 days, 00:09:43 | Linux 2.6.36-hardened-r6  Wed Jan  5 12:17:33 2011
----------------------------+---------------------------------------------------
->  13     0 days, 00:14:46 | Linux 2.6.36-hardened-r6  Tue Feb  1 15:02:26 2011
----------------------------+---------------------------------------------------
1up in     0 days, 00:40:31 | at                        Tue Feb  1 15:57:42 2011
t10 in     2 days, 23:54:58 | at                        Fri Feb  4 15:12:09 2011
no1 in    24 days, 20:17:07 | at                        Sat Feb 26 11:34:18 2011
    up   115 days, 09:00:18 | since                     Fri Oct  8 09:48:07 2010
  down     0 days, 21:28:47 | since                     Fri Oct  8 09:48:07 2010
   %up               99.230 | since                     Fri Oct  8 09:48:07 2010

I suspect this to be a hardware problem but if anyone has any other ideas as to what might cause this type of problem please let me know.

Best Practices: Bind Mounts and chroots


Introduction

Not overly recently, I was asked to configure a chroot jail by setting up bind mounts instead of copying over the involved binaries. The incident caused a really interesting system failure, a wake-up call to myself, and an ambitious project to create an autochroot command.

The Stage

Setting up a chroot jail requires that all binaries and referenced items will need to be inside the jail location. This creates quite a complicated configuration for something even as simple as creating a chroot jail for bash itself. Not only do we need to create a clean directory structure and duplicate our file system hierarchy including bash and all of its dependencies. The dependencies for bash can be found out with ldd:

ldd /bin/bash
        linux-vdso.so.1 (0x00007ffffbdd9000)
        libreadline.so.6 => /lib64/libreadline.so.6 (0x00007fdf9fa02000)
        libncurses.so.5 => /lib64/libncurses.so.5 (0x00007fdf9f7ae000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fdf9f5aa000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fdf9f202000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fdf9fc49000)

Once all of the dependencies (and their dependencies) are in place, the proper items to do work in this minimal shell environment also need to be installed. This may include device nodes or system environment files (e.g. /etc/passwd, /etc/shadow). This can make some utilities a pain to install in a chroot by hand (e.g. /usr/bin/ssh, /usr/bin/scp, &c).

Creating a chroot

So, it's hard; that's not a big deal. There are ways to make creating chroot's easier; like bind mounts </sarcasm>:

mount -o bind

This handy feature allows one to take any directory on the system and mount it in another location (like a mountpoint alias). These become extremely useful when setting up a chroot environment for system recovery via a live environment but can be a ticking time bomb if used incorrectly.

Setting up a chroot with bind mounts is incredibly easy. We simply mount all of the required sections from the external filesystem inside the chroot location:

mount -o bind /dev /chroot/dev
mount -o bind /lib /chroot/lib
mount -o bind /usr/lib /chroot/usr/lib

That was extremely simple; far simpler than finding the dependencies and copying them into the chroot environment by hand.

The Problem

Our chroot environment has been setup and running for a while with the bind mounts and everything has continued working thus far. What happens when we no longer need this chroot environment and the person removing it doesn't realise how we've setup this environment? Most likely they'll simply rm -rf /chroot and leave it at that. Since we've used bind mounts and a recursive rm starts at leaves we're going to be in for a pleasant surprise when things in /dev, /lib, and /usr/lib start to go missing.

The Lesson

Don't use bind mounts as a quick solution for setting up chroot environments. Even when everything goes smoothly there may be consequences of particular actions that are not intended.

One potential workaround that comes to mind is setting the bind mounts to read only with the ro mount option. This might not behave quite as expected; in fact the following is what happens when attempting this:

mkdir -p /mnt/bind
mount -o bind,ro /proc /mnt/bind
mount: warning: /mnt/bind seems to be mounted read-write.
mount
/proc on /mnt/bind type none (rw,bind)

Mounting a bind mount does not respect the read only mount option we passed (at least in this particular case).

Conclusion

In general, bind mounts should not be used for chroot environments that are intended to be persistent. It is more work to determine dependencies and copy the chroot components by hand but the result is far more resilient to potential issues that may crop up.

One solution that I've not seen but would be nice is a script that can programmatically do the dependency resolution and copying of the items to be placed in a chroot environment. I've started a GitHub project, autochroot but no work has been started on this yet.

Using Special Keys in Zsh


I recently made the plunge and bagan using Zsh in lieu of Bash . I've not regretted the decision in the slightest but there have been minor annoyances that needed to be dealt with. The simplest annoyance was the special keys (i.e. delete, home, &c). The solution is quite simple and elegant but not completely obvious.

The bindings are usually read from the inputrc file (located in /etc/) by Bash but Zsh does not do this by default. There are probably more elegant solutions to this problem but a quick brute force solution is to create a bindkeys file from inputrc:

gawk '$1 ~ /.*:/ { print "bindkey",$1,$2 }' /etc/inputrc | sed -e 's/://g' > ~/.zshrc-bindkeys

Once this file has been crafted, it's simply a matter of invoking it from your .zshrc with source ~/.zshrc-bindkeys.

Optimizing Gentoo: Mailing Portage Messages


Introduction

Portage is an amazingly simple and complex piece of technology. The simplicity in each piece's ability to do a specific function comes together in a complex package management system that rivals all others. Automating updates is something that admins everywhere do out of necessity. In fact, automating everything is what an admin does. Automating portage's updates is a bit more harrowing than other package management systems but isn't impossible.

Problem

Automating package updates with portage is simple: add a cron job that calls the appropriate items, but being notified of available updates or the helpful messages that are part of the updates requires a little tweaking to the portage configuration itself.

Solution

Turning up the verbosity of portage in the cron job doesn't quite do the job we expect. Sure, it adds the messages to the e-mail cron generates but it also adds plenty more for us to look through.

This also doesn't help when utilities like Puppet update packages.

Configuring portage to e-mail the messages only is quite simple and solves this issue much more satisfactorily.

The make.conf man page lets us know about the following parameters that affect how output is logged:

  • PORTAGE_ELOG_CLASSES
  • PORTAGE_ELOG_SYSTEM
  • PORTAGE_ELOG_COMMAND
  • PORTAGE_ELOG_MAILURI
  • PORTAGE_ELOG_MAILFROM
  • PORTAGE_ELOG_MAILSUBJECT

These parameters allow us to log various output from portage runs to a large number of destinations. If we simply want to mail output (not the full build output) we add the following directives to make.conf:

PORTAGE_ELOG_SYSTEM="save mail"
PORTAGE_ELOG_MAILFROM="portage@alunduil.com"

This block leaves the normal configuration, save, but adds the mail facility.

Conclusion

Automating maintenance tasks and making administration more event driven frees up time for other more interesting areas of the infrastructure that can continue improving quality while allowing the administrator more time.

Python Development: Dynamically Loaded Modules or Plugins


Introduction

Sometimes dynamically loaded modules (plugins or extensions) are pretty convenient to provide extensible functionality for your applications. For example, you need to provide a command that provides known data sources to subcommands but want the subcommands to be easily written and added even after the application has been finalized. We could do this with a simple modular design but it seems more natural to allow for the subcommands to be defined elsewhere with a standard interface to allow for extensible behavior even after the initial application development cycle.

Note

This discussion does not cover eggs and their entry points but entry points would be a potential solution to this situation as well.

The Problem

How do we find and then load and then run code that we didn't necessarily write? The first step is fairly obvious, we ask (via a parameter, configuration option, &c) where the code that should be loaded is located. Once we have the location the other steps are much easier. In more detail, we need to know a location that contains code following our plugin API. To do this we can use the following code (where d is the directory with our plugins):

sys.path.append(d)
files = itertools.chain(*[ [ os.path.join(x[0], fs) for fs in x[2] ] for x in os.walk(d) ] )
plugins = [ f.split('/')[-1].split('.')[0] for f in files if f.endswith('.py') ]
modules = [ __import__(p, globals(), locals(), [], -1) for p in plugins ]

for p, m in zip(plugins, modules):
    matches = [ x for x in m.__dic__.keys() if x.lower() == p ]
    if len(matches) == 1: # and issubclass(m.__dict__[matches[0]], PluginBase):
        self._commands.append(m.__dict__[matches[0]]())

Break Down

  1. We add our directory to the python module path so we can simply load them by name
  2. Then we get a list of the files in this directory
  3. Then we filter this down to the names of the python files to find the Class that we need to create an instance of
  4. We then import the modules as module objects we can manipulate
  5. We then loop through the correlated list of plugin names and module objects
  6. We look for an object in the module dictionary that matches the name of the file (case insensitive)
  7. If we find a match we then add an instantiated object of the class we found

Quite a bit is going on in this short snippet of code but the important thing is that it takes a directory path and creates a list of instantiated plugin objects we can use just like any other object variable. Once we have the objects it's simply a matter of calling functions on them: self._commands[n].method().

Conclusion

Getting a modular design can be daunting and making that modular design as dynamic as possible can be even more daunting but the modern languages (this technique but not syntax works with ruby as well) make this process much easier than the compiled languages. (More to come on that later I hope.)

Python Development: Writing Small Interpreters


Introduction

Writing parsers or interpreters or compilers isn't too common of a topic but it's an extremely useful design to be familiar with. For example, sometimes we want to do more than parse options for input or we might want a custom configuration file format. Another example would be using another language to do custom processing. What if we have a strict data store but want to have a little flexibility with how we access this data store? We could write an elaborate model library that provides nice easy access to the data store or we could write a structured query language (SQL ) parser that does this translation for us and is far easier to adapt to other needs.

Note

This article assumes some prior knowledge about context free grammer .

Creating a Grammar

Creating a parser is far easier once the grammar has been created and thought about. This is not only true because it's proper planning but also because most parser generators (including the one we're looking at) read a grammar as their input.

We'll be looking at the expression grammar from tiny basic as an example. This is a simple grammar, safe for LR or LL parsing (which is important to note when we talk about other languages like SQL).

Tiny Basic Expression Grammar

expression ::= (+|-|ε) term ((+|-) term)*
term ::= factor ((*|/) factor)*
factor ::= number | (expression)

The parser generator we'll utilize is pyparsing , an easy to use LL parser. The following is the pyparsing code to implement the above grammar:

expr = Forward()
factor = ( Word(nums) | Group(Suppress('(') + expr + Suppress(')') )
term = Group(factor + ZeroOrMore((Literal('*') | Literal('/')) + factor))
expr << Group(Optional(Literal('-') | Literal('+')) + term + ZeroOrMore((Literal('-') | Literal('+')) + term))

This pyparsing snippet turns sentences like the following:

System Message: ERROR/3 (<string>, line 56)

Cannot find pygments lexer for language "math"

.. code-block:: math

    5 + 5 * 6 / 3 - (47 + 56) * 34

Into easily consumable lists like the following:

[[['5'], '+', ['5', '*', '6', '/', '3'], '-', [[[['47'], '+', ['56']]], '*', '34']]]

We could improve this parser by having it autoreduce expressions and other expression handling code but it suffices for demonstration purposes.

We still need to invoke the parser in order to get the parsed set of tokens. The following snippet shows the relevant python line:

expr.parseString('5+5*6/3-(47+56)*34')

Parser Testing

Proper testing includes unit tests and other strategies but sometimes the easiest way to see what's going on with parsers is to write a small interpreter that loops over expressions so you can see the effects. Using python and readline we can create a convenient environment for live interaction of this sort:

import rlcompleter
import readline
import so

if not os.access(".history", os.F_OK):
    open(".history", "w").close()

readline.read_history_file(".history")

buffer = ""

while True
    try:
        line = raw_input(pycolorize.light_blue("BASIC$ "))
    except EOFError:
        readline.write_history_file(".history")
        print
        break

    if line.lower() == "exit" or line.lower() == "quit":
        readline.write_history_file(".history")
        break

    buffer += line
    result = ACTION_ON_BUFFER
    buffer = ""

Complete Reference Script

import rlcompleter
import readline
import os
import pprint
import pycolorize

from pyparsing import *

if not os.access(".history", os.F_OK):
    open(".history", "w").close()

readline.read_history_file(".history")

class ExpressionParser(object):
    def __init__(self):
        self._expr = Forward()
        factor = ( Word(nums) | Group(Suppress('(') + self._expr + Suppress(')')) )
        term = Group(factor + ZeroOrMore(Literal('*') | Literal('/')) + factor)
        self._expr << Group(Optional(Literal('-') | Literal('+')) + term + ZeroOrMore((Literal('-') | Literal('+')) + term))

    def _calculate(self, l):
        while any([ isinstance(x, list) for x in l]):
            for n, i in enumerate(l):
                if isinstance(i, list):
                    l[n] = self._calculate(i)
            return str(eval(" ".join(l)))

    def __call__(self, string):
        return self._calculate(self._expr.parseString(string.asList()))

print pycolorize.green("Enter your commands to tokenize:")
print pycolorize.green("Enter a blank line to exit.")

while True:
    try:
        line = raw_input(pycolorize.light_blue("BASIC$ "))
    except EOFError:
        readline.write_history_file(".history")
        print
        break

    if line.lower() == "exit" or line.lower() == "quit":
        readline.write_history_file(".history")
        break

    buffer += line
    result = None

    try:
        result = ExpressionParser()(buffer)
    except ParseBaseException, e:
        buffer = ""

        pycolorize.error(e.line)
        pycolorize.error(" "*(e.col - 1) + "^")
        pycolorize.error(str(e))

        continue

    pycolorize.status("Result: %s", result)

    buffer = ""

Conclusion

Writing parsers with pyparsing is quite simple but remember that any non-LL grammars will need to have the left-recursion factored out. Python isn't the only language with available parser generators: C and C++ have bison and lemon , and other languages are sure to have them as well.

New Project: pclean


I've often gotten frustrated with my /etc/portage/package.* files when they become massive and full of crud that I don't even have installed any longer. Because of this I have crafted a simple little utility to clean out packages that are no longer installed and USE flags that are no longer valid from these files. This should help trim the cruft from the Gentoo configuration.

pclean, does all of this and only has one major problem (so far) before I call it good enough for a 1.0 release. If you would like to try this little utility; it's available in my overlay and if you notice any odd behavior please report it to my bugzilla.

Ping Checks Fail in Cacti with iputils-20100418


Introduction

It appears the output of ping has changed in this release of iputils from referring to the icmp seq numbers as icmp_seq to icmp_req which obliterates the ping.pl script that cacti uses to do pings of servers it watches and results in NaN values.

The Fix

The fix is quite simple: change the seq to req in the grep line of ping.pl but the following fix is probably more versatile (and will be checked for upstream).

*** a/ping.pl 2010-07-09 17:33:46.000000000 -0500
--- b/ping.pl 2010-10-24 18:22:16.325881546 -0500
*************
*** 4,10 ****
  $host = $ARGV[0];
  $host =~ s/tcp:/$1/gis;

! open(PROCESS, "ping -c 1 $host | grep icmp_seq | grep time |");
  $ping = <process>;
  close(PROCESS);
  $ping =~ m/(.*time=)(.*) (ms|usec)/;
--- 4,10 ----
  $host = $ARGV[0]
  $host =~ s/tcp:/$1/gis;

! open(PROCESS, "ping -c 1 $host | grep -E icmp_/(r\|s\)eq | grep time |");
  $ping = <process>;
  close(PROCESS);
  $ping =~ m/(.*time=)(.*) (ms|usec)/;

Conclusion

Sometimes things just break because of small changes. This is a simple example of that and the quick fix for the annoyance of not recording ping times in Cacti.