Thursday, 1 July 2010

Perl: eval {...}, DESTROY and fork()

Okay, the point of this exercise is just to make a note of Perl garbage collection behaviour that can have elusive twist if you are not careful. In my case, I thought Perl was erroneously calling the destructor on my object multiple times, when in actual fact it was behaving correctly.

For those unaware of how a Perl destructor is implemented, here is a quick example:


package Hello;

use strict;

sub new {
    my $class = shift;
    my $this = {};

    $this->{pid} = $$;

    # Do something

    return bless($this, $class);
}

sub DESTROY {
    my $this = shift;

    if ($$ != $this->{pid}) {
        return;
    }
}

1;


To save endless amounts of repetitive modification of the same code, the caveat (obvious to those familiar with thread safe coding styles) is pointed out. As you can see, in the constructor, the PID number of the process that the instance is created in is recorded. This is later used to identify whether the process calling into the destructor, has the right to truly destroy itself.

In the scenario I experienced, I was creating a file in a method used to create a transaction. If the transaction is never committed and the object instance goes out of scope, the file should be removed as part of the destructor; this just ensures files aren't left lying around should the object instance be discarded.

The problem was in the call to fork() elsewhere in the same scope of the object instance, but outside the Hello module. During the fork operation, the child implicitly acquires a copy of everything in memory and access to any file handles. That means that if the child process terminates before the object instance goes out of scope in the parent process, the destructor gets called in the child. Since the child has a copy of the object instance held in exactly the same state as the parent at the time of the fork, the destructor will be called. In my case, the destructor removed the transaction file because the transaction remained uncommitted.

As already pointed out, the simple solution is to only perform destructive operations outside the scope of the parent process (or the process that instantiated the object), if that is the real intention. Otherwise, caution should not be thrown to the wind and an early return from the destructor if the PID cannot be identified.

So where does eval {...}; come in to all of this?

To make debugging this issue difficult, there were a few things happening:

  1. Warnings on STDERR within the destructor didn't get output to the console.
  2. The stack trace was lacking information.
  3. An eval {...}; block always seemed to be the last call in the return stack.
To solve the warning issue, I simply reopened STDERR to a temporary file:


use File::Temp qw( mktemp );
open(STDERR, ">", mktemp("/tmp/debug.XXXXXX"));


Then warnings just follow suit. I also used Carp to obtain the return stack for each:


use Carp qw( cluck );
cluck($this, ": DEBUG: I am in the destructor - PID: ", $$);


Eventually this led to the following output in the debug log files:


Hello=HASH(0x87885e0) DEBUG: I am in the destructor - PID: 26151 at /lib/perl/Hello.pm line 20
Hello::DESTROY('Hello=HASH(0x87885e0)') called at /lib/perl/SomeModule.pm line 0
eval {...} called at /lib/perl/SomeModule.pm line 0


As you can see, the eval block is the last in the return stack. This led me on a bit of a wild goose chase, thinking that eval was somehow creating copies of the object instances in the same way fork does. It was unclear to me that something else was forking in the same scope, since I didn't call fork directly. It was only evident when I realised a fork was actually occurring within the same scope and thus calling the destructor. Now looking at the line 0 attributes of the return stack, it's characteristic of a return stack generated from a forked child process; this is something worth retaining for future reference. Since I always forget, this will be my dumping ground.

No comments:

Post a Comment