Mud Slide

Friday, 12 November 2010

More Perl eval DESTROY woes!

Something now seemingly obvious, is the scope issues associated with the $@ variable, eval and an object's destructor. Consider the scenario, that like in a C++ program, you want to use well defined exceptions to determine the flow of the program under erroneous circumstances, rather than arbitrarily passing parameters around or relying on return value checking:

my $success = 0;
if (open(my $fh, ">", "/dev/null")) {
   if (myFunction("some parameter")) {
   my $obj = My::Something->new();

   if ($obj->method1()) {
   if ($obj->method2()) {
   $success = 1;
   }
   }
   }
}
if (! $success) {
   warn("Oh dear, something went wrong");
   return 0;
}
return 1;

Levels of nesting can start to look ugly, and with lots of return value checking going on, it can start to become hard to follow or sustain. So instead, I often look to simplify things like this:

eval {
   open(my $fh, ">", "/dev/null") or do {
   die(My::Exception->new($!));
   };

   myFunction("some parameter");

   my $obj = My::Something->new();
   $obj->method1();

$obj->method2();

};

if ($@) {

my $ref = ref($@);

if ("My::Exception" eq $ref) {

$@->warn();

} else {

warn("Oh dear, something went wrong: $@");

}

return 0;

}

return 1;

Ensuring that all packages and functions you create throw some exception object, makes error reporting easy to localise and self contained. It's also easy to disable if you haven't warnings plastered throughout your code.

Well all be it nice, in Perl there is one caveat that caught me out. Consider the scenario above, where in the instance of My::Something, it's destructor calls some method or function that contains an eval block. With that in mind, also consider what would happen if method1 was to throw an exception. Here is what happens:

# My::Something constructor is executed.

my $obj = My::Something->new();

# When method2 throws an exception, the eval

# block is exited and $@ is set to the appropriate

# exception object by 'die'.

$obj->method2();

# After setting $@ but before executing the next

# statement after the eval block, Perl executes

# the destructor on $obj. Within the destructor,

# some method calls 'eval', which on instantiation,

# resets the $@ variable.

eval { die("Ignore this error"); };

# Now when the destructor has finished, Perl executes

# the next statement where it evaluates whether the 'eval'

# block was successful or not.

if ($@) { ...

# Because of the 'eval' instance resetting $@, the

# code skips the error reporting and returns a

# successful return value.

return 1;

This is a complete disaster and will easily go unnoticed until something much further down the line identifies something that should have happened, hasn't or vice-versa. However, there is an extremely simple way to secure the destructor of an object against such an event, by simply declaring $@ in local scope within the destructor:

sub DESTROY {

my $this = shift;

local $@;

eval {

die("Now this error will truly be ignored");

};

}

For such a simple solution, it's worth making habit to always instantiate a local copy of $@ within a destructor unless you want to explicitly propagate a destructor exception up to some other handler. But since there is a danger you will always overwrite some other more important exception that quite possibly caused the exception in the destructor in the first place, it's probably worth implementing some global variable for destructor exceptions:

package My::Something;

my $destruct_except;

sub DESTROY {

my $this = shift;

local $@;

$My::Something::destruct_except = undef;

eval {

die("Oh dear, that's not supposed to happen!");

};

if ($@) {

$My::Something::destruct_except = $@;

}

Obviously, if there are multiple instances of the same object type in a single eval block, it would be very difficult to track which destructor threw or which ones didn't. Then you would have to become more cunning, using some sort of hash or list to stack up the exceptions that occurred with each destructor. For the most part though, usually you are not interested in what fails within a destructor, since it's primary purpose is to clean up. If what it wants to clean doesn't exist, as far as you are concerned, it's job is done and you don't need to know about what couldn't be cleaned, because the lack of existence implies it is clean.

Monday, 9 August 2010

FOLLOW UP: Perl: eval {...}, DESTROY and fork()

Just following up on a previous entry. I have read something interesting on the destructors of Perl modules in a threaded environment. This doesn't work for forked processes, since the kernel is responsible for duplicating forked processes, but it does provide a mechanism for making threads with cloned objects thread-safe.

CLONE_SKIP

Friday, 30 July 2010

XSLT: Poor browser compilation reporting.

You have to love the lack of context with web browser XSLT processing:

Firefox - "Error during XSLT transformation: Evaluating an invalid expression."

All down to a double equals in an expression. Something I regularly make the mistake of doing, but under normal cercumstances, is easy to spot via xsltproc:

XPath error : Invalid expression
$leftspan == 2
^

compilation error: file xxx-xxxxx.xsl line 193 element if
xsl:if : could not compile test expression '$leftspan == 2'

Thursday, 1 July 2010

Perl: eval {...}, DESTROY and fork()

Okay, the point of this exercise is just to make a note of Perl garbage collection behaviour that can have elusive twist if you are not careful. In my case, I thought Perl was erroneously calling the destructor on my object multiple times, when in actual fact it was behaving correctly.

For those unaware of how a Perl destructor is implemented, here is a quick example:

package Hello;


use strict;


sub new {
    my $class = shift;
    my $this = {};


    $this->{pid} = $$;


    # Do something


    return bless($this, $class);
}


sub DESTROY {
    my $this = shift;


    if ($$ != $this->{pid}) {
        return;
    }
}


1;

To save endless amounts of repetitive modification of the same code, the caveat (obvious to those familiar with thread safe coding styles) is pointed out. As you can see, in the constructor, the PID number of the process that the instance is created in is recorded. This is later used to identify whether the process calling into the destructor, has the right to truly destroy itself.

In the scenario I experienced, I was creating a file in a method used to create a transaction. If the transaction is never committed and the object instance goes out of scope, the file should be removed as part of the destructor; this just ensures files aren't left lying around should the object instance be discarded.

The problem was in the call to fork() elsewhere in the same scope of the object instance, but outside the Hello module. During the fork operation, the child implicitly acquires a copy of everything in memory and access to any file handles. That means that if the child process terminates before the object instance goes out of scope in the parent process, the destructor gets called in the child. Since the child has a copy of the object instance held in exactly the same state as the parent at the time of the fork, the destructor will be called. In my case, the destructor removed the transaction file because the transaction remained uncommitted.

As already pointed out, the simple solution is to only perform destructive operations outside the scope of the parent process (or the process that instantiated the object), if that is the real intention. Otherwise, caution should not be thrown to the wind and an early return from the destructor if the PID cannot be identified.

So where does eval {...}; come in to all of this?

To make debugging this issue difficult, there were a few things happening:

Warnings on STDERR within the destructor didn't get output to the console.
The stack trace was lacking information.
An eval {...}; block always seemed to be the last call in the return stack.

To solve the warning issue, I simply reopened STDERR to a temporary file:

use File::Temp qw( mktemp );
open(STDERR, ">", mktemp("/tmp/debug.XXXXXX"));

Then warnings just follow suit. I also used Carp to obtain the return stack for each:

use Carp qw( cluck );
cluck($this, ": DEBUG: I am in the destructor - PID: ", $$);

Eventually this led to the following output in the debug log files:

Hello=HASH(0x87885e0) DEBUG: I am in the destructor - PID: 26151 at /lib/perl/Hello.pm line 20
Hello::DESTROY('Hello=HASH(0x87885e0)') called at /lib/perl/SomeModule.pm line 0
eval {...} called at /lib/perl/SomeModule.pm line 0

As you can see, the eval block is the last in the return stack. This led me on a bit of a wild goose chase, thinking that eval was somehow creating copies of the object instances in the same way fork does. It was unclear to me that something else was forking in the same scope, since I didn't call fork directly. It was only evident when I realised a fork was actually occurring within the same scope and thus calling the destructor. Now looking at the line 0 attributes of the return stack, it's characteristic of a return stack generated from a forked child process; this is something worth retaining for future reference. Since I always forget, this will be my dumping ground.