|
NAMEThread::Apartment - Apartment threading wrapper for Perl objectsSYNOPSISpackage MyClass; use Thread::Apartment::Server; use base qw(Thread::Apartment::Server); sub new { # # the usual constructor # } # # mark some methods as simplex # sub get_simplex_methods { return { 'bar' => 1 }; } # # mark some methods as urgent # sub get_urgent_methods { return { 'bingo' => 1 }; } sub foo { # # do something # } sub bar { # # do something else # } sub bingo { print "BINGO!\n"; } 1; # # create pool of 20 apartment threads # Thread::Apartment->create_pool(AptPoolSize => 20); my $apt = Thread::Apartment->new( AptClass => 'MyClass', # class to install in apartment AptTimeout => 10, # timeout secs for TQD responses AptRequire => { # classes to require into the thread 'This::Class' => '1.234', 'That::Class' => '0.02' }, AptParams => \@params_for_MyClass) || die $@; my $result = $apt->foo(@params); die $@ unless $result; # # bar is simplex # $apt->bar(@params); DESCRIPTIONThread::Apartment provides an apartment threading wrapper for Perl classes. "Apartment threading" is a method for isolating an object (or object hierarchy) in its own thread, and providing external interfaces via lightweight client proxy objects. This approach is especially valuable in the Perl threads environment, which doesn't provide a direct means of passing complex, nested structure objects between threads, and for non-threadsafe legacy object architectures, e.g., Perl/Tk.By using lightweight client proxy objects that implement the Thread::Queue::Queueable interface, with Thread::Queue::Duplex objects as the communication channel between client proxies and apartment threads (or between threads in general), a more thread-friendly OO environment is provided, ala Java, i.e., the ability to pass arbitrary objects between arbitrary threads. Thread::Apartment is a fundamental component of the PSiCHE framework (<http://www.presicient.com/psiche>). Glossary
METHODSRefer to the included classdocs for summary and detailed method descriptions.Application Notes and RestrictionsClosures must be passed from TAS objects.In the general case, closures cannot be passed between threads. Hence, a special mapping scheme is used to proxy a closure originating in one TAS but passed to another TAS. As such, closures originating outside a TAS cannot be passed to T::A object. If your application needs to pass a closure to a T::A object from the main processing flow (e.g., to supply a closure to Tk::Threaded), you'll need to create a class and create or install an instance of it in an apartment thread. Passing filehandles Filehandles (aka GLOBs) cannot be passed between ithreads; hence, any objects which contain GLOB's (e.g., IO::Socket) also cannot readily be passed. The recommended method for passing filehandles between threads is to pass the fileno() and reconstruct the filehandle in the receiving thread (via either IO::Handle::fdopen() or open(FH, "&$fileno")). Despite best efforts, no consistent solution for marshalling and esp. unmarshalling filehandles - while preserving their access modes and PerlIO layers - could be found. As of Perl 5.8.6, there appear to be bugs in open() and binmode() regarding mixing fileno open()'s and layers, and Win32 doesn't appear to have any means of recovering access modes from an existing filehandle. Therefore, applications are responsible for providing their own mechanisms for marshalling filehandles between threads. Cannot Provide Proxied Access to Members of Tied Objects Since TAC is itself a threads::shared object, and threads::shared objects cannot be tied, it is not possible to proxy the tied "STORE", "FETCH", etc. methods. Note that this does not preclude using a tied object in an T::A, but the resulting TAC will not be able to access the tied elements of the proxied object. It may be possible to create a non-threads::shared subclass of TAC to support the tied capability; refer to DBIx::Threaded for hints on how to support tied objects. Proxied operator overloading not supported Operator overloading is not currently supported. A future version of T::A may provide a means to proxy overloaded operators, ala proxied closures. Proxied lvalue methods not supported Due to the inability to capture the actual assignment event associated with lvalue subs, it is not possible for the proxy TAC to safely pass the assigned value back to the proxied object. However, clever subclasses of TAC - and associated TAS subclasses - may overcome this limitation by creating lvalue'd subs in the TAC, and permitting the TAS to directly reference the TAC's members (since the TAC is threads::shared, it members are available to the TAS thread). Implementors of such subclasses are urged to be mindful of the probable locking requirements, and the inability to determine the precise instance when an lvalue assignment occurs. AUTOLOADing in Apartment Threaded Packages In order to minimize proxy overhead, when an object is installed into an apartment thread, the object's @ISA hierarchy (as reported by Class::ISA::self_and_super_path), and the list of available public methods (as reported by Class::Inspector::methods()) are exported to the client proxy objects, so that "isa()" and "can()" will execute locally without the overhead of a request/response exchange over the TQD. As a result, the installed object should explicitly declare and/or implement all public methods. However, classes which need to rely on AUTOLOADing can specify that in a number of ways:
When an undeclared public (i.e., no leading underscore) method is invoked, if AUTOLOADing has been enabled by any of these methods, the object's TAC will pass the method call to the TAS, which can AUTOLOAD if needed. Note that undeclared methods are always executed as duplex, non-urgent methods, and that "can()" method calls will be passed to the TAS if an undeclared method is referenced Use "$self->isa()", not "ref $self" Due to the exported @ISA described above, using the "ref" operator on the client stub objects will report a TAC or TACo object, rather than the proxied object. TAC overrides the "UNIVERSAL::isa()" method to test the exported @ISA hierarchy. Subclassing Thread::Apartment::Server TAS provides implementations of several abstract methods which may be overridden in subclasses. Refer to the Thread::Apartment::Server classdocs for method details. I/O Bound Classes Classes which detect/trap I/O (or other async) events should inherit from TAES and provide an implementation of its "poll()" method, in order to interleave the apartment thread's TQD "dequeue()" calls and the detection of internal events. E.g., a network socket monitor which calls "select()" to detect socket events would implement the "select()" call with some small timeout inside its "poll()" implementation. "Thread::Apartment::run()" detects the installation of a TAES object, and will use TQD's "dequeue_nb()" method, rather than "dequeue()", to check for incoming method calls, and, if none are available, will call the object's "poll()" method to permit the object to field any events. Classes with Control Loops Classes which encapsulate their own control loops (e.g., Perl/Tk) should inherit from TAMS and frequently call "Thread::Apartment::MuxServer::handle_method_requests()" to check for and process any external proxied method calls. T::A detects the installation of a TAMS object, and cedes control to the TAMS's "run()" method when the internal T::A::_run() method is called. Subclassing Thread::Apartment::Client When a TAS based class is used, the class may override the "create_client()" method to manufacture its own TAC. By subclassing TAC, the implementation can provide optimizations of TAC behavior, e.g., providing thread local accessor/mutator methods for static scalar values, or for threads::shared scalar, array, or hash refs, in order to avoid the overhead of making a proxied method call. Refer to the Thread::Apartment::Client classdocs for detailed descriptions of its methods. Installing POPO's In order to provide the greatest possible flexibility, T::A supports installing Plain Old Perl Objects aka POPOs. POPOs do not implement the TAS class, and thus can be nearly any existing class definition, with the following limitations:
Externally Created Threads Must Run "Thread::Apartment::run()" In the event an application wants to supply threads to the Thread::Apartment constructor (e.g., from a pre-created thread pool), the threads should use the "Thread::Apartment::run()" method, e.g., # # create our backchannel # my $cmdq = Thread::Queue::Duplex(ListenerRequired => 1); $cmdq->listen(); # # ...some more code... # # create our thread pool: # start the threads first, then retrieve their # TQDs; this minimizes the context the started # threads inherit # my %tqds; my @my_threads; push @my_threads, threads->create(\&Thread::Apartment::run, $cmdq) foreach (1..$poolsize); # # now get their TQDs: the thread # posts them to the backchannel, along # with the thread ID from which it came # foreach (1..$poolsize) { my $resp = $cmdq->dequeue(); $tqds->{$resp->[0]} = $resp->[1] if $resp; } Errors Returned as "undef", with Error Text in $@ T::A assumes that any non-simplex method that returns "undef" has an error, and the error message is available in $@ (as for "eval{}" operations). An application specific adapter class may be required to adapt existing classes to this error reporting behavior. Object-returning Methods When an object reference is returned from a T::A managed object, T::A checks if the object is a TQQ object (i.e., it implements "curse()" and "redeem()" methods). If it is, then the object is marshalled to the TQD as usual. Otherwise, T::A assumes the returned object is part of an object hierarchy to be executed within its current apartment thread (aka "Zone Threading"), and will
Note that, in the event the object has previously been mapped in the hierarchy, the existing TAC instance for the object will be reused. T::A (via TAS::marshalResults()) does not do a deep inspection of returned values to detect instances of non-TAC objects. If an application returns objects within a returned data structure, it will need to provide an appropriate subclass to implement the needed marshal/unmarshal methods. Cyclic Object/Method Dependencies If TAS object A calls method1() on TAS object B, which in turn calls method2() on TAS A, then the associated apartment threads will deadlock (A is waiting for a response from B, while B is waiting on a response from A). Such problems may be avoided by any of
The re-entrancy approach causes the TAC for object B to use a special "Thread::Apartment::run_wait()" method which will field any incoming proxy method requests for object A at the same time as it waits for the results from the call to "method2". The class-method version of Thread::Queue::Duplex::wait() is used to wait on both the local proxied object's TQD, as well as waiting on the response to the specific method request sent to object B's TQD. Note that using the "ta_reentrant_" method prefix has a transient effect, i.e., it only applies to the single method call; the other 2 approaches will persist the re-entrant behavior for all proxied method calls from all objects within the thread, including closure calls, until Thread::Apartment::set_reentrancy(0) is called to disable re-entrancy. Finally, note that re-entrancy should be used with caution, as it could lead to inadvertantly deep recursions; process-wide performance degradation (due to the lock signalling required); and unexpected object states compared to the non-threaded, sequentially executed equivalent. Unexpected TAC AUTOLOADs for Indirect Object References If an object is indirectly referenced via a hash, e.g., "$objmap->{'MyObject'}->someMethod();", TAC's AUTOLOAD may get a DESTROY method reference, rather than the expected 'someMethod' value (the reasons for this are not yet clear...). As a result, it may be neccesary to dereference the object into a lexical variable to invoke the method, e.g., my $temp = $objmap->{'MyObject'}; $temp->someMethod(); Further investigation is needed to determine the reason for this behavior. Passing/Invoking Closures Between T::A Objects Managing closures in T::A relies on ithread's isolation of class variables between threads, i.e., assume SomeClass declares package SomeClass; our $variable; Further assume that SomeClass is loaded into 2 different threads. Then modifying $SomeClass::variable in thread A does not effect the current value of $SomeClass::variable in thread B. Hence, T::A declares the following non-threads::shared class variables:
When a new root object is installed into an apartment thread, the %closure_map and $next_closure_id variables are reset, and $closure_signature is set to the current timestamp. 2 methods for passing closures are supported: either directly as CODEREF's, or by creating Thread::Apartment::Closure aka TACl instances. The latter method permits the closure generator to specify the simplex/and or urgent properties of a closure. When specified as a simple CODEREF, the closure recipient will always assume the closure is duplex (i.e., will wait for a returned result) and non-urgent. The following methods are provided to support creating TACl's explicitly:
E.g., # # regular CODEREF: $recvr will wait for the closure to complete # $recvr->someMethod(-command => sub { print "in a closure"; }); # # TACl: $recvr will not wait for the closure to complete # $recvr->someMethod(-command => $self->new_simplex_tacl(sub { print "in a closure"; })); Simplex closures may be useful in situations where 2 apartments may "ping-pong" closure calls on each other, in order to avoid deadlock. They may also be useful to expedite processing when no returned values are needed. Default closure call behavior can be modified via either
The following acronyms are used in the following detailed discussion:
Processing of CODEREF Closures When a CGTAS object passes a CODEREF to a CRTAC, the CRTAC's marshalling logic will detect the CODEREF (within the T::A::Common::marshal method). At that point in time, the CRTAC is executing within the CGTAS's thread, and hence, any assignment to the thread's Thread::Apartment class variables will be private to that thread.
Processing of TACl Closures When a CGTAS wishes to apply simplex or urgent properties to a closure, it must create a complete TACl object. The TACl object behaves much like the CRTAC for the CODEREF case: it calls class-level methods to allocate an ID, applies the simplex and/or urgent bits to the ID, then maps the closure into the thread's map, and installs the local TAC into the TACl. When the CRTAC detects the TACl while marshalling the method call, it simply invokes TACL's curse() method to marshal the signature, ID, and CGTAC. The remainder of the closure processing is identical to the CODREF case. Closures as Return Values Thus far, closures have been discussed solely as method arguments; however, closures may also be return values. In such cases, the CRTAC will marshall the closure as for the CODEREF or TACl cases described above, and they'll be recovered in the CRTAS when the return values are unmarshalled. The invokation of the closures remains the same. Limitations In its current implementation, the marshalling/unmarshalling process does not detect closures deep within structures passed between threads. In such cases, the Storable package used to marshal/unmarshal complex non-threads::shared structures between threads will throw an error. If an application needs to pass such complex structure between threads, it will need to provide its own TAC subclasses, with appropriate marshalling logic to map the closures. Asynchronous Method Calls As an additional means of avoiding deadlock situations as described above, and to simplify execution of concurrent operations, T::A provides support for asynchronous method calls on duplex methods. Two async method call mechanisms are supported:
Best PracticesAllocate threads and TQDs earlyDue to Perl's heavyweight thread model (i.e., cloning the entire parent thread context), threads that are spawned after many modules have been loaded, or lots of objects have been created, may consume significant unneeded resources. By creating threads as early as possible, and deferring module loading (i.e., not "use"'ing many/any modules, but rather "require"'ing when needed), the apartment threads will be created within the minimum required context. Use T::A::set_single_threaded() for debugging Debugging threaded applications can be very challenging (not only Perl, but any language). Perl's current debugger provides little or no support for differentiating between the execution context of different threads. Therefore, using the single threaded implementation for preliminary test and debug is highly recommended. While it may not surface issues related to concurrency, it will usually be sufficient for finding and fixing most application logic bugs. Wrap Filehandles With Access Discipline Objects Given Perl's inability to marshall filehandles between threads, wrapping handles with classes that provide the access disciplines can be used to provide a marshal-able solution, e.g., a Logger class that provides logging methods, as well as file open, truncation, and close methods, and is provided (as a TAC) to any other objects needing a logger. PREREQUISITES
CHANGE HISTORYRelease 0.50
Release 0.10
TO DOFilehandle marshalling supportSome means of transparently passing filehandles (as either GLOBs, or subclasses of IO::Handle) would be nice. However, it appears lots of things in Perl need to be fixed for that to be a realistic option.<sigh/> Operator overloading Some proxied method of supporting operator overloading is needed (Though I personally dislike operator overloading). An early design would add use overload nomethod => \&operAutoLoad; to TACs, so that operators applied to a TAC could get redirected to the associated TAS. However, a means of operator overload introspection is also required so the TAS can report its operator overloads to the TAC. lvalue methods Might be useful, but difficult to implement without some means to detect the actual assignment event. Can be supported by TAC's if the target variable is threads::shared, but then requires some locking support as well. Provide multiprocess/distributed implementation The infrastructure contained within Thread::Apartment should be readily extendable to a multiprocess version, possibly using some lightweight IPC mechanism in place of TQD's, and Storable to marshal/unmarshal objects and method calls. Likewise, a fully distributed implementation should be feasible using sockets in place of TQD's. Implement AptSimplex, AptUrgent method attributes Rather than requiring a TAS subclass to provide get_simplex_methods() and get_urgent_methods(), method attributes could be provided; however, the status of attribute support in Perl 5 is a bit nebulous at this time. Support CLONE_SKIP() CLONE_SKIP has been added in Perl 5.8.8 to help avoid unneeded object cloning (and thus improve performance and reduce memory footprint). It would probably be useful to support this for the proxied objects. Implement as C/XS or Inline::C Given the "CORE-ish" nature of T::A's behavior, a better performing and more lightweight solution using C/XS would be desirable. But I suspect there be dragons there... Implement T::A wrappers for commonly useful modules E.g., HTTP::Daemon, HTTP::Daemon::SSL, Class::DBI, etc. DBI and Perl/Tk are already covered... Support for Thread::Queue::Multiplex I have vague notions of how a publish/subscribe architecture might exploit T::A, but need a reference application to get a better idea how to implement. Better Support for DESTROY At present, reference counting and proxied object destruction are not fully implemented. In addition, DESTROY events in the TAC occur frequently, and appear to be duplicates or accidental, e.g., in some instances, simply dereferencing a TAC from a hash causes a DESTROY, even though the TAC has not bee removed from the hash. Hence, it's difficult to determine when the proxied object should be advised of a DESTROY event. At present, applications should assume that apartment threaded objects will be retained for the life of the application's execution. Add a Thread::Apartment::Rendezvous class The current implementation doesn't readily support initiation of concurrent method requests, and waiting for them all to complete. While asynchronous methods is a partial solution, it doesn't provide a coordinated wait mechanism. By adding a Thread::Apartment::Rendezvous object, the initiating application could wait for completion of all the started methods, and then proceed with processing. An initial design would use an alternate method signature for async methods; instead of passing a closure to be called when the method completed, a T::A::Rendezvous aka TAR object would be provided. When the method call was initiated, the TAC would pass itself and the generated method request ID to the Rendezvous object. Once the application had inititated all the methods, it would invoke the TAR's "rendezvous()" method to wait for completion of all the methods. Upon completion, the application could retrieve results from individual TAC's using the request identifiers returned by the previous async method call. Example: my $rdvu = Thread::Apartment::Rendezvous->new(); ... my @pending = (); push @pending, $tac1->ta_async_methodA($rdvu, @args); push @pending, $tac2->ta_async_methodB($rdvu, @args); push @pending, $tac3->ta_async_methodC($rdvu, @args); $rdvu->rendezvous(); my $result1 = $tac1->ta_get_results(shift @pending); my $result2 = $tac2->ta_get_results(shift @pending); my $result3 = $tac3->ta_get_results(shift @pending); However, some challenges remain:
SEE ALSOThread::Queue::Duplex, Thread::Queue::Queueable, DBIx::Threaded, Tk::Threaded, Thread::Resource::RWLock, threads::shared, perlthrtut, Class::ISA, Class::InspectorPots and Thread::Isolate provide somewhat similar functionality, but aren't quite as transparent as Thread::Apartment. They also do not appear to support passing closures between threaded objects. AUTHOR & COPYRIGHTCopyright(C) 2005, 2006, Dean Arnold, Presicient Corp., USALicensed under the Academic Free License version 2.1, as specified in the License.txt file included in this software package, or at OpenSource.org <http://www.opensource.org/licenses/afl-2.1.php>. POD ERRORSHey! The above document had some coding errors, which are explained below:
Visit the GSP FreeBSD Man Page Interface. |