Thread-safe closures

From: Leslie P. Polzer
Subject: Thread-safe closures
Date: Fri, 18 Jul 2008 15:25:44 +0000
Message-ID: <ad8c0ee5-4436-4ae4-9d68-beb547315ed4@s21g2000prm.googlegroups.com>

Here's an implementation of thread-safe closures.

Please let me know of any gotchas you spot.


(defmacro with-gensyms ((&rest names) &body body)
  `(let ,(mapcar #'(lambda (name)
                     (list name `(make-symbol ,(symbol-name name))))
names)
     ,@body))

(defun %current-thread ()
  #+sbcl sb-thread:*current-thread*)

; single binding
(defmacro with-threadsafe-binding ((var default) &body body)
  (with-gensyms (thread-data-map data-for-current-thread)
    `(let ((,thread-data-map (make-hash-table)))
       (flet ((,data-for-current-thread ()
                (multiple-value-bind (value foundp) (gethash (%current-
thread) ,thread-data-map)
                  (if foundp
                    value
                    (setf (gethash (%current-thread) ,thread-data-
map) ,default))))
              ((setf ,data-for-current-thread) (value)
                (setf (gethash (%current-thread) ,thread-data-map)
value)))
         (symbol-macrolet ((,var (,data-for-current-thread)))
           ,@body)))))

(with-threadsafe-binding (foo 5) ; test fn
  (defun bar ()
    (incf foo)))

; multiple bindings
(defmacro thread-let ((&rest bindings) &body body)
  `(with-threadsafe-binding ,(car bindings)
    ,(if (cdr bindings)
       `(thread-let ,(cdr bindings)
          ,@body)
       `(let ()
          ,@body))))

(thread-let ((foo 0) (bar 0)) ; test fn
  (defun quux ()
    (list (incf foo)
          (decf bar))))


You can test multi-threaded behaviour with something like this:

(sb-thread:join-thread (sb-thread:make-thread (lambda () (quux))))

Thanks,

    Leslie

Re: Thread-safe closures Kaz Kylheku
Re: Thread-safe closures Peter Schuller
- Re: Thread-safe closures Leslie P. Polzer
  - Re: Thread-safe closures Rainer Joswig
  - Re: Thread-safe closures Alex Mizrahi
- Re: Thread-safe closures Kaz Kylheku
Re: Thread-safe closures Alex Mizrahi

From: Kaz Kylheku
Subject: Re: Thread-safe closures
Date: Fri, 18 Jul 2008 18:57:14 +0000
Message-ID: <20080718111559.30@gmail.com>

On 2008-07-18, Leslie P. Polzer <·············@gmx.net> wrote:
> Here's an implementation of thread-safe closures.

What's this supposed to do?

I would want all invocations of the same closure to see exactly the same
environment, regardless of whether those invocations are sequential, nested,
interleaved or concurrent.

A thread-specific variable in a lexical environment doesn't make a whole lot of
sense. It's not really captured. If thread A makes the closure in which there
is a thread-specific lexical variable X with value V, then thread B calls that
closure, B has not access to value V. It sees X as uninitialized, or
initialized to NIL.

In what situations would it be an advantage for a given variable to be newly
instantiated every time a closure is made over it, /and/ newly instantiated
every time it is accessed by a new thread?

It's usually useful for thread-local storage to be otherwise static.

This construct would be cooler if it ensured one static instance of the
variable for all thread:

  (static-let (x)
    ;; x is a ``static lexical'': a lexically scoped, singly-instantiated
    ;; variable. Under multiple threads, its location is thread-specific.
    )

This is similar to the ``static'' keyword in C, and the thread-local
behavior is similar go GCC's TLS extension, the ``__thread'' keyword.

You can simulate the above with special variables, except that each thread has
to be aware of them and do the rebinding.

This could be fixed by an extension to the language which would allow some
special variables to be attributed as thread local. These specials would be
automatically given a thread-specific binding in each thread that starts up.

E.g.

  (def-thread-var *x*)

> Please let me know of any gotchas you spot.

How about this one: who cleans out the hash table?  Your closure contains a
hash table for mapping lookups to thread-specific values.  As new threads call
into the closure and access the variable, the hash table grows.

But you have no hooks for removing entries from the hash table when threads
die.  As long as a reference to the closure exists, it cannot be garbage
collected, but its hash table can be chock full of variables from threads which
have terminated. This is a virtual resource leak.  

Here is another question: is %current-thread guaranteed to be permanently
unique? If a thread dies and then a new one is created, could the (gethash
%current-thread ...) fetch the entry belonging to the dead thread?

Then there is this: your implementation seems wasteful, because a single
per-thread hash table could handle all of the variables. You are instantiating
a new hash table for every single variable. The THREAD-LET interface for
multiple bindings merely recursively expands multiple nestings of
WITH-THREADSAFE-BINDING, each of which generates code that calls
MAKE-HASH-TABLE. Yuck!

From: Peter Schuller
Subject: Re: Thread-safe closures
Date: Fri, 18 Jul 2008 17:39:45 +0000
Message-ID: <868wvzks66.fsf@hyperion.scode.org>

"Leslie P. Polzer" <·············@gmx.net> writes:

> ; single binding
> (defmacro with-threadsafe-binding ((var default) &body body)

Would not with-thread-local-binding be more indicative of what it
actually does?

Thread-safe, without further qualification, implies to me that it
somehow provides safety for concurrent access to shared state, while
(if I am not mistaken), the intent rather seems to be to provide a
seamless method of creating temporary thread-local state.

I am also curious as to what the expected use cases are. As far as I
can tell:

If the closure has side-effects that affect subsequent invocations, it
either means it is not meant to be called more than once (in which
case the concurrent use case is not relevant since it already violates
that rule), or it means the side effects are *desired* (which means
thread-local temporary state is often not what you want).

If the state is intended to be local to the invocation of a closure on
the other hand, there is no need to make it *thread-local* since it
already is.

If the state is global/shared, you will want thread-safe access rather
than thread-local versions of the data.

So I guess the use case will tend to be closures that are meant to be
invoced repeatedly and concurrently, with desired side-effects across
invocations that are to be thread-local. It would be interesting to
hear what real-life situations call for this.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <··············@infidyne.com>'
Key retrieval: Send an E-Mail to ·········@scode.org
E-Mail: ··············@infidyne.com Web: http://www.scode.org

From: Leslie P. Polzer
Subject: Re: Thread-safe closures
Date: Fri, 18 Jul 2008 17:58:53 +0000
Message-ID: <4929e5e4-b3ab-41ca-ac08-208372a1f9bd@m3g2000hsc.googlegroups.com>

> Would not with-thread-local-binding be more indicative of what it
> actually does?

Yeah, I suppose that's the name I was looking for.


> I am also curious as to what the expected use cases are. As far as I
> can tell:
>
> If the closure has side-effects that affect subsequent invocations, it
> either means it is not meant to be called more than once (in which
> case the concurrent use case is not relevant since it already violates
> that rule), or it means the side effects are *desired* (which means
> thread-local temporary state is often not what you want).

I'm not sure I can follow your implications thoroughly.
Let me explain:

I have a main thread that spawns auxiliary threads that use
the closure. The closure is called only once somewhere at the
beginning and should set up the state; this state shall be
shared with another function that uses it at the end of
the thread lifetime to clean up.

Sketch:

(defvar *value* t) ; imagine *value* being globally accessible

(thread-let (saved)
  (defun setup ()
    (setf saved *value*)
    (clobber! *value*))

  (defun teardown ()
    (assert saved)
    (setf *value* saved))

I know there are other ways to achieve this (e.g. returning the
TEARDOWN closure from SETUP, etc.) but I like how this construction
states the intent clearly.

From: Rainer Joswig
Subject: Re: Thread-safe closures
Date: Fri, 18 Jul 2008 18:49:43 +0000
Message-ID: <joswig-5EAD86.20494218072008@news-europe.giganews.com>

In article 
<····································@m3g2000hsc.googlegroups.com>,
 "Leslie P. Polzer" <·············@gmx.net> wrote:

> > Would not with-thread-local-binding be more indicative of what it
> > actually does?
> 
> Yeah, I suppose that's the name I was looking for.
> 
> 
> > I am also curious as to what the expected use cases are. As far as I
> > can tell:
> >
> > If the closure has side-effects that affect subsequent invocations, it
> > either means it is not meant to be called more than once (in which
> > case the concurrent use case is not relevant since it already violates
> > that rule), or it means the side effects are *desired* (which means
> > thread-local temporary state is often not what you want).
> 
> I'm not sure I can follow your implications thoroughly.
> Let me explain:
> 
> I have a main thread that spawns auxiliary threads that use
> the closure. The closure is called only once somewhere at the
> beginning and should set up the state; this state shall be
> shared with another function that uses it at the end of
> the thread lifetime to clean up.
> 
> Sketch:
> 
> (defvar *value* t) ; imagine *value* being globally accessible
> 
> (thread-let (saved)
>   (defun setup ()
>     (setf saved *value*)
>     (clobber! *value*))
> 
>   (defun teardown ()
>     (assert saved)
>     (setf *value* saved))
> 
> I know there are other ways to achieve this (e.g. returning the
> TEARDOWN closure from SETUP, etc.) but I like how this construction
> states the intent clearly.

Thinking about the coding style...

I can't say I like that code. Global functions that are suddenly
closures and can't be redefined individually, it uses
lexical variables in an imperative way, ... I haven't thought
too much about it, but intuitively I would think that it makes
software maintenance more difficult. Using DEFUN which is
best used to define top-level functions.

To me it looks like Scheme code in Common Lisp.

(let ((value 0))
  (defun counter ()
     (incf value)))

vs.

(setf (symbol-function 'counter)
    (let ((value 0))
       (lambda () (incf value))))

vs.

(defun make-counter (&optional (value 0))
   (lambda () (incf value)))

vs.

(defclass counter () ((value :initform 0)))

(defmethod count-up ((counter counter))
   (incf (slot-value counter 'value)))

(count-up (make-instance 'counter))

Other opinions?

-- 
http://lispm.dyndns.org/

From: Alex Mizrahi
Subject: Re: Thread-safe closures
Date: Fri, 18 Jul 2008 19:33:14 +0000
Message-ID: <4880effb$0$90265$14726298@news.sunsite.dk>

 LPP> I have a main thread that spawns auxiliary threads that use
 LPP> the closure. The closure is called only once somewhere at the
 LPP> beginning and should set up the state; this state shall be
 LPP> shared with another function that uses it at the end of
 LPP> the thread lifetime to clean up.

 LPP> Sketch:

 LPP> (defvar *value* t) ; imagine *value* being globally accessible

 LPP> (thread-let (saved)
 LPP>   (defun setup ()
 LPP>     (setf saved *value*)
 LPP>     (clobber! *value*))

 LPP>   (defun teardown ()
 LPP>     (assert saved)
 LPP>     (setf *value* saved))

 LPP> I know there are other ways to achieve this (e.g. returning the
 LPP> TEARDOWN closure from SETUP, etc.) but I like how this construction
 LPP> states the intent clearly.

i'd make it like this:

(defmacro with-clobber! (&body body)
  `(call-with-clobber! (lambda () ,@body)))

(defun call-with-clobber! (thunk)
   (let ((*value* *value*)) ; rebind value
       (clobber! *value*)
       (funcall thunk)))

implementation is much simplier, and also interface -- you do not need
 unwind-protect, just use with-clobber! macro, or call-with-clobber! 
function directly.

From: Kaz Kylheku
Subject: Re: Thread-safe closures
Date: Fri, 18 Jul 2008 19:32:39 +0000
Message-ID: <20080718115729.65@gmail.com>

On 2008-07-18, Peter Schuller <··············@infidyne.com> wrote:
> "Leslie P. Polzer" <·············@gmx.net> writes:
>
>> ; single binding
>> (defmacro with-threadsafe-binding ((var default) &body body)
>
> Would not with-thread-local-binding be more indicative of what it
> actually does?
>
> Thread-safe, without further qualification, implies to me that it
> somehow provides safety for concurrent access to shared state, while
> (if I am not mistaken), the intent rather seems to be to provide a
> seamless method of creating temporary thread-local state.
>
> I am also curious as to what the expected use cases are. As far as I
> can tell:
>
> If the closure has side-effects that affect subsequent invocations, it
> either means it is not meant to be called more than once (in which
> case the concurrent use case is not relevant since it already violates
> that rule), or it means the side effects are *desired* (which means
> thread-local temporary state is often not what you want).

Exactly; if you're calling a closure from multiple threads, you want the
lexical environment to be shared. You can stick a mutex in there to avoid
races, etc.

I did exactly this, in a Lisp dialect that I developed at a previous job.

A closure is like a class object, and its environment is like instance
variables.

A thread-local /class/ slot is useful, without a doubt.

Would you ever want a thread-specific /instance/ slot in an object?

When the problem is looked at this wya, the answer turns out to be yes.

Thread-specific instance data is for instance useful in some thread
synchronization objects.

Suppose you want to support writer-preference behavior in a read-write lock,
but recursive read locks are allowed and must not deadlock.

Under writer preference,  if one or more writers are waiting to acquire the
lock, new read lock attempts are queued behind these writers.

However, if a read lock request is made by a thread which already owns a read
lock, it cannot be suspended, because a deadlock ensues.

A solution is for the read lock to keep track of which threads have it
read-locked, and how many times.  This could be achieved by a thread-local
instance variable:

  (defun make-read-write-lock ()
    (let (read-locks 0)
         (write-locked nil)
         (waiting-writers 0)
	 (mutex (make-mutex ...))
	 (condition (make-condition-variable ...)))
    (thread-let (my-own-read-locks 0)
      (list

        ;; read lock operation
	;; preference given to waiting writers
	;; avoids deadlock by checking MY-OWN-READ-LOCKS
        (lambda () 
	  (with-mutex-lock (mutex)
	    (loop while (or write-locked 
	                    (and (zerop my-own-read-locks)
	                         (not (zerop waiting-writers))))
	          do 
		    (condition-wait condition mutex)
		  finally
		    (incf read-locks)
		    (incf my-own-read-locks))))

	;; write lock operation
	(lambda ()
	  (with-mutex-lock (mutex)
	    (incf waiting-writers)
	    (loop while (or write-locked
			    (not (zerop read-locks)))
		  do
		    (condition-wait condition mutex)
		  finally
		    (decf waiting-writers)
		    (setf write-locked (current-thread-id)))))

	;; unlock operation
	(lambda ()
	  (with-mutex-lock (mutex)
	    (cond 
	      (write-locked
		(assert (eq write-locked (current-thread-id)))
		(setf write-locked nil)
		(cond-broadcast cond))
	      (t
		(decf my-own-read-locks)
		(if (zerop (decf read-count))
		  (cond-broadcast cond)))))))))

Of course, you don't need thread-let, or thread-specific instance slots to
achieve the same functionality.  But all ways of of achieving that
functionality are just implementations of what this abstraction expresses.

> If the state is intended to be local to the invocation of a closure on
> the other hand, there is no need to make it *thread-local* since it
> already is.

Well, if state is intended to be local to the invocation of a closure, then it
won't be stored in a free variable, but in a bound variable. I.e. a variable
that isn't closed over.

From: Alex Mizrahi
Subject: Re: Thread-safe closures
Date: Fri, 18 Jul 2008 18:57:54 +0000
Message-ID: <4880e7b3$0$90276$14726298@news.sunsite.dk>

 LPP> Here's an implementation of thread-safe closures.

why do you call this "thread-safe closures"?

what you implement here is a new kind of variable which
has per-thread bindings (like special variables in most
implementations), but are defined more like lexical variables,
and have lexical variable properties (not sure about this, is
symbol-macrolet a correct way to implement lexical variables?).

thus, you could call them thread-local variables or something
like that. thread-safe is not a good term to use -- normal
special and lexical variables are quite thread-safe, as long
as you know how do they work and implement your code
accordingly.

also, you variables are likely to introduce considerable
overhead because of hash-table access, and could be
non thread-safe in some implementation -- nobody guarantees
that concurrent reads/writes via gethash will work, and iirc there
were rumours that they will not work in future versions of SBCL.

i see these thread-local variables could be useful for global
variables with per-thread value, as special variables would
be not good at it because you need to bind it at least once
in each thread. binding. such binding is not a problem if you
control creation of threads and code pathes, e.g.:

(defvar *thread-local-connection-pool*)

(defun get-connection ()
  (get-connection-via-cp  *thread-local-connection-pool*))

(defun start-worker ()
  (make-thread
    (lambda ()
      (let ((*thread-local-connection-pool* (make-connection-pool)))
        (do-work)))))

if you spawn worker threads in this way, you will always have separate
pool in each thread and it will work fine without hackery. however, if 
you're
writing library functions and you cannot control how threads are created
(or do not want to make interface cumbersome), this thread-local variables
might be good:

(thread-let ((*thread-local-connection-pool* (make-connection-pool)))
   (defun get-connectio ()
     (get-connection-via-cp  *thread-local-connection-pool*))))

i don't see this could be useful for local stuff, as you can have as many
lexical environments as you need.

so it might be better to implement such variables as a global ones, like:

(def-thread-local *thread-connection-pool* (make-connection-pool))