Thread-safe caches

From: Pascal Costanza
Subject: Thread-safe caches
Date: Tue, 06 Nov 2007 07:55:40 +0000
Message-ID: <5pakvsFq7qdoU1@mid.individual.net>

Hi,

Currently, ContextL caches are implemented using property lists. 
However, there seem to be problem with concurrent accesses to those 
caches when using multiple threads.

Does anyone know of a straightforward way to make property lists (or 
association lists or hashtables) thread-safe in an 
implementation-independent way, possibly using some kind of 
compatibility layer?

It may also be acceptable that the caches are not always updated, so if 
new entries in the caches are actually not written every now and then, 
that's fine, as long as the caches remain consistent...


Thanks for any hint,
Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

Re: Thread-safe caches Don Geddis
- Re: Thread-safe caches Pascal Costanza
  - Re: Thread-safe caches D Herring
    - Re: Thread-safe caches Alex Mizrahi
      - Re: Thread-safe caches Alexander Kjeldaas
        Re: Thread-safe caches Robert Maas, see http://tinyurl.com/uh3t
        Re: Thread-safe caches Maciej Katafiasz
Re: Thread-safe caches ··············@gmail.com
- Re: Thread-safe caches Pascal Costanza
  - Re: Thread-safe caches Alex Mizrahi
    - Re: Thread-safe caches Pascal Costanza
    - Re: Thread-safe caches Pascal Costanza
      - Re: Thread-safe caches Edi Weitz
        Re: Thread-safe caches Pascal Costanza
        Re: Thread-safe caches Duane Rettig
    - Re: Thread-safe caches Pascal Costanza
      - Re: Thread-safe caches Antony Sequeira
        Re: Thread-safe caches Robert Maas, see http://tinyurl.com/uh3t
      - Re: Thread-safe caches Robert Maas, see http://tinyurl.com/uh3t
        Re: Thread-safe caches Pascal Costanza
Re: Thread-safe caches Alex Mizrahi
- Re: Thread-safe caches Pascal Costanza
- Re: Thread-safe caches Juho Snellman
  - Re: Thread-safe caches Alex Mizrahi
- Re: Thread-safe caches Duane Rettig
  - Re: Thread-safe caches Pascal Costanza
    - Re: Thread-safe caches Duane Rettig
      - Re: Thread-safe caches Pascal Costanza
        Re: Thread-safe caches Duane Rettig
        Re: Thread-safe caches Pascal Costanza
        Re: Thread-safe caches Duane Rettig
        Re: Thread-safe caches Pascal Costanza
        Re: Thread-safe caches Juho Snellman
        Re: Thread-safe caches Pascal Costanza
        Re: Thread-safe caches David Lichteblau
        Re: Thread-safe caches Duane Rettig
        Re: Thread-safe caches David Lichteblau
        Re: Thread-safe caches Duane Rettig
    - Re: Thread-safe caches Alexander Kjeldaas
  - Re: Thread-safe caches Alex Mizrahi
    - Re: Thread-safe caches Duane Rettig
Re: Thread-safe caches Mariano Montone
Re: Thread-safe caches Robert Maas, see http://tinyurl.com/uh3t
- Re: Thread-safe caches Pascal Costanza

From: Don Geddis
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 17:37:07 +0000
Message-ID: <87sl3jmg1o.fsf@geddis.org>

Pascal Costanza <··@p-cos.net> wrote on Tue, 06 Nov 2007:
> Does anyone know of a straightforward way to make property lists (or
> association lists or hashtables) thread-safe in an
> implementation-independent way, possibly using some kind of compatibility
> layer?

Surely any implementation which offers multiple threads (which is beyond
ANSI CL as it is), also offers semaphores (locks).  No?

It would seem that you already need a compatibility layer just for the
multiple threads, and the locks ought to be part of that.

Or is it that your code does NOT assume multiple threads, but your customers
are running in multiple threads themselves, and are requesting that you make
your code thread-safe...  I suppose that could be an issue.

I don't think there's a way of dealing with implementation-dependent features
(like threading) without writing implementation-dependent code.

Perhaps you're just hoping that somebody else has already taken responsibility
for this, and you can just import their compatibility layer.

        -- Don
_______________________________________________________________________________
Don Geddis                  http://don.geddis.org/               ···@geddis.org
Murphy's Law is recursive.  Washing your car to make it rain doesn't work.

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 20:35:47 +0000
Message-ID: <5pc1h3Fpsjv5U1@mid.individual.net>

Don Geddis wrote:
> Pascal Costanza <··@p-cos.net> wrote on Tue, 06 Nov 2007:
>> Does anyone know of a straightforward way to make property lists (or
>> association lists or hashtables) thread-safe in an
>> implementation-independent way, possibly using some kind of compatibility
>> layer?
> 
> Surely any implementation which offers multiple threads (which is beyond
> ANSI CL as it is), also offers semaphores (locks).  No?

That's correct.

> It would seem that you already need a compatibility layer just for the
> multiple threads, and the locks ought to be part of that.

Also correct. Portable-threads of the GBBopen project would be a good 
choice here.

> Or is it that your code does NOT assume multiple threads, but your customers
> are running in multiple threads themselves, and are requesting that you make
> your code thread-safe...  I suppose that could be an issue.

That is a pretty accurate description of the situation. Adding GBBopen / 
portable-threads has two disadvantages:

- It adds another dependency (but I don't want to do the portability 
stuff myself either ;).

- More importantly, acquiring locks for updating and reading from caches 
probably slows down things, which I would prefer to avoid, if possible.

> I don't think there's a way of dealing with implementation-dependent features
> (like threading) without writing implementation-dependent code.
> 
> Perhaps you're just hoping that somebody else has already taken responsibility
> for this, and you can just import their compatibility layer.

Something like that.

Currently, I think that assuming that setf on simple variables is atomic 
might be reasonable.


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: D Herring
Subject: Re: Thread-safe caches
Date: Wed, 07 Nov 2007 04:53:57 +0000
Message-ID: <g4Gdnfh7Net72azanZ2dnUVZ_s2tnZ2d@comcast.com>

Pascal Costanza wrote:
> - More importantly, acquiring locks for updating and reading from caches 
> probably slows down things, which I would prefer to avoid, if possible.

I did a quick timing on synchronization in C/C++ at work; just a loop 
that repeatedly tried a synchronization technique.

The results were something like:
volatile access - 0.2s (no locking; "control case")
compare-and-swap - 4s (GCC's __sync_bool_compare_and_swap intrinsic)
pthread_mutex - 11s
pthread_cond_* - 8s

Yeah; in general acquiring locks is a bad thing...  In addition to the 
deadlock potential and priority issues, you can burn 3 CASs for each 
mutex.  But all this is *slow* compared to a normal op.

Where a hard setf isn't sufficient, CAS may provide enough for your 
code to detect that someone else is updating the same structure.  For 
example, a loop might find the end of a list until a CAS succeeds at 
appending data -- this is simple, fast, and nonblocking.

- Daniel

From: Alex Mizrahi
Subject: Re: Thread-safe caches
Date: Wed, 07 Nov 2007 10:19:40 +0000
Message-ID: <47319141$0$90267$14726298@news.sunsite.dk>

 DH> Where a hard setf isn't sufficient, CAS may provide enough for your
 DH> code to detect that someone else is updating the same structure.  For
 DH> example, a loop might find the end of a list until a CAS succeeds at
 DH> appending data -- this is simple, fast, and nonblocking.

iirc CAS still does a memory fence, causing all cores to wait to synchronize 
caches.
if that is done frequently (how frequently? i don't know, it's a 
speculation), this (together with CAS overhead, 20x according to your 
measurements) can degrade performance of SMP system to the point where it's 
not better than single-CPU.
hence, it undermines whole idea of having multithreading if it's done for 
speed -- to fully utilize modern multicore CPUs. quite possibly 
single-thread implementation using no locks and no CAS would be faster than 
multithreaded on multicore SMP system.

of course it's not in effect if multithreading is used for IO-intensive 
tasks, or if operations are not that frequent..

From: Alexander Kjeldaas
Subject: Re: Thread-safe caches
Date: Fri, 09 Nov 2007 13:21:05 +0000
Message-ID: <1194614465.739948.81430@s15g2000prm.googlegroups.com>

On Nov 7, 11:19 am, "Alex Mizrahi" <········@users.sourceforge.net>
wrote:
>  DH> Where a hard setf isn't sufficient, CAS may provide enough for your
>  DH> code to detect that someone else is updating the same structure.  For
>  DH> example, a loop might find the end of a list until a CAS succeeds at
>  DH> appending data -- this is simple, fast, and nonblocking.
>
> iirc CAS still does a memory fence, causing all cores to wait to synchronize
> caches.
> if that is done frequently (how frequently? i don't know, it's a
> speculation), this (together with CAS overhead, 20x according to your
> measurements) can degrade performance of SMP system to the point where it's
> not better than single-CPU.

A reader could misunderstand what you are writing.  A CAS is not slow
because it does a memory fence.   A memory fence is an operation that
simply means that the CPU will not reorder reads or writes across the
fence.  Thus a memory fence is the "cheapest" operation for multi-
threading, and no CPU will need to wait or syncrhonize because of a
fence operation.  A memory fence can typically be used as part of more
complex operations like CAS, but it can also be used as a light-weight
operation in data structures where there synchronization is not
needed.  A cache can be such a data structure where two threads don't
necessarily need to see the same view of the data.

Alexander

From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Thread-safe caches
Date: Wed, 14 Nov 2007 19:26:50 +0000
Message-ID: <rem-2007nov14-003@yahoo.com>

> From:  Alexander Kjeldaas <··················@gmail.com>
> A memory fence is an operation that simply means that the CPU
> will not reorder reads or writes across the fence.

<cliche>You learn something new almost every day</cliche>
Ah, thanks for the new concept I never heard of before.
(I never had the "opportunity" to try to write machine code for any
 CPU that had the option of re-ordering memory operations. The
 newest CPU I ever wrote machine code for was the 68000, a Mac Plus,
 via Sesame C and Pocket Forth.)
Indeed that's a very simple/elegant/powerful/inexpensive feature.

Now a followup question: Suppose your CPU does *not* have any
atomic read+write operation (such as swap register<->memory),
guaranteed to complete in a single memory cycle. But you do have
memory fences. Suppose you have more than one CPU accessing the
same RAM. How do you implement a lock efficiently?

My first idea is to have a separate memory cell for each CPU that
might need to use the same lock, with pre-established priorities as
to who gets to use the resource if more than one CPU wants to use
it at the same time. Lower priority CPUs would try to get their
lock but then check if some higher priority CPU had gotten it in
the mean time, and wait until higher-CPU didn't have it before
actually using the resource themselves. Higher priority CPUs would
just grab the lock and not care if anybody lower was reporting to
have it, since lower-CPU wouldn't actually use it until *I*
released it. I don't know if this would really work.
Any comments or alternatives?

Now assuming the CPU does have adequate built-in mechanisms:
More generally, there seem to be two classifications of resources:
-1- Short duration use, like just a few clock cycles per use
-2- Long duration use, long enough that queue is more efficient than retestLoop
-A- Hardly ever a simultanous request for same resource
-B- Most of time the resources has multiple processses wanting it

As for what seems to me optimal solution for each of the four composite cases:
-1A- It suffices to use atomic-swap instruction to request lock,
     and sleep-retry if it was busy already.
-1B- Difficult case, I have some composite ideas that might work.
-2A- It suffices to use a general lock mechanism, with a priori
     knowledge of how long the other process is likely to use it,
     with retry at random times approximately when the other
     process is expected to be done.
-2B- You really need a formal scheduler to mediate all requests for
     the resource. You put a request in the queue, with a callback for
     when the resource will become ready.

For that difficult case 1B, one idea is to use the atomic lock a
few times just to see if we happened to catch it when it wasn't too
busy, but when it starts to look like we have a serious problem
with contention for that resource then we submit a formal request
to the resource mediator. The mediator gets a lock on the resource
and keeps it so long as any process has been sub-delegated that
resource and/or there are any other processes in the queue for it.
Only when the queue is empty, then release the lock for more
efficient use while load is low.

Caveat: I've never programmed any of this. I'm just brainstorming.
(I placed top five in Putnam contest, so I'm licensed to brainstorm!!)

From: Maciej Katafiasz
Subject: Re: Thread-safe caches
Date: Wed, 14 Nov 2007 21:29:43 +0000
Message-ID: <fhfpc7$685$1@news.net.uni-c.dk>

Den Wed, 14 Nov 2007 11:26:50 -0800 skrev Robert Maas, see
http://tinyurl.com/uh3t:

> Now a followup question: Suppose your CPU does *not* have any atomic
> read+write operation (such as swap register<->memory), guaranteed to
> complete in a single memory cycle. But you do have memory fences.
> Suppose you have more than one CPU accessing the same RAM. How do you
> implement a lock efficiently?

You don't. Setting a fence is a per-CPU operation, it only means that the 
CPU will be notified that the order is significant and won't try 
optimising the operations by shuffling them.

Cheers,
Maciej

From: ··············@gmail.com
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 11:18:28 +0000
Message-ID: <1194347908.382980.47940@22g2000hsm.googlegroups.com>

> Does anyone know of a straightforward way to make property lists (or
> association lists or hashtables) thread-safe in an
> implementation-independent way, possibly using some kind of
> compatibility layer?

if you don't care about multiple (wasted) calculations and a little
consing, then you can have a scheme where the cache structure is read-
only and always copied when a new entry is added. then the new (copied
and updated) cache structures are setf'ed in place of the old ones.

the worst thing that can happen is that some cached entries are
deleted by another thread, but this may be ok depending on what the
cache is used for. consing can also be a problem depending on the use-
cases.

unfortunately i have no time right now to think this through in the
context of contextl. i'm in a different layer right now... :)

hth,

- attila

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 11:48:50 +0000
Message-ID: <5pb2l2Fqab1bU2@mid.individual.net>

··············@gmail.com wrote:
>> Does anyone know of a straightforward way to make property lists (or
>> association lists or hashtables) thread-safe in an
>> implementation-independent way, possibly using some kind of
>> compatibility layer?
> 
> if you don't care about multiple (wasted) calculations and a little
> consing, then you can have a scheme where the cache structure is read-
> only and always copied when a new entry is added. then the new (copied
> and updated) cache structures are setf'ed in place of the old ones.
> 
> the worst thing that can happen is that some cached entries are
> deleted by another thread, but this may be ok depending on what the
> cache is used for. consing can also be a problem depending on the use-
> cases.

That would all be fine if setf were guaranteed to be an atomic 
operation. Unfortunately, it's not. :(

> unfortunately i have no time right now to think this through in the
> context of contextl. i'm in a different layer right now... :)

Right. :)


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Alex Mizrahi
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 11:55:15 +0000
Message-ID: <47305628$0$90270$14726298@news.sunsite.dk>

 ??>> the worst thing that can happen is that some cached entries are
 ??>> deleted by another thread, but this may be ok depending on what the
 ??>> cache is used for. consing can also be a problem depending on the use-
 ??>> cases.

 PC> That would all be fine if setf were guaranteed to be an atomic
 PC> operation. Unfortunately, it's not. :(

do you have an example when setf'ing single variable is not an atomic 
operation?

if it was not, implementation would be very easy to crash with simpliest 
operations. haven't heard anything about such cases

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 12:58:53 +0000
Message-ID: <5pb6oeFq691uU1@mid.individual.net>

Alex Mizrahi wrote:
>  ??>> the worst thing that can happen is that some cached entries are
>  ??>> deleted by another thread, but this may be ok depending on what the
>  ??>> cache is used for. consing can also be a problem depending on the use-
>  ??>> cases.
> 
>  PC> That would all be fine if setf were guaranteed to be an atomic
>  PC> operation. Unfortunately, it's not. :(
> 
> do you have an example when setf'ing single variable is not an atomic 
> operation?

No. But there doesn't seem to be good documentation in the existing CL 
implementations which operations are atomic and which are not.

> if it was not, implementation would be very easy to crash with simpliest 
> operations. haven't heard anything about such cases 

Maybe, but I don't want to bet on that.

Java has exceptionally good documentation about its multi-threading 
capabilities, and they document cases where one would guess that 
operations are atomic, but counter-intuitively they are in fact not. So 
I feel a bit uncomfortable here.

Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 13:54:37 +0000
Message-ID: <5pba0tFq7jnhU1@mid.individual.net>

Alex Mizrahi wrote:
>  ??>> the worst thing that can happen is that some cached entries are
>  ??>> deleted by another thread, but this may be ok depending on what the
>  ??>> cache is used for. consing can also be a problem depending on the use-
>  ??>> cases.
> 
>  PC> That would all be fine if setf were guaranteed to be an atomic
>  PC> operation. Unfortunately, it's not. :(
> 
> do you have an example when setf'ing single variable is not an atomic 
> operation?
> 
> if it was not, implementation would be very easy to crash with simpliest 
> operations. haven't heard anything about such cases 

Hmmm, your reasoning actually makes sense. Indeed, if setf weren't 
atomic, you would have to protect even the simplest code against 
concurrency.

But how to be sure?


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Edi Weitz
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 14:00:28 +0000
Message-ID: <uk5ovmq2r.fsf@agharta.de>

On Tue, 06 Nov 2007 14:54:37 +0100, Pascal Costanza <··@p-cos.net> wrote:

> Indeed, if setf weren't atomic, you would have to protect even the
> simplest code against concurrency.

I would say that makes sense for SETQ, but not for SETF as you
yourself can implement arbitrarily complex SETF functions.

> But how to be sure?

Ask the vendors?

Edi.

-- 

Lisp is not dead, it just smells funny.

Real email: (replace (subseq ·········@agharta.de" 5) "edi")

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 14:11:22 +0000
Message-ID: <5pbb0aFqc3srU1@mid.individual.net>

Edi Weitz wrote:
> On Tue, 06 Nov 2007 14:54:37 +0100, Pascal Costanza <··@p-cos.net> wrote:
> 
>> Indeed, if setf weren't atomic, you would have to protect even the
>> simplest code against concurrency.
> 
> I would say that makes sense for SETQ, but not for SETF as you
> yourself can implement arbitrarily complex SETF functions.

Since there is no real difference between setq and setf... - but I know 
what you mean. ;)

>> But how to be sure?
> 
> Ask the vendors?

*sigh*

OK. :)


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Duane Rettig
Subject: Re: Thread-safe caches
Date: Wed, 07 Nov 2007 00:13:20 +0000
Message-ID: <o04pfyyktb.fsf@gemini.franz.com>

[Testing, sorry - looking for a lost submission :-( ]

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 21:54:58 +0000
Message-ID: <5pc65iFqlnfpU1@mid.individual.net>

Alex Mizrahi wrote:
>  ??>> the worst thing that can happen is that some cached entries are
>  ??>> deleted by another thread, but this may be ok depending on what the
>  ??>> cache is used for. consing can also be a problem depending on the use-
>  ??>> cases.
> 
>  PC> That would all be fine if setf were guaranteed to be an atomic
>  PC> operation. Unfortunately, it's not. :(
> 
> do you have an example when setf'ing single variable is not an atomic 
> operation?
> 
> if it was not, implementation would be very easy to crash with simpliest 
> operations. haven't heard anything about such cases 

Just for information: There is a documented case for the Java language. 
See http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.7


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Antony Sequeira
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 23:11:18 +0000
Message-ID: <4OSdnQJ2wYI50azanZ2dnUVZ_ramnZ2d@comcast.com>

Pascal Costanza wrote:
> Alex Mizrahi wrote:
>>  ??>> the worst thing that can happen is that some cached entries are
>>  ??>> deleted by another thread, but this may be ok depending on what the
>>  ??>> cache is used for. consing can also be a problem depending on 
>> the use-
>>  ??>> cases.
>>
>>  PC> That would all be fine if setf were guaranteed to be an atomic
>>  PC> operation. Unfortunately, it's not. :(
>>
>> do you have an example when setf'ing single variable is not an atomic 
>> operation?
>>
>> if it was not, implementation would be very easy to crash with 
>> simpliest operations. haven't heard anything about such cases 
> 
> Just for information: There is a documented case for the Java language. 
> See http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.7
> 
> 
> Pascal
>
Not pointed at any one poster.

Another point to remember is that being atomic at write is not enough in 
many practical situations (where you are not setting the value 
unconditionally).
For example according to the link Pascal posted, we can assume a boolean 
is atomic in Java without requiring extra work from the programmer, yet 
there is
http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/AtomicBoolean.html

It may not be obvious why.

-Antony

From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Thread-safe caches
Date: Wed, 14 Nov 2007 18:24:39 +0000
Message-ID: <rem-2007nov14-001@yahoo.com>

> From: Antony Sequeira <··················@gmail.com>
> Another point to remember is that being atomic at write is not
> enough in many practical situations (where you are not setting the
> value unconditionally).

I'm not quite sure what you're talking about there. Can you give a
simple example where there's only a single shared variable yet
conditional atomic-write of that variable isn't thread safe?

> For example according to the link Pascal posted, we can assume a
> boolean is atomic in Java without requiring extra work from the
> programmer, yet there is
> http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/AtomicBoolean.html
> It may not be obvious why.

<cliche>The devil is in the details</cliche> I.e. look not just at
the name of the class, but the set of methods provided.
The key thing is that atomic read and write is *not* the only
method provided there. Also provided are two others which interlock
two separate operations into a single atomic operation (a "critical
section"):

(1)
public final boolean compareAndSet(boolean expect,
                                   boolean update)
          Atomically set the value to the given updated value if the
          current value == the expected value.
        Returns:
                true if successful. False return indicates that the
                actual value was not equal to the expected value.
If all it did was set to new value if it wasn't already that value,
simply setting it would do the trick. If all it did was set to new
value if it already was, NO-OP would suffice. Those are the only
two cases to consider as for side-effect itself. But there's also
the return value, which must be the result of test *immediately*
before (no interrupts between) the conditional setting. This (with
the constructor of course) is in fact sufficient to implement a
LOCK, as follows:
  To create a LOCK which nobody has yet:
    sharedLock = new AtomicBoolean(false);  /* = default no-arg constructor */
  To try to get a lock, and know whether you got it or not:
    gotFlag = sharedLock.compareAndSet(false, true);
    if (gotFlag) { Code for using the resource ... code (below) to release }
    else { Code for re-trying or giving up etc. }
  To release the lock, after you already know you got it:
    happyFlag = sharedLock.compareAndSet(true, false);
    if (!happyFlag) { Bug report, we thought we had the lock, but it went false }

Note that atomic-swap-local-with-shared-boolean is equivalent in
power, but is a single atomic machine instruction on some/most CPU
architectures. (Exercise for reader such as newcomer to software:
                Emulate each in terms of the other.)

public final boolean getAndSet(boolean newValue)
          Sets to the given value and returns the previous value.

(2)
Ah, that is more obviously equivalent to atomic-swap-local-with-shared-boolean.
For example: localBool = sharedLock.getAndSet(localBool);
That emulates swap in terms of getAndSet, and emulating the reverse
direction is nearly as trivial, no need for exercise for reader
unless very very beginner at software.


On the other hand, what about this??

public boolean weakCompareAndSet(boolean expect,
                                 boolean update)
          Atomically set the value to the given updated value if the
          current value == the expected value. May fail spuriously.
        Returns:
                true if successful.

Now this really bothers me! First, assuming that the CPU can
implmement it via atomic swap register with memory, how can it
possibly fail except if the programmer totally goofed and didn't
code that mechanism? Second, what does "successful" mean? Does it
mean "should have made the update because the expect matched" or
"didn't spuriously fail" or both of those together i.e. false if it
spuriously failed or if no update was needed?
Finally, what kinds of spurious failage are possible here?
Does it set the variable even when the expect didn't match?
Does it fail to set the variable when the expect *did* match?
Does it return an inappropriate value even though the operation itself worked?
Does it throw an unchecked exception?
Does it garbage-collect the AtomicBoolean object so that later the JVM crashes?
Does it go into an infinite loop with all interrupts disabled needing cold boot?
Does it delete/dismount/trash the root directory and/or boot sector?
IMO This JavaDoc needs some serious re-thinking!

From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Thread-safe caches
Date: Tue, 13 Nov 2007 19:10:52 +0000
Message-ID: <rem-2007nov13-004@yahoo.com>

> > do you have an example when setf'ing single variable is not an atomic
> > operation?
> > if it was not, implementation would be very easy to crash with simpliest
> > operations. haven't heard anything about such cases
> From: Pascal Costanza <····@p-cos.net>
> Just for information: There is a documented case for the Java language.
> See http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.7

IMO there should be a toplevel page which clearly outlines the
distinction between atomic and non-atomic operations. For example,
primitive assignments and fetches ought to be atomic with the
exception of specially noted cases such as 64-bit numbers.
Assignment of 64-bit addresses/pointers/handles are by comparison
*guaranteed* to be atomic!! Method calls are atomic only if
declared as synchronized. Is that correct for <ot>java</ot>?

Lisp should do likewise. It's too late to get such a statement in
the ANSI/ISO standard, but we could perhaps get agreement among all
major implementors to some add-on part of the standard, and then
include the agreement in the HyperSpec? We might handle this
vaguely the way Java handles "interfaces", whereby we define a
standard of performance, give it a name, and then anything that
conforms to that standard is allowed to declare that it
"implements" that interface. So for example CMUCL on Pentium-3
could claim to implement the "atomic-operations-2007" standard
("interface"). Perhaps a named <lispjargon>FEATURE</lispjargon>
would be appropriate, so that compilers and/or user code could
easily test for it.

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Tue, 13 Nov 2007 23:24:36 +0000
Message-ID: <5puq1kFtbqt8U1@mid.individual.net>

Robert Maas, see http://tinyurl.com/uh3t wrote:
>>> do you have an example when setf'ing single variable is not an atomic
>>> operation?
>>> if it was not, implementation would be very easy to crash with simpliest
>>> operations. haven't heard anything about such cases
>> From: Pascal Costanza <····@p-cos.net>
>> Just for information: There is a documented case for the Java language.
>> See http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.7
> 
> IMO there should be a toplevel page which clearly outlines the
> distinction between atomic and non-atomic operations. For example,
> primitive assignments and fetches ought to be atomic with the
> exception of specially noted cases such as 64-bit numbers.
> Assignment of 64-bit addresses/pointers/handles are by comparison
> *guaranteed* to be atomic!! Method calls are atomic only if
> declared as synchronized. Is that correct for <ot>java</ot>?
> 
> Lisp should do likewise. It's too late to get such a statement in
> the ANSI/ISO standard, but we could perhaps get agreement among all
> major implementors to some add-on part of the standard, and then
> include the agreement in the HyperSpec? We might handle this
> vaguely the way Java handles "interfaces", whereby we define a
> standard of performance, give it a name, and then anything that
> conforms to that standard is allowed to declare that it
> "implements" that interface. So for example CMUCL on Pentium-3
> could claim to implement the "atomic-operations-2007" standard
> ("interface"). Perhaps a named <lispjargon>FEATURE</lispjargon>
> would be appropriate, so that compilers and/or user code could
> easily test for it.

This could be handled via the Common Lisp Document Repository - see 
http://cdr.eurolisp.org/


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Alex Mizrahi
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 13:01:36 +0000
Message-ID: <473065b5$0$90276$14726298@news.sunsite.dk>

 PC> It may also be acceptable that the caches are not always updated, so if
 PC> new entries in the caches are actually not written every now and then,
 PC> that's fine, as long as the caches remain consistent...

i think plists are thread-safe when setq, rplaca and rplacd are atomic, 
which i believe is true.
as for consistency question, it depends on nature of this cache -- are 
concurrent incompatible updates possible and what to do in this case.
e.g. is it possible that one thread deletes entry while another updates it?

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 13:30:02 +0000
Message-ID: <5pb8iqFosaalU1@mid.individual.net>

Alex Mizrahi wrote:
>  PC> It may also be acceptable that the caches are not always updated, so if
>  PC> new entries in the caches are actually not written every now and then,
>  PC> that's fine, as long as the caches remain consistent...
> 
> i think plists are thread-safe when setq, rplaca and rplacd are atomic, 
> which i believe is true.

Hmm.

> as for consistency question, it depends on nature of this cache -- are 
> concurrent incompatible updates possible and what to do in this case.
> e.g. is it possible that one thread deletes entry while another updates it? 

Cache entries are never deleted. Moreover, if two threads attempt to add 
something to the cache, it will be the same cache value!


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Juho Snellman
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 14:36:17 +0000
Message-ID: <87tznz7866.fsf@vasara.proghammer.com>

"Alex Mizrahi" <········@users.sourceforge.net> writes:

>  PC> It may also be acceptable that the caches are not always updated, so if
>  PC> new entries in the caches are actually not written every now and then,
>  PC> that's fine, as long as the caches remain consistent...
> 
> i think plists are thread-safe when setq, rplaca and rplacd are atomic, 
> which i believe is true.

I think not. The plist implementation would also need to use the
atomic primitives in a way that ensures that the whole update is
atomic. Consider an implementation where (setf (getf place :b) 1)
expands to something like (ignoring gensymmage noise):

  (let ((head (loop for head on var by #'cddr
                    when (eql (car head) :b) return head)))
    (if head
        (setf (cdr head) 1)
        (progn 
          (push 1 var)
          (push :b var)))
    1)

(Which isn't completely unreasonable).

Now, having two concurrent writers could produce a plist (:b :b 1
1). Or just with a reader + a writer the reader could read VAR between
1 being pushed and :B being pushed, and see a plist with an odd number
of elements.

You could do (setq var (list* :b 1 var)) instead of the pushes, and
not suffer from these problems. But in this implementation it's still
possible for another thread to do an insert between the evaluation of
the arguments of the LIST* call and the evaluation of the SETQ, which
would result in an insert being dropped. So instead the branch for
inserting a new value would need to be:

   (loop for old-value = var
         for new-value = (list* :b 1 old-value)
         until (eql (compare-and-swap var old-value new-value) old-value))

But of course you can't write that yourself, because there's no
standard COMPARE-AND-SWAP, and thus it's probably something not
exported to the users in most implementations (that example uses the
SBCL C-A-S).

So you'd need to trust the implementors to have gone through extra
effort to make (SETF GETF) thread-safe, which would only happen if all
implementors considered thread-safety of non-primitive data structures
to be a feature. And that's not true.

-- 
Juho Snellman

From: Alex Mizrahi
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 18:36:11 +0000
Message-ID: <4730b420$0$90263$14726298@news.sunsite.dk>

 ??>> i think plists are thread-safe when setq, rplaca and rplacd are
 ??>> atomic, which i believe is true.

 JS> I think not. The plist implementation would also need to use the
 JS> atomic primitives in a way that ensures that the whole update is
 JS> atomic.

i meant plist as data structure, and that it is possible to implement plists 
being thread-safe (in some way).

i don't claim that all implementations are, however it's reasonable to 
expect that CL advertised as having mature mp support at least doesn't 
produce invalid plists via (setf getf).

 JS> You could do (setq var (list* :b 1 var)) instead of the pushes, and
 JS> not suffer from these problems. But in this implementation it's still
 JS> possible for another thread to do an insert between the evaluation of
 JS> the arguments of the LIST* call and the evaluation of the SETQ, which
 JS> would result in an insert being dropped.

Pascal explicitly said that insert being dropped is not a problem, so i've 
meant it is thread-safe in that context: it always produces valid plists, 
but some inserts might be dropped.

even if it was not dropping inserted value, some uses might need a guarantee 
that concurrent updates are serialized in some predictable way..
so i think what is thread safe depends on the application..

but at least it should damage data structure, that's for sure..

From: Duane Rettig
Subject: Re: Thread-safe caches
Date: Wed, 07 Nov 2007 18:14:28 +0000
Message-ID: <o0wsstx6rf.fsf@gemini.franz.com>

[At this point, I am resigned to the liklihood that the original reply
I submitted yesterday morning was lost by my nntp service, and since I
didn't save the draft, I am rewriting it.  Maybe this one will be
shorter. :-)  Apologies in advance if you receive the original and
thus get duplicate submissions (yet not quite the same)]

"Alex Mizrahi" <········@users.sourceforge.net> writes:

>  PC> It may also be acceptable that the caches are not always updated, so if
>  PC> new entries in the caches are actually not written every now and then,
>  PC> that's fine, as long as the caches remain consistent...
>
> i think plists are thread-safe when setq, rplaca and rplacd are atomic, 
> which i believe is true.

Not likely.  We tend to have an intuition about atomicity, but since
the CL spec doesn't define multiprocessing, it also doesn't define
atomic operations; atomic as an adjective is defined in a completely
different context as what we're discussing here.

In virtual threads implementations, atomicity can usually be
guaranteed by disallowing any other threads to run during the critical
sections.  In Allegro CL, this is done by wrapping
(without-interrupts ...) around the critical section.  It should not
be done often, since it is rather heavyweight, almost like a lock, and
like a lock, can result in deadock situations.  We also have a lighter
weight method that is unexported, called (excl::atomically ...) which
is only effective in compiled code, and which does not deal with 100%
of atomic cases, though it is 100% safe when successful (the theory is
that excl:atomically will cause the compiler to fail if your form is
not inherently atomic, and so if your form compiled successfully while
wrapped in excl::atomically, then it was atomic).

But what level must we deal with atomic operations to provide safe
code?  You've implied here that if the write operations are atomic,
then the larger operations are safe.  But this is not true: most of
the critical operations that preserve the integrity of the lisp are of
the form read/modify/write.  You're dealing with the write portion,
which isn't the whole critical section.  What happens if, during the
search through a plist where an indicator wasn't found, another thread
adds (i.e. pushes) a new indicator onto the front of the plist?  And
once the search is done, a duplicate indicator is pushed, causing the
plist to have two of the same property.  This wouldn't seem to be a
problem, but remprop is defined to only remove the first of multiple
identical indicators, so a program that assumed that a remprop had
actually removed the (only) value of a property would be broken at
this point.

Let's look at some code.  As usual, macroexpand is our friend:

CL-USER(6): (pprint (macroexpand '(setf (getf x y) z)))

(LET* ((#:G26 X) (#:G28 Y) (#:G27 Z))
  (DO ((#:G29 #:G26 (CDDR #:G29)))
      ((NULL #:G29)
       (SETQ #:G26 (LIST* #:G28 #:G27 #:G26))
       (SETQ X #:G26)
       #:G27)
    (WHEN (EQ (CAR #:G29) #:G28) (RPLACA (CDR #:G29) #:G27) (RETURN #:G27))))
CL-USER(7): 

Note that the rplaca is a very small portion of this operation.  Also
note that if this code is not subsequently compiled, in a lisp which
has an interpreter, then there are signifigantly large numbers of
places where another thread can get in and modify the plist where the
first thread doesn't notice.  Of course, without-interrupts fixes this
situation, but the underlying mechanism is not itself thread-safe.

What about symbol-plist?  Let's look at this one:

CL-USER(7): (pprint (macroexpand '(setf (get x y) z)))

(LET* ((#:G31 X) (#:G32 Y) (#:G34 Z)) (EXCL::.INV-GET #:G31 #:G32 #:G34))
CL-USER(8): 

Here we have a much safer looking situation; there is a single
function call, and the gensyms will be optimized out anyway.  So the
question of atomicity boils down to whether excl::.inv-get is itself
atomic.  Well, the answer is: no, it is not atomic.  If you
disassemble this function, you can see an interrupt-check in the code,
which allows interrupts _and_ thread switches, and that makes it
thread unsafe. 

It is possible to remove that interrupt-check, which would make the
code thread-safe, but then it would become uninterruptible, in the
unlikely event that someone has blasted a plist to be circular.
Perhaps the assumption that no-one is likely to do this is a
reasonable one, but I would prefer not to lock out other threads
(which would not be smp-safe, since all threads potentially run
simultaneously at that point), and would prefer a solution using a
compare-and-swap implementation - something we are currently working
on - "cas" can be done directly, with cas (sparc) or locked cmpxchg (x86,
x86-64) instructions, or it can be done with combinations of
load-linked (powerpc) or load-locked (alpha) and store-conditional
instructions.

> as for consistency question, it depends on nature of this cache -- are 
> concurrent incompatible updates possible and what to do in this case.
> e.g. is it possible that one thread deletes entry while another updates it? 

I think the word you are looking for is determinism.  It may be
harsher than you are intending, but it actually makes it easier to
intuit why it is impossible.  When I was at a previous company eons
ago as a test engineer, the development engineering department had a
brilliant idea for testing our whole systems; they would place one
machine - a reference machine - next to a machine-under-test, and
they would drive them in lock-step to compare how they reacted,
cycle-by-cycle.  The critical issue was whether they could determine
that because the machine under test reached a point where its state
was slightly different than the reference, that the m-u-t was wrong.
And the answer was that they couldn't say that, because they couldn't
specify what the right decision was, given operating tolerances, in
particular cases where timing was involved, or initial states couldn't
be controlled.  Ironically, back in test engineering, we were using
similar methods for testing our pc boards, though the main difference
was that we did analysis of the test patterns and masked off any
comparisons that couldn't be guaranteed.

The situation is similar here; there are things you want to happen in
a particular order, and other things that you can't guarantee.  For
example, if two threads store different values into the same global
location, what is the value of that location at some reasonable time
afterward?  The answer is "one of those two values".  You can't
guarantee which one, but you can and should guarantee the integrity of
the location that is being used.  When that location is just a place,
then that guarantee is easy.  When it is a structure, or when the
locatin changes over time, then it is harder to do, and portions of
the update that deal with the location itself must all be done
atomically.

The other side of this issue is to look at where atomicity is _not_
needed.  This occurs most when the resources we are using are not
global.  Protection is not necessary when the location is only being
used and modified by the one thread.  the trick, though, is figuring
out what portion of resources are local, and what part are global.

One example is hash-tables.  You might think of (setf gethash) to be
a relatively atomic operation.  However, if a new key is being
installed into a full hash-table, a rehash is required, which may take
a long time.  This kind of operation is so long that we don't really
want to put without-interrupts around the whole operation.  And you
might think that hash-tables aren't generally used as global resources
between multiple threads, but the simplest rebuttal is two different
threads calling INTERN at the same time on the user package.  For this
kind of situation, and for hash-tables in general, we have a mechanism
in Allegro CL that guarantees hash-tables to be thread-safe, even
though no locks are employed.  And that atomicity also carries over
into gethash, maphash and remhash, as well.

Well, I think I'm done for now.  I get a feeling my second attempt at
writing this response is more disjointed than my first try, so I'll
probably have to fill in the blanks later.  And, of course, my first
response was much shorter than this one...

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Wed, 07 Nov 2007 21:19:21 +0000
Message-ID: <5peoepFqubujU1@mid.individual.net>

Duane Rettig wrote:

> But what level must we deal with atomic operations to provide safe
> code?  You've implied here that if the write operations are atomic,
> then the larger operations are safe.  But this is not true: most of
> the critical operations that preserve the integrity of the lisp are of
> the form read/modify/write.

But can we assume that the write operations are atomic? This by itself 
may not necessarily be a given, but that's all I need in my application...

Thanks,
Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Duane Rettig
Subject: Re: Thread-safe caches
Date: Thu, 08 Nov 2007 02:11:11 +0000
Message-ID: <o0ode5wkow.fsf@gemini.franz.com>

Pascal Costanza <··@p-cos.net> writes:

> Duane Rettig wrote:
>
>> But what level must we deal with atomic operations to provide safe
>> code?  You've implied here that if the write operations are atomic,
>> then the larger operations are safe.  But this is not true: most of
>> the critical operations that preserve the integrity of the lisp are of
>> the form read/modify/write.
>
> But can we assume that the write operations are atomic? This by itself
> may not necessarily be a given, but that's all I need in my
> application...

I don't understand what your theory is for this.  What is it about
your application that counts on atomicity but does not count on
thread-safety?  Perhaps if you could give us a simple example scenario
and tell us what assumptions and conclusions you are making about
thread-safety, we can analyze and conform or contest those
assumptions and conclusions.  But so far, all of our replies have been
of the nature that although a write might be atomic, the combined
read/modify/write operation might not be thread-safe.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Thu, 08 Nov 2007 14:44:26 +0000
Message-ID: <5pglmaFpt11gU1@mid.individual.net>

Duane Rettig wrote:
> Pascal Costanza <··@p-cos.net> writes:
> 
>> Duane Rettig wrote:
>>
>>> But what level must we deal with atomic operations to provide safe
>>> code?  You've implied here that if the write operations are atomic,
>>> then the larger operations are safe.  But this is not true: most of
>>> the critical operations that preserve the integrity of the lisp are of
>>> the form read/modify/write.
>> But can we assume that the write operations are atomic? This by itself
>> may not necessarily be a given, but that's all I need in my
>> application...
> 
> I don't understand what your theory is for this.  What is it about
> your application that counts on atomicity but does not count on
> thread-safety?  Perhaps if you could give us a simple example scenario
> and tell us what assumptions and conclusions you are making about
> thread-safety, we can analyze and conform or contest those
> assumptions and conclusions.  But so far, all of our replies have been
> of the nature that although a write might be atomic, the combined
> read/modify/write operation might not be thread-safe.

The discussion continue in other mailing lists as well, and I have 
described what I wanted over there. I am going to post it here again, 
because the response is very interesting for the general audience. It 
turns out that things are even worse than people around here assumed.

The following is cited from the LispWorks mailing list.

There, I described what I want to do as follows:

> I am trying to implement thread-safe caches for ContextL without
> having to use locks or other multi-threading primitives, to make them
> more efficient.
> 
> I am currently using property lists for caches, but (setf getf) is
> not thread-safe as such.
> 
> However, if I do a (setf plist (list* new-key new-val plist)), things
> should be thread-safe. As far as I can tell, the worst that can
> happen in this case is that sometimes, cache entries are actually not
> recorded, because another thread writes to the same plist cache. But
> that's ok if it happens rarely.
> 
> However, thread-safety of assignments is an essential requirement for
> this to work...

My view was that if there are two threads writing to the same resource 
at more or less the same time, and one of them 'wins', that's ok. Caches 
are not crucial to the semantics of a program, at least not in the 
ContextL case, they are just there to speed things up.

However, Martin Simmons had the following to respond (reposted here with 
kind permission), and proved that my view was still too naive:

>> I am trying to implement thread-safe caches for ContextL without 
>> having to use locks or other multi-threading primitives, to make
>> them more efficient.
> 
> Yes, I was afraid of that...
> 
[...]

>> However, thread-safety of assignments is an essential requirement
>> for this to work...
>> 
>> If there is any mistake in my thoughts, I'd be very happy to know
>>  about this, of course. ;)
> 
> Yes, they are essential, but unfortunately not sufficient.
> 
> There is a small problem: in some architectures, the order of memory 
> operations is not guaranteed between different CPUs.
> 
> Consider a simplified version of your updating thread doing this:
> A1) Makes a new cons.
> A2) Stores new-key in the car of the cons.
 > A3) Stores the cons in some slot.
> 
> Presumably there is also a reading thread that might be doing this: 
> B1) Load the cons from the slot.
 > B2) Load the car of the cons.
 > B3) Operate on the car of the cons on some way.
> 
> The reading thread will only work reliably if it sees the effect of
> A2 before executing B2.  Unfortunately, that is only guaranteed when
> the effects of A2 are seen by all CPUs before the effects of A3.  If
> the CPU running the reading thread sees the effect of A3 before A2,
> then B2 might read the "old" value of the car in the new cons, which
> will probably be nil or some junk.
> 
> The same issue can occur with the cdrs of the conses and the car of
> the cons containing new-val in the plist.
> 
> Having said that, your code will probably work correctly on current
> Intel chips with the current versions of LispWorks.  It might not be
> portable or future-proof however.

So I guess I will have to bite the sour apple and do proper locking, 
etc., in my code.


This was a very helpful discussion. Thanks to everyone who participated!


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Duane Rettig
Subject: Re: Thread-safe caches
Date: Thu, 08 Nov 2007 18:23:05 +0000
Message-ID: <o0ode4fvg6.fsf@gemini.franz.com>

Pascal Costanza <··@p-cos.net> writes:

> Duane Rettig wrote:
>> Pascal Costanza <··@p-cos.net> writes:
>>
> The discussion continue in other mailing lists as well, and I have
> described what I wanted over there. I am going to post it here again,
> because the response is very interesting for the general audience. It
> turns out that things are even worse than people around here assumed.

Hmm.  It's interesting that you came away from the conversation with
that outlook.  I think that it might be overly pessimistic.  I think
Martin has the facts exactly correct, but let me put what he said into
a summary set of points that might put a slightly different spin on
it:

 1. There is an issue where a write might not be atomic due to
multiple processor cache implementations.

 2. Under the current Lisp technologies, you are not likely to see the
problem.

 3. With current plans of Lisp vendors, assumptions of atomicity
might not be portable for future implementations.

The problem as stated in #1 is one of cache-coherency; how does one
cpu which wants to read a word from memory know that a second cpu has
just written to the same address, but that it is still lodged in its
own local cache and is not seen in main memory yet?  The typical
answer is "cache snooping" (not to be confused with DNS cache snooping)
or "bus snooping" or "bus sniffing".  You can google for any of these
and get a wealth of information on the concepts.  However, since this
tends to be a low-level, architecture-specific concept, it is handy to
know what triggers snoop cycles - at the assembler programmer level
one magic keyword is "synchronizing" - whenever an instruction is
labelled a synchronizing instruction, it means that it initiates
whatever it takes to ensure cache coherency.  Some architectures use
other termionology; for example, Sparc defines "memory barrier
semantics", and explicitly notes that their cas instruction does _not_
implement this (i.e. one must use the membar instruction in order to
get proper multi-cpu memory ordering).  The x86 and x86-64 family
require a lock prefix to cmpxchg instructions to ensure this
ordering.  And the wording is sometimes vague, to allow for higher
level specification of intent without throttling the range of
implementations that can be designed for the harwdware.

We've been talking about this issue in various places on this thread
in even higher level terms; the "compare-and-swap" operation is often
a library function with compile-time optimizations, but it hopefully
does "the right thing" for the architecture and the number of
processors that are configured.  In Lisp implementations which compile
to machine language, these can be sets of instructions, and can also
be configurable or worst-case assumptions can be made.

For the second point, note that the compare-and-swap operation can be
very expensive.  Elsewhere on this thread, it was seen that in C++ a
compare-and-swap could take up to 20X as long as a raw store
instruction.  I've not experimented much with the timings of the raw
instructions, especiall on multiple processors, but it is sufficient
to say that we must expect some slowdown when using these synchronizing
instructions, even though they will usually tend to be much faster
than actually grabbing locks.  But for now, with whatever Lisp
techologies are currently in effect, thread safety _and_ speed is
achieved by understanding when these cache coherency issues are not
likely to occur - e.g. if the lisp is forced to run only in one cpu,
then there will never be an issue with caches, and these locks and/or
synchronizing instructions might then not be necessary.  Why not add
them for safety?  Well, because they can be so much slower...

Finally, the third point is the most potentially troubling.  On the
surface, it seems like he is saying "all bets are off" when looking
into the future.  But I don't think that any Lisp vendor would
purposely remove safety from their products, without also adding
alternatives.  I would imagine that, for example, if a purely atomic
(setf getf) were to be implemented using locks or cas, and it caused
the operation to be much slower, that the vendor might instead leave
the current implementation alone, and provide you with a new
(setf getf-atomic) which used the safety measures, and then gave you a
choice to use it or not.  The original would then have the same
semantics and would tend to operate at the same speed, but it would
then not be as safe in an smp setting as in a non-smp setting.  This
would mean that the programming that you did when assuming safety
would no longer be correct, but there would be a new way of
programming that could take its place.  That is what I read into
Martin's statement about code possibly not being portable or
future-proof.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Thu, 08 Nov 2007 20:38:38 +0000
Message-ID: <5phaeeFr7p5qU1@mid.individual.net>

Duane Rettig wrote:
> Pascal Costanza <··@p-cos.net> writes:
> 
>> Duane Rettig wrote:
>>> Pascal Costanza <··@p-cos.net> writes:
>>>
>> The discussion continue in other mailing lists as well, and I have
>> described what I wanted over there. I am going to post it here again,
>> because the response is very interesting for the general audience. It
>> turns out that things are even worse than people around here assumed.
> 
> Hmm.  It's interesting that you came away from the conversation with
> that outlook.  I think that it might be overly pessimistic.  I think
> Martin has the facts exactly correct, but let me put what he said into
> a summary set of points that might put a slightly different spin on
> it:
> 
>  1. There is an issue where a write might not be atomic due to
> multiple processor cache implementations.
> 
>  2. Under the current Lisp technologies, you are not likely to see the
> problem.
> 
>  3. With current plans of Lisp vendors, assumptions of atomicity
> might not be portable for future implementations.

Thanks a lot for the consolation ;) but it actually seems that OpenMCL 
and SBCL already potentially suffer from #1.


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Duane Rettig
Subject: Re: Thread-safe caches
Date: Thu, 08 Nov 2007 22:46:23 +0000
Message-ID: <o0ejf0fj9c.fsf@gemini.franz.com>

Pascal Costanza <··@p-cos.net> writes:

> Duane Rettig wrote:
>> Pascal Costanza <··@p-cos.net> writes:
>>
>>> Duane Rettig wrote:
>>>> Pascal Costanza <··@p-cos.net> writes:
>>>>
>>> The discussion continue in other mailing lists as well, and I have
>>> described what I wanted over there. I am going to post it here again,
>>> because the response is very interesting for the general audience. It
>>> turns out that things are even worse than people around here assumed.
>> Hmm.  It's interesting that you came away from the conversation with
>> that outlook.  I think that it might be overly pessimistic.  I think
>> Martin has the facts exactly correct, but let me put what he said into
>> a summary set of points that might put a slightly different spin on
>> it:
>>  1. There is an issue where a write might not be atomic due to
>> multiple processor cache implementations.
>>  2. Under the current Lisp technologies, you are not likely to see
>> the
>> problem.
>>  3. With current plans of Lisp vendors, assumptions of atomicity
>> might not be portable for future implementations.
>
> Thanks a lot for the consolation ;) but it actually seems that OpenMCL
> and SBCL already potentially suffer from #1.

Looks like two bug reports are in order :-)

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Mon, 12 Nov 2007 21:39:57 +0000
Message-ID: <5prvhdFsn4seU1@mid.individual.net>

Duane Rettig wrote:
> Pascal Costanza <··@p-cos.net> writes:
> 
>> Duane Rettig wrote:
>>> Pascal Costanza <··@p-cos.net> writes:
>>>
>>>> Duane Rettig wrote:
>>>>> Pascal Costanza <··@p-cos.net> writes:
>>>>>
>>>> The discussion continue in other mailing lists as well, and I have
>>>> described what I wanted over there. I am going to post it here again,
>>>> because the response is very interesting for the general audience. It
>>>> turns out that things are even worse than people around here assumed.
>>> Hmm.  It's interesting that you came away from the conversation with
>>> that outlook.  I think that it might be overly pessimistic.  I think
>>> Martin has the facts exactly correct, but let me put what he said into
>>> a summary set of points that might put a slightly different spin on
>>> it:
>>>  1. There is an issue where a write might not be atomic due to
>>> multiple processor cache implementations.
>>>  2. Under the current Lisp technologies, you are not likely to see
>>> the
>>> problem.
>>>  3. With current plans of Lisp vendors, assumptions of atomicity
>>> might not be portable for future implementations.
>> Thanks a lot for the consolation ;) but it actually seems that OpenMCL
>> and SBCL already potentially suffer from #1.
> 
> Looks like two bug reports are in order :-)

Maybe. :)

I think I have been able to solve the concrete problem I had in 
ContextL. The discussions on that topic were very insightful and helped 
a lot. So thanks to everyone who participated.


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Juho Snellman
Subject: Re: Thread-safe caches
Date: Fri, 09 Nov 2007 00:41:44 +0000
Message-ID: <87pryk5jxz.fsf@vasara.proghammer.com>

Pascal Costanza <··@p-cos.net> writes:
> > I think
> > Martin has the facts exactly correct, but let me put what he said into
> > a summary set of points that might put a slightly different spin on
> > it:
> >  1. There is an issue where a write might not be atomic due to
> > multiple processor cache implementations.
> >  2. Under the current Lisp technologies, you are not likely to see
> > the problem.
> >  3. With current plans of Lisp vendors, assumptions of atomicity
> > might not be portable for future implementations.
> 
> Thanks a lot for the consolation ;) but it actually seems that OpenMCL
> and SBCL already potentially suffer from #1.

The only CPU architectures that SBCL has thread support on are x86 and
x86-64. Both Intel and AMD have a specified memory ordering for their
x86oid CPUs, where Martin's example would not be a problem.

>>> A1) Makes a new cons.
>>> A2) Stores new-key in the car of the cons.
>>> A3) Stores the cons in some slot.
>>>
>>> Presumably there is also a reading thread that might be doing this:
>>> B1) Load the cons from the slot.
>>> B2) Load the car of the cons.
>>> B3) Operate on the car of the cons on some way.

It's guaranteed that if B1 sees the result of A3 then B2 will see the
write of A2.

-- 
Juho Snellman

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Mon, 12 Nov 2007 21:40:23 +0000
Message-ID: <5prvi7Fsn4seU2@mid.individual.net>

Juho Snellman wrote:
> Pascal Costanza <··@p-cos.net> writes:
>>> I think
>>> Martin has the facts exactly correct, but let me put what he said into
>>> a summary set of points that might put a slightly different spin on
>>> it:
>>>  1. There is an issue where a write might not be atomic due to
>>> multiple processor cache implementations.
>>>  2. Under the current Lisp technologies, you are not likely to see
>>> the problem.
>>>  3. With current plans of Lisp vendors, assumptions of atomicity
>>> might not be portable for future implementations.
>> Thanks a lot for the consolation ;) but it actually seems that OpenMCL
>> and SBCL already potentially suffer from #1.
> 
> The only CPU architectures that SBCL has thread support on are x86 and
> x86-64. Both Intel and AMD have a specified memory ordering for their
> x86oid CPUs, where Martin's example would not be a problem.
> 
>>>> A1) Makes a new cons.
>>>> A2) Stores new-key in the car of the cons.
>>>> A3) Stores the cons in some slot.
>>>>
>>>> Presumably there is also a reading thread that might be doing this:
>>>> B1) Load the cons from the slot.
>>>> B2) Load the car of the cons.
>>>> B3) Operate on the car of the cons on some way.
> 
> It's guaranteed that if B1 sees the result of A3 then B2 will see the
> write of A2.

OK, thanks a lot for the correction!


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: David Lichteblau
Subject: Re: Thread-safe caches
Date: Tue, 13 Nov 2007 13:00:39 +0000
Message-ID: <slrnfjj7vn.bkh.usenet-2006@radon.home.lichteblau.com>

On 2007-11-08, Duane Rettig <·····@franz.com> wrote:
>  3. With current plans of Lisp vendors, assumptions of atomicity
> might not be portable for future implementations.

So may I ask what your plans for SMP support in Allegro are?


d.

From: Duane Rettig
Subject: Re: Thread-safe caches
Date: Tue, 13 Nov 2007 16:53:23 +0000
Message-ID: <o03avaf5oc.fsf@gemini.franz.com>

David Lichteblau <···········@lichteblau.com> writes:

> On 2007-11-08, Duane Rettig <·····@franz.com> wrote:
>>  3. With current plans of Lisp vendors, assumptions of atomicity
>> might not be portable for future implementations.

Note that this point was not my own, but it was the third point in my
interpretation of what Martin Simmons was saying.

> So may I ask what your plans for SMP support in Allegro are?

Yes, of course you may ask.

:-)

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: David Lichteblau
Subject: Re: Thread-safe caches
Date: Tue, 13 Nov 2007 17:44:04 +0000
Message-ID: <slrnfjjoj5.bog.usenet-2006@radon.home.lichteblau.com>

On 2007-11-13, Duane Rettig <·····@franz.com> wrote:
>> So may I ask what your plans for SMP support in Allegro are?
> Yes, of course you may ask.
>:-)

Okay, I'm asking! :-)  What are your plans for SMP support in Allegro?

I realize that you might not be able to answer if you cannot speak for
Franz or explain long-terms plans for new features in Allegro.

But it would be interesting to hear about
 - technical innovations in Allegro that will allow SMP to be taken
   advantage of (if indeed there are such changes being worked on)
 - or alternatively what exactly the technical difficulties are that
   keep Allegro from supporting SMP
especially compared to SBCL which seems to have taken the approach of
simply doing SMP and worrying about thread safety one problem at a time.


Thanks,
d.

From: Duane Rettig
Subject: Re: Thread-safe caches
Date: Tue, 13 Nov 2007 21:37:09 +0000
Message-ID: <o0y7d1esje.fsf@gemini.franz.com>

David Lichteblau <···········@lichteblau.com> writes:

> On 2007-11-13, Duane Rettig <·····@franz.com> wrote:
>>> So may I ask what your plans for SMP support in Allegro are?
>> Yes, of course you may ask.
>>:-)
>
> Okay, I'm asking! :-)  What are your plans for SMP support in Allegro?

Uhh... 

> I realize that you might not be able to answer if you cannot speak for
> Franz or explain long-terms plans for new features in Allegro.

Yes, you've called that one.

> But it would be interesting to hear about

These questions below are _much_ more answerable...

>  - technical innovations in Allegro that will allow SMP to be taken
>    advantage of (if indeed there are such changes being worked on)

We've been working for a long time to inch toward smp compatibility in
our implementation.  Most of what we've done in the past has been
targeted toward bringing Allegro CL closer toward being smp-ready
without any appreciable speed hit.  These changes are dropped into
Allegro CL incrementally and the user is (hopefully) unaware of them.

One example of this is what we call "wide-binding" which is just
another style of implementing symbol-value on a per-thread basis. It
was implemented as early as version 6.2 - for a description, see: 
http://www.franz.com/support/documentation/8.1/doc/multiprocessing.htm#wide-binding-1
There is a slight speed hit with wide-binding, because an extra
indirection is taken.  But the benefits outweigh the hit; process
switching (both on os-threads and on virtual threads implementations)
is much cleaner and faster, because unwinding of the current bindstack
and rewinding of the newly-scheduled thread's bindstack is not
necessary like it used to be (of course, this would also be true for
an implementation which chooses to allocate the value cells as slots
in a thread-local storage area.

Another example is thread-local storage itself; Allegro CL was
orignally a non-multiprocessing lisp, and when we added virtual
threads early on, some of the objects that should have been
thread-local remained in a global table, rather than being actually
thread-local.  One example of this is the bindstack - its canonical
location is in the global table (i.e. an offset from the nil register)
and we virtualize this value as a thread-local value by transferring
it to its thread when the thread relinquishes control.  Well, if you
have an 8.1 lisp that is x86 or x86-64 (others will be converted in
the future) then you will see :tlsvalues and :tlsbnp in the *features*
list; this means that the new canonical location for the bindstack and
the current binding index (called the bindstack-name-pointer, hence
the bnp tla) are always via the thread register, rather than through
the global table.

And speaking of thread access: If you disassemble the following in an
8.0 lisp:

(compile
 (defun foo ()
   (declare (optimize speed (safety 0) (debug 0)))
   sys:*current-thread*))

you get (I'm doing x86-64 here):

;; disassembly of #<Function FOO>
;; formals: 

;; code start: #x10009b53b8:
   0: 49 8b 87 d7 f5 movq	rax,[r15-2601]        ; SYS:*CURRENT-THREAD*
      ff ff 
   7: 48 8b 78 ed    movq	rdi,[rax-19]
  11: 49 8b 47 af    movq	rax,[r15-81]  ; SYS::C_BINDSTACK_INDEX
  15: 48 3b 47 f6    cmpq	rax,[rdi-10]
  19: 7d 05          jnl	26
  21: 48 8b 7c 07 fe movq	rdi,[rdi+rax-2]
  26: 48 8b 7f fe    movq	rdi,[rdi-2]
  30: f8             clc
  31: 4c 8b 74 24 10 movq	r14,[rsp+16]
  36: c3             ret
  37: 90             nop
CL-USER(3): 

whereas if you disassemble the same on 8.1 on the same architecture,
you get:

;; disassembly of #<Function FOO>
;; formals: 

;; code start: #x1000934088:
   0: 48 8d 7b 52    leaq	rdi,[rbx+82]
   4: f8             clc
   5: 4c 8b 74 24 10 movq	r14,[rsp+16]
  10: c3             ret
  11: 90             nop
CL-USER(3): 

So thread access is _much_ quicker than before (a necessity for the
"tls" features described above).

>  - or alternatively what exactly the technical difficulties are that
>    keep Allegro from supporting SMP
> especially compared to SBCL which seems to have taken the approach of
> simply doing SMP and worrying about thread safety one problem at a time.

Actually, we had already tried that approach, at least 18 years ago.
The machine was a Sequent 386-based multiple-processor system, and we
used several tricks to make use of the multiple processors, like hash
tables for global-symbol lookups (I guess you could say that this was
the precursor to wide-binding) and also objects called "futures", for
communicating between threads.

The product was extremely slow and unsatisfactory - it was there that
we learned many of the lessons of multiple-processor concepts
firsthand, and thus being once-bitten/twice-shy, we've been much more
careful about what kinds of promises we make to customers and to the
public.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Alexander Kjeldaas
Subject: Re: Thread-safe caches
Date: Mon, 12 Nov 2007 22:47:53 +0000
Message-ID: <1194614948.503518.308600@i38g2000prf.googlegroups.com>

On Nov 7, 10:19 pm, Pascal Costanza <····@p-cos.net> wrote:
> Duane Rettig wrote:
> > But what level must we deal with atomic operations to provide safe
> > code?  You've implied here that if the write operations are atomic,
> > then the larger operations are safe.  But this is not true: most of
> > the critical operations that preserve the integrity of the lisp are of
> > the form read/modify/write.
>
> But can we assume that the write operations are atomic? This by itself
> may not necessarily be a given, but that's all I need in my application...
>

This depends on the platform.  If you use the currently documented
Intel platform, you are in luck.  Writes are not reordered.

For other platforms, you might never know.

The portable solution is to use a memory barrier, but the syntax for a
memory barrier depends on your implementation, so you're in a catch
22.

Note that you *do* need a memory barrier when reading, if your reads
depend on the order of your writes.

Alexander

From: Alex Mizrahi
Subject: Re: Thread-safe caches
Date: Wed, 07 Nov 2007 21:16:35 +0000
Message-ID: <47322b39$0$90266$14726298@news.sunsite.dk>

DR> [At this point, I am resigned to the liklihood that the original reply
DR> I submitted yesterday morning was lost by my nntp service,

i've recieved that reply in personal mail (to my surprise), so probably 
you've pressed wrong button in mail agent..

 PC>>> It may also be acceptable that the caches are not always updated, so
 PC>>> if new entries in the caches are actually not written every now and
 PC>>> then, that's fine, as long as the caches remain consistent...
 ??>>
 ??>> i think plists are thread-safe when setq, rplaca and rplacd are
 ??>> atomic, which i believe is true.

 DR> Not likely.  We tend to have an intuition about atomicity, but since
 DR> the CL spec doesn't define multiprocessing, it also doesn't define
 DR> atomic operations; atomic as an adjective is defined in a completely
 DR> different context as what we're discussing here.

by atomic write operation i mean that contents of some location cannot be 
seen in some "intermediate" state -- either readers see value before write, 
or value that is written.

 DR> In virtual threads implementations, atomicity can usually be
 DR> guaranteed by disallowing any other threads to run during the critical
 DR> sections.

but primitive writes typically are just a single CPU instruction that is 
atomic (on 32-bit x86 CPUs writting 32-bit value on aligned memory 
locationatomic etc), so it "just works".
i'll cite your "other reply":
---
1. The read and write are themselves atomic.  This is usually trivial
for a read operation, and sometimes easy for a write operation
---

 DR> But what level must we deal with atomic operations to provide safe
 DR> code?  You've implied here that if the write operations are atomic,
 DR> then the larger operations are safe.

no, i didn't meant this in general case, but in context of cache 
implementations, that has weakest requirements -- it appears it only 
requires data structure to be valid, so it doesn't have some crashes or junk 
appearing. it easily tolerates concurrent updates, values missing etc.

 DR>   But this is not true: most of the critical operations that preserve
 DR> the integrity of the lisp are of the form read/modify/write.

i would also say it's a matter of application logic what to consider safe.
with SQL serializable transaction isolation level in some cases database 
totally discards operation and asks application to fully re-evaluate 
transaction -- because it's not possble to resolve conflicts in other way. 
and it still doesn't offer really full serialization..

 DR> indicator onto the front of the plist?  And once the search is done, a
 DR> duplicate indicator is pushed, causing the plist to have two of the
 DR> same property.  This wouldn't seem to be a problem, but remprop is
 DR> defined to only remove the first of multiple identical indicators, so a
 DR> program that assumed that a remprop had actually removed the (only)
 DR> value of a property would be broken at this point.

luckily Pascal says he doesn't need remprop..
but actually here two implementations are possible: either duplication is 
possible, but it's very unlikely it skips insertions (or doesn't skip 
insertions at all if larger block is made atomic).
or duplications are not possible, but it's more likely it skips insertions.

 DR> (which would not be smp-safe, since all threads potentially run
 DR> simultaneously at that point)

Allegro CL is planned to have real SMP? just curious..

 DR> , and would prefer a solution using a compare-and-swap implementation -
 DR> something we are currently working on - "cas" can be done directly,
 DR> with cas (sparc) or locked cmpxchg (x86, x86-64) instructions, or it
 DR> can be done with combinations of load-linked (powerpc) or load-locked
 DR> (alpha) and store-conditional instructions.

..and this CAS asks all eight cores of shiny new Intel (or Sun) CPU to 
synchronize their dirty caches each time one calls getf or a function like 
that.

 DR> in Allegro CL that guarantees hash-tables to be thread-safe, even
 DR> though no locks are employed.  And that atomicity also carries over
 DR> into gethash, maphash and remhash, as well.

that's cool, indeed..
i've heard that in some future release of SBCL hash-tables can signal an 
error when it's being written while others read -- that can be quite a 
surprise.
probably this algorithm that guarantees safety without locks is quite 
complex

 DR> Well, I think I'm done for now.  I get a feeling my second attempt at
 DR> writing this response is more disjointed than my first try, so I'll
 DR> probably have to fill in the blanks later.  And, of course, my first
 DR> response was much shorter than this one...

this second one was more interesting :)

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"Hanging In The Balance Of Deceit And Blasphemy")

From: Duane Rettig
Subject: Re: Thread-safe caches
Date: Thu, 08 Nov 2007 02:57:01 +0000
Message-ID: <o0k5otwiki.fsf@gemini.franz.com>

"Alex Mizrahi" <········@users.sourceforge.net> writes:

> DR> [At this point, I am resigned to the liklihood that the original reply
> DR> I submitted yesterday morning was lost by my nntp service,
>
> i've recieved that reply in personal mail (to my surprise), so probably 
> you've pressed wrong button in mail agent..

Ah - that must be it.

>  PC>>> It may also be acceptable that the caches are not always updated, so
>  PC>>> if new entries in the caches are actually not written every now and
>  PC>>> then, that's fine, as long as the caches remain consistent...
>  ??>>
>  ??>> i think plists are thread-safe when setq, rplaca and rplacd are
>  ??>> atomic, which i believe is true.
>
>  DR> Not likely.  We tend to have an intuition about atomicity, but since
>  DR> the CL spec doesn't define multiprocessing, it also doesn't define
>  DR> atomic operations; atomic as an adjective is defined in a completely
>  DR> different context as what we're discussing here.
>
> by atomic write operation i mean that contents of some location cannot be 
> seen in some "intermediate" state -- either readers see value before write, 
> or value that is written.

Correct.  But it's not always intuitive as to what constitutes an
atomic operation.  And, as you imply, atomicity is always seen by "an
observer", where the observer might be something at a lower-level than
the lisp level, such as gdb or similar at the instruction level, or it
might be at a higher level than the writer, say, at the level of
another thread.  Or, in an smp setting, the threads themselves may
have an instruction-level granularity, which means that the atomicity
is at a much finer-grained level.  The context of the observer is
always important to questions of atomicity and safety.  In virtual
threads, for example, the granularity is forced by the fact that only
one thread runs at a time.

>  DR> In virtual threads implementations, atomicity can usually be
>  DR> guaranteed by disallowing any other threads to run during the critical
>  DR> sections.
>
> but primitive writes typically are just a single CPU instruction that is 
> atomic (on 32-bit x86 CPUs writting 32-bit value on aligned memory 
> locationatomic etc), so it "just works".
> i'll cite your "other reply":
> ---
> 1. The read and write are themselves atomic.  This is usually trivial
> for a read operation, and sometimes easy for a write operation

Yes, this is true, assuming that the operation is indeed only one
instruction.

>  DR> But what level must we deal with atomic operations to provide safe
>  DR> code?  You've implied here that if the write operations are atomic,
>  DR> then the larger operations are safe.
>
> no, i didn't meant this in general case, but in context of cache 
> implementations, that has weakest requirements -- it appears it only 
> requires data structure to be valid, so it doesn't have some crashes or junk 
> appearing. it easily tolerates concurrent updates, values missing etc.

Right, but elsewhere on this thread there have been examples given
where the structure might indeed be temporarily invalid.  If the
observer threads are able to break into the current thread when the
state is inconsistent, ten thread-safety is viloated.

>  DR>   But this is not true: most of the critical operations that preserve
>  DR> the integrity of the lisp are of the form read/modify/write.
>
> i would also say it's a matter of application logic what to consider safe.
> with SQL serializable transaction isolation level in some cases database 
> totally discards operation and asks application to fully re-evaluate 
> transaction -- because it's not possble to resolve conflicts in other way. 
> and it still doesn't offer really full serialization..

Yes, of course.  I think what we are mostly discussing here is whether
the operations in CL that one would normally think of as preferably
thread-safe are indeed so.

>  DR> indicator onto the front of the plist?  And once the search is done, a
>  DR> duplicate indicator is pushed, causing the plist to have two of the
>  DR> same property.  This wouldn't seem to be a problem, but remprop is
>  DR> defined to only remove the first of multiple identical indicators, so a
>  DR> program that assumed that a remprop had actually removed the (only)
>  DR> value of a property would be broken at this point.
>
> luckily Pascal says he doesn't need remprop..
> but actually here two implementations are possible: either duplication is 
> possible, but it's very unlikely it skips insertions (or doesn't skip 
> insertions at all if larger block is made atomic).
> or duplications are not possible, but it's more likely it skips insertions.

I would still be concerned about plist implementations possibly not
being thread-safe.

>  DR> (which would not be smp-safe, since all threads potentially run
>  DR> simultaneously at that point)
>
> Allegro CL is planned to have real SMP? just curious..

Hmm...

>  DR> , and would prefer a solution using a compare-and-swap implementation -
>  DR> something we are currently working on - "cas" can be done directly,
>  DR> with cas (sparc) or locked cmpxchg (x86, x86-64) instructions, or it
>  DR> can be done with combinations of load-linked (powerpc) or load-locked
>  DR> (alpha) and store-conditional instructions.
>
> ..and this CAS asks all eight cores of shiny new Intel (or Sun) CPU to 
> synchronize their dirty caches each time one calls getf or a function like 
> that.

Yes, of course; there is a price to pay for such explicit atomicity -
that is why it has to be added as carefully as possible.  As often as
possible, we secure things algorithmically, rather than explicity, we
use locks as few times as possible.

>  DR> in Allegro CL that guarantees hash-tables to be thread-safe, even
>  DR> though no locks are employed.  And that atomicity also carries over
>  DR> into gethash, maphash and remhash, as well.
>
> that's cool, indeed..
> i've heard that in some future release of SBCL hash-tables can signal an 
> error when it's being written while others read -- that can be quite a 
> surprise.

We had considered this kind of reaction, but thought it would be too
harsh for situations that might not be easy to reproduce (e.g. two
threads calling intern at the same time on a hash table that is
full).  So we decided to bite the bullet and get more complex...

> probably this algorithm that guarantees safety without locks is quite 
> complex

Indeed.

>  DR> Well, I think I'm done for now.  I get a feeling my second attempt at
>  DR> writing this response is more disjointed than my first try, so I'll
>  DR> probably have to fill in the blanks later.  And, of course, my first
>  DR> response was much shorter than this one...
>
> this second one was more interesting :)

Thanks.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Mariano Montone
Subject: Re: Thread-safe caches
Date: Tue, 06 Nov 2007 18:28:30 +0000
Message-ID: <1194373710.550439.23350@y42g2000hsy.googlegroups.com>

On Nov 6, 4:55 am, Pascal Costanza <····@p-cos.net> wrote:
> Hi,
>
> Currently, ContextL caches are implemented using property lists.
> However, there seem to be problem with concurrent accesses to those
> caches when using multiple threads.
>
> Does anyone know of a straightforward way to make property lists (or
> association lists or hashtables) thread-safe in an
> implementation-independent way, possibly using some kind of
> compatibility layer?

I would play with cl-stm. I'm not saying it will solve your problem
(I've not even tried it seriously myself, just played with the tests),
but it would certainly be a lot of fun! :)

Mariano

From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Thread-safe caches
Date: Tue, 13 Nov 2007 18:11:37 +0000
Message-ID: <rem-2007nov13-003@yahoo.com>

> From: Pascal Costanza <····@p-cos.net>
> Currently, ContextL caches are implemented using property lists.
> However, there seem to be problem with concurrent accesses to those
> caches when using multiple threads.
> Does anyone know of a straightforward way to make property lists (or
> association lists or hashtables) thread-safe in an
> implementation-independent way, possibly using some kind of
> compatibility layer?

For the benefit of readers who are not familiar with ContextL, it
would be helpful if you would formally describe the sorta-use-cases
of the API you are attempting to make thread-safe. That is which
operations upon the cache do you need to be "atomic" with respect
to switching activity between different threads (or even worse,
running two threads *simultaneously* on dual/multiple processors).

For example, if process A modifies a cached item, then ten minutes
later process B tries to read the same item, do you expect B to see
the cached-modified version or is it permissible for B to still see
the original value after all that time? There are too many
questions like this for me to list a menu of yes/no requirements.
Better you just define precisely what behaviour you expect to be
totally reasonable in what way and what behaviour you would allow
to be somewhat non-reasonable in what way and what unreasonable
behaviour you would absolutely forbid.

My naive view is that you would most definitely need *some* type of
thread-safe compatibility layer, but whether the provisions of the
underlying CPU and/or CL-implementation would suffice already would
depend on your specific requirements, which have not been stated
clearly enough (even in your folloups) for my satisfaction. Most
notably, you haven't said clearly whether you are talking about
readonly caches (all the data is totally static to begin with,
                 specifically at initialization of empty cache and
                 consequent start of multi-threaded use of that
                 cache, and will never be modified during the
                 operation of this application, nevermind the
                 pre-loading of the data store which occurred
                 *beforehand* and is not under discussion here),
or read/write caches (at least one modification will occur during
                      the duration of this multi-threaded application).
You can't solve a problem until you state the problem clearly/precisely!
<meta>I hope you like my prettyprinted English!</meta>

I'll respond separately to the two followups mentionning how Java
deals with [non]atomic operations (64-bit values, and SafeBoolean)
and the one followup mentionning memory fences in modern CPUs.

From: Pascal Costanza
Subject: Re: Thread-safe caches
Date: Tue, 13 Nov 2007 23:31:31 +0000
Message-ID: <5puqejFtb7anU1@mid.individual.net>

Robert Maas, see http://tinyurl.com/uh3t wrote:
>> From: Pascal Costanza <····@p-cos.net>
>> Currently, ContextL caches are implemented using property lists.
>> However, there seem to be problem with concurrent accesses to those
>> caches when using multiple threads.
>> Does anyone know of a straightforward way to make property lists (or
>> association lists or hashtables) thread-safe in an
>> implementation-independent way, possibly using some kind of
>> compatibility layer?
> 
> For the benefit of readers who are not familiar with ContextL, it
> would be helpful if you would formally describe the sorta-use-cases
> of the API you are attempting to make thread-safe. That is which
> operations upon the cache do you need to be "atomic" with respect
> to switching activity between different threads (or even worse,
> running two threads *simultaneously* on dual/multiple processors).
> 
> For example, if process A modifies a cached item, then ten minutes
> later process B tries to read the same item, do you expect B to see
> the cached-modified version or is it permissible for B to still see
> the original value after all that time? There are too many
> questions like this for me to list a menu of yes/no requirements.
> Better you just define precisely what behaviour you expect to be
> totally reasonable in what way and what behaviour you would allow
> to be somewhat non-reasonable in what way and what unreasonable
> behaviour you would absolutely forbid.

That's not so straightforward. The implementation of ContextL is 
(roughly) described in http://p-cos.net/documents/context-switch.pdf - 
and I am talking about the caches that are mentioned in Section 3.1 in 
that paper.

> My naive view is that you would most definitely need *some* type of
> thread-safe compatibility layer, but whether the provisions of the
> underlying CPU and/or CL-implementation would suffice already would
> depend on your specific requirements, which have not been stated
> clearly enough (even in your folloups) for my satisfaction.

Maybe it becomes clearer by reading that paper.

Anyway, I have found a solution to my problem. I'm using the 
portable-threads library of the GBBopen project now, and I do proper 
locking of the caches. Efficiency shouldn't suffer, because the locking 
only occurs on the slow paths - the fast paths are not affected.

Because I use locking now, the possibility of cache entries not properly 
being recorded shouldn't occur. Whether some data types are atomic or 
not doesn't matter anymore either.

The discussions here in c.l.l helped a lot to get a better idea about 
this whole issue.

The changes to ContextL are not in the official release yet, but you can 
find them in the darcs repository.

Sorry if I can't be more specific...

Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/