Live code patching in Lisp

From: Ulrich Hobelmann
Subject: Live code patching in Lisp
Date: Tue, 11 Oct 2005 15:40:52 +0000
Message-ID: <3r24o5Fhg4v6U1@individual.net>

I'm reading a bit about live code patching and I'm wondering how state 
is kept consistent.

Let's say I update two functions in a running Lisp (say, a web server), 
how can I be sure that one function isn't called after only the first 
one has been updated in the symbol table?

I guess in a single-threaded system this is easy, but in a truly 
multithreaded one you'd have to suspend all threads, or build locks 
around *every* symbol-lookup?

How do Lisps handle this, and what are the problems, if any?

-- 
State, the new religion from the friendly guys who brought you fascism.

Re: Live code patching in Lisp Jock Cooper
Re: Live code patching in Lisp Kaz Kylheku
- Re: Live code patching in Lisp Ulrich Hobelmann
Re: Live code patching in Lisp Thomas A. Russ
- Re: Live code patching in Lisp ··············@hotmail.com
  - Re: Live code patching in Lisp Espen Vestre
Re: Live code patching in Lisp Pascal Costanza
- Re: Live code patching in Lisp Ulrich Hobelmann
  - Re: Live code patching in Lisp Pascal Costanza
Re: Live code patching in Lisp (typep 'nil '(satisfies identity)) => ?

From: Jock Cooper
Subject: Re: Live code patching in Lisp
Date: Tue, 11 Oct 2005 17:28:01 +0000
Message-ID: <m33bn8hske.fsf@jcooper02.sagepub.com>

Ulrich Hobelmann <···········@web.de> writes:

> I'm reading a bit about live code patching and I'm wondering how state
> is kept consistent.
> 
> Let's say I update two functions in a running Lisp (say, a web
> server), how can I be sure that one function isn't called after only
> the first one has been updated in the symbol table?
> 
> I guess in a single-threaded system this is easy, but in a truly
> multithreaded one you'd have to suspend all threads, or build locks
> around *every* symbol-lookup?
> 
> How do Lisps handle this, and what are the problems, if any?
> 

How about just calling the LOAD inside of whatever wrapper your MP
package uses for "don't interrupt this code" (eg ACL's WITHOUT-INTERRUPTS).

From: Kaz Kylheku
Subject: Re: Live code patching in Lisp
Date: Tue, 11 Oct 2005 17:41:03 +0000
Message-ID: <1129052463.361488.156550@g44g2000cwa.googlegroups.com>

Ulrich Hobelmann wrote:
> I'm reading a bit about live code patching and I'm wondering how state
> is kept consistent.
>
> Let's say I update two functions in a running Lisp (say, a web server),
> how can I be sure that one function isn't called after only the first
> one has been updated in the symbol table?

I don't think you can do that, unless you implement your own mutex that
is acquired around calls to those functions, and when they are being
updated.

Actually, the real problem is that there may be threads executing the
old functions as you update them. Those calls may have happened long
before you requested the update.

The old functions will stay around while there are any threads
executing them.

> I guess in a single-threaded system this is easy, but in a truly
> multithreaded one you'd have to suspend all threads, or build locks
> around *every* symbol-lookup?
>
> How do Lisps handle this, and what are the problems, if any?

I think you have to just ensure that everything is compatible, so that
the new versions of functions can coexist with parallel calls to the
old functions, in any combination, ensuring that the right things are
forward and backward compatible.

Suppose we have some function A and B. A relies on some interface in B.
Suppose we want to upgrade them to A' and B'. A' relies on some new
features in the B' interface, so that it cannot use B.

What that means is that we have to replace B by B' first, so that the
new interface is in place. And only then replace A with A'.

After we replace B with B', calls from the old A may start arriving
into B'. So B' has to support the old interface expected by A. These
calls may continue even after A is replaced by A', because A will stay
around while there are any threads executing it.

From: Ulrich Hobelmann
Subject: Re: Live code patching in Lisp
Date: Tue, 11 Oct 2005 21:22:38 +0000
Message-ID: <3r2oouFhof3nU1@individual.net>

Kaz Kylheku wrote:
> Actually, the real problem is that there may be threads executing the
> old functions as you update them. Those calls may have happened long
> before you requested the update.

I don't think that's bad, as those threads have been using the old 
functions all along ;)

The problem is that updating the function pointers might not be atomic.

> The old functions will stay around while there are any threads
> executing them.
> 
>> I guess in a single-threaded system this is easy, but in a truly
>> multithreaded one you'd have to suspend all threads, or build locks
>> around *every* symbol-lookup?
>>
>> How do Lisps handle this, and what are the problems, if any?
> 
> I think you have to just ensure that everything is compatible, so that
> the new versions of functions can coexist with parallel calls to the
> old functions, in any combination, ensuring that the right things are
> forward and backward compatible.

That sounds interesting...

> After we replace B with B', calls from the old A may start arriving
> into B'. So B' has to support the old interface expected by A. These
> calls may continue even after A is replaced by A', because A will stay
> around while there are any threads executing it.

Yes, but then you could just signal the thread to "restart". :)

-- 
State, the new religion from the friendly guys who brought you fascism.

From: Thomas A. Russ
Subject: Re: Live code patching in Lisp
Date: Tue, 11 Oct 2005 18:09:15 +0000
Message-ID: <ymihdbo2aes.fsf@sevak.isi.edu>

Ulrich Hobelmann <···········@web.de> writes:

> Let's say I update two functions in a running Lisp (say, a web server), 
> how can I be sure that one function isn't called after only the first 
> one has been updated in the symbol table?
> 
> I guess in a single-threaded system this is easy, but in a truly 
> multithreaded one you'd have to suspend all threads, or build locks 
> around *every* symbol-lookup?

Wouldn't it be sufficient to just execute the patch loading inside of
something like WITHOUT-INTERRUPTS ?

(without-interrupts
   (load-patches))

Of course, without interrupts in not a standard CL function.  (But then
multiprocessin isn't in CL either).  A quick check shows that Allegro CL
and MCL both have it.  I assume it is reasonably common in
multiprocessing lisp systems.

-- 
Thomas A. Russ,  USC/Information Sciences Institute

From: ··············@hotmail.com
Subject: Re: Live code patching in Lisp
Date: Tue, 11 Oct 2005 20:18:00 +0000
Message-ID: <1129061880.321070.35350@g44g2000cwa.googlegroups.com>

Thomas A. Russ wrote:
> Ulrich Hobelmann <···········@web.de> writes:
>
> > Let's say I update two functions in a running Lisp (say, a web server),
> > how can I be sure that one function isn't called after only the first
> > one has been updated in the symbol table?
> >
> > I guess in a single-threaded system this is easy, but in a truly
> > multithreaded one you'd have to suspend all threads, or build locks
> > around *every* symbol-lookup?
>
> Wouldn't it be sufficient to just execute the patch loading inside of
> something like WITHOUT-INTERRUPTS ?
>
> (without-interrupts
>    (load-patches))
>
> Of course, without interrupts in not a standard CL function.  (But then
> multiprocessin isn't in CL either).  A quick check shows that Allegro CL
> and MCL both have it.  I assume it is reasonably common in
> multiprocessing lisp systems.

I don't think this is sufficient; the idea is that some threads may be
between function calls. E.g.

(defun handle-web-request ()
  (setup-web-request)
  (process-web-request)
  (teardown-web-request))

If you wish to change setup-web-request and teardown-web-request
simultaneously (because they have to be mutually compatible, and you've
changed something about the shared functionality), there are (at least
two issues)

1) how to ensure the two function changes occur in a single atomic
operation (which your approach deals with)

This avoids the problem where you update (setup-web-request), a request
comes in and quickly executes the NEW (setup-web-request), but gets
through the processing to execute the OLD (teardown-web-request) before
you can update that function definition.

2) how to ensure that no threads have executed (setup-web-request) (the
old version), and are currently in (process-web-request), and therefore
need to execute (teardown-web-request) ALSO in the old version.

Even if you change the functions atomically, it is too late for
currently executing threads to run the NEW version of
(setup-web-request) to avoid problems when they execute the NEW version
of (teardown-web-request).

Although I don't have practical experience in this kind of thing, it
seems to me that even with Lisp, you have to set up your server to go
into a "maintenance mode" where, for example, all transactions are
completed cleanly, and no transactions are allowed to begin until
"maintenance mode" is done. Which has to be designed in advance, and to
some extent customized to one's requirements.

From: Espen Vestre
Subject: Re: Live code patching in Lisp
Date: Wed, 12 Oct 2005 07:30:18 +0000
Message-ID: <kwfyr78a5x.fsf@merced.netfonds.no>

···············@hotmail.com" <············@gmail.com> writes:

> I don't think this is sufficient; the idea is that some threads may be
> between function calls. E.g.
>
> (defun handle-web-request ()
>   (setup-web-request)
>   (process-web-request)
>   (teardown-web-request))

As several posters have pointed out, the /possible/ problems of live
patching are many.  In practice, it doesn't look quit as bad, though.

I do live patching all the time, for several years, in servers with
relatively high load and up to 100 simultanous threads, without
experiencing any errors of this kind that comes to my mind right now.
But I usually only do relatively small incremental patches. I.e. this
week I released a redesigned version of my "Prime Trader" server, and
I didn't even think of trying to hot-patch those changes, especially
since most of the completely redesigned parts were in the startup
routines, and of course I wanted to see if these worked as well in
production as in tests (they did, with a few exceptions ;)). 

Usually, it's pretty easy to be reasonably sure that an incremental
change doesn't actually do something like your example above. And if
it does, it's usually easy to implement "teardown-web-request" in a way
that will work with both versions of "setup-web-request".

But I always have live patching in mind when I write this server
code, and a few times I've even changed the names of functions in
order to avoid problems.

In practice, it's also often a question of tradeoff: What is more
important: To quickly solve a severe problem, or to reduce the risk of
unwanted consequences of the problem solving operation from low to
neglectable?
-- 
  (espen)

From: Pascal Costanza
Subject: Re: Live code patching in Lisp
Date: Tue, 11 Oct 2005 18:43:02 +0000
Message-ID: <3r2fdmFhfbb1U1@individual.net>

Ulrich Hobelmann wrote:
> I'm reading a bit about live code patching and I'm wondering how state 
> is kept consistent.
> 
> Let's say I update two functions in a running Lisp (say, a web server), 
> how can I be sure that one function isn't called after only the first 
> one has been updated in the symbol table?
> 
> I guess in a single-threaded system this is easy, but in a truly 
> multithreaded one you'd have to suspend all threads, or build locks 
> around *every* symbol-lookup?
> 
> How do Lisps handle this, and what are the problems, if any?

By default, Common Lisp doesn't provide dedicated means to deal with 
this. You have to think about setting up the infrastructure in a way to 
make this work.

This is actually one of the usage scenarios we have in mind for 
ContextL. With something like dynamically scoped functions or ContextL, 
it should be possible to activate new definitions only for newly spawned 
threads, without affecting already executing threads. In theory, you 
only have to ensure that the demons who are responsible for spawning the 
  worker threads inquire about the layers / function definitions that 
should be applied.

We haven't tried this yet, so I can't tell whether this would actually 
work. If someone is interested in trying this out, I would be happy to 
hear about it...

Pascal

-- 
OOPSLA'05 tutorial on generic functions & the CLOS Metaobject Protocol
++++ see http://p-cos.net/oopsla05-tutorial.html for more details ++++

From: Ulrich Hobelmann
Subject: Re: Live code patching in Lisp
Date: Tue, 11 Oct 2005 21:26:32 +0000
Message-ID: <3r2p07Fhof3nU2@individual.net>

Pascal Costanza wrote:
> This is actually one of the usage scenarios we have in mind for 
> ContextL. With something like dynamically scoped functions or ContextL, 
> it should be possible to activate new definitions only for newly spawned 
> threads, without affecting already executing threads. In theory, you 
> only have to ensure that the demons who are responsible for spawning the 
>  worker threads inquire about the layers / function definitions that 
> should be applied.

Yes, that's a bit what I'm thinking.  New threads could be parameterized 
with their "package", so you could just give them a new package of 
definitions.  But I'm not sure if you can actually "drop" packages, or 
use packages as first-class values in Lisp (like SML structs in a 
functor), so ContextL layers sound interesting there.

> We haven't tried this yet, so I can't tell whether this would actually 
> work. If someone is interested in trying this out, I would be happy to 
> hear about it...

Mine was just a theoretical question so far, but I might give it a thought.

-- 
State, the new religion from the friendly guys who brought you fascism.

From: Pascal Costanza
Subject: Re: Live code patching in Lisp
Date: Tue, 11 Oct 2005 22:29:47 +0000
Message-ID: <3r2smrFhjbdnU1@individual.net>

Ulrich Hobelmann wrote:
> Pascal Costanza wrote:
> 
>> This is actually one of the usage scenarios we have in mind for 
>> ContextL. With something like dynamically scoped functions or 
>> ContextL, it should be possible to activate new definitions only for 
>> newly spawned threads, without affecting already executing threads. In 
>> theory, you only have to ensure that the demons who are responsible 
>> for spawning the  worker threads inquire about the layers / function 
>> definitions that should be applied.
> 
> Yes, that's a bit what I'm thinking.  New threads could be parameterized 
> with their "package", so you could just give them a new package of 
> definitions.  But I'm not sure if you can actually "drop" packages, or 
> use packages as first-class values in Lisp (like SML structs in a 
> functor), so ContextL layers sound interesting there.

Packages influence the way s-expressions are read, they provide purely a 
way to manage different symbol spaces. The connection to functions, 
variables, classes, whatever, is only there because such definitions are 
typically ("accidentally") associated with symbols. But packages do 
_not_ structure such definitions, they are completely oblivious to how 
symbols are used. This is in stark contrast to module systems that don't 
structure namespaces, but instead manage definitions and their 
visibility across several modules. Among other things, this means that 
there is no way to restructure your programs at runtime by using 
packages (unless you use read to read in new program code, but that's 
besides the point here).

ContextL layers are closer to what a module system typically provides 
(but that's not the primary goal). They are collections of partial class 
and method definitions, and there are ways to activate them with dynamic 
scope. (There are also ways to activate them globally, but this would 
lead to similar problems with already executing threads as in other 
approaches. Only dynamically scoped layer activation ensures that no 
other thread is affected.)

What's additionally exciting is that ContextL doesn't seem to impose any 
serious runtime overhead. Classes and generic functions that are 
augmented by layers are not any slower than regular CLOS classes and 
functions. Hopefully, we can publish a paper soon about how we have 
achieved this (or you could download the source code and analyze it... ;)

>> We haven't tried this yet, so I can't tell whether this would actually 
>> work. If someone is interested in trying this out, I would be happy to 
>> hear about it...
> 
> Mine was just a theoretical question so far, but I might give it a thought.

Just let me know...

Pascal

-- 
OOPSLA'05 tutorial on generic functions & the CLOS Metaobject Protocol
++++ see http://p-cos.net/oopsla05-tutorial.html for more details ++++

From:  (typep 'nil '(satisfies identity)) => ?
Subject: Re: Live code patching in Lisp
Date: Wed, 12 Oct 2005 21:18:54 +0000
Message-ID: <1129151934.233303.116910@z14g2000cwz.googlegroups.com>

Hi!
The easyest way of patching is adding different versions of a function
inside the function with a cond or case clause.

(defun foo
e.g. write a new function that uses case:
(case *version* ;; a global variable of your current version
      (1.1 (your-code for version 1.0))
      (1.2 (your-code for version 1.1))
     )
Replace your old function (without the clause) with the new one, with
the different version (leaving *version* to 1.0). So there wont be any
difference in executing the new version for invoced functions and for
all function that will be called.

For changing, set *version* to the new value, and all functions from
that on will use the new code.

When multiple functions depend on each other and all have to be
pachted, you could define different variables and set them in the rigth
order. If you expect compatibility issues while changing in differnet
levels of a interface. Define waiting routines, in the case clauses,
untill all variables are set up properly.

I preferr this way instead of a real patchting because these operations
are well established in common lisp.