From: Malcolm McLean
Subject: Lisp in C.
Date: 
Message-ID: <krednUf6s9BQb73bnZ2dnUVZ8turnZ2d@bt.com>
If you are implementing a Lisp in  C, you can have a structure

typedef struct
{
  struct node *cdr;
  struct node *car;
  void *atom;
} NODE;

However this is a waste because if a node is an atom then cdr and car will 
be null. However how do you distinguish between a list and an atom?

Secondly, there is a nil object. Is it best to map this to memory location 
zero, so a node with cdr = NULL is the terminal one ina list, or is it best 
to create a nil node somewhere (with pointers to NULL or to itself?).

The question isn't really about C and would apply to a Lisp intepreter 
written in assembly. My plan however is to have an adventure program 
basically written in
C and compilable under C, but with the high-level logic written in Lisp.
-- 
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

From: ·······@googlemail.com
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176568764.226222.17220@y5g2000hsa.googlegroups.com>
I currently use something like this.

enum type {
        CONS,
        SYMBOL
};

typedef struct objct * obj;

struct object {
        enum type type;
        union {
               struct { obj car,cdr; } cons;
               char * symbol;
        } u;
};

obj nil = NULL;


#define Consp(x)    ((x)&&(x)->type==CONS)
#define Symbolp(x)  ((x)&&(x)->type==SYMBOL)



The test against nil is simply

if (!x) {
        // do something here
}




Regards,

Thomas
From: Ken Tilton
Subject: Re: Lisp in C.
Date: 
Message-ID: <m27Uh.653$S27.444@newsfe12.lga>
Malcolm McLean wrote:
> If you are implementing a Lisp in  C, you can have a structure
> 
> typedef struct
> {
>  struct node *cdr;
>  struct node *car;
>  void *atom;
> } NODE;
> 
> However this is a waste because if a node is an atom then cdr and car 
> will be null. However how do you distinguish between a list and an atom?

use some bitfields:
0 = atom
1 = cons with car containing any data that fits in 32-bits
2 = cons with car containing pointer to data

> 
> Secondly, there is a nil object. Is it best to map this to memory 
> location zero, so a node with cdr = NULL is the terminal one ina list, 
> or is it best to create a nil node somewhere (with pointers to NULL or 
> to itself?).

The former. zero = NIL = false (done deal in C)

> 
> The question isn't really about C and would apply to a Lisp intepreter 
> written in assembly. My plan however is to have an adventure program 
> basically written in
> C and compilable under C, but with the high-level logic written in Lisp.

Did you look at ECL?

kt

-- 
http://www.theoryyalgebra.com/

"Algebra is the metaphysics of arithmetic." - John Ray

"As long as algebra is taught in school,
there will be prayer in school." - Cokie Roberts

"Stand firm in your refusal to remain conscious during algebra."
    - Fran Lebowitz

"I'm an algebra liar. I figure two good lies make a positive."
    - Tim Allen
From: Ari Johnson
Subject: Re: Lisp in C.
Date: 
Message-ID: <m2irby4s2n.fsf@hermes.theari.com>
"Malcolm McLean" <·········@btinternet.com> writes:

> If you are implementing a Lisp in  C, you can have a structure
>
> typedef struct
> {
>  struct node *cdr;
>  struct node *car;
>  void *atom;
> } NODE;
>
> However this is a waste because if a node is an atom then cdr and car
> will be null. However how do you distinguish between a list and an
> atom?
>
> Secondly, there is a nil object. Is it best to map this to memory
> location zero, so a node with cdr = NULL is the terminal one ina list,
> or is it best to create a nil node somewhere (with pointers to NULL or
> to itself?).
>
> The question isn't really about C and would apply to a Lisp intepreter
> written in assembly. My plan however is to have an adventure program
> basically written in
> C and compilable under C, but with the high-level logic written in Lisp.

For a very small Lisp designed for a very similar purpose (having a
Lisp to allow players to control in-game object behavior without
exposing unwanted parts of the underlying system, with the underlying
system written in C), see my own ALisp (not to be confused with any
other Lisp of a similar name): http://theari.com/code/alisp.tar.gz
(updated February 25, 2007 to be amd64-friendly).

It's fairly portable.  It's a Lisp-1.  It has a bytecode compiler and
interpreter.  It does macros.  It's fun.  It's fairly well-commented
(particularly the type system, which is what you are asking about).
It's 8502 lines of C in total.  It includes sample code, including a
REPL driver and an OOP system.  It's BSD-licensed.  Have fun.
From: Ari Johnson
Subject: Re: Lisp in C.
Date: 
Message-ID: <m2abxa47nz.fsf@hermes.theari.com>
Ari Johnson <·········@gmail.com> writes:
> It's 8502 lines of C in total.  It includes sample code, including a
> REPL driver and an OOP system.  It's BSD-licensed.  Have fun.

Of course, I meant to say that it's MIT-licensed.  You can use my name
if you really think it will help.
From: John Thingstad
Subject: Re: Lisp in C.
Date: 
Message-ID: <op.tqsaucskpqzri1@pandora.upc.no>
On Sat, 14 Apr 2007 17:28:28 +0200, Malcolm McLean  
<·········@btinternet.com> wrote:

> If you are implementing a Lisp in  C, you can have a structure
>
> typedef struct
> {
>   struct node *cdr;
>   struct node *car;
>   void *atom;
> } NODE;
>

Not qute.

typedef struct cons
{
   WORD car_tag_inline_rep;
   union
   {
      void *car_atom_rep;
      struct cons *car_cons_rep;
    };
   WORD cdr_tag_inline_rep;
   union
   {
      void *cdr_atom_rep;
      struct cons *cdr_cons_rep;
    };
} CONS;

Is more like the way Lisp sees it.
Integers are stored as a car_inline_rep.
You need type tag assigned to all objects also inline reps.
Thus a fixnum has typically two tag bits which can not be used to store  
the number.
Instead of  void * it is better to use a union for the various types here.
(array and hash_table structs perhaps)

Anyhow I probably haven't handled the type tags very well here.
"Lisp in small pieces" is a good book for clues on implementing a Lisp.

> However this is a waste because if a node is an atom then cdr and car  
> will be null. However how do you distinguish between a list and an atom?
>
> Secondly, there is a nil object. Is it best to map this to memory  
> location zero, so a node with cdr = NULL is the terminal one ina list,  
> or is it best to create a nil node somewhere (with pointers to NULL or  
> to itself?).
>

Well nil is a subtype of null in Lisp.
All other types are subtypes of t.
The important thing is that returning nil is different from returning 0
so a 0 pointer is not the way to go.
(The tag value of the tag_inline_rep means you can always tell.)

> The question isn't really about C and would apply to a Lisp intepreter  
> written in assembly. My plan however is to have an adventure program  
> basically written in
> C and compilable under C, but with the high-level logic written in Lisp.

A adventure program doesn't sound very CPU intensive.
Why not write the whole thing in Lisp?
Marshaling between C and Lisp is costly.

There is a GNU library called GUILE which implents a Scheme like
Lisp. It is designed to be embedded in C programs.
Also for Common Lisp fans there is ECL.
So rather than reinvent the wheel perhaps one of these will do?
(The language Lua is popular in gaming comuneties)

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
From: Wolfram Fenske
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176588389.714849.310990@e65g2000hsc.googlegroups.com>
"John Thingstad" <··············@chello.no> writes:

> On Sat, 14 Apr 2007 17:28:28 +0200, Malcolm McLean
> <·········@btinternet.com> wrote:

[...]

> Anyhow I probably haven't handled the type tags very well here.
> "Lisp in small pieces" is a good book for clues on implementing a
> Lisp.

I agree that "Lisp in Small Pieces" is a very good book for people who
want to implement a Lisp, but it doesn't focus on machine-related
issues that much.  However, it does include a small Scheme to C
compiler that the OP might want to look at.  The data type definitions
are too long to post here, but you can download all the source code
for the book from <http://www-spi.lip6.fr/~queinnec/WWW/LiSP.html>.
Look for the link that says "programs of this book".  There's a file
src/c/scheme.h in archive that defines all the data types.  Take a
look at "SCM_object" and friends.

[...]

>> Secondly, there is a nil object. Is it best to map this to memory
>> location zero, so a node with cdr = NULL is the terminal one ina
>> list,  or is it best to create a nil node somewhere (with pointers
>> to NULL or  to itself?).
>
> Well nil is a subtype of null in Lisp.
> All other types are subtypes of t.
> The important thing is that returning nil is different from returning 0
> so a 0 pointer is not the way to go.

I agree.  Using NULL for NIL was a lot of trouble in my interpreter
because I needed to treat NULL specially in many places.  So I ended
up defining the symbol NIL as a global constant.  However, the code
has changed a lot since I made that decision.  Maybe I'd do it
differently now.

[...]

> A adventure program doesn't sound very CPU intensive.
> Why not write the whole thing in Lisp?
> Marshaling between C and Lisp is costly.

As I understood it, the OP wants to implement his own Lisp.  That
means he can make the interaction between C and Lisp as efficient as
he wants.  The operative word is, of course, "can".  Unless you invest
an enormous amount of time, your home-grown interpreter will probably
be much slower and buggier than anything that's currently out there.

> There is a GNU library called GUILE which implents a Scheme like
> Lisp. It is designed to be embedded in C programs.
> Also for Common Lisp fans there is ECL.
> So rather than reinvent the wheel perhaps one of these will do?
> (The language Lua is popular in gaming comuneties)

I agree.  If your goal is to write your own Lisp, by all means go
ahead.  But if your mainly interested in writing that game, you should
try really hard to find an existing Lisp or Scheme that fits your
needs.

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
From: ·······@googlemail.com
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176589049.109485.182540@w1g2000hsg.googlegroups.com>
> I agree.  Using NULL for NIL was a lot of trouble in my interpreter
> because I needed to treat NULL specially in many places.  So I ended
> up defining the symbol NIL as a global constant.  However, the code
> has changed a lot since I made that decision.  Maybe I'd do it
> differently now.

It depends on the host language.  A interpreter writen in c++ may
work fine this way.  Simply becouse it is easy to type something like

     foo->null()

But this is not always what you want.  I have come to the conclusion,
that it is the best way to write an interpreter straight from SICP
with
a minimum of fuss.

Regards,

Thomas
From: Wolfram Fenske
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176591296.913743.73600@e65g2000hsc.googlegroups.com>
--text follows this line--
·······@googlemail.com writes:

>> I agree.  Using NULL for NIL was a lot of trouble in my interpreter
>> because I needed to treat NULL specially in many places.  So I ended
>> up defining the symbol NIL as a global constant.  However, the code
>> has changed a lot since I made that decision.  Maybe I'd do it
>> differently now.
>
> It depends on the host language.  A interpreter writen in c++ may
> work fine this way.  Simply becouse it is easy to type something like
>
>      foo->null()

I wouldn't know, I wrote mine in C. :-)

IIRC, the problem with using NULL for NIL was that most of the time I
only want to know whether a given object is a symbol or not.  It's
easier if all symbols can be treated equally.  After all, NIL isn't
only used as boolean FALSE, but is also frequently used as a symbol,
e. g., it's often the name of a Common Lisp BLOCK.

When I actually do want to know the boolean value of an object, I
compare it to my global C variable "lisp_nil".

> But this is not always what you want.  I have come to the
> conclusion, that it is the best way to write an interpreter straight
> from SICP with a minimum of fuss.

I can't comment on SICP, haven't read it.  But as someone else already
said, the hard part isn't the interpreter per se, but the garbage
collector (and call/cc if you want to implement that, too ;-)).  You
also need a ton of library functions to make your language useful, but
for the most part that's just work.  It's a lot of work, though.

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
From: verec
Subject: Re: Lisp in C.
Date: 
Message-ID: <46215f70$0$517$5a6aecb4@news.aaisp.net.uk>
On 2007-04-14 23:54:56 +0100, "Wolfram Fenske" <·····@gmx.net> said:

> But as someone else already said, the hard part isn't the interpreter 
> per se, but the garbage collector

A simple mark and sweep is well known, close to trivial to implement,
and well documented everywhere :-)

Also, despite its other shortcomings (pauses and running time),
it uses memory efficiently (no "half space" to set aside) and
can be made precise if you are careful about how you mix lisp types
and "foreign" (ie "C") types.
--
JFB
From: Wolfram Fenske
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176603112.734038.78670@e65g2000hsc.googlegroups.com>
verec <·····@mac.com> writes:

> On 2007-04-14 23:54:56 +0100, "Wolfram Fenske" <·····@gmx.net> said:
>
>> But as someone else already said, the hard part isn't the
>> interpreter per se, but the garbage collector
>
> A simple mark and sweep is well known, close to trivial to implement,
> and well documented everywhere :-)

Yes, in theory it's easy.  In practice you forget to mark one field in
one of your data types, start getting random errors and segmentation
faults several days later and then spend a day or two chasing dangling
pointers.  Then you decide to make the GC generational because GC'ing
everything all the time is too slow and you forget to record one
pointer from an older to a younger generation ... :-) Those bugs are
probably the most evil kind there is.  That's why I find the GC hard.

Also, it took me some time to find out how to keep track of which Lisp
objects are one the C stack.  I finally found a neat idea in the OCaml
sources.  If anyone's interested I can try to explain how it works.

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
From: Edi Weitz
Subject: Re: Lisp in C.
Date: 
Message-ID: <uabx9q59m.fsf@agharta.de>
On 14 Apr 2007 19:11:52 -0700, "Wolfram Fenske" <·····@gmx.net> wrote:

> Also, it took me some time to find out how to keep track of which
> Lisp objects are one the C stack.  I finally found a neat idea in
> the OCaml sources.  If anyone's interested I can try to explain how
> it works.

I am.

-- 

Lisp is not dead, it just smells funny.

Real email: (replace (subseq ·········@agharta.de" 5) "edi")
From: ·······@googlemail.com
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176647651.855893.113160@p77g2000hsh.googlegroups.com>
On Apr 15, 12:52 pm, Edi Weitz <········@agharta.de> wrote:
> On 14 Apr 2007 19:11:52 -0700, "Wolfram Fenske" <····@gmx.net> wrote:
>
> > Also, it took me some time to find out how to keep track of which
> > Lisp objects are one the C stack.  I finally found a neat idea in
> > the OCaml sources.  If anyone's interested I can try to explain how
> > it works.
>
> I am.

Me too.  I did not find a nice way to find it.  I looked at the emacs
source (alloc.c), but their method look a bit ugly.  Search for
find_stack_direction.

Thomas
From: Wolfram Fenske
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176660656.787286.309680@b75g2000hsg.googlegroups.com>
Ari Johnson <·········@gmail.com> writes:

> ·······@googlemail.com writes:
>
>> On Apr 15, 12:52 pm, Edi Weitz <········@agharta.de> wrote:
>>> On 14 Apr 2007 19:11:52 -0700, "Wolfram Fenske" <····@gmx.net> wrote:
>>>
>>> > Also, it took me some time to find out how to keep track of which
>>> > Lisp objects are one the C stack.  I finally found a neat idea in
>>> > the OCaml sources.  If anyone's interested I can try to explain how
>>> > it works.
>>>
>>> I am.
>>
>> Me too.  I did not find a nice way to find it.  I looked at the emacs
>> source (alloc.c), but their method look a bit ugly.  Search for
>> find_stack_direction.
>
> I, too, am interested in any novel way of doing this. :)

The basic idea is to build a linked list of all the parameters and
local variables of a C function that are GC-able objects, and to save
the head of that linked list in a global variable that the GC knows
about.  The global variable, called "caml_local_roots", is initialized
with NULL.  Each C function that deals with GC-able objects has to
follow this protocol:

1) save the old value of "caml_local_roots",

2) build a struct that contains pointers to all its parameters and
   locals,

3) set the "next" pointer of that struct to the old value of
   "caml_local_roots", and finally

4) set "caml_local_roots" to the address of that struct.

When such a function returns, it simply resets "caml_local_roots" to
the value that was saved in step 1) and the performs a C "return".

The datatypes and macros that make this easy to write are defined in
ocaml-${version}/byterun/memory.h (Note: "value" is the C data type
for all OCaml objects.):

--8<---------------cut here---------------start------------->8---
struct caml__roots_block {
  struct caml__roots_block *next;
  intnat ntables;
  intnat nitems;
  value *tables [5];
};

CAMLextern struct caml__roots_block *caml_local_roots;  /* defined in
roots.c */

/* The following macros are used to declare C local variables and
   function parameters of type [value].

   The function body must start with one of the [CAMLparam] macros.
   If the function has no parameter of type [value], use [CAMLparam0].
   If the function has 1 to 5 [value] parameters, use the
corresponding
   [CAMLparam] with the parameters as arguments.
   If the function has more than 5 [value] parameters, use
[CAMLparam5]
   for the first 5 parameters, and one or more calls to the
[CAMLxparam]
   macros for the others.
   If the function takes an array of [value]s as argument, use
   [CAMLparamN] to declare it (or [CAMLxparamN] if you already have a
   call to [CAMLparam] for some other arguments).

   If you need local variables of type [value], declare them with one
   or more calls to the [CAMLlocal] macros at the beginning of the
   function. Use [CAMLlocalN] (at the beginning of the function) to
   declare an array of [value]s.

   Your function may raise an exception or return a [value] with the
   [CAMLreturn] macro.  Its argument is simply the [value] returned by
   your function.  Do NOT directly return a [value] with the [return]
   keyword.  If your function returns void, use [CAMLreturn0].

   All the identifiers beginning with "caml__" are reserved by Caml.
   Do not use them for anything (local or global variables, struct or
   union tags, macros, etc.)
*/

#define CAMLparam0() \
  struct caml__roots_block *caml__frame = caml_local_roots

[...]

#define CAMLreturn(result) do{ \
  caml_local_roots = caml__frame; \
  return (result); \
}while(0)

[...]

#define CAMLparam2(x, y) \
  CAMLparam0 (); \
  CAMLxparam2 (x, y)

#define CAMLparam3(x, y, z) \
  CAMLparam0 (); \
  CAMLxparam3 (x, y, z)

[...]

#define CAMLparamN(x, size) \
  CAMLparam0 (); \
  CAMLxparamN (x, (size))

[...]

#define CAMLxparam2(x, y) \
  struct caml__roots_block caml__roots_##x; \
  CAMLunused int caml__dummy_##x = ( \
    (caml__roots_##x.next = caml_local_roots), \
    (caml_local_roots = &caml__roots_##x), \
    (caml__roots_##x.nitems = 1), \
    (caml__roots_##x.ntables = 2), \
    (caml__roots_##x.tables [0] = &x), \
    (caml__roots_##x.tables [1] = &y), \
    0)

#define CAMLxparam3(x, y, z) \
  struct caml__roots_block caml__roots_##x; \
  CAMLunused int caml__dummy_##x = ( \
    (caml__roots_##x.next = caml_local_roots), \
    (caml_local_roots = &caml__roots_##x), \
    (caml__roots_##x.nitems = 1), \
    (caml__roots_##x.ntables = 3), \
    (caml__roots_##x.tables [0] = &x), \
    (caml__roots_##x.tables [1] = &y), \
    (caml__roots_##x.tables [2] = &z), \
    0)

[...]

#define CAMLxparamN(x, size) \
  struct caml__roots_block caml__roots_##x; \
  CAMLunused int caml__dummy_##x = ( \
    (caml__roots_##x.next = caml_local_roots), \
    (caml_local_roots = &caml__roots_##x), \
    (caml__roots_##x.nitems = (size)), \
    (caml__roots_##x.ntables = 1), \
    (caml__roots_##x.tables[0] = &(x[0])), \
    0)

[...]

#define CAMLlocal3(x, y, z) \
  value x = 0, y = 0, z = 0; \
  CAMLxparam3 (x, y, z)

[...]

#define CAMLlocalN(x, size) \
  value x [(size)] = { 0, /* 0, 0, ... */ }; \
  CAMLxparamN (x, (size))
--8<---------------cut here---------------end--------------->8---

Here's how it looks in action (from otherlibs/unix/getaddrinfo.c).
The typing overhead isn't too bad, IMO.

--8<---------------cut here---------------start------------->8---
static value convert_addrinfo(struct addrinfo * a)
{
  CAMLparam0();
  CAMLlocal3(vres,vaddr,vcanonname);

  union sock_addr_union sa;
  socklen_param_type len;

  len = a->ai_addrlen;
  if (len > sizeof(sa)) len = sizeof(sa);
  memcpy(&sa.s_gen, a->ai_addr, len);
  vaddr = alloc_sockaddr(&sa, len, -1);
  vcanonname = copy_string(a->ai_canonname == NULL ? "" : a-
>ai_canonname);
  vres = alloc_small(5, 0);
  Field(vres, 0) = cst_to_constr(a->ai_family, socket_domain_table, 3,
0);
  Field(vres, 1) = cst_to_constr(a->ai_socktype, socket_type_table, 4,
0);
  Field(vres, 2) = Val_int(a->ai_protocol);
  Field(vres, 3) = vaddr;
  Field(vres, 4) = vcanonname;

  CAMLreturn(vres);
}
--8<---------------cut here---------------end--------------->8---

PS: I meant to reply to Ari Johnson's posting but it looks like Google
groups didn't get it.  So I just reply to another one. :-)

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
From: ·······@googlemail.com
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176666200.665030.186570@b75g2000hsg.googlegroups.com>
> The basic idea is to build a linked list of all the parameters and
> local variables of a C function that are GC-able objects, and to save
> the head of that linked list in a global variable that the GC knows
> about.

This is like maintaining a own copy of the stack.  But maybe it is the
only portable way.  It seems to me that most code of the 'boehm
collector'
deals with getting the values from the C stack and registers.
It would be nice to have a lib for that.

So the two simple ways of getting to the root set in C are:
  - Maintaining a copy of the stack with macros or constructors (c++)
  - Do not use the C stack at all and use your own.

Thomas
From: Wolfram Fenske
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176738818.824676.236090@y80g2000hsf.googlegroups.com>
·······@googlemail.com writes:

>> The basic idea is to build a linked list of all the parameters and
>> local variables of a C function that are GC-able objects, and to save
>> the head of that linked list in a global variable that the GC knows
>> about.
>
> This is like maintaining a own copy of the stack.  But maybe it is
> the only portable way.  It seems to me that most code of the 'boehm
> collector' deals with getting the values from the C stack and
> registers.  It would be nice to have a lib for that.

Keep in mind that the Boehm CG is conservative, the OCaml method is
precise.

> So the two simple ways of getting to the root set in C are:
>   - Maintaining a copy of the stack with macros or constructors
>     (c++)

I wonder how much overhead using C++ constructors would incur.  Or
using C++ for an interpreter in general.  But then again the speed of
an interpreter depends on a lot of other things as well, e. g. whether
it's a pure interpreter or if it compiles to byte-code or machine
code.

>   - Do not use the C stack at all and use your own.

How would you do that?  Maybe allocate your own "stack" on the heap
and have all your local variables live there?  Something like this?

--8<---------------cut here---------------start------------->8---
value *stack = (value*) malloc(A_LOT*sizeof(value));
int fill=0;

[...]

value*
my_func()
{
  int old_fill=fill;
  value *ret_val = &(stack[fill++]);

  *ret_val = cons(INT2FIXNUM(1), INT2FIXNUM(2));
  // do something that might trigger the GC

  fill=old_fill;
  return ret_val;
}
--8<---------------cut here---------------end--------------->8---

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
From: ·······@googlemail.com
Subject: Re: Lisp in C.
Date: 
Message-ID: <1177193188.007804.252940@o5g2000hsb.googlegroups.com>
> >   - Do not use the C stack at all and use your own.
> How would you do that?  Maybe allocate your own "stack" on the heap
> and have all your local variables live there?  Something like this?

Compile to byte code and allocate the stack on the heap.

If you are implementing a pure interpreter in C, then it is a /real/
pain to keep track of the pointers.  c++ classes could keep track
of this automatic.

Speed is not an issue.  Constructors could be compiled inline and my
naive implementation of environments slow everthing more down...

Here is the (dump) handle class:

struct obj {
        struct object * p;
        struct object **stack_position;

        obj() : p(0) {
                stack_position = push(p);
        }
        obj(const obj& x) {
                p=x.p;
                stack_position = push(p);
        }
        obj(struct object * x) : p(x) {
                stack_position = push(p);
        }
        ~obj() {
                pop();
        }
        operator struct object*() const { return p; }
        struct object& operator*() const { return *p; }
        struct object * operator->() const { return p; }

        obj& operator=(const obj& rhs) {
                p=rhs.p;
                if (stack_position)
                        *stack_position=p;
                return *this;
        }
};


cheers,
thomas
From: Wolfram Fenske
Subject: Re: Lisp in C.
Date: 
Message-ID: <1177278964.102075.134060@y80g2000hsf.googlegroups.com>
·······@googlemail.com writes:

>> >   - Do not use the C stack at all and use your own.
>> How would you do that?  Maybe allocate your own "stack" on the heap
>> and have all your local variables live there?  Something like this?
>
> Compile to byte code and allocate the stack on the heap.

I see.  Actually, that's what I do in my byte-code interpreter. :-)
But for functions that are written in C which might trigger the GC
(e. g. APPEND, LIST, anything that uses FUNCALL), I use the scheme I
described earlier.

> If you are implementing a pure interpreter in C, then it is a /real/
> pain to keep track of the pointers.

I wrote a pure interperter in C because I wanted to be able to
bootstrap using just a C compiler.  I use the interpreter to run the
byte-code compiler, which is written in my dialect Lisp.  And so on.

> c++ classes could keep track of this automatic.
>
> Speed is not an issue.  Constructors could be compiled inline

Does using C++ for Lisp objects mean that you you have to call "new"
and "delete" all the time?  AFAIK a good GC should perform much better
than that.

> and my naive implementation of environments slow everthing more
> down...

I rewrote my environments implementation a couple of times too.  The
fastest implementation for dynamic environments should be function and
value cells directly in the symbols.  However, I want to make my Lisp
multi-threaded at some point, so directly modifying the symbols didn't
seem like a good idea.  That's why I'm using hash tables for the
dynamic environment now.  The keys are the symbols, the values are
resizable vectors.  The last element of one of these vectors is the
current value of the symbol.

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
From: ·······@googlemail.com
Subject: Re: Lisp in C.
Date: 
Message-ID: <1177347555.392734.152240@b75g2000hsg.googlegroups.com>
> Does using C++ for Lisp objects mean that you you have to call "new"
> and "delete" all the time?  AFAIK a good GC should perform much better
> than that.

I use c++ to implement smart-pointers to manage the sack.  So instead
of
writing

lispobject * foo(lispobject * a, lispobject * b)
{
         push(a);
         push(b);

        /* do something, allocate new object etc... */

       pop();
       pop();
       return ...;
}

I can use the smart pointer instead.

--
Thomas Dickreiter
From: Rob Warnock
Subject: Re: Lisp in C.
Date: 
Message-ID: <aJ6dndBxTfugv77bnZ2dnUVZ_v-tnZ2d@speakeasy.net>
Wolfram Fenske <·····@gmx.net> wrote:
+---------------
| The basic idea is to build a linked list of all the parameters and
| local variables of a C function that are GC-able objects, and to save
| the head of that linked list in a global variable that the GC knows
| about.  The global variable, called "caml_local_roots", is initialized
| with NULL.  Each C function that deals with GC-able objects has to
| follow this protocol... [snip]
+---------------

The Wheel of Reinvention rolls around once again!! That's *exactly* how
the precise GC for the "Elk" dialect of Scheme did it in the mid-1990's.
A typical piece of C code:

    /* 
     * (make-iota-vector 5) ==> #(0 1 2 3 4)
     */
    Object
    make_iota_vector(int argc, Object argv[])
    {
	unsigned i, n;
	Object retval;
	GC_Node;

	ASSERT(argc >= 1); /* XXX Need to call "error" */
	n = Get_Exact_Unsigned(argv[0]);

        retval = Make_Vector(n, False);
        /* Now we need to protect "retval" from allocations that might be
         * done by the following series of Make_Unsigned_Long()
         */
        GC_Link(retval);
        for (i = 0; i < n; ++i) {
          /* See p.22 of "Build. Ext. Apps. w/ Elk" for why need this temp */
          Object gc_seq_temp = Make_Unsigned(i);
          VECTOR(retval)->data[i] = gc_seq_temp;
        }
        GC_Unlink;
        return retval;
    }

There were also versions of the macros protecting more than one object
[all args to "GC_Link*" have to be variable names of type Object]:

    GC_Node2 & GC_Link2(var1,var2)
    GC_Node3 & GC_Link3(var1,var2,var3)
    GC_Node4 & GC_Link4(var1,var2,var3,var4)
    ...

The "GC_Unlink" worked regardless of the size of "GC_Node*" in use.


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Wolfram Fenske
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176734856.443950.47050@y80g2000hsf.googlegroups.com>
····@rpw3.org (Rob Warnock) writes:

> Wolfram Fenske <·····@gmx.net> wrote:
> +---------------
> | The basic idea is to build a linked list of all the parameters and
> | local variables of a C function that are GC-able objects, and to save
> | the head of that linked list in a global variable that the GC knows
> | about.  The global variable, called "caml_local_roots", is initialized
> | with NULL.  Each C function that deals with GC-able objects has to
> | follow this protocol... [snip]
> +---------------
>
> That's *exactly* how the precise GC for the "Elk" dialect of Scheme
> did it in the mid-1990's.

That's interesting.

> The Wheel of Reinvention rolls around once again!!

As I wrote earlier, I didn't reinvent.  I stole the code, like any
good programmer should do. ;-) In this case from OCaml.  Development
on OCaml started in 1990.  OCaml is an extension of CAML, which,
according to Wikipedia, dates back to 1984.  So it may well be that
the Elk developers were the ones to reinvent the wheel.  CAML was
probably not the first interpreter to use this technique, either,
though ...

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
From: Ari Johnson
Subject: Re: Lisp in C.
Date: 
Message-ID: <m2647x4q6r.fsf@hermes.theari.com>
·······@googlemail.com writes:

> On Apr 15, 12:52 pm, Edi Weitz <········@agharta.de> wrote:
>> On 14 Apr 2007 19:11:52 -0700, "Wolfram Fenske" <····@gmx.net> wrote:
>>
>> > Also, it took me some time to find out how to keep track of which
>> > Lisp objects are one the C stack.  I finally found a neat idea in
>> > the OCaml sources.  If anyone's interested I can try to explain how
>> > it works.
>>
>> I am.
>
> Me too.  I did not find a nice way to find it.  I looked at the emacs
> source (alloc.c), but their method look a bit ugly.  Search for
> find_stack_direction.

I, too, am interested in any novel way of doing this. :)
From: verec
Subject: Re: Lisp in C.
Date: 
Message-ID: <4622263c$0$512$5a6aecb4@news.aaisp.net.uk>
On 2007-04-15 03:11:52 +0100, "Wolfram Fenske" <·····@gmx.net> said:

> verec <·····@mac.com> writes:
> 
>> On 2007-04-14 23:54:56 +0100, "Wolfram Fenske" <·····@gmx.net> said:
>> 
>>> But as someone else already said, the hard part isn't the
>>> interpreter per se, but the garbage collector
>> 
>> A simple mark and sweep is well known, close to trivial to implement,
>> and well documented everywhere :-)
> 
> Yes, in theory it's easy.  In practice you forget to mark one field in
> one of your data types, start getting random errors and segmentation
> faults several days later and then spend a day or two chasing dangling
> pointers.

Yes. That's the one piece of code you must get exactly right. OTOH,
my implementation was under 300 lines so it's not like you can't
wrap your head around the problem :)

Yes, there is also "support" code on the allocator side, but that
one was easy to debug quickly...(It works, or doesn't!)
--
JFB
From: John Thingstad
Subject: Re: Lisp in C.
Date: 
Message-ID: <op.tqss4vx8pqzri1@pandora.upc.no>
On Sun, 15 Apr 2007 00:54:56 +0200, Wolfram Fenske <·····@gmx.net> wrote:

>
> I can't comment on SICP, haven't read it.  But as someone else already
> said, the hard part isn't the interpreter per se, but the garbage
> collector (and call/cc if you want to implement that, too ;-)).  You
> also need a ton of library functions to make your language useful, but
> for the most part that's just work.  It's a lot of work, though.
>

Garbage collection may be hard, but reference counting is relativly simple
and should work rather well for a simple system. Particularly if you make
the variables have dynamic scope. (good old LISP)
With all of C there already you don't really need a large library I would  
think.
After all it just orcestrates the work of C functions.


-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
From: Wolfram Fenske
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176605426.417937.92780@n76g2000hsh.googlegroups.com>
"John Thingstad" <··············@chello.no> writes:

> On Sun, 15 Apr 2007 00:54:56 +0200, Wolfram Fenske <·····@gmx.net> wrote:
>
>> I can't comment on SICP, haven't read it.  But as someone else already
>> said, the hard part isn't the interpreter per se, but the garbage
>> collector (and call/cc if you want to implement that, too ;-)).  You
>> also need a ton of library functions to make your language useful, but
>> for the most part that's just work.  It's a lot of work, though.
>
> Garbage collection may be hard, but reference counting is relativly
> simple and should work rather well for a simple system.

Yes, it's simple to implement but IMO it's a total pain to use,
because you have to increment and decrement reference counts all over
the place.  If you forget just one reference increment or decrement,
you lose.  I started out using reference counting but then switched to
mark and sweep and it was *so* much better.  The only place I had to
worry about memory references anymore became the code of the GC
itself.

[...]

> With all of C there already you don't really need a large library I
> would think.  After all it just orcestrates the work of C functions.

Among other things, I guess that depends on which data types you want
to provide.  E. g. I find hash maps insanely useful, but the C
libraray doesn't buy me a lot there.  The same goes for strings.
Getting from an interpreter that evaluates

  (let ((x 1)
        (y 2))
    (cons x y))

correctly to something that is actually useful is a lot of work.

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
From: Malcolm McLean
Subject: Re: Lisp in C.
Date: 
Message-ID: <AbudnV3QHI47trzbRVnyigA@bt.com>
"John Thingstad" <··············@chello.no> wrote in message 
······················@pandora.upc.no...
On Sat, 14 Apr 2007 17:28:28 +0200, Malcolm McLean
<·········@btinternet.com> wrote:

>A adventure program doesn't sound very CPU intensive.
>Why not write the whole thing in Lisp?
> There is a GNU library called GUILE which implents a Scheme like
> Lisp. It is designed to be embedded in C programs.
> Also for Common Lisp fans there is ECL.
>So rather than reinvent the wheel perhaps one of these will do?
>(The language Lua is popular in gaming comuneties)
>
I want it to compile under a Microsoft Visual C++ environment without any 
third-party non-source add-ons. It's got graphics and a windowing system 
which is better handled under C. The problem was that the high-level logic 
got unmanageable. I was storing everything in linked lists, and realised 
that I was effectively implementing a buggy and homebrewed version of Lisp.

It seems to me that a Lisp interpreter shouldn't be too difficult to write. 
You don't even need to do the eval function in C. However using someone 
else's system might be a better idea.

-- 
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
From: ·······@googlemail.com
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176579586.382297.53390@y80g2000hsf.googlegroups.com>
> You don't even need to do the eval function in C.

How do you implement the interpreter without eval in C?

Regards,

Thomas
From: Malcolm McLean
Subject: Re: Lisp in C.
Date: 
Message-ID: <xpadnXXTh8Nr1LzbnZ2dnUVZ8s6gnZ2d@bt.com>
<·······@googlemail.com> wrote in message 
····························@y80g2000hsf.googlegroups.com...
>> You don't even need to do the eval function in C.
>
> How do you implement the interpreter without eval in C?
>
That's another thing I'm not quite sure about. Surely defun is a function 
that takes a name, a parameter list, and a set of operations as arguments, 
and is bascially just a wrapper to setq? Then eval is just a function that 
pulls out the names passed to defun, and operates on them. But you've got to 
start execution somewhere. eval can evaluate itself, but how to light the 
match?

-- 
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
From: ·······@googlemail.com
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176587767.012524.196380@d57g2000hsg.googlegroups.com>
In my opinion you _have_ to implement eval in the host language.
But this is not the hard part.  Just look it up in SICP.

The burdon is the GC.  Simple rule when you implement in C.
Push all arguments on the stack.  And pop it on return.  Ugly
but simple.

Currently I am implementing a simple interpreter in C for web
development.  If you like I will send you my stuff.


Regargs,

Olaf
From: John Thingstad
Subject: Re: Lisp in C.
Date: 
Message-ID: <op.tqslbyn1pqzri1@pandora.upc.no>
On Sat, 14 Apr 2007 21:30:59 +0200, Malcolm McLean  
<·········@btinternet.com> wrote:

>
> "John Thingstad" <··············@chello.no> wrote in message  
> ······················@pandora.upc.no...
> On Sat, 14 Apr 2007 17:28:28 +0200, Malcolm McLean
> <·········@btinternet.com> wrote:
>
>> A adventure program doesn't sound very CPU intensive.
>> Why not write the whole thing in Lisp?
>> There is a GNU library called GUILE which implents a Scheme like
>> Lisp. It is designed to be embedded in C programs.
>> Also for Common Lisp fans there is ECL.
>> So rather than reinvent the wheel perhaps one of these will do?
>> (The language Lua is popular in gaming comuneties)
>>
> I want it to compile under a Microsoft Visual C++ environment without  
> any third-party non-source add-ons. It's got graphics and a windowing  
> system which is better handled under C. The problem was that the  
> high-level logic got unmanageable. I was storing everything in linked  
> lists, and realised that I was effectively implementing a buggy and  
> homebrewed version of Lisp.
>
> It seems to me that a Lisp interpreter shouldn't be too difficult to  
> write. You don't even need to do the eval function in C. However using  
> someone else's system might be a better idea.
>

Apart from Fourth subsets LISP is probaly the simplest language to
write provided you are happy to interpret and you arn't to ambitious
about funtionality. This is because of the simple syntax and command rules.

; The Lisp defined in McCarthy's 1960 paper, translated into CL.
; Assumes only quote, atom, eq, cons, car, cdr, cond.
; Bug reports to ········@paulgraham.com.

(defun null. (x)
   (eq x '()))

(defun and. (x y)
   (cond (x (cond (y 't) ('t '())))
         ('t '())))

(defun not. (x)
   (cond (x '())
         ('t 't)))

(defun append. (x y)
   (cond ((null. x) y)
         ('t (cons (car x) (append. (cdr x) y)))))

(defun list. (x y)
   (cons x (cons y '())))

(defun pair. (x y)
   (cond ((and. (null. x) (null. y)) '())
         ((and. (not. (atom x)) (not. (atom y)))
          (cons (list. (car x) (car y))
                (pair. (cdr x) (cdr y))))))

(defun assoc. (x y)
   (cond ((eq (caar y) x) (cadar y))
         ('t (assoc. x (cdr y)))))

(defun eval. (e a)
   (cond
     ((atom e) (assoc. e a))
     ((atom (car e))
      (cond
        ((eq (car e) 'quote) (cadr e))
        ((eq (car e) 'atom)  (atom   (eval. (cadr e) a)))
        ((eq (car e) 'eq)    (eq     (eval. (cadr e) a)
                                     (eval. (caddr e) a)))
        ((eq (car e) 'car)   (car    (eval. (cadr e) a)))
        ((eq (car e) 'cdr)   (cdr    (eval. (cadr e) a)))
        ((eq (car e) 'cons)  (cons   (eval. (cadr e) a)
                                     (eval. (caddr e) a)))
        ((eq (car e) 'cond)  (evcon. (cdr e) a))
        ('t (eval. (cons (assoc. (car e) a)
                         (cdr e))
                   a))))
     ((eq (caar e) 'label)
      (eval. (cons (caddar e) (cdr e))
             (cons (list. (cadar e) (car e)) a)))
     ((eq (caar e) 'lambda)
      (eval. (caddar e)
             (append. (pair. (cadar e) (evlis. (cdr e) a))
                      a)))))

(defun evcon. (c a)
   (cond ((eval. (caar c) a)
          (eval. (cadar c) a))
         ('t (evcon. (cdr c) a))))

(defun evlis. (m a)
   (cond ((null. m) '())
         ('t (cons (eval.  (car m) a)
                   (evlis. (cdr m) a)))))

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
From: ············@gmail.com
Subject: Re: Lisp in C.
Date: 
Message-ID: <1176745002.491482.23790@p77g2000hsh.googlegroups.com>
On Apr 14, 12:30 pm, "Malcolm McLean" <·········@btinternet.com>
wrote:
> I want it to compile under a Microsoft Visual C++ environment without any
> third-party non-source add-ons.

ECL can be built with VC++ and this configuration has been tested.
Seriously, use ECL, don't waste your time unless you really really
enjoy learning how to write an interpreter ;P  I use ECL to implement
embedded CL in my C applications without any troubles.  ECL works on
pretty much any platform that supports the GNU build tools and has a C
compiler.

mfh
From: Malcolm McLean
Subject: Re: Lisp in C.
Date: 
Message-ID: <2q6dndGWrO3287fbnZ2dnUVZ8sqjnZ2d@bt.com>
"Malcolm McLean" <·········@btinternet.com> wrote in message 
·····································@bt.com...
> If you are implementing a Lisp in  C, you can have a structure
>
> typedef struct
> {
>  struct node *cdr;
>  struct node *car;
>  void *atom;
> } NODE;
>
> However this is a waste because if a node is an atom then cdr and car will 
> be null. However how do you distinguish between a list and an atom?
>
> Secondly, there is a nil object. Is it best to map this to memory location 
> zero, so a node with cdr = NULL is the terminal one ina list, or is it 
> best to create a nil node somewhere (with pointers to NULL or to itself?).
>
> The question isn't really about C and would apply to a Lisp intepreter 
> written in assembly. My plan however is to have an adventure program 
> basically written in
> C and compilable under C, but with the high-level logic written in Lisp.
>
Thanks to everyone who contributed to this thread.
It is very telling that when you ask about running Lisp on an underlying C 
platform, you get three or four offers of interpreters. That just wouldn't 
happen with most languages. 
From: Rob Warnock
Subject: Re: Lisp in C.
Date: 
Message-ID: <KZadnU5fg6d6LbfbnZ2dnUVZ_sDinZ2d@speakeasy.net>
Malcolm McLean <·········@btinternet.com> wrote:
+---------------
| It is very telling that when you ask about running Lisp on
| an underlying C platform, you get three or four offers of
| interpreters.
+---------------

Not just interpreters, but compilers as well!!

And *many* more than "three or four", if your definition
of "Lisp" is not restricted to ANSI Common Lisp...


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Ken Tilton
Subject: Re: Lisp in C.
Date: 
Message-ID: <DOAWh.1228$hR2.971@newsfe12.lga>
Rob Warnock wrote:
> Malcolm McLean <·········@btinternet.com> wrote:
> +---------------
> | It is very telling that when you ask about running Lisp on
> | an underlying C platform, you get three or four offers of
> | interpreters.
> +---------------
> 
> Not just interpreters, but compilers as well!!
> 
> And *many* more than "three or four", if your definition
> of "Lisp" is not restricted to ANSI Common Lisp...

Damn. I thought we had pushed them all into the sea.

:(

kenny

-- 
http://www.theoryyalgebra.com/

"Algebra is the metaphysics of arithmetic." - John Ray

"As long as algebra is taught in school,
there will be prayer in school." - Cokie Roberts

"Stand firm in your refusal to remain conscious during algebra."
    - Fran Lebowitz

"I'm an algebra liar. I figure two good lies make a positive."
    - Tim Allen