About the relation between S-expressions and trees

From: Pat
Subject: About the relation between S-expressions and trees
Date: Mon, 09 Oct 2006 15:38:48 +0000
Message-ID: <1160408328.655088.229620@k70g2000cwa.googlegroups.com>

Hi,

I'm doing my PhD on some applications of genetic programming and I've a
question that concerns some theoretical aspect of LISP (I suppose). I
hope you can help, I've been searching a solution to this problem since
last week.

Let's imagine that A and B are binary operations and C is a binary
operation. After A we have a jump and the result is presented directly
to C. It is an S-expression, but now it's useful to show it like a
tree:

                 ------>----------
                |                 |
       In ->--|A|--->--|B|---->---|C|-->--Out


Anyway, the structure can be simplified as follows, avoiding the jumps
and replying the structure before the jump:

       In ->--|A|---->---------------
                                        |
       In ->--|A|---->-|B|--->----|C|->---Out


It is quite clear that both structures are equivalent, and it is quite
also quite clear that this way of proceed is general and can be used
always for forward trees. Anyway, although the situation seems to be
trivial, I've not found a theoretical demonstration of my statement.

Can anybody please help me? It's for my PhD thesis...

Re: About the relation between S-expressions and trees Kaz Kylheku
- Re: About the relation between S-expressions and trees Pat
  - Re: About the relation between S-expressions and trees Rob Thorpe
    - Re: About the relation between S-expressions and trees Pat
      - Re: About the relation between S-expressions and trees Rob Thorpe

From: Kaz Kylheku
Subject: Re: About the relation between S-expressions and trees
Date: Mon, 09 Oct 2006 17:21:46 +0000
Message-ID: <1160414506.645525.32450@m7g2000cwm.googlegroups.com>

On Oct 9, 8:38 am, "Pat" <···········@gmail.com> wrote:
> Hi,
>
> I'm doing my PhD on some applications of genetic programming and I've a
> question that concerns some theoretical aspect of LISP (I suppose). I

By the way, In a new Ph. D. thesis, you might want to distinguish
yourself by using the modern spelling "Lisp", and also be specifically
clear about what type of Lisp you are referring to. ANSI Common Lisp,
Scheme, etc.

> hope you can help, I've been searching a solution to this problem since
> last week.
>
> Let's imagine that A and B are binary operations and C is a binary
> operation.

But let's not restrict ourselves to binary relations in our sentence
structure, and simply say that ``A, B and C are binary operations.''

:)

I think you may have made a mistake here, as I reiterate below.

> After A we have a jump and the result is presented directly
> to C. It is an S-expression, but now it's useful to show it like a
> tree:

But where is your the S-expression?

>                  ------>----------
>                 |                 |
>        In ->--|A|--->--|B|---->---|C|-->--Out

This diagram is confusing. If A is a binary operation, it should have
two inputs. Do "in" and "out" represent control flow or data flow, or
both?

I can't imagine which S-expression this corresponds to.

Did you perhaps intend to write, "A and B are /unary/ operations, and C
is a binary operation?"

The arrows then become straightforward data dependencies.

The S-expression for the computation is then

(C (B (A IN)) (A IN))

The (A IN) part is known, in compiler terminology, as a "common
subexpression".

In Lisp, S-expressions are considered source code. You could take
advantage of a special notation in Common Lisp to factor the expression
so that the resulting object only has one copy of (A IN).

 (C (B #1=(A IN)) #1#)

The #1= causes the following object to be remembered by the reader,
associated with the integer 1. Lexically subsequent occurences of the
#1# notation is replaced by the object remembered under 1.

In Lisp source, this doesn't buy you anything, because if your compiler
recognizes and eliminates common subexpressions, it certainly does not
do so based on identity equivalence of pieces of the source code,
(except in the case of simple symbols which are necessarily equated by
identity).

If the function A has a side effect, then the two occurences of (A IN)
are not common subexpressions that can be merged into one, even if the
same object instance represents both. Re-using the same object for a
piece of source code has no special bearing on the semantics; it's sort
of a notational shorthand only, a way of saving memory.

From: Pat
Subject: Re: About the relation between S-expressions and trees
Date: Thu, 12 Oct 2006 12:57:15 +0000
Message-ID: <1160657835.292346.219120@i42g2000cwa.googlegroups.com>

Hi,

Thanks for answering. Actually, I made a couple of mistake in my
previous post, and other important information miss. Let me explain
what I mean:

1) First of all, I'm not working with Lisp, but with genetic
programming, a technique of artificial intelligence that "borrows" some
notation from Lisp (like S-expressions). So, please, forgive me if my
language about Lisp is not adequate of precise enough.

2) You've seen the point I want to know: you described my (bad drawn)
tree by this S:expression: (C (B (A IN)) (A IN)), saying that (A IN) is
a common subexpression (actually, A and B were unary operators ...
lapsus calami).

3) The question is: given a tree where subexpression are "hidden" (in
this case you have "unrolled" the tree identifying hte subexpression),
is there any technique to identify all the subexpression in the tree?

I hope that now the situation is clear and maybe (Kaz) or someone else
can halp me. Any comment is welcome

John

Kaz Kylheku ha scritto:

> On Oct 9, 8:38 am, "Pat" <···········@gmail.com> wrote:
> > Hi,
> >
> > I'm doing my PhD on some applications of genetic programming and I've a
> > question that concerns some theoretical aspect of LISP (I suppose). I
>
> By the way, In a new Ph. D. thesis, you might want to distinguish
> yourself by using the modern spelling "Lisp", and also be specifically
> clear about what type of Lisp you are referring to. ANSI Common Lisp,
> Scheme, etc.
>
> > hope you can help, I've been searching a solution to this problem since
> > last week.
> >
> > Let's imagine that A and B are binary operations and C is a binary
> > operation.
>
> But let's not restrict ourselves to binary relations in our sentence
> structure, and simply say that ``A, B and C are binary operations.''
>
> :)
>
> I think you may have made a mistake here, as I reiterate below.
>
> > After A we have a jump and the result is presented directly
> > to C. It is an S-expression, but now it's useful to show it like a
> > tree:
>
> But where is your the S-expression?
>
> >                  ------>----------
> >                 |                 |
> >        In ->--|A|--->--|B|---->---|C|-->--Out
>
> This diagram is confusing. If A is a binary operation, it should have
> two inputs. Do "in" and "out" represent control flow or data flow, or
> both?
>
> I can't imagine which S-expression this corresponds to.
>
> Did you perhaps intend to write, "A and B are /unary/ operations, and C
> is a binary operation?"
>
> The arrows then become straightforward data dependencies.
>
> The S-expression for the computation is then
>
> (C (B (A IN)) (A IN))
>
> The (A IN) part is known, in compiler terminology, as a "common
> subexpression".
>
> In Lisp, S-expressions are considered source code. You could take
> advantage of a special notation in Common Lisp to factor the expression
> so that the resulting object only has one copy of (A IN).
>
>  (C (B #1=(A IN)) #1#)
>
> The #1= causes the following object to be remembered by the reader,
> associated with the integer 1. Lexically subsequent occurences of the
> #1# notation is replaced by the object remembered under 1.
>
> In Lisp source, this doesn't buy you anything, because if your compiler
> recognizes and eliminates common subexpressions, it certainly does not
> do so based on identity equivalence of pieces of the source code,
> (except in the case of simple symbols which are necessarily equated by
> identity).
>
> If the function A has a side effect, then the two occurences of (A IN)
> are not common subexpressions that can be merged into one, even if the
> same object instance represents both. Re-using the same object for a
> piece of source code has no special bearing on the semantics; it's sort
> of a notational shorthand only, a way of saving memory.

From: Rob Thorpe
Subject: Re: About the relation between S-expressions and trees
Date: Thu, 12 Oct 2006 13:09:33 +0000
Message-ID: <1160658573.055670.267120@i42g2000cwa.googlegroups.com>

Pat wrote:
> Hi,
>
> Thanks for answering. Actually, I made a couple of mistake in my
> previous post, and other important information miss. Let me explain
> what I mean:
>
> 1) First of all, I'm not working with Lisp, but with genetic
> programming, a technique of artificial intelligence that "borrows" some
> notation from Lisp (like S-expressions). So, please, forgive me if my
> language about Lisp is not adequate of precise enough.
>
> 2) You've seen the point I want to know: you described my (bad drawn)
> tree by this S:expression: (C (B (A IN)) (A IN)), saying that (A IN) is
> a common subexpression (actually, A and B were unary operators ...
> lapsus calami).
>
> 3) The question is: given a tree where subexpression are "hidden" (in
> this case you have "unrolled" the tree identifying hte subexpression),
> is there any technique to identify all the subexpression in the tree?
>
> I hope that now the situation is clear and maybe (Kaz) or someone else
> can halp me. Any comment is welcome

You can use an algorithm similar to the one used to turn a graph into a
tree I mentioned to you on comp.programming.

* start at the leaves of the tree.
* Pass over each leaf entering each expression "(instruction arg1
arg2)" into a hash table.
* Now every place in the hash-table where you have more than one entry
represents redundancy
* Pass over the tree replacing each of the redundant expressions with a
reference to the first calculation.  That is you leave the first
expression in place and replace every subsequent one with a reference
to the first.

Then at the end possibly run the whole algorithm again, since there may
be common-subexpression remaining.

To make the above practical you need to take account of expression in
the parse tree that are conditional and may not be executed.  A book on
compilers will describe all this stuff.

From: Pat
Subject: Re: About the relation between S-expressions and trees
Date: Thu, 12 Oct 2006 16:09:05 +0000
Message-ID: <1160669344.977064.140390@m73g2000cwd.googlegroups.com>

Great, you're idea war absolutely right! It was so easy, my tree was no
more a tree but a graph, a directed acyclic graph (DAG) and it can be
always converted to a tree.

Thank you so much for your help

Rob Thorpe ha scritto:

> Pat wrote:
> > Hi,
> >
> > Thanks for answering. Actually, I made a couple of mistake in my
> > previous post, and other important information miss. Let me explain
> > what I mean:
> >
> > 1) First of all, I'm not working with Lisp, but with genetic
> > programming, a technique of artificial intelligence that "borrows" some
> > notation from Lisp (like S-expressions). So, please, forgive me if my
> > language about Lisp is not adequate of precise enough.
> >
> > 2) You've seen the point I want to know: you described my (bad drawn)
> > tree by this S:expression: (C (B (A IN)) (A IN)), saying that (A IN) is
> > a common subexpression (actually, A and B were unary operators ...
> > lapsus calami).
> >
> > 3) The question is: given a tree where subexpression are "hidden" (in
> > this case you have "unrolled" the tree identifying hte subexpression),
> > is there any technique to identify all the subexpression in the tree?
> >
> > I hope that now the situation is clear and maybe (Kaz) or someone else
> > can halp me. Any comment is welcome
>
> You can use an algorithm similar to the one used to turn a graph into a
> tree I mentioned to you on comp.programming.
>
> * start at the leaves of the tree.
> * Pass over each leaf entering each expression "(instruction arg1
> arg2)" into a hash table.
> * Now every place in the hash-table where you have more than one entry
> represents redundancy
> * Pass over the tree replacing each of the redundant expressions with a
> reference to the first calculation.  That is you leave the first
> expression in place and replace every subsequent one with a reference
> to the first.
>
> Then at the end possibly run the whole algorithm again, since there may
> be common-subexpression remaining.
>
> To make the above practical you need to take account of expression in
> the parse tree that are conditional and may not be executed.  A book on
> compilers will describe all this stuff.

From: Rob Thorpe
Subject: Re: About the relation between S-expressions and trees
Date: Thu, 12 Oct 2006 17:36:13 +0000
Message-ID: <1160674573.260365.4870@e3g2000cwe.googlegroups.com>

Pat wrote:
> Great, you're idea war absolutely right! It was so easy, my tree was no
> more a tree but a graph, a directed acyclic graph (DAG) and it can be
> always converted to a tree.

Yes, it's a graph once the algorithm has finished.

There is a bug in my suggestion though.  You have to take into account
when a value is assigned, if "foo" is assigned you have to remember
that after that happens then the common-subexpressions involving "foo"
aren't valid anymore because it has changed.

You can find quite a bit about this stuff in books on compiler design,
and lecture notes on compiler design online.