From: ······@hushmail.com
Subject: Bad performance for PAIP example
Date: 
Message-ID: <0jih9uk22k35v0o3uedm4smf2fvqbs34lm@4ax.com>
I'm reading PAIP and in Chapter 2 it defines a context-free grammar for a
subset of English (see code below)

I wrote a slightly different version of the author's "generate" function,
which explicitly differentiates between terminal and non-terminal symbols of
the grammar. It's called "generate2" (see code below).

Although both functions look very similar, the performance is very different:

------------------------------------------------------------------------------------------------
CL-USER 5 > (time (dotimes (x 100000 'done)
                    (generate 'sentence)))
Timing the evaluation of (DOTIMES (X 100000 'DONE) (GENERATE 'SENTENCE))

user time    =     13.970
system time  =      0.000
Elapsed time =   0:00:14
Allocation   = 6320 bytes standard / 13218326 bytes fixlen
0 Page faults
DONE

CL-USER 6 > (time (dotimes (x 100000 'done)
                    (generate2 'sentence)))
Timing the evaluation of (DOTIMES (X 100000 'DONE) (GENERATE2 'SENTENCE))

user time    =      0.711
system time  =      0.010
Elapsed time =   0:00:01
Allocation   = 2840 bytes standard / 2202420 bytes fixlen
0 Page faults
DONE
--------------------------------------------------------------------------------------

Why is my function much faster, and where is "generate" creating all this
extra garbage?

Thanks

-- code starts here -------------------------------------------
;
; Grammars
;

(defparameter *simple-grammar*
  '((sentence -> (noun-phrase verb-phrase))
    (noun-phrase -> (Article Noun))
    (verb-phrase -> (Verb noun-phrase))
    (Article -> the a)
    (Noun -> man ball woman table)
    (Verb -> hit took saw liked))
  "A grammar for a trivial subset of English")

(defparameter *bigger-grammar*
  '((sentence -> (noun-phrase verb-phrase))
    (noun-phrase -> (Article Adj* Noun PP*) (Name) (Pronoun))
    (verb-phrase -> (Verb noun-phrase PP*))
    (PP* -> () (PP PP*))
    (Adj* -> () (Adj Adj*))
    (PP -> (Prep noun-phrase))
    (Prep -> to in by with on)
    (Adj -> big little blue green adiabatic)
    (Article -> the a)
    (Name -> Fernando Pat Kim Lee Terry Robin)
    (Noun -> man ball woman table)
    (Verb -> hit took saw liked)
    (Pronoun -> he she it these those that)))


(defvar *grammar* *simple-grammar*
  "The grammar used by generate. Initially this is *simple-grammar* but we can
switch to other grammars")

(defun set-grammar (grammar)
  "Sets the current grammar"
  (progn
    (setf *grammar* grammar)
    'done))

;
; Rule accessor functions
;
(defun rule-lhs (rule)
  "Left ahnd side of the rule"
  (first rule))

(defun rule-rhs (rule)
  "Right hand side of the rule"
  (rest (rest rule)))

(defun rewrites (category)
  "Return a list of possible rewrites for this category"
  (rule-rhs (assoc category *grammar*)))

(defun random-elt (choices)
  "Choose an element from a list at random"
  (elt choices (random (length choices))))

;
; Generators
;
(defun generate (phrase)
  "Generate a random sentence or phrase"
  (let ((choices (rewrites phrase)))
    (cond
     ((listp phrase) 
      (mapcan #'generate phrase))
     ((consp choices) 
      (generate (random-elt choices)))
     (t (list phrase)))))

(defun generate2 (phrase)
  "Explicitly differentiates between terminal symbols and non-terminal
symbols"
  (cond
   ((terminal? phrase)
    (list phrase))
   ((atom phrase)
    (mapcan #'generate2 (rewrites phrase)))
   (t (generate2 (random-elt (rewrites phrase))))))

(defun terminal? (token)
  "Determines ia given token is a terminal symbol (no rewrites) or not"
  (null (rewrites token)))
-------------------------------------------------------------------------------------



----
Fernando Rodr�guez
frr at wanadoo dot es
-------

From: Nils Goesche
Subject: Re: Bad performance for PAIP example
Date: 
Message-ID: <a7akm9$jc2o1$1@ID-125440.news.dfncis.de>
In article <··································@4ax.com>, ······@hushmail.com wrote:
> 
> I'm reading PAIP and in Chapter 2 it defines a context-free grammar for a
> subset of English (see code below)
> 
> I wrote a slightly different version of the author's "generate" function,
> which explicitly differentiates between terminal and non-terminal symbols of
> the grammar. It's called "generate2" (see code below).
> 
> Although both functions look very similar, the performance is very different:

[snip]

Maybe you should test your code before timing it:

GRAMMAR 12 > (generate 'sentence)
(THE MAN SAW A TABLE)

GRAMMAR 13 > (generate2 'sentence)
((NOUN-PHRASE VERB-PHRASE))

So, they don't seem to do the same thing :-)  Or am I doing something
wrong?

Regards,
-- 
Nils Goesche
"Don't ask for whom the <CTRL-G> tolls."

PGP key ID 0x42B32FC9
From: Joe Marshall
Subject: Re: Bad performance for PAIP example
Date: 
Message-ID: <iu7m8.33879$44.10308989@typhoon.ne.ipsvc.net>
<······@hushmail.com> wrote in message
·······································@4ax.com...
>
> Although both functions look very similar, the performance is very
different:
>
> ;
> ; Generators
> ;
> (defun generate (phrase)
>   "Generate a random sentence or phrase"
>   (let ((choices (rewrites phrase)))
>     (cond
>      ((listp phrase)
>       (mapcan #'generate phrase))
>      ((consp choices)
>       (generate (random-elt choices)))
>      (t (list phrase)))))
>
> (defun generate2 (phrase)
>   "Explicitly differentiates between terminal symbols and non-terminal
> symbols"
>   (cond
>    ((terminal? phrase)
>     (list phrase))
>    ((atom phrase)
>     (mapcan #'generate2 (rewrites phrase)))
>    (t (generate2 (random-elt (rewrites phrase))))))

Generate executes the mapcan clause when the phrase
is a list.  Generate2 executes the mapcan clause when
the phrase is *NOT* a list.
From: Thomas A. Russ
Subject: Re: Bad performance for PAIP example
Date: 
Message-ID: <ymi7ko4e03p.fsf@sevak.isi.edu>
······@hushmail.com writes:

> I'm reading PAIP and in Chapter 2 it defines a context-free grammar for a
> subset of English (see code below)
> 
> I wrote a slightly different version of the author's "generate" function,
> which explicitly differentiates between terminal and non-terminal symbols of
> the grammar. It's called "generate2" (see code below).
> 
> Although both functions look very similar, the performance is very different:

But a simple test of your code indicates that they produce rather
different results:

  (generate 'sentence)  => (A MAN SAW THE BALL)
  (generate2 'sentence) => ((NOUN-PHRASE VERB-PHRASE))

That would be a good explanation of why you get different runtimes.  A
quick comparison of your functions indicates that the COND tests in
GENERATE sometimes use PHRASE and sometimes CHOICES to decide what
branch to take.  GENERATE2 also doesn't get applied to all elements of a
list (like the LISTP option).  That is why the rewrites don't go all the
way to the bottom, but rather stop at the first level.

-- 
Thomas A. Russ,  USC/Information Sciences Institute          ···@isi.edu