From: ·······@gmail.com
Subject: need fast table-based control flow
Date: 
Message-ID: <m3tzdh4msj.fsf@maple.ramandgita.com>
I am trying to implement a text file reader. For my application I need
the reader to be fast. I believe that a hand coded lexer might give me
the speed I need.

In C I would implement something like this:

  switch (ch)
  case 'A': ...
  case 'B': ...
  default: ...

I tried doing this is lisp with tagbody and an array of
labels:

  (tagbody
    (go (elt ch tbl))
    A
      ...
    B
      ...
    default
      ...)

But it appears that go does not eval its arguments.

I know about the "case" macro in lisp, but the hyperspec
says that "Each of the normal-clauses is then considered in turn."

Does this mean that the lisp case macro is required to check
each destination key sequentially? Am I going in the wrong direction?

-Ram

From: K Livingston
Subject: Re: need fast table-based control flow
Date: 
Message-ID: <670465e4-2838-4c00-96a9-dad910e4e3df@p25g2000hsf.googlegroups.com>
On Aug 19, 12:05 am, ·······@gmail.com wrote:
> I am trying to implement a text file reader. For my application I need
> the reader to be fast. I believe that a hand coded lexer might give me
> the speed I need.


Why not just use READ ?

Let it do the scanning/lexing for you.  If you need different syntax
you can usually adjust the readtable** so that it works well.  (I have
one mucked up so much it returns tokens for java like syntax!)

Anyway, you can try to roll your own, but for the time savings in
programming and speed, I bet you'll have a hard time beating READ.
(If it were me, I'd probably try that first.)  For example:


(cl:defpackage #:parser
    (:export #:parse))

(in-package #:parser)

(defun a ()
  (pprint "saw a"))

(defun b ()
  (pprint "saw b"))

(defun foobar ()
  (pprint "saw foobar"))

(defun parse (stream)
  (let ((*package* (find-package '#:parser)))
    (do ((sym (read stream nil stream)
              (read stream nil stream)))
        ((eq sym stream))
      (funcall (symbol-function sym)))))


CL-USER> (parser::parse (make-string-input-stream "b a foobar a"))

"saw b"
"saw a"
"saw foobar"
"saw a"
NIL


(yes under the hood there is some kind of hashing from string name to
the symbol table - but read in most implementations is pretty
efficient)


** Let's say it's comma delimited symbols you need to parse (a problem
I just had), you can trivially change the readtable to turn them into
spaces:

(defvar *comma-to-space-readtable*
  (let ((*readtable* (copy-readtable nil)))
    (set-syntax-from-char #\, #\Space)
    *readtable*))

Now you can just bind over *readtable* in parse to change it to the
*comma-to-space-readtable* and change how READ behaves.



Kevin
From: Rainer Joswig
Subject: Re: need fast table-based control flow
Date: 
Message-ID: <joswig-9E7B0F.08574519082008@news-europe.giganews.com>
In article <··············@maple.ramandgita.com>, ·······@gmail.com 
wrote:

> I am trying to implement a text file reader. For my application I need
> the reader to be fast. I believe that a hand coded lexer might give me
> the speed I need.
> 
> In C I would implement something like this:
> 
>   switch (ch)
>   case 'A': ...
>   case 'B': ...
>   default: ...
> 
> I tried doing this is lisp with tagbody and an array of
> labels:
> 
>   (tagbody
>     (go (elt ch tbl))
>     A
>       ...
>     B
>       ...
>     default
>       ...)
> 
> But it appears that go does not eval its arguments.
> 
> I know about the "case" macro in lisp, but the hyperspec
> says that "Each of the normal-clauses is then considered in turn."
> 
> Does this mean that the lisp case macro is required to check
> each destination key sequentially? Am I going in the wrong direction?

CASE is a macro. You can expand it and see what the implementation
does. Usually I would expect that it expands into
some form that uses COND.

 It could be that some clever implementation would
optimize it for some cases, but I have no idea which would do it.
You would need to ask the implementor(s) or macroexpand (or
even disassemble) the code.

the typical Lisp way would do something like this:

(funcall (aref table (char-code (read-char))) ...)

Where the table is an array of functions.
Isn't that fast enough?


> 
> -Ram

-- 
http://lispm.dyndns.org/
From: Pascal J. Bourguignon
Subject: Re: need fast table-based control flow
Date: 
Message-ID: <7cy72te554.fsf@pbourguignon.anevia.com>
·······@gmail.com writes:

> I am trying to implement a text file reader. For my application I need
> the reader to be fast. I believe that a hand coded lexer might give me
> the speed I need.
>
> In C I would implement something like this:
>
>   switch (ch)
>   case 'A': ...
>   case 'B': ...
>   default: ...

Notice that C compilers can and do generate successive tests,
depending on the set of constants, and the optimization level.

So if you don't care what code is generated in C, why do you care in
Lisp?  Just use CASE and trust the compiler writers.



> I tried doing this is lisp with tagbody and an array of
> labels:
>
>   (tagbody
>     (go (elt ch tbl))
>     A
>       ...
>     B
>       ...
>     default
>       ...)
>
> But it appears that go does not eval its arguments.
>
> I know about the "case" macro in lisp, but the hyperspec
> says that "Each of the normal-clauses is then considered in turn."
>
> Does this mean that the lisp case macro is required to check
> each destination key sequentially? Am I going in the wrong direction?

This looks good enough a table for me:

C/USER[154]> (disassemble (compile nil (lambda (x) (declare (type character x)) (case x ((#\0 #\1 #\2 #\3) 'digit) ((#\a #\b #\c #\d) 'letter) (otherwise 'unknown)))))
Disassembly of function NIL
(CONST 0) = #S(HASH-TABLE :TEST EXT:STABLEHASH-EQ (#\d . 4) (#\c . 4) (#\b . 4) (#\a . 4) (#\3 . 1) (#\2 . 1) (#\1 . 1) (#\0 . 1))
(CONST 1) = DIGIT
(CONST 2) = LETTER
(CONST 3) = UNKNOWN
1 required argument
0 optional arguments
No rest parameter
No keyword parameters
11 byte-code instructions:
0     (LOAD 1)
1     (JMPHASH 0 L10 #1=L4 #1# #1# #1# #2=L7 #2# #2# #2#)
4     L4
4     (CONST 1)                           ; DIGIT
5     (SKIP&RET 2)
7     L7
7     (CONST 2)                           ; LETTER
8     (SKIP&RET 2)
10    L10
10    (CONST 3)                           ; UNKNOWN
11    (SKIP&RET 2)
NIL


Otherwise, if you happen to use an implementation that doesn't do
that, well, why don't you just do what you say you want to do?


C/USER[159]> (disassemble 
               (compile nil
                 (lambda (x)
                   (declare (fixnum x))
                   (if (<= 0 x 3)
                      (funcall (aref #.(vector (lambda () 'zero) (lambda () 'one)  (lambda () 'two)) x))
                      (funcall (lambda () 'out-of-bounds))))))

Disassembly of function NIL
(CONST 0) = 0
(CONST 1) = 3
(CONST 2) = #(#<FUNCTION :LAMBDA NIL 'ZERO> #<FUNCTION :LAMBDA NIL 'ONE> #<FUNCTION :LAMBDA NIL 'TWO>)
(CONST 3) = OUT-OF-BOUNDS
1 required argument
0 optional arguments
No rest parameter
No keyword parameters
12 byte-code instructions:
0     (CONST&PUSH 0)                      ; 0
1     (LOAD&PUSH 2)
2     (CONST&PUSH 1)                      ; 3
3     (CALLSR&JMPIF 2 49 L10)             ; <=
7     (CONST 3)                           ; OUT-OF-BOUNDS
8     (SKIP&RET 2)
10    L10
10    (CONST&PUSH 2)                      ; #(# # #)
11    (LOAD&PUSH 2)
12    (CALLSR&PUSH 1 1)                   ; AREF
15    (FUNCALL 0)
17    (SKIP&RET 2)
NIL
C/USER[160]> 



-- 
__Pascal Bourguignon__
From: ····@sonic.net
Subject: Re: need fast table-based control flow
Date: 
Message-ID: <m3pro540gy.fsf@maple.ramandgita.com>
···@informatimago.com (Pascal J. Bourguignon) writes:

>
> Otherwise, if you happen to use an implementation that doesn't do
> that, well, why don't you just do what you say you want to do?
>
>
> C/USER[159]> (disassemble 
>                (compile nil
>                  (lambda (x)
>                    (declare (fixnum x))
>                    (if (<= 0 x 3)
>                       (funcall (aref #.(vector (lambda () 'zero) (lambda () 'one)  (lambda () 'two)) x))
>                       (funcall (lambda () 'out-of-bounds))))))
>

Thank you! It never occurred to me to have an array of functions! This will
work just fine for me.

-Ram
From: Barry Margolin
Subject: Re: need fast table-based control flow
Date: 
Message-ID: <barmar-6BFC70.00501020082008@newsgroups.comcast.net>
In article <··············@maple.ramandgita.com>, ····@sonic.net wrote:

> Thank you! It never occurred to me to have an array of functions! This will
> work just fine for me.

Interesting.  When I saw your subject line, I assumed you were asking 
about an array of functions.  That's what "table-based control flow" 
sounds like to me.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: Duane Rettig
Subject: Re: need fast table-based control flow
Date: 
Message-ID: <o063pvpupw.fsf@gemini.franz.com>
Barry Margolin <······@alum.mit.edu> writes:

> In article <··············@maple.ramandgita.com>, ····@sonic.net wrote:
>
>> Thank you! It never occurred to me to have an array of functions! This will
>> work just fine for me.
>
> Interesting.  When I saw your subject line, I assumed you were asking 
> about an array of functions.  That's what "table-based control flow" 
> sounds like to me.

But 'tain't necessarily so:

CL-USER(1): (compile
              (defun foo (x)
                (declare (optimize speed (safety 0)))
                (case x
                  ((1 3 5 7 9) 'hi)
                  ((2 4 6 8 10) 'bye))))
FOO
NIL
NIL
CL-USER(2): (disassemble *)
;; disassembly of #<Function FOO>
;; formals: X
;; constant vector:
0: HI
1: BYE

;; code start: #x10009291b8:
   0: 48 c1 cf 03    ror	rdi,$3
   4: 48 83 ff 0a    cmp	rdi,$10
   8: 77 42          jnbe	76
  10: 48 8d 15 0b 00 leaq	rdx,[rip+11]        ; 28
      00 00 
  17: 48 63 04 ba    movslq	rax,[rdx+rdi*4+0])
  21: 48 03 c2       addq	rax,rdx
  24: ff e0          jmp	*eax
  26: 66 90          .align	4
  28: 30 00 00 00    .long	$48		; 0: 76
  32: 3a 00 00 00    .long	$58		; 1: 86
  36: 41 00 00 00    .long	$65		; 2: 93
  40: 3a 00 00 00    .long	$58		; 3: 86
  44: 41 00 00 00    .long	$65		; 4: 93
  48: 3a 00 00 00    .long	$58		; 5: 86
  52: 41 00 00 00    .long	$65		; 6: 93
  56: 3a 00 00 00    .long	$58		; 7: 86
  60: 41 00 00 00    .long	$65		; 8: 93
  64: 3a 00 00 00    .long	$58		; 9: 86
  68: 41 00 00 00    .long	$65		; 10: 93
  72: 00 00 00 00    .long	$0
  76: 49 8b ff       movq	rdi,r15
  79: f8             clc
  80: 4c 8b 74 24 10 movq	r14,[rsp+16]
  85: c3             ret
  86: 49 8b 7e 36    movq	rdi,[r14+54]        ; HI
  90: f8             clc
  91: eb f3          jmp	80
  93: 49 8b 7e 3e    movq	rdi,[r14+62]        ; BYE
  97: f8             clc
  98: eb ec          jmp	80
CL-USER(3): 

No function calls there...


-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182