From: Geoffrey Summerhayes
Subject: Writing a parser
Date: 
Message-ID: <o%BG9.1084$3J2.265382@news20.bellglobal.com>
I'm working on a program that takes instructions in the form
of strings (A-Z,0-9,=). A couple of examples of the kind of
instructions that need to be parsed are A234B15C10, A45B=FOO.
There are about 30 different instruction types. The instruction
type is determined by the letters and the position of the numbers.
The numbers themselves and everything after an = are the parameters
of the instruction.

I borrowed the original algorithm for parsing the line from
a C version I wrote many moons ago, when I first started
learning C and made a couple of modifications.

Here's what I wrote for proof-of-concept:

;;;;parser.lisp

(defpackage "INS-PARSER" (:use "CL" "LEGAL-INSTRUCTIONS"))

(in-package "INS-PARSER")

(export 'execute-instruction)

(defun prep-instruction (input)
  "Seperates a user instruction into an instruction
pattern and an argument list. Returns a list of
arguments and the pattern as a string."
  (let ((arguments '())
        (pattern '()))
    (do ((pos 0))
        ((= pos (length input))
         (values (nreverse arguments)
                 (coerce (nreverse pattern) 'string)))
      (cond ((digit-char-p (elt input pos))
             (multiple-value-bind (number new-pos)
                 (parse-integer input :start pos :junk-allowed t)
               (setf pos new-pos)
               (push number arguments)
               (push #\# pattern)))
            ((char-equal (elt input pos) #\=)
             (push #\= pattern)
             (push (subseq input (1+ pos)) arguments)
             (setf pos (length input)))
            (t (push (elt input pos) pattern)
               (incf pos))))))

(defun execute-instruction (instruction)
  "Parse and execute one instruction."
  (multiple-value-bind (list name)
      (prep-instruction instruction)
    (multiple-value-bind (symbol status)
        (find-symbol name "LEGAL-INSTRUCTIONS")
      (if (and name
               (eq :external status)
               (symbol-function symbol))
          (funcall (symbol-function symbol) list)
        ;; this will be replaced some form of error handling
        (format t "~&~A pattern not found.~%" instruction)))))

;;;; instruction.lisp

(defpackage "LEGAL-INSTRUCTIONS" (:use "CL"))

(in-package "LEGAL-INSTRUCTIONS")

(export 'A\#B\#)
(export 'A\#B\#B\#)

(defun A\#B\# (list)
  (format t "~&Single B:~A~%" list))

(defun A\#B\#B\# (list)
  (format t "~&Double B:~A~%" list))

The idea being I can add a new instruction simply by exporting it
from the LEGAL-INSTRUCTIONS package as long as I stick to the same
format.

So...what I'm looking for is advice, a better way to parse an instruction,
improvements to the code, have I coded any undefined behaviour, and does
EXECUTE-INSTRUCTION fall into the "just because you can, doesn't mean you
should" category.

--
Geoff