From: Szymon
Subject: smart compiler vs luser, part II.
Date: 
Message-ID: <87r7mxxpfh.fsf@eva.rplacd.net>
Hi.

I need help.

I just finished simple ''parser''. It can split csv strings:

(let ((example "a,b,c,\"d,e,f,\"g,h"))
  (map nil
       (lambda (c) (princ c) (terpri))
       (split-csv-string example)))

output:

a
b
c
"d,e,f,"g
h
NIL

Unfortunately I got warning (cmucl 19a):

; In: DEFUN SPLIT-CSV-STRING

;   (COND (# #) (# #) (# :NORMAL) (# :QUOTED))
; --> IF COND IF COND IF COND IF 
; ==>
;   (COND)
; Warning: A possible binding of STATE is not a (VALUES &OPTIONAL KEYWORD &REST T):
;   NIL
; 
; Compiling Top-Level Form: 

; Compilation unit finished.
;   1 warning

I cannot get rid of this (of course I can change the declaration... but I think
it's too radical).

code:

(let ((temp (make-array 10 :element-type 'character :fill-pointer t :adjustable t)))
  (declare (type string temp))
  (defun split-csv-string (csv-string
                           &aux (csv-length (length csv-string))
                                (simple? (if (simple-string-p csv-string) t)))
    (declare (optimize (speed 0)
                       (space 0)
                       (compilation-speed 0)
                       (safety 3)
                       (debug 3))
             (type string csv-string)
             (type (integer 0 #.array-dimension-limit) csv-length)
             (type boolean simple?))
    (unless (zerop csv-length)
      (setf (fill-pointer temp) 0)
      (labels ((--> (result state index)
                    (declare (type list result)
                             (type keyword state)
                             (type (integer 0 #.array-dimension-limit) index))
                    (if (eql index csv-length)
                        (nreverse (cons (copy-seq temp) result))
                      (let ((char (if simple?
                                      (schar csv-string index)
                                    (char csv-string index))))
                        (declare (type character char))
                        (--> (if (and (eql char #\,)
                                      (not (memq state '(:quoted :escaped :quoted&escaped))))
                                 (prog1 (cons (copy-seq temp) result)
                                   (setf (fill-pointer temp) 0))
                               (prog1 result
                                 (vector-push-extend char temp)))
                             (cond ((eq state :normal)
                                    (case char (#\" :quoted) (#\\ :escaped) (t state)))
                                   ((eq state :quoted)
                                    (case char (#\" :normal) (#\\ :quoted&escaped) (t state)))
                                   ((eq state :escaped) :normal)
                                   ((eq state :quoted&escaped) :quoted))
                             (1+ index))))))
        (--> nil :normal 0)))))

TIA, Szymon.

ps. any improvements, criticisms and comments are appreciated.

From: Barry Margolin
Subject: Re: smart compiler vs luser, part II.
Date: 
Message-ID: <barmar-C7D9DF.01363314112004@comcast.dca.giganews.com>
In article <··············@eva.rplacd.net>, Szymon <············@o2.pl> 
wrote:

> Hi.
> 
> I need help.
> 
> I just finished simple ''parser''. It can split csv strings:
> 
> (let ((example "a,b,c,\"d,e,f,\"g,h"))
>   (map nil
>        (lambda (c) (princ c) (terpri))
>        (split-csv-string example)))
> 
> output:
> 
> a
> b
> c
> "d,e,f,"g
> h
> NIL
> 
> Unfortunately I got warning (cmucl 19a):
> 
> ; In: DEFUN SPLIT-CSV-STRING
> 
> ;   (COND (# #) (# #) (# :NORMAL) (# :QUOTED))
> ; --> IF COND IF COND IF COND IF 
> ; ==>
> ;   (COND)
> ; Warning: A possible binding of STATE is not a (VALUES &OPTIONAL KEYWORD 
> &REST T):
> ;   NIL
> ; 
> ; Compiling Top-Level Form: 
> 
> ; Compilation unit finished.
> ;   1 warning
> 
> I cannot get rid of this (of course I can change the declaration... but I 
> think
> it's too radical).
> 
> code:
> 
> (let ((temp (make-array 10 :element-type 'character :fill-pointer t 
> :adjustable t)))
>   (declare (type string temp))
>   (defun split-csv-string (csv-string
>                            &aux (csv-length (length csv-string))
>                                 (simple? (if (simple-string-p csv-string) 
>                                 t)))
>     (declare (optimize (speed 0)
>                        (space 0)
>                        (compilation-speed 0)
>                        (safety 3)
>                        (debug 3))
>              (type string csv-string)
>              (type (integer 0 #.array-dimension-limit) csv-length)
>              (type boolean simple?))
>     (unless (zerop csv-length)
>       (setf (fill-pointer temp) 0)
>       (labels ((--> (result state index)
>                     (declare (type list result)
>                              (type keyword state)
>                              (type (integer 0 #.array-dimension-limit) 
>                              index))
>                     (if (eql index csv-length)
>                         (nreverse (cons (copy-seq temp) result))
>                       (let ((char (if simple?
>                                       (schar csv-string index)
>                                     (char csv-string index))))
>                         (declare (type character char))
>                         (--> (if (and (eql char #\,)
>                                       (not (memq state '(:quoted :escaped 
>                                 :quoted&escaped))))
>                                  (prog1 (cons (copy-seq temp) result)
>                                    (setf (fill-pointer temp) 0))
>                                (prog1 result
>                                  (vector-push-extend char temp)))
>                              (cond ((eq state :normal)
>                                     (case char (#\" :quoted) (#\\ :escaped) 
>                                 (t state)))
>                                    ((eq state :quoted)
>                                     (case char (#\" :normal) (#\\ 
>                                 :quoted&escaped) (t state)))
>                                    ((eq state :escaped) :normal)
>                                    ((eq state :quoted&escaped) :quoted))

I think it's complaining about the above COND expression.  If STATE 
doesn't have one of those four values, COND will return NIL, and this 
will be passed as the STATE argument to the recursive --> call.  The 
compiler apparently isn't smart enough to tell that all possible paths 
through the code will result in one of these values, so that the 
implicit default clause will never occur.

Anyway, I suggest you change it to an ECASE expression:

(ecase state
  (:normal ...)
  (:quoted ...)
  (:escaped ...)
  (:quoted&escaped ...))

By using ECASE, you tell the compiler that it can never fall through to 
a default case -- if STATE doesn't match any of those values an error 
will be signalled.

>                              (1+ index))))))
>         (--> nil :normal 0)))))
> 
> TIA, Szymon.
> 
> ps. any improvements, criticisms and comments are appreciated.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
From: Kalle Olavi Niemitalo
Subject: Re: smart compiler vs luser, part II.
Date: 
Message-ID: <87lld4hoxp.fsf@Astalo.kon.iki.fi>
Szymon <············@o2.pl> writes:

> I cannot get rid of this (of course I can change the declaration...
> but I think it's too radical).

I don't think declaring (type keyword state) helps the compiler
any.  The value must still be represented as a pointer so it
takes the usual amount of memory, and all you're doing with it
is to compare it with EQ against a set of keywords, which can
be compiled the same way regardless of the type of the value.

The warning occurs because STATE is only known to be a KEYWORD on
entry to -->.  If the keyword is none of those in the COND, then
COND returns NIL, which will be the value of STATE on the next
iteration.  I think Stalin could infer that the value will always
be one of the keywords in the set, but apparently Python cannot.

If you want to keep declaring the type of STATE as something
stricter than SYMBOL, I think there are two ways:

(a) Use ECASE or CCASE rather than COND.  Then there will be no
    NIL default.

(b) Declare (type (member :normal :quoted :escaped :quoted&escaped)
    state).  Then Python should understand that some condition will
    match.

I'm still using CMUCL 18e, which doesn't warn in the first place,
so I don't know whether these changes would avoid the warning.
From: Szymon
Subject: Re: smart compiler vs luser, part II.
Date: 
Message-ID: <877joomy27.fsf@eva.rplacd.net>
Thanks for the replies. It works :)

Regards, Szymon.
From: Szymon
Subject: csv splitter finished (was: smart compiler vs luser, part II).
Date: 
Message-ID: <87r7mwmlzu.fsf@eva.rplacd.net>
Szymon <············@o2.pl> writes:

> ... (if (and (eql char #\,)
>              (not (memq state '(:quoted :escaped :quoted&escaped)))))...

It was late...

(if (and (eq state :normal) (eql char #\,)) .....)

LOL...

Btw, I wrote better [perfect :P] wersion of splitter:

(defun split-csv-s-string (csv-string &aux (last 0) (csv-length (length csv-string)))
  (declare (type simple-string csv-string)
           (type (integer 0 #.array-dimension-limit) last csv-length))
  (unless (zerop csv-length)
    (labels
        ((--> (result state index)
           (declare (type list result)
                    (type keyword state)
                    (type (integer 0 #.array-dimension-limit) index))
           (if (eql index csv-length)
               (nreverse (cons (subseq csv-string last) result))
             (let ((char (schar csv-string index)))
               (declare (type character char))
               (--> (if (and (eq state :normal) (eql char #\,))
                        (prog1 (cons (subseq csv-string last index) result)
                          (setq last (1+ index)))
                      result)
                    (block nil
                      (if (eq state :normal)
                          (progn
                            (if (eql char #\") (return :quoted))
                            (if (eql char #\\) (return :escaped))    
                            (return :normal)))
                      (if (eq state :quoted)
                          (progn
                            (if (eql char #\") (return :normal))
                            (if (eql char #\\) (return :quoted&escaped))
                            (return :quoted)))
                      (if (eq state :escaped)
                          (return :normal))
                      :quoted)                ; :quoted&escaped
                    (1+ index))))))
      (--> nil :normal 0))))

Regards, Szymon.