Symbolic Expression Counting

From: ········@my-dejanews.com
Subject: Symbolic Expression Counting
Date: Fri, 16 Oct 1998 00:00:00 +0000
Message-ID: <707rpr$u0a$1@nnrp1.dejanews.com>

This is the interloper goofing off at work again :-)

I just ordered "On Lisp" and "Prardigms in artificial intelegence" (Norvag)
from amazon dot com.  With all this money I'm throwing at books, I better
just resign myself to using Lisp from now on :-)

Anyway, to the subject.  I would like to measure the size of Lisp programs in
general and Lisp programs I write in particular.  Lines of code is a horrible
metric for reasons I won't bother to go into because I assume you all agree
with me.  As you can guess from the subject, I wish to use the s-exp as my
metric. So I want to write a lisp function that can count all the s-exps in a
program (or environment).

I don't know if this is a good excercise for one begining lisp or not.	Also,
I would kind of expect such a utility to already exist.  If such a utility
exists, I would love it if someone mails me the lisp code for it.

I've got my lisp environment mostly set up.  I am using XEmacs in islisp mode
and cmucl.  I haven't got clx loaded yet.  I think it is just a matter of rtfm
to figure that out.  RPM wouldn't let me install clm because of a dependancy
issue.  If I don't need it for Garnet, I don't care.  So really, I am ready to
start going through Graham's "The ANSI Common Lisp".

Today is Friday.  I am off to the races.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

Re: Symbolic Expression Counting Dobes Vandermeer
Re: Symbolic Expression Counting CONTAINS SPOILER. Michael Tuchman
- Re: Symbolic Expression Counting CONTAINS SPOILER. David Steuber "The Interloper
  - Re: Symbolic Expression Counting CONTAINS SPOILER. David B. Lamkins
    - Re: Symbolic Expression Counting CONTAINS SPOILER. ···@dur.mindspring.com
  - Re: Symbolic Expression Counting CONTAINS SPOILER. Kelly Murray
    - Re: Symbolic Expression Counting CONTAINS SPOILER. Howard R. Stearns
- Re: Symbolic Expression Counting CONTAINS SPOILER. Matthias H�lzl
  - Re: Symbolic Expression Counting CONTAINS SPOILER. David Steuber "The Interloper
    - Re: Symbolic Expression Counting CONTAINS SPOILER. rusty craine
      - Re: Symbolic Expression Counting CONTAINS SPOILER. David Steuber "The Interloper
        Re: Symbolic Expression Counting CONTAINS SPOILER. David B. Lamkins
        Re: Symbolic Expression Counting CONTAINS SPOILER. Kent M Pitman
        Re: Symbolic Expression Counting CONTAINS SPOILER. Barry Margolin
        Re: Symbolic Expression Counting CONTAINS SPOILER. David B. Lamkins
        Re: Symbolic Expression Counting CONTAINS SPOILER. Kent M Pitman
        Re: Symbolic Expression Counting CONTAINS SPOILER. David B. Lamkins
        Re: Symbolic Expression Counting CONTAINS SPOILER. Pierre Mai
        Re: Symbolic Expression Counting CONTAINS SPOILER. Barry Margolin
        Re: Symbolic Expression Counting CONTAINS SPOILER. Reini Urban
        Re: Symbolic Expression Counting CONTAINS SPOILER. Barry Margolin
        Re: Symbolic Expression Counting CONTAINS SPOILER. Pierre Mai
        Re: Symbolic Expression Counting CONTAINS SPOILER. Pierre Mai
        VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Howard R. Stearns
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Pierre Mai
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Rainer Joswig
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Pierre Mai
        Re: VERSION-CONTROL && Ilog Talk ········@totient.demon.co.uk
        Re: VERSION-CONTROL && Ilog Talk Harley Davis
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Tim Bradshaw
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Tim Bradshaw
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Marco Antoniotti
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Tim Bradshaw
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Francis Leboutte
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Tim Bradshaw
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Bob Bane
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Marco Antoniotti
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Bob Bane
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Pierre Mai
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Tim Bradshaw
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Kelly Murray
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Tim Bradshaw
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Francis Leboutte
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Christopher Stacy
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Howard R. Stearns
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Tim Bradshaw
        Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.) Howard R. Stearns
        Re: VERSION-CONTROL and ICAD's Source Control facility David J Cooper Jr
        Re: VERSION-CONTROL and ICAD's Source Control facility Tim Bradshaw
        Re: VERSION-CONTROL and ICAD's Source Control facility Howard R. Stearns
        Re: VERSION-CONTROL and ICAD's Source Control facility Tim Bradshaw
        Re: VERSION-CONTROL and ICAD's Source Control facility Christopher Stacy
        Re: Symbolic Expression Counting CONTAINS SPOILER. David Steuber "The Interloper
        Re: Symbolic Expression Counting CONTAINS SPOILER. rusty craine
      - Re: Symbolic Expression Counting CONTAINS SPOILER. Gareth McCaughan
    - Re: Symbolic Expression Counting CONTAINS SPOILER. Barry Margolin
      - Re: Symbolic Expression Counting CONTAINS SPOILER. David Steuber "The Interloper
    - Re: Symbolic Expression Counting CONTAINS SPOILER. Matthias H�lzl
      - Re: Symbolic Expression Counting CONTAINS SPOILER. David Steuber "The Interloper
    - Re: Symbolic Expression Counting CONTAINS SPOILER. Steve Gonedes
      - Re: Symbolic Expression Counting CONTAINS SPOILER. David Steuber "The Interloper
  - Re: Symbolic Expression Counting CONTAINS SPOILER. John Wiseman
    - counting forms without respect to package markers (was: Symbolic Expression Counting) Michael Tuchman
      - Re: counting forms without respect to package markers (was: Symbolic Expression Counting) John Wiseman
        Re: counting forms without respect to package markers (was: Symbolic Expression Counting) Michael Tuchman
        Re: counting forms without respect to package markers (was: Symbolic Expression Counting) Matthias H�lzl
    - Re: Symbolic Expression Counting CONTAINS SPOILER. Matthias H�lzl
      - Re: Symbolic Expression Counting CONTAINS SPOILER. Steve Gonedes

From: Dobes Vandermeer
Subject: Re: Symbolic Expression Counting
Date: Fri, 16 Oct 1998 00:00:00 +0000
Message-ID: <3627A224.EC28FE9E@mindless.com>

········@my-dejanews.com wrote:
> 
> This is the interloper goofing off at work again :-)
> 
> I just ordered "On Lisp" and "Prardigms in artificial intelegence" (Norvag)
> from amazon dot com.  With all this money I'm throwing at books, I better
> just resign myself to using Lisp from now on :-)
> 
> Anyway, to the subject.  I would like to measure the size of Lisp programs in
> general and Lisp programs I write in particular.  Lines of code is a horrible
> metric for reasons I won't bother to go into because I assume you all agree
> with me.  As you can guess from the subject, I wish to use the s-exp as my
> metric. So I want to write a lisp function that can count all the s-exps in a
> program (or environment).
> 
> I don't know if this is a good excercise for one begining lisp or not.  Also,
> I would kind of expect such a utility to already exist.  If such a utility
> exists, I would love it if someone mails me the lisp code for it.
> 
> I've got my lisp environment mostly set up.  I am using XEmacs in islisp mode
> and cmucl.  I haven't got clx loaded yet.  I think it is just a matter of rtfm
> to figure that out.  RPM wouldn't let me install clm because of a dependancy
> issue.  If I don't need it for Garnet, I don't care.  So really, I am ready to
> start going through Graham's "The ANSI Common Lisp".

Count the number of "("'s without a prefixed "'", that might be close.

From: Michael Tuchman
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Fri, 16 Oct 1998 00:00:00 +0000
Message-ID: <wk4st4l23i.fsf@orsi.com>

········@my-dejanews.com writes:


> I just ordered "On Lisp" and "Prardigms in artificial intelegence" (Norvag)
> from amazon dot com.  With all this money I'm throwing at books, I better
> just resign myself to using Lisp from now on :-)

Uhhmm. don't you mean "look forward to weeks and weeks of unending
intellectual bliss?" :-)

> 
> [intro snipped] As you can guess from the subject, I wish to use the
> s-exp as my metric. So I want to write a lisp function that can
> count all the s-exps in a program (or environment).
> 
> I don't know if this is a good excercise for one begining lisp or
> not. Also,


I think it is.  The only thing you have to be careful of is how you
signal your end of file.  This is how I did it - would like some
criticism.

-------------------------
---- spoiler below ------
-- if you don't want ----
---- to know, then ------
-------quit now ---------
-------------------------
(down 15 lines)















(defun count-forms (file)
   "count the number of (top level) forms in a file"
   ;; the gensym is helpful so that if NIL appears in a file, 
   ;; that will count it as a form instead of treating it as an eof
   ;; condition
  (WHEN (probe-file file)
   (with-open-file (instream file :direction :input)
     (do*  ((eof-value (gensym))
           (count 0 (unless (eq form eof-value) (+ 1 count)))
           (form (read instream nil eof-value) (read instream nil eof-value)))
         ((eq form eof-value) count)))))


Of course, you could narrow this to count only the defuns, defvars,
defclasses, etc.  

I'm sure you'll see examples involving LOOP.

From: David Steuber "The Interloper
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sat, 17 Oct 1998 00:00:00 +0000
Message-ID: <3627f526.341191948@news.newsguy.com>

On 16 Oct 1998 15:48:49 -0400, Michael Tuchman <·····@orsi.com>
claimed or asked:

% > I just ordered "On Lisp" and "Prardigms in artificial intelegence" (Norvag)
% > from amazon dot com.  With all this money I'm throwing at books, I better
% > just resign myself to using Lisp from now on :-)
% 
% Uhhmm. don't you mean "look forward to weeks and weeks of unending
% intellectual bliss?" :-)

Uh, yeah, whatever :-)

% I think it is.  The only thing you have to be careful of is how you
% signal your end of file.  This is how I did it - would like some
% criticism.

Your domain is an ors of course.  Oh!  You mean the code you wrote!
:-^)

% (defun count-forms (file)
%    "count the number of (top level) forms in a file"
%    ;; the gensym is helpful so that if NIL appears in a file, 
%    ;; that will count it as a form instead of treating it as an eof
%    ;; condition
%   (WHEN (probe-file file)
%    (with-open-file (instream file :direction :input)
%      (do*  ((eof-value (gensym))
%            (count 0 (unless (eq form eof-value) (+ 1 count)))
%            (form (read instream nil eof-value) (read instream nil eof-value)))
%          ((eq form eof-value) count)))))

Well, I haven't a clue as to why it would work.  The suggestion that I
count un quoted '(' characters seemed reasonable to someone who knows
just enough Perl to threaten national security.  Let me see how much
of this code I can follow.

The function you defined would be called like (count-forms
"foo.lisp").

You then test that the file actually exists.  You open it for input
and assign the file handle to instream.  Then you go into a do loop.
I don't understand what the conditional is doing to see that it has
reached the end of the file.  I am used to while (!in.eof()){}.

I guess you are initializing a variable count to 0.  The function form
must be a predefined lisp function.  Why it takes two of the same
list, I don't know.

The function seems to return the number of s-exps you counted in the
file.

That is the extent of my lisp understanding so far.  I don't know why
it is do* and not just do.  I also don't know how much read is doing.
Does the nil prevent evaluation?

Except for the fact that I don't understand all the bits of the
function, it looks pretty good to me.  Even shorter than I expected.
I'll let the more experienced programmers do the real criticism :-)

Is it possible to do something similar with the current environment?
Or is the fact that everything is in internal format going to prevent
knowing how many 'forms' are contained?

You know, in C++, the function would be much larger.

--
David Steuber (ver 1.31.2a)
http://www.david-steuber.com
To reply by e-mail, replace trashcan with david.

So I have this chicken, see?  And it hatched from this egg, see?  But
the egg wasn't laid by a chicken.  It was cross-laid by a turkey.

From: David B. Lamkins
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sat, 17 Oct 1998 00:00:00 +0000
Message-ID: <m_VV1.3216$es1.1794108@news.teleport.com>

In article <··················@news.newsguy.com> ,
········@david-steuber.com (David Steuber "The Interloper") wrote:

[snip]

>% (defun count-forms (file)
>%    "count the number of (top level) forms in a file"
>%    ;; the gensym is helpful so that if NIL appears in a file, 
>%    ;; that will count it as a form instead of treating it as an eof
>%    ;; condition
>%   (WHEN (probe-file file)
>%    (with-open-file (instream file :direction :input)
>%      (do*  ((eof-value (gensym))
>%            (count 0 (unless (eq form eof-value) (+ 1 count)))
>%            (form (read instream nil eof-value) (read instream nil eof-value)))
>%          ((eq form eof-value) count)))))
>
>Well, I haven't a clue as to why it would work.  The suggestion that I
>count un quoted '(' characters seemed reasonable to someone who knows
>just enough Perl to threaten national security.  Let me see how much
>of this code I can follow.
>
>The function you defined would be called like (count-forms
>"foo.lisp").
>

Yes.

>You then test that the file actually exists.  You open it for input
>and assign the file handle to instream.  Then you go into a do loop.
>I don't understand what the conditional is doing to see that it has
>reached the end of the file.  I am used to while (!in.eof()){}.
>
>I guess you are initializing a variable count to 0.  The function form
>must be a predefined lisp function.  Why it takes two of the same
>list, I don't know.
>
>The function seems to return the number of s-exps you counted in the
>file.
>
>That is the extent of my lisp understanding so far.  I don't know why
>it is do* and not just do.  I also don't know how much read is doing.
>Does the nil prevent evaluation?
>

The indentation is a little bit off, so it's less obvious at a glance, but
the first three lines of that do* form contain (name initializer step)
clauses.  The fourth line contains a termination condition and a result
value.  There is no body in this example -- all the work is done in the
(name init step) clauses.  It's do* because the (name init step) forms need
to be executed in sequence because the 2nd and 3rd depend upon the value of
eof-value established in the 1st; if it was do, the clauses would be update
the values all at the same time.

You'll get more out of this if you stop guessing and look it up. 
Fortunately, you can, at <http://www.harlequin.com/books/HyperSpec/>.

>Except for the fact that I don't understand all the bits of the
>function, it looks pretty good to me.  Even shorter than I expected.
>I'll let the more experienced programmers do the real criticism :-)
>
>Is it possible to do something similar with the current environment?
>Or is the fact that everything is in internal format going to prevent
>knowing how many 'forms' are contained?
>

Actually, you can get close using function-lambda-expression and
do-all-symbols.  Close, because an implementation is not mandated to return
a useful value from function-lambda-expression.

>You know, in C++, the function would be much larger.

Yup.

---
David B. Lamkins <http://www.teleport.com/~dlamkins/>

From: ···@dur.mindspring.com
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sun, 18 Oct 1998 00:00:00 +0000
Message-ID: <wk1zo5x140.fsf@dur.mindspring.com>

"David B. Lamkins" <········@teleport.com> writes:

> In article <··················@news.newsguy.com> ,
> ········@david-steuber.com (David Steuber "The Interloper") wrote:
> 

> You'll get more out of this if you stop guessing and look it up. 
> Fortunately, you can, at <http://www.harlequin.com/books/HyperSpec/>.
> 

Of course, I could have made it easier on him by making my code more
self documenting.  LISP certainly provides the tools to do so easily.

Especially since I'm the only LISP programmer where I work and c folks
have to read my code, the easier I can make it to read, the better.

I get so used to DO and assume everybody is fluent in its
workings. Naughty me.

Thank you to all who provided constructive criticism!

From: Kelly Murray
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Mon, 19 Oct 1998 00:00:00 +0000
Message-ID: <362BCBB4.7BC31BE0@IntelliMarket.Com>

Here's a perhaps overdone object-oriented solution
(if CommonLisp had a FILE and STREAM class..

(defmethod count-forms ((f file))
 (with-open-file (stream file :direction :input 
	                      :if-does-not-exist nil)
   (count-forms stream)	                      
   ))
   
(defmethod count-forms ((f stream))
  (loop with count = 0 
        for form = (read f nil nil)
        while form
        finally (return count)
     do	  
     (incf count (count-form form))
     ))

(defmethod count-form (f) 
 1)

Actually, I took interest in this because I REALLY need
something that is CVS-like, and will compare to lists
of forms, and indicate what are the differences between the
two lists, such as whether stuff is added, deleted, or changed.
Someone want to tackle it, consider it a challenge.

-Kelly Murray  ···@intellimarket.com

From: Howard R. Stearns
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <362CE73A.62DBE731@elwood.com>

Kelly Murray wrote:
> 
> Here's a perhaps overdone object-oriented solution
> (if CommonLisp had a FILE and STREAM class..

I guess I'm confused. CL DOES have string, pathname and stream classes. 
While it might be nice to have programmer-created, or at least
system-defined subclasses of these, the example here doesn't seem to
require it.

By the way, Kelly's elegent solution was kind of subtle in not calling
attention to the following:

+ :if-does-not-exist nil makes it unecessarry to use probe-file.

+ It provides the opportunity to count leaves, which is probably a
better measure of code size (not nec. complexity) than just counting
top-level forms.  Hint: put a recursive method on CONS.  

I'm not saying anything about how to compare leaf counts to
lines-of-C-code.  However, by changing the read table, one could easily
get the leaf-count approach to work on C or other code so that solutions
can be compared between languages.  This is essentially 'wc', but might
be more accurate for certain tokens.  Besides, it also gives one the
opportunity to put a heavier weight on some kinds of forms, ...

> 
> (defmethod count-forms ((f file))
>  (with-open-file (stream file :direction :input
>                               :if-does-not-exist nil)
>    (count-forms stream)
>    ))
> 
> (defmethod count-forms ((f stream))
>   (loop with count = 0
>         for form = (read f nil nil)
>         while form
>         finally (return count)
>      do
>      (incf count (count-form form))
>      ))
> 
> (defmethod count-form (f)
>  1)
> 
> Actually, I took interest in this because I REALLY need
> something that is CVS-like, and will compare to lists
> of forms, and indicate what are the differences between the
> two lists, such as whether stuff is added, deleted, or changed.
> Someone want to tackle it, consider it a challenge.
> 
> -Kelly Murray  ···@intellimarket.com

From: Matthias H�lzl
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sun, 18 Oct 1998 00:00:00 +0000
Message-ID: <86r9w6k9qq.fsf@gauss.muc.de>

Michael Tuchman <·····@orsi.com> writes:

> ········@my-dejanews.com writes:
>
> > [intro snipped] As you can guess from the subject, I wish to use the
> > s-exp as my metric. So I want to write a lisp function that can
> > count all the s-exps in a program (or environment).
> > 
> > I don't know if this is a good excercise for one begining lisp or
> > not. Also,

I think it is a good exercise for a Lisp beginner (but not too useful
as a metric for code complexity/size).

> I think it is.  The only thing you have to be careful of is how you
> signal your end of file.  This is how I did it - would like some
> criticism.

Please don't take the following comments about the style of the
function as personal criticism.  I would have written COUNT-FORMS a
little bit differently, and since you ask for comments about your code
I'll try to justify the choices that I would have made.

> (defun count-forms (file)
>    "count the number of (top level) forms in a file"
>    ;; the gensym is helpful so that if NIL appears in a file, 
>    ;; that will count it as a form instead of treating it as an eof
>    ;; condition
>   (WHEN (probe-file file)
>    (with-open-file (instream file :direction :input)
>      (do*  ((eof-value (gensym))
>            (count 0 (unless (eq form eof-value) (+ 1 count)))
>            (form (read instream nil eof-value) (read instream nil eof-value)))
>          ((eq form eof-value) count)))))

Wrapping the update form in a UNLESS statement does not seem quite
right: Firstly the exit condition of the loop is identical to the
condition in the UNLESS form, and the form for COUNT precedes the one
for FORM so that the test is never true and the body of the UNLESS is
always executed, secondly the UNLESS form returns nil if the test
evaluates to true, so you would not get the desired result in this
case.  (This test is a left-over from a version where the order of the
FORM and COUNT clauses was exchanged, isn't it?)

Personally I wouldn't use a DO* in this case but rather a DO enclosed
in a LET form, e.g.,

(defun count-forms (file)
   "count the number of (top level) forms in a file"
   ;; the gensym is helpful so that if NIL appears in a file, 
   ;; that will count it as a form instead of treating it as an eof
   ;; condition
  (when (probe-file file)
   (with-open-file (instream file :direction :input)
     (let ((eof-value (gensym)))
       (do ((count 0 (+ count 1))
	    (form (read instream nil eof-value) 
		  (read instream nil eof-value)))
	   ((eq form eof-value) count))))))

This makes the code easier to read IMO since 

a] you don't have the reference to FORM in COUNT's update form before
   the binding for FORM is introduced syntactically (this forward
   reference is of course legal, I just find it somehow distracting
   to read),
b] you don't have to take the order dependencies between different
   init-forms into account and
c] you have made it clearer that EOF-VALUE is always the same.

Generally speaking, I try to bind only variables in a DO or DO* form
that are possibly updated in the body or by an update form, and I
prefer DO over DO* if possible.

I probably wouldn't return a nil value if the file is missing, I'd
rather add a keyword argument that specifies the return value for this
case or, since this is somewhat overkill for COUNT-FORMS, I'd let
WITH-OPEN-FILE raise an error.  Doing the error checking outside the
function has the benefit that it might need to be done less frequently
in cases where we have code that performs several operations on a
file, like

(let ((nr-of-forms (count-forms file))
      (complexity-of-forms (determine-complexities file))
      (other-measure (calculate-something file)))
 ...)

or we may safely omit the PROBE-FILE, e.g., for

(mapcar #'count-forms
	(directory #p"cmucl:src/compiler/*.lisp")

More importantly, checking whether the file exists and returning nil
if it is missing does not simplify things for the client of our
function: Probably we want to add the number of forms for several
files, or use the result to print a message, or do something else with
the result and then we have to check for a nil return value and either
substitute 0 instead or raise an error, e.g., instead of

(or (count-forms file) 0)

we have to write

(if (probe-file file) (count-forms file) 0)

which is actually clearer in my opinion.  However I am fully aware
that many (or even most) programmers would prefer the error check in
COUNT-FORMS.

Another nit one could pick with the above code is that it does not work
in some weird cases, e.g., it fails for a file containing the following:

(defvar my-var 123)
(defun foo () (+ #.my-var 1))

Binding *read-eval* to nil would be one solution that generates a more
meaningful error message, another possibility would be to define

(defun count-forms (file)
   "Count the number of top level forms in a file."
   (with-open-file (instream file :direction :input)
     (let ((*readtable* (copy-readtable))
	   (eof-value (gensym)))
       ;; Ignore the sharp-dot reader macro.
       (set-dispatch-macro-character #\# #\.
	  (constantly nil))
       (do ((count 0 (+ count 1))
	    (form (read instream nil eof-value) 
		  (read instream nil eof-value)))
	   ((eq form eof-value) count)))))

which ignores the sharp-dot reader macro.  To avoid mentioning the
`(read instream nil eof-value)' form twice one might consider using
the loop macro:

(defun count-forms (file)
   "Count the number of top level forms in a file."
   (with-open-file (instream file :direction :input)
     (let ((*readtable* (copy-readtable))
	   (eof-value (gensym)))
       ;; Ignore the sharp-dot reader macro.
       (set-dispatch-macro-character #\# #\.
	  (constantly nil))
       (loop
	 for form = (read instream nil eof-value)
	 until (eq form eof-value)
	 count form))))

Regards,

  Matthias

From: David Steuber "The Interloper
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sun, 18 Oct 1998 00:00:00 +0000
Message-ID: <36295835.4687109@news.newsguy.com>

On 18 Oct 1998 02:13:33 +0200, ··@gauss.muc.de (Matthias H�lzl)
claimed or asked:

% I think it is a good exercise for a Lisp beginner (but not too useful
% as a metric for code complexity/size).

Why would the number of s-expresions in a program not be a good
measure of code complexity?

Line counting is very common, but simple formatting differences can
throw of such a metric.  One could count atoms, but I am thinking that
maybe that is too fine?

Another metric I am familiar with is token counting.

--
David Steuber (ver 1.31.2a)
http://www.david-steuber.com
To reply by e-mail, replace trashcan with david.

So I have this chicken, see?  And it hatched from this egg, see?  But
the egg wasn't laid by a chicken.  It was cross-laid by a turkey.

From: rusty craine
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sat, 17 Oct 1998 00:00:00 +0000
Message-ID: <70bscl$poj$1@excalibur.flash.net>

David Steuber "The Interloper" wrote in message
<················@news.newsguy.com>...
>
>Line counting is very common, but simple formatting differences can
>throw of such a metric.  One could count atoms, but I am thinking that
>maybe that is too fine?
>
I'm not sure what complexity means, but  one of the respondents regarding a
request for help from me shared the following
(defun  permutations-list (ls)
(if (cdr ls)
  (mapcan
    (lambda (pp)
      (cons (cons (car ls) pp)
         (maplist (lambda (ta)
                      (nconc (ldiff pp (cdr ta)) (list (car ls))
                              (copy-list (cdr ta))))
                       pp)))
        (permutations-list (cdr ls)))
        (list ls)))

;;;btw i had to quote the lambda functions before it would run in muLisp

Took me (maybe i should admit this)  4 hours of "flow charting" before i
could really explain to someone what was happening (though it just took 15
minutes to get it running) .  Having not seen the function before "I" think
it is complex.  Not many lines,  not many atoms.....but flow chart it
out....pretty complex in my opinion if you have never seen it before.  If
there was a whole application of mapping lists, cons, cdrs with recursion, I
would never be able to understand it.  I guess this is to say some atoms
have got to be weighted to really represent complexity....at least for me.
(not to mention that recursion can add 100's of layers from one function)

rusty

From: David Steuber "The Interloper
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sun, 18 Oct 1998 00:00:00 +0000
Message-ID: <362a3b1d.1525263@news.newsguy.com>

On Sat, 17 Oct 1998 23:57:39 -0500, "rusty craine"
<········@flash.net> claimed or asked:

% Took me (maybe i should admit this)  4 hours of "flow charting" before i
% could really explain to someone what was happening (though it just took 15
% minutes to get it running) .  Having not seen the function before "I" think
% it is complex.  Not many lines,  not many atoms.....but flow chart it
% out....pretty complex in my opinion if you have never seen it before.  If
% there was a whole application of mapping lists, cons, cdrs with recursion, I
% would never be able to understand it.  I guess this is to say some atoms
% have got to be weighted to really represent complexity....at least for me.
% (not to mention that recursion can add 100's of layers from one function)

Maybe complexity needs to be defined.  But the code wasn't so complex
as expressing something complex, concisely.  Lisp seems to be very
expressive.  Or is it concise?  Or terse?

I am trying to apply (in my mind) the theory that the more
expressions, tokens, whatever a program has, the more likely it is to
contain a programmer error.  This may be a baseless theory.  However,
when someone asks me why I want to write a complex program in Lisp
instead of some other language (like C++ or Java), I can hopefully
reply with, "because Lisp is more expressive and therefor more
appropriate for my needs."  Instead of, "sod off you stupid git!"

The definition of the following hypothetical function may be
exceedingly complex, and it performs a complex operation.  But the use
of the function based on the call looks pretty simple.  Here it is:

(re-sequence-dna-from-to  marketing-person frog) ;; inspired by
Dilbert

Error: requires increase in genetic information

--
David Steuber (ver 1.31.2a)
http://www.david-steuber.com
To reply by e-mail, replace trashcan with david.

So I have this chicken, see?  And it hatched from this egg, see?  But
the egg wasn't laid by a chicken.  It was cross-laid by a turkey.

From: David B. Lamkins
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sun, 18 Oct 1998 00:00:00 +0000
Message-ID: <ADtW1.4229$es1.2611463@news.teleport.com>

In article <················@news.newsguy.com> , ········@david-steuber.com
(David Steuber "The Interloper") wrote:

[snip]

>I am trying to apply (in my mind) the theory that the more
>expressions, tokens, whatever a program has, the more likely it is to
>contain a programmer error.  This may be a baseless theory.  However,
>when someone asks me why I want to write a complex program in Lisp
>instead of some other language (like C++ or Java), I can hopefully
>reply with, "because Lisp is more expressive and therefor more
>appropriate for my needs."  Instead of, "sod off you stupid git!"
>

There are different kinds of complexity.  For example:

  - the inherent complexity of the problem statement
  - the complexity introduced by "inessential"
    additions to the problem statement, e.g. boundary
    conditions, constraints, generalizations
  - the complexity of the programmer's solution
  - the complexity introduced by the language, e.g.
    due to a mismatch between the domain expressible
    by the language and the domain required for the
    problem solution

Counting metrics (lines of code, tokens, expressions, control transfers,
etc.) may not produce a useful correlation to either complexity or
likelihood of errors.  The only correlation I've noticed is that shorter
programs tend to be more readable than longer ones, given equivalent
functionality and the assumption that the shorter program is shorter because
of improved abstraction rather than cleverer obfuscation.  You rarely get to
make this kind of comparison because programs tend to suffer from increased
functionality over their lifetime, and because most maintenance programmers
do not place a high priority on improving program structure (which would
tend to improve abstraction and reduce most counting metrics.)

>The definition of the following hypothetical function may be
>exceedingly complex, and it performs a complex operation.  But the use
>of the function based on the call looks pretty simple.  Here it is:
>
>(re-sequence-dna-from-to  marketing-person frog) ;; inspired by
>Dilbert
>
>Error: requires increase in genetic information
>

I see that our world views coincide in at least this aspect <g>.

---
David B. Lamkins <http://www.teleport.com/~dlamkins/>

From: Kent M Pitman
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Mon, 19 Oct 1998 00:00:00 +0000
Message-ID: <sfwr9w58jah.fsf@world.std.com>

"David B. Lamkins" <········@teleport.com> writes:

> There are different kinds of complexity.  For example:

Thank you for making this point.

>   - the inherent complexity of the problem statement
>   - the complexity introduced by "inessential"
>     additions to the problem statement, e.g. boundary
>     conditions, constraints, generalizations
>   - the complexity of the programmer's solution

I'd subdivide this into "expressive complexity" and "algorithmic
complexity".  For example, a programmer may choose to use a
complicated procedure where a built-in exists to do the same thing.
There may be no loss of speed but there is still loss of time in both
writing and reading the code.  That's expressive complexity.
Then there are things like doing things the hard way vs doing
things the easy way.  Consider the competing ways of doing fib
(tail recursively vs truly recursively) as an example.

>   - the complexity introduced by the language, e.g.
>     due to a mismatch between the domain expressible
>     by the language and the domain required for the
>     problem solution

> Counting metrics (lines of code, tokens, expressions, control transfers,
> etc.) may not produce a useful correlation to either complexity or
> likelihood of errors.  The only correlation I've noticed is that shorter
> programs tend to be more readable than longer ones, given equivalent
> functionality and the assumption that the shorter program is shorter because
> of improved abstraction rather than cleverer obfuscation. 

I somewhat agree with your conclusion but hasten to point out (not for
your benefit but for the benefit of others who might not see it) that
this is only a rough rule of thumb.  Consider that sometimes the
introduction of temporary names makes programs more readable but
longer.  I often put in LET bindings and variables for quantities to
be used only once if the quantities themselves have names.

Also, a SETQ may be more compact than a LET binding, but can greatly
complicate things.  Ditto a GO statement in a TAGBODY may be more
compact than an equivalent LABELS, but there is a community of people
who thinks that LABELS is still more perspicuous as it imposes a bit
more structure.

Then again, maybe a hybrid criterion would work, for example:

 count the number of expressions,
 multiply by ceiling(n/3) where n is the average length of the 
  expressions (long expressions are as bad as deep ones),
 multiply by w/k where w is the number of symbols (words) whose name is
  longer than 1 letter and k is the number of symbols whose name is both
  longer than 1 letter and contains no vowels.

(In case I didn't say it, I think unreadable variable names contribute
 to code complexity, too.  ;-)

From: Barry Margolin
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Mon, 19 Oct 1998 00:00:00 +0000
Message-ID: <KpyW1.14$2L4.899189@burlma1-snr1.gtei.net>

In article <···············@world.std.com>,
Kent M Pitman  <······@world.std.com> wrote:
>"David B. Lamkins" <········@teleport.com> writes:
>
>> There are different kinds of complexity.  For example:
>
>Thank you for making this point.

Folks, is this debate really appropriate for this group?  Computer
scientists have been doing research for decades on complexity metrics for
programs, and I suspect that comp.software-eng would be a better place to
discuss the merits of complexity analysis.

The issue that's appropriate for this group is how to implement a
particular metric with Lisp programs.

-- 
Barry Margolin, ······@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.

From: David B. Lamkins
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Mon, 19 Oct 1998 00:00:00 +0000
Message-ID: <ZuzW1.4480$es1.2788210@news.teleport.com>

In article <···················@burlma1-snr1.gtei.net> , Barry Margolin
<······@bbnplanet.com>  wrote:

>In article <···············@world.std.com>,
>Kent M Pitman  <······@world.std.com> wrote:
>>"David B. Lamkins" <········@teleport.com> writes:
>>
>>> There are different kinds of complexity.  For example:
>>
>>Thank you for making this point.
>
>Folks, is this debate really appropriate for this group?  Computer
>scientists have been doing research for decades on complexity metrics for
>programs, and I suspect that comp.software-eng would be a better place to
>discuss the merits of complexity analysis.
>
>The issue that's appropriate for this group is how to implement a
>particular metric with Lisp programs.

OK.  Perhaps I was being too circumspect in trying to explain why I believe
that counting methods are useless, each in its own unique way <g>.

To get back to Lisp, here's the core of the tool I use to "measure" my own
code.  I use this mostly to monitor code growth (or reduction) during
development, and as a quick check for an adequate balance of comments to
code.  (So far, I have not imposed this upon anyone, myself included, as a
metric.)

There's a full version, with a UI for MCL, on my web site (see my
.signature).

(defun stream-count-source-lines (stream)
  "For STREAM, a Lisp source text, return 
non-comment source line count, comment line count,
comment block count, and empty line count.

Assumes that block comments #| |# begin and end a line."
  (let ((ncsl-count 0)
        (comment-line-count 0)
        (comment-block-count 0)
        (empty-line-count 0)
        (in-comment nil)
        (in-block-comment nil))
    (flet ((get-line (stream)
             (let ((line (read-line stream nil nil)))
               (when line (string-trim '(#\Space #\Tab) line))))
           (empty-line-p (line)
             (zerop (length line)))
           (comment-line-p (line)
             (eql #\; (char line 0)))
           (block-comment-begin-p (line)
             (search "#|" line))
           (block-comment-end-p (line)
             (search "|#" line)))
      (do ((line (get-line stream) (get-line stream)))
          ((null line) (values
                        ncsl-count
                        comment-line-count
                        comment-block-count 
                        empty-line-count))
        (cond ((empty-line-p line)
               (incf empty-line-count))
              ((comment-line-p line)
               (incf comment-line-count)
               (unless in-comment
                 (setq in-comment t)
                 (incf comment-block-count)))
              ((block-comment-end-p line)
               (incf comment-line-count)
               (setq in-block-comment nil))
              ((block-comment-begin-p line)
               (setq in-block-comment t)
               (incf comment-line-count)
               (incf comment-block-count))
              (in-block-comment
               (incf comment-line-count))
              (t
               (setq in-comment nil)
               (incf ncsl-count)))))))

To clarify the meaning of the measured items:

A non-comment source line is a non-blank line containing more than just a
comment, although it may contain both code and a trailing comment.

A comment line is a line consisting entirely of comment text.  It could be
the #| or |# of a block comment, a line between the #| and |# pair, or a
line which has a ; as its first nonblank character.

A comment block is a consecutive run of comment lines.  The degenerate case,
a single line, is also a comment block.

An empty line has no nonblank characters.

---
David B. Lamkins <http://www.teleport.com/~dlamkins/>

From: Kent M Pitman
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Mon, 19 Oct 1998 00:00:00 +0000
Message-ID: <sfw1zo587z4.fsf@world.std.com>

"David B. Lamkins" <········@teleport.com> writes:

> (defun stream-count-source-lines (stream)
>   "For STREAM, a Lisp source text, return 
> non-comment source line count, comment line count,
> comment block count, and empty line count.
> 
> Assumes that block comments #| |# begin and end a line."
>   (let ((ncsl-count 0)
>         (comment-line-count 0)
>         (comment-block-count 0)
>         (empty-line-count 0)
>         (in-comment nil)
>         (in-block-comment nil))
>     (flet ((get-line (stream)
>              (let ((line (read-line stream nil nil)))
>                (when line (string-trim '(#\Space #\Tab) line))))
>            (empty-line-p (line)
>              (zerop (length line)))
>            (comment-line-p (line)
>              (eql #\; (char line 0)))
>            (block-comment-begin-p (line)
>              (search "#|" line))
>            (block-comment-end-p (line)
>              (search "|#" line)))

I don't suppose I could convince you to use

 (search "#\|" ...) and (search "|\#" ...)

so that your code doesn't mistakenly think part of
itself is a comment.  It's doesn't affect the correctness
of the code--just the look.

From: David B. Lamkins
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Mon, 19 Oct 1998 00:00:00 +0000
Message-ID: <BFJW1.4777$es1.3053003@news.teleport.com>

In article <···············@world.std.com> , Kent M Pitman
<······@world.std.com>  wrote:

>"David B. Lamkins" <········@teleport.com> writes:
>
>> (defun stream-count-source-lines (stream)
>>   "For STREAM, a Lisp source text, return 
>> non-comment source line count, comment line count,
>> comment block count, and empty line count.
>> 
>> Assumes that block comments #| |# begin and end a line."
>>   (let ((ncsl-count 0)
>>         (comment-line-count 0)
>>         (comment-block-count 0)
>>         (empty-line-count 0)
>>         (in-comment nil)
>>         (in-block-comment nil))
>>     (flet ((get-line (stream)
>>              (let ((line (read-line stream nil nil)))
>>                (when line (string-trim '(#\Space #\Tab) line))))
>>            (empty-line-p (line)
>>              (zerop (length line)))
>>            (comment-line-p (line)
>>              (eql #\; (char line 0)))
>>            (block-comment-begin-p (line)
>>              (search "#|" line))
>>            (block-comment-end-p (line)
>>              (search "|#" line)))
>
>I don't suppose I could convince you to use
>
> (search "#\|" ...) and (search "|\#" ...)
>
>so that your code doesn't mistakenly think part of
>itself is a comment.  It's doesn't affect the correctness
>of the code--just the look.
>

Darn, darn, darn!  Why can't we have a compiler that pays attention to
comments?

Yes, yours is a good suggestion.  However, my intent (see the last line of
the doc string) was to only recognize block comments that begin and end on
their own line.  (There were some _unstated_ assumptions, as well, that
became clear to me as I read the code again; more on this later.)

According to my intentions for the counter

#|
This is a block comment.
|#

but

(defun foo ()
  (let ( #| This is just some commented-out code.
            It gets counted as source lines.
        (bar 17) |#
        (baz 43))
    ...))

To honor this distinction, I should change the following tests.  (Note that
blanks are trimmed from the string earlier on.)

            (block-comment-begin-p (line)
              (eql 0 (search "#\|" line)))  ; changed
            (block-comment-end-p (line)
              (search "|\#" line))) ; changed

But this gives different results in (what should be) a similar situation.

(defun bar ()
  #| This is also commented-out code,
     but it is counted as a comment block.
  (baz 23)
  (qux 97) |# (foo 77) ; and this form is uncounted
  (raz 88))

To handle this consistently (i.e. equivalent to the embedded block comment
in the (defun foo () ...) form, above) would mean keeping track of
additional context.

I rarely use block comments to disable code, preferring #+:ignore
(form-to-be-ignored).  But if you happen to use #| |# to disable code, you
should be suspicious of the counts provided by my function.

---
David B. Lamkins <http://www.teleport.com/~dlamkins/>

From: Pierre Mai
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Mon, 19 Oct 1998 00:00:00 +0000
Message-ID: <87iuhgxn6r.fsf@dent.isdn.cs.tu-berlin.de>

Sam Steingold <···@goems.com> writes:

> Please excuse my ignorance, but why not use somehting like
> 
> (defun count-sexps (form)
>   (if (or (atom form) (eq 'quote (car form))) 0
>       (reduce #'+ form :key #'count-sexps :initial-value 1)))
> 
> (defun code-complexity (file)
>   (with-open-file (str file :direction :input)
>     (do ((form (read str nil +eof+) (read str nil +eof+)) (cc 0))
>         ((eq form +eof+) cc)
>       (incf cc (count-sexps form)))))
> 
> (count-sexps '(+ 1 2 (* 3 4))) ==> 2
> 
> 

Amongst other things, forms are not always proper lists:

(count-sexps '(loop for (demo . rest) in demolist collect rest))

==> Error somewhere along the line ;)

Also you will get problems if the file you are counting defines and
uses (via eval-when) an extended read-table.  Generally, counting
sexps in a file is not as easy a proposition as it first sounds...

WRT to the discussion on metrics:  Of course metrics are not absolute
measures like meters, and never will be.  But if used in a controlled
(personal) environment, with a sound statistical callibration, they
can become a very reliable guide for planning, testing, and
estimation.  For a gentle introduction to metrics, processes, etc. in
a personal setting, see "A Discipline for Software Engineering" by
Watts S. Humphrey.

Regs, Pierre.

-- 
Pierre Mai <····@cs.tu-berlin.de>	http://home.pages.de/~trillian/
  "Such is life." -- Fiona in "Four Weddings and a Funeral" (UK/1994)

From: Barry Margolin
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <j3RW1.5$Sg6.6819@burlma1-snr1.gtei.net>

In article <··············@dent.isdn.cs.tu-berlin.de>,
Pierre Mai  <····@acm.org> wrote:
>Also you will get problems if the file you are counting defines and
>uses (via eval-when) an extended read-table.  Generally, counting
>sexps in a file is not as easy a proposition as it first sounds...

I was going to point out how his code counts things that aren't really
sexps, for instance in a cond it will count each clause as an sexp, as well
as counting the conditional and each form in the consequence.

But given that any complexity metric like this is just an approximation of
real complexity, it's not really necessary to get all the details precisely
right.  It's like giving a measurement as "10.23456 +/- 0.1" -- what's the
point of all those extra decimal places of precision if they're beyond the
accuracy of the measurement?

As long as the metric is consistent, it can be used to compare programs.

-- 
Barry Margolin, ······@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.

From: Reini Urban
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <362c9bde.24263849@judy>

Sam Steingold <···@goems.com> wrote:
>Please excuse my ignorance, but why not use something like
>(defun count-sexps (form)
>  (if (or (atom form) (eq 'quote (car form))) 0
>      (reduce #'+ form :key #'count-sexps :initial-value 1)))
>
>(defun code-complexity (file)
>  (with-open-file (str file :direction :input)
>    (do ((form (read str nil +eof+) (read str nil +eof+)) (cc 0))
>        ((eq form +eof+) cc)
>      (incf cc (count-sexps form)))))
>
>(count-sexps '(+ 1 2 (* 3 4))) ==> 2

this is my favourite so far. why did everybody else try to reinvent
READ?

my second favourite (and actually in use) is 
grep -c (    
along with 
wc -l

note:
why not "[^']("?
data complexity is in my eyes comparable to functional complexity, esp.
with lisp. the need for less functions in lisp has the need for more
data which is often used more intelligent than in "static" languages.
that's why '( should count too. 
you may decide differently, code complexity is relative anyway. 
grep -c ( runs on every machine and is therefore is easier comparable
than any used lisp function. even by project managers :)
simplicity rules

why not feeding out comments?
i also have a short perl which counts functions with weeding out
comments. this is for metrics and function coverage, not for complexity.
for complexity i would count commented versions as well. i have to pay
attention to comments on reading and i may decide to remove the
commented versions of code, otherwise i would delete them or move them
to the old directory.

---
Reini Urban
http://xarch.tu-graz.ac.at/autocad/news/faq/autolisp.html

From: Barry Margolin
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <JU3X1.52$Sg6.298191@burlma1-snr1.gtei.net>

In article <··············@eho.eaglets.com>,
Sam Steingold  <···@goems.com> wrote:
>why use perl to process lisp code?
>presumably you are at least somewhat well versed in Lisp - in which case
>it should be much easier to do the same in lisp, not perl.

We've had this discussion before.  Perl was designed to make file scanning
and pattern matching easy.  The programs we've been designing in this
thread are precisely the kind of application that Perl is intended for.
I'm well-enough versed in Lisp to know that the equivalent of:

while (<>) { # Loop over input lines
  counter++ if /^\s*\(/; # if first non-white character is open-paren, count it
}

would be much more verbose without being significantly more expressive.

-- 
Barry Margolin, ······@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.

From: Pierre Mai
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Wed, 21 Oct 1998 00:00:00 +0000
Message-ID: <87iuheu6wo.fsf@dent.isdn.cs.tu-berlin.de>

Barry Margolin <······@bbnplanet.com> writes:

> In article <··············@eho.eaglets.com>,
> Sam Steingold  <···@goems.com> wrote:
> >why use perl to process lisp code?
> >presumably you are at least somewhat well versed in Lisp - in which case
> >it should be much easier to do the same in lisp, not perl.
> 
> We've had this discussion before.  Perl was designed to make file scanning
> and pattern matching easy.  The programs we've been designing in this
> thread are precisely the kind of application that Perl is intended for.
> I'm well-enough versed in Lisp to know that the equivalent of:
> 
> while (<>) { # Loop over input lines
>   counter++ if /^\s*\(/; # if first non-white character is open-paren, count it
> }
> 
> would be much more verbose without being significantly more expressive.
                                           -------------

If it weren't for the regexp, which needs a comment of 10 words to
explain what it does, and which is easy to get wrong (either comment,
or regexp that is), I could believe that statement, but so I have to
humbly disagree ;)

Anyways, this is Common Lisp:

(loop for line = (read-line *standard-input* nil nil)
      while line
      count (starts-with (left-trim-whitespace line) "("))

This uses two trivial string functions which are probably part of
every working CL user[1].  With an extensible LOOP facility, this could
even be clarified further...

All in all the above snippet takes 3 lines and 130 characters and gets 
by with no comments, while the Perl snippet takes 3 lines and 120
characters, but needs extensive commenting...

Of course I know that both snippets aren't exactly equivalent in
functionality (Perl makes using named, multiple input files easy, CL's 
r-e-p loop gives you the count directly, in Perl you need an extra
print), but then again, this all boils down to just a couple of extra
procedures and macrology.

We should probably just sit down and throw together a reasonable "Perl"
package for Lisp, to end this problem once and for-all.  Anyone
interested in collaborating/contributing?

Regs, Pierre.

Footnotes: 
[1]  Here are some very simple, inefficient sample implementations:
(defun starts-with (string substring)
  "Detect whether the `string' starts with `substring'."
  (eql 0 (search substring string)))

(defun left-trim-whitespace (string &optional (ws-bag '(#\Space #\Tab)))
  "Trims any whitespace characters (i.e. characters in `ws-bag') from
the left side of `string'."
  (string-left-trim ws-bag string))

-- 
Pierre Mai <····@acm.org>               http://home.pages.de/~trillian/
  "Such is life." -- Fiona in "Four Weddings and a Funeral" (UK/1994)

From: Pierre Mai
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <87ogr7t6da.fsf@dent.isdn.cs.tu-berlin.de>

Reini Urban <······@sbox.tu-graz.ac.at> writes:

> Sam Steingold <···@goems.com> wrote:
> >Please excuse my ignorance, but why not use something like
> >(defun count-sexps (form)
> >  (if (or (atom form) (eq 'quote (car form))) 0
> >      (reduce #'+ form :key #'count-sexps :initial-value 1)))
> >
> >(defun code-complexity (file)
> >  (with-open-file (str file :direction :input)
> >    (do ((form (read str nil +eof+) (read str nil +eof+)) (cc 0))
> >        ((eq form +eof+) cc)
> >      (incf cc (count-sexps form)))))
> >
> >(count-sexps '(+ 1 2 (* 3 4))) ==> 2
> 
> this is my favourite so far. why did everybody else try to reinvent
> READ?

Because if you want to use READ you will have to compile/load all the
files prior to measuring their complexity, to ensure that any packages 
are created, and any changes to the read-table are in effect before
encountering special read-syntax in the files, and to ensure that
#. works as  well.  This seems a bit overkill just for counting #\( in 
the files[1].  This is exactly what Tim Bradshaw has argued against, as
being a very lax attitude towards Software Engineering issues, and
rightly so, IMNSHO.

If simplicity of implementation is any measure, then this solution
should be quite simple enough, while avoiding at least the most
serious problems of read.

(defun count-open-parens (name)
  (with-open-file (stream name)
    (loop for line = (read-line stream nil nil)
	  while line
	  sum (count #\( line))))

This is probably just as accurate as using read, since the number of
non-code #\( in comments is most likely a very small fraction of the
number of total #\(.

> 
> my second favourite (and actually in use) is 
> grep -c (    
> along with 
> wc -l

This can be achieved by the following function: ;)

(defun count-open-parens-and-lines (name)
  (with-open-file (stream name)
    (loop for line = (read-line stream nil nil)
	  while line
	  sum (count #\( line) into open-parens
	  count 1 into total-lines
	  finally (return (values open-parens total-lines)))))

> grep -c ( runs on every machine and is therefore is easier comparable
> than any used lisp function. even by project managers :)
> simplicity rules

What?  grep -c ( only runs on Unix natively, and may work on some
other plattforms, when one installs additional software.  But the
defun given above runs in any ANSI-compliant CL, which should be
available on any plattform where the question of counting #\(
arises[2]...  As a project manager for a CL-based project, I would
most definitely prefer a function like the above, especially since
this meshes very well with the various defsystem facilities used for
building, testing, packaging, etc.[3]

Regs, Pierre.

Footnotes: 
[1]  One of the other problems is that all the symbols in the files 
will be interned by read, which is a frightful waste of resources,
just for getting a very quick-and-dirty complexity measure.

[2]  Of course you may want to count #\( in Scheme or other dialects,
but there are similiar definitions of the above defun for those
dialects, too.

[3]  In fact, the only thing that I'm missing sorely is a lisp-based
version-control facility.  Using RCS, PRCS, CVS & Co. with a
defsystem-based build system doesn't mesh that well.

-- 
Pierre Mai <····@cs.tu-berlin.de>	http://home.pages.de/~trillian/
  "Such is life." -- Fiona in "Four Weddings and a Funeral" (UK/1994)

From: Howard R. Stearns
Subject: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Wed, 21 Oct 1998 00:00:00 +0000
Message-ID: <362E1BCA.87C3EB7B@elwood.com>

Pierre Mai wrote:
> ...
> [3]  In fact, the only thing that I'm missing sorely is a lisp-based
> version-control facility.  Using RCS, PRCS, CVS & Co. with a
> defsystem-based build system doesn't mesh that well.
> ...

Somewhere in the lower third of my "to-list" is to come up with a
documented MOP for pathnames which allows users to define their own
file-system access for version control, and a MOP for defsystem that
makes use of it.

For example, many Lisp implmentations (including Eclipse) support the
"Gray CLOS-Streams" protocol.  This protocol stops just short of
defining how OPEN can be extended to use new stream classes. 
Internally, Eclipse defines a CLOS-Pathnames MOP which:
 1. ties in with OPEN
 2. provides similar tie-ins for PROBE-FILE, DIRECTORY, etc.
 3. allows creation of new pathname classes with customized behavior for
the CL 
    pathname utilities, including how versions are handled.

Alas, the internals are totally undocumented pending more experience,
but you can see how the public behavior appears to the user at
http://www.elwood.com/eclipse/path.htm.  In particular, versions are
handled on Unix/Windows in a way that is compatible with non-versioning
software.

Anyone who has comments or would seriously like to collaborate on the
development of a portable pathname/stream/file-system/defsystem MOP
should get in touch with me.

Howard Stearns
······@elwood.com
1-414-764-7500 (U.S. central timezone)

From: Pierre Mai
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Sun, 25 Oct 1998 00:00:00 +0000
Message-ID: <87sogc9s68.fsf@dent.isdn.cs.tu-berlin.de>

"Howard R. Stearns" <······@elwood.com> writes:

> Pierre Mai wrote:
> > ...
> > [3]  In fact, the only thing that I'm missing sorely is a lisp-based
> > version-control facility.  Using RCS, PRCS, CVS & Co. with a
> > defsystem-based build system doesn't mesh that well.
> > ...

This small footnote of mine seems to have generated a lot of interest
(also in private email), so it seems I've struck a nerve here
somewhere... ;)

> Somewhere in the lower third of my "to-list" is to come up with a
> documented MOP for pathnames which allows users to define their own
> file-system access for version control, and a MOP for defsystem that
> makes use of it.

This is very interesting, and I'd be interested in something like
this, especially since this makes integrating build system and VCS
easier.

> Anyone who has comments or would seriously like to collaborate on the
> development of a portable pathname/stream/file-system/defsystem MOP
> should get in touch with me.

I'd be interested in collaborating on something like this, especially
since this would probably fill a need felt in the CL community for some
time IMHO, that is a more portable, more integrated support for system
construction.  Most of this functionality is reinvented in every major
project, i.e. build-support (defsystem), regression testing, documentation
and metric generation, version-control, bug-tracking, etc.

Given a flexible substrate like you are suggesting, most of this
functionality could be built portably and reuseably on top of this,
obviating the need to create everything from scratch, and assisting in 
the integration of sources from disparate projects...

To give a bit of background of what motivates my interest in this kind 
of feature, let me insert here a short summary of the things I'd like
to see in a VCS for Lisp, which I wrote in a private email to someone
else interested in my footnote:

--------------------------------------------------------------------------

- The version-control facility should have a clue about defined
  systems, components, etc in the CL's defsystem.
  i.e. it should be possible to use  defsystem systems in conjunction
  with version numbers to access a particular version of a particular
  system, e.g. (operate-on-system '("framework" :version 1.5) 'compile).

  This should be integrated into the defsystem substrate, so that
  extensions can transparently use the versioning.  I have a nice
  metrics extension, for example, which let's me call
  (print-system-metrics "framework" t) to get a nice report on source
  metrics for the system "framework" and all the systems it depends on 
  (3rd parameter is t).  It would be nice to be able to do

  (with-open-file (stream "progress-report.txt" :direction :output)
    (print-system-metrics '("framework" :version 1.3) t stream)
    (print-system-metrics '("framework" :version 1.7) t stream))

  to get a nice overview of changing metrics (more advanced functions, 
  like differential analysis are of course thinkable)...

- There should be direct access to versioning information from Lisp,
  e.g. querying of available revisions, etc.

- Versioning should probably be integrated into the CL pathname
  system, since it already provides for versioning support.

- The actions of the version control system should be extensible from
  CL, probably in a CLOS based protocol:

(defmethod rcs:check-in-file-version :before ((repo repository) file ...)
  ;; Do some check-in validation or whatever, throw exception to
  ;; prevent check-in...)

(defmethod rcs:check-out-file-version :after ...)

etc.

Many of these features could of course be provided by a very good
interface to an external VCS, except for ensuring that CL extensions
are always run.

Basically, you get many of the same advantages that you get when you
integrate a make-based build system and a VCS, like in the
ShapeToolkit[1], except more so, because the build-system is more
easily extended (and more powerful, IMHO), and the whole is dropped in 
a language substrate that makes extending the whole even easier and
more powerful...

Coming to think of it, this would be an ideal candidate for some sort
of thesis, maybe coupled with a report on productivity gains in SCM
settings...  OTOH it might be difficult to get enough real-live
samples ;-(  And anyway, (Common) Lisp is an uncool topic currently in 
academia, Java being the star of the moment (after functional
languages, after logic programming, after graph-based languages, after 
Lisp and Scheme, after ...). :-(

--------------------------------------------------------------------------

Regs, Pierre.

Footnotes: 
[1]  See http://swt.cs.tu-berlin.de/~shape/index.html

-- 
Pierre Mai <····@acm.org>               http://home.pages.de/~trillian/
  "Such is life." -- Fiona in "Four Weddings and a Funeral" (UK/1994)

From: Rainer Joswig
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Mon, 26 Oct 1998 00:00:00 +0000
Message-ID: <joswig-2610981226250001@pbg3.lavielle.com>

In article <··············@dent.isdn.cs.tu-berlin.de>, Pierre Mai
<····@acm.org> wrote:

> I'd be interested in collaborating on something like this, especially
> since this would probably fill a need felt in the CL community for some
> time IMHO, that is a more portable, more integrated support for system
> construction.  Most of this functionality is reinvented in every major
> project, i.e. build-support (defsystem), regression testing, documentation
> and metric generation, version-control, bug-tracking, etc.

Hmm, I guess one only needs to reimplement the "System Construction Tool"
of Symbolics Genera. It is described in the manual "Genera 8.3,
Program Development Utilities". A feature of SCT is that
it  maintains versions of systems and its patches. The Zmacs editor has these
nice commands "Add Patch", "Finish Patch", ... SCT knows about
reviewers for patches, sites that are maintaining systems,
generating bug reports, ...

-- 
http://www.lavielle.com/~joswig

From: Pierre Mai
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Tue, 27 Oct 1998 00:00:00 +0000
Message-ID: <873e8avmsa.fsf@dent.isdn.cs.tu-berlin.de>

······@lavielle.com (Rainer Joswig) writes:

> Hmm, I guess one only needs to reimplement the "System Construction Tool"
> of Symbolics Genera. It is described in the manual "Genera 8.3,
> Program Development Utilities". A feature of SCT is that
> it  maintains versions of systems and its patches. The Zmacs editor has these
> nice commands "Add Patch", "Finish Patch", ... SCT knows about
> reviewers for patches, sites that are maintaining systems,
> generating bug reports, ...

I guess I'll have to try to locate a copy of the Genera manuals.  Any
hints, anyone? ;-)

-- 
Pierre Mai <····@acm.org>               http://home.pages.de/~trillian/
  "Such is life." -- Fiona in "Four Weddings and a Funeral" (UK/1994)

From: ········@totient.demon.co.uk
Subject: Re: VERSION-CONTROL && Ilog Talk
Date: Tue, 27 Oct 1998 00:00:00 +0000
Message-ID: <7149fu$1cu$1@nnrp1.dejanews.com>

ILOG Talk (a lisp like dialect, contained an interface
to version control systems). It's described in the manuals.
Maybe the talk code could be ported over to CL??

ftp://ftp.ilog.fr/pub/Products



Jon

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

From: Harley Davis
Subject: Re: VERSION-CONTROL && Ilog Talk
Date: Wed, 28 Oct 1998 00:00:00 +0000
Message-ID: <718i76$3ot$1@news1-alterdial.uu.net>

········@totient.demon.co.uk wrote in message
<············@nnrp1.dejanews.com>...
>ILOG Talk (a lisp like dialect, contained an interface
>to version control systems). It's described in the manuals.
>Maybe the talk code could be ported over to CL??


I've put the source code of this Ilog Talk module (tar'ed and gzip'ed) at:

    ftp://ftp.ilog.com/pub/share/iltver.tar.gz

For what it's worth, I'm responsible for writing this code originally but
it's been about 4 or 5 years since I've touched it.  I doubt I would have
written it the same way today as it seems a little overengineered for its
purpose.

This code is entirely unsupported by Ilog or me now, but as it's part of the
publically available Linux version of Talk anyway, feel free to check it
out.

If someone is trying to port it to CL or some other language, I can help
with any Talk-isms in it.

-- Harley

From: Tim Bradshaw
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Wed, 04 Nov 1998 00:00:00 +0000
Message-ID: <ey3ww5bu3ct.fsf@todday.aiai.ed.ac.uk>

* Pierre Mai wrote:

> I guess I'll have to try to locate a copy of the Genera manuals.  Any
> hints, anyone? ;-)

Ask on the slug mailing list.  But SCT is *not* enough because it
completely fails to deal with things like branches.  It also relies on
a versioned filesystm, and it can actually be a serious pain to
rebuild using SCT on a different architecture (for instance if someone
sends you a system for ivory and you need to build on 3600).

SCT is a start, but it's not enough.

--tim

From: Tim Bradshaw
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Mon, 26 Oct 1998 00:00:00 +0000
Message-ID: <ey3hfwrlb9p.fsf@todday.aiai.ed.ac.uk>

* Pierre Mai wrote:
> "Howard R. Stearns" <······@elwood.com> writes:

>> > [3]  In fact, the only thing that I'm missing sorely is a lisp-based
>> > version-control facility.  Using RCS, PRCS, CVS & Co. with a
>> > defsystem-based build system doesn't mesh that well.
>> > ...

> This small footnote of mine seems to have generated a lot of interest
> (also in private email), so it seems I've struck a nerve here
> somewhere... ;)

Can I make a small plea *not* to write a version control system from
nothing?  This is a very traditional `isolationist' lisp way to go,
and it makes things like integrating with other systems much harder
than it needs to be.  It also puts you in a position of trying to keep
up with systems having orders of magnitude more money pumped into
them, which is a problem (look at lisp machine performance vs risc
say).

Here are a couple of positive suggestions rather than just whining:

CVS has a client-server interface.  A Lisp system could have a CVS
client which could hide behind a suitable pathname system.  You'd need
socket code to do this, or if you could live with being slightly
OS-specific you could emit cvs command lines. A proper cvs client
would be cooler and, given a socket interface, should work for a lisp
system running on almost any platform.  I don't think the CVS client
interface is that incomprehensible.

I hate Lisp defsystems because they're isolationist, but I had a
scheme for a `defsystem' which would use make to track file
dependencies, but instead of cranking up Lisp each time, it would spit
out a file full of suitable compile commands, which could then be fed
to a Lisp.  The `make depend' bit could be done by having some Lisp
program which did who-calls or equivalent, and generated a makefile
from the results.  If a purely-lisp interface is needed one could
write defsystems which would then be translated into makefiles
mechanically (this should be pretty easy).  I know make is much
nastier than anything one might do in Lisp, but it's there, it gets
used for everything else (on Unix), it runs most places (people who do
multiplatform Unix/NT stuff seem to use make on NT with success), and
it gives people the interface they're used to, and access to all the
(admitedly crufty) tools that exist already.

--tim

From: Marco Antoniotti
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Mon, 26 Oct 1998 00:00:00 +0000
Message-ID: <lwd87f9v18.fsf@galvani.parades.rm.cnr.it>

I tend to have a somewhat-in-between attitude on the subject.
Tim Bradshaw <···@aiai.ed.ac.uk> writes:
> 
> Can I make a small plea *not* to write a version control system from
> nothing?  This is a very traditional `isolationist' lisp way to go,
> and it makes things like integrating with other systems much harder
> than it needs to be.  It also puts you in a position of trying to keep
> up with systems having orders of magnitude more money pumped into
> them, which is a problem (look at lisp machine performance vs risc
> say).

I agree with this. CVS (and PRCS - a favourite of mine) do exactly
what you need.  It is better to interface to them (PRCS also has a
Lispish syntax: too bad the implementors did not listen to my plea
when it first came out :{ )

> 
> Here are a couple of positive suggestions rather than just whining:
> 
> CVS has a client-server interface.  A Lisp system could have a CVS
> client which could hide behind a suitable pathname system.  You'd need
> socket code to do this, or if you could live with being slightly
> OS-specific you could emit cvs command lines. A proper cvs client
> would be cooler and, given a socket interface, should work for a lisp
> system running on almost any platform.  I don't think the CVS client
> interface is that incomprehensible.

Yep.

> 
> I hate Lisp defsystems because they're isolationist, but I had a
> scheme for a `defsystem' which would use make to track file
> dependencies, but instead of cranking up Lisp each time, it would spit
> out a file full of suitable compile commands, which could then be fed
> to a Lisp.  The `make depend' bit could be done by having some Lisp
> program which did who-calls or equivalent, and generated a makefile
> from the results.  If a purely-lisp interface is needed one could
> write defsystems which would then be translated into makefiles
> mechanically (this should be pretty easy).  I know make is much
> nastier than anything one might do in Lisp, but it's there, it gets
> used for everything else (on Unix), it runs most places (people who do
> multiplatform Unix/NT stuff seem to use make on NT with success), and
> it gives people the interface they're used to, and access to all the
> (admitedly crufty) tools that exist already.

In this case I don't agree that much.  MK:DEFSYSTEM (apart from
copyright issues) does a very good job at very Lispy things and it
runs under any CL with minor modifications.  The same argument is
applicable.  It's there, it can be used, so why not use it? (Please,
do not raise the issue of "lack of a defsystem GUI under MCL": it is
not very portable anyway).

A script (in CL of course) which parses a Makefile and produces a nice
defsystem definition or a (progn #|compile and/or load|#) into a file
would be nice as well, but I see a lot of problems and subtleties to
be overcome.  After all I strongly believe that there are only a
handful of people on the planet who actually know and use all the
esotheric features of "make" and who know all the nuances of using
'make', 'gmake', 'MS make', 'Some other make' etc.

-- 
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - (0)6 - 68 10 03 16, fax. +39 - (0)6 - 68 80 79 26
http://www.parades.rm.cnr.it

From: Tim Bradshaw
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Mon, 26 Oct 1998 00:00:00 +0000
Message-ID: <ey3emrvku7a.fsf@todday.aiai.ed.ac.uk>

* Marco Antoniotti wrote:

> A script (in CL of course) which parses a Makefile and produces a nice
> defsystem definition or a (progn #|compile and/or load|#) into a file
> would be nice as well, but I see a lot of problems and subtleties to
> be overcome.  

In case there is confusion, this isn't what I was intending -- parsing
a makefile (with typically large embedded shell scripts) is a hideous
job.  I was intending to have a lisp defsystem which would write
makefiles (and in turn a makefile which would write lisp files with
lots of compiles in, not anything that would actually track the
dependencies).

> After all I strongly believe that there are only a
> handful of people on the planet who actually know and use all the
> esotheric features of "make" and who know all the nuances of using
> 'make', 'gmake', 'MS make', 'Some other make' etc.

I agree, partly.  I think that a good few people know a surprising
amount about make and use an awfully large amount of its surface
area. *But* those people tend to choose a make implementation (often
gnu make because it's available in source) and port it to any platform
they need, including NT.  In fact most people seem to try and build a
little unix build environment under NT, including shells awk, sed,
perl & stuff.  Certainly I'd do that!

--tim

From: Francis Leboutte
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Tue, 03 Nov 1998 00:00:00 +0000
Message-ID: <363f2491.116912311@192.168.1.250>

Tim Bradshaw <···@aiai.ed.ac.uk> wrote:

>In case there is confusion, this isn't what I was intending -- parsing
>a makefile (with typically large embedded shell scripts) is a hideous
>job.  I was intending to have a lisp defsystem which would write
>makefiles (and in turn a makefile which would write lisp files with
>lots of compiles in, not anything that would actually track the
>dependencies).
>

Could you explain the interest of this? To compile, the defsystem tools
seems to be good enough for me, even for quite large applications. Maybe am
I missing something :-)

Francis

--
Francis Leboutte
··········@skynet.be ········@acm.org  http://users.skynet.be/algo

From: Tim Bradshaw
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Wed, 04 Nov 1998 00:00:00 +0000
Message-ID: <ey3yapru3gk.fsf@todday.aiai.ed.ac.uk>

* Francis Leboutte wrote:

> Could you explain the interest of this? To compile, the defsystem tools
> seems to be good enough for me, even for quite large applications. Maybe am
> I missing something :-)

I want to be able to look after everything with make, basically.  I
*know* this is cruddy, but if I'm trying to integrate Lisp into some
other build process that's what people will expect.  And I want things
like the source file names to be visible in the makefiles (rather than
just `run lisp and load sysdef.lisp') so other make targets (make
commit, make release) can work well.

--tim

From: Bob Bane
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Mon, 26 Oct 1998 00:00:00 +0000
Message-ID: <36349A72.762EB73@gst.com>

Tim Bradshaw wrote:
> I hate Lisp defsystems because they're isolationist....

But they don't have to be!  I develop a planning and scheduling module in
Lisp as part of a system that captures, grinds, indexes, stores, and
retrieves satellite data.  The rest of the system is in C, C++, and tcl/tk,
so it builds with makefiles.  My stuff builds with makefiles, too, to keep
our CM people happy.  I have a simple piece of Lisp code that walks the
MKANT defsystem description of my module and emits a sub-makefile that
contains the dependencies extracted from it.  My top-level makefile looks
like this:

all: Makefile.defsystem
    make -f Makefile.defsystem rac.image

Makefile.defsystem: rac-defsystem.lisp make-makefile.lisp base.image
    echo '(load "make-makefile.lisp") |  base.image -qq -batch

My defsystem walking code is not a general solution to the problem, but it
works for the simple defsytem dependencies I tend to use.  I can post the
code if anyone's interested.

From: Marco Antoniotti
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Mon, 26 Oct 1998 00:00:00 +0000
Message-ID: <lwbtmz9usq.fsf@galvani.parades.rm.cnr.it>

Bob Bane <····@gst.com> writes:

> Tim Bradshaw wrote:
> > I hate Lisp defsystems because they're isolationist....
> 
> 
> My defsystem walking code is not a general solution to the problem, but it
> works for the simple defsytem dependencies I tend to use.  I can post the
> code if anyone's interested.

Very interested!  Please do.

-- 
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - (0)6 - 68 10 03 16, fax. +39 - (0)6 - 68 80 79 26
http://www.parades.rm.cnr.it

From: Bob Bane
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Tue, 27 Oct 1998 00:00:00 +0000
Message-ID: <36360E51.E9DA943D@gst.com>

OK, here it is, slightly cleaned up and with most of the home-grown CM drivel
removed from it.  Loading this file into a Lisp containing MK-DEFSYSTEM and a
defsystem definition for a component named "RAC" will cause a file named
Makefile.defsystem to be written out.  All this makefile says is that rac.image
depends on all the sources from the RAC component; it builds rac.image by
loading init-rac.lisp, a file that loads MK-DEFSYSTEM and the RAC defsystem,
and then uses the normal defsystem stuff to recompile and dump the image.

I wouldn't call defsystem "isolationist"; on the contrary, because defsystem
structures are exposed and programmatically accessible, defsystems can reach
out and cooperate with other CM systems.  We're not "isolationist", we're an
oppressed minority.  ;-)

    - Bob Bane (····@gst.com)

;;; -*- Mode: LISP; Package: MAKE Syntax: Common-Lisp -*-

(in-package :MAKE)

;;
;; This is really simple and sleazy; it defines a defsystem operator for
;; collecting the components of a defsystem whose sources are rooted in a
;; single directory (the source-root-dir of the top component).  From this
;; we can generate a barely intelligible makefile that can determine
;; when lisp needs to be started to rebuild the defsystem in question.
;; We use the built-in defsystem walker by defining a component-operation
;; (yay!) but we have to communicate with it via special variables (boo!).
;;

(defvar *make-src-dir* nil
  "Absolute pathname for current component source")

(defun mung-source-name (file)
  (make-pathname :name (component-source-pathname file)
   :type (component-source-extension file)
   :directory (relative-subdirectory
        *make-src-dir*
        (pathname (component-source-root-dir file)))
   ))

(defun relative-subdirectory (root branch)
  "Assumes root and branch are absolute pathnames and branch is a subdirectory
of root, so all we have to do is pop off the common head of their directory
lists"
  (cons :relative
 (nthcdr (length (pathname-directory root))
  (pathname-directory branch))))

(defvar *makefile-source-pathnames* nil
  "A special variable where the source pathnames are collected")

(defun gather-sources-operation (component force)
  "The DEFSYSTEM component-operation mechanism will walk this across all the
source components for us; it just gathers up the pathnames of the sources,
relative to the current directory."
  (declare (ignore force))
    (push (mung-source-name component) *makefile-source-pathnames*))

(component-operation :gather-sources 'gather-sources-operation)
(component-operation 'gather-sources 'gather-sources-operation)

(defun sources-for-component (name)
  (let* (*makefile-source-pathnames*
  (component (find-system name))
  (*make-src-dir* (pathname (component-source-root-dir component)))
  )
    (operate-on-system name :gather-sources)
    *makefile-source-pathnames*
    ))

;;
;; And here is the code that actually builds Makefile.defsystem
;; If the current Lisp doesn't have MK-DEFSYTEM and the RAC defsystem,
;; loaded, load them here
;;

(with-open-file (*standard-output* "Makefile.defsystem"
   :direction :output
   :if-exists :supersede)
  (format t
"## Makefile.defsystem generated from make-makefile.lisp
## Don't edit this directly; make changes to the text in make-makefile.lisp,
## then make Makefile.defsystem, OK?  Be sure to put back-slashes
## (\\) in front of any double-quotes you use in that file.

rac.image: init-rac.lisp")
  (format t "~{ ~a~}" (sources-for-component "RAC"))

  (format t
"~%~Cecho '(progn (load \"init-rac.lisp\") (in-package :PLANNER) (excl:gc t)
(excl:dumplisp :name \"rac.image\"))' | ./base.image -qq -batch

# This comment guards the precious whitespace above and below
# You gotta love makefiles... NOT.

"
   #\Tab)
  )

From: Pierre Mai
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Tue, 27 Oct 1998 00:00:00 +0000
Message-ID: <8767d6vmw4.fsf@dent.isdn.cs.tu-berlin.de>

Tim Bradshaw <···@aiai.ed.ac.uk> writes:

> Can I make a small plea *not* to write a version control system from
> nothing?  This is a very traditional `isolationist' lisp way to go,
> and it makes things like integrating with other systems much harder
> than it needs to be.  It also puts you in a position of trying to
> keep

Sorry, but I don't quite understand why implementing a version control 
system in Lisp is in any way isolationist.  If the same VCS is written 
in C, like e.g. RCS,  no one cries that RCS is isolationist.  Why?
Implementing something in CL makes it isolationist?  And in C it's
totally open, integratable?

If this is indeed so, then I should immediately stop my current
project, because the _last_ thing that my client can afford is being
isolationist[1].  So back to using C++ and tcl, I guess...

> up with systems having orders of magnitude more money pumped into
> them, which is a problem (look at lisp machine performance vs risc
> say).

Sorry, but this kind of argument will actually force me to abandon
Lisp in toto, since it is clear that (currently) much, much more
money is pumped into C++ than into CL, and thus -- so your reasoning
-- CL will have a very hard time keeping up with C++, so we should all 
switch to using C++.

I find it amazing that the failure of Lisp Machines is always taken as 
a kind of signal, that nothing implemented purely in Lisp can
succeed.  Sorry, but I don't believe in this kind of short-circuit
reasoning.  One has to analyze the failure of the Lisp Machine
Companies in much more detail, if one wants to make any valid
conclusions for the present.  I also believe that the determining
factors for success are quite different for hardware vs. software, and 
for `commercial' vs. `free' software, and for software vs.
specifications.

> CVS has a client-server interface.  A Lisp system could have a CVS
> client which could hide behind a suitable pathname system.  You'd need
> socket code to do this, or if you could live with being slightly
> OS-specific you could emit cvs command lines. A proper cvs client
> would be cooler and, given a socket interface, should work for a lisp
> system running on almost any platform.  I don't think the CVS client
> interface is that incomprehensible.

Since the first goal would be to design a portable interface to
version control, nothing prevents anyone from implementing the backend 
via CVS, PRCS, RCS or even, dare I say, in Common Lisp.  You will of
course not be able to do some of the things that I sketched out in my
previous posting.  OTOH you will "gain" the ability to evade Common
Lisp as much as posible by working from the shell, and only ever
seeing a Common Lisp prompt when something went wrong.  This is not
something that I value, but then again I'm not calling CL-based
software development tools isolationist ...

There are those of us that value CL's advanced features _not only_ in
language, but also in it's advanced programming _environment_, with
incremental compilation, the listener (R-E-P loop), the debugger,
profilers, tracing, inspect, describe, documentation, etc.

Many of these advantages cannot be realized in full, if we insist on
restricting ourselves to the lowest common denominator of all languages
out there, i.e. the file-based edit-compile-run-debug cycle as
supported by make, rcs, gdb & co.

Of course, not all of us value those things, and some have other
requirements, and so for them file-compiler based solutions (there are 
CL implementations that support this way of doing business quite well) 
and make are the solutions of choice.  Nothing wrong with that.  But
then again I don't understand your plea:  If you want to use make, cvs 
& co., what is stopping you?  Those tools are already out there, and
no one is going to take them away from you[2].

It's just that I want something else, something more.  The ideas I
sketched out in my previous posting were borne out of real needs I
have.  And they are there because my way of working does not try to
evade the CL listener, to the contrary, I've always got a CL running
while developing.  And even in combination with "only" the ILISP
interface of (X)Emacs, this provides a much more fruitful interface to 
my developing environment, than any combination of shell, make and
emacs ever has.

> I hate Lisp defsystems because they're isolationist, but I had a

Sorry, but the defsystems that I know are not isolationist at all,
except if you call everything isolationist that runs in a CL
implementation.  Let's take MK-DEFSYSTEM for example, which is
freely[3] available, and quite portable:

- There is built-in support for Scheme and C besides LISP, and further 
  languages can be easily added via mk:define-language[4].

- There is support for delegating the build-process of whole
  subsystems to arbitrary lisp forms, which through your
  implementations's OS interface may indeed call out to make, shape or 
  any other build-system:

<quote source="DEFSYSTEM Documentation" author="Mark Kantrowitz">
 4.3.1.5. Including Foreign Systems

 Systems defined using some other system definition tool may be
 included by providing separate compile and load forms for them
 (using the :compile-form and :load-form keywords). These forms
 will be run if and only if they are included in a module with no
 components. This is useful if it isn't possible to convert these
 systems to the defsystem format all at once.
</quote>

So what makes this defsystem isolationist?  The fact that I "have" to
type

(operate-on-system "simulator" 'compile)
or shorter
(oos "simulator" 'compile)
instead of
make simulator

?

Or what is so great about make, that your proposal of first writing
defsystem definitions, and then translating them to Makefiles, and
building via make -- except that you then have to write a new *.lisp
file so you don't have to start/stop your lisp for every *.lisp file
that is built -- is necessary?

Don't misunderstand me:  I'm all for interoperability.  But I
seriously doubt that the only way to reach interoperability is working 
in C, or basing oneself on CVS, or working from the unix shell.

> mechanically (this should be pretty easy).  I know make is much
> nastier than anything one might do in Lisp, but it's there, it gets
> used for everything else (on Unix), it runs most places (people who do
> multiplatform Unix/NT stuff seem to use make on NT with success), and
> it gives people the interface they're used to, and access to all the

Sorry, but after having written Makefiles for quite some time during
my live, I'm still not used to writing Makefiles, i.e. I have to have
another window open with the corresponding info file to get all the
variable substitutions, pattern-matching and stuff right.

> (admitedly crufty) tools that exist already.

What make-based tools are there, that can't coexist with a
mk:defsystem that calls out to make for a subsystem?

Allow me a final word:

A short time ago, Eric Marsden complained here, that:

| [To come back to Lisp:] For me, one of the primary weaknesses of Common
| Lisp is the lack of libraries. Other languages such as Python and Perl
| have a huge user-contributed library of functions for networking,
| database access, text processing, graphics, which make many non-trivial
| tasks easy. In CL you have to write these from scratch (heck, there isn't
| even a portable socket or regexp interface!).

Well, there seems to be good reason that there is a lack of libraries
in Common Lisp, in comparison to perl or python:  No one has pleaded
that those who implemented those libraries for perl or python should
have instead crufted something on top of external utilities, when this 
would have _clearly visible_ drawbacks.

There is a web-browser written in python, although there were already
Netscape&Co. out there, which clearly had the money on their side.
Still the python-based browser is very helpful for many users of
python for easy integration and customization in their applications.

Or take Perl's Makefile.pm approach[5], although there was already
autoconf, automake & co. out there.  This approach is much more
isolationist than the defsystems I know of, yet this hasn't hurt Perl, 
has it?

Unless CL users start to recover from the lisp market downsizing
shock, and start developing for CL again, there _*WILL BE NO LIBRARIES
AND TOOLS FOR COMMON LISP*_.  _*THERE IS NOTHING WRONG WITH DEVELOPING 
LIBRARIES AND TOOLS FOR COMMON LISP IN COMMON LISP*_.

Regs, Pierre.

Footnotes: 
[1]  A simulation system that is isolated from the rest of the
business world, which consists of such ugly beast as SAP R/[2-3],
diverse relational databases, PPSs, Windows "solutions", etc, is less
useful than a drawing block.

[2]  BTW:  Even in this file-based paradigm, there are much more
advanced tools out there than "good" *old* make and rcs/cvs.  If I
were into file-based SCM right now, I would want _at least_ the
functionality provided by the Shape Toolkit developed locally at the
TU Berlin some time ago (http://swt.cs.tu-berlin.de/~shape/index.html) 

[3]  Yes, all of you GNU-zealots, I know that Mark Kantrowitz's
defsystem is in the non-free part of the debian distribution for
various valid reasons, but then again I don't restrict the normal
connations of the word `free' to those that would please
GNU-zealots.

[4]  This capability could of course be further expanded on, but the
basics are there.

[5]  Or whatever this was called... It's been to long that I installed 
a package off CPAN... :-)

-- 
Pierre Mai <····@acm.org>               http://home.pages.de/~trillian/
  "Such is life." -- Fiona in "Four Weddings and a Funeral" (UK/1994)

From: Tim Bradshaw
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Tue, 27 Oct 1998 00:00:00 +0000
Message-ID: <ey3d87eavng.fsf@todday.aiai.ed.ac.uk>

* Pierre Mai wrote:

> Sorry, but I don't quite understand why implementing a version control 
> system in Lisp is in any way isolationist.  If the same VCS is written 
> in C, like e.g. RCS,  no one cries that RCS is isolationist.  Why?
> Implementing something in CL makes it isolationist?  And in C it's
> totally open, integratable?

I didn't say anything about C.  What I meant was that there are an
awful lot of people using CVS (for example), which has a more-or-less
language-neutral API (via the CVS client-server stuff).  And CVS
works, actually, and has been stress-tested by a large number of very
large projects, and it's free, and there is commercial support if you
need that (www.cyclic.com, of whom I am not a customer!), and it runs
most places you'd want it to, and lots of CM people understand it and
use it and so forth.  So writing a CL API to CVS seems like a pretty
good way of building on all this work (and if you did it right, you'd
need to do a portable socket API for CL too, so you get something else
we all need).  Writing a whole new version control system that almost
certainly won't have the functionality that CVS has seems pretty
isolationist to me.

> Sorry, but this kind of argument will actually force me to abandon
> Lisp in toto, since it is clear that (currently) much, much more
> money is pumped into C++ than into CL, and thus -- so your reasoning
> -- CL will have a very hard time keeping up with C++, so we should all 
> switch to using C++.

Again, I didn't say that.  Pumping huge amounts of money into
compilers gets you, perhaps, a factor of 2 speedup, perhaps smaller
code and not much else -- in particular it doesn't get the extra
functionality that even a very slow Lisp system gives you. so really
this isn't an issue for Lisp vs C++.  Something like a VC/CM system is
a rather different case, particularly as you can quite easily use at
least two existing systems (RCS, CVS) with Lisp.

> I find it amazing that the failure of Lisp Machines is always taken as 
> a kind of signal, that nothing implemented purely in Lisp can
> succeed. 

I didn't say that.  I said `look at lisp machine performance vs risc
say'.  RISC had a lot of money pumped into it, and those machines beat
the Lispms, which had less money pumped into them, hands down in terms
of performance.  I could equally have said `look at the x86 compared
to RISC, say', which is another case where massive injections of money
(into x86) has produced a remarkably competitive result, despite
a crippling basic architecture.

> Of course, not all of us value those things, and some have other
> requirements, and so for them file-compiler based solutions (there are 
> CL implementations that support this way of doing business quite well) 
> and make are the solutions of choice.  Nothing wrong with that.  But
> then again I don't understand your plea:  If you want to use make, cvs 
> & co., what is stopping you?  Those tools are already out there, and
> no one is going to take them away from you[2].

Nothing stops me, I do use them.  A cooler interface to Lisp would be
very nice (and something I'm interested in writing).  This interface
would of course support the incremental stuff that Lisp people (like
me) like *as well as* the make stuff that I'm afraid I believe we
need.  What I'm really seriously uninterested in doing is reinventing
something like CVS, except worse, and with more bugs which at some
point will eat all my changes and cost me weeks of work to recover
from.  I have a pretty good idea how long it took to get these
systems up to the point they are now, and I just don't think this
wheel is worth reinventing, especially since it's reasonably clear
that these tools can support a decent Lisp-like way of working if
people come down from some `pure lisp' hobby horse and pick up all
this cool stuff and integrate with it rather than reinvent it.

I don't really have time to respond to the rest of your message,
especially as it seems you've really not understood what I'm getting
at (being interpreted as Pro-C is particularly amusing for me: my home
computer is a Symbolics).

Except this:

> So what makes this defsystem isolationist?  The fact that I "have" to
> type

> (operate-on-system "simulator" 'compile)
> or shorter
> (oos "simulator" 'compile)
> instead of
> make simulator

> ?

No that doesn't make it isolationist.  The fact that there are
hundreds of thousands of lines of code in various languages that can
do smart stuff with CVS/make/&co, like making web pages, tracking
bugs, checking commits, automating builds, regression testing,
managing releases &c &c &c, and spending time reinventing versions of
this stuff certainly does seem pretty isolationist to me.

--tim

From: Kelly Murray
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Tue, 27 Oct 1998 00:00:00 +0000
Message-ID: <363678AF.67D0C32A@IntelliMarket.Com>

My interest in version control has little to do with FILES.
In which case Make, diff, configure, RCS, and CVS are useless,
In my view, half the problem is dealing with all these patched
together tools.  It's amazing they work most of the time.
It's a nightmare when they don't.
Almost impossible to debug problems, at least by mere mortals.

To add yet another layer on top seems the wrong approach.
 Starting from a clean-slate and doing something which is
 specifically integrated for Lisp is great.  And in fact,
 as has been pointed out, if it is done well, it could actually
 be used to manage non-lisp applications.

-Kelly Murray  ···@intellimarket.com

From: Tim Bradshaw
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Wed, 28 Oct 1998 00:00:00 +0000
Message-ID: <ey390i0bzvf.fsf@todday.aiai.ed.ac.uk>

* Kelly Murray wrote:

> To add yet another layer on top seems the wrong approach.
>  Starting from a clean-slate and doing something which is
>  specifically integrated for Lisp is great.  And in fact,
>  as has been pointed out, if it is done well, it could actually
>  be used to manage non-lisp applications.

Yes of course that would be great.  But for most people it's a choice
between doing something like that (which is really reinventing the
wheel to some extent, even if your new wheel is better than any old
ones), and getting on and writing the system that will make you rich,
you hope.  Of course, you might hope that the superior wheel will *be*
the system that makes you rich -- that might even be true!  But, at
least for me, I just don't think I'd gain enough productivity from a
new improved Lisp wheel to make it worth my while doing it instead of
what I'm actually interested in.

--tim

From: Francis Leboutte
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Wed, 04 Nov 1998 00:00:00 +0000
Message-ID: <364308b8.175319796@192.168.1.250>

Tim Bradshaw <···@aiai.ed.ac.uk> wrote:

>* Kelly Murray wrote:
>
>> To add yet another layer on top seems the wrong approach.
>>  Starting from a clean-slate and doing something which is
>>  specifically integrated for Lisp is great.  And in fact,
>>  as has been pointed out, if it is done well, it could actually
>>  be used to manage non-lisp applications.
>
>Yes of course that would be great.  But for most people it's a choice
>between doing something like that (which is really reinventing the
>wheel to some extent, even if your new wheel is better than any old
>ones), and getting on and writing the system that will make you rich,
>you hope.  Of course, you might hope that the superior wheel will *be*
>the system that makes you rich -- that might even be true!  But, at
>least for me, I just don't think I'd gain enough productivity from a
>new improved Lisp wheel to make it worth my while doing it instead of
>what I'm actually interested in.
>
>--tim

C'est la vie. It is exactly what we are going to do: add some
functionalities and a graphical interface on CVS, to be used in the IDE of
ACL5. This interface should display not only the files but also the
systems. It should also include functions to edit, compile and search.
Hopefully we will do that in a few days (Hum).

Francis

--
Francis Leboutte
··········@skynet.be ········@acm.org  http://users.skynet.be/algo

From: Christopher Stacy
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Tue, 27 Oct 1998 00:00:00 +0000
Message-ID: <u90i2371m.fsf@pilgrim.com>

I think Kent Pitman wrote a paper years ago about things like the
difference in purpose and functionality between compilation and version
control tools (examples included RCS, Make, DEFSYSTEM, and other things).
People might want to read that.  I think it was an MIT AI Lab memo.

From: Howard R. Stearns
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Tue, 27 Oct 1998 00:00:00 +0000
Message-ID: <363613B1.D615E37B@elwood.com>

What I think I hear people saying is that they would like to be able to
have (unspecified) Lisp programs make use of version information and
version control tools such as RCS, SCCS, etc., in such a way that the
same information could also be made use of by other (non-Lisp) tools. 
One of the Lisp utlitities which should be able to make use of such
information is DEFSYSTEM.  Specifically, DEFSYSTEM should be able to
keep track of the mapping between file and system versions, just as the
old Symbolics System-Construction-Tools did.

ANSI CL already defines how version information should be used in
pathnames.  If all Lisp implementations did this, it would be relatively
straightforward to extend defsystem to keep track of the mapping.  The
point is to keep the pathname-version hair in the implementation of
pathnames, not in the implementation of defsystem.  Defsystem only needs
to map pathname-versions to system-versions, not provide the
implementation of pathname versions. Symbolics pathnames and SCT did
this, and Sean Tierney at ICAD did a portable and extendible source
control system this way several years ago, and it worked very well.  

Note that defystem is not the only thing in Lisp which needs to properly
handle versions.  The idea is to make pathnames fully support versions
in general, and CL already defines how this should work with respect to
OPEN, DIRECTORY, etc.  The only additional thing we need is to work out
a way for the pathname utilities to be implemented in such a way that
anyone can come along and hook in their favorite external mechanism for
storing version information and contents.  This is what I was alluding
to in my previous post.  In response to further queries, I'm appending
an outline of this protocol, below.

I have not defined a similar MOP for defsystem, but the idea is the
same: provide a general user interface based on generic functions
specialized on different kinds of modules, and provide a means for users
to create new operations and new classes of modules.

With respect to 'make':  I generally have a problem with "specialized
scripting languages" which limit their expressive power by proving less
than general functionality, and 'make' is no exception.  I have also
found portability to be a problem.  Nonetheless, given the above setup
of pathnames and defsystem, there is no reason that someone couldn't
define a defsystem operation which printed out a makefile for some
particular flavor of 'make'.

-------------------
Warning: This assumes familiarity with MOP protocols such as AMOP and
the CLOS-Streams protocol.  Also, note that in addition to supporting
version control, this is intended to support different networked and
cross-platform file systems, and extended-char namestrings.

Pathname class hierarchy, each subclassable by user:

 PATHNAME
  LOGICAL-PATHNAME
    URN-PATHNAME ???
      URL-PATHNAME ???
  PHYSICAL-PATHNAME
    UNIX-PATHNAME
      RCS-PATHNAME ???
      SCCS-PATHNAME ???
      GNU-EMACS-PATHNAME ???
    WINDOWS-PATHNAME
    ...

Pathname generic-functions used to implement public ANSI CL pathname
functions. Users may customize by specializing on user-defined
subclasses
of the above.  This is analogous to the stream-xxx methods in
CLOS-Streams:

 PATHNAME thing
 PATHNAME-COMPONENT pathname component-name
   (a form of slot-value that returns a cannonicalized, common-case
value)
 PATHNAME-LOCAL-COMPONENT pathname component-name
   (a localized version of slot-value)
 MAKE-LOAD-FORM pathname &optional environment
 PATHNAME-DEFAULT-DEVICE pathname
 PATHNAME-NAMESTRING pathname &optional errorp
 PATHNAME-ENOUGH-NAMESTRING pathname default &optional errorp
 PATHNAME-HOST-NAMESTRING pathname &optional errorp
 PATHNAME-DIRECTORY-NAMESTRING pathname &optional errorp
 PATHNAME-FILE-NAMESTRING pathname &optional errorp
 PATHNAME-PARSE-NAMESTRING pathname namestring start end junk-allowed
 PATHNAME-PARSE-HOST pathname namestring start end junk-allowed
 PATHNAME-PARSE-DEVICE pathname namestring start end junk-allowed
 PATHNAME-PARSE-DIRECTORY pathname namestring start end junk-allowed
 PATHNAME-PARSE-NAME pathname namestring start end junk-allowed
 PATHNAME-PARSE-TYPE pathname namestring start end junk-allowed
 PATHNAME-PARSE-VERSION pathname namestring start end junk-allowed
 PATHNAME-MATCH pathaname1 pathname2 &optional versionp
 EQUAL pathname1 pathname2
 PATHNAME-WILD-P pathname component-name
 PATHNAME-TRANSLATE source from-wildname to-wildname
 PATHNAME-MERGE pathname default default-version
 PATHNAME-UNMERGE pathname default

FILE-STREAM generic functions:

 FILE-STREAMs should take init-args of :pathname, :translated-pathname
   and :truename (which has version and type resolved, see below). The
   MAKE-INSTANCE method is responsible for doing whatever is necessary
   in the file system.  This might simply call operating-system
   open()/fopen(), or first check-out a file under
   version-control. The implementation may use other init-args
   supplied by OPEN in order to support such :if-exist options as
   :rename, :rename-and-delete, etc.  For example, an implementation
   might accept :renamed and :cleanup init-args that are used by the
   CLOSE method.  The CLOSE method might need to check-in the file.)
 FILE-STREAM-CLASS pathname truename direction external-format
element-type
   (Called by OPEN to return a (possibly user-defined) file-stream
    meta-class object of which a newly opended stream should be an
instance of.)
 STREAM-ELEMENT-SIZE stream
 STREAM-LENGTH file-stream
 STREAM-OBJECT-LENGTH file-stream object &key start end
 (SETF) STREAM-POSITION file-stream

File-system generic-functions used to implement public ANSI CL
file-system functions:

 PATHNAME-RESOLVE-TYPE pathname &key follow-links &allow-other-keys
   (Cannonicalize pathname to a truename using file system
    information, or return nil if it does not exist.)
 PATHNAME-RESOLVE-VERSION pathname (resolves mnemonic versions to
numeric,
   using file system)
 PATHNAME-MAKE-DIRECTORY pathname
 PATHNAME-EXPUNGE-FILE pathname
 PATHNAME-WRITE-DATE pathname
 PATHNAME-AUTHOR pathname
 PATHNAME-DIRECTORY pathname &key &allow-other-keys
   subprotocol: (it might be that normal open, close, etc. are
      sufficient for directory-streams, too)
   OPEN-DIRECTORY pathname (returns a directory-stream)
   DIRECTORY-ENOUGH-NAMESTRING directory-stream file-pathname
directory-pathname 
     (returns a string suitable for use by READ-DIRECTORY)
   READ-DIRECTORY directory-stream namestring-to-match
   CLOSE-DIRECTORY 

All file-stream implementations need to do buffering, but the
CLOS-Streams protocol doesn't define any standard mechanism for doing
this -- presumably because there isn't a clearly correct way to do
this.  Without a buffering protocol, users can still use CLOS-Streams
effectively, but they probably reinvent things that the implementation
already uses internally.  Similarly, the pathname protocol I've
outlined here does not define protocols for generating or parsing
namestrings in the pathname protocol, or forming operating-specific
namestrings in the file-system protocol.  As a result, users may
reinvient the wheel, but they'll still be able to do things.  Of
course, a particular implementation could also expose these lower
level protocols as well.

From: Tim Bradshaw
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Wed, 28 Oct 1998 00:00:00 +0000
Message-ID: <ey3af2gc0wr.fsf@todday.aiai.ed.ac.uk>

* Howard R Stearns wrote:
> What I think I hear people saying is that they would like to be able to
> have (unspecified) Lisp programs make use of version information and
> version control tools such as RCS, SCCS, etc., in such a way that the
> same information could also be made use of by other (non-Lisp) tools. 
> One of the Lisp utlitities which should be able to make use of such
> information is DEFSYSTEM.  Specifically, DEFSYSTEM should be able to
> keep track of the mapping between file and system versions, just as the
> old Symbolics System-Construction-Tools did.

I think that you need to do quite a lot more than SCT, specifically
you need to be able to deal with branches.  I don't think Lisp
pathnames are up to that as they stand.  There does seem to be the
beginnings of support for branches in Genera 8.3 although there's no
documentation for it and I don't know if it ever worked.  It would be
interesting if some old symbolics person knew what was intended.

--tim

From: Howard R. Stearns
Subject: Re: VERSION-CONTROL (was Re: Symbolic Expression Counting CONTAINS SPOILER.)
Date: Wed, 28 Oct 1998 00:00:00 +0000
Message-ID: <3637661E.6BE62DB7@elwood.com>

Tim Bradshaw wrote:
> 
> * Howard R Stearns wrote:
> > What I think I hear people saying is that they would like to be able to
> > have (unspecified) Lisp programs make use of version information and
> > version control tools such as RCS, SCCS, etc., in such a way that the
> > same information could also be made use of by other (non-Lisp) tools.
> > One of the Lisp utlitities which should be able to make use of such
> > information is DEFSYSTEM.  Specifically, DEFSYSTEM should be able to
> > keep track of the mapping between file and system versions, just as the
> > old Symbolics System-Construction-Tools did.
> 
> I think that you need to do quite a lot more than SCT, specifically
> you need to be able to deal with branches.  I don't think Lisp
> pathnames are up to that as they stand.  There does seem to be the
> beginnings of support for branches in Genera 8.3 although there's no
> documentation for it and I don't know if it ever worked.  It would be
> interesting if some old symbolics person knew what was intended.
> 
> --tim

Good point.  I believe that the ICAD pathnames I mentioned in my
previous post DID handle tree structured versions.

The CL spec does not address branches in versions or hosts, but it DOES
address tree structured directories.  

The pathname MOP I sketched out in my previous post does not specificaly
address tree-structured directories, but is sufficent to encompass them
(by existence proof).  That is, the implementers of certain methods such
as (PATHNAME-MERGE pathname default default-version) must decide if and
how tree-structured directories must be handled.  Given that, the overal
structure of the pathname MOP does correctly handle everything that
needs to be handled with respect to tree-structured directories. 
Furthermore, it even provides sufficient hooks such that binary pathname
operations such as merge-pathnames and pathname-match-p can handle
pathnames of different classes, that have different behavior for
tree-structured directories.

I believe (but have not demonstrated) that tthe pathname MOP is
similarly sufficient to support other tree-structured components in
versions, hosts, types(!), etc.  Note that ability to define behavior
for working with pairs of pathnames of different classes is crucial.

My previous post specifically mentioned that the CLOS-stream MOP doesn't
define an implementation of buffering, and the pathname MOP doesn't
define an implementation of parsing and generating namestrings, yet both
are sufficient to do real things -- even if at the cost of sometimes
reinventing the wheel.  I would add tree-structured components as a
similar item in this category.

From: David J Cooper Jr
Subject: Re: VERSION-CONTROL and ICAD's Source Control facility
Date: Thu, 29 Oct 1998 00:00:00 +0000
Message-ID: <3638CD89.E3DCACFB@genworks.com>

Howard R. Stearns wrote:

> Good point.  I believe that the ICAD pathnames I mentioned in my
> previous post DID handle tree structured versions.

I still work with these ICAD OSI (``Operating System Independent'') pathnames
every day.

Here is an example of what they look like:

    gm-kbe:light-duty-truck;chassis;driveline;source;assembly.lisp.1.1

    The gm-kbe: refers to a ``host'' (mapped to a physical filesystem
directory, which can be specified as :simple or :rcs-unix), the words between
the semicolons refer to directory path components, and the assembly.lisp is a
version-controlled filename.

    The ``1.1'' on the end refers to an RCS version number. As you know,
RCS supports branching in version of an individual file (1.1.1.1, 1.1.2.1,
1.1.2.2, etc, etc.). However, I do not believe the ISC (ICAD Source Control)
system does very much sophisticated with allowing branching of systems
corresponding to branching of file versions.

    I could see the use for such a capability when going back to production
versions of a system and applying ``bug fix'' patches, while maintaining a
current ``development'' version of the system.   Currently the patching
mechanism works by just submitting your patch from a temporary file, which
gets copied into a special ``patch'' subdirectory for the appropriate version
of the system or component-system in question, compiled, and loaded. It's a
workable solution, but you must always make sure to apply the same fix to the
latest ``development'' version of the file (in practice, nine times out of 10
this is the file you end up making the patch from anyway, so there's not
usually a problem).

    The next time I talk to the KTI development staff (KTI is Knowledge
Technologies International, the vendor of the ICAD System (a.k.a. the
Knowledge Base Organizer)) and find out what possibility there would be of
submitting the ICAD Source Control system to the Lisp community. Everyone
might stand some chance of benefiting if the Lisp community could take it and
extend it.

    We in the ICAD community have done this to some extent; Genworks provides
a package called ISC-Control which is basically a set of ``power tools'' for
building and maintaining ISC systems. But there are always ways it could be
extended.

    Of course, there would be issues trying to port something like ISC into a
non-ICAD Lisp---but if interest is there I believe I have solutions for most
if not all of these issues.

    Anyone who would like to follow up on this idea please respond to me or to
the newsgroup.  My main goal in all this, of course, is to benefit the Lisp
community at large, without pissing off KTI, since we (Genworks) are in a
strong partnership with them and want nothing but success for them. ``What is
good for Lisp in general is good for KTI.''  I hope they realize that as much
as I do...


 -djc


--
   "The Lisp ignorants will come up with an amazing array of bull****
    to defend their ignorance. They are like racists that way"

--Erik Naggum

From: Tim Bradshaw
Subject: Re: VERSION-CONTROL and ICAD's Source Control facility
Date: Fri, 30 Oct 1998 00:00:00 +0000
Message-ID: <ey3u30m9qyz.fsf@todday.aiai.ed.ac.uk>

* David J Cooper wrote:

>     The ``1.1'' on the end refers to an RCS version number. As you know,
> RCS supports branching in version of an individual file (1.1.1.1, 1.1.2.1,
> 1.1.2.2, etc, etc.). However, I do not believe the ISC (ICAD Source Control)
> system does very much sophisticated with allowing branching of systems
> corresponding to branching of file versions.

This is slightly at a tangent, but I think I'm branching back to
something close to the original thread (:).

It's important to realise that if you want to do a real VC system you
need to do more than branches, you also need to be able to do branch
*merges* of a reasonably sophisticated kind.  The sort of thing that
you want to be able to express is:

	for this module, merge all the changes on branch A between
	time a and b into the current version of branch B, and tell me
	about the conflicts, and let me resolve them somehow.

This is the kind of thing that CVS lets you do, and it's a very
important thing to have.  Of course CVS does it in a very suboptimal
way (after all, it's Unix), but there are emacs-based tools that make
it a lot less painful.  To do it really *right* you need knowledge of
the logical structure of the data, particularly you need to know that
some bits of code may depend on other remote bits in such a way that a
simple textual merge can lose badly.  To my knowledge, SCT (which, OK,
is quite old now) really has no story about either branches, or
merging of this sophistication (you can merge two files rather nicely
in zmacs, but that's not enough).

--tim

From: Howard R. Stearns
Subject: Re: VERSION-CONTROL and ICAD's Source Control facility
Date: Fri, 30 Oct 1998 00:00:00 +0000
Message-ID: <363A07C4.6DAA3648@elwood.com>

Tim Bradshaw wrote:
> 
> * David J Cooper wrote:
> 
> >     The ``1.1'' on the end refers to an RCS version number. As you know,
> > RCS supports branching in version of an individual file (1.1.1.1, 1.1.2.1,
> > 1.1.2.2, etc, etc.). However, I do not believe the ISC (ICAD Source Control)
> > system does very much sophisticated with allowing branching of systems
> > corresponding to branching of file versions.
> 
> This is slightly at a tangent, but I think I'm branching back to
> something close to the original thread (:).
> 
> It's important to realise that if you want to do a real VC system you
> need to do more than branches, you also need to be able to do branch
> *merges* of a reasonably sophisticated kind.  The sort of thing that
> you want to be able to express is:
> 
>         for this module, merge all the changes on branch A between
>         time a and b into the current version of branch B, and tell me
>         about the conflicts, and let me resolve them somehow.
> 
> This is the kind of thing that CVS lets you do, and it's a very
> important thing to have.  Of course CVS does it in a very suboptimal
> way (after all, it's Unix), but there are emacs-based tools that make
> it a lot less painful.  To do it really *right* you need knowledge of
> the logical structure of the data, particularly you need to know that
> some bits of code may depend on other remote bits in such a way that a
> simple textual merge can lose badly.  To my knowledge, SCT (which, OK,
> is quite old now) really has no story about either branches, or
> merging of this sophistication (you can merge two files rather nicely
> in zmacs, but that's not enough).
> 
> --tim

I agree that this is a nice thing to be able to do, but it is not clear
to me how this effects the design of either versioned pathnames or
versioned systems.  

My point in these posts has been that an extensible pathname MOP could
be designed to let Lisp pathnames retain and manipulate version
identifiers, and that given this, a defsystem MOP could be designed to
make use of these versioned pathnames.  (FYI, I believe ICAD's
operating-system-independent pathnames predate ANSI logical pathnames,
but are obviously based on CLtL2/Symbolics logical pathnames.)  

Note that a pathname is simply an identifier for a file -- it does not
need any knowledge about the CONTENTS or STRUCTURE of that file.  The
pathname MOP needs to provide sufficient hooks such that whatever
file-system utilities external to Lisp are used, an appropriate stream
can be created which supports the appropriate Common Lisp things to do
with a stream.  CL does not define any merging operations which these
pathnames would need to support.  Of course, there is no reason why such
utilities could not be written, making use of whatever CVS or other
facilities are desired.  But the starting point for merge facilities,
versioned defsystem MOPS, file-stream MOPs, source-code databases, or
what-have-you, is, to my mind, a pathname MOP.  

My question for the Lisp community is whether anyone can think of
anything required for these other facilities which impacts the design of
the pathname MOP.  Noting that tree-structured versions are desirable is
an example.  Noting that merging or auto-patching facilities are
desirable may impact the defsystem design, but I think they are more
likely to impact an interactive user-interface (i.e., editor) for
defsystem, as opposed to the core defsystem design itself.

From: Tim Bradshaw
Subject: Re: VERSION-CONTROL and ICAD's Source Control facility
Date: Fri, 30 Oct 1998 00:00:00 +0000
Message-ID: <ey3pvb9ak2i.fsf@todday.aiai.ed.ac.uk>

* Howard R Stearns wrote:

[Merging of versions]

> I agree that this is a nice thing to be able to do, but it is not clear
> to me how this effects the design of either versioned pathnames or
> versioned systems.  

I agree. I was just trying to remind people that there is more to
being able to do VC than just knowing about versions (and that `more'
is probably much harder).

--tim

From: Christopher Stacy
Subject: Re: VERSION-CONTROL and ICAD's Source Control facility
Date: Sat, 31 Oct 1998 00:00:00 +0000
Message-ID: <usog4o7cv.fsf@pilgrim.com>

I would like to suggest the following paper:
 ``The Description of Large Systems", MIT AI Memo 801,
   September, 1984, by Kent Pitman.

From: David Steuber "The Interloper
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <362dd88c.5005357@news.newsguy.com>

On Mon, 19 Oct 1998 03:32:58 GMT, Barry Margolin
<······@bbnplanet.com> claimed or asked:

% Folks, is this debate really appropriate for this group?  Computer
% scientists have been doing research for decades on complexity metrics for
% programs, and I suspect that comp.software-eng would be a better place to
% discuss the merits of complexity analysis.
% 
% The issue that's appropriate for this group is how to implement a
% particular metric with Lisp programs.

I am interested in the metric for Lisp, but David Lamkins and Kent
Pitman have both pointed out important issues that affect the meaning
of the metric.  I could simply run wc on a file.  Would that give me a
meaningful indication of complexity?  I don't think so.

I thought perhaps counting s-exps would be sufficient.  Maybe it is,
but clearly there are some important caveats.

I don't know how far off topic this thread may have drifted.  But I
feel that I have benefited from the issues raised by others.  For that
reason alone, I am willing to go out on a limb and say that this
debate (although there is really no pro or con here) is appropriate.

There is another point as well.  Problem complexity, as brought up by
David Lamkins, seems to be a case where the method of bottom up design
via use of macros (I haven't got my copy of On Lisp yet) can be used
to simplify the expression complexity.  This is assuming that existing
language features are taken advantage of rather than reinventing some
wheel needlessly.

When I tackle problems in C++, I tend to take a top down approach.  I
chunk the problem down until I have small enough chunks that I can
implement them.  I am not used to going in the other direction.  I
have done it once in Perl.  I had to go bottom up, because I couldn't
see any way to break down the problem from the top.

But I digress.  Perhaps it would be helpful to the discussion if I
added a reason that I want some sort of metric.  Actually, I think I
already did, so I am reiterating the point (I wonder if that is
tail-recursion?).  I believe, rightly or wrongly, that the number of
bugs in a program is relational to the programs size.  In fact, I
think the relationship is probably non-linear in favor of more bugs as
the program size grows.  So I wish to employ methods for measuring
program complexity for the purposes of finding ways to reduce the
complexity.  The first step is simple s-exp counting.  I don't know
what the second step is.  I guess you could say I'm working from the
bottom up :-).

--
David Steuber (ver 1.31.2a)
http://www.david-steuber.com
To reply by e-mail, replace trashcan with david.

So I have this chicken, see?  And it hatched from this egg, see?  But
the egg wasn't laid by a chicken.  It was cross-laid by a turkey.

From: rusty craine
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sun, 18 Oct 1998 00:00:00 +0000
Message-ID: <70e486$3ad$1@excalibur.flash.net>

David Steuber "The Interloper" wrote in message
<················@news.newsguy.com>...

>Maybe complexity needs to be defined.  But the code wasn't so complex
>as expressing something complex, concisely.  Lisp seems to be very
>expressive.  Or is it concise?  Or terse?
>
Ah....Well put, I suspect that I'm guilty of never really separating the
functionality from the function (code).  My focus seems to be always on the
functionality; I think t'is clearly demonstrated  when I compare my code
with examples from this group.  Since I use lisp to do pharmacokinetic
calucations at our hospital, I have avoided the more esoteric (and maybe the
most powerful) aspects of lisp for a "what you see is what you get"
programming style.  The drugs we model have narrow theraputic indices
(narrow dose range between help and hurt) so the correct answer is more
important the an elegant style (at least to the patient and my liability
insurance company).  Critical-safe  programming in floating point is not for
the faint of heart....abstracting the functions that deal with these
calculations would give in apoplexy.

A dose of lisp
rusty

From: Gareth McCaughan
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Fri, 23 Oct 1998 00:00:00 +0000
Message-ID: <86iuhbtb78.fsf@g.pet.cam.ac.uk>

Rusty Craine wrote:

> I'm not sure what complexity means, but  one of the respondents regarding a
> request for help from me shared the following
> (defun  permutations-list (ls)
> (if (cdr ls)
>   (mapcan
>     (lambda (pp)
>       (cons (cons (car ls) pp)
>          (maplist (lambda (ta)
>                       (nconc (ldiff pp (cdr ta)) (list (car ls))
>                               (copy-list (cdr ta))))
>                        pp)))
>         (permutations-list (cdr ls)))
>         (list ls)))
...
> Took me (maybe i should admit this)  4 hours of "flow charting" before i
> could really explain to someone what was happening (though it just took 15
> minutes to get it running) .

I claim this is because it was rather impolitely presented.
What would have happened to those 4 hours if the code had been
replaced by the following (which differs only in using nicer
names for things, being properly indented and including a few
brief comments)?

    (defun permutations-list (items)
      "Return a list containing all permutations of ITEMS."
      (if (rest items)
        ;; More than 1 element.
        (mapcan
          #'(lambda (permutation)
              ;; This function returns the list of all permutations
              ;; of ITEMS that produce PERMUTATION when FIRST-ITEM
              ;; is omitted.
              (cons
                (cons (first items) permutation)
                ;; Now insert FIRST-ITEM after each item in PERMUTATION.
                (maplist #'(lambda (tail)
                             (nconc (ldiff permutation (rest tail))
                                    (list (first items))
                                    (copy-list (rest tail))))
                         permutation)))
          (permutations-list (rest items)))
        ;; Only 1 element: so only 1 permutation.
        (list items)
    ))

Almost any code can be confusing if it's misleadingly indented,
uses unhelpful names for things, and is totally undocumented.

-- 
Gareth McCaughan       Dept. of Pure Mathematics & Mathematical Statistics,
·····@dpmms.cam.ac.uk  Cambridge University, England.

From: Barry Margolin
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sun, 18 Oct 1998 00:00:00 +0000
Message-ID: <ynfW1.8$2L4.519081@burlma1-snr1.gtei.net>

In article <················@news.newsguy.com>,
David Steuber "The Interloper" <········@david-steuber.com> wrote:
>Why would the number of s-expresions in a program not be a good
>measure of code complexity?

If I understand your function correctly, it's just counting the number of
top-level forms in the file.  Most top-level forms in a program are just
variable and function definitions.  So your function would consider a
program with ten two-line functions (20 lines in all) to be more complex
than one with five 100-line functions (500 lines total).  There's something
obviously wrong with considering all top-level forms to be equivalent.

-- 
Barry Margolin, ······@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.

From: David Steuber "The Interloper
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Sun, 18 Oct 1998 00:00:00 +0000
Message-ID: <362b3dec.2244467@news.newsguy.com>

On Sun, 18 Oct 1998 05:53:34 GMT, Barry Margolin
<······@bbnplanet.com> claimed or asked:

% If I understand your function correctly, it's just counting the number of
% top-level forms in the file.  Most top-level forms in a program are just
% variable and function definitions.  So your function would consider a
% program with ten two-line functions (20 lines in all) to be more complex
% than one with five 100-line functions (500 lines total).  There's something
% obviously wrong with considering all top-level forms to be equivalent.

Obviously.  My intent (the function was written by someone else) is to
measure the number of s-exps, not the "top-level forms".  Someone
responded with the suggestion of counting unquoted occurrences of '('.
That sounds like it would be close.  With a large program, a small
statistical error is not significant.

--
David Steuber (ver 1.31.2a)
http://www.david-steuber.com
To reply by e-mail, replace trashcan with david.

So I have this chicken, see?  And it hatched from this egg, see?  But
the egg wasn't laid by a chicken.  It was cross-laid by a turkey.

From: Matthias H�lzl
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Mon, 19 Oct 1998 00:00:00 +0000
Message-ID: <86lnmdjuqy.fsf@gauss.muc.de>

········@david-steuber.com (David Steuber "The Interloper") writes:

> On 18 Oct 1998 02:13:33 +0200, ··@gauss.muc.de (Matthias H�lzl)
> claimed or asked:
> 
> % I think it is a good exercise for a Lisp beginner (but not too useful
> % as a metric for code complexity/size).
>
> Why would the number of s-expresions in a program not be a good
> measure of code complexity?

Obviously we have to agree on what code complexity (CC) means before
deciding whether some number is a good complexity measure.  I am not
really sure how to define CC in a useful fashion or whether some
"objective" measure for code complexity exists, but what I would
expect of any sensible CC measure is some indication of how long it
will take to understand a certain piece of code and how difficult it
will be for some programmer to maintain or extend the program.  This
notion of complexity is of course not only dependent on the number of
forms in the code but also on the style the code is written in, on the
quality of the abstractions used, on the algorithms employed, and so
on, so it probably cannot be expressed as a single number.

In Common Lisp we have a much greater potential to create abstractions
than in most other languages. If done correctly this can greatly
decrease both the amount of code and the complexity, e.g., in an
implementation of a PostScript interpreter you could define a macro
DEFINE-PS-OPERATOR to define PostScript operators.  This macro would
allow you to write

(define-ps-operator + ((x ps-number) (y ps-number))
  (+ x y))

This would expand into a function that pops x and y of the stack,
takes care of error handling and type checking, performs the addition
and pushes the result back on the stack.  Additionally this macro
could emit the forms to add the operator to the system dictionary of
the PostScript interpreter and to do any additional initialization.
Even though this macro would have a rather large and complicated
expansion it minimizes code size and complexity (at least the
complexity the programmer has to face in order to understand the
code).

On the other hand it is possible to use macros to compress the code in
ways which almost force the programmer to perform macroexpansion
before being able to figure out what the code really does (for an
example of a rather mild form of this phenomenon you may want to have
a look at the file defcombin.lisp in the PCL distribution).  In this
case the code size decreases, but the complexity of the code actually
increases.  This makes the issue of size more complicated than in
other languages.  For C programs I presume that you could run the code
through a pretty-printer to account for formatting differences and
then count the lines of code and have a fairly accurate measure of the
amount of program text the programmer has to digest.  In Lisp, on the
other hand, I would say that in the case of DEFINE-PS-OPERATOR the
relevant measure for code size is the actual number of lines in the
source code, whereas the size of the macro expansion is more useful as
a size measure if the programmer has to look at this expansion to
understand the program.  So IMO you need at least the number of forms
of the original and the macroexpanded version of the code and you have
to perform a best/worst case analysis.

The same is true of many features of CL, you can use them to write
wonderfully clear and concise code, you can also abuse them to write
code that makes the winners of the obfuscated C contest envious.  It
seems difficult to build an analysis tool that can differentiate these
two cases and the shades of gray in between.

Regards,

  Matthias

From: David Steuber "The Interloper
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <362ee0f6.7159865@news.newsguy.com>

On 19 Oct 1998 01:49:41 +0200, ··@gauss.muc.de (Matthias H�lzl)
claimed or asked:

% The same is true of many features of CL, you can use them to write
% wonderfully clear and concise code, you can also abuse them to write
% code that makes the winners of the obfuscated C contest envious.  It
% seems difficult to build an analysis tool that can differentiate these
% two cases and the shades of gray in between.

I didn't think of all that.

Unless there really is a sensible way to define complexity, there
really is no sensible way to measure it.

I noticed in the FAQ (I finally found it today.  Perhaps the location
of the FAQ should be in the FAQ.  Oh, wait, it is.) that there are
style guidelines and a listing of books with more comprehensive style
guidelines to aid in writing maintainable code.

With my memory, it doesn't take me more than a few days (minutes) to
forget what I was thinking when I wrote something.  Thank goodness
Lisp has a documentation facility (besides comments) that I will
probably neglect to use.

--
David Steuber (ver 1.31.2a)
http://www.david-steuber.com
To reply by e-mail, replace trashcan with david.

So I have this chicken, see?  And it hatched from this egg, see?  But
the egg wasn't laid by a chicken.  It was cross-laid by a turkey.

From: Steve Gonedes
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Mon, 19 Oct 1998 00:00:00 +0000
Message-ID: <m21zo5ju1g.fsf@KludgeUnix.com>

········@david-steuber.com (David Steuber "The Interloper") writes:


< % I think it is a good exercise for a Lisp beginner (but not too useful
< % as a metric for code complexity/size).
< 
< Why would the number of s-expresions in a program not be a good
< measure of code complexity?
< 
< Line counting is very common, but simple formatting differences can
< throw of such a metric.  One could count atoms, but I am thinking that
< maybe that is too fine?
< 
< Another metric I am familiar with is token counting.

I usually look at the description for the code and use that to
determine how complex it will probably be. Anyway, I'm super bored so
I decided to write a lisp-code-counter-thing. It's more difficult that
I thought. I got burnt once because it was counting semicolons inside
of complex format strings. I'm sure there's a million other glitches -
I only tried a couple of times and it seemed to work ok though.
Another thing I forgot about was almost every sexp ends in nil, but
not all of them. Anyway, the function is destructive and will total
any live code you feed it so keep this in mind before you try counting
built in functions and such...


;;;; -*-Mode: common-lisp; Syntax: common-lisp; Package: user; -*-
;;;; October 18, 1998 Sunday  5:23 PM
;;;; Count the sexps in a file

(in-package :user)

(defun ncount-subforms (sexp)
  (let ((atoms 0) (sexps 0) (local-nil (gensym "NIL")))
    (labels ((count-subforms-aux (sexp)
        (cond ((null sexp))
              ((atom sexp) (incf atoms) nil)
              (t (incf sexps)
                 (let ((ptr (last sexp)))
                   (cond ((eq (cdr ptr) local-nil)
                          (rplacd ptr nil)
                          (mapc #'count-subforms-aux sexp))
                         (t (decf atoms) (decf sexps)
                            (count-subforms-aux (car sexp))
                            (count-subforms-aux (cdr sexp)))))))))
      (count-subforms-aux (nsubst local-nil nil sexp))
      (values sexps atoms))))

(defun count-sexps (file)
  (let ((*readtable* (copy-readtable nil))
        (eof (gensym "EOF"))
        (semi-colon-count 0)
        (comment-count 0) (block-comment-count 0)
        (toplevel-form-count 0) (sexp-count 0) (atom-count 0))
    (set-dispatch-macro-character #\# #\. #'read)
    (let ((block-comment-reader
           (get-dispatch-macro-character
            #\# #\| *readtable*)))
      (set-dispatch-macro-character #\# #\|
        #'(lambda (&rest args) (incf block-comment-count)
                  (apply block-comment-reader args))))
    (multiple-value-bind (comment-reader terminating-p)
        (get-macro-character #\; *readtable*)
      (set-macro-character #\;
        #'(lambda (stream &rest args)
            (incf comment-count) (incf semi-colon-count)
            (loop while (char= (peek-char nil stream) #\;)
                doing (progn (read-char stream) (incf semi-colon-count)))
            (apply comment-reader stream args))
        terminating-p *readtable*)
      (multiple-value-bind (string-quote-reader terminating2-p)
          (get-macro-character #\" *readtable*)
        (set-macro-character #\"
          #'(lambda (&rest args)
              (let ((fn (get-macro-character #\; *readtable*)))
                (unwind-protect
                    (progn (set-macro-character #\; 
                        comment-reader terminating-p *readtable*)
                      (apply string-quote-reader args))
                  (set-macro-character #\; fn terminating-p *readtable*))))
          terminating2-p *readtable*)))

    (with-open-file (input file :direction :input)
      (loop for sexp = (read input nil eof)
          until (eq sexp eof) do
            (multiple-value-bind (sexps atoms)
                (ncount-subforms sexp)
              (incf toplevel-form-count)
              (incf sexp-count sexps) (incf atom-count atoms))))
    (values sexp-count
            atom-count
            toplevel-form-count
            comment-count
            semi-colon-count
            block-comment-count)))

(defun summerize-file-counts (file)
  (multiple-value-bind (sexps atoms topforms
                        comments semi-colons block-comments)
      (count-sexps file)
    (let ((total-comments (+ comments block-comments)))
      (format *standard-output*
              "~&;; Summery for file ~s~%~
                 ;;   Total toplevel forms:      ~d~%~
                 ;;   Total sexps:               ~d~%~
                 ;;   Total atoms:               ~d~%~
                 ;;   Total comments:            ~d~%~
         ~:*~[~:;~
               ~[~*~:;~:*~
                 ;;     Semi-colon Comments:     ~d~%~
                 ;;       Semi-colons:           ~d~%~]~
                 ;;     Block Comments:          ~d~%~]"
              (enough-namestring file) topforms sexps atoms
              total-comments comments semi-colons block-comments)))
  (values))



USER(-1): ; just kidding
  (summerize-file-counts "sexp-count.lisp")


;; Summery for file "sexp-count.lisp"
;;   Total toplevel forms:      4
;;   Total sexps:               124
;;   Total atoms:               245
;;   Total comments:            3
;;     Semi-colon Comments:     3
;;       Semi-colons:           12
;;     Block Comments:          0


I think I need a new hobby, anyway hope you get a kick out of it.

From: David Steuber "The Interloper
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <362fe3ab.7852531@news.newsguy.com>

On 19 Oct 1998 00:13:02 GMT, Steve Gonedes <········@worldnet.att.net>
claimed or asked:

% I usually look at the description for the code and use that to
% determine how complex it will probably be. Anyway, I'm super bored so
% I decided to write a lisp-code-counter-thing. It's more difficult that
% I thought. I got burnt once because it was counting semicolons inside
% of complex format strings. I'm sure there's a million other glitches -
% I only tried a couple of times and it seemed to work ok though.
% Another thing I forgot about was almost every sexp ends in nil, but
% not all of them. Anyway, the function is destructive and will total
% any live code you feed it so keep this in mind before you try counting
% built in functions and such...

Wow!  I didn't realize the problem was that complicated.  I hope you
didn't eat the file while you were counting it :-)

--
David Steuber (ver 1.31.2a)
http://www.david-steuber.com
To reply by e-mail, replace trashcan with david.

So I have this chicken, see?  And it hatched from this egg, see?  But
the egg wasn't laid by a chicken.  It was cross-laid by a turkey.

From: John Wiseman
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Mon, 19 Oct 1998 00:00:00 +0000
Message-ID: <arxww5w41mu.fsf@gargoyle.cs.uchicago.edu>

I think it's important to note that outside of a coding exercise READ
is probably the wrong way to do input for this task.  It will choke on
the first symbol with an explicit package prefix that refers to a
non-existent package.

Real code is likely to use package prefixes.  Therefore, solutions
using READ will break on real code unless the code has already been
loaded.  Pre-loading the code is probably not a desirable requirement
for this sort of tool.


This particular behavior of READ is one of the few that cannot be
changed, as far as I know.  Is that a sign that READ is being misused
or that Common Lisp needs a fix in this area?


John Wiseman

From: Michael Tuchman
Subject: counting forms without respect to package markers (was: Symbolic Expression Counting)
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <wkd87ns2th.fsf_-_@orsi.com>

·······@cs.uchicago.edu (John Wiseman) writes:

> I think it's important to note that outside of a coding exercise READ
> is probably the wrong way to do input for this task.  It will choke on
> the first symbol with an explicit package prefix that refers to a
> non-existent package.

Yes and no.  Both Harlequin and Franz implementation of LISP simply
asks me to create any packages it doesn't know about while reading So
after doing this on several files, I may have a few "extra" packages defined
in my system, but so what?  

In fact, I have a header file for my projects with all package defs
already loaded, so I don't have to preload the entire code to get
satisfactory results.

While I agree with your opinion, in practice this difficulty hasn't
been problematic.

> 
> Real code is likely to use package prefixes.  Therefore, solutions
> using READ will break on real code unless the code has already been
> loaded.  Pre-loading the code is probably not a desirable requirement
> for this sort of tool.
> 

Some aspects of READ are hard wired, but I don't understand why you
say that the behavior of READ can't be changed.  You can also
temporarily rebind the syntax type of : to constituent to read package
markers without worrying about whether the package is defined in the
current environment.

--------------------From Hyperspec-------------------- Notes:

The constituent traits of a character are ``hard wired'' into the
parser for extended tokens. For example, if the definition of S is
copied to *, then * will become a constituent that is alphabetic[2]
but that cannot be used as a short float exponent marker. For further
information, see Section 2.1.4.2 (Constituent Traits).

--------------------End excerpt--------------------
--------------------Begin Transcript---------------
;;;; this session is from Allegro 3.0.2

(setf *j* (copy-readtable))
#<READTABLE #xDE8114>
> (set-syntax-from-char #\: #\a *j*)
CONSTITUENT
> (read)foo:::bar
;; Error: Package FOO does not exist.
[condition type: PACKAGE-ERROR]
;; Returning to Top Level
> (let ((*readtable* *j*))
     (read))foo:::bar
FOO\:\:\:BAR
> 
--------------------End Transcript--------------------
Just goes to show you how many concerns are behind a "simple" problem
and how important it is to seek a variety of opinions before coding.

From: John Wiseman
Subject: Re: counting forms without respect to package markers (was: Symbolic Expression Counting)
Date: Wed, 21 Oct 1998 00:00:00 +0000
Message-ID: <arxk91tk791.fsf@gargoyle.cs.uchicago.edu>

Michael Tuchman <·····@orsi.com> writes:
>
> > I think it's important to note that outside of a coding exercise READ
> > is probably the wrong way to do input for this task.  It will choke on
> > the first symbol with an explicit package prefix that refers to a
> > non-existent package.
> 
> Yes and no.  Both Harlequin and Franz implementation of LISP simply
> asks me to create any packages it doesn't know about while reading So
> after doing this on several files, I may have a few "extra" packages defined
> in my system, but so what?  
> 
> In fact, I have a header file for my projects with all package defs
> already loaded, so I don't have to preload the entire code to get
> satisfactory results.
> 
> While I agree with your opinion, in practice this difficulty hasn't
> been problematic.

It's a minor, but annoying, wart.  It makes it hard to automate the
measurement process.  Mostly I just wanted to point it out.


> Some aspects of READ are hard wired, but I don't understand why you
> say that the behavior of READ can't be changed.  You can also
> temporarily rebind the syntax type of : to constituent to read package
> markers without worrying about whether the package is defined in the
> current environment.

I was thinking vaguely of the restriction that constituent traits of
characters are hardwired, and I thought that this would preclude a
solution.  Apparently it just means that there isn't an easy and good
solution, though you and Matthias Holzl surprised me with what you
were able to accomplish.


> --------------------Begin Transcript---------------
> ;;;; this session is from Allegro 3.0.2
> 
> (setf *j* (copy-readtable))
> #<READTABLE #xDE8114>
> > (set-syntax-from-char #\: #\a *j*)
> CONSTITUENT
> > (read)foo:::bar
> ;; Error: Package FOO does not exist.
> [condition type: PACKAGE-ERROR]
> ;; Returning to Top Level
> > (let ((*readtable* *j*))
>      (read))foo:::bar
> FOO\:\:\:BAR
> > 
> --------------------End Transcript--------------------

This doesn't work in ACL 4.3 (unix) or Lispworks 4.1 (windows,
personal edition).  READ is subtle, indeed.


John

From: Michael Tuchman
Subject: Re: counting forms without respect to package markers (was: Symbolic Expression Counting)
Date: Thu, 22 Oct 1998 00:00:00 +0000
Message-ID: <wk67dctzcw.fsf@orsi.com>

·······@cs.uchicago.edu (John Wiseman) writes:

> 
> This doesn't work in ACL 4.3 (unix) or Lispworks 4.1 (windows,
> personal edition).  READ is subtle, indeed.
> 
> 
> John
 I'm stumped.  I wonder where I erred in the world of
nonportability.

From: Matthias H�lzl
Subject: Re: counting forms without respect to package markers (was: Symbolic Expression Counting)
Date: Thu, 22 Oct 1998 00:00:00 +0000
Message-ID: <86soggz8rp.fsf@gauss.muc.de>

Michael Tuchman <·····@orsi.com> writes:

> ·······@cs.uchicago.edu (John Wiseman) writes:

[Essentially:

(setf *j* (copy-readtable))
(set-syntax-from-char #\: #\a *j*)
(let ((*readtable* *j*))
  (read))foo:::bar
 FOO\:\:\:BAR]

> > 
> > This doesn't work in ACL 4.3 (unix) or Lispworks 4.1 (windows,
> > personal edition).  READ is subtle, indeed.
> > 
> > 
> > John
>  I'm stumped.  I wonder where I erred in the world of
> nonportability.

The problem with this solution is the following: The HyperSpec
specifies that SET-SYNTAX-FROM-CHAR copies the syntax type (and the
macro function, etc.) of a character but not its constituent traits.
Since the syntax type of #\: is already `constituent' according to
figure 2.7, the call to SET-SYNTAX-FROM-CHAR doesn't really change
anything and ACL 4.3 and Lispworks 4.1 give the correct result
according to the HyperSpec, whereas Allegro 3.0.1 does not.

From a short glance at CLtL2 the concept of constituent traits seems
to be a rather recent introduction to CL.  My opinion is that it
would be convenient if the user was allowed to change the constituent
trait of a character, but maybe there are some arguments against it
that I am overlooking.

Regards,

  Matthias

From: Matthias H�lzl
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <86u30zyqm2.fsf@gauss.muc.de>

·······@cs.uchicago.edu (John Wiseman) writes:

> I think it's important to note that outside of a coding exercise READ
> is probably the wrong way to do input for this task.  It will choke on
> the first symbol with an explicit package prefix that refers to a
> non-existent package.
>
> Real code is likely to use package prefixes.  Therefore, solutions
> using READ will break on real code unless the code has already been
> loaded.  Pre-loading the code is probably not a desirable requirement
> for this sort of tool.
> 
> This particular behavior of READ is one of the few that cannot be
> changed, as far as I know.  Is that a sign that READ is being misused
> or that Common Lisp needs a fix in this area?

Personally I don't think that loading the code before determining some
metrics is unreasonable since I would surely want to perform some
measurements after macroexpansion.  But if you want to build a purely
static analysis tool then you may have to write a specialized reader.
However if you are only interested in the number of symbols and not in
their names, then there is still a possibility to use the normal
Common Lisp reader:

CL offers no portable way to change the constituent trait of a
character, which would be the easiest solution, however it is possible
to kludge up a solution that "ignores" the package prefix (or rather
returns a symbol whose print name is a strange concatenation of the
package prefix and a variant of the original print name) by changing
the syntax type of #\: to single escape.  This works because the
constituent trait is only taken into account by the reader for
characters of syntax type `constituent' and non-terminating macro
characters.  One complication when doing this is that we have to reset
the syntax type of #\: to `constituent' before reading a string, since
otherwise we get errors for strings ending in a colon.  E.g.

(defun count-forms (file)
  "Count the number of top level forms in a file."
  (with-open-file (instream file :direction :input)
    (let ((*readtable* (copy-readtable))
	  (eof-value (gensym)))
      ;; Ignore the sharp-dot reader macro.
      (set-dispatch-macro-character #\# #\.
	(lambda (stream subchar arg)
	  (declare (ignore subchar arg))
	  (read stream nil eof-value t)))
      ;; Ignore the package prefix.
      (set-syntax-from-char #\: #\\)
      ;; Make #\: constituent before reading a string.
      (let ((double-quote-reader (get-macro-character #\")))
	(set-macro-character #\"
          (lambda (stream char)
	    (set-syntax-from-char #\: #\a)
	    (let ((res (funcall double-quote-reader stream char)))
	      (set-syntax-from-char #\: #\\)
	      res)))
	(loop
	  for form = (read instream nil eof-value)
	  until (eq form eof-value)
	  count form)))))

[A solution that seems more elegant at first sight would be to turn
#\: into a terminating macro char; unfortunately this either fails for
forms like (1 . foo:bar) or for forms like (1 . :foo), depending on
the reader macro function.]  Even though the "single escape char"
solution is not very elegant it seems still preferable to writing a
completely new reader if the metrics you want to compute don't depend
on the actual names or packages of symbols.  

Of course if you need this kind of reader more than once to write some
it would be good to encapsulate it with a macro:

(defmacro with-phoney-reader ((&optional eof-value) &body body)
  (let ((eof-val (gensym)))
  `(let ((,eof-val ,eof-value)
	 (*readtable* (copy-readtable)))
     ;; Ignore the sharp-dot reader macro.
     (set-dispatch-macro-character #\# #\.
       (lambda (stream subchar arg)
	 (declare (ignore subchar arg))
	 (read stream nil ,eof-val t)))
     ;; Ignore the package prefix.
     (set-syntax-from-char #\: #\\)
     (let ((double-quote-reader (get-macro-character #\")))
       (set-macro-character #\"
         (lambda (stream char)
	   (set-syntax-from-char #\: #\a)
	   (let ((res (funcall double-quote-reader stream char)))
	     (set-syntax-from-char #\: #\\)
	     res)))
       ,@body))))

(defun count-forms (file)
  (with-open-file (instream file :direction :input)
    (let ((eof-value (gensym)))
      (with-phoney-reader (eof-value)
	(loop
	  for form = (read instream nil eof-value)
	  until (eq form eof-value)
	  count form)))))

Regards,

  Matthias

From: Steve Gonedes
Subject: Re: Symbolic Expression Counting CONTAINS SPOILER.
Date: Tue, 20 Oct 1998 00:00:00 +0000
Message-ID: <m2yaqaolsg.fsf@KludgeUnix.com>

··@gauss.muc.de (Matthias H�lzl) writes:

 
< CL offers no portable way to change the constituent trait of a
< character, which would be the easiest solution, however it is possible
< to kludge up a solution that "ignores" the package prefix (or rather
< returns a symbol whose print name is a strange concatenation of the
< package prefix and a variant of the original print name) by changing
< the syntax type of #\: to single escape.  This works because the
< constituent trait is only taken into account by the reader for
< characters of syntax type `constituent' and non-terminating macro
< characters.  One complication when doing this is that we have to reset
< the syntax type of #\: to `constituent' before reading a string, since
< otherwise we get errors for strings ending in a colon.  E.g.

Really, using read to parse text files was intended to be somewhat
humerous and wasn't meant to be a serious attempt at anything but
killing some time. I say this because it's starting to sound more like
an exercise from the TeX book with each post - which is usually a sign
that something has gone very bad.