Hi,
If I understand the hyperspec correctly, then there is no way to extend
coerce to accept any other result-type than the ones explicetly listed.
Nevertheless, I would like to transform from and to arbitrary types. It
seems almost trivial to do have a unified interface in the form of
(convert value from-type to-type)
and
(defconverter (from-type to-type var &rest args)
...)
Nevertheless, I didn't find a library for this. Is is because it's so
simple that everybody writes his own type converter, or is it just a bad
idea to do such thing?
Albert
Originally Lisp was designed to have just one type: the list, to
express anything (text, arrays, structures, etc.). That would have
obviated conversors.
A quite elegant idea, but practicity made apparent the convenience of
more types, which were then welcomed as progress.
Conceptually -however- it remains opinable which approach is better.
In article
<····································@i7g2000prf.googlegroups.com>,
·······@eurogaran.com wrote:
> Originally Lisp was designed to have just one type: the list, to
> express anything (text, arrays, structures, etc.). That would have
> obviated conversors.
But those ancient dialects of Lisp didn't have the COERCE function, so
they're hardly relevant to a modern discussion.
--
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE don't copy me on replies, I'll read them in the group ***
·······@eurogaran.com writes:
> Originally Lisp was designed to have just one type: the list, to
> express anything (text, arrays, structures, etc.). That would have
> obviated conversors.
> A quite elegant idea, but practicity made apparent the convenience of
> more types, which were then welcomed as progress.
> Conceptually -however- it remains opinable which approach is better.
You're wrong.
The notion of type has nothing to do with what the language provides.
(shadow '(vector string make-string character coerce))
(defstruct (vector :named (:type list)) elements)
(defstruct (character :named (:type list)) code)
(defstruct (string :named (:type list) (:include vector)))
(make-vector :elements '(65 66 67 68 69))
--> (VECTOR (65 66 67 68 69)) ; note, only CONSes.
(make-character :code 65)
--> (CHARACTER 65) ; still only CONSes.
(make-string :elements (vector-elements
(make-vector :elements (mapcar (lambda (code) (make-character :code code))
'(65 66 67 68 69)))))
--> (VECTOR ((CHARACTER 65) (CHARACTER 66) (CHARACTER 67) (CHARACTER 68) (CHARACTER 69)) STRING)
Well, I would like to call that COERCE:
(defun coerce (object type)
(cond
((and (vector-p object) (eq 'string type)
(every (function character-p) (vector-elements object)))
(make-string :elements (vector-elements object)))
(t (error "No coersion from ~S to ~A" object type))))
(coerce (make-vector :elements (mapcar (lambda (code) (make-character :code code)) '(65 66 67 68))) 'string)
--> (VECTOR ((CHARACTER 65) (CHARACTER 66) (CHARACTER 67) (CHARACTER 68)) STRING)
(mapcar
(lambda (s)
(mapcar
(lambda (f) (funcall f s)) '(character-p vector-p string-p)))
(list (make-vector :elements (mapcar (lambda (code) (make-character :code code)) '(65 66 67 68)))
(coerce (make-vector :elements (mapcar (lambda (code) (make-character :code code)) '(65 66 67 68))) 'string)))
--> ((NIL T NIL) ; a vector is not a string
(NIL T T)) ; a string is a vector
Now of course when you define STRING=, it cannot work on VECTORs, that
is, on lists of the form: (VECTOR (65 66 67 68 69)).
STRING= can only work on lists of the form:
(VECTOR ((CHARACTER 65) (CHARACTER 66) (CHARACTER 67) (CHARACTER 68)) STRING)
This is what types are all about. The bits (or conses) don't matter.
--
__Pascal Bourguignon__ http://www.informatimago.com/
READ THIS BEFORE OPENING PACKAGE: According to certain suggested
versions of the Grand Unified Theory, the primary particles
constituting this product may decay to nothingness within the next
four hundred million years.
On Apr 4, 8:20 am, Albert Krewinkel <·······@inb.uni-luebeck.de>
wrote:
> Hi,
>
> If I understand the hyperspec correctly, then there is no way to extend
> coerce to accept any other result-type than the ones explicetly listed.
> Nevertheless, I would like to transform from and to arbitrary types. It
> seems almost trivial to do have a unified interface in the form of
>
> (convert value from-type to-type)
>
> and
>
> (defconverter (from-type to-type var &rest args)
> ...)
>
> Nevertheless, I didn't find a library for this. Is is because it's so
> simple that everybody writes his own type converter, or is it just a bad
> idea to do such thing?
I think it's a good idea. In fact, my FSet library has a generic
function `convert' that uses eql-specializers to dispatch on the
target type:
(defmethod convert ((to-type (eql 'set)) (b bag) &key)
...)
I put the destination type first in the argument order because I think
it makes the calls more readable: (convert 'foo <long expression>) is
easier to read than (convert <long expression> 'foo). Also, a few of
these methods take additional keyword parameters to specify some
aspect of the conversion to be performed.
Perhaps the generic function should be defined in its own library so
the definition can be more easily shared.
-- Scott
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: coerce for arbitrary types
Date:
Message-ID: <rem-2008apr04-004@yahoo.com>
> From: Scott Burson <········@gmail.com>
> If I understand the hyperspec correctly, then there is no way to extend
> coerce to accept any other result-type than the ones explicetly listed.
IMO that's a good idea, because coerce is already a crock from the start.
It is not possible to change something from the type it was to the
type you want to be (although in the case of CLOS objects, you can
change the class to which an object belongs, but still you can't
change it to no longer be a CLOS object, and you can't change
anything else to become a CLOS object). So coerce must do something
sorta random, such as perform a shallow copy from one type of
container to a newly-created copy of another type, or perform a
"similar-meaning" conversion such as from an integer to a float.
> Nevertheless, I would like to transform from and to arbitrary types.
That would only make the confusion worse. Better to stop using
coerce entirely. If you want a shallow copy from one type of
sequence to another, call MAP with the IDENTITY function.
(map 'list #'identity "foo")
I you want a numeric conversion, use FLOAT or ROUND or FLOOR etc.
> Perhaps the generic function should be defined in its own library so
> the definition can be more easily shared.
Why do you even need a generic function, instead of calling the
specific clearly-defined function for the shallow copy or numeric
near-value or whatever?
If you want to convert the first character of a string to a character:
(elt "X" 0)
If you want to convert a character to a single-character string:
(format nil "~A" #\X)
If you want to convert a string to its integer value:
(parse-integer "42")
If you want to convert a integer to its string representation:
(format nil "~D" 42)
etc. etc.
Say exactly what you want done instead of trying to design and use
some catch-all convert-type function where the reader of your code
has to guess what your function-call really does.
·······@yahoo.com (Robert Maas, see http://tinyurl.com/uh3t) writes:
> > From: Scott Burson <········@gmail.com>
> > If I understand the hyperspec correctly, then there is no way to extend
> > coerce to accept any other result-type than the ones explicetly listed.
Why do you need to do this? This is often a sign of a program being
structured wrong--trying to get too much information out of too few bits.
http://www.nhplace.com/kent/PS/EQUAL.html#two-bit_rule
In fact, the entire article
http://www.nhplace.com/kent/PS/EQUAL.html
in which that rule resides builds up to the relevant point, which is
that in dynamic languages, additional care is required in order to
infer "intent" because the fact of a representation's use is not the
same as the intent of a representation's use, and you can only
dynamically query about the fact of its use, not the intent. This is
not a bug in dynamic languages because it buys additional, much needed
flexibility in other situations. But in the case here, you have to be
careful to either pass more information or not to infer so much information
from what is passed. Sometimes more information means "another argument"
and sometimes it means "a different representation of the data than the
one you're trying to coerce". So, for example:
> IMO [not allowing coerce to be extended is] a good idea, because coerce
> is already a crock from the start. It is not possible to change something
> from the type it was to the type you want to be
This isn't the reason it's a "crock". That's just the reason it has a
bad name. The definition has issues that make it not easily
extendable independent of its name, and those issues derive from the
nature of the language. In brief, COERCE is too short a name and
falsely gives the sense that it is a canonically defined operation
with only one meaning rather than an operation that has several
(perhaps myriad) friends that are all vying for the cool short-name.
If these were domain names, not function names, all the names
coerce.org, coerce.net., coerce.cc, coerce.tv, etc. would all be
bought up and people would be assigning them subtly different
meanings. But when there's no trailing GTLD to remind us, it's easy
to forget that there was a competition in play.
The real problem is that something you can customize needs to be
defined in such a way that it's clear what customizing it would mean.
For historical reasons, COERCE is already messed up because of the NIL
vs () issue [note well: NOT the issue commonly fussed over, which is
the false vs. empty list issue]. The simple case of:
(coerce nil 'string) => ""
(string nil) => "NIL"
illustrates the problem.
The reason this happens by the way is that STRING plainly does not work
on arbitrary sequences, so there's no confusion that STRING is not asking
to operate on sequences. e.g.,
(string #(#\N #\I #\L))
Error: Cannot coerce #(#\N #\I #\L) to type STRING.
So it was deemed necessary that STRING should lean toward seeing NIL as
its name, which was a string. But COERCE does take sequences, and empty
sequences were common and useful. The progression
(coerce '(a b) 'vector)
=> #(A B)
(coerce '(a) 'vector)
=> #(A)
(coerce '() 'vector)
=> #(#\N #\I #\L)
would have been devastating to getting any work done. So COERCE was defined
to treat NIL as () rather than as a symbol.
In effect, you are meant to make the choice between STRING and
(COERCE x 'STRING) somewhat on the basis of understanding how you want this
particular tie to be broken.
But if we were to make it open season on extending this, in the present of
a dynamic language with multiple inheritance, this kind of problem would
happen all the time, and the result would be ragged... absent the intent
information I alluded to above.
Thanks for all the answers. They are all very helpfull!
Kent M Pitman <······@nhplace.com> writes:
> ·······@yahoo.com (Robert Maas, see http://tinyurl.com/uh3t) writes:
>
>> > From: Scott Burson <········@gmail.com>
>> > If I understand the hyperspec correctly, then there is no way to extend
>> > coerce to accept any other result-type than the ones explicetly listed.
>
> Why do you need to do this? This is often a sign of a program being
> structured wrong--trying to get too much information out of too few
> bits.
I encountered a couple of situations were I find myself writing a lot of
code to convert from one thing to another. Especially in my semi-toy
computational biology project (http://sf.net/projects/clcb for those
interested), a vast number of functions turned out to be in the form of
from-this->to-that: triplet to amino-acid, rna to protein, amino-acid
coordinates to nucleotide coordinates, relative to absolute...
So type conversion is very common pattern in my code, and lisp has
teached me one thing: It's a Good Thing (tm) to abstract away common
patterns.
> http://www.nhplace.com/kent/PS/EQUAL.html#two-bit_rule
>
> In fact, the entire article
> http://www.nhplace.com/kent/PS/EQUAL.html
> in which that rule resides builds up to the relevant point, which is
> that in dynamic languages, additional care is required in order to
> infer "intent" because the fact of a representation's use is not the
> same as the intent of a representation's use, and you can only
> dynamically query about the fact of its use, not the intent.
Very enlightening article! Interesting enough, you are proposing the
same function signature for 'CONVERT' that I intended to define (see
orignial post). The problems you describe under 'Input and Output'
occured to me too: Reading objects from a text-file table would be most
easy with what you named `ACCEPT-FROM-STRING'. Clearly, this is a
special case of converting between types.
> This is
> not a bug in dynamic languages because it buys additional, much needed
> flexibility in other situations. But in the case here, you have to be
> careful to either pass more information or not to infer so much information
> from what is passed. Sometimes more information means "another argument"
> and sometimes it means "a different representation of the data than the
> one you're trying to coerce". So, for example:
>
>> IMO [not allowing coerce to be extended is] a good idea, because coerce
>> is already a crock from the start. It is not possible to change something
>> from the type it was to the type you want to be
>
[...]
>
> But if we were to make it open season on extending this, in the present of
> a dynamic language with multiple inheritance, this kind of problem would
> happen all the time, and the result would be ragged... absent the intent
> information I alluded to above.
Integrating over all what was said, I take it that writing a
comprehensive conversion library which relies on intentional-types isn't
all that bad. I'll be knee deep in my newbie code for the next couple
of hours.
Albert Krewinkel <·······@inb.uni-luebeck.de> writes:
> > But if we were to make it open season on extending this, in the present of
> > a dynamic language with multiple inheritance, this kind of problem would
> > happen all the time, and the result would be ragged... absent the intent
> > information I alluded to above.
>
> Integrating over all what was said, I take it that writing a
> comprehensive conversion library which relies on intentional-types isn't
> all that bad. I'll be knee deep in my newbie code for the next couple
> of hours.
Well, I'm not big on assessing Good and Bad. Lisp is not a religion.
I will note that a bunch of general type conversion stuff isn't
typically how people handle things. A lot of the time they just make
their own arbitrary function that doesn't require intentional
information because the intention is built into their personal
package.
In other words, as soon as you write
(defun my-coerce (x) ...)
you are pretty much free to make it mean whatever you think you can make
internally consistent for your own use. You don't have to pass extra info
because you can assume whatever you like. In fact, using packages, you can
even SHADOW the COERCE symbol and make your own COERCE symbol that does
what you like and is distinct from CL:COERCE.
What makes one need to pass the extra info is trying to write
something for general consumption, because not all people will make
the same assumptions...
Kent M Pitman <······@nhplace.com> writes:
> Albert Krewinkel <·······@inb.uni-luebeck.de> writes:
>
>> > But if we were to make it open season on extending this, in the present of
>> > a dynamic language with multiple inheritance, this kind of problem would
>> > happen all the time, and the result would be ragged... absent the intent
>> > information I alluded to above.
>>
>> Integrating over all what was said, I take it that writing a
>> comprehensive conversion library which relies on intentional-types isn't
>> all that bad. I'll be knee deep in my newbie code for the next couple
>> of hours.
>
> Well, I'm not big on assessing Good and Bad. Lisp is not a religion.
It's part of emacs, so yes, it is ;-)
> I will note that a bunch of general type conversion stuff isn't
> typically how people handle things. A lot of the time they just make
> their own arbitrary function that doesn't require intentional
> information because the intention is built into their personal
> package.
>
[...]
>
> What makes one need to pass the extra info is trying to write
> something for general consumption, because not all people will make
> the same assumptions...
So is this a general thing about libraries!? If I understand you
correctly, then you argue that almost no library ever does exactly what
people like to do, but it should be general enough to make it easier for
the programmer to do what he intended to. In the special case of a
converter library, this would mean some customization mechanisms to
specify assumptions (e.g. by defining another function which allows the
from type to be given implicitly).
That conforms with my experience that most lispers consider their new
code to be nothing but 'exploratory programming', which will evolve into
something different except for when it does what it's supposed to.
From: Albert Krewinkel
Subject: Method specializers for types and subtypes (was: coerce for arbitrary types)
Date:
Message-ID: <87lk3shs17.fsf_-_@black-coffee.lan>
Uh oh, looks like I was a little bit too optimistic when I claimed that
a conversion library would be trivial. In my first pass coding attempt
(see below) I was stopped by some thoughts about types and specializers:
Given I have a method `CONVERT' which takes a value and the type from
which to convert and to what it should be converted.
(defgeneric convert (value from-type to-type))
Now if there is a method that converts reals to floats, it would look
like this:
(defmethod convert ((value real) (from (eql 'real)) (to (eql ('float))))
(coerce value to))
But now I have to remember that this conversions is for reals, so
calling
(convert 9 'integer 'float)
would not be cought by the above method. I could, of course, define
this method for all subtypes of `REAL', e.g. by using closer-mop and a
macro. But then, this doesn't help at all when I'd issue a call in the
form of
(convert some-number '(or fixnum float) 'float)
So I guess, what I'd like to have is some way to specialize on methods
_optionaly_ using `subtypep' or similar. Kind of doesn't feel right.
Where did I go wrong?
;;;; Some code very simple type converter code:
(defpackage type-converter
(:use :common-lisp))
(in-package #:type-converter)
(declaim (ftype (function (t t t &key &allow-other-keys) t) convert))
(defgeneric convert (value from-intentional-type to-intentional-type &key)
(:documentation "Converts between arbitrary types."))
(defmacro defconverter ((var from-type to-type &rest keys) &body body)
`(defmethod convert ((,var ,from-type) (from-type (eql ',from-type))
(to-type (eql ',to-type))
&key ,@keys &allow-other-keys)
,@body))
;;; Use coerce where it is acceptable.
(defmacro defconverter-using-coerce (from-value to-value)
(let ((var (gensym "VARIABLE")))
`(defconverter (,var ,from-value ,to-value)
(coerce ,var ',to-value))))
(macrolet ((convert-all-using-coerce (duplets)
`(progn
,@(loop
for (from-value to-value) in duplets
collect
`(defconvert-using-coerce ,from-value ,to-value)))))
(convert-all-using-coerce
((list string)
(real float)
(real short-float)
(real single-float)
(real double-float)
(real long-float)
(real complex))))
;; Get the name of a symbol
(defconverter (sym symbol string)
(string sym))
;; Code to char and back again
(defconverter (char char-code character)
(code-char char))
(defconverter (code character char-code)
(char-code code))
;; and so on...
From: Kent M Pitman
Subject: Re: Method specializers for types and subtypes (was: coerce for arbitrary types)
Date:
Message-ID: <utzif7j7i.fsf@nhplace.com>
Albert Krewinkel <·······@inb.uni-luebeck.de> writes:
> Uh oh, looks like I was a little bit too optimistic when I claimed that
> a conversion library would be trivial.
Well, first, as I noted before, the article I wrote was really making a
theoretical point--not saying that people would necessarily prefer to
program that way.
Nothing wrong with this as something to cut your teeth on as long as you
keep in mind that it's possible to write lots of things that don't turn
out to be useful. Getting a concrete understanding of how things fit
together in practice is quite useful and instructive.
> In my first pass coding attempt
> (see below) I was stopped by some thoughts about types and specializers:
> Given I have a method `CONVERT' which takes a value and the type from
> which to convert and to what it should be converted.
>
> (defgeneric convert (value from-type to-type))
>
> Now if there is a method that converts reals to floats, it would look
> like this:
>
> (defmethod convert ((value real) (from (eql 'real)) (to (eql ('float))))
> (coerce value to))
Some extra parens in the last one. Not ('float) but 'float. But I assume
that's just a typo.
There are several ways to do this. One is to not write all those detailed
methods and just do:
(defmethod convert ((value real) (from t) (to (eql 'float)))
(coerce value 'float))
This says that if there is not a more specific method, use COERCE.
It still allows you to define methods on types that don't want to be
done this way, but saves you from writing repetitive methods on the
things that are the same.
Kent M Pitman <······@nhplace.com> writes:
> Albert Krewinkel <·······@inb.uni-luebeck.de> writes:
>
>> Uh oh, looks like I was a little bit too optimistic when I claimed that
>> a conversion library would be trivial.
>
> Well, first, as I noted before, the article I wrote was really making a
> theoretical point--not saying that people would necessarily prefer to
> program that way.
>
> Nothing wrong with this as something to cut your teeth on as long as you
> keep in mind that it's possible to write lots of things that don't turn
> out to be useful. Getting a concrete understanding of how things fit
> together in practice is quite useful and instructive.
Half of my programms turn out to be useless except for what I learned by
writing them. The hard part is to make a decission up front: Is the
wasted time worth the mere experience of hacking some code and how much
will I learn on the way? I find this even harder as preferences have to
shift when personal knowlegde increases.
In this case here it's definitifely worth it, as I seem to miss some
fundamental understanding about types. Is there any recommended reading
that could help me with this?
>> In my first pass coding attempt
>> (see below) I was stopped by some thoughts about types and specializers:
>
>> Given I have a method `CONVERT' which takes a value and the type from
>> which to convert and to what it should be converted.
>>
>> (defgeneric convert (value from-type to-type))
>>
>> Now if there is a method that converts reals to floats, it would look
>> like this:
>>
>> (defmethod convert ((value real) (from (eql 'real)) (to (eql ('float))))
>> (coerce value to))
>
> Some extra parens in the last one. Not ('float) but 'float. But I assume
> that's just a typo.
Right, that's a usenet-posting bug.
> There are several ways to do this. One is to not write all those detailed
> methods and just do:
>
> (defmethod convert ((value real) (from t) (to (eql 'float)))
> (coerce value 'float))
>
> This says that if there is not a more specific method, use COERCE.
> It still allows you to define methods on types that don't want to be
> done this way, but saves you from writing repetitive methods on the
> things that are the same.
That solves only part of my problem. To rephrase it more precisely: Let
there be a function to read a value (or nil float) from a string. The
converter should behave in the following way:
(convert "9.0" 'string '(or nil float)) => 9.0
(convert "NIL" 'string '(or nil float)) => nil
(convert "" 'string '(or nil float)) => nil
(let's forget for a second that one might be better of using conditions
here.) How would I fit that into my naïve convert framework, as using
EQL specializers won't help here?
Albert Krewinkel <·········@gmx.net> writes:
> Half of my programms turn out to be useless except for what I learned by
> writing them. The hard part is to make a decission up front: Is the
> wasted time worth the mere experience of hacking some code and how much
> will I learn on the way? I find this even harder as preferences have to
> shift when personal knowlegde increases.
I forgot to mention the fun, which is the main culprit for hacking in
lisp :-)
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: coerce for arbitrary types
Date:
Message-ID: <rem-2008apr09-002@yahoo.com>
> From: Albert Krewinkel <·······@inb.uni-luebeck.de>
> I encountered a couple of situations were I find myself writing a
> lot of code to convert from one thing to another. Especially in my
> semi-toy computational biology project (http://sf.net/projects/clcb
> for those interested), a vast number of functions turned out to be
> in the form of from-this->to-that: triplet to amino-acid, rna
> to protein, amino-acid coordinates to nucleotide coordinates,
> relative to absolute...
Let me guess: You have a function called something like:
triplet-to-aa
which takes a keyword argument code which defaults to the standard code:
<http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c#SG1>
but which allows use of other codes such as:
<http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c#SG2>
etc. Then you have a function called something like:
rna-to-protein
which takes the same keyword argument, and which maps down the rna
sequence looking for a start sequence per that code then passing
triples to triplet-to-aa until it sees a stop sequence, and returns
two values, the sequence of amino acids, and the position where
translation stopped?
> So type conversion is very common pattern in my code, and lisp
> has teached [s[c] me one thing: It's a Good Thing (tm) to
> abstract away common patterns.
But you do this for a specific problem domain, and don't expect
your code to work outside that domain, and don't expect anybody
else's code (including convenience functions such as COERCE within
the standard language) to do the right thing for your specific
problem domain, I hope.
> > http://www.nhplace.com/kent/PS/EQUAL.html
> Very enlightening article!
Ah, you are seeing that article for the first time perhaps? I saw
it years ago and still like it (although I've been wanting to write
something of my own that covers a wider variety of intention, such
as atomic data types, such as unsigned integer with intention of
bit mask or boolean), and I cite it from time to time. I guess
you're a newbie who hasn't seen it before. As the cliche says,
there's always a first time.
> Integrating over all what was said, I take it that writing a
> comprehensive conversion library which relies on intentional-types
> isn't all that bad.
Well, it's easier said than done to decide upon a good system for
classifying intentional types and naming them so as to satisfy a
majority of intentions various people might have. Kent's essay
emphasized containers built out of CONS cells (single CONS cells
with two parts, CDR-chained lists, assoc lists, trees with only
atoms at leaves, etc.), and for that specific class of objects it
might be feasible to clearly define the difference between a
specific kind of container and the elements within it, such that
it's possible to define what a shallow copy of the container would
entail (copy *all* the structure of the container per se, but do
*not* copy any of the structure of the various elements, the
elements of old and new container must be EQ, except for numbers
and characters which might be only EQL). But note that there are at
least two essentially different *kinds* of containers: One kind
merely contains elements. The other kind defines a mapping between
keys and values. You really can't coerce one kind to be the other
kind meaningfully, except in the sense that a map can emulate a set
simply by setting all the values to some default value such as NIL,
so that only the keys are important. So you can inter-convert
between a CDR-linked list and a vector (one-dimensional array), and
for lists of length 2 you can also convert between a CONS cell and
a list. Then you can inter-convert between an association list and
a property list and a hash table. Then you can extract just the key
set or the values set from a map to produce a list, or build a map
from just a list, setting values to NIL, or you can take two
different lists of the same length and collate them to produce
corresponding keys and values to build a map. Then you have the
problem of whether a set or map is ordered or not. Converting a
list into a hashtable tends to scramble the keys. So even the easy
case of defining
(shallow-copy <newContainerType> <oldContainerType> <container>)
won't be a trivial task, especially if you allow user-defined
container types, such as various kinds of self-balancing binary
search trees such as AVL tree and red/black tree and MBBT etc.
Then if you ever complete all that huge task, including a formal
language for application programmers to define the structure of new
kinds of sets/lists and maps to be handled by your shallow-copy and
other utilities you devise, next comes the intional type of numbers
(signed or unsigned or bitmask or block-of-smaller-bytes, metric or
English units, etc.) and textual sequences (US-ASCII or Latin-1 or
Big5 or UTF-8 or other UniCode representation) and file formats
(images: GIF, JPEG, BMAP, TIFF, etc. / compressed: comp, zip, gzip,
etc. / who knows what else somebody might want converted from one
format/type to another ...)
Advice: Choose some domain small enough that you can do the job
right but large enough to be non-trivial. Don't burn yourself on
the too-hot porridge, but don't get complacent with the too-cold
porridge.
> I'll be knee deep in my newbie code for the next couple of hours.
To do any of that *right* will take more than a couple hours.
Good luck.
Oh, when I started this followup, I wrote myself this note as to
what I wanted to say: "Be sure you don't define function for two
intersecting domains where they conflict anywhere in the
intersection." I.e. don't make the mistake that the designers of
COERCE made, of having strings and sequences intersect at NIL which
is simultaneously a way to name the word "NIL" which has three
characters, and a way to indicate an empty sequence of zero
anythings, such that people disagreed what the "correct" result
should be, three characters or none.
·······@yahoo.com (Robert Maas, see http://tinyurl.com/uh3t) writes:
>> From: Albert Krewinkel <·······@inb.uni-luebeck.de>
>> I encountered a couple of situations were I find myself writing a
>> lot of code to convert from one thing to another. Especially in my
>> semi-toy computational biology project (http://sf.net/projects/clcb
>> for those interested), a vast number of functions turned out to be
>> in the form of from-this->to-that: triplet to amino-acid, rna
>> to protein, amino-acid coordinates to nucleotide coordinates,
>> relative to absolute...
>
> Let me guess: You have a function called something like:
> triplet-to-aa
> which takes a keyword argument code which defaults to the standard code:
> <http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c#SG1>
> but which allows use of other codes such as:
> <http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c#SG2>
> etc. Then you have a function called something like:
> rna-to-protein
> which takes the same keyword argument, and which maps down the rna
> sequence looking for a start sequence per that code then passing
> triples to triplet-to-aa until it sees a stop sequence, and returns
> two values, the sequence of amino acids, and the position where
> translation stopped?
Yes, essentially that's it, but some slight modifications. The
translation table for codons is currently specified using a special
variable: I can't imagine any use case in which it is necessary to
change the translation table often. But allowing for a keyword argument
is something I did consider -- and since it seems natural to you (and
probably other people, too), I'll reconsider my design decission.
My main intend to write the position stuff was to allow mapping between
proteins and {D,R}NA: For an interesting feature in the protein, we want
to know where is it coded in the DNA. This is non-trivial (at least for
me) as splicing of RNA complicates the matter, so a continuous protein
sequence might be scattered over multiple DNA sequences. Plus, the
mapping has to work on any (possibly non contiguous) DNA slice. My
approach here was to regard those DNA/protein sequences to be intervals
in a discrete sequence space. That way, most of the problems described
can be solved using coordinate conversion (another use case for a
`convert' method) and subsequent set theoretic operations on the
resulting intervals. But I digress.
>> So type conversion is very common pattern in my code, and lisp
>> has teached [s[c] me one thing: It's a Good Thing (tm) to
>> abstract away common patterns.
>
> But you do this for a specific problem domain, and don't expect
> your code to work outside that domain, and don't expect anybody
> else's code (including convenience functions such as COERCE within
> the standard language) to do the right thing for your specific
> problem domain, I hope.
I see. I guess the origin of my question was not so much the need for
an existing converter *function* but rather for a unified *interface*
for conversion operations.
>> > http://www.nhplace.com/kent/PS/EQUAL.html
>
>> Very enlightening article!
>
> Ah, you are seeing that article for the first time perhaps? I saw
> it years ago and still like it (although I've been wanting to write
> something of my own that covers a wider variety of intention, such
> as atomic data types, such as unsigned integer with intention of
> bit mask or boolean), and I cite it from time to time. I guess
> you're a newbie who hasn't seen it before.
Yep, exactly. I'd never have guessed to remain in raw newbie status
after two years of programming mostly in CL. But then, CL isn't like
any other language, so I should have known.
>> Integrating over all what was said, I take it that writing a
>> comprehensive conversion library which relies on intentional-types
>> isn't all that bad.
>
> Well, it's easier said than done to decide upon a good system for
> classifying intentional types and naming them so as to satisfy a
> majority of intentions various people might have. [...] But note that
> there are at least two essentially different *kinds* of containers:
> One kind merely contains elements. The other kind defines a mapping
> between keys and values. You really can't coerce one kind to be the
> other kind meaningfully, except in the sense that a map can emulate a
> set simply by setting all the values to some default value such as
> NIL, so that only the keys are important.
> [...]
> Advice: Choose some domain small enough that you can do the job
> right but large enough to be non-trivial. Don't burn yourself on
> the too-hot porridge, but don't get complacent with the too-cold
> porridge.
Those are some ugly dilemmas, indeed. There still is my comp-bio
playground which should suffice for me to learn more about all that
stuff.
>> I'll be knee deep in my newbie code for the next couple of hours.
>
> To do any of that *right* will take more than a couple hours.
> Good luck.
I (unwillingly) figured that out already. But then, it's a nice problem
which I hope can give me some deeper insights. So I won't just file it
away.
Thanks for your help!
Albert Krewinkel <·······@inb.uni-luebeck.de> wrote:
+---------------
| The translation table for codons is currently specified using a special
| variable: I can't imagine any use case in which it is necessary to
| change the translation table often. But allowing for a keyword argument
| is something I did consider -- and since it seems natural to you (and
| probably other people, too), I'll reconsider my design decission.
+---------------
Well, remember that you can always make the default value of a
keyword argument be a special variable [or some function of a
special variable, see below]. I use that a lot with output-formatting
functions, e.g.:
(defun format-foo (the-foo &key (stream *standard-output*) ...)
...
(format stream ...stuff...)
...)
Or for HTML-formatting stuff the default might be
(HTTP-REQUEST-STREAM *HTTP-REQUEST*) instead of
*STANDARD-OUTPUT*, whatever.
-Rob
-----
Rob Warnock <····@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: coerce for arbitrary types
Date:
Message-ID: <rem-2008apr11-001@yahoo.com>
> > Let me guess: ...
> From: Albert Krewinkel <·······@inb.uni-luebeck.de>
> Yes, essentially that's it, but some slight modifications. The
> translation table for codons is currently specified using a special
> variable: I can't imagine any use case in which it is necessary to
> change the translation table often. But allowing for a keyword argument
> is something I did consider -- and since it seems natural to you (and
> probably other people, too), I'll reconsider my design decission.
It's a trade-off. With keyword, you have to pass it through every
level of function-call, which might be a burden. With fluid, the
effect is felt everywhere in the dynamic context, but debugging may
be difficult later when you forget where that binding is coming
from a year after you previously looked at the code.
> I guess the origin of my question was not so much the need for an
> existing converter *function* but rather for a unified *interface*
> for conversion operations.
AFAIK Common Lisp doesn't have formal interfaces the way that Java
does. Still I like that jargon, and am in favor of interfaces in CL
to be defined as conventions, via documentation and practice, and
eventually widely-accepted by concensus. Whether any such agreement
will ever occur is unknown. My own convention not just for
conversions but for a lot of data-processing tasks is to list the
types of input, then "to", then the type of output. But the convention
is gradually developing, not yet firm. For example:
alphasorhists-to-diffvec -- Takes two histograms, each alphabetically sorted,
and collates them to produce the difference vector.
hist-euclidean-diagonal -- Takes a histogram, computes length of diagonal
from origin, i.e. sqrt of sum of squares.
(defun alphasorhists-to-cartdist (sorhist1 sorhist2)
(hist-euclidean-diagonal (alphasorhists-to-diffvec sorhist1 sorhist2)))
-- Takes two histograms, computes Cartesian distance between them.
hist-to-ht -- Takes a histogram, builds a hash table with all its data.
pr+tot+ht+tot-to-freqratio -- Takes a pair (i.e. string+count entry in a histogram),
and the total for that histogram, and a hash table (from another histogram),
and the total for that other histogram, and computes the ratio between
the frequencies in the histograms.
hist+ht+tot-to-freqrathist -- Takes a full histogram, and the hashtable+total
for another histogram, and converts the first histogram to a histogram
of frequency ratios.
> >> > http://www.nhplace.com/kent/PS/EQUAL.html
> >> Very enlightening article!
> > Ah, you are seeing that article for the first time perhaps? I saw
> > it years ago and still like it (although I've been wanting to write
> > something of my own that covers a wider variety of intention, such
> > as atomic data types, such as unsigned integer with intention of
> > bit mask or boolean), and I cite it from time to time. I guess
> > you're a newbie who hasn't seen it before.
> Yep, exactly. I'd never have guessed to remain in raw newbie status
> after two years of programming mostly in CL.
Oh no, I didn't mean you're a *raw* newbie, just that you haven't
happened to have encountered that wonderful essay yet so in some
sense you're a newbie in that realm.
I've programmed in Lisp at least 16 years (Stanford Lisp 1.6, UCI
Lisp, MacLisp, Standard Lisp, Portable Standard Lisp, and various
versions of Common Lisp). But there are major features of Common
Lisp that I haven't yet used because the stuff I already knew
sufficed for all my needs. For example, I have only recently
thought of what might be my very first serious use of CLOS. Also I
don't have access to any machine where CLIM can run, so I haven't
used that yet either. So I'm a newbie in CLOS (played with it just
slightly) and a non-user of CLIM. I haven't yet found a use for any
of those downloadable packages such as all that asdf stuff yet either.
(The problem with those is lack of any good information system
whereby I could find what I might have a use for and get enough
user-evaluation to know whether it'll be worth my effort or not.)
·················@SpamGourmet.Com (Robert Maas, http://tinyurl.com/uh3t) writes:
> AFAIK Common Lisp doesn't have formal interfaces the way that Java
> does. Still I like that jargon, and am in favor of interfaces in CL
> to be defined as conventions, via documentation and practice, and
> eventually widely-accepted by concensus. [...]
Or, if you need some more formal interfaces, you can implement them
easily, thanks to Lisp.
For example, have a look at slime/swank sources, swank-backend.lisp
contains a definterface macro used to define the interface between
slime and swank, and if an implementation (one of the other
swank-${implem}.lisp files) doesn't implement all the interface, this
is detected and warned or errored about.
--
__Pascal Bourguignon__
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: coerce for arbitrary types
Date:
Message-ID: <rem-2008apr09-001@yahoo.com>
> > > From: Scott Burson <········@gmail.com>
> > > If I understand the hyperspec correctly, then there is no way to extend
> > > coerce to accept any other result-type than the ones explicetly [sic] listed.
> From: Kent M Pitman <······@nhplace.com>
> Why do you need to do this? This is often a sign of a program being
> structured wrong--trying to get too much information out of too few bits.
> http://www.nhplace.com/kent/PS/EQUAL.html#two-bit_rule
Well maybe that's sorta relevant, but the main essay (below) is
more to the point, IMO.
> In fact, the entire article
> http://www.nhplace.com/kent/PS/EQUAL.html
> in which that rule resides builds up to the relevant point, which
> is that in dynamic languages, additional care is required in order
> to infer "intent" because the fact of a representation's use is not
> the same as the intent of a representation's use, and you can only
> dynamically query about the fact of its use, not the intent.
IMO that's primarily because we pass raw objects, rather then
intention-tagged objects, as parameters to functions. We do this
for efficiency, but there's a tradeoff that the runtime system
doesn't have sufficient information to know which of many possible
intentional types is mapped to the given runtime-implementational
type. But this problem isn't limited to dynamic languages, because
even in static languages people use the same declared datatype to
represent several different kinds of data-object, because the
programming language doesn't have enough declaration datatypes to
cover all the possible intentions one-to-one. Consider for example
that most languages have a single datatype to cover:
- Unsigned integer (of some particular length)
- Bitmask (of some particular number of bits)
- Boolean (using some convention to map two integer/bitmask values
to the two values True/False, and somehow gracefully dealing with
other values that may occur in lieu of the "correct" values, often
*deliberately* allowing all those other values to represent one of
the boolean values to allow some shortcuts in coding).
Then there's the most extreme case of void* in C.
Now in Common Lisp, all we'd need to do to demark the boundary
between container and element in structures based on the CONS cell
would be to wrap the overall container with an extra layer of
intent, then the generic fuction can use that info to decide what
operation we *really* want to do upon some subset of the CONS cells
dangling from the toplevel container pointer. But this increases
the overhead hence slowness of execution, and prevents sharing
common structure between multiple containers of the same type.
Likewise an extra layer of intent around a primitive type such as
integer can tell us whether it's to be treated as a number or a
bitmask. Here the overhead of the extra wrapper is perhaps more
extreme.
And in static languages, we might use a typedef to add the
intentional information, which a polymorphic compiler could use to
choose the correct function to apply to the data. (I'm pretty sure
K&R C doesn't support this, whereas C++ does. I'm not sure whether
Objective Think C, which I used briefly in 1992, does or doesn't.)
> This is not a bug in dynamic languages because it buys
> additional, much needed flexibility in other situations.
Agreed, but not just in dynamic languages, such as integer being
used as bitmask because there is no built-in bitmask datatype, or
because the compiler can't handle two functions by the same name
that take different-typed arguments (i.e. compiler isn't
polymorphic).
> But in the case here, you have to be careful to either pass more
> information or not to infer so much information from what is
> passed.
Agreed. I think the main lesson is for newbies to realize that pair
and list and alist and property-list and tree all look exactly the
same to the runtime system, a pointer to a single CONS cell that
has more or less "massive quantities of" stuff hanging off it,
perhaps exactly the *same* stuff, perhaps decisively different
stuff but it would require too much CPU time to explore a structure
deeply to decide which stuff hangs off it every time a toplevel
operation is to be performed on "it" that is supposed to do
different things depending on the intention of the programmer as
expressed by "all that stuff" hanging off it.
So the newbie must learn to call different functions to deal with
each of the various kinds of structures all hanging off the same
exact kind of CONS cell, or:
> Sometimes more information means "another argument" and sometimes
> it means "a different representation of the data than the one
> you're trying to coerce".
As for "another argument" as a solution: In general this requires
the programmer write his/her own utility function which takes that
extra argument (parameter) and dispatches to the appropriate
built-in or programmer-defined single-case utility.
Your (Pittman's) essay on intention of data structure deals mostly
only with CONS-based structures. But the OP and similar threads
lately have dealt with floating point values converted to
rationals. Here the ambiguity isn't so much the intentional *type*
of the data, but rather the precision and accuracy. After all, a
floating point value is **supposed** to be only an approximation,
not an exact value, whereas a rational is supposed to be an exact
value. So when the runtime system sees only a floating-point value,
how is the system supposed to know how [in]accurate the value is?
Any good programmer (cough cough) performs numerical analysis to
learn the tolerance for every floating-point approximation within
his/her program at every point during execution, right? So at the
point where a floating-point approximation is to be printed out, or
converted to a rational, the programmer can include code to
explicitly take that known error-margin into consideration, right?
Common Lisp provides only two built-in conversions:
- (rational <float>) assumes the tolerance is zero, the value is *exact*.
- (rationalize <float>) assumes the data value was entered by
parsing from a string, and no further processing was done after
parsing, and the string before parsing was exactly the correct
decimal-fraction, and the floating-point precision was sufficient
to distinguish that decimal fraction from any other fraction
whatsoever of equal or smaller denominator.
Obviously neither assumption is correct in most cases where a
floating-point value is converted to a rational. Even when the data
value was obtained immediately from parsing string representation,
we have to beware of the limit on precision available in various
floating-point internal representations:
(rationalize 1.237) => 1237/1000 (good)
(rationalize 1.2373) => 7039/5689 (whoops, not enough bits in single-float)
(rationalize 1.2373d0) => 12373/10000 (good)
...
(rationalize 1.2372583d0) => 12372583/10000000 (good)
(rationalize 1.23725683d0) => 101747482/82236347 (whoops, n.e.b.i.double-f.)
> So, for example:
> > IMO [not allowing coerce to be extended is] a good idea, because coerce
> > is already a crock from the start. It is not possible to change something
> > from the type it was to the type you want to be
> This isn't the reason it's a "crock". That's just the reason it has a
> bad name. The definition has issues that make it not easily
> extendable independent of its name, and those issues derive from the
> nature of the language. In brief, COERCE is too short a name and
> falsely gives the sense that it is a canonically defined operation
> with only one meaning rather than an operation that has several
> (perhaps myriad) friends that are all vying for the cool short-name.
Ah, you are more diplomatic than I am. Indeed there simply
shouldn't be a function by that name, or it should be more limited
in cases than it currently is.
> If these were domain names, not function names, all the names
> coerce.org, coerce.net., coerce.cc, coerce.tv, etc. would all be
> bought up and people would be assigning them subtly different
> meanings. But when there's no trailing GTLD to remind us, it's easy
> to forget that there was a competition in play.
Hmm, nice way of looking at the issue.
> The real problem is that something you can customize needs to be
> defined in such a way that it's clear what customizing it would mean.
I.e. "coerce" is too vague to clearly say what it's supposed to do
in general when there are multiple reasonable things a function
might do? (Be glad you actually need to call a function to convert
a value from one type to a reasonable value of another type.
Compare with languages such as C where simply assigning a value to
a different type of variable automatically performs some kind of
conversion, or even C++ where this issue was clarified in some
cases but left as automatic conversion in other cases. At lesat in
Lisp we can argue about what a named function should do, rather
than what an "assignment" operator should also do in addition to
its primary task of assignment.)
> For historical reasons, COERCE is already messed up because of
> the NIL vs () issue [note well: NOT the issue commonly fussed
> over, which is the false vs. empty list issue].
Hmm, on that non-issue, I can imagine that COERCE applied to a
sequence, with target type not any sequence type, might map down
the list like MAP, converting each element into the requested type,
putting the elements into a new sequence of the same type as the
old sequence as much as possible. For example, if (coerce
'character <int>) performed (code-char <int>), then (coerce
'character <string>) might perform (map 'vector #'code-char
<string>).
Back to the main issue: It's indeed unfortunate that early versions
of Lisp tried to get by on very few types of objects (fixnums, CONS
cells, symbols, nothing else that I can recall), so of course the
idea of having yet another data type reserved solely for a single
object that indicated an empty list, which read and printed as (),
would have seemed extravagant. So somebody needed to decide whether
the fixnum 0 or the symbol NIL would denote the empty list (hence
the last CDR of a non-empty list), and NIL won. I suppose the idea
was that it would be useful to distinguish between any number
whatsoever, including zero, and an empty list, and hence it would
be a pain if some particular number, especially a commonly-used
value such as zero were to be the indicator of empty list, whereas
NIL was such a strangly named symbol that it wouldn't turn up in
normal use, so it could take on this role without causing lots of
trouble? Anybody who really needed the symbol NIL for any other
purpose would just have to deal with the consequences, while the
majority wouldn't ever suffer a problem? And after 25 years of
using NIL to denote the empty list were assumed by a huge amount of
working code, the idea of changing that fact of life when merging
MacLisp and LispMachineLisp etc. into Common Lisp seemed
politically infeasible, right?
> The simple case of:
> (coerce nil 'string) => ""
> (string nil) => "NIL"
> illustrates the problem.
IMO since STRING needs to interpret the argument as some sort of
sequence of objects that each can be coerced into a character, it
needs to decide whether NIL is to be interpreted as an empty list
of objects which *would* have been convertable to characters if
there had been any in the list, or as a symbol whose print-name
should be used. If STRING worked like this:
(string '(#\F #\O #\O)) => "FOO"
then it would have been reasonable for this to also work:
(string '()) => ""
i.e.
(string NIL) => ""
But that kind of conversion isn't defined at all, whereas this *is* defined:
(string 'FOO) => "FOO"
So the only reasonable option is:
(string 'NIL) => "NIL"
and since NIL evaluates to itself, the quoting isn't necessary:
(string NIL) => "NIL"
Now if conversion from list of characters to string was in fact
implemented, then we'd have an ambiguity as to what (string NIL)
should return, and I'd prefer it to signal an error, suggesting
alternate code to make clear the intent of the programmer:
(map 'string 'identity '(#\F #\O #\O)) => "FOO"
(map 'string 'identity '()) => ""
(map 'string 'identity nil) => ""
(symbol-name 'FOO) => "FOO"
(symbol-name 'NIL) => "NIL"
> The reason this happens by the way is that STRING plainly does
> not work on arbitrary sequences, so there's no confusion that
> STRING is not asking to operate on sequences.
Agreed.
> But COERCE does take sequences, and empty sequences were common
> and useful. The progression
> (coerce '(a b) 'vector)
> => #(A B)
> (coerce '(a) 'vector)
> => #(A)
> (coerce '() 'vector)
> => #(#\N #\I #\L)
> would have been devastating to getting any work done.
Although coerce may be "nice" as a shortcut for lazy programmers,
really IMO the programmer should have written:
(map 'vector #'identity '(a b))
(map 'vector #'identity '(a))
(map 'vector #'identity '())
Now consider this progression:
(coerce '(#\F #\O #\O) 'vector)
=> #(#\F #\O #\O)
(coerce * 'string)
=> "FOO"
(coerce '(#\F #\O #\O) 'string)
Looks like it should work, by transitivity, right? If the list is
considered "the same" as the vector, except for implementation
type, such that there's a default conversion from one to the other,
and if the vector is considered "the same" as the string, likewise
default conversion, then why aren't the list and string considered
"the same" too??
(coerce "FOO" 'vector)
=> "FOO" ;Because string is a sub-type of vector, so no conversion is needed.
(coerce #(#\F #\O #\O) 'list)
=> (#\F #\O #\O)
(coerce "FOO" 'list)
=> (#\F #\O #\O)
Wow, it works in the opposite directly all the way from string to list!
(coerce "" 'list)
=> NIL
Yes, Virginia, there really is a reverse coercion available!
Putting it all together:
(string (coerce "" 'list)) => "NIL"
That one little snippet of code is terse+deep enough to be a newbie koan!
> In effect, you are meant to make the choice between STRING and
> (COERCE x 'STRING) somewhat on the basis of understanding how you
> want this particular tie to be broken.
IMO that is a perverse bit of trivia for Lisp programmers to need
to memorize and not screw up. IMO it's better that neither STRING
nor COERCE be used at all, ever, in production code. Now since most
of the string-handling functions implicitly apply STRING to their
supposedly-string arguments, we must understand the three cases
where STRING works (string to itself, character to single-character
string, or symbol to print name). But still, directly calling
STRING from application code seems unnecessary. Whereas the string
functions are somewhat generic, working equally well with actual
strings as well as with characters or symbols, whereas they don't
know the intention of the programmer, this kind of "magic" coercion
to the proper type may be necessary. But when a programmer is
working with characters as if strings, or with symbols as if their
print-names, I see no reason the programmer can't make his/her
intention explicit by coding what is really intended, either
(format nil "~A" <char>) or (symbol-name <sym>) instead of asking
the magic genie STRING to do "what's right" which as we see above
isn't always what's right.
> But if we were to make it open season on extending this, in the
> present of a dynamic language with multiple inheritance, this kind
> of problem would happen all the time, and the result would be
> ragged... absent the intent information I alluded to above.
Indeed, if the previous discussion wasn't enough to convince the
diehard genie-user to stop using the magic genie and clearly say
exactly what data conversion is desired, this nuclear weapon of
clue-by-four should suffice.
Now I was almost going to make a general recommendation to avoid
using shortcuts that disguise intent. For example, you want to do
one thing if the value of x is nil and something else otherwise,
you should say:
(if (null x) <FirstThing> <SecondThing>)
or else
(if (not (null x)) <SecondThing> <FirstThing>)
rather than
(if x <SecondThing> <FirstThing>)
But then I relalized I take a similar shortcut frequently when
parsing strings by scanning for successive occurrances of
particular delimiting characters or white-space. IX is starting
index i.e., or the index where I left off the previous time around
the loop, like this:
(unless (setq ix1 (position #\! str :start ix0))
<FinalCode>)
(setq ix2 (or (position #\Space str :start ix0)
(length str)))
(push revres (subseq str (+ 1 ix1) ix2))
It's the first parameter to OR, which returns NIL if the character
isn't found until end of string, where I'm taking the shortcut.
I don't see any cleaner way to write that kind of code, so maybe I
can think of guidelines for when to use the shortcut that non-NIL
is treated as true and when to make the intent more explicit,
rather than saying to try to always make intent explicit?
Now back to what you said about class in heritance etc. It seems to
me that when somebody is developing a domain-specific bunch of
code, it's reasoanble to define some default coercions that make
sense for that specific domain, but *not* expect the underlying
Lisp system to already provide such custom intent built-into
functions such as COERCE. Let the domain-specific programmer write
a domain-specific coercion function him/herself, and make such a
function consistent for just that one domain, and not over-reach.
Regarding that proposed koan earlier: Does anybody have a
collection of short code snippets which likewise illustrate a whole
mess of deep understanding, where just trying to understand why
this particular snippet of code produces the result it does will
enlighten a newbie?
Robert Maas, see http://tinyurl.com/uh3t wrote:
> Regarding that proposed koan earlier: Does anybody have a
> collection of short code snippets which likewise illustrate a whole
> mess of deep understanding, where just trying to understand why
> this particular snippet of code produces the result it does will
> enlighten a newbie?
As we just witnessed (if you've been reading this list recently),
a 10-line recursive BFS is too much for a
newbie to figure out by himself if he's never seen something
like that done in CL before. I don't have a snippet collection that
you speak of, but if there were such a collection, I would expect
something like recursive BFS to be included since it demonstrates a
powerful problem-solving method so well.
On the other hand, after reviewing my notes, I guess I do have a
few suggestions.
Pascal-B illustrates recursion in a pedagogical and also useful
manner:
(defun enumerate (start end)
(when (< start end)
(cons start (enumerate (1+ start) end))))
A recursive macro written by Mr. Crowe has a certain beauty to it.
Probably not a good idea for the newbie to think they have to fully
understand it, but to understand that this is the type of code that
can be written, nay, should be written. Short, yet flexible and reusable.
CL-USER> (defmacro base-bind (unit-var amount (&rest var-and-radix) &body code)
(if (endp var-and-radix)
`(let ((,unit-var ,amount)) ,@code)
(let ((transfer (gensym)))
`(multiple-value-bind (,transfer ,unit-var)
(floor ,amount ,(cadar var-and-radix))
(base-bind ,(caar var-and-radix) ,transfer ,(cdr var-and-radix)
,@code)))))
CL-USER> (base-bind pence 1000 ((shillings 12)(pounds 20))
(list pounds shillings pence))
=> (4 3 4)
CL-USER> (base-bind seconds 100000 ((minutes 60)(hours 60)(days 24))
(values days hours minutes seconds))
1
3
46
40
The random shuffle algorithm is accessible and shows that
there are lots of ways of accomplishing something seemingly
simple. (Take that Python.)
http://www.pvk.ca/Blog/Lisp/malmesure_quand_tu_nous_tiens.html
I also think Ron's iteration article is well written and
provides insight. He extracts an important pattern.
(Is there any high-level pattern more important than iteration,
other than "IF"?)
http://rondam.blogspot.com/2008/02/joy-of-iterators.html
A nice example of parameterization, first-class functions,
and functional programming:
(defun best (sequence comparison &key (key #'identity))
(reduce (lambda (x y)
(if (funcall comparison (funcall key x) (funcall key y))
x y))
sequence
:initial-value (elt sequence 0)))
[11]> (best '(5 6 42 8 666 9 69 -42) #'> )
666
Why recursion is not always the best solution:
[16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
FACT
[17]> (fact -1)
*** - Lisp stack overflow. RESET
[18]> (fact 10000)
*** - Lisp stack overflow. RESET
Example of clear, concise Lispy style:
[From "Tutorial on Good Lisp Programming Style"]
(defun count-all-numbers (exp)
(typecase exp
(cons
(+ (count-all-numbers (first exp))
(count-all-numbers (rest exp))))
(number 1)
(t 0)))
Learn to use accumulators:
[More from "Tutorial on Good Lisp Programming Style"]
(defun product (numbers &optional (key #'identity)
(accum 1))
"Like (reduce #'* numbers), but bails out early
when a zero is found."
(if (null numbers)
accum
(let ((term (funcall key (first numbers))))
(if (= term 0)
0
(product (rest numbers) key (* accum term))))
SORT has to be handled with care, even if it is destructive.
If you want to sort a list, you have to use the result of SORT.
Wont give a newbie the expected result:
(let ((mylist (list "c" "d" "a")))
(sort mylist #'string<)
mylist)
Better:
(let ((mylist (list "c" "d" "a")))
(setq mylist (sort mylist #'string<))
mylist)
More from Alan Crowe, this time showing how easy it is to
accomplish difficult tasks if you can properly determine
the right lever. [I've made a few spelling corrections.]
The cartesian product of two sets is easy enough, which
suggests using a fold to lift the operation from binary to
n-ary. There is a snag.
{a, b} x {1, 2} = {(a, 1), (a, 2), (b, 1), (b, 2)}
so strictly speaking
{a, b} x {1, 2} x {p, q}
is
( {a, b} x {1, 2} ) x {p, q} =
{((a, 1), p), ((a, 2), p), ((b, 1), p), ((b, 2), p),
((a, 1), q), ((a, 2), q), ((b, 1), q), ((b, 2), q)}
a technicality which mathematicians think of as /hair/ and
try to avoid or shave off.
The exact detail of the pair construction operator matters. I've
written my code to foreground this detail
#| Start from a generalization of the cartesian product of two sets, so that
f, {a,b}, {x,y} => {f(a,x), f(a,y), f(b,x), f(b,y)} |#
(defun product-set (f)
(lambda (u v)
(let (w)
(dolist (x u w)
(dolist (y v)
(push (funcall f x y) w))))))
#| I think this function has a clear theme
(funcall (product-set #'*) '(1 2 3) '(5 7))
=> (21 15 14 10 7 5)
|#
;;; Minor utility function, turns an item into a set
;;; with two elements, the singleton set and the empty set
;;; (in-and-out 3) => ((3) NIL)
(defun in-and-out (x)
"x => ({x} {})"
(list (list x) nil))
;;; Now we can define power-set and cartesian-product as folds
;;; that is, we use REDUCE to raise a suitably chosen product-set function
;;; from binary to n-ary
;;; (power-set '(1 2 3)) => ((2 1) (2 1 3) (1) (1 3) (2) (2 3) NIL (3))
(defun power-set (set)
(reduce (product-set #'union)
(mapcar #'in-and-out set)))
;;; (cartesian-product '(0) '(1 2) '(3 4 5))
;;; => ((0 1 5) (0 1 4) (0 1 3) (0 2 5) (0 2 4) (0 2 3))
(defun cartesian-product (&rest sets)
(reduce (product-set #'cons)
sets
:from-end t
:initial-value (list nil)))
And last, I would stay away from mentioning
the eval operator. There's a reason why its Hamming distance
indicates adjacency to 'evil.' Not because eval is evil,
but asking for it is usually an indicator that the
OP is way off the trail, and sending a bunch of good
people thrashing around after him is a giant waste
of everybody's time.
But it doesn't matter anyway because newbies don't read,
they think it's faster to just start coding.
vanekl <·····@acd.net> writes:
> [...]
> Why recursion is not always the best solution:
>
> [16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
> FACT
>
> [17]> (fact -1)
> *** - Lisp stack overflow. RESET
Yes, this is a bug:
(defun fact(n) (if (minusp n) 1 (* n (fact (1- n)))))
> [18]> (fact 10000)
> *** - Lisp stack overflow. RESET
Indeed, CL doesn't mandate TCO. But most implementations do implement
it, including clisp, with the right optimization level. Try in clisp:
(compile 'fact)
(fact 10000)
> [...]
> The cartesian product of two sets is easy enough, which
> suggests using a fold to lift the operation from binary to
> n-ary. There is a snag.
>
> {a, b} x {1, 2} = {(a, 1), (a, 2), (b, 1), (b, 2)}
>
> so strictly speaking
>
> {a, b} x {1, 2} x {p, q}
>
> is
>
> ( {a, b} x {1, 2} ) x {p, q} =
>
> {((a, 1), p), ((a, 2), p), ((b, 1), p), ((b, 2), p),
> ((a, 1), q), ((a, 2), q), ((b, 1), q), ((b, 2), q)}
>
> a technicality which mathematicians think of as /hair/ and
> try to avoid or shave off.
Lispers too.
(cons 'p (cons '1 (cons 'a nil))) --> (p 1 a) == (p . (1 a)) ;-)
--
__Pascal Bourguignon__ http://www.informatimago.com/
Until real software engineering is developed, the next best practice
is to develop with a dynamic system that has extreme late binding in
all aspects. The first system to really do this in an important way
is Lisp. -- Alan Kay
Pascal Bourguignon <···@informatimago.com> writes:
> vanekl <·····@acd.net> writes:
> > [...]
> > Why recursion is not always the best solution:
> >
> > [16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
> > FACT
> >
> > [17]> (fact -1)
> > *** - Lisp stack overflow. RESET
>
> Yes, this is a bug:
>
> (defun fact(n) (if (minusp n) 1 (* n (fact (1- n)))))
The bug in your version is a lot more serious than the bug in the original.
(Hint: What will your version return for 0?).
> > [18]> (fact 10000)
> > *** - Lisp stack overflow. RESET
>
> Indeed, CL doesn't mandate TCO. But most implementations do implement
> it, including clisp, with the right optimization level. Try in clisp:
I'd bet that clisp will not do TCO on either of the functions
presented here, for the simple reason that neither one of them is tail
recursive. Or does clisp do automatic transformation of non tail
recursive algorithms to tail recursive ones these days.
--
Juho Snellman
Juho Snellman <······@iki.fi> writes:
> Pascal Bourguignon <···@informatimago.com> writes:
>> vanekl <·····@acd.net> writes:
>> > [...]
>> > Why recursion is not always the best solution:
>> >
>> > [16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
>> > FACT
>> >
>> > [17]> (fact -1)
>> > *** - Lisp stack overflow. RESET
>>
>> Yes, this is a bug:
>>
>> (defun fact(n) (if (minusp n) 1 (* n (fact (1- n)))))
>
> The bug in your version is a lot more serious than the bug in the original.
> (Hint: What will your version return for 0?).
Right. I usually write it as (defun fact(n) (if (< n 1) 1 (* n (fact (1- n)))))
but I let vanekl's zerop influence me :-(
>> > [18]> (fact 10000)
>> > *** - Lisp stack overflow. RESET
>>
>> Indeed, CL doesn't mandate TCO. But most implementations do implement
>> it, including clisp, with the right optimization level. Try in clisp:
>
> I'd bet that clisp will not do TCO on either of the functions
> presented here, for the simple reason that neither one of them is tail
> recursive. Or does clisp do automatic transformation of non tail
> recursive algorithms to tail recursive ones these days.
Correct, this formulation of fact is not TCOptimizable.
However, when compiled the stack usage is greatly reduced, so (fact
10000) can be computed. But this has nothing to do with TCO indeed.
--
__Pascal Bourguignon__
Pascal J. Bourguignon wrote:
> Juho Snellman <······@iki.fi> writes:
>
>> Pascal Bourguignon <···@informatimago.com> writes:
>>> vanekl <·····@acd.net> writes:
>>>> [...]
>>>> Why recursion is not always the best solution:
>>>>
>>>> [16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
>>>> FACT
>>>>
>>>> [17]> (fact -1)
>>>> *** - Lisp stack overflow. RESET
>>> Yes, this is a bug:
>>>
>>> (defun fact(n) (if (minusp n) 1 (* n (fact (1- n)))))
>> The bug in your version is a lot more serious than the bug in the original.
>> (Hint: What will your version return for 0?).
>
> Right. I usually write it as (defun fact(n) (if (< n 1) 1 (* n (fact (1- n)))))
> but I let vanekl's zerop influence me :-(
>
>>>> [18]> (fact 10000)
>>>> *** - Lisp stack overflow. RESET
>>> Indeed, CL doesn't mandate TCO. But most implementations do implement
>>> it, including clisp, with the right optimization level. Try in clisp:
>> I'd bet that clisp will not do TCO on either of the functions
>> presented here, for the simple reason that neither one of them is tail
>> recursive. Or does clisp do automatic transformation of non tail
>> recursive algorithms to tail recursive ones these days.
>
> Correct, this formulation of fact is not TCOptimizable.
>
> However, when compiled the stack usage is greatly reduced, so (fact
> 10000) can be computed. But this has nothing to do with TCO indeed.
>
Pascal,
Then am I correct in inducing that when you use Clisp and are able to
compile your code that recursion has never stood in your way
(aside from the possible speed penalty)?
vanekl <·····@acd.net> writes:
> Pascal J. Bourguignon wrote:
>> Juho Snellman <······@iki.fi> writes:
>>
>>> Pascal Bourguignon <···@informatimago.com> writes:
>>>> vanekl <·····@acd.net> writes:
>>>>> [...]
>>>>> Why recursion is not always the best solution:
>>>>>
>>>>> [16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
>>>>> FACT
>>>>>
>>>>> [17]> (fact -1)
>>>>> *** - Lisp stack overflow. RESET
>>>> Yes, this is a bug:
>>>>
>>>> (defun fact(n) (if (minusp n) 1 (* n (fact (1- n)))))
>>> The bug in your version is a lot more serious than the bug in the original.
>>> (Hint: What will your version return for 0?).
>> Right. I usually write it as (defun fact(n) (if (< n 1) 1 (* n
>> (fact (1- n)))))
>> but I let vanekl's zerop influence me :-(
>>
>>>>> [18]> (fact 10000)
>>>>> *** - Lisp stack overflow. RESET
>>>> Indeed, CL doesn't mandate TCO. But most implementations do implement
>>>> it, including clisp, with the right optimization level. Try in clisp:
>>> I'd bet that clisp will not do TCO on either of the functions
>>> presented here, for the simple reason that neither one of them is tail
>>> recursive. Or does clisp do automatic transformation of non tail
>>> recursive algorithms to tail recursive ones these days.
>> Correct, this formulation of fact is not TCOptimizable.
>> However, when compiled the stack usage is greatly reduced, so (fact
>> 10000) can be computed. But this has nothing to do with TCO indeed.
>
> Pascal,
>
> Then am I correct in inducing that when you use Clisp and are able to
> compile your code that recursion has never stood in your way
> (aside from the possible speed penalty)?
Yes, indeed.
When debugging (ie. when using the interpreter), we don't use big data
sets and the stack doesn't go very deep usually.
And when deploying, it's compiled, and then stack usage is reasonable
(for non-tail calls) or null (for tail-calls thanks to the TCO), so we
can use normal data sets with deeper stacks (but with thinner frames).
--
__Pascal Bourguignon__
Pascal J. Bourguignon wrote:
> vanekl <·····@acd.net> writes:
>
>> Pascal J. Bourguignon wrote:
>>> Juho Snellman <······@iki.fi> writes:
>>>
>>>> Pascal Bourguignon <···@informatimago.com> writes:
>>>>> vanekl <·····@acd.net> writes:
>>>>>> [...]
>>>>>> Why recursion is not always the best solution:
>>>>>>
>>>>>> [16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
>>>>>> FACT
>>>>>>
>>>>>> [17]> (fact -1)
>>>>>> *** - Lisp stack overflow. RESET
>>>>> Yes, this is a bug:
>>>>>
>>>>> (defun fact(n) (if (minusp n) 1 (* n (fact (1- n)))))
>>>> The bug in your version is a lot more serious than the bug in the original.
>>>> (Hint: What will your version return for 0?).
>>> Right. I usually write it as (defun fact(n) (if (< n 1) 1 (* n
>>> (fact (1- n)))))
>>> but I let vanekl's zerop influence me :-(
>>>
>>>>>> [18]> (fact 10000)
>>>>>> *** - Lisp stack overflow. RESET
>>>>> Indeed, CL doesn't mandate TCO. But most implementations do implement
>>>>> it, including clisp, with the right optimization level. Try in clisp:
>>>> I'd bet that clisp will not do TCO on either of the functions
>>>> presented here, for the simple reason that neither one of them is tail
>>>> recursive. Or does clisp do automatic transformation of non tail
>>>> recursive algorithms to tail recursive ones these days.
>>> Correct, this formulation of fact is not TCOptimizable.
>>> However, when compiled the stack usage is greatly reduced, so (fact
>>> 10000) can be computed. But this has nothing to do with TCO indeed.
>> Pascal,
>>
>> Then am I correct in inducing that when you use Clisp and are able to
>> compile your code that recursion has never stood in your way
>> (aside from the possible speed penalty)?
>
> Yes, indeed.
>
> When debugging (ie. when using the interpreter), we don't use big data
> sets and the stack doesn't go very deep usually.
>
> And when deploying, it's compiled, and then stack usage is reasonable
> (for non-tail calls) or null (for tail-calls thanks to the TCO), so we
> can use normal data sets with deeper stacks (but with thinner frames).
>
Thanks for the reply. I've noticed a few times in the past where I've
run out of stack but that's probably me just forgetting to compile my
code or inputting a nonsense parameter. I'll be more vigilant in the
future now that I know it's probably pilot error.
vanekl <·····@acd.net> writes:
> Pascal J. Bourguignon wrote:
>> vanekl <·····@acd.net> writes:
>>
>>> Pascal J. Bourguignon wrote:
>>>> Juho Snellman <······@iki.fi> writes:
>>>>
>>>>> Pascal Bourguignon <···@informatimago.com> writes:
>>>>>> vanekl <·····@acd.net> writes:
>>>>>>> [...]
>>>>>>> Why recursion is not always the best solution:
>>>>>>>
>>>>>>> [16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
>>>>>>> FACT
>>>>>>>
>>>>>>> [17]> (fact -1)
>>>>>>> *** - Lisp stack overflow. RESET
>>>>>> Yes, this is a bug:
>>>>>>
>>>>>> (defun fact(n) (if (minusp n) 1 (* n (fact (1- n)))))
>>>>> The bug in your version is a lot more serious than the bug in the original.
>>>>> (Hint: What will your version return for 0?).
>>>> Right. I usually write it as (defun fact(n) (if (< n 1) 1 (* n
>>>> (fact (1- n)))))
>>>> but I let vanekl's zerop influence me :-(
>>>>
>>>>>>> [18]> (fact 10000)
>>>>>>> *** - Lisp stack overflow. RESET
>>>>>> Indeed, CL doesn't mandate TCO. But most implementations do implement
>>>>>> it, including clisp, with the right optimization level. Try in clisp:
>>>>> I'd bet that clisp will not do TCO on either of the functions
>>>>> presented here, for the simple reason that neither one of them is tail
>>>>> recursive. Or does clisp do automatic transformation of non tail
>>>>> recursive algorithms to tail recursive ones these days.
>>>> Correct, this formulation of fact is not TCOptimizable.
>>>> However, when compiled the stack usage is greatly reduced, so (fact
>>>> 10000) can be computed. But this has nothing to do with TCO indeed.
>>> Pascal,
>>>
>>> Then am I correct in inducing that when you use Clisp and are able to
>>> compile your code that recursion has never stood in your way
>>> (aside from the possible speed penalty)?
>> Yes, indeed.
>> When debugging (ie. when using the interpreter), we don't use big
>> data
>> sets and the stack doesn't go very deep usually.
>> And when deploying, it's compiled, and then stack usage is
>> reasonable
>> (for non-tail calls) or null (for tail-calls thanks to the TCO), so we
>> can use normal data sets with deeper stacks (but with thinner frames).
>>
>
> Thanks for the reply. I've noticed a few times in the past where I've
> run out of stack but that's probably me just forgetting to compile my
> code or inputting a nonsense parameter. I'll be more vigilant in the
> future now that I know it's probably pilot error.
And if you are heavy on recursive style, or you've got deep data, you
can also increase the stack size with ulimit:
bash% ( ulimit -s 32768 ; clisp -a ) # 32 MB of stack space...
http://clisp.cons.org/impnotes/faq.html#faq-stack
--
__Pascal Bourguignon__
From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: coerce for arbitrary types
Date:
Message-ID: <rem-2008apr11-003@yahoo.com>
> > Regarding that proposed koan earlier: Does anybody have a
> > collection of short code snippets which likewise illustrate a whole
> > mess of deep understanding, where just trying to understand why
> > this particular snippet of code produces the result it does will
> > enlighten a newbie?
> From: vanekl <·····@acd.net>
> As we just witnessed (if you've been reading this list recently),
(It's not a list, it's a group.)
> a 10-line recursive BFS
I assume you mean breadth-first search (in some sort of tree,
possibly binary)? Why would anybody want to write that recursively?
While a depth-first search is nicely expressed recursively, since
the algorithm itself is essentially last-in first-out, hence
stack-like, hence recursive-like, a breadth-first search is
inherently first-in first-out, like a queue, not at all stack-like
or recursive-like. It seems that a iterative loop of appending
newly-discovered branches at the end of the queue and popping the
next task-to-do off the front of the queue is the best way to
expresss the algorithm. Why even bother trying to emulate a queue
via some sort of recursion? (Alternately you can use two stacks,
one for popping off tasks at the current level, and one for
postponing all new tasks at the next level. Again this is best
expressed as either nested while loops or co-routines, not as
recursion.) Pseudocode for the two-stack algorithm:
NowStack <= new empty stack
WaitStack <= new stack containing topnode
While WaitStack not empty
Swap the two stacks
While NowStack not empty
Pop one item from NowStack and process it,
pushing each next-level node onto WaitStack.
What could be simpler, except for the single-queue algorithm of
course? Note that the two-stack algorithm could be instrumented to
announce each new level being explored, which would be more
difficult with the single-queue algorithm.
Queue <= new queue containing topnode
While Queue not empty
Pop one item from front of Queue and process it,
appending each next-level node onto end of Queue.
> is too much for a newbie to figure out by himself if he's never
> seen something like that done in CL before.
Let me see if I can find the 10-line recursive BFS algorithm ...
according to Google Groups search engine, the only mention of BFS
and recursive in the same article in this newsgroup was your
article I'm replying to. Let me try other keywords ... no articles
whatsoever with breadth first search recursive, let me try more
general Google search ...
<http://209.85.173.104/search?q=cache:uwIaeFMa0xcJ:www.cs.jcu.edu.au/ftp/pub/techreports/99-1.ps.gz+recursive+BFS&hl=en&ct=clnk&cd=3&gl=us&ie=UTF-8>
All known algorithms of BFS use iteration. ...
The standard denition [sic] of BFS involves an iterative loop
over a queue of vertices. ...
Exploration of a graph component is achieved by dequeuing a vertex to
discover unexplored vertices which are subsequently enqueued. This
kind of manipulation produces graph exploration which can be viewed as
progressive levels of discovered vertices
(nicely explained IMO)
This section describes and analyzes recursive and iterative algorithms
that implement the clos recurrence to produce BFS. The recursive algorithm
is described rst [sic], since it is a natural extension of the clos
recurrence. ...
Algorithm 3.1
A recursive BFS algorithm.
(completely illegible, Google didn't do a good job of converting
PostScript to regular text)
(But it notes it's tail-recursive, i.e. actually iterative.)
The same traversal sequence is generated by the iterative algorithm
and the
recursive algorithm, except that the iterative algorithm is more
ecient [sic].
<http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/231503>
Ah, this looks better, a recursive algorithm based on iterators.
Unlike the usual queue-based BFS, the space used is only O(h).
Is this the kind of algorithm you were referring to?
If so, I'll take a closer look at it later. But:
I believe ... that this code does not perform a valid
breadth-first search. The problem is that the 'last' variable only
prevents backtracking to the last visited node, not to any
previously visited node. ...
<http://www.mec.ac.in/resources/notes/notes/ds/gratra.htm>
... when hear you breadth you should think queue, and when
you hear depth you should think stack.
(Somebody agrees with me!)
<http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V0C-4MT5K49-4&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=9030bc77b63e54a6798de49d6c28c64d>
recursive best-first search (RBFS).
Is *that* what you meant??
Maybe I should have just asked you what you mean by BFS and where
to find the 10-line algorithm you are referring to, instead of
spending an hour researching what you might have meant.
> Pascal-B illustrates recursion in a pedagogical and also useful manner:
> (defun enumerate (start end)
> (when (< start end)
> (cons start (enumerate (1+ start) end))))
IMO this is a *bad* lesson, a lesson how *not* to write an algorithm.
TCONC (or equivalent) is IMO the proper way to do this sort of
thing, incrementally modifying a new data structure in an iterative
loop which uses o(1) stack space and o(1) additional memory.
The problem with this sort of recursion is that whereas heap space
is allocated in any random order, so that various structures can
share the heap providing they don't all try to use it at the same
time, a stack (as normally implemented) requires a pre-allocation
of dedicated space, which is almost all *wasted* almost all the
time. So you either allocate a huge block of consecutive storage
for a huge stack for the largest graph you'd ever want to traverse,
or you re-allocate stack from time to time when you realize the old
stack wasn't big enough, or your algorithm dies from stack
overflow.
Try something like this instead, to implement a fifo queue:
(defun make-empty-tconc () (cons nil nil))
(defun is-empty-tconc (tc) (null (car tc)))
(defun append-tconc (tc el)
(let ((newcell (cons el nil)))
(cond ((is-empty-tconc tc) (setf (car tc) newcell) (setf (cdr tc) newcell))
(t (setf (cdr (cdr tc)) newcell) (setf (cdr tc) newcell)))))
(defun pop-tconc (tc) ;To get elements out one by one
(let ((popped (car (car tc))))
(setf (car tc) (cdr (car tc)))
(if (is-empty-tconc tc) (setf (cdr tc) nil))
popped))
(defun see-all-tconc (tc) ;To get list of all remaining elements at once
(car tc))
The TCONC object as defined above can be used both to visit nodes
in fifo sequence per breadth-first search (using append-tconc and
pop-tconc repeatedly), and to build up the final list of nodes to
return from the top level (using append-tconc repeatedly and
see-all-tconc just once at the return point).
(BBN/UCI Lisp had a slightly different naming convention for
essentially the same functionality.)
> The random shuffle algorithm is accessible and shows that
> there are lots of ways of accomplishing something seemingly
> simple. (Take that Python.)
> http://www.pvk.ca/Blog/Lisp/malmesure_quand_tu_nous_tiens.html
That article has a lot of discussion which slightly distracts me
from your point. Note that for me, there are some obvious ways to
randomize a sequence efficiently:
- If a vector, run the selection sort algorithm except that a
random number is used to directly select the element to exchange
with the next-to-fix element.
- If a list:
- Copy to vector and apply above algorithm then copy back to list.
- Tag elements with pseudo-random numbers, run merge-sort, then
collate adjacent elements to find the very few blocks that
got identical random numbers and locally shuffle each such block.
Of course if you have a list and want vector result, or vice versa,
you use the vector-selection-sort algorithm, copying from list at
start or to list at end as required.
Now if you have a large dataset in consecutive memory locations
(i.e. in an array, or in a file then loaded or mapped into an
array) in a large virtual address space but have only a relatively
small amount of RAM, you need to keep your working set small to
avoid thrashing, and the problem gets more interesting. An
array-based implementation of merge-sort, using just one temporary
array smaller than the original array, might be the best algorithm.
> I also think Ron's iteration article is well written and
> provides insight. He extracts an important pattern.
> (Is there any high-level pattern more important than iteration,
> other than "IF"?)
> http://rondam.blogspot.com/2008/02/joy-of-iterators.html
Actually the lesson I see is that if you can't predict the new
requirements that your boss may throw at you *after* you've already
implemented all the use cases in the previous set of requirements,
you may need to do some refactoring of the old code to make it look
decent with the new requirements. In *some* such cases, writing an
iterator of various sorts may be useful. But you can't know ahead
of time whether it'll be useful or not, unless you can predict what
your boss will throw at you tomorrow. So you should just get the
current use-cases to work with reasonable code, and cross
tomorrow's bridge when your boss throws it at you tomorrow. It
helps to use a fast-prototype-enabling language such as Lisp so
that you can get all the current use cases working *before* the
boss throws the new stuff at you, so that at the end of each day
you have something actually working you can demo. Ideally you set
up a Web demo of each day's accomplishments, and when your boss
tells you he has new requirements, you interrupt him to tell him
that yesterday's requirements are already up and running at
tinyurl.com/<6chars> if he wants to see what you've done so-far
based on yesterday's requirements.
But I agree that producer-->consumer model is useful, whether it's
implemented by an iterator or continuation or stream or co-routine
or multi-tasking or whatever the jargon is today.
> A nice example of parameterization, first-class functions,
> and functional programming:
> (defun best (sequence comparison &key (key #'identity))
> (reduce (lambda (x y)
> (if (funcall comparison (funcall key x) (funcall key y))
> x y))
> sequence
> :initial-value (elt sequence 0)))
I actually disagree! I agree that REDUCE is a wonderful function
that everyone should be familiar with, and that REDUCE would
accomplish the desired purpose here, but *not* that using REDUCE
here is really appropriate. The problem with using REDUCE is that
really picking the best from a large set could be parallelized,
which REDUCE doesn't do. For this pick-the-best task, a more
enlightening algorithm would quickly traverse the list, activating
side tasks for comparison, and running further traversal of the
list in parallel with the side tasks, rather than waiting for
completion of each comparison task before traversing further down
the list, as REDUCE is defined to do. Drawing a dataflow tree,
REDUCE must work like this (* represents a comparison):
el1 el2 el3 el4
\ / / /
* / /
\ / /
* /
\ /
*
whereas a parallized algorithm could do these comparisons:
el1 el2 el3 el4
\ / \ /
* *
\_ _/
\ /
*
taking log(n) time instead of n-1 time.
> Why recursion is not always the best solution:
> [16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
(Bug if n less than 0, which somebody else pointed out.)
Indeed although this is often touted as the best first-example of
recursion, it's mis-use of recursion, we agree.
Have you seen my split-in-half recursive algorithm for factorial?
It uses stack depth log(n) instead of n.
<http://groups.google.com/group/comp.lang.lisp/msg/39a686b3e4bffa8a?dmode=source>
> Example of clear, concise Lispy style:
> [From "Tutorial on Good Lisp Programming Style"]
> (defun count-all-numbers (exp)
> (typecase exp
> (cons
> (+ (count-all-numbers (first exp))
> (count-all-numbers (rest exp))))
> (number 1)
> (t 0)))
Minor quibble: The use of "first" and "rest" implies the
*intention* is nested lists, *not* arbitrary CONS trees. Yet the
code works with arbitrary CONS trees. CAR and CDR should be used
instead of FIRST and REST if the intention is to work with
arbitrary CONS trees.
If exp is (5 . 9), should it count one number or two numbers??
As nested lists, there's only one number in a proper position.
But as CONS trees, there are two numbers.
> SORT has to be handled with care, even if it is destructive.
> If you want to sort a list, you have to use the result of SORT.
> Wont give a newbie the expected result:
> (let ((mylist (list "c" "d" "a")))
> (sort mylist #'string<)
> mylist)
Yeah, this is a specific lesson about a specific
possibly-destructive function whose *result* rather than
side-effect is the *correct* result.
> Better:
> (let ((mylist (list "c" "d" "a")))
> (setq mylist (sort mylist #'string<))
> mylist)
Yes, but slightly inefficient compared to:
(let ((mylist (list "c" "d" "a")))
(setq mylist (sort mylist #'string<)))
because after all the return value from SETQ is the value that was assigned,
unlike defconstant whose return value is the symbol being assigned.
But really, since mylist is inaccessible after return, why not just:
(let ((mylist (list "c" "d" "a")))
(sort mylist #'string<))
Who cares that mylist has a randomly munged value if it's not accessible?
> And last, I would stay away from mentioning the eval operator.
Actually, when developing code line-at-a-time, which is the way I
strongly prefer in most cases, the REP is your biggest tool, and
EVAL really needs to be mentionned. Maybe what you mean is that
*you* explictly embedding an extra call to EVAL in the code you are
already EVALing is most often a design mistake. Better to get
familiar with APPLY and FUNCALL rather than "consing up an
expression to pass to EVAL". Perhaps lots of examples of where
APPLY or FUNCALL is useful, would be a good tutorial, so the idea
of needing to explicitly call EVAL never arises in the first place.
Or maybe after the appropriate use of APPLY or FUNCALL is shown,
the same example can be demonstrated with EVAL to show how much
extra work it is by comparision. (Nevermind that lexical variables
aren't captured by EVAL whereas they *are* captured by FUNCALL and
APPLY. If you are never tempted to explicitly call EVAL in the
first place, you'll never be bitten by that problem.)
Good: (setq x <someComplicatedCalculation>)
(setq y <anotherComplicatedCalculation>)
(setq f <someCalculationToProduceAnAppropriateFunctionOBJECT>)
(funcall f x y)
Bad: (setq x <someComplicatedCalculation>)
(setq y <anotherComplicatedCalculation>)
(setq f <someCalculationToProduceAnAppropriateFunctionNAME>)
(setf (symbol-function 'tmpf) f)
(setq exp '(tmpf x y))
(eval exp)
> But it doesn't matter anyway because newbies don't read,
> they think it's faster to just start coding.
There's a balance between:
- Reading everything before trying to do anything
- Reading nothing before trying to do everything
The trick is to read what's appropriate before trying to do
something appropriate as the first thing to do after reading that
first stuff. Don't read too much before you write your first line
of code, because you'll forget most of it because it's either not
relevant or not comprehendable before you write your first line of
code. But do read *something*, but what? There are several
pedagogical approaches:
- Sequence of lessons which you must follow in sequence.
- Top-down breakdown from end goal back towards prerequisites which
must be mastered before the main task can be attempted.
- Learn a few basic principles, then work from the reference manual
when actually trying to code.
Robert Maas, http://tinyurl.com/uh3t wrote:
>>> Regarding that proposed koan earlier: Does anybody have a
>>> collection of short code snippets which likewise illustrate a whole
>>> mess of deep understanding, where just trying to understand why
>>> this particular snippet of code produces the result it does will
>>> enlighten a newbie?
>> From: vanekl <·····@acd.net>
>> As we just witnessed (if you've been reading this list recently),
>
> (It's not a list, it's a group.)
>
>> a 10-line recursive BFS
>
> I assume you mean breadth-first search (in some sort of tree,
> possibly binary)?
Nope, I mean DFS. I misspoke. I was thinking about the coloring
algorithm that Cormen, Leiserson, and Rivest use in their DFS.
As stated, this was in context to the person who recently
posted his homework assignment. A DFS was a solution
to his homework. If you care to go back and read the problem,
you'd probably agree.
The solution would probably be simplest using a quanary
tree (North, South, East, West branches).
> Why would anybody want to write that recursively?
See above and below.
> While a depth-first search is nicely expressed recursively, since
> the algorithm itself is essentially last-in first-out, hence
> stack-like, hence recursive-like, a breadth-first search is
> inherently first-in first-out, like a queue, not at all stack-like
> or recursive-like.
> It seems that a iterative loop of appending
> newly-discovered branches at the end of the queue and popping the
> next task-to-do off the front of the queue is the best way to
> expresss the algorithm. Why even bother trying to emulate a queue
> via some sort of recursion?
Because it's simple, can be easily understood, and works.
Seem like pretty good ideas to me, especially when you are
discussing code that you would like newbies to understand.
You seem to think queues are easier to understand. From my
experience, once you get over the hurdle of recursion,
recursive code is quicker to understand, shorter, and less
bug prone. Why even bother with possible efficiency issues
at this point in the newbie's process of discovery?
> (Alternately you can use two stacks,
> one for popping off tasks at the current level, and one for
> postponing all new tasks at the next level. Again this is best
> expressed as either nested while loops or co-routines, not as
> recursion.) Pseudocode for the two-stack algorithm:
> NowStack <= new empty stack
> WaitStack <= new stack containing topnode
> While WaitStack not empty
> Swap the two stacks
> While NowStack not empty
> Pop one item from NowStack and process it,
> pushing each next-level node onto WaitStack.
> What could be simpler, except for the single-queue algorithm of
> course?
A recursive algorithm would be simpler. You could "paint" visited
nodes if they've been visited. (See Cormen, Leiserson, and Rivest.)
No queue required. While I agree that stacks/queues and iteration
is the preferred method if efficiency is your top concern, I cannot
figure out why you think your solution would be easier to understand and
a better demonstration.
> Note that the two-stack algorithm could be instrumented to
> announce each new level being explored, which would be more
> difficult with the single-queue algorithm.
> Queue <= new queue containing topnode
> While Queue not empty
> Pop one item from front of Queue and process it,
> appending each next-level node onto end of Queue.
>
>> is too much for a newbie to figure out by himself if he's never
>> seen something like that done in CL before.
>
> Let me see if I can find the 10-line recursive BFS algorithm ...
> according to Google Groups search engine, the only mention of BFS
> and recursive in the same article in this newsgroup was your
> article I'm replying to. Let me try other keywords ... no articles
> whatsoever with breadth first search recursive, let me try more
> general Google search ...
>
> <http://209.85.173.104/search?q=cache:uwIaeFMa0xcJ:www.cs.jcu.edu.au/
>ftp/pub/techreports/99-1.ps.gz+recursive+BFS&hl=en&ct=clnk&cd=3&gl=us&ie=UTF-8>
> All known algorithms of BFS use iteration. ...
> The standard denition [sic] of BFS involves an iterative loop
> over a queue of vertices. ...
> Exploration of a graph component is achieved by dequeuing a vertex to
> discover unexplored vertices which are subsequently enqueued. This
> kind of manipulation produces graph exploration which can be viewed as
> progressive levels of discovered vertices
> (nicely explained IMO)
> This section describes and analyzes recursive and iterative algorithms
> that implement the clos recurrence to produce BFS. The recursive algorithm
> is described rst [sic], since it is a natural extension of the clos
> recurrence. ...
> Algorithm 3.1
> A recursive BFS algorithm.
> (completely illegible, Google didn't do a good job of converting
> PostScript to regular text)
> (But it notes it's tail-recursive, i.e. actually iterative.)
> The same traversal sequence is generated by the iterative algorithm
> and the
> recursive algorithm, except that the iterative algorithm is more
> ecient [sic].
>
> <http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/231503>
> Ah, this looks better, a recursive algorithm based on iterators.
> Unlike the usual queue-based BFS, the space used is only O(h).
> Is this the kind of algorithm you were referring to?
> If so, I'll take a closer look at it later. But:
> I believe ... that this code does not perform a valid
> breadth-first search. The problem is that the 'last' variable only
> prevents backtracking to the last visited node, not to any
> previously visited node. ...
>
> <http://www.mec.ac.in/resources/notes/notes/ds/gratra.htm>
> ... when hear you breadth you should think queue, and when
> you hear depth you should think stack.
> (Somebody agrees with me!)
For a newbie, recursion handles the stack just fine, and the code is
simpler. Stack depth is not an issue considering the problem the newbie
described, and the type of problems newbies are typically assigned.
> <http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V0C-4MT5K49-4
>&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221
>&_version=1&_urlVersion=0&_userid=10&md5=9030bc77b63e54a6798de49d6c28c64d>
> recursive best-first search (RBFS).
> Is *that* what you meant??
>
> Maybe I should have just asked you what you mean by BFS and where
> to find the 10-line algorithm you are referring to, instead of
> spending an hour researching what you might have meant.
>> Pascal-B illustrates recursion in a pedagogical and also useful manner:
>> (defun enumerate (start end)
>> (when (< start end)
>> (cons start (enumerate (1+ start) end))))
>
> IMO this is a *bad* lesson, a lesson how *not* to write an algorithm.
> TCONC (or equivalent) is IMO the proper way to do this sort of
> thing, incrementally modifying a new data structure in an iterative
> loop which uses o(1) stack space and o(1) additional memory.
Except TCONC isn't a part of CL, nor do I think this is newbie "territory."
You appear to differ.
So you're saying that CONS should be deprecated for newbies.
Amazing.
Your education and Lisp practitioning is so disjoint from mine that
there's probably no reconciling the two.
While I agree your earnest attempt at optimizing algorithms is proper
and even notable in some contexts, I believe, in the context of the premise of
this discussion, you are beyond the pale.
> The problem with this sort of recursion is that whereas heap space
> is allocated in any random order, so that various structures can
> share the heap providing they don't all try to use it at the same
> time, a stack (as normally implemented) requires a pre-allocation
> of dedicated space, which is almost all *wasted* almost all the
> time. So you either allocate a huge block of consecutive storage
> for a huge stack for the largest graph you'd ever want to traverse,
> or you re-allocate stack from time to time when you realize the old
> stack wasn't big enough, or your algorithm dies from stack
> overflow.
Right.
Those are the only considerations and the only options.
Uh huh.
You have an interesting way of framing problems to suit your
particular biases, ignoring the fact that this technique
is successfully commonly used, and simple to teach and understand.
>
> Try something like this instead, to implement a fifo queue:
> (defun make-empty-tconc () (cons nil nil))
> (defun is-empty-tconc (tc) (null (car tc)))
> (defun append-tconc (tc el)
> (let ((newcell (cons el nil)))
> (cond ((is-empty-tconc tc) (setf (car tc) newcell) (setf (cdr tc) newcell))
> (t (setf (cdr (cdr tc)) newcell) (setf (cdr tc) newcell)))))
> (defun pop-tconc (tc) ;To get elements out one by one
> (let ((popped (car (car tc))))
> (setf (car tc) (cdr (car tc)))
> (if (is-empty-tconc tc) (setf (cdr tc) nil))
> popped))
> (defun see-all-tconc (tc) ;To get list of all remaining elements at once
> (car tc))
> The TCONC object as defined above can be used both to visit nodes
> in fifo sequence per breadth-first search (using append-tconc and
> pop-tconc repeatedly), and to build up the final list of nodes to
> return from the top level (using append-tconc repeatedly and
> see-all-tconc just once at the return point).
> (BBN/UCI Lisp had a slightly different naming convention for
> essentially the same functionality.)
Interesting, but I submit it is not something important enough that
a newbie should be first exposed to.
>> The random shuffle algorithm is accessible and shows that
>> there are lots of ways of accomplishing something seemingly
>> simple. (Take that Python.)
>> http://www.pvk.ca/Blog/Lisp/malmesure_quand_tu_nous_tiens.html
>
> That article has a lot of discussion which slightly distracts me
> from your point. Note that for me, there are some obvious ways to
> randomize a sequence efficiently:
> - If a vector, run the selection sort algorithm except that a
> random number is used to directly select the element to exchange
> with the next-to-fix element.
> - If a list:
> - Copy to vector and apply above algorithm then copy back to list.
> - Tag elements with pseudo-random numbers, run merge-sort, then
> collate adjacent elements to find the very few blocks that
> got identical random numbers and locally shuffle each such block.
> Of course if you have a list and want vector result, or vice versa,
> you use the vector-selection-sort algorithm, copying from list at
> start or to list at end as required.
>
> Now if you have a large dataset in consecutive memory locations
> (i.e. in an array, or in a file then loaded or mapped into an
> array) in a large virtual address space but have only a relatively
> small amount of RAM, you need to keep your working set small to
> avoid thrashing, and the problem gets more interesting. An
> array-based implementation of merge-sort, using just one temporary
> array smaller than the original array, might be the best algorithm.
>
>> I also think Ron's iteration article is well written and
>> provides insight. He extracts an important pattern.
>> (Is there any high-level pattern more important than iteration,
>> other than "IF"?)
>> http://rondam.blogspot.com/2008/02/joy-of-iterators.html
>
> Actually the lesson I see is that if you can't predict the new
> requirements that your boss may throw at you *after* you've already
> implemented all the use cases in the previous set of requirements,
> you may need to do some refactoring of the old code to make it look
> decent with the new requirements. In *some* such cases, writing an
> iterator of various sorts may be useful. But you can't know ahead
> of time whether it'll be useful or not, unless you can predict what
> your boss will throw at you tomorrow. So you should just get the
> current use-cases to work with reasonable code, and cross
> tomorrow's bridge when your boss throws it at you tomorrow. It
> helps to use a fast-prototype-enabling language such as Lisp so
> that you can get all the current use cases working *before* the
> boss throws the new stuff at you, so that at the end of each day
> you have something actually working you can demo. Ideally you set
> up a Web demo of each day's accomplishments, and when your boss
> tells you he has new requirements, you interrupt him to tell him
> that yesterday's requirements are already up and running at
> tinyurl.com/<6chars> if he wants to see what you've done so-far
> based on yesterday's requirements.
>
> But I agree that producer-->consumer model is useful, whether it's
> implemented by an iterator or continuation or stream or co-routine
> or multi-tasking or whatever the jargon is today.
>
>> A nice example of parameterization, first-class functions,
>> and functional programming:
>> (defun best (sequence comparison &key (key #'identity))
>> (reduce (lambda (x y)
>> (if (funcall comparison (funcall key x) (funcall key y))
>> x y))
>> sequence
>> :initial-value (elt sequence 0)))
>
> I actually disagree!
What a surprise :)
> I agree that REDUCE is a wonderful function
> that everyone should be familiar with, and that REDUCE would
> accomplish the desired purpose here, but *not* that using REDUCE
> here is really appropriate. The problem with using REDUCE is that
> really picking the best from a large set could be parallelized,
> which REDUCE doesn't do. For this pick-the-best task, a more
> enlightening algorithm would quickly traverse the list, activating
> side tasks for comparison, and running further traversal of the
> list in parallel with the side tasks, rather than waiting for
> completion of each comparison task before traversing further down
> the list, as REDUCE is defined to do. Drawing a dataflow tree,
> REDUCE must work like this (* represents a comparison):
> el1 el2 el3 el4
> \ / / /
> * / /
> \ / /
> * /
> \ /
> *
> whereas a parallized algorithm could do these comparisons:
> el1 el2 el3 el4
> \ / \ /
> * *
> \_ _/
> \ /
> *
> taking log(n) time instead of n-1 time.
Wow. Now your disagreeing not because the algorithm is sound, but
because you would rather have one that is more easily parallelizable.
You continue to amaze me. I thought the entire premise for this
collection of koans was for newbie education.
(Your ascii art is striking, though, and well illustrates
your point.)
>> Why recursion is not always the best solution:
>> [16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
>
> (Bug if n less than 0, which somebody else pointed out.)
> Indeed although this is often touted as the best first-example of
> recursion, it's mis-use of recursion, we agree.
> Have you seen my split-in-half recursive algorithm for factorial?
> It uses stack depth log(n) instead of n.
> <http://groups.google.com/group/comp.lang.lisp/msg/39a686b3e4bffa8a?dmode=source>
No, I haven't seen this until today, but I recently read about this
same algorithm implemented in this same fashion on another web site (I can't
place it right now). The web site did time tests and explained why this type of
algorithm is faster. It was consing less and using far
fewer BigNum calculations, according to the article.
Pretty interesting stuff, but, again, largely irrelevant for newbies IMO.
>> Example of clear, concise Lispy style:
>> [From "Tutorial on Good Lisp Programming Style"]
>> (defun count-all-numbers (exp)
>> (typecase exp
>> (cons
>> (+ (count-all-numbers (first exp))
>> (count-all-numbers (rest exp))))
>> (number 1)
>> (t 0)))
>
> Minor quibble: The use of "first" and "rest" implies the
> *intention* is nested lists, *not* arbitrary CONS trees.
Yeah, and this distinction is important for newbs. </sarcasm>
As if CAR and CDR are easier for newbies to grok.
They're /synonyms/, FCS.
The *intention* is not conveyed by the language, per se,
but how the language (e.g., the operators) is commonly used
depending on context. Again, not important for newbs IMO.
In fact, some newbies that I've talked to specifically
mention that they find the CAR and CDR operators distasteful.
Even after I mention that they never have to be used, they
still are turned off by them. I don't understand their
argument, but they are bright guys, so there must be something
to it. I'm guessing that Norvig, et al, took that into account
when this example was written.
> Yet the
> code works with arbitrary CONS trees. CAR and CDR should be used
> instead of FIRST and REST if the intention is to work with
> arbitrary CONS trees.
> If exp is (5 . 9), should it count one number or two numbers??
Yes, the doc string is not explicit. Got me there :)
> As nested lists, there's only one number in a proper position.
> But as CONS trees, there are two numbers.
>
>> SORT has to be handled with care, even if it is destructive.
>> If you want to sort a list, you have to use the result of SORT.
>> Wont give a newbie the expected result:
>> (let ((mylist (list "c" "d" "a")))
>> (sort mylist #'string<)
>> mylist)
>
> Yeah, this is a specific lesson about a specific
> possibly-destructive function whose *result* rather than
> side-effect is the *correct* result.
Considering how often sort is used, even though it is minor
lesson, it is still important, especially since it almost
certainly will eventually bite the newb on the @ss.
>> Better:
>> (let ((mylist (list "c" "d" "a")))
>> (setq mylist (sort mylist #'string<))
>> mylist)
>
> Yes, but slightly inefficient compared to:
> (let ((mylist (list "c" "d" "a")))
> (setq mylist (sort mylist #'string<)))
> because after all the return value from SETQ is the value that was assigned,
> unlike defconstant whose return value is the symbol being assigned.
Efficiency was not my prime motivation. Pedagogical value was. Did you
notice the symmetry between the two examples? They were like that for a
reason: it's easier to draw conclusions this way. I don't understand your
urge to confuse easy to understand concepts with your (minor) efficiency concerns.
Apparently you find efficiency to be so important that even newbies
must be immediately indoctrinated in the finer points of Lisp.
And a good compiler could peephole optimize that last "mylist" out
anyway. I don't know whether any Lisp compiler does this, but I've
worked with a C compiler that was able to do peephole optimization.
It's not impossible, is what I'm saying, and may completely invalidate
your argument.
> But really, since mylist is inaccessible after return, why not just:
> (let ((mylist (list "c" "d" "a")))
> (sort mylist #'string<))
> Who cares that mylist has a randomly munged value if it's not accessible?
>
Apparently you are snipping things out of the discussion.
Since you do not mention it at all, it's hard to say unless I go
back to my original post and try to discover what you took
out.
>> And last, I would stay away from mentioning the eval operator.
>
> Actually, when developing code line-at-a-time, which is the way I
> strongly prefer in most cases, the REP is your biggest tool, and
> EVAL really needs to be mentionned.
Evaluation could be mentioned as apart of the REPL. The EVAL /operator/
doesn't. Two separate, but related concepts. You have mistakenly
conflated the two.
Maybe what you mean is that
> *you* explictly embedding an extra call to EVAL in the code you are
> already EVALing is most often a design mistake.
Uh, yes, that's what I mean when I specifically say "eval operator."
Again, from context, this should have been evident. Reading each line
completely separate from the rest of the paragraph just makes for
poor reading comprehension.
> Better to get
> familiar with APPLY and FUNCALL rather than "consing up an
> expression to pass to EVAL". Perhaps lots of examples of where
> APPLY or FUNCALL is useful, would be a good tutorial, so the idea
> of needing to explicitly call EVAL never arises in the first place.
Yes, I showed an example of this already. See above (/\cfuncall/).
> Or maybe after the appropriate use of APPLY or FUNCALL is shown,
> the same example can be demonstrated with EVAL to show how much
> extra work it is by comparision. (Nevermind that lexical variables
> aren't captured by EVAL whereas they *are* captured by FUNCALL and
> APPLY. If you are never tempted to explicitly call EVAL in the
> first place, you'll never be bitten by that problem.)
>
> Good: (setq x <someComplicatedCalculation>)
> (setq y <anotherComplicatedCalculation>)
> (setq f <someCalculationToProduceAnAppropriateFunctionOBJECT>)
> (funcall f x y)
>
> Bad: (setq x <someComplicatedCalculation>)
> (setq y <anotherComplicatedCalculation>)
> (setq f <someCalculationToProduceAnAppropriateFunctionNAME>)
> (setf (symbol-function 'tmpf) f)
> (setq exp '(tmpf x y))
> (eval exp)
I haven't a clue as to why you think newbies would need to know this.
They tend to look for the biggest hammer first, and showing them EVAL,
IMO, is not helpful. Apparently you think otherwise.
>> But it doesn't matter anyway because newbies don't read,
>> they think it's faster to just start coding.
>
> There's a balance between:
> - Reading everything before trying to do anything
> - Reading nothing before trying to do everything
> The trick is to read what's appropriate before trying to do
> something appropriate as the first thing to do after reading that
> first stuff. Don't read too much before you write your first line
> of code, because you'll forget most of it because it's either not
> relevant or not comprehendable before you write your first line of
> code. But do read *something*, but what? There are several
> pedagogical approaches:
> - Sequence of lessons which you must follow in sequence.
> - Top-down breakdown from end goal back towards prerequisites which
> must be mastered before the main task can be attempted.
> - Learn a few basic principles, then work from the reference manual
> when actually trying to code.
Sorry, that last statement was (a little) tongue in cheek, and you
seemed to have taken it literally. I (poorly) was referring to the
newb who was just here asking for the best lisp operators to use,
as if knowing the most common operators is going to help you when you
are unable to articulate your algorithm. Well, I guess if you were
doing some Genetic Programming, it would help, but I don't think
that was the context.
My preferred method of learning a new language: find a good book,
read a chapter, put it in my lap as I do the most interesting
chapter problems, and continue on to the next chapter. When I've
learned enough to code without the book, put the book down.
My preferred book happened to be Norvig's PAIP. Even with the
advent of the internet, it's still my preferred method, because
published books are generally of higher quality than the language
studies I find on the internet. YMMV.
From: Robert Maas, http://tinyurl.com/uh3t
Subject: How best to teach newbie at algorithms (was: coerce for arbitrary types)
Date:
Message-ID: <rem-2008apr13-002@yahoo.com>
> >> a 10-line recursive BFS
> > I assume you mean breadth-first search (in some sort of tree,
> > possibly binary)?
> From: vanekl <·····@acd.net>
> Nope, I mean DFS. I misspoke.
Oh..... nevermind.
Depth-first search is obviously recursive, assuming you don't
suffer stack overflow in which case you might emulate a stack using
a linked list.
> > While a depth-first search is nicely expressed recursively, since
> > the algorithm itself is essentially last-in first-out, hence
> > stack-like, hence recursive-like, a breadth-first search is
> > inherently first-in first-out, like a queue, not at all stack-like
> > or recursive-like.
> > It seems that a iterative loop of appending
> > newly-discovered branches at the end of the queue and popping the
> > next task-to-do off the front of the queue is the best way to
> > expresss the algorithm. Why even bother trying to emulate a queue
> > via some sort of recursion?
> Because it's simple, can be easily understood, and works.
To my mind, a queue is simple easily understood and works, and is
"obviously just right" for any fifo task such as breadth-first
search (the topic we accidently got on when you misspoke).
> Seem like pretty good ideas to me, especially when you are
> discussing code that you would like newbies to understand.
I disagree. We should show newbies what kinds of tasks just
naturally fit into iterative paradigms, and which fit just
naturally into recursive paradigms, and only later show how either
can emulate the other via tricky or nifty coding.
> You seem to think queues are easier to understand.
No, I think using queues for fifo tasks (such as BFS and
queue-at-bank), and using stacks for lifo tasks (such as DFS and
EVAL), are easier to understand than vice versa.
> From my experience, once you get over the hurdle of recursion,
> recursive code is quicker to understand, shorter, and less bug
> prone.
I disagree with such a general claim.
> Why even bother with possible efficiency issues at this point in
> the newbie's process of discovery?
That's a strawman. The issues are (1) fitting the most obviously
appropriate algorithm to the task at hand, and (2) avoiding really
dumb things that require immense amounts of pre-allocated
contiguous stack space that can't be recycled for any other use
until the application quits.
> ftp/pub/techreports/99-1.ps.gz+recursive+BFS&hl=en&ct=clnk&cd=3&gl=us&ie=UTF-8
Alert: Unable to access document.
> TCONC isn't a part of CL, nor do I think this is newbie "territory."
> You appear to differ.
Nothing beyond the ANSI spec is "part of CL". The second thing the
newbie needs to learn is how to write useful tools that go beyond
what's in the ANSI spec, and IMO implementing a fifo queue via
TCONC is a useful exercise for the newbie and a useful tool for
later use. I almost wish I had gone the TCONC way in my code.
Instead I push results onto a local variable called revres then
nreverse that at the return point. Maybe I'll copy the TCONC code I
posted into a utility file and use it from now on.
> So you're saying that CONS should be deprecated for newbies.
Absolutely not!! (Although perhaps a renaming of it should occur,
such as MAKE-PAIR or NEW-PAIR. After all, CONS is really a
constructor for an ordered pair data-object, and MAKE-whatever is
the usual convention in Common Lisp, such as MAKE-ARRAY
MAKE-HASH-TABLE MAKE-LIST MAKE-STRING MAKE-STRING-OUTPUT-STREAM
etc. It might be easire for a newbie to remember that MAKE-PAIR is
the way to make a new pair-object, instead of the special-case name
CONS that is a carry-over from when such pairs were the *only*
structures ever built in Lisp 1.5 or whatever so "construct what?"
wasn't a reasonble question.)
> While I agree your earnest attempt at optimizing algorithms is
> proper and even notable in some contexts, I believe, in the
> context of the premise of this discussion, you are beyond the pale.
There's a difference between what I advocate, namely choosing an
algorithm that naturally fits the task to be accomplished, which
isn't horribly inefficient, vs. micro-optimizing too early, which
is *not* what I advocate.
> > Try something like this instead, to implement a fifo queue:
> > (defun make-empty-tconc () (cons nil nil))
> > (defun is-empty-tconc (tc) (null (car tc)))
> > (defun append-tconc (tc el)
> > (let ((newcell (cons el nil)))
> > (cond ((is-empty-tconc tc) (setf (car tc) newcell) (setf (cdr tc) newcell))
> > (t (setf (cdr (cdr tc)) newcell) (setf (cdr tc) newcell)))))
> > (defun pop-tconc (tc) ;To get elements out one by one
> > (let ((popped (car (car tc))))
> > (setf (car tc) (cdr (car tc)))
> > (if (is-empty-tconc tc) (setf (cdr tc) nil))
> > popped))
> > (defun see-all-tconc (tc) ;To get list of all remaining elements at once
> > (car tc))
> > The TCONC object as defined above can be used both to visit nodes
> > in fifo sequence per breadth-first search (using append-tconc and
> > pop-tconc repeatedly), and to build up the final list of nodes to
> > return from the top level (using append-tconc repeatedly and
> > see-all-tconc just once at the return point).
> > (BBN/UCI Lisp had a slightly different naming convention for
> > essentially the same functionality.)
> Interesting, but I submit it is not something important enough that
> a newbie should be first exposed to.
In nearly any course in computer programming, one of the chapters
is API and specific commonly-useful examples: Stack, Queue, Binary
Search Tree (usually AVL), etc. I agree with the designers of those
courses that these examples are useful for teaching both the basic
idea of an API and the general idea of how to write decent
algorithms for commonly-useful kinds of tasks. Since Lisp has a
stack (lifo) API available already via PUSH and POP, or trivially
implementable via CONS and CAR/CDR, but doesn't have any queue
(fifo) API either directly avaiable nor trivially implementable, I
suggest that implementing a fifo queue in Lisp would be a useful
exercise to show how to build a slightly non-trivial API. In that
context, TCONC is one reasonable design for such an API. Two stacks
is another reasonable design. Maybe the student should be taught
both APIs, and any other fifo-queue API you would suggest, after
perhaps an in-class brainstorming exercise to see if any of the
students can invent a fifo-queue API all on their own.
So do you disagree that a course on algorithms should include the
queue API, or do you disagree that TCONC is a reasonble example of
how such an API might be implemented, to show that there's more
than one simple way to implement it, to give the students the clear
idea that there isn't just one way, that there are radically
different ways?
(snipped example of divide-in-half recursive algorithm for
computing maximum of a set)
> Now your disagreeing not because the algorithm is sound, but
> because you would rather have one that is more easily
> parallelizable. You continue to amaze me. I thought the entire
> premise for this collection of koans was for newbie education.
I think there are several issues, in addition to how to get the job
done on a single CPU Von Neumann machine with unlimited memory. One
issue is dealing with limited address space, especially causing
limited stack. Another issue is locality of reference, if there's
large address space but limited actual memory. Another issue is
dealing with multi-processing with shared structures that get
independently updated where the updates *are* or are *not* supposed
to be communicated between processes. A related issue is dealing
with multiple CPUs, especially a large number of CPUs all availble
to help share the workload to speed up processing a large amount of
data. I think some discussion of such topics is relevant to what a
newbie should learn about at some point.
> >> Example of clear, concise Lispy style:
> >> [From "Tutorial on Good Lisp Programming Style"]
> >> (defun count-all-numbers (exp)
> >> (typecase exp
> >> (cons
> >> (+ (count-all-numbers (first exp))
> >> (count-all-numbers (rest exp))))
> >> (number 1)
> >> (t 0)))
> > Minor quibble: The use of "first" and "rest" implies the
> > *intention* is nested lists, *not* arbitrary CONS trees.
> Yeah, and this distinction is important for newbs. </sarcasm>
> As if CAR and CDR are easier for newbies to grok.
IMO it's important for naming conventions to closely match the
intent of the API. Sometimes you really want to define a function
or macro that does nothing except change the name of some obscure
thing to more directly fit the context where you're using it. I do
not claim that the specific names CAR and CDR are optimal, merely
that they clearly indicate the intention of working directly with
arbitrary trees of CONS cells rather than restricting oneself to
nested proper lists as "first" and "rest" imply.
> They're /synonyms/, FCS.
Yeah, and CONS is a synomym for MAKE-HIST-CELL,
and CAR is a synomym for HIST-CELL-TOKEN,
and CDR is a synomym for HIST-CELL-COUNT.
But when you're writing lots of code to deal with cells in
histograms, if you use the histogram-cell-API names instead of the
CONS/CAR/CDR names you can mostly forget the internal
representation of histogram cells and just use the more descriptive
names instead.
> The *intention* is not conveyed by the language, per se,
> but how the language (e.g., the operators) is commonly used
> depending on context.
That point is debatable. When you write an API for some specific
data structure that has certain operations publicized as "proper"
for that structure, you give names that make sense for that API,
even if some of them are nothing more than synonyms for built-in
functions such as CADDR. Then when you write good code in the sense
of making sense to anyone reading it, you use these API names for
all the "proper" operations, rather than occasionally sneak in the
CADDR etc. low-level equivalents. Every time you use one of the API
functions, you are clearly showing anyone reading your code that
the *intent* is to perform a "proper" operation on an object of
this type as defined by the API.
I think a useful exercise for newbies would be to implement a
totally trivial API. For example, build&use records of the form
(givenName:string familyName:string yearBorn:integer)
and simply define a constructor and three accessors and three
setters, each with reasonable names per this specific structure of
data, first as functions, then as macros. Maybe even do it as
lists, as vectors, and as structs, to show how the same API can be
implemented multiple ways. By having the API be totally trivial,
the student can concentrate on the API itself without being
distracted by the difficulty of getting a complicated algorithm
working.
> Again, not important for newbs IMO. In fact, some newbies that
> I've talked to specifically mention that they find the CAR and
> CDR operators distasteful.
Show them how easy it is to define their own operators for both
the tree API
(something like MAKE-PAIR PAIR-LEFT PAIR-RIGHT)
and the nested-list API
(something like LIST-PREPEND-ELEMENT LIST-FIRST-ELEMENT REST-OF-LIST)
let them make up any names they will be comfortable with for these
two different *intentions* of using CONS to build structures looked
at in two different ways (as symmetric left/right, or as assymetric
firstElement/restOfListAfterFirstElement).
> Even after I mention that they never have to be used, they still
> are turned off by them.
You could always provide them with an init.lisp file which defines
aliases for both the symmetric-pair and element/rest intentions,
and teach the two APIs totally separately, as if they didn't really
use the same internal form. Whenever they want to create a literal
value per the symmetric-pair intention, they must strictly use
dotted notation, no list-shorthands allowed. Whenever they want to
create a literal list or nested lists, they must strictly use
non-dotted list notation, and they're forbidden from using any dots
at all, not dotted lists allowed. You could even provide toString
functions for each intention, which would either use strictly
dotted notation or strictly proper-list notation. So your students
wouldn't need to see CAR or CDR or CONS even once (unless they peek
at how you implemented your two APIs in your init.lisp file).
> > Yet the
> > code works with arbitrary CONS trees. CAR and CDR should be used
> > instead of FIRST and REST if the intention is to work with
> > arbitrary CONS trees.
> > If exp is (5 . 9), should it count one number or two numbers??
> Yes, the doc string is not explicit. Got me there :)
So, you gonna fix the doc string before you post that code again?
> >> SORT has to be handled with care, even if it is destructive.
> >> If you want to sort a list, you have to use the result of SORT.
> >> Wont give a newbie the expected result:
> >> (let ((mylist (list "c" "d" "a")))
> >> (sort mylist #'string<)
> >> mylist)
> > Yeah, this is a specific lesson about a specific
> > possibly-destructive function whose *result* rather than
> > side-effect is the *correct* result.
> Considering how often sort is used, even though it is minor
> lesson, it is still important, especially since it almost
> certainly will eventually bite the newb on the @ss.
I agree that at the point where SORT is presented, this point
should be made clear. The manuals that I use makes this point
clear. Perhaps in new online manuals there also should be an
appendix that lists *all* the built-in functions that are
destructive in such a way that their return value is the only
result you can rely on after the function is called. Each
definition of a particular function could have a footnote linking
to the appendix.
Let me check if the HyperSpec is like that ...
<http://www.lispworks.com/documentation/HyperSpec/Body/f_sort_.htm>
No, it's not. It has a link to a section about destructive
operations, but that section merely tells about avoiding modifying
a literal, and unpredictable behaviour if there's a transfer of
control out from the middle of a destructive operation. There's no
mention of the structure hanging from the original pointer *not*
being identical to the returned value, or indeed *not* being
anything reasonable whatsoever, in the case of sorting a list.
Maybe Kent can fix that to do what I recommend?
> >> Better:
> >> (let ((mylist (list "c" "d" "a")))
> >> (setq mylist (sort mylist #'string<))
> >> mylist)
> > Yes, but slightly inefficient compared to:
> > (let ((mylist (list "c" "d" "a")))
> > (setq mylist (sort mylist #'string<)))
> > because after all the return value from SETQ is the value that was assigned,
> > unlike defconstant whose return value is the symbol being assigned.
> Efficiency was not my prime motivation. Pedagogical value was.
Yeah. I suggest you make your newbie point about how you need to
return the value from the SETQ rather than the original given
argument. I.e. *not* do this:
(let* ((origlist (list "c" "d" "a")))
(sortedlist (sort origlist #'string<)))
origlist)
but do this instead:
(let* ((origlist (list "c" "d" "a")))
(sortedlist (sort origlist #'string<)))
sortedlist)
and don't even try to return both original and sorted list like this:
(let* ((origlist (list "c" "d" "a")))
(sortedlist (sort mylist #'string<)))
(values sortedlist origlist))
I like that way of having parallel examples.
Then after that main point is given, note how origlist is used in
only one places, so it needn't be assigned a named variable, it can
just be nested inside the call to sort:
(let* ((sortedlist (sort (list "c" "d" "a") #'string<)))
sortedlist)
Then note that sortedlist is used in only one place, so likewise
it needn't be assigned to a named variable:
(sort (list "c" "d" "a") #'string<)
P.S., if you've perhaps noticed, I like to debug in a PROG where
each line of code assigns a new variable, then later refactor the
syntax to get rid of some of the local variables by nesting
expressions if I feel like doing that, and convert the whole PROG
to LET* or some other nifty form if it's easy. What I like about
PROG is how it allows a sequence of statements to be listed all at
the same level of structure, and unlike LET* it allows an intermix
of SETQ and MULTIPLE-VALUE-SETQ which LET* doesn't currently
support.
A few days ago somebody suggested an extension of the LET* syntax
to support multiple values, which I rather like, and if implemented
would allow me to get rid of almost *all* my PROGs:
(LET** ((q r (rem x y)))
(format t "~D divided by ~D gives ~D with remainder ~D~%" x y q r))
(I introduced the second asterisk myself just now.)
Maybe someday I'll implement that as a macro and submit it to some
public repository of freely-usable code?
(Or maybe somebody with more free time will do it before I get around to it?)
> Did you notice the symmetry between the two examples? They were
> like that for a reason: it's easier to draw conclusions this way.
OK. See my alternate parallel structure using LET* which might
illustrate all the points slightly better?
> a good compiler could peephole optimize that last "mylist" out
> anyway.
Agreed. I think newbies should be shown how to write the same
algorithm with a named variable at every step (for ease in
debugging) and with all single-use values passed by nested
expresssions to avoid the need for named variables to temporarily
hold the value (for compactness), and be allowed to use whatever
combination between the two extremes they feel like using on any
given day for any given task, and be encouraged to use cut&paste to
refactor the syntax of such algorithms any time they feel like it.
Example:
Easy-to-debug one-line-at-a-time version:
(defun phvec-euclidean-diagonal (phvec)
(prog (squares sumsq diag)
(setq squares (mapcar #'(lambda (num) (* num num)) phvec))
(setq sumsq (reduce #'+ squares))
(setq diag (sqrt sumsq))
(return diag)))
Compactified version (beginners hate all these parens):
(defun phvec-euclidean-diagonal (phvec)
(sqrt (reduce #'+ (mapcar #'(lambda (num) (* num num)) phvec)))))
Completely rewritten version using LOOP with SUM also possible.
> >> And last, I would stay away from mentioning the eval operator.
> > Actually, when developing code line-at-a-time, which is the way I
> > strongly prefer in most cases, the REP is your biggest tool, and
> > EVAL really needs to be mentionned.
> Evaluation could be mentioned as apart of the REPL. The EVAL
> /operator/ doesn't. Two separate, but related concepts.
OK, for absolute newbies we are in complete agreement.
(Eventually EVAL as an explicit operator in code turns out to be
occasionally useful, but only after uses of FUNCALL and APPLY are
exhausted. I presume we're in agreement there too.)
> > Better to get
> > familiar with APPLY and FUNCALL rather than "consing up an
> > expression to pass to EVAL". Perhaps lots of examples of where
> > APPLY or FUNCALL is useful, would be a good tutorial, so the idea
> > of needing to explicitly call EVAL never arises in the first place.
> Yes, I showed an example of this already. See above (/\cfuncall/).
OK, we're in agreement. (I didn't happen to notice the cfuncall section.)
> ... newbies ... tend to look for the biggest hammer first, and
> showing them EVAL, IMO, is not helpful.
I think I see your point. Although mentionning EVAL in the context
of the read-eval-print loop might be useful early-on, the EVAL
operator mentionned explicitly as something anyone might put into
actual code is poison to the newbie's mind. Maybe instead of
Read-Eval-Print, the REP should be described as
Parse-Compute-Print. If they never hear about any function called
"EVAL", they won't be tempted to play with it in their code, and
they won't get the misguided idea that the big hammar EVAL is
better than the delicate tools FUNCALL and APPLY.
But I have an even better idea I thought of a year or two ago which
I haven't implemented yet but might "any time now", in conjunction
with my CookBook examples: I could set up a Web service that taught
how to compose D/P algorithms without any reference to any
particular programming language whatsoever. The main window would
allow browsing of data types, and within each data type browsing
the various useful functions commonly provided in programming
languages, with *descriptions* of functionality rather than names
of functions. The student would choose a desired functionality,
assign his/her personal name to it (which the CAI system would
remember), and query as to what parameters to pass to it, either
literal values (numbers or strings only) or previously-assigned
variables, and ask for a variable name to assign to the result (the
default is a GENSYMmed name). The student could build up sequences
of actions, and eventually collect them into a function, with the
student choosing the name of the function. There'd be a fast-browse
menu of all the functionalities the student has already used, as
well as a fast-browse menu of all the variable-value bindings the
student has previously set up. At any time the student could see
what a single line of code, or a completed function definition,
would look like in any of six or seven programming languages (PHP,
Perl, Lisp, C, C++, Java maybe also FlamingThunder), but just for
curiosity, not as anything needed for further work in the course.
For functions that return multiple values, the student would pick
more than one variable name, or get GENSYMed names for all the
ones not explicitly named. Perhaps the GENSYMed names would follow
a variation of the MacSyma convention.
Single value: D1
Single value: D2
Multiple values: D3, D3a
Single value: D4
Multiple values: D5, D5a, D5b
This would almost completely get rid of the need to teach a
*programming*language* right alongside teaching
*how*to*design*algorithms*, hence the need to **choose** which
programming language to *teach*. Instead only
*how*to*design*algorithms* need be taught to newbies.
So the common question in comp.programming
"What is the best first programming language to learn?"
would become moot!
> I (poorly) was referring to the newb who was just here asking for
> the best lisp operators to use, as if knowing the most common
> operators is going to help you when you are unable to articulate
> your algorithm.
Hmm, I think my idea of a Web-based CAI for how to program without
explicitly using *any* programming language would pretty much
eliminate those kinds of newbie questions. Instead of saying "Learn
these built-in Lisp functions", and in the C group being told
"Learn these C operators and library functions", and in the Java
group being told "Learn these Java classes, and the following
methods within them", and in comp.programming getting all that
contradictory advice plus admonitions to instead use C# or VB6 or
*.NET or FlamingThunder or Pascal or machine language or Fortran 77
or Scheme or Python etc. etc., instead all newbies asking such
questions can be sent to the non-language CAI service where all
such questions can be postponed for months or years.
> My preferred method of learning a new language: find a good book,
> read a chapter, put it in my lap as I do the most interesting
> chapter problems, and continue on to the next chapter.
That presents a dilemma. Write all answers to chapter problems
using pseudo code, whereby you have no way to check your algorithms
to see if they really work, or pick a particular programming
language and suffer whatever syntax it presents.
Suppose my proposed non-language online CAI-algorithm-design
service existed. Would you be willing to use it, instead of an
actual programming language, to work out the answers to the chapter
problems, to verify that your proposed algorithms really work to
solve the particular task?
> When I've learned enough to code without the book, put the book down.
That sounds reasonable. If my proposed non-language CAI-algorithm-design
service existed, you could initially:
- See problem in book, and suggestions of functions to call, browse
my index of functionality to find equivalent functionality, and
name it anything you want, like what the book says, or something
you prefer instead.
Then when you're at the put-book-down point, you switch to:
- Browse my index of functioality for anything you feel like
playing with, with no requirment to do what any book says,
instead invent/specify your own tasks that you feel you can
accomplish with just a little bit more effort in finding
existing functionality plus adding a layer of your own algorithms.
> My preferred book happened to be Norvig's PAIP.
So are you doing the examples in the book using some particular
programming language, or perhaps doing them in parallel using two
different languages, or doing them with just pseudocode?
> Even with the advent of the internet, it's still my preferred
> method, because published books are generally of higher quality
> than the language studies I find on the internet.
I would like to see that changed someday soon, to where
high-quality tutorials about algorithm design (for newbies) are
available over the Web.
Have you seen my proposed course outline for teaching computer
programming (using Lisp for now, using my proposed non-language Web
service eventually) starting from the absolute beginning?
<http://www.rawbw.com/~rem/HelloPlus/hellos.html#s4outl>
Critique please.
·················@SpamGourmet.Com (Robert Maas, http://tinyurl.com/uh3t) writes:
> > > Regarding that proposed koan earlier: Does anybody have a
> > > collection of short code snippets which likewise illustrate a whole
> > > mess of deep understanding, where just trying to understand why
> > > this particular snippet of code produces the result it does will
> > > enlighten a newbie?
> > From: vanekl <·····@acd.net>
> > a 10-line recursive BFS
>
> I assume you mean breadth-first search (in some sort of tree,
> possibly binary)? Why would anybody want to write that recursively?
For fun :-)
> Let me see if I can find the 10-line recursive BFS algorithm ...
Will four lines do?
(defun bfs6 (test children pending)
(and pending
(or (some test pending)
(bfs6 test children (mapcan children pending)))))
I've forgotten which Pascal showed me this, sorry.
It fails to return the path.
CL-USER> (loop for i from 10 to 20
do (print (bfs6 (equal-to i)
(lambda(x)(remove 1000
(list (* x 2)
(* x 3)
(* x 5))
:test #'< ))
(list 1))))
T
NIL
T
NIL
NIL
T
T
NIL
T
NIL
T
I cannot remember the right way to fix this, but ruthless
cheating is fun too:
CL-USER> (defun test-path (test)
(lambda(path)
(if (funcall test (car path))
path
nil)))
CL-USER> (defun child-path (children)
(lambda(path)
(loop for child in (funcall children (car path))
collect (cons child path))))
CL-USER> (loop for i from 10 to 20
do (print (bfs6 (test-path (equal-to i))
(child-path (lambda(x)(remove 1000
(list (* x 2)
(* x 3)
(* x 5))
:test #'< )))
(list (list 1)))))
(10 2 1)
NIL
(12 4 2 1)
NIL
NIL
(15 3 1)
(16 8 4 2 1)
NIL
(18 6 2 1)
NIL
(20 4 2 1)
Alan Crowe
Edinburgh
Scotland
Alan Crowe <····@cawtech.freeserve.co.uk> writes:
> Will four lines do?
>
> (defun bfs6 (test children pending)
> (and pending
> (or (some test pending)
> (bfs6 test children (mapcan children pending)))))
Uh, this assumes the children function returns freshly consed (or, at
least, disposable) list, right? Surely there will be situations where
that's a risky assumption. Not that it's not trivially corrected. I'm
just too lazy. It's a pity there's no non-destructive MAPCAN.
From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: coerce for arbitrary types
Date:
Message-ID: <rem-2008apr25-006@yahoo.com>
> From: Kent M Pitman <······@nhplace.com>
> It's a pity there's no non-destructive MAPCAN.
You only need the one-list-of-args version.
How long does it take to write one? Ten minutes?
(defun mapappend1 (funct args1)
(if (consp args1) (append (funcall funct (car args1))
(mapappend1 funct (cdr args1)))))
(mapappend1 #'(lambda (a) (make-list a :initial-element a))
'(1 7 4 9))
=> (1 7 7 7 7 7 7 7 4 4 4 4 9 9 9 9 9 9 9 9 9)
Once you have that (or an iterative version) debugged,
with copy&paste it's trivial to write versions for any other
specific number of arguments, for example:
(defun mapappend2 (funct args1 args2)
(if (and (consp args1) (consp args2))
(append (funcall funct (car args1) (car args2))
(mapappend2 funct (cdr args1) (cdr args2)))))
(mapappend2 #'(lambda (a b) (make-list a :initial-element b))
'(1 7 4 9)
'(8 3 2 5))
=> (8 3 3 3 3 3 3 3 2 2 2 2 5 5 5 5 5 5 5 5 5)
You never *really* need any more than two arglists, because you can
always nest calls in a pinch.
Actually you never need more than one arglist, because you can use
MAPCAR with #'LIST to collate two or more lists into a list of
argument tuples:
(mapcar #'LIST args1 args2 args3 args4 args5)
then use MAPAPPEND1 with NTH or APPLY to process the single
already-merged list of argument tuples.
(mapappend1 #'(lambda (tuple)
(myfunct5 (nth 0 tuple) (nth 1 tuple) (nth 2 tuple)
(nth 3 tuple) (nth 4 tuple)))
(mapcar #'LIST args1 args2 args3 args4 args5))
(mapappend1 #'(lambda (tuple) (apply #'myfunct5 tuple))
(mapcar #'LIST args1 args2 args3 args4 args5))
;Untested code there. If I made a mistake, please post followup.
·················@SpamGourmet.Com (Robert Maas,
http://tinyurl.com/uh3t) writes:
>> From: Kent M Pitman <······@nhplace.com>
>> It's a pity there's no non-destructive MAPCAN.
>
> You only need the one-list-of-args version.
> How long does it take to write one? Ten minutes?
> (defun mapappend1 (funct args1)
> (if (consp args1) (append (funcall funct (car args1))
> (mapappend1 funct (cdr args1)))))
> (mapappend1 #'(lambda (a) (make-list a :initial-element a))
> '(1 7 4 9))
> => (1 7 7 7 7 7 7 7 4 4 4 4 9 9 9 9 9 9 9 9 9)
>
> Once you have that (or an iterative version) debugged,
> with copy&paste it's trivial to write versions for any other
> specific number of arguments, for example:
> (defun mapappend2 (funct args1 args2)
> (if (and (consp args1) (consp args2))
> (append (funcall funct (car args1) (car args2))
> (mapappend2 funct (cdr args1) (cdr args2)))))
> (mapappend2 #'(lambda (a b) (make-list a :initial-element b))
> '(1 7 4 9)
> '(8 3 2 5))
> => (8 3 3 3 3 3 3 3 2 2 2 2 5 5 5 5 5 5 5 5 5)
>
> You never *really* need any more than two arglists, because you can
> always nest calls in a pinch.
>
> Actually you never need more than one arglist, because you can use
> MAPCAR with #'LIST to collate two or more lists into a list of
> argument tuples:
> (mapcar #'LIST args1 args2 args3 args4 args5)
> then use MAPAPPEND1 with NTH or APPLY to process the single
> already-merged list of argument tuples.
> (mapappend1 #'(lambda (tuple)
> (myfunct5 (nth 0 tuple) (nth 1 tuple) (nth 2 tuple)
> (nth 3 tuple) (nth 4 tuple)))
> (mapcar #'LIST args1 args2 args3 args4 args5))
> (mapappend1 #'(lambda (tuple) (apply #'myfunct5 tuple))
> (mapcar #'LIST args1 args2 args3 args4 args5))
> ;Untested code there. If I made a mistake, please post followup.
You can write mapappend like this:
(defun mapappend (function &rest lists)
"Like mapcan, but is non-destructive, ie. uses append to combine the
results together rather than nconc."
(loop for result in (apply #'mapcar function lists)
append result))
Bj�rn Lindberg
From: Robert Maas, http://tinyurl.com/uh3t
Subject: Short koan-like code snippets (was: coerce for arbitrary types)
Date:
Message-ID: <rem-2008apr25-005@yahoo.com>
> > > > Regarding that proposed koan earlier: Does anybody have a
> > > > collection of short code snippets which likewise illustrate a whole
> > > > mess of deep understanding, where just trying to understand why
> > > > this particular snippet of code produces the result it does will
> > > > enlighten a newbie?
> > > From: vanekl <·····@acd.net>
> > > a 10-line recursive BFS
> > I assume you mean breadth-first search (in some sort of tree,
> > possibly binary)? Why would anybody want to write that recursively?
> From: Alan Crowe <····@cawtech.freeserve.co.uk>
> For fun :-)
> > Let me see if I can find the 10-line recursive BFS algorithm ...
> Will four lines do?
> (defun bfs6 (test children pending)
> (and pending
> (or (some test pending)
> (bfs6 test children (mapcan children pending)))))
Um, that is a bit inscrutable. It needs some documentation as to
what are natural values and what are continuations or other unusual
types of data. I studied that code for quite some time, and it
seems reasonable that perhaps:
- test is a predicate for filtering the nodes as to which satisfy
the toplevel goal and which don't.
- children is a function which, given a node, returns a list of all
children to that node, but furthermore always makes a copy of
that list before returning, so that it's safe to use mapcan on
the result.
- pending is a list of nodes at the current level, each of which
has been discovered at the previus level but not yet processed
The algorithm seems to be a tail-recursive expression of what
is essentially an iterative algorithm:
(defun bfs6 (test children pending) ;At start, pending = (topOfTree)
(loop ;One cycle of loop per level of tree being traversed
(unless pending (return NIL)) ;If we've bottomed out already
(let ((founds (some test pending)))
(when founds (return founds))) ;If we've found our target at this level
(setq pending (mapcan children pending)) ;Step down to next level
))
Note that test and children are *constants* throughout the entire
tail-recursive, or iterative, algorithm, and it unnecessarily
complicates the expression of the algorithm to pass each explicitly
through each level of recursion emulating iteration.
WIth the iterative expression I proposed, there's only one setq, on
the *variable* pending, so it's obvious that test and children are
constants, not requiring careful study to deduce that fact.
I don't like using tail recursion to emulate iteration, for the
simple reason that Common Lisp does not specify that tail recursion
will be optimized to *not* push deeper and deeper on the stack, and
also for the simple reason that *any* iterative algorithm can be
contorted to be emulated via tail-recursion, and *one* example for
the student, any example, the most trivial example the best, is
sufficient to get the point across. It's not necessary to use an
unnecessarily complicated example like this just to show how tail
recursion can emulate iteration.
In either case, there's an obvious re-write I'd prefer. Instead of
requiring the children function to always make a copy of the list
of children before return, I'd define an auxilary function
copy-children which did that, and have a functional variable
cchildren so that the name made it clear to pass the auxilary
function instead of the basic data-field accessor function.
Or I'd write a function called mapappend which worked just like
mapcan except it appended instead of effecting nconc. Then I could
call the basic data-field accessor function directly. That would be
my preferred solution, to avoid horrible bugs when the application
programmer mistakenly passes the wrong function.
So now let me say how I'd teach the tail-recurse-to-emulate-interation
lesson if I were teaching a class: First I'd demonstrate a totally
frivilous use of iteration, to build a copy of an integer by
counting that integer down to zero while another integer is counted
up from zero in parallel:
(defun copy-n-i (n verbose &aux (ncopy 0))
(loop
(when verbose (format t "n=~D ncopy=~D~%" n ncopy))
(cond ((< 0 n) (decf n) (incf ncopy))
(t (return ncopy)))))
(copy-n-i 5 t)
n=5 ncopy=0
n=4 ncopy=1
n=3 ncopy=2
n=2 ncopy=3
n=1 ncopy=4
n=0 ncopy=5
5
(copy-n-i 947 nil)
947
Next I'd show how to do exactly the same algorithm via tail-recursion:
(defun copy-n-r (n &optional (ncopy 0))
(if (< 0 n) (copy-n-r (- n 1) (+ ncopy 1))
ncopy))
(trace copy-n-r)
(copy-n-r 5)
0: (COPY-N-R 5)
1: (COPY-N-R 4 1)
2: (COPY-N-R 3 2)
3: (COPY-N-R 2 3)
4: (COPY-N-R 1 4)
5: (COPY-N-R 0 5)
5: COPY-N-R returned 5
4: COPY-N-R returned 5
3: COPY-N-R returned 5
2: COPY-N-R returned 5
1: COPY-N-R returned 5
0: COPY-N-R returned 5
5
Next I'd assign a task of building a list of length n, every
element NIL, both iteratively and recursively, using only CONS to
do the building, but using (decf n) in the iterative algorithm and
(- n 1) in the recursive algorithm just as in the above example.
The functions should be named make-nil-list-i and make-nil-list-r
respectively.
I'd provide two test benches like this:
(make-nil-list-i 5 t)
(trace make-nil-list-r)
(make-nil-list-r 5)
(untrace make-nil-list-r)
(setq *gc-verbose* nil)
(loop for exp from 0 upto 500 do
(format t "~&** exp=~D " exp) (finish-output)
(let ((pwr (expt 2 exp)))
(format t "pwr=~D " pwr) (finish-output)
(prog (time0 ires rres len time9)
(format t "~& iterative...") (finish-output)
(setq time0 (get-universal-time))
(setq ires (make-nil-list-i pwr nil))
(setq time9 (get-universal-time))
(format t "~D seconds, " (- time9 time0)) (finish-output)
(setq len (length ires))
(if (= pwr len) (format t "correct.~%")
(error "Wrong!! Length should be ~D but is actually ~D" pwr len))
(format t "~& recursive...") (finish-output)
(setq time0 (get-universal-time))
(setq rres (make-nil-list-r pwr nil))
(setq time9 (get-universal-time))
(format t "~D seconds, " (- time9 time0)) (finish-output)
(setq len (length rres))
(if (= pwr len) (format t "correct.~%")
(error "Wrong!! Length should be ~D but is actually ~D" pwr len))
)
)
(sleep 0.3))
I'd expect the student to write code somewhat like this:
(defun make-nil-list-i (n verbose &aux (list nil))
(loop
(when verbose (format t "n=~D list=~S~%" n list))
(cond ((< 0 n) (decf n) (setq list (cons nil list)))
(t (return list)))))
Or maybe NCONC the new element at the end (shows as really really slow
on second benchmark test, but maybe that's the best way to learn).
Or maybe use APPEND instead of NCONS.
Or mabye CONS at start of the list and also NREVERSE before return
to be flexible code anticipating that I might next want the
elements to be the index rather than always NIL.
(defun make-nil-list-r (n &optional (list nil))
(if (< 0 n) (make-nil-list-r (- n 1) (cons nil list))
list))
By the way, running the second benchmark test with the sample code
I showed in CMUCL on my ISP's unix shell account gives in part:
** exp=17 pwr=131072
iterative...3 seconds, correct.
recursive...5 seconds, correct.
** exp=18 pwr=262144
iterative...5 seconds, correct.
recursive...6 seconds, correct.
** exp=19 pwr=524288
iterative...9 seconds, correct.
recursive...18 seconds, correct.
** exp=20 pwr=1048576
iterative...23 seconds, correct.
recursive...27 seconds, correct.
** exp=21 pwr=2097152
iterative...51 seconds, correct.
recursive...106 seconds, correct.
** exp=22 pwr=4194304
iterative...103 seconds, correct.
recursive...151 seconds, correct.
** exp=23 pwr=8388608
iterative...177 seconds, correct.
recursive...330 seconds, correct.
In XLISP it doesn't work at all because DECF isn't defined.
So I defined it using code that came in source form with the download.
Also *gc-verbose* isn't defined, so I skipped that step.
But then:
error: unbound variable - exp
which presumably means they don't have the new LOOP macro, or it's
broken. So I'll change the LOOP a little bit to make it run with
the old CL1 version ... but now I get:
error: unbound function - finish-output
so I commented out all those calls, but now I get:
error: unbound function - get-universal-time
And neither of the common-lisp files have a definition for it.
I give up! XLISP is useless!!
I wish there were a halfway decent free Common Lisp for Mac system 7.5.5
Scott Burson wrote:
> (defmethod convert ((to-type (eql 'set)) (b bag) &key)
> ...)
>
> I put the destination type first in the argument order because I think
> it makes the calls more readable: (convert 'foo <long expression>) is
> easier to read than (convert <long expression> 'foo). Also, a few of
> these methods take additional keyword parameters to specify some
> aspect of the conversion to be performed.
What, exactly, makes one of these more or less readable than the other?
Damien Kick <·····@earthlink.net> writes:
> Scott Burson wrote:
>
> > (defmethod convert ((to-type (eql 'set)) (b bag) &key)
> > ...)
> > I put the destination type first in the argument order because I
> > think
> > it makes the calls more readable: (convert 'foo <long expression>) is
> > easier to read than (convert <long expression> 'foo). Also, a few of
> > these methods take additional keyword parameters to specify some
> > aspect of the conversion to be performed.
>
> What, exactly, makes one of these more or less readable than the other?
There is a visual analog here to the question of "tail call optimization".
If you expect two arguments, your eye has to first parse the first arg
and all the while remember that when it's done it will parse the second.
If it can dismiss the task of remembering while doing a usually trivial
task (the probably-short name), then when you're parsing the long expression,
you have more "short term memory" (to use the Newell&Simon term) or more
"stack" (to use the computational term) available while you're doing the
second task, which is now in a tail position and no longer has to remember
about the need to parse the other arg. That's the approximate explanation
anyway.
On Apr 6, 5:11 pm, Kent M Pitman <······@nhplace.com> wrote:
> Damien Kick <·····@earthlink.net> writes:
> > Scott Burson wrote:
>
> > > (defmethod convert ((to-type (eql 'set)) (b bag) &key)
> > > ...)
> > > I put the destination type first in the argument order because I
> > > think
> > > it makes the calls more readable: (convert 'foo <long expression>) is
> > > easier to read than (convert <long expression> 'foo). Also, a few of
> > > these methods take additional keyword parameters to specify some
> > > aspect of the conversion to be performed.
>
> > What, exactly, makes one of these more or less readable than the other?
>
> There is a visual analog here to the question of "tail call optimization".
Cute analogy! I agree.
Also, I find I've used `concatenate' a lot more often than `coerce',
and `concatenate' takes the result type as its first argument (it has
to, so it can take a variable number of sequences). So in a strange
way it seems to me more consistent with precedent to put the type
first.
-- Scott