I'm trying to write a function that takes the name of a dictionary file
containing one word per line and returns the contents of that file as a
list of strings. The latest of several versions is this one (I know it's
not very elegant):
(defun get-words (filename)
(let ((*words* '()))
(defun read-loop (stream)
(setf *words* (cons (read-line stream nil 0) *words*))
(if (numberp (first *words*))
(rest *words*)
(read-loop stream)))
(with-open-file (stream filename :direction :input)
(read-loop stream))))
The function works with files containing less than around 10,000 words,
but no matter what I do, whenever I call this function with a file
containing more words, I get a stack overflow or a core dump. How can I
do this with larger files? I feel like there must be a better way.
Thank you for your time.
Joe
Joseph Dale <·····@uclink4.berkeley.edu> writes:
> I'm trying to write a function that takes the name of a dictionary file
> containing one word per line and returns the contents of that file as a
> list of strings. The latest of several versions is this one (I know it's
> not very elegant):
>
> (defun get-words (filename)
> (let ((*words* '()))
> (defun read-loop (stream)
> (setf *words* (cons (read-line stream nil 0) *words*))
> (if (numberp (first *words*))
> (rest *words*)
> (read-loop stream)))
> (with-open-file (stream filename :direction :input)
> (read-loop stream))))
>
> The function works with files containing less than around 10,000 words,
> but no matter what I do, whenever I call this function with a file
> containing more words, I get a stack overflow or a core dump. How can I
> do this with larger files? I feel like there must be a better way.
>
> Thank you for your time.
It looks like your CL system cannot do teil recursion optimization.
Anyway, a non-recursive version is here
(defun get-words (filename)
(with-open-file (f filename :direction :input)
(loop for line = (read-line f nil '$$get-word-eof$$)
when (eq line '$$get-word-eof$$)
do (return words)
collect line into words)))
Cheers
--
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa
Marco Antoniotti <·······@parades.rm.cnr.it> writes:
> Joseph Dale <·····@uclink4.berkeley.edu> writes:
>
> It looks like your CL system cannot do teil recursion optimization.
>
> Anyway, a non-recursive version is here
>
> (defun get-words (filename)
> (with-open-file (f filename :direction :input)
> (loop for line = (read-line f nil '$$get-word-eof$$)
> when (eq line '$$get-word-eof$$)
> do (return words)
> collect line into words)))
>
Since we are at it....
(defun get-words (filename)
(with-open-file (f filename :direction :input)
(labels ((read-loop (f lines)
(let ((line (read-line f nil '$$get-word-eof$$)))
(if (eq line '$$get-word-eof$$)
lines
(read-loop f (cons line lines))))))
(nreverse (read-loop f ())))))
Of course, if your CL compiler does not do tail recursion elimination,
this second function will stack overflow on large files.
Cheers
--
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa
Marco Antoniotti wrote:
>
> Since we are at it....
>
> (defun get-words (filename)
> (with-open-file (f filename :direction :input)
> (labels ((read-loop (f lines)
> (let ((line (read-line f nil '$$get-word-eof$$)))
> (if (eq line '$$get-word-eof$$)
> lines
> (read-loop f (cons line lines))))))
> (nreverse (read-loop f ())))))
>
> Of course, if your CL compiler does not do tail recursion elimination,
> this second function will stack overflow on large files.
Interestingly, even after (proclaim '(optimize (debug 0))), this one
does even worse than my original function.
>
> Cheers
>
> --
> Marco Antoniotti ===========================================
> PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
> tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
> http://www.parades.rm.cnr.it/~marcoxa
Marco Antoniotti wrote:
>
> Joseph Dale <·····@uclink4.berkeley.edu> writes:
>
> > I'm trying to write a function that takes the name of a dictionary file
> > containing one word per line and returns the contents of that file as a
> > list of strings. The latest of several versions is this one (I know it's
> > not very elegant):
> >
> > (defun get-words (filename)
> > (let ((*words* '()))
> > (defun read-loop (stream)
> > (setf *words* (cons (read-line stream nil 0) *words*))
> > (if (numberp (first *words*))
> > (rest *words*)
> > (read-loop stream)))
> > (with-open-file (stream filename :direction :input)
> > (read-loop stream))))
> >
> > The function works with files containing less than around 10,000 words,
> > but no matter what I do, whenever I call this function with a file
> > containing more words, I get a stack overflow or a core dump. How can I
> > do this with larger files? I feel like there must be a better way.
> >
> > Thank you for your time.
>
> It looks like your CL system cannot do teil recursion optimization.
This is why I'm am in such shock, because neither this, nor any other
tail-recursive version of my function, seems to work in either CLISP or
ACL (on Red Hat 6.1). I was under the impression that these systems
would optimize tail-recursion. The non-recursive version is nice, but I
was really hoping to do it recursively.
>
> Anyway, a non-recursive version is here
>
> (defun get-words (filename)
> (with-open-file (f filename :direction :input)
> (loop for line = (read-line f nil '$$get-word-eof$$)
> when (eq line '$$get-word-eof$$)
> do (return words)
> collect line into words)))
>
> Cheers
>
> --
> Marco Antoniotti ===========================================
> PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
> tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
> http://www.parades.rm.cnr.it/~marcoxa
On Thu, 10 Feb 2000 14:13:41 GMT, Joseph Dale <·····@uclink4.berkeley.edu>
wrote:
> This is why I'm am in such shock, because neither this, nor any other
> tail-recursive version of my function, seems to work in either CLISP or
> ACL (on Red Hat 6.1). I was under the impression that these systems
> would optimize tail-recursion. The non-recursive version is nice, but I
> was really hoping to do it recursively.
Try
(PROCLAIM (OPTIMIZE (DEBUG 0)))
as Common Lisp systems sometimes have a 'factory setting' that turns off TRO to
assist debugging.
__Jason
Jason Trenouth wrote:
>
> On Thu, 10 Feb 2000 14:13:41 GMT, Joseph Dale <·····@uclink4.berkeley.edu>
> wrote:
>
> > This is why I'm am in such shock, because neither this, nor any other
> > tail-recursive version of my function, seems to work in either CLISP or
> > ACL (on Red Hat 6.1). I was under the impression that these systems
> > would optimize tail-recursion. The non-recursive version is nice, but I
> > was really hoping to do it recursively.
>
> Try
>
> (PROCLAIM (OPTIMIZE (DEBUG 0)))
>
> as Common Lisp systems sometimes have a 'factory setting' that turns off TRO to
> assist debugging.
>
> __Jason
??????
In ACL (trial, by the way):
USER(1): (PROCLAIM (OPTIMIZE (DEBUG 0)))
Error: attempt to call `OPTIMIZE' which is an undefined function.
[condition type: UNDEFINED-FUNCTION]
Restart actions (select using :continue):
0: Try calling OPTIMIZE again.
1: Return a value instead of calling OPTIMIZE.
2: Try calling a function other than OPTIMIZE.
3: Setf the symbol-function of OPTIMIZE and call it again.
4: Return to Top Level (an "abort" restart)
5: Abort #<PROCESS Initial Lisp Listener>
In CLISP:
[1]> (PROCLAIM (OPTIMIZE (DEBUG 0)))
*** - EVAL: the function OPTIMIZE is undefined
Joseph Dale wrote:
>
> Jason Trenouth wrote:
> >
> > On Thu, 10 Feb 2000 14:13:41 GMT, Joseph Dale <·····@uclink4.berkeley.edu>
> > wrote:
> >
> > > This is why I'm am in such shock, because neither this, nor any other
> > > tail-recursive version of my function, seems to work in either CLISP or
> > > ACL (on Red Hat 6.1). I was under the impression that these systems
> > > would optimize tail-recursion. The non-recursive version is nice, but I
> > > was really hoping to do it recursively.
> >
> > Try
> >
> > (PROCLAIM (OPTIMIZE (DEBUG 0)))
> >
> > as Common Lisp systems sometimes have a 'factory setting' that turns off TRO to
> > assist debugging.
> >
> > __Jason
>
> ??????
>
> In ACL (trial, by the way):
>
> USER(1): (PROCLAIM (OPTIMIZE (DEBUG 0)))
> Error: attempt to call `OPTIMIZE' which is an undefined function.
> [condition type: UNDEFINED-FUNCTION]
>
> Restart actions (select using :continue):
> 0: Try calling OPTIMIZE again.
> 1: Return a value instead of calling OPTIMIZE.
> 2: Try calling a function other than OPTIMIZE.
> 3: Setf the symbol-function of OPTIMIZE and call it again.
> 4: Return to Top Level (an "abort" restart)
> 5: Abort #<PROCESS Initial Lisp Listener>
>
> In CLISP:
>
> [1]> (PROCLAIM (OPTIMIZE (DEBUG 0)))
>
> *** - EVAL: the function OPTIMIZE is undefined
(proclaim '(optimize (debug 0))) is accepted, but the function still
doesn't work.
According to ACL documentation:
The `tail-call-self-merge-switch' is true if SPEED is greater than 0.
The `tail-call-non-self-merge-switch' is true if SPEED is greater
than 1 and DEBUG is less than 0.
There was a bug where the wrong switch is examined for LABELS
functions, though.
You can also set the switches directly rather than rely on the SPEED
and DEBUG settings:
(setq compiler:tail-call-self-merge-switch t)
(setq compiler:tail-call-non-self-merge-switch t)
Joseph Dale <·····@uclink4.berkeley.edu> wrote in message
······················@uclink4.berkeley.edu...
> Joseph Dale wrote:
> >
> > Jason Trenouth wrote:
> > >
> > > On Thu, 10 Feb 2000 14:13:41 GMT, Joseph Dale
<·····@uclink4.berkeley.edu>
> > > wrote:
> > >
> > > > This is why I'm am in such shock, because neither this, nor any
other
> > > > tail-recursive version of my function, seems to work in either CLISP
or
> > > > ACL (on Red Hat 6.1). I was under the impression that these systems
> > > > would optimize tail-recursion. The non-recursive version is nice,
but I
> > > > was really hoping to do it recursively.
> > >
> > > Try
> > >
> > > (PROCLAIM (OPTIMIZE (DEBUG 0)))
> > >
> > > as Common Lisp systems sometimes have a 'factory setting' that turns
off TRO to
> > > assist debugging.
> > >
> > > __Jason
> >
> > ??????
> >
> > In ACL (trial, by the way):
> >
> > USER(1): (PROCLAIM (OPTIMIZE (DEBUG 0)))
> > Error: attempt to call `OPTIMIZE' which is an undefined function.
> > [condition type: UNDEFINED-FUNCTION]
> >
> > Restart actions (select using :continue):
> > 0: Try calling OPTIMIZE again.
> > 1: Return a value instead of calling OPTIMIZE.
> > 2: Try calling a function other than OPTIMIZE.
> > 3: Setf the symbol-function of OPTIMIZE and call it again.
> > 4: Return to Top Level (an "abort" restart)
> > 5: Abort #<PROCESS Initial Lisp Listener>
> >
> > In CLISP:
> >
> > [1]> (PROCLAIM (OPTIMIZE (DEBUG 0)))
> >
> > *** - EVAL: the function OPTIMIZE is undefined
>
>
> (proclaim '(optimize (debug 0))) is accepted, but the function still
> doesn't work.
Joseph Dale <·····@uclink4.berkeley.edu> writes:
> Jason Trenouth wrote:
> >
> > On Thu, 10 Feb 2000 14:13:41 GMT, Joseph Dale <·····@uclink4.berkeley.edu>
> > wrote:
> >
> > > This is why I'm am in such shock, because neither this, nor any other
> > > tail-recursive version of my function, seems to work in either CLISP or
> > > ACL (on Red Hat 6.1). I was under the impression that these systems
> > > would optimize tail-recursion. The non-recursive version is nice, but I
> > > was really hoping to do it recursively.
> >
> > Try
> >
> > (PROCLAIM (OPTIMIZE (DEBUG 0)))
> >
> > as Common Lisp systems sometimes have a 'factory setting' that turns off TRO to
> > assist debugging.
> >
> > __Jason
>
>
> ??????
>
> In ACL (trial, by the way):
>
> USER(1): (PROCLAIM (OPTIMIZE (DEBUG 0)))
> Error: attempt to call `OPTIMIZE' which is an undefined function.
> [condition type: UNDEFINED-FUNCTION]
>
> Restart actions (select using :continue):
> 0: Try calling OPTIMIZE again.
> 1: Return a value instead of calling OPTIMIZE.
> 2: Try calling a function other than OPTIMIZE.
> 3: Setf the symbol-function of OPTIMIZE and call it again.
> 4: Return to Top Level (an "abort" restart)
> 5: Abort #<PROCESS Initial Lisp Listener>
>
>
> In CLISP:
>
> [1]> (PROCLAIM (OPTIMIZE (DEBUG 0)))
>
> *** - EVAL: the function OPTIMIZE is undefined
Minor mistake
(proclaim '(optimize (debug 0))) or
(declaim (optimize (debug 0)))
should work.
--
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa
Marco Antoniotti wrote:
>
>
> Minor mistake
>
> (proclaim '(optimize (debug 0))) or
> (declaim (optimize (debug 0)))
>
> should work.
>
Neither does. More stack overflows.
> --
> Marco Antoniotti ===========================================
> PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
> tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
> http://www.parades.rm.cnr.it/~marcoxa
From: Michael Kappert
Subject: Re: beginner question on input from large files
Date:
Message-ID: <38A2CE0A.510D0114@iitb.fhg.de>
Joseph Dale wrote:
>
> Marco Antoniotti wrote:
> >
> > Joseph Dale <·····@uclink4.berkeley.edu> writes:
> >
> > > I'm trying to write a function that takes the name of a dictionary file
> > > containing one word per line and returns the contents of that file as a
> > > list of strings. The latest of several versions is this one (I know it's
> > > not very elegant):
> > >
> > > (defun get-words (filename)
> > > (let ((*words* '()))
> > > (defun read-loop (stream)
As a side note: Although this looks like a local function definition, it isn't.
read-loop will only be defined when get-words ist first _executed_.
After that, read-loop will be _globally_ defined.
You may want to take a look at FLET and LABELS.
> > > (setf *words* (cons (read-line stream nil 0) *words*))
> > > (if (numberp (first *words*))
> > > (rest *words*)
> > > (read-loop stream)))
> > > (with-open-file (stream filename :direction :input)
> > > (read-loop stream))))
> > >
> > > The function works with files containing less than around 10,000 words,
> > > but no matter what I do, whenever I call this function with a file
> > > containing more words, I get a stack overflow or a core dump. How can I
> > > do this with larger files? I feel like there must be a better way.
> > >
> > > Thank you for your time.
> >
> > It looks like your CL system cannot do teil recursion optimization.
>
> This is why I'm am in such shock, because neither this, nor any other
> tail-recursive version of my function, seems to work in either CLISP or
> ACL (on Red Hat 6.1). I was under the impression that these systems
> would optimize tail-recursion. The non-recursive version is nice, but I
> was really hoping to do it recursively.
Hmmm... quite sure that ACL does optimize tail recursion.
Maybe read-loop was not compiled?
Regards,
Michael.
--
Michael Kappert
Fraunhofer IITB
Fraunhoferstr. 1 Phone: +49(0)721/6091-477
D-76131 Karlsruhe, Germany EMail: ···@iitb.fhg.de
Joseph Dale <·····@uclink4.berkeley.edu> writes:
> This is why I'm am in such shock, because neither this, nor any other
> tail-recursive version of my function, seems to work in either CLISP or
> ACL (on Red Hat 6.1). I was under the impression that these systems
> would optimize tail-recursion. The non-recursive version is nice, but I
> was really hoping to do it recursively.
Why recursively? Why lists? These seem like very bad choices for
large file sizes...
If indeed you need a list, what's wrong with
(defun get-words-from-file (filename)
(with-open-file (stream filename)
(loop for line = (read-line stream nil nil)
while line
collect line)))
or for an array
(defun get-words-from-line (filename)
(with-open-file (stream filename)
(let ((line-array (make-array 1000 :adjustable t :fill-pointer 0)))
(do ((line (read-line stream nil nil) (read-line stream nil nil)))
((null line) line-array)
(vector-push-extend line line-array)))))
or any other iterative variant. Or if it indeed has to be recursive
(again why? Reading a list/array of lines from a file is not an
intuitively recursive operation, so why make it one?), at least use
labels instead of such ugly Scheme-isms like nested defuns...
> Marco Antoniotti wrote:
>
> > Anyway, a non-recursive version is here
> >
> > (defun get-words (filename)
> > (with-open-file (f filename :direction :input)
> > (loop for line = (read-line f nil '$$get-word-eof$$)
> > when (eq line '$$get-word-eof$$)
> > do (return words)
> > collect line into words)))
See above, the normal read eof-value hack is not needed for read-line,
since read-line will always return strings, which are disjunct from
any symbols (like nil). Also why the collect into with a when/do
return? Use unless/while, and let the loop return the collected list
automatically. If you've got to use LOOP, use it wisely, some might
say ;)
BTW: With an extensible LOOP, you can do things like this (code on
request):
(with-open-file (stream filename)
(loop for line being the lines of stream
collect line))
Or indeed
(with-open-file (stream filename)
(loop for (name age salary . info) being the lines of stream split-by #\Tab
collect (list name info (/ salary age))))
Regs, Pierre.
--
Pierre Mai <····@acm.org> PGP and GPG keys at your nearest Keyserver
"One smaller motivation which, in part, stems from altruism is Microsoft-
bashing." [Microsoft memo, see http://www.opensource.org/halloween1.html]
····@acm.org (Pierre R. Mai) writes:
> > Marco Antoniotti wrote:
> >
> > > Anyway, a non-recursive version is here
> > >
> > > (defun get-words (filename)
> > > (with-open-file (f filename :direction :input)
> > > (loop for line = (read-line f nil '$$get-word-eof$$)
> > > when (eq line '$$get-word-eof$$)
> > > do (return words)
> > > collect line into words)))
>
> See above, the normal read eof-value hack is not needed for read-line,
> since read-line will always return strings, which are disjunct from
> any symbols (like nil). Also why the collect into with a when/do
> return? Use unless/while, and let the loop return the collected list
> automatically. If you've got to use LOOP, use it wisely, some might
> say ;)
Yep, I stand corrected. I was on automatic... :{
Cheers
--
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa
From: Robert Monfera
Subject: Re: beginner question on input from large files
Date:
Message-ID: <38A2D971.684E5A85@fisec.com>
Are you sure you have compiled your function? I don't think
tail-recursion is optimized in the interpreter. It's easy _not_ to
compile with the trial version, because loading the file won't do it. I
guess you do file loading, because you used PROCLAIM instead of
DECLARE. You could use EXCL:LOAD-COMPILED on the file or COMPILE the
function, and you can check your function with COMPILED-FUNCTION-P (?)
if it is actually compiled.
Others wrote about declarations already.
Robert
Joseph Dale wrote:
> The function works with files containing less than around 10,000 words,
> but no matter what I do, whenever I call this function with a file
> containing more words, I get a stack overflow or a core dump.
Robert Monfera wrote:
>
> Are you sure you have compiled your function? I don't think
> tail-recursion is optimized in the interpreter. It's easy _not_ to
> compile with the trial version, because loading the file won't do it. I
> guess you do file loading, because you used PROCLAIM instead of
> DECLARE. You could use EXCL:LOAD-COMPILED on the file or COMPILE the
> function, and you can check your function with COMPILED-FUNCTION-P (?)
> if it is actually compiled.
>
Thanks!! That did it.
Joe
From: Erik Naggum
Subject: Re: beginner question on input from large files
Date:
Message-ID: <3159210673593055@naggum.no>
* Joseph Dale <·····@uclink4.berkeley.edu>
| I'm trying to write a function that takes the name of a dictionary file
| containing one word per line and returns the contents of that file as a
| list of strings.
first off, it's a reasonable thing to do.
| The latest of several versions is this one (I know it's not very elegant):
au contraire, it's _too_ elegant.
| (defun get-words (filename)
| (let ((*words* '()))
| (defun read-loop (stream)
| (setf *words* (cons (read-line stream nil 0) *words*))
| (if (numberp (first *words*))
| (rest *words*)
| (read-loop stream)))
| (with-open-file (stream filename :direction :input)
| (read-loop stream))))
whether this is compiled or not, this is needlessly complex and "elegant"
only in the Scheme sense, which is not elegant at all in Common Lisp.
(defun stream-lines (input-stream)
(do ((lines nil)
(line #1=(read-line input-stream nil :eof) #1#))
((eq line :eof) (nreverse lines))
(push line lines)))
if you're not as silly as to have an allergic reaction to LOOP, this is
even better:
(defun stream-lines (input-stream)
(loop for line = (read-line input-stream nil :eof)
until (eq line :eof)
collect line))
using a number for magic purposes and then testing for any number is
_really_ bad style. where did you pick this up? a Perl class?
| The function works with files containing less than around 10,000 words,
| but no matter what I do, whenever I call this function with a file
| containing more words, I get a stack overflow or a core dump. How can I
| do this with larger files? I feel like there must be a better way.
recursion has its uses. using recursion for iteration is not one of
them. Scheme educators got that one _seriously_ confused, and your code
is simply such Scheme code in Common Lisp. this is _not_ a reasonable
thing to do, not even in Scheme.
my suggestions:
1 get rid of the internal define.
2 use iteration forms when that's what you do.
3 use recursion for inherently recursive tasks, only.
4 learn Common Lisp if you want to think in Common Lisp.
5 use Scheme if you think in Scheme.
writing Scheme code in Common Lisp is also really bad for your karma.
#:Erik