Speed: a column of floats from large file

From: Jacek Generowicz
Subject: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 13:19:00 +0000
Message-ID: <m27jftjzjv.fsf@genera.local>

I have some ASCII files containing columns of data. I am interested in
the 11th column which contains floating point numbers between 0 and
1. I need to histogram the numbers found in that column. The files can
be up to a few Gb in size.

Here's one of my shots at it (I'm relying on the knowledge that the
11th column of data always starts in the 55th column of text):

(defun doit ()
  (let* ((number-of-bins 100)
         (table (make-array number-of-bins :initial-element 0))
         (binsize (coerce (/ 1 number-of-bins) 'float)))
    ;; Read data from file and stuff into bins
    (with-open-file (data *data-file*)
      (handler-case 
	  (loop
	     (with-input-from-string (the-float (read-line data) :start 55)
	       (incf (aref table
			   (floor (read the-float) binsize)))))
	(end-of-file ())))
    ;; Write histogram out to file
    (with-open-file (out *histogram-file*
                         :direction :output
                         :if-exists :supersede)
      (loop for x from 0.0 by binsize
            for frequency across table
            do (format out "~&~,4f ~d" x frequency)))))

On a 100000 line file, the above takes under 2s in LW, about 8 in SBCL
and about 20 in OpenMCL (1GHz G4, Mac OS X, Tiger).

I'd appreciate hints on how I can improve on this.

In LW, about half of the time is spent in READ. Is it possible to
speed this up by specifying somehow that a float is being read, rather
than letting READ figure it out for itself each time?

In SBCL, about 2/3 of the time is spent in READ-LINE (which only takes
about 20% of the time on LW).

(I didn't profile OpenMCL.)

Should I be looking at some approach based on READ-SEQUENCE?

Re: Speed: a column of floats from large file Nikhil Ketkar
Re: Speed: a column of floats from large file Edi Weitz
- Re: Speed: a column of floats from large file Jacek Generowicz
  - Re: Speed: a column of floats from large file Jacek Generowicz
    - Re: Speed: a column of floats from large file Christophe Rhodes
      - Re: Speed: a column of floats from large file Jacek Generowicz
        Re: Speed: a column of floats from large file Christophe Rhodes
        Re: Speed: a column of floats from large file Pete Kazmier
        Re: Speed: a column of floats from large file Christophe Rhodes
        Re: Speed: a column of floats from large file Pete Kazmier
        Re: Speed: a column of floats from large file Christophe Rhodes
        Re: Speed: a column of floats from large file Pete Kazmier
        Re: Speed: a column of floats from large file Pete Kazmier
        Re: Speed: a column of floats from large file Christophe Rhodes
        Re: Speed: a column of floats from large file Jacek Generowicz
Re: Speed: a column of floats from large file JP Massar
- Re: Speed: a column of floats from large file Debian User
  - Re: Speed: a column of floats from large file Jacek Generowicz
    - Re: Speed: a column of floats from large file Pascal Bourguignon
      - Re: Speed: a column of floats from large file Jacek Generowicz
    - Re: Speed: a column of floats from large file Debian User
      - Re: Speed: a column of floats from large file Jacek Generowicz

From: Nikhil Ketkar
Subject: Re: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 14:48:12 +0000
Message-ID: <1121352492.457580.257510@g47g2000cwa.googlegroups.com>

What would happen if you preprocess the input file and get it in list
format and then read it? Use awk to extract the 5th column, throw a
circular bracket at the begining and end so that the entire column
becomes a list and read it using something like this.

(defun read-column (column-file)
  (let ((input-file nil))
    (setf input-file (open column-file :direction :input))
    (setf gene (read input-file))
    (close input-file)))

From: Edi Weitz
Subject: Re: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 13:24:33 +0000
Message-ID: <u3bqh34ha.fsf@agharta.de>

On Thu, 14 Jul 2005 15:19:00 +0200, Jacek Generowicz <················@cern.ch> wrote:

> In LW, about half of the time is spent in READ. Is it possible to
> speed this up by specifying somehow that a float is being read,
> rather than letting READ figure it out for itself each time?

Maybe using a specialized function/library can help:

  <http://www-cgi.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/math/atof/atof.cl>
  <http://www.cliki.net/PARSE-NUMBER>

Cheers,
Edi.

-- 

Lisp is not dead, it just smells funny.

Real email: (replace (subseq ·········@agharta.de" 5) "edi")

From: Jacek Generowicz
Subject: Re: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 14:03:15 +0000
Message-ID: <m21x61jxi4.fsf@genera.local>

Edi Weitz <········@agharta.de> writes:

> On Thu, 14 Jul 2005 15:19:00 +0200, Jacek Generowicz <················@cern.ch> wrote:
>
>> In LW, about half of the time is spent in READ. Is it possible to
>> speed this up by specifying somehow that a float is being read,
>> rather than letting READ figure it out for itself each time?
>
> Maybe using a specialized function/library can help:
>
>   <http://www-cgi.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/math/atof/atof.cl>

This one improves OpenMCL from 20s to about 4s. But SBCL (8s -> 11s)
and LW (2s -> 4s) get worse.

>   <http://www.cliki.net/PARSE-NUMBER>

Haven't tried this one yet.

From: Jacek Generowicz
Subject: Re: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 14:17:36 +0000
Message-ID: <m2wtntii9r.fsf@genera.local>

Jacek Generowicz <················@cern.ch> writes:

>>   <http://www-cgi.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/math/atof/atof.cl>
>
> This one improves OpenMCL from 20s to about 4s. But SBCL (8s -> 11s)
> and LW (2s -> 4s) get worse.

Sorry, SBCL doesn't change significantly.

From: Christophe Rhodes
Subject: Re: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 14:36:13 +0000
Message-ID: <sqbr55xxnm.fsf@cam.ac.uk>

Jacek Generowicz <················@cern.ch> writes:

> Jacek Generowicz <················@cern.ch> writes:
>
>>>   <http://www-cgi.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/math/atof/atof.cl>
>>
>> This one improves OpenMCL from 20s to about 4s. But SBCL (8s -> 11s)
>> and LW (2s -> 4s) get worse.
>
> Sorry, SBCL doesn't change significantly.

Well, no, because as you said the bottleneck for SBCL is in READ-LINE,
not in parsing the float.

Now, in SBCL these days there is quite aggressively pessimized support
for external formats on character streams: the characters you get back
when you read from a character stream depend on the encoding you have
asked for (whether explicitly through an :external-format argument, or
implicitly through your locale setting).

So the first question is likely to be what encoding you're getting on
your stream.  If you're not getting iso-8859-1, then there are likely
to be per-character checks or conversion functions while converting
octets to characters: if your external-format is :ascii (or a
synonym), sbcl checks that the octets you're getting are smaller than
128; if it's :utf-8, then sbcl must likewise check for smaller than
128ness to see if the octet represents an entire character.  So one
way of mitigating READ-LINE (or other character input) slowness might
be, if you're not already, to explicitly ask for iso-8859-1
external-format.

Once you do that, the bottleneck is likely to be in parsing the float.
SBCL includes a reasonably fast float parser, but there are faster
ones around: there are some classic papers on reading and printing
floats quickly and accurately.  (You could start from the paper used
to implement SBCL's float printer, which contains useful references:
Burger and Dyvbig, 1996 -- there might also be shortcuts possible if
you know that a float was printed using a particular printing routine
on a particular implementation of IEEE 754).

Christophe

From: Jacek Generowicz
Subject: Re: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 15:19:44 +0000
Message-ID: <m28y091kkv.fsf@genera.local>

Christophe Rhodes <·····@cam.ac.uk> writes:

> Now, in SBCL these days there is quite aggressively pessimized support
> for external formats on character streams: the characters you get back
> when you read from a character stream depend on the encoding you have
> asked for (whether explicitly through an :external-format argument, or
> implicitly through your locale setting).
>
> So the first question is likely to be what encoding you're getting on
> your stream.

utf-8

> If you're not getting iso-8859-1, then there are likely to be
> per-character checks or conversion functions while converting octets
> to characters: if your external-format is :ascii (or a synonym),
> sbcl checks that the octets you're getting are smaller than 128; if
> it's :utf-8, then sbcl must likewise check for smaller than 128ness
> to see if the octet represents an entire character.  So one way of
> mitigating READ-LINE (or other character input) slowness might be,
> if you're not already, to explicitly ask for iso-8859-1
> external-format.

Yes, this helps.

Before:

  seconds  |   consed   |  calls  |  sec/call  |  name  
----------------------------------------------------------
     5.457 | 36,815,224 | 100,001 |   0.000055 | READ-LINE
     1.343 | 16,541,856 | 200,210 |   0.000007 | COERCE
     1.042 | 16,801,016 | 100,010 |   0.000010 | READ

After:

  seconds  |   consed   |  calls  |  sec/call  |  name  
----------------------------------------------------------
     3.302 | 36,815,160 | 100,001 |   0.000033 | READ-LINE
     1.268 | 16,542,064 | 200,212 |   0.000006 | COERCE
     1.196 | 16,801,624 | 100,011 |   0.000012 | READ

> Once you do that, the bottleneck is likely to be in parsing the
> float.

Not yet: READ-LINE still seems to be the biggest hog.

Thanks for your explanation. This is exactly the sort of thing I was
hoping to learn.

From: Christophe Rhodes
Subject: Re: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 15:42:55 +0000
Message-ID: <sq7jftxukg.fsf@cam.ac.uk>

Jacek Generowicz <················@cern.ch> writes:

> Christophe Rhodes <·····@cam.ac.uk> writes:
>
>> Once you do that, the bottleneck is likely to be in parsing the
>> float.
>
> Not yet: READ-LINE still seems to be the biggest hog.

Darn.  Now it gets trickier.  :-/

This is /probably/ because, to be safe, READ-LINE must allocate a
fresh string of the most permissive storage type: so that, after
reading, arbitrary characters can be stored in the string.  (Actually,
I can't see a guarantee of that in the CLHS on a quick skim, but I
think people would be surprised if the return value from READ-LINE
wasn't a fresh string, or was specialized on a narrow character type.)
In full-Unicode-enabled SBCLs, this means allocating four bytes per
character.  If you can live without full Unicode support, building
SBCL without the :sb-unicode feature might give you some speed back,
as string-returning functions should cons less by a factor of four.

One potential waste of time lies in the function to read lines if your
lines have more than 80 characters: the buffer for read-line starts
out being 80 characters long; if the line is longer, then it allocates
another, larger buffer and copies the current contents.  If all your
lines are 81 characters long, you might try changing the
ansi-stream-read-line function (but note that READ-LINE would also
have to be recompiled, because ansi-stream-read-line is declared
inline).

If you're on an x86 or x86-64, you could try using the statistical
profiler (<http://www.sbcl.org/manual/Statistical-Profiler.html>) to
attempt to identify the bottleneck within read-line itself (you can
also try on non-x86-based platforms, but it appears that there's a
difficult-to-isolate bug in the signal handling which interacts with
the garbage collection on those platforms that sometimes causes a
deadlock).  This might provide further insight into where exactly the
performance problem lies.

Christophe

From: Pete Kazmier
Subject: Re: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 20:27:12 +0000
Message-ID: <87oe95jfq7.fsf@coco.kazmier.com>

Christophe Rhodes <·····@cam.ac.uk> writes:

> If you're on an x86 or x86-64, you could try using the statistical
> profiler (<http://www.sbcl.org/manual/Statistical-Profiler.html>) to
> attempt to identify the bottleneck within read-line itself

I posted the results of this profiling when I had the same question
regarding the speed of READ-LINE to the SBCL list back in January of
this year.  I never made any headway even with an SBCL compiled
without unicode support.  It was just too slow and it was beyond my
capabilities to identify the bottleneck (I'm still a CL newbie and not
familiar with the internals of any CL implementation).  Here is my
original email with the profiling included:

http://sourceforge.net/mailarchive/message.php?msg_id=10625029

I'd love to be able to figure out how to speed up the performance of
READ-LINE so I can start using SBCL at work.  I'm just not sure how to
interpret the results of my profiling.  In the end, I switched to
CMUCL and used the WITH-FAST-STRING-READER macro by Bulent Murtezaoglu
which really improved things:

http://groups-beta.google.com/group/comp.lang.lisp/browse_frm/thread/ecfa7e65c62cfd1a/f22b458b22598666?q=bulent+%22with-fast-string-reader%22&rnum=1&hl=en#f22b458b22598666

So now I use SBCL at home and CMUCL at work because it was faster in
this case (parsing lots of log files) and gave results that were at
least comparable to perl/python scripts.

Pete

From: Christophe Rhodes
Subject: Re: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 21:25:14 +0000
Message-ID: <sqeka1ull1.fsf@cam.ac.uk>

Pete Kazmier <·············@kazmier.com> writes:

> Christophe Rhodes <·····@cam.ac.uk> writes:
>
>> If you're on an x86 or x86-64, you could try using the statistical
>> profiler (<http://www.sbcl.org/manual/Statistical-Profiler.html>) to
>> attempt to identify the bottleneck within read-line itself
>
> I posted the results of this profiling when I had the same question
> regarding the speed of READ-LINE to the SBCL list back in January of
> this year.  I never made any headway even with an SBCL compiled
> without unicode support.  It was just too slow and it was beyond my
> capabilities to identify the bottleneck (I'm still a CL newbie and not
> familiar with the internals of any CL implementation).  Here is my
> original email with the profiling included:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=10625029
>
> I'd love to be able to figure out how to speed up the performance of
> READ-LINE so I can start using SBCL at work.  I'm just not sure how to
> interpret the results of my profiling.

Thanks for bringing this (back :-) to my attention.

            Self        Cumul        Total
   Nr  Count     %  Count     %  Count     % Function
 ------------------------------------------------------------------------
    1    400  27.8    460  31.9    400  27.8 SB-KERNEL:HAIRY-DATA-VECTOR-SET
    2    289  20.1   1084  75.2    689  47.8 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII
    3    241  16.7    324  22.5    930  64.5 ARRAY-DIMENSION

This is telling us that 75% of the time is spent in
FD-STREAM-READ-N-CHARACTERS/ASCII and callees (and so that you're
likely using an ascii external format for this profile run), and that
about half of that time (32%) is likely spent calling
HAIRY-DATA-VECTOR-SET.  HAIRY-DATA-VECTOR-SET is called for (SETF
AREF) when the compiler didn't have any information about the
simpleness and specialized array element type of the destination
array, and tends to be slow because it's one huge ugly typecase (which
SBCL currently does not optimize to a constant-time jump-table).

<http://paste.lisp.org/display/9894> contains a candidate patch which
might improve the situation; it would be good if interested parties
could test it against their workloads and report any effect.  (For
extra kicks, try disassembling one of the high-count functions after
doing the statistical profile :-)

Christophe

From: Pete Kazmier
Subject: Re: Speed: a column of floats from large file
Date: Fri, 15 Jul 2005 11:28:15 +0000
Message-ID: <878y08jokw.fsf@coco.kazmier.com>

Christophe Rhodes <·····@cam.ac.uk> writes:

> Thanks for bringing this (back :-) to my attention.
>
>             Self        Cumul        Total
>    Nr  Count     %  Count     %  Count     % Function
>  ------------------------------------------------------------------------
>     1    400  27.8    460  31.9    400  27.8 SB-KERNEL:HAIRY-DATA-VECTOR-SET
>     2    289  20.1   1084  75.2    689  47.8 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII
>     3    241  16.7    324  22.5    930  64.5 ARRAY-DIMENSION
>
> This is telling us that 75% of the time is spent in
> FD-STREAM-READ-N-CHARACTERS/ASCII and callees (and so that you're
> likely using an ascii external format for this profile run), and that
> about half of that time (32%) is likely spent calling
> HAIRY-DATA-VECTOR-SET.  HAIRY-DATA-VECTOR-SET is called for (SETF
> AREF) when the compiler didn't have any information about the
> simpleness and specialized array element type of the destination
> array, and tends to be slow because it's one huge ugly typecase (which
> SBCL currently does not optimize to a constant-time jump-table).

Thank you for explaining the details.

> <http://paste.lisp.org/display/9894> contains a candidate patch which
> might improve the situation; it would be good if interested parties
> could test it against their workloads and report any effect.

I built a new SBCL with the patch and the results are definitely
better. 

Without the patch:

CL-USER[3]: (time (dotimes (i 10) (parse-file-2 "/tmp/data")))
Evaluation took:
  23.542 seconds of real time
  19.549028 seconds of user run time
  1.352794 seconds of system run time
  0 page faults and
  254,086,856 bytes consed.

           Self        Cumul        Total
  Nr  Count     %  Count     %  Count     % Function
------------------------------------------------------------------------
   1     47  24.9     60  31.7     47  24.9 (SB-C::TL-XEP SB-KERNEL:HAIRY-DATA-VECTOR-SET)
   2     41  21.7    139  73.5     88  46.6 (SB-C::TL-XEP SB-IMPL::FD-STREAM-READ-N-CHARACTERS/LATIN-1)
   3     40  21.2    181  95.8    128  67.7 (SB-C::HAIRY-ARG-PROCESSOR READ-LINE)
   4     32  16.9     38  20.1    160  84.7 (SB-C::TL-XEP ARRAY-DIMENSION)
   5     19  10.1     19  10.1    179  94.7 (SB-C::TL-XEP ARRAY-RANK)

And with the patch:

CL-USER[3]: (time (dotimes (i 10) (parse-file-2 "/tmp/data")))
Evaluation took:
  7.783 seconds of real time
  6.419024 seconds of user run time
  1.346795 seconds of system run time
  0 page faults and
  254,084,160 bytes consed.

           Self        Cumul        Total
  Nr  Count     %  Count     %  Count     % Function
------------------------------------------------------------------------
   1     29  44.6     56  86.2     29  44.6 (SB-C::HAIRY-ARG-PROCESSOR READ-LINE)
   2     17  26.2     25  38.5     46  70.8 (SB-C::TL-XEP SB-IMPL::FD-STREAM-READ-N-CHARACTERS/LATIN-1)
   3      8  12.3      8  12.3     54  83.1 "foreign function __read"
   4      5   7.7      5   7.7     59  90.8 "no debug information for frame"
   5      2   3.1      2   3.1     61  93.8 "foreign function pthread_sigmask"

PARSE-LINE-2 is defined as:

(defun parse-file-2 (file)
  (let ((count 0))
    (with-open-file (in file :external-format :iso-8859-1)
      (do ((line (read-line in nil :eof) (read-line in nil :eof)))
          ((eql line :eof) t)
        (incf count)))
    count))

It was reading a log file with about 75k lines that were less than 80
chars in length.  Is there any way to improve it further with another
2 line patch :)

Thanks,
Pete

From: Christophe Rhodes
Subject: Re: Speed: a column of floats from large file
Date: Fri, 15 Jul 2005 11:50:55 +0000
Message-ID: <sqpstkp9sw.fsf@cam.ac.uk>

Pete Kazmier <··················@kazmier.com> writes:

> Christophe Rhodes <·····@cam.ac.uk> writes:
>> <http://paste.lisp.org/display/9894> contains a candidate patch which
>> might improve the situation; it would be good if interested parties
>> could test it against their workloads and report any effect.
>
> I built a new SBCL with the patch and the results are definitely
> better. 
>
> Without the patch:
>
> CL-USER[3]: (time (dotimes (i 10) (parse-file-2 "/tmp/data")))
> Evaluation took:
>   23.542 seconds of real time
>
> And with the patch:
>
> CL-USER[3]: (time (dotimes (i 10) (parse-file-2 "/tmp/data")))
> Evaluation took:
>   7.783 seconds of real time
>
> It was reading a log file with about 75k lines that were less than 80
> chars in length.  Is there any way to improve it further with another
> 2 line patch :)

Almost certainly! :-)

A two-line patch which might well improve things is to add
  (declare (optimize speed (safety 0)))
to the definitions of ,in-function in src/code/fd-stream.lisp.  I'm
not sure that I would be entirely comfortable with that, but if speed
is your need, that might help, since you're still spending 26% of the
time in the ,in-function.

I'm afraid the rest of the time is slightly trickier to deal with;
there's not a lot we can do about __read(), while
ANSI-STREAM-READ-LINE (which is what READ-LINE ends up doing, and is
expanded inline) already uses the fast-read-char mechanism.  It's
possible that an index[*] declaration for the index variable in
ANSI-STREAM-READ-LINE could make things a little better: it might save
a few instructions in the inner loop.

Christophe

[*] sb-int:index is a type corresponding to valid array indices, and
in this case has the useful property that adding one to an object of
this type is guaranteed to return a fixnum: this means that a type
check for (incf index) is easy to implement inline.

From: Pete Kazmier
Subject: Re: Speed: a column of floats from large file
Date: Fri, 15 Jul 2005 13:43:37 +0000
Message-ID: <873bqgjiba.fsf@coco.kazmier.com>

Christophe Rhodes <·····@cam.ac.uk> writes:

> Pete Kazmier <··················@kazmier.com> writes:
>
>> It was reading a log file with about 75k lines that were less than 80
>> chars in length.  Is there any way to improve it further with another
>> 2 line patch :)
>
> Almost certainly! :-)
>
> A two-line patch which might well improve things is to add
>   (declare (optimize speed (safety 0)))
> to the definitions of ,in-function in src/code/fd-stream.lisp.  I'm
> not sure that I would be entirely comfortable with that, but if speed
> is your need, that might help, since you're still spending 26% of the
> time in the ,in-function.

Ok, I've just started a build with those additional declarations.
Could you explain to the uninformed why you would not be comfortable
making those changes?  I mean if you, one of the developers, aren't
comfortable, I get the feeling that I probably should not be doing it
either, especially for production code at work.  What types of bad
things can happen with safety disabled in that part of the code?

> while ANSI-STREAM-READ-LINE (which is what READ-LINE ends up doing,
> and is expanded inline) already uses the fast-read-char mechanism.

I was hoping that the "HAIRY" part of HAIRY-ARG-PROCESSOR READ-LINE
meant that it was just some ambiguity that could be fixed with some
additional declaration somewhere (like the HAIRY-DATA-VECTOR-SET).

> It's possible that an index[*] declaration for the index variable in
> ANSI-STREAM-READ-LINE could make things a little better: it might
> save a few instructions in the inner loop.

I'll also give this a try as well to see if this provides any
significant improvement.

Thanks again,
Pete

From: Pete Kazmier
Subject: Re: Speed: a column of floats from large file
Date: Fri, 15 Jul 2005 16:05:14 +0000
Message-ID: <87hdewhx6t.fsf@coco.kazmier.com>

Pete Kazmier <··················@kazmier.com> writes:

> Christophe Rhodes <·····@cam.ac.uk> writes:
>
>> A two-line patch which might well improve things is to add
>>   (declare (optimize speed (safety 0)))
>> to the definitions of ,in-function in src/code/fd-stream.lisp.  I'm
>> not sure that I would be entirely comfortable with that, but if speed
>> is your need, that might help, since you're still spending 26% of the
>> time in the ,in-function.
>
> Ok, I've just started a build with those additional declarations.
> Could you explain to the uninformed why you would not be comfortable
> making those changes?  I mean if you, one of the developers, aren't
> comfortable, I get the feeling that I probably should not be doing it
> either, especially for production code at work.  What types of bad
> things can happen with safety disabled in that part of the code?

The above declarations improved speed a little more than half a
second:

CL-USER[3]: (time (dotimes (i 10) (parse-file-2 "/tmp/data")))
Evaluation took:
  7.184 seconds of real time
  5.710132 seconds of user run time
  1.375791 seconds of system run time
  0 page faults and
  254,091,472 bytes consed.

>> It's possible that an index[*] declaration for the index variable in
>> ANSI-STREAM-READ-LINE could make things a little better: it might
>> save a few instructions in the inner loop.
>
> I'll also give this a try as well to see if this provides any
> significant improvement.

Declaring index as an index also improved the situation by about a
half second (this sbcl build did not include the first set of safety
declarations):

CL-USER[3]: (time (dotimes (i 10) (parse-file-2 "/tmp/data")))
Evaluation took:
  7.34 seconds of real time
  5.902103 seconds of user run time
  1.422784 seconds of system run time
  0 page faults and
  254,090,376 bytes consed.

So I suppose that I might be able to trim off a full second with both
optimizations.

From: Christophe Rhodes
Subject: Re: Speed: a column of floats from large file
Date: Fri, 15 Jul 2005 16:24:19 +0000
Message-ID: <sq7jfs11ho.fsf@cam.ac.uk>

Pete Kazmier <··················@kazmier.com> writes:

> Christophe Rhodes <·····@cam.ac.uk> writes:
>
>> Pete Kazmier <··················@kazmier.com> writes:
>>
>>> It was reading a log file with about 75k lines that were less than 80
>>> chars in length.  Is there any way to improve it further with another
>>> 2 line patch :)
>>
>> Almost certainly! :-)
>>
>> A two-line patch which might well improve things is to add
>>   (declare (optimize speed (safety 0)))
>> to the definitions of ,in-function in src/code/fd-stream.lisp.  I'm
>> not sure that I would be entirely comfortable with that, but if speed
>> is your need, that might help, since you're still spending 26% of the
>> time in the ,in-function.
>
> Ok, I've just started a build with those additional declarations.
> Could you explain to the uninformed why you would not be comfortable
> making those changes?  I mean if you, one of the developers, aren't
> comfortable, I get the feeling that I probably should not be doing it
> either, especially for production code at work.  What types of bad
> things can happen with safety disabled in that part of the code?

I, as a developer, rely on being able to interpret bug reports easily;
if it so happens that one of the type declarations is off-by-one, say,
a bug report that says "I got a TYPE-ERROR in FD-STREAMS-READ-FOO when
I did BAR" is a lot easier to decipher than one which says "I got
random memory corruption sometime after doing QUUX, BAR, BAZ and
FROB".  In an ideal world, sbcl developers never make mistakes with
their type declarations, but I'm not sure that we live in the best of
all possible worlds, so builds for general use accept a small speed
penalty for the extra safety net.

Christophe

From: Jacek Generowicz
Subject: Re: Speed: a column of floats from large file
Date: Thu, 14 Jul 2005 23:13:23 +0000
Message-ID: <m24qax0yng.fsf@genera.local>

Christophe Rhodes <·····@cam.ac.uk> writes:

> Jacek Generowicz <················@cern.ch> writes:
>
> > READ-LINE still seems to be the biggest hog.
>
> Darn.  Now it gets trickier.  :-/
>
> This is /probably/ because, to be safe, READ-LINE must allocate a
> fresh string of the most permissive storage type: so that, after
> reading, arbitrary characters can be stored in the string.

In which case using READ-SEQUENCE could well be profitable (as long as
all the lines are the same length ... hmm, most are 69, but a tiny
fraction has 70 characters).

> One potential waste of time lies in the function to read lines if your
> lines have more than 80 characters: the buffer for read-line starts
> out being 80 characters long; if the line is longer, then it allocates
> another, larger buffer and copies the current contents.  If all your
> lines are 81 characters long, you might try changing the
> ansi-stream-read-line function (but note that READ-LINE would also
> have to be recompiled, because ansi-stream-read-line is declared
> inline).

My lines are around 70 chars, so that's probably not worth it.

> If you're on an x86 or x86-64,

Alas, no.

> you could try using the statistical profiler [...]
> on non-x86-based platforms, but it appears that there's a
> difficult-to-isolate bug in the signal handling which interacts with
> the garbage collection on those platforms that sometimes causes a
> deadlock).

Been there, done that, got the T-shirt. Deadlocks pretty much every
time. Though I'm not sure I've tried it since moving to 0.9.0.

From: JP Massar
Subject: Re: Speed: a column of floats from large file
Date: Fri, 15 Jul 2005 07:02:51 +0000
Message-ID: <nened1d2e5krhplr5lg79absbberqvofb9@4ax.com>

On Thu, 14 Jul 2005 15:19:00 +0200, Jacek Generowicz
<················@cern.ch> wrote:

>
>Should I be looking at some approach based on READ-SEQUENCE?

You could try reading millions of characters in at a time using
read-sequence.  (You'd have to deal with split lines unless
your lines are all the same length.)

Also, you say your float  numbers are between 0 and 1.
So if they aren't stored using exponential notation you could
read them not as floats but as integers ignoring the initial
decimal point and then divide by the appropriate divisor
based on the length to get your float.   (Actually, you can
multiply by the precomputed  reciprocal instead of dividing, which
might be faster)

From: Debian User
Subject: Re: Speed: a column of floats from large file
Date: Fri, 15 Jul 2005 08:58:57 +0000
Message-ID: <42d77ad1$0$82411$dbd43001@news.wanadoo.nl>

IT might be interesting to have a look at K. Anderson's paper:
Performing Lisp: What about I/O?

This paper discusses similar problems and performance questions.

From: Jacek Generowicz
Subject: Re: Speed: a column of floats from large file
Date: Fri, 15 Jul 2005 21:11:50 +0000
Message-ID: <m2zmsnzsdl.fsf@genera.local>

Debian User <···@ddd.dd> writes:

> IT might be interesting to have a look at K. Anderson's paper:
> Performing Lisp: What about I/O?

Do you have a reference? I'm not having much luck finding this.

From: Pascal Bourguignon
Subject: Re: Speed: a column of floats from large file
Date: Fri, 15 Jul 2005 23:14:34 +0000
Message-ID: <87y887d5lx.fsf@thalassa.informatimago.com>

Jacek Generowicz <················@cern.ch> writes:

> Debian User <···@ddd.dd> writes:
>
>> IT might be interesting to have a look at K. Anderson's paper:
>> Performing Lisp: What about I/O?
>
> Do you have a reference? I'm not having much luck finding this.

http://justfuckinggoogleit.com/search?q=Performing+Lisp+What+about+I%2FO

Or perhaps you mean that all the google servers were down just when you tried it?

-- 
__Pascal_Bourguignon__               _  Software patents are endangering
()  ASCII ribbon against html email (o_ the computer industry all around
/\  1962:DO20I=1.100                //\ the world http://lpf.ai.mit.edu/
    2001:my($f)=`fortune`;          V_/   http://petition.eurolinux.org/

From: Jacek Generowicz
Subject: Re: Speed: a column of floats from large file
Date: Mon, 18 Jul 2005 13:33:55 +0000
Message-ID: <m2r7dwz1a4.fsf@genera.local>

Pascal Bourguignon <···@informatimago.com> writes:

> Jacek Generowicz <················@cern.ch> writes:
>
>> Debian User <···@ddd.dd> writes:
>>
>>> IT might be interesting to have a look at K. Anderson's paper:
>>> Performing Lisp: What about I/O?
>>
>> Do you have a reference? I'm not having much luck finding this.
>
> http://justfuckinggoogleit.com/search?q=Performing+Lisp+What+about+I%2FO
>
> Or perhaps you mean that all the google servers were down just when
> you tried it?

No, the one I tried was up, but the searches I performed didn't come
up with anything stunningly relevant. Maybe I was too enthusiastic on
grouping things in double quotes.

From: Debian User
Subject: Re: Speed: a column of floats from large file
Date: Sat, 16 Jul 2005 08:01:19 +0000
Message-ID: <42d8becf$0$82405$dbd43001@news.wanadoo.nl>

On Fri, 15 Jul 2005 23:11:50 +0200, Jacek Generowicz
<················@cern.ch> wrote:

> Debian User <···@ddd.dd> writes:
> 
>> IT might be interesting to have a look at K. Anderson's paper:
>> Performing Lisp: What about I/O?
> 
> Do you have a reference? I'm not having much luck finding this.

Thisa is the paper:
http://openmap.bbn.com/~kanderso/lisp/performing-lisp/pl-io.ps


Have a also look in the papers section under:
openmap.bbn.com/~kanderso/performance/

From: Jacek Generowicz
Subject: Re: Speed: a column of floats from large file
Date: Mon, 18 Jul 2005 13:33:42 +0000
Message-ID: <m2slycz1ah.fsf@genera.local>

Debian User <···@ddd.dd> writes:

> Thisa is the paper:
> http://openmap.bbn.com/~kanderso/lisp/performing-lisp/pl-io.ps

Thanks.

> Have a also look in the papers section under:
> openmap.bbn.com/~kanderso/performance/

Yup, I knew about those already.

Cheers.