From: Bob Felts
Subject: Lisp and very large arrays
Date: 
Message-ID: <1ifuhu4.av99ve1t4nyg0N%wrf3@stablecross.com>
I have a friend who took a Perl script and recoded it in C for a huge
gain in performance.  I've been telling him how much I like Lisp and, in
an attempt to do a little bit of Lisp evangelism, I offered to recode
his C into Lisp.

And I've hit a brick wall.  The program needs four large arrays: two
arrays of (* 128 1024 1024) elements of unsigned bytes, and two arrays
of (* 128 1024 1024) elements of type bit.

LispWorks Personal 5.0.2 on Mac OS X won't even create the arrays;
they're too large.

sbcl 1.0.15.13 on Mac OS X (10.5.2, Intel) gives inconsistent behavior.

If I do:

   (defun main ( )
      (let* ((size (* 128 1024 1024)))
               (array1 (make-array size
                                    :element-type 'unsigned-byte
                                    :initial-element 0))
               (array2 (make-array size
                                    :element-type 'bit
                                    :initial-element 0))
         (array3 ...
               (array4 ...))
       ...))

then the code essentially hangs (it ran for 40 minutes before I killed
it.)

Ok, so maybe there's a problem creating large arrays inside a function.
Experimentation from the sbcl prompt showed that:

    (defparameter *array1* (make-array size ...))

worked.  All four arrays could be made.

So I moved the arrays outside of main like this:

    (defparameter *array1* nil)
    ...
    (defun main ...
       (setf *array1* (make-array size ...)))

[The arrays are really dynamically sized, based on a parameter to main.]

This runs, once.

If I invoke main again, it hangs.  Ok, maybe it's collecting garbage.
Let's see if we can help it out:

     (defun main ...
         (setf *array1* nil)  ; for all 4 arrays
         (setf *array1* (make-array size ...))
         ...)

Still no luck.  Inserting a call to (gc) doesn't help.

At this point, the only way I can develop this code is to incrementally
develop the next part, run it, quit Aquamacs/slime, possibly kill sbcl
from the terminal, restart Aquamacs/slime, lather, rinse, repeat.

This isn't helping me to sing the praises of Lisp.  Anyone been through
this before and have any helpful suggestions?


        

From: Brian
Subject: Re: Lisp and very large arrays
Date: 
Message-ID: <945d33dd-302f-47a2-8ded-719af160f430@m36g2000hse.googlegroups.com>
For me on SBCL 1.0.6 on Ubuntu, SBCL doesn't have enough heap to
allocate a 512MiB array by default.   I needed to use the --dynamic-
space-size parameter.

You are going to be needing around 1.03GiB of memory for the arrays
alone; perhaps you should start SBCL with a larger dynamic space size?
From: Richard M Kreuter
Subject: Re: Lisp and very large arrays
Date: 
Message-ID: <877ieomxoz.fsf@progn.net>
····@stablecross.com (Bob Felts) writes:

> I have a friend who took a Perl script and recoded it in C for a huge
> gain in performance.  I've been telling him how much I like Lisp and, in
> an attempt to do a little bit of Lisp evangelism, I offered to recode
> his C into Lisp.
>
> And I've hit a brick wall.  The program needs four large arrays: two
> arrays of (* 128 1024 1024) elements of unsigned bytes, and two arrays
> of (* 128 1024 1024) elements of type bit.
>
> LispWorks Personal 5.0.2 on Mac OS X won't even create the arrays;
> they're too large.
>
> sbcl 1.0.15.13 on Mac OS X (10.5.2, Intel) gives inconsistent behavior.
>
> If I do:
>
>    (defun main ( )
>       (let* ((size (* 128 1024 1024)))
>                (array1 (make-array size
>                                     :element-type 'unsigned-byte
>                                     :initial-element 0))
>                (array2 (make-array size
>                                     :element-type 'bit
>                                     :initial-element 0))
>          (array3 ...
>                (array4 ...))
>        ...))

It's not clear what MAIN does, but if it tries to print any of these
arrays (possibly by returning them), then Lisp may have to construct
string representations of each array (and for an array with 128
Mibiobjects, the printed representation will be no less than 128
Mibicharacters; if the implementation represents a character with 4
octets, that's a 512 MB string); additionally, when working under
SLIME, Lisp also has to ship that data over to Emacs, which can also
be a problematic bottleneck.  Have you tried setting *PRINT-ARRAY* to
NIL (or *PRINT-LENGTH* to some small nonnegative integer)?

--
RmK
From: Bob Felts
Subject: Re: Lisp and very large arrays
Date: 
Message-ID: <1ifuomu.1sjnyoda8sboN%wrf3@stablecross.com>
Richard M Kreuter <·······@progn.net> wrote:

> ····@stablecross.com (Bob Felts) writes:
> 
> > I have a friend who took a Perl script and recoded it in C for a huge
> > gain in performance.  I've been telling him how much I like Lisp and, in
> > an attempt to do a little bit of Lisp evangelism, I offered to recode
> > his C into Lisp.
> >
> > And I've hit a brick wall.  The program needs four large arrays: two
> > arrays of (* 128 1024 1024) elements of unsigned bytes, and two arrays
> > of (* 128 1024 1024) elements of type bit.
> >
> > LispWorks Personal 5.0.2 on Mac OS X won't even create the arrays;
> > they're too large.
> >
> > sbcl 1.0.15.13 on Mac OS X (10.5.2, Intel) gives inconsistent behavior.
> >
> > If I do:
> >
> >    (defun main ( )
> >       (let* ((size (* 128 1024 1024)))
> >                (array1 (make-array size
> >                                     :element-type 'unsigned-byte
> >                                     :initial-element 0))
> >                (array2 (make-array size
> >                                     :element-type 'bit
> >                                     :initial-element 0))
> >          (array3 ...
> >                (array4 ...))
> >        ...))
> 
> It's not clear what MAIN does, but if it tries to print any of these
> arrays (possibly by returning them), then Lisp may have to construct
> string representations of each array (and for an array with 128
> Mibiobjects, the printed representation will be no less than 128
> Mibicharacters; if the implementation represents a character with 4
> octets, that's a 512 MB string); additionally, when working under
> SLIME, Lisp also has to ship that data over to Emacs, which can also
> be a problematic bottleneck.  Have you tried setting *PRINT-ARRAY* to
> NIL (or *PRINT-LENGTH* to some small nonnegative integer)?

Sorry to be less than precise about what main does.  It's constructed so
that the arrays aren't printed.  At least, that's the intent.  I'll see
what effect setting *PRINT-ARRAY* to NIL does.
From: Ari Johnson
Subject: Re: Lisp and very large arrays
Date: 
Message-ID: <m2prsguqy7.fsf@hermes.theari.com>
····@stablecross.com (Bob Felts) writes:

> I have a friend who took a Perl script and recoded it in C for a huge
> gain in performance.  I've been telling him how much I like Lisp and, in
> an attempt to do a little bit of Lisp evangelism, I offered to recode
> his C into Lisp.
>
> And I've hit a brick wall.  The program needs four large arrays: two
> arrays of (* 128 1024 1024) elements of unsigned bytes, and two arrays
> of (* 128 1024 1024) elements of type bit.
>
> LispWorks Personal 5.0.2 on Mac OS X won't even create the arrays;
> they're too large.
>
> sbcl 1.0.15.13 on Mac OS X (10.5.2, Intel) gives inconsistent behavior.
>
> If I do:
>
>    (defun main ( )
>       (let* ((size (* 128 1024 1024)))
>                (array1 (make-array size
>                                     :element-type 'unsigned-byte
>                                     :initial-element 0))
>                (array2 (make-array size
>                                     :element-type 'bit
>                                     :initial-element 0))
>          (array3 ...
>                (array4 ...))
>        ...))
>
> then the code essentially hangs (it ran for 40 minutes before I killed
> it.)
>
> Ok, so maybe there's a problem creating large arrays inside a function.
> Experimentation from the sbcl prompt showed that:
>
>     (defparameter *array1* (make-array size ...))
>
> worked.  All four arrays could be made.
>
> So I moved the arrays outside of main like this:
>
>     (defparameter *array1* nil)
>     ...
>     (defun main ...
>        (setf *array1* (make-array size ...)))
>
> [The arrays are really dynamically sized, based on a parameter to main.]
>
> This runs, once.
>
> If I invoke main again, it hangs.  Ok, maybe it's collecting garbage.
> Let's see if we can help it out:
>
>      (defun main ...
>          (setf *array1* nil)  ; for all 4 arrays
>          (setf *array1* (make-array size ...))
>          ...)
>
> Still no luck.  Inserting a call to (gc) doesn't help.
>
> At this point, the only way I can develop this code is to incrementally
> develop the next part, run it, quit Aquamacs/slime, possibly kill sbcl
> from the terminal, restart Aquamacs/slime, lather, rinse, repeat.
>
> This isn't helping me to sing the praises of Lisp.  Anyone been through
> this before and have any helpful suggestions?

Others have mentioned (UNSIGNED-BYTE 8) and array size limits.  For
what it's worth, from SBCL genesis.lisp:

;;; a big vector, implemented as a vector of SMALLVECs
;;;
;;; KLUDGE: This implementation seems portable enough for our
;;; purposes, since realistically every modern implementation is
;;; likely to support vectors of at least 2^16 elements. But if you're
;;; masochistic enough to read this far into the contortions imposed
;;; on us by ANSI and the Lisp community, for daring to use the
;;; abstraction of a large linearly addressable memory space, which is
;;; after all only directly supported by the underlying hardware of at
;;; least 99% of the general-purpose computers in use today, then you
;;; may be titillated to hear that in fact this code isn't really
;;; portable, because as of sbcl-0.7.4 we need somewhat more than
;;; 16Mbytes to represent a core, and ANSI only guarantees that
;;; ARRAY-DIMENSION-LIMIT is not less than 1024. -- WHN 2002-06-13

In other words, to be portable at least in practice, SBCL's
genesis-core-building code doesn't trust the underlying Common Lisp
implementation to provide arrays large enough.  Note that SMALLVEC is
just `(SIMPLE-ARRAY (UNSIGNED-BYTE 8) (,(expt 2 16))).
From: Rainer Joswig
Subject: Re: Lisp and very large arrays
Date: 
Message-ID: <joswig-770466.17510423042008@news-europe.giganews.com>
In article <···························@stablecross.com>,
 ····@stablecross.com (Bob Felts) wrote:

> I have a friend who took a Perl script and recoded it in C for a huge
> gain in performance.  I've been telling him how much I like Lisp and, in
> an attempt to do a little bit of Lisp evangelism, I offered to recode
> his C into Lisp.
> 
> And I've hit a brick wall.  The program needs four large arrays: two
> arrays of (* 128 1024 1024) elements of unsigned bytes, and two arrays
> of (* 128 1024 1024) elements of type bit.
> 
> LispWorks Personal 5.0.2 on Mac OS X won't even create the arrays;
> they're too large.

You can always use a 64bit Common Lisp. Some 32bit Lisps have
small array sizes (due to smaller fixnums than 32bit).
You can check for the max number of elements - there is
a variable that gives that number.

Also:

* make sure that the Lisp starts with enough memory (or is able
to extend its memory at runtime)

* don't try to print those large arrays in the repl. There
 variables to control the printer, don't call functions
 in the REPL that return those arrays, etc.




> 
> sbcl 1.0.15.13 on Mac OS X (10.5.2, Intel) gives inconsistent behavior.
> 
> If I do:
> 
>    (defun main ( )
>       (let* ((size (* 128 1024 1024)))
>                (array1 (make-array size
>                                     :element-type 'unsigned-byte
>                                     :initial-element 0))
>                (array2 (make-array size
>                                     :element-type 'bit
>                                     :initial-element 0))
>          (array3 ...
>                (array4 ...))
>        ...))
> 
> then the code essentially hangs (it ran for 40 minutes before I killed
> it.)
> 
> Ok, so maybe there's a problem creating large arrays inside a function.
> Experimentation from the sbcl prompt showed that:
> 
>     (defparameter *array1* (make-array size ...))
> 
> worked.  All four arrays could be made.
> 
> So I moved the arrays outside of main like this:
> 
>     (defparameter *array1* nil)
>     ...
>     (defun main ...
>        (setf *array1* (make-array size ...)))
> 
> [The arrays are really dynamically sized, based on a parameter to main.]
> 
> This runs, once.
> 
> If I invoke main again, it hangs.  Ok, maybe it's collecting garbage.
> Let's see if we can help it out:
> 
>      (defun main ...
>          (setf *array1* nil)  ; for all 4 arrays
>          (setf *array1* (make-array size ...))
>          ...)
> 
> Still no luck.  Inserting a call to (gc) doesn't help.
> 
> At this point, the only way I can develop this code is to incrementally
> develop the next part, run it, quit Aquamacs/slime, possibly kill sbcl
> from the terminal, restart Aquamacs/slime, lather, rinse, repeat.
> 
> This isn't helping me to sing the praises of Lisp.  Anyone been through
> this before and have any helpful suggestions?
> 
> 
>

-- 
http://lispm.dyndns.org/
From: Juho Snellman
Subject: Re: Lisp and very large arrays
Date: 
Message-ID: <874p9s8szn.fsf@vasara.proghammer.com>
····@stablecross.com (Bob Felts) writes:
>    (defun main ( )
>       (let* ((size (* 128 1024 1024)))
>                (array1 (make-array size
>                                     :element-type 'unsigned-byte
>                                     :initial-element 0))

Do you really mean to use UNSIGNED-BYTE here? That upgrades to T on
SBCL, and probably on all other CLs too. If you want an array of
octets, use (UNSIGNED-BYTE 8) as the element type.

-- 
Juho Snellman
From: Bob Felts
Subject: Re: Lisp and very large arrays
Date: 
Message-ID: <1ifuoth.16svnps1xp0cgN%wrf3@stablecross.com>
Juho Snellman <······@iki.fi> wrote:

> ····@stablecross.com (Bob Felts) writes:
> >    (defun main ( )
> >       (let* ((size (* 128 1024 1024)))
> >                (array1 (make-array size
> >                                     :element-type 'unsigned-byte
> >                                     :initial-element 0))
> 
> Do you really mean to use UNSIGNED-BYTE here? That upgrades to T on
> SBCL, and probably on all other CLs too. If you want an array of
> octets, use (UNSIGNED-BYTE 8) as the element type.

[Note:  Christophe Rhodes sent the same advice via e-mail and I want to
make sure everyone gets credit and thanks].

(unsigned-byte 8) performs much, much better.  Now, maybe I can dazzle
my Lisp-less friend!
From: John Thingstad
Subject: Re: Lisp and very large arrays
Date: 
Message-ID: <op.t92pljlqut4oq5@pandora.alfanett.no>
P� Wed, 23 Apr 2008 16:39:23 +0200, skrev Bob Felts <····@stablecross.com>:

> I have a friend who took a Perl script and recoded it in C for a huge
> gain in performance.  I've been telling him how much I like Lisp and, in
> an attempt to do a little bit of Lisp evangelism, I offered to recode
> his C into Lisp.
>
> And I've hit a brick wall.  The program needs four large arrays: two
> arrays of (* 128 1024 1024) elements of unsigned bytes, and two arrays
> of (* 128 1024 1024) elements of type bit.
>
> LispWorks Personal 5.0.2 on Mac OS X won't even create the arrays;
> they're too large.
>

about array-total-size-limit (hyperspec)

The actual limit on the array total size imposed by the implementation  
might vary according the element type of the array; in this case, the  
value of array-total-size-limit will be the smallest of these possible  
limits.

In LispWorks 5.0 32 bit Windows Xp

CL-USER 1 > array-total-size-limit
536870911

CL-USER 2 > (* 128 1024 1024)
134217728

As Juhu noted (unsigned byte 8) is the thing to use or you will make a  
array 4 x larger..
When CL was defined it ran on machines with many different byte sizes  
which is why it looks like this.
(Yes it comfused me too a while back.)

And also don't return the array to the repl. Ending a function declaration  
with (values) causes the function to return nothing. (But it will be  
interpreted as nil if given to a function.)


--------------
John Thingstad
From: Johan Ur Riise
Subject: Re: Lisp and very large arrays
Date: 
Message-ID: <874p9sqvof.fsf@SODD.riise-data.net>
using '(unsigned-byte 8)
This shows that allocation of a 13.4 GB string of octets in SBCL on
a slow machine with 2GB memory and some swap.

Traversing the array is slow, but at least it works.
Random access is ok.

·····@perle:~$ uname -a
Linux perle 2.6.22-14-generic #1 SMP Tue Feb 12 02:46:46 UTC 2008 x86_64 GNU/Linux


·····@perle:~$ rlwrap sbcl
This is SBCL 1.0.16.3, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.
* (time (defparameter *array1* (make-array (* 128 1024 1024) :element-type '(unsigned-byte 8))))

Evaluation took:
  0.003 seconds of real time
  0.004 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  134,217,744 bytes consed.
*ARRAY1*
* (time (defparameter *array1* (make-array (* 10 128 1024 1024) :element-type '(unsigned-byte 8))))

Evaluation took:
  0.02 seconds of real time
  0.020001 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  1,342,177,296 bytes consed.
*ARRAY1*
* (time (defparameter *array1* (make-array (* 100 128 1024 1024) :element-type '(unsigned-byte 8))))
Heap exhausted during allocation: 7074676736 bytes available, 13421772816 requested.

debugger invoked on a SB-KERNEL::HEAP-EXHAUSTED-ERROR in thread #<THREAD "initial thread" {1002445B11}>:
  Heap exhausted: 7074676736 bytes available, 13421772816 requested. PROCEED WITH CAUTION!

Type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [ABORT] Exit debugger, returning to top level.

(SB-KERNEL::HEAP-EXHAUSTED-ERROR 7074676736 13421772816)
0] 0;

* (quit)
·····@perle:~$ free
             total       used       free     shared    buffers     cached
Mem:       2063688     760564    1303124          0       4356      74700
-/+ buffers/cache:     681508    1382180
Swap:      2931768     987060    1944708
·····@perle:~$ df -h /home
Filesystem            Size  Used Avail Use% Mounted on
/dev/md2               46G   11G   34G  24% /home
·····@perle:~$ dd if=/dev/zero of=/home/swapfile1 bs=1M count=10000
dd: opening `/home/swapfile1': Permission denied
·····@perle:~$ su -
Password: 
····@perle:~# dd if=/dev/zero of=/home/swapfile1 bs=1M count=15000
15000+0 records in
15000+0 records out
15728640000 bytes (16 GB) copied, 443,189 seconds, 35,5 MB/s
····@perle:~# mkswap /home/swapfile1
Setting up swapspace version 1, size = 15728635 kB
no label, UUID=9f545a05-d981-433c-a278-99f957803cef
····@perle:~# swapon /home/swapfile1 
····@perle:~# free
             total       used       free     shared    buffers     cached
Mem:       2063688    2032240      31448          0      15508    1296064
-/+ buffers/cache:     720668    1343020
Swap:     18291760     987072   17304688
····@perle:~# exit
logout
·····@perle:~$ rlwrap sbcl --dynamic-space-size 15000
This is SBCL 1.0.16.3, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.
* (time (defparameter *array1* (make-array (* 100 128 1024 1024) :element-type '(unsigned-byte 8))))

Evaluation took:
  0.195 seconds of real time
  0.196013 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  536,870,928 bytes consed.
*ARRAY1*
* (time (setf (aref *array1* 10000000001) 42))

Evaluation took:
  0.0 seconds of real time
  0.0 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
42
* (time (setf (aref *array1* 0) 42))

Evaluation took:
  0.0 seconds of real time
  0.0 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
42
* (time (setf (aref *array1* 1) 42))

Evaluation took:
  0.0 seconds of real time
  0.0 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
42
* (time (aref *array1* 10000000001))

Evaluation took:
  0.0 seconds of real time
  0.0 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
42
* (time (aref *array1* 10000000002))

Evaluation took:
  0.0 seconds of real time
  0.0 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
0
* (time (aref *array1* 1))

Evaluation took:
  0.0 seconds of real time
  0.0 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
42
* (time (loop for x across *array1* count x))

Evaluation took:
  306.97 seconds of real time
  304.43503 seconds of user run time
  2.184137 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
13421772800
;; well, at least it completed.

;; trying to print the string...
* *array1*
;; prints lots of numbers, I have to <control-c>

when counting the string...

top - 20:03:01 up 23 days, 4 min,  5 users,  load average: 0.49, 0.23, 0.53
Tasks: 117 total,   3 running, 114 sleeping,   0 stopped,   0 zombie
Cpu(s): 99.7%us,  0.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2063688k total,   273112k used,  1790576k free,     3372k buffers
Swap: 18291760k total,  1510084k used, 16781676k free,    54976k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND       
 5090 johan     25   0 14.7g 1.7g 1.6g R 99.4 84.8   0:39.98 sbcl          
    1 root      18   0  5116  100   96 S  0.0  0.0   0:01.12 init          
    2 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 kthreadd      
    3 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/0   
    4 root      34  19     0    0    0 S  0.0  0.0   0:00.01 ksoftirqd/0   
    5 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/0    
    6 root      10  -5     0    0    0 S  0.0  0.0   0:00.71 events/0      
    7 root      10  -5     0    0    0 S  0.0  0.0   0:00.01 khelper       
   25 root      10  -5     0    0    0 S  0.0  0.0   0:00.52 kblockd/0