From: Thomas A. Russ
Subject: Re: Optimize a function for speed
Date: 
Message-ID: <ymir61mrqzw.fsf@blackcat.isi.edu>
Francogrex <······@grex.org> writes:

> Hello, I have this run function but it's going really slow and I would
> like it to go much faster. I thought of hash-tables but how can we fit
> array structures into hash tables? It would be a learning experience
> for me if someone can give me a few hints on how to optimize this for
> speed. I am using SBCL. Thanks

Well, even before that, it would be good to translate this algorithm
into Lisp.  It looks suspiciously like C code.

Among the more general problems is the use of undeclared global
variables, and not following the standard naming convention for special
variables.  We'll fix those along the way.

But the fundamental problem was identified by Dimiter.  You are using
sequential access to get list elements.  Lists in Common Lisp are
sequential access data structures.  NTH is absolutely the wrong thing to
use to get to the elements.  Especially you seem to be trying to access
the list elements sequentially anyway.

> ;; The while macro
> (defmacro while (condition &rest body)
>   (let ((var (gensym)))
>     `(do ((,var nil (progn ,@body)))
> 	 ((null ,condition) ,var))))

I prefer the simpler

  (loop while ...
        do ...)

construct.  In fact, a lot of this looping can be captured better by
using LOOP.  But that's another entire conversation.


> (defvar mlist (loop repeat 4000 collect (random 1.0))) ;;I need to use
> even 10000 repeats or more

Should be named *mlist* to conform to convention.

If you really need random access to the elements, then it should also be
a vector.  In that case:

  (defvar *mlist* (loop with result = (make-array '(4000))
                        for i from 0 below 4000
                        do (setf (aref result i) (random 1.0))
                        finally (return result)))

Or perhaps even more useful to you, since it easily allows you to try
this with different sizes and also to reset the input values:

  (defun generate-random-vector (size &optional (limit 1.0))
     (let ((result (make-array (list size))))
        (loop for i from 0 below size
              do (setf (aref result i) (random limit)))
        result))

  (defvar *mlist* (generate-random-vector 4000))
  

> ;;The function to optimize
> (defun run (mlist)

Introduce local variables with LET!

  (let ((n ...)
        (mtable ...)
        etc.)
    ....
   )

>   (setf n (1- (length mlist)))
>   (setf mlist (delete (nth 0 mlist) mlist))

You realize that by calling DELETE here you may be removing more than
one element from your list, right?  It may be unlikely, but you really
are removing all elements that match the first element.

Even if there are no other matches, DELETE must go through the entire
list, thus wasting time.

If instead, as I presume, you mean to use everything except the first
element, then it would be more efficient to just use (pop mlist) or
(setf mlist (rest mlist)).  But this will, of course, have to be
different if you go to a vector solution.

>   (setf mtable (make-array (list (/ (* n (1- n)) 2) 3) :initial-
> element 0))
>   (setf ind 0 ind2 0 Pm 0)
>   (setf i 0)
>   (while (<= i (1- n))

A simple optimization here is to avoid repeatedly subtracting from your
end value by using the test  (< i n)

Alternately, using the list data structure, you could do something like
looping over the successive CDRs of the list.  That seems to be what you
are really doing in here with this and the inner loop.  So something
like the following structure is likely to be more efficient than your
current code:

   (loop with index = 0
         for remainder on mlist by #'cdr
         do (loop for item in remainder
                  as sP = item then (+ sP item)
                  do ....))

  This does lose the values of the I and J indices, so maybe you can't
  really accept that.  But perhaps if you would have written a comment
  about what this function calculates, there may be a simpler way to
  achieve what you need.

>     (setf sP 0)
>     (setf j (1+ i))
>     (while (< j n)
>       (setf sP (+ sP (nth (1- j) mlist)))
>       (setf (aref mtable ind 0) i)
>       (setf (aref mtable ind 1) j)
>       (setf (aref mtable ind 2) (if (< sP 1) (/ 1 (* (float (- j i))
> (float (+ (- n j) (1+ i))) (float sP))) 1))
>       (when (> (aref mtable ind 2) Pm) (setf Pm (aref mtable ind 2)
> ind2 ind))
>       (incf ind)
>       (incf j))
>     (incf i))
>   (setf mtable2 (loop for i from 0 to 2 collect (aref mtable ind2 i)))
>   )

Don't forget this step before you do any timing:

(compile 'run)

> (time (run mlist))
> 
> Evaluation took:
>   121.360 seconds of real time
>   121.265625 seconds of total run time (97.109375 user, 24.156250
> system)
>   [ Run times consist of 1.658 seconds GC time, and 119.608 seconds
> non-GC time. ]
>   99.92% CPU
>   242,671,142,450 processor cycles
>   167,449,016 bytes consed

-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: TomSW
Subject: Re: Optimize a function for speed
Date: 
Message-ID: <330e5423-8c3e-401c-9b7e-77174fc1b9f6@f11g2000vbf.googlegroups.com>
On Feb 26, 8:27 am, Francogrex <······@grex.org> wrote:
> On Feb 26, 1:17 am, ····@sevak.isi.edu (Thomas A. Russ) wrote:
>
> > Well, even before that, it would be good to translate this algorithm
> > into Lisp.  It looks suspiciously like C code.
>
> Indeed it is originally coming from C :), it was sent to me by a
> collegue. I almost word by word translated it to CL but as I can see
> it's not the best approach. Thanks for suggestions below, I will read
> them carefully.

A few more things:

- the beginning of the function pops the first element off the start
of mlist, for some reason... That should be the job of some other
function, which passes the list of random numbers to your function.
- do you really need mtable or mtable2? Can't you just maintain three
variables, Pmax and its associated i and j values, and return their
values at the end?
- if sP is a float, I don't think you need to cast the values you
multiply it by to float as well (is it a compiler hint?)
- as far as I can understand it you're using the series 1, 1/(n-1), 1/
(2n-2), 1/(3n-3) etc. in which case you can pre-calculate it to n
places

(defun get-pmax (rands)
  "TODO: describe me"
  (let ((Pmax          0)
        (Pseries-index 0)
        ;; series is the indexed series: (0 1), (1 (1/n-1)), (2
(1/2n-2)), (3 (1/3n-3)) ...
        (series        (reciprocal-of-nx-minus-n (length rands))))
    (maplist
     ;; Map across sublists of rands
     #'(lambda (sub-rands)
         (let ((j  (1+ i))
               (sP 0))
           (mapcar
            ;; Map across the sub-list
            #'(lambda (rand series-value)
                (destructuring-bind (n val) series-value
                  (incf sP rand)
                  (let ((P (if (< sP 1)
                               series-value
                               1)))
                    ;; store the max if appropriate
                    (when (> P Pmax) (setf Pmax P Pseries-index n)))))
            sub-rands
            series))))
    ;; return the max values
    (values Pmax
            Pseries-index)))