Why so much Space and GC? How to interpret `(time ...)'?

From: ·········@math.ufl.edu
Subject: Why so much Space and GC?  How to interpret `(time ...)'?
Date: Fri, 02 Nov 2007 15:22:57 +0000
Message-ID: <1194016977.698624.53070@o3g2000hsb.googlegroups.com>

I'm teaching an elem. Number Theory course, implementing some NT
algorithms.  Mostly the numbers are small and running time ("RT"),
doesn't matter, but with the Baby-Step Giant-Step alg., below, it
does.
I'm trying to understand the O/P of CL's `time' and CLISP's
`ext:times'.

    [Probably the details of the problem are unimportant, but here
    they are:  The program is trying to find the exponent E so that

        45^E  == 54237450,

    where "==" means "congruent mod 95525767".  Both `discrete-log'
and
    `slow-discrete-log' are tiny programs, code listed below, and both
    get the correct answer, which is E = 95525700.]

Here are the two runs, in CLISP.  Both functions are compiled.

(ext:times (slow-discrete-log 54237450))
                                    Permanent            Temporary
    Class                      instances   bytes    instances   bytes
    -----                      --------- ---------  ---------
---------
    BIGNUM                             1        12  252649860
3223013800
    -----                      --------- ---------  ---------
---------
    Total                              1        12  252649860
3223013800
    Real time: 155.42569 sec.
    Run time: 146.37845 sec.
    Space: 3228009700 Bytes
    GC: 2813, GC time: 26.074505 sec.


 (ext:times (discrete-log 54237450))
                                    Permanent            Temporary
    Class                      instances   bytes    instances   bytes
    -----                      --------- ---------  ---------
---------
    BIGNUM                             1        12      19896
277848
    -----                      --------- ---------  ---------
---------
    Total                              1        12      19896
277848
    Real time: 0.006965 sec.
    Run time: 0.006932 sec.
    Space: 277860 Bytes

I'm puzzled why the "Space:" line of `slow-discrete-log' is so high
and why so many GCs.  I think of `slow-discrete-log' as updating a
single bignum "register" a whole lotta times, but --an I interpreting
the "instances" correctly?-- that the storage is being discarded and
re-allocated?

I expected the ratio of RT of slow-discrete-log to discrete-log to be
about the square-root of the modulus,

               (sqrt 95525767.0)      =>   9773.729
What I see is
               (/ 146.37845 0.006932) =>  21116.336

--I guess this isn't so far off.  [In interpreted code, the RT-ratio
was
within a few percent of the sqrt.]

Here is the code.

    (defun-e slow-discrete-log (target &optional (b *bsgs* ))
      (loop
        with mdlus = (BSGS-mdlus b)
        with gen   = (BSGS-generator b)
        with tar   = (mod target mdlus)
        for expo  from 0 below mdlus ;!EXIT when exponent runs off
range.
        for gen^expo = 1 then (mod (*  gen  gen^expo) mdlus)
        when (= gen^expo tar) return expo
        FINALLY (return false)
        ) )

Here, the `b' is a struct that has the modulus and the generator [the
number 45].  So `slow-discrete-log' simply tests all possible
exponents.

The `discrete-log', below, uses a pre-computed hashtable that is
passed-in via `b'.  The loop controller is `numpatches'; it is
approximately the sqrt of the modulus.

    (defun-e discrete-log (target &optional (b *bsgs* ))
      "ASSUME: `target' is coprime to the modulus from struct `b'.
    Returns nil if `target' is not in the cyclic subgroup generated by
the
    generator and modulus stored in `b'."
      (loop
        with mdlus  = (BSGS-mdlus b)
        with m      = (BSGS-patchmult b) ;The multiplier.
        with hash   = (BSGS-powers    b)
        for k   from  0 below (BSGS-numpatches b) ;!EXIT
        for tar-m^k = (mod target mdlus)  then (mod (*  tar-m^k  m)
mdlus)
        for j?      = (gethash  tar-m^k  hash)
        when j? return (+  j? (* k (BSGS-patchlen b)))
        FINALLY (return false)
        )  )

Is it the line "(mod (*  gen  gen^expo) mdlus)" in
`slow-discrete-log' that is producing all the GCing, and the big
"Space"?

Do I need some kind bignum-declaration somewhere?

How do I tell at what point CLISP switches to bignums?

FWIWorth, here is the RT of the code that produces the hashtable:

(ext:times (init-bsgs 95525767        45))

                                    Permanent            Temporary
    Class                      instances   bytes    instances   bytes
    -----                      --------- ---------  ---------
---------
    BIGNUM                            -1       -12      17807
233188
    SIMPLE-VECTOR                      0         0         50
196352
    CONS                               0         0        528
4224
    FUNCTION                           0         0         36
2376
    SIMPLE-8BIT-VECTOR                 0         0         18
216
    HASH-TABLE                         0         0          1
60
    CIPHER::BSGS                       0         0          1
36
    -----                      --------- ---------  ---------
---------
    Total                             -1       -12      18441
436452
    Real time: 0.008209 sec.
    Run time: 0.008105 sec.
    Space: 436440 Bytes


Prof. Jonathan LF King   Mathematics dept, Univ. of Florida

Re: Why so much Space and GC? How to interpret `(time ...)'? ·········@math.ufl.edu
- Re: Why so much Space and GC? How to interpret `(time ...)'? Thomas A. Russ
  - Re: Why so much Space and GC? How to interpret `(time ...)'? ·········@math.ufl.edu
    - Re: Why so much Space and GC? How to interpret `(time ...)'? Thomas A. Russ
      - Re: Why so much Space and GC? How to interpret `(time ...)'? ·········@math.ufl.edu

From: ·········@math.ufl.edu
Subject: Re: Why so much Space and GC? How to interpret `(time ...)'?
Date: Fri, 02 Nov 2007 15:27:19 +0000
Message-ID: <1194017239.075970.31550@19g2000hsx.googlegroups.com>

Google  Groups shows I bad linebreak, so I have made the lines shorter
and reposted.

I'm teaching an elem. Number Theory course, implementing some NT
algorithms.  Mostly the numbers are small and running time ("RT"),
doesn't matter, but with the Baby-Step Giant-Step alg., below, it
does.
I'm trying to understand the O/P of CL's `time' and CLISP's
`ext:times'.

    [Probably the details of the problem are unimportant, but here
    they are:  The program is trying to find the exponent E so that

        45^E  == 54237450,

    where "==" means "congruent mod 95525767".  Both `discrete-log'
and
    `slow-discrete-log' are tiny programs, code listed below, and both
    get the correct answer, which is E = 95525700.]

Here are the two runs, in CLISP.  Both functions are compiled.

(ext:times (slow-discrete-log 54237450))
                          Permanent            Temporary
    Class            instances   bytes    instances   bytes
    -----            --------- ---------  --------- ---------
    BIGNUM                   1        12  252649860 3223013800
    -----            --------- ---------  --------- ---------
    Total                    1        12  252649860 3223013800
    Real time: 155.42569 sec.
    Run time: 146.37845 sec.
    Space: 3228009700 Bytes
    GC: 2813, GC time: 26.074505 sec.


 (ext:times (discrete-log 54237450))
                          Permanent            Temporary
    Class            instances   bytes    instances   bytes
    -----            --------- ---------  --------- ---------
    BIGNUM                   1        12      19896    277848
    -----            --------- ---------  --------- ---------
    Total                    1        12      19896    277848
    Real time: 0.006965 sec.
    Run time: 0.006932 sec.
    Space: 277860 Bytes

I'm puzzled why the "Space:" line of `slow-discrete-log' is so high
and why so many GCs.  I think of `slow-discrete-log' as updating a
single bignum "register" a whole lotta times, but --an I interpreting
the "instances" correctly?-- that the storage is being discarded and
re-allocated?

I expected the ratio of RT of slow-discrete-log to discrete-log to be
about the square-root of the modulus,

               (sqrt 95525767.0)      =>   9773.729
What I see is
               (/ 146.37845 0.006932) =>  21116.336

--I guess this isn't so far off.  [In interpreted code, the RT-ratio
was
within a few percent of the sqrt.]

Here is the code.

    (defun-e slow-discrete-log (target &optional (b *bsgs* ))
      (loop
        with mdlus = (BSGS-mdlus b)
        with gen   = (BSGS-generator b)
        with tar   = (mod target mdlus)
        for expo  from 0 below mdlus ;!EXIT when exponent runs off
range.
        for gen^expo = 1 then (mod (*  gen  gen^expo) mdlus)
        when (= gen^expo tar) return expo
        FINALLY (return false)
        ) )

Here, the `b' is a struct that has the modulus and the generator [the
number 45].  So `slow-discrete-log' simply tests all possible
exponents.

The `discrete-log', below, uses a pre-computed hashtable that is
passed-in via `b'.  The loop controller is `numpatches'; it is
approximately the sqrt of the modulus.

    (defun-e discrete-log (target &optional (b *bsgs* ))
      "ASSUME: `target' is coprime to the modulus from struct `b'.
    Returns nil if `target' is not in the cyclic subgroup generated by
the
    generator and modulus stored in `b'."
      (loop
        with mdlus  = (BSGS-mdlus b)
        with m      = (BSGS-patchmult b) ;The multiplier.
        with hash   = (BSGS-powers    b)
        for k   from  0 below (BSGS-numpatches b) ;!EXIT
        for tar-m^k = (mod target mdlus)  then (mod (*  tar-m^k  m)
mdlus)
        for j?      = (gethash  tar-m^k  hash)
        when j? return (+  j? (* k (BSGS-patchlen b)))
        FINALLY (return false)
        )  )

Is it the line "(mod (*  gen  gen^expo) mdlus)" in
`slow-discrete-log' that is producing all the GCing, and the big
"Space"?

Do I need some kind bignum-declaration somewhere?

How do I tell at what point CLISP switches to bignums?

FWIWorth, here is the RT of the code that produces the hashtable:

(ext:times (init-bsgs 95525767        45))

                          Permanent            Temporary
    Class            instances   bytes    instances   bytes
    -----            --------- ---------  --------- ---------
    BIGNUM                  -1       -12      17807    233188
    SIMPLE-VECTOR            0         0         50    196352
    CONS                     0         0        528      4224
    FUNCTION                 0         0         36      2376
    SIMPLE-8BIT-VECT         0         0         18       216
    HASH-TABLE               0         0          1        60
    CIPHER::BSGS             0         0          1        36
    -----            --------- ---------  --------- ---------
    Total                   -1       -12      18441    436452
    Real time: 0.008209 sec.
    Run time: 0.008105 sec.
    Space: 436440 Bytes


Prof. Jonathan LF King   Mathematics dept, Univ. of Florida

From: Thomas A. Russ
Subject: Re: Why so much Space and GC? How to interpret `(time ...)'?
Date: Mon, 05 Nov 2007 17:41:36 +0000
Message-ID: <ymiir4g37zj.fsf@blackcat.isi.edu>

·········@math.ufl.edu writes:

[snip]
> I'm puzzled why the "Space:" line of `slow-discrete-log' is so high
> and why so many GCs.  I think of `slow-discrete-log' as updating a
> single bignum "register" a whole lotta times, but --an I interpreting
> the "instances" correctly?-- that the storage is being discarded and
> re-allocated?

BIGNUMs in Lisp are immutable objects.  So each time a BIGNUM is
created, there is memory allocated, and when not needed anymore,
discarded and garbage collected.

What you are thinking of as a "register" is, in fact, merely a pointer
to the BIGNUM object.

> Here is the code.
> 
>     (defun-e slow-discrete-log (target &optional (b *bsgs* ))
>       (loop
>         with mdlus = (BSGS-mdlus b)
>         with gen   = (BSGS-generator b)
>         with tar   = (mod target mdlus)
>         for expo  from 0 below mdlus ;!EXIT when exponent runs off
> range.
>         for gen^expo = 1 then (mod (*  gen  gen^expo) mdlus)
>         when (= gen^expo tar) return expo
>         FINALLY (return false)
>         ) )

-- 
Thomas A. Russ,  USC/Information Sciences Institute

From: ·········@math.ufl.edu
Subject: Re: Why so much Space and GC? How to interpret `(time ...)'?
Date: Mon, 05 Nov 2007 22:38:56 +0000
Message-ID: <1194302336.974521.87880@o3g2000hsb.googlegroups.com>

Thank you for your reply.

On Nov 5, 12:41 pm, ····@sevak.isi.edu (Thomas A. Russ) wrote:
> ·········@math.ufl.edu writes: ...
> > I'm puzzled why the "Space:" line of `slow-discrete-log' is so high
> > and why so many GCs.  I think of `slow-discrete-log' as updating a
> > single bignum "register" a whole lotta times, but --an I interpreting
> > the "instances" correctly?-- that the storage is being discarded and
> > re-allocated?
>
> BIGNUMs in Lisp are immutable objects.  So each time a BIGNUM is
> created, there is memory allocated, and when not needed anymore,
> discarded and garbage collected.
>
> What you are thinking of as a "register" is, in fact, merely a pointer
> to the BIGNUM object.

Thank you.  Is there a way, via a declaration, to make an
integer mutable, if bounds on its size are known at runtime?
Mutable, and have its storage reused as an accumulator, is
what I'm asking.

For example, from the code below, the variable `gen^expo' is
a non-negative integer that can never get larger than
mdlus*mdlus; this latter is a constant, once the loop is
entered.

> >     (defun-e slow-discrete-log (target &optional (b *bsgs* ))
> >       (loop
> >         with mdlus = (BSGS-mdlus b)
> >         with gen   = (BSGS-generator b)
> >         with tar   = (mod target mdlus)
> >         for expo  from 0 below mdlus
> >         for gen^expo = 1 then (mod (*  gen  gen^expo) mdlus)
> >         when (= gen^expo tar) return expo
> >         FINALLY (return false)
> >         ) )

--
Jonathan LF King	 Mathematics dept., Univ. of Florida

From: Thomas A. Russ
Subject: Re: Why so much Space and GC? How to interpret `(time ...)'?
Date: Tue, 06 Nov 2007 00:02:25 +0000
Message-ID: <ymiejf42qcu.fsf@blackcat.isi.edu>

·········@math.ufl.edu writes:

> Thank you.  Is there a way, via a declaration, to make an
> integer mutable, if bounds on its size are known at runtime?
> Mutable, and have its storage reused as an accumulator, is
> what I'm asking.

I doubt it.

> For example, from the code below, the variable `gen^expo' is
> a non-negative integer that can never get larger than
> mdlus*mdlus; this latter is a constant, once the loop is
> entered.

In particular, in the code below, there are also intermediate results
that will involve the creation of bignums (* gen gen^expot), for
example.  You generally can't make these things happen at the lisp
level, since they are really buried in the guts of the internal
representation.

If this is really critical, I'm afraid you will probably need to resort
to a language like C with the gnu mp package that will allow you to do
this.  You would then need to use the foreign function interface to use
it from lisp.  I don't even think you can do anything in CLisp, which
IIRC uses that library.

> > >     (defun-e slow-discrete-log (target &optional (b *bsgs* ))
> > >       (loop
> > >         with mdlus = (BSGS-mdlus b)
> > >         with gen   = (BSGS-generator b)
> > >         with tar   = (mod target mdlus)
> > >         for expo  from 0 below mdlus
> > >         for gen^expo = 1 then (mod (*  gen  gen^expo) mdlus)
> > >         when (= gen^expo tar) return expo
> > >         FINALLY (return false)
> > >         ) )
> 
> --
> Jonathan LF King	 Mathematics dept., Univ. of Florida
> 

-- 
Thomas A. Russ,  USC/Information Sciences Institute

From: ·········@math.ufl.edu
Subject: Re: Why so much Space and GC? How to interpret `(time ...)'?
Date: Mon, 12 Nov 2007 07:25:28 +0000
Message-ID: <1194852328.515369.212510@v2g2000hsf.googlegroups.com>

On Nov 5, 7:02 pm, ····@sevak.isi.edu (Thomas A. Russ) wrote:
> ·········@math.ufl.edu writes:
> > Thank you.  Is there a way, via a declaration, to make an
> > integer mutable, if bounds on its size are known at runtime?
> > Mutable, and have its storage reused as an accumulator, is
> > what I'm asking.
>
> I doubt it.
...
> If this is really critical, I'm afraid you will probably need to resort
> to a language like C with the gnu mp package that will allow you to do
> this.

It is not critical, and I will stick to Lisp.
Thank you for the information.

--
Prof. Jonathan LF King	 Mathematics dept, Univ. of Florida