In ACL 5.0 the RANDOM function doesn't work with bignum arguments
above a certain size.
USER(8): (random (fac 200))
Error: This integer is too large to be converted to a double-float:
788657867364790503552363213932185062295135977687173263294742
5332443594499634033429203042840119846239041772121389196388302576427902426371050
6192662495282993111346285727076331723739698894392244562145166424025403329186413
1227428294853277524242407573903240321257405579568660226031904170324062351700858
796178922222789623703897374720000000000000000000000000000000000000000000000000
[condition type: SIMPLE-ERROR]
Is this a bug or is this allowed by the standard as an implementation
limitation?
--
Lieven Marchand <···@wyrd.be>
Gla�r ok reifr skyli gumna hverr, unz sinn b��r bana.
Lieven Marchand <···@wyrd.be> writes:
> In ACL 5.0 the RANDOM function doesn't work with bignum arguments
> above a certain size.
>
> USER(8): (random (fac 200))
>
> Error: This integer is too large to be converted to a double-float:
> 788657867364790503552363213932185062295135977687173263294742
> 5332443594499634033429203042840119846239041772121389196388302576427902426371050
> 6192662495282993111346285727076331723739698894392244562145166424025403329186413
> 1227428294853277524242407573903240321257405579568660226031904170324062351700858
> 796178922222789623703897374720000000000000000000000000000000000000000000000000
> [condition type: SIMPLE-ERROR]
>
> Is this a bug or is this allowed by the standard as an implementation
> limitation?
Well, I consider it to be not nice. However, you _are_ using a fairly
old version...
CL-USER(1): (defun fac (n) (if (zerop n) 1 (* n (fac (1- n)))))
FAC
CL-USER(2): (random (fac 200))
385659726072417266815163346569130849728919695862094262266274273263923094
436104919585302931505846861875583284492310271183012782027360432948601794
795242668701951621728428087664436310376914151798502369576365955053956795
794966417095205415398617584931778601463269451936770033379222514292662207
749671157526124763860538179207685875748568921235828711357784357467516035
207239114678582
CL-USER(3):
--
Duane Rettig Franz Inc. http://www.franz.com/ (www)
1995 University Ave Suite 275 Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253 ·····@Franz.COM (internet)
Lieven Marchand <···@wyrd.be> writes:
> In ACL 5.0 the RANDOM function doesn't work with bignum arguments
> above a certain size.
>
> USER(8): (random (fac 200))
>
> Error: This integer is too large to be converted to a double-float:
[snip big int]
> [condition type: SIMPLE-ERROR]
>
> Is this a bug or is this allowed by the standard as an implementation
> limitation?
Even if this was just an implementation limitation I think that you
should look into getting your randomness from a different source.
Pseudo-random number generators like random are notoriously bad when
used to generate more that a handful of bits, even the 32 they claim
to do in one shot is really tenuous. In general if you start asking
for bigger chunks like the amount you would need for (random (fac
200)) you need a real source of randomness. Check out
www.lavarand.com for a system out of SGI for doing this.
-Eric
>
> --
> Lieven Marchand <···@wyrd.be>
> Gla�r ok reifr skyli gumna hverr, unz sinn b��r bana.
eric dahlman <·······@cs.colostate.edu> writes:
> Lieven Marchand <···@wyrd.be> writes:
>
> > In ACL 5.0 the RANDOM function doesn't work with bignum arguments
> > above a certain size.
> >
> > USER(8): (random (fac 200))
> >
> > Error: This integer is too large to be converted to a double-float:
> [snip big int]
> > [condition type: SIMPLE-ERROR]
> >
> > Is this a bug or is this allowed by the standard as an implementation
> > limitation?
>
> Even if this was just an implementation limitation I think that you
> should look into getting your randomness from a different source.
> Pseudo-random number generators like random are notoriously bad when
> used to generate more that a handful of bits, even the 32 they claim
> to do in one shot is really tenuous.
This is not true. Random numbers larger than 32-bits are easy to
generate. And with patches, Allegro CL 6.0 does a good job of this.
We use the Mersenne Twister Algorithm, which has a large cycle.
> In general if you start asking
> for bigger chunks like the amount you would need for (random (fac
> 200)) you need a real source of randomness. Check out
> www.lavarand.com for a system out of SGI for doing this.
True randomness is not always desirable. One of the problems with
true randomness is that at a local level, coincidence can make the
generator look downright repetitive. And when you have to test
softare or hardware, sometimes it is not randomness you want but
relatively even distribution, which gives you a higher confidence
of coverage in the testing.
Pseudo-randomness also gives you repeatability; given a particular
seed, you can generate the same sequence over and over again. Imagine
the frustration of someone who is trying to debug a subtle timing error
that can't be reproduced because the circumstances can't be set up
that caused the error in the first place...
--
Duane Rettig Franz Inc. http://www.franz.com/ (www)
1995 University Ave Suite 275 Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253 ·····@Franz.COM (internet)
Duane Rettig <·····@franz.com> writes:
> eric dahlman <·······@cs.colostate.edu> writes:
>
> > Lieven Marchand <···@wyrd.be> writes:
> >
> > > In ACL 5.0 the RANDOM function doesn't work with bignum arguments
> > > above a certain size.
> > >
> > > USER(8): (random (fac 200))
> > >
> > > Error: This integer is too large to be converted to a double-float:
> > [snip big int]
> > > [condition type: SIMPLE-ERROR]
> > >
> > > Is this a bug or is this allowed by the standard as an implementation
> > > limitation?
> >
> > Even if this was just an implementation limitation I think that you
> > should look into getting your randomness from a different source.
> > Pseudo-random number generators like random are notoriously bad when
> > used to generate more that a handful of bits, even the 32 they claim
> > to do in one shot is really tenuous.
>
> This is not true. Random numbers larger than 32-bits are easy to
> generate. And with patches, Allegro CL 6.0 does a good job of this.
> We use the Mersenne Twister Algorithm, which has a large cycle.
I would disagree. There are lots of ways that random number
generators break down fortunately the authors of the Mersenne Twister
Algorithm have done much of the analysis work for us to quote:
However, it is widely believed that the spectral test is one of the
strongest test to select a good generator. MT is designed to pass a
similar test, called the k-distribution test. For example, if we
look the output with 32-bit accuracy, then the 623-tuples from the
output in a whole period are equidistributed in the 623-dimensional
unit cube. For 16-bit accuracy, 1246-tuples are equidistributed in
1246-dimension, and for 2-bit accuaracy, in 9968-dimension. These
values are at least 15 times larger than the previous records.
I stole that from there web site at
http://www.math.keio.ac.jp/~matumoto/emt.html
This is actually really good, and Franz is right to be tauting the
fact that the use this algorithm in their implementation. Just for
comparisons the old Fortran rand48 generator would have died at
3-tuples and C's rand function could maybe get to 5. It has been a
few years since I look at them so I could be off a little but the
point is that they are bad. On a side note CMUCL also uses this
generator and the Lisp code for their implementation is available on
the above site if anyone wants to look at how it works.[1]
The C and Fortran generators really cannot be any better than they are
since this is really a function of the number of bits used to
represent the random state. If your random state is 32 bits then you
cannot deterministically use that to generate a pseudo-random object
needing 128 bits there are not enough bits. Depending on how your
generator cycles you will either constantly regenerate the same subset
as on your first iteration or you will generate a set of cosets one
for each cycle. The error from the original poster, which I cut out,
indicated that the previous implementation was using what I presume
were 64bit floats for the pseudo-random generator. This means that
generating any objects which require more than 64 bits is unsound.
Actually, you should not use more than half the number of bits in the
random state but that is a different kettle of fish.
Just for reference the random state in the new generator uses 19936
bits. You will have the same problems with this as the older versions
as you try to generate objects which around 19936 bits to represent.
This is better than a limit of 32 bits but it is still just as real.
And if you do not think that you would be generating things that large
it happens implicitly all the time. For instance if you are randomly
traversing a graph you my only use a couple of bits for each local
decision, however, the entire path represents a larger random object.
That was one of the driving forces behind the MT generator since other
generators would fail for Monte Carlo methods.
Finally, I would submit one more reason that even the MT algorithm
would break down for large objects. If we look at the claims of the
authors, which I have no reason to doubt, they say in so many words
that their generator will generator will generate any 623 tuple of
32bit words. There are not surprisingly 2^19936 such objects which
happens to be the cycle length of the generator. This means that once
you have generated one one 19936 bit object you will never generate it
again until the generator has looped. All that you have in this case
is a permutation of all such objects. This is certainly not random
since you are guaranteed that there will not be an chance of a repeat
occurrence of an event. This degradation of randomness is guaranteed
to happen simply because your random number generator is ideally an
automorphism on the set of random states. So as you approach the size
of you random state things get less random. You are guaranteed to
loose, the math is against you.
This has gotten much too long as it is so I will stop. I still am
very opposed to Duane's statement that "Random numbers larger than
32-bits are easy to generate." Certainly the use of the MT generator
ups the bar for how many bits you can generate it certainly does not
remove the inherent mathematical implications of the way that
pseudo-random numbers are generated.
-Eric
Footnotes:
[1] Looking at the code I saw one problem with the CMUCL code and that
was that they use a fixnum as their seed. This is then used to
calculate the larger random state used by the generator. This means
that they are only 2^32 points from which random numbers can be
generated. This is significantly smaller than the 2^19936 which
exist, this seeding process can significantly bias the set of numbers
which will be generated. Ideally, the whole random state would be the
seed, such a chunk of randomness could come from /dev/random under
Linux.
eric dahlman <·······@cs.colostate.edu> writes:
[...]
> The error from the original poster, which I cut out,
> indicated that the previous implementation was using what I presume
> were 64bit floats for the pseudo-random generator.
That is correct.
> This means that
> generating any objects which require more than 64 bits is unsound.
> Actually, you should not use more than half the number of bits in the
> random state but that is a different kettle of fish.
Actually, since a 64-bit float has only 52 real bits of mantissa, and
since we actually truncated them to 48 bits, the real usable bit width
is more like 24.
> Just for reference the random state in the new generator uses 19936
> bits. You will have the same problems with this as the older versions
> as you try to generate objects which around 19936 bits to represent.
> This is better than a limit of 32 bits but it is still just as real.
[...]
> This has gotten much too long as it is so I will stop. I still am
> very opposed to Duane's statement that "Random numbers larger than
> 32-bits are easy to generate." Certainly the use of the MT generator
> ups the bar for how many bits you can generate it certainly does not
> remove the inherent mathematical implications of the way that
> pseudo-random numbers are generated.
It seems you have 32-bits-on-the-brain. In Lisp, 32-bits is a meaningless
width, except for performance considerations. Perhaps you should also
revise your statements for 64-bit architectures. As for me, I willingly
revise my statement to be more general:
"Random numbers larger than N bits are easy to generate; all you need is
a random-number generator that has at least N*2 bits of state."
This works for the original question: (integer-length (fac 200)) => 1246,
which is well under the 19936 bit MT cycle.
--
Duane Rettig Franz Inc. http://www.franz.com/ (www)
1995 University Ave Suite 275 Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253 ·····@Franz.COM (internet)
Duane Rettig <·····@franz.com> writes:
> eric dahlman <·······@cs.colostate.edu> writes:
>
> [...]
> > The error from the original poster, which I cut out,
> > indicated that the previous implementation was using what I presume
> > were 64bit floats for the pseudo-random generator.
>
> That is correct.
>
> > This means that
> > generating any objects which require more than 64 bits is unsound.
> > Actually, you should not use more than half the number of bits in the
> > random state but that is a different kettle of fish.
>
> Actually, since a 64-bit float has only 52 real bits of mantissa, and
> since we actually truncated them to 48 bits, the real usable bit width
> is more like 24.
And just for reference these are the middle-order(is that a word?) bits.
>
> > Just for reference the random state in the new generator uses 19936
> > bits. You will have the same problems with this as the older versions
> > as you try to generate objects which around 19936 bits to represent.
> > This is better than a limit of 32 bits but it is still just as real.
> [...]
>
> > This has gotten much too long as it is so I will stop. I still am
> > very opposed to Duane's statement that "Random numbers larger than
> > 32-bits are easy to generate." Certainly the use of the MT generator
> > ups the bar for how many bits you can generate it certainly does not
> > remove the inherent mathematical implications of the way that
> > pseudo-random numbers are generated.
>
> It seems you have 32-bits-on-the-brain. In Lisp, 32-bits is a meaningless
> width, except for performance considerations. Perhaps you should also
> revise your statements for 64-bit architectures. As for me, I willingly
> revise my statement to be more general:
I could care less about how many bits there are, that was just
important in the context of other generators described.
Pseudo-randomness was just delivered in 32 bit packages. It is also
important in the context of MT since the algorithm is defined on an
array of 32 bit integers. Me personally I could care less about how
many bits there are, I would be really happy with a 1 bit true random
number generator ;-)
>
> "Random numbers larger than N bits are easy to generate; all you need is
> a random-number generator that has at least N*2 bits of state."
>
> This works for the original question: (integer-length (fac 200)) => 1246,
> which is well under the 19936 bit MT cycle.
I would agree with that. The real problem with this bit limit comes
when you are doing something like a Monte Carlo simulation of these
things because all of the bits used in the simulation are part of the
same conceptual object. So here we could only have 16 steps in a
simulation before the bad mojo starts. That is assuming that all the
bits are good; as I am inclined to believe after reading their web page.
That said the bad mojo may not be that bad depending on what you are
doing. One just needs to be aware of what is going on.
-Eric
>
> --
> Duane Rettig Franz Inc. http://www.franz.com/ (www)
> 1995 University Ave Suite 275 Berkeley, CA 94704
> Phone: (510) 548-3600; FAX: (510) 548-8253 ·····@Franz.COM (internet)
>>>>> "eric" == eric dahlman <·······@cs.colostate.edu> writes:
[nice discussion of MT generator snipped]
eric> Footnotes:
eric> [1] Looking at the code I saw one problem with the CMUCL code and that
eric> was that they use a fixnum as their seed. This is then used to
eric> calculate the larger random state used by the generator. This means
eric> that they are only 2^32 points from which random numbers can be
eric> generated. This is significantly smaller than the 2^19936 which
eric> exist, this seeding process can significantly bias the set of numbers
eric> which will be generated. Ideally, the whole random state would be the
eric> seed, such a chunk of randomness could come from /dev/random under
eric> Linux.
It's done this way because that's how the referece C code for the MT
generator seeded the generator. A seed (the time) is used to seed a
32-bit linear congruential generator. The output is used to seed the
MT generator.
Using /dev/random would be a good option for the x86 port.
Ray
Raymond Toy <···@rtp.ericsson.se> writes:
> It's done this way because that's how the referece C code for the MT
> generator seeded the generator. A seed (the time) is used to seed a
> 32-bit linear congruential generator. The output is used to seed the
> MT generator.
I hadn't looked at the C code but suspected as much. I went back and
the author's acknowledge this as a problem and describe how to load
the whole state as a fix.
>
> Using /dev/random would be a good option for the x86 port.
If I can add things to the wish list an interface to convert from
(simple-array (unsigned-byte 32) (627)) to a random state would be
nice in general, since then I could get my seeds from a true random
source somewhere and then load them on any platform. It is not hard to
hack such a thing but it is also nice for these things to be
standardized.
-Eric
Duane Rettig <·····@franz.com> writes:
> eric dahlman <·······@cs.colostate.edu> writes:
[snip: answered in other thread]
> > In general if you start asking
> > for bigger chunks like the amount you would need for (random (fac
> > 200)) you need a real source of randomness. Check out
> > www.lavarand.com for a system out of SGI for doing this.
>
> True randomness is not always desirable. One of the problems with
> true randomness is that at a local level, coincidence can make the
> generator look downright repetitive. And when you have to test
> softare or hardware, sometimes it is not randomness you want but
> relatively even distribution, which gives you a higher confidence of
> coverage in the testing.
If you want coverage testing then do just that don't use random test
cases or worse yet biased random test cases like you suggest. Random
testing is a very useful and statistically justified method for
evaluating software but it is not about throwing a bunch of garbage at
the thing and seeing what will stick. It is no silver bullet, you
still need to understand the domain and implementation you are testing
in order to identify meaningful random tests. You also do not need to
allow for repetition in your tests, just select unique tests. If you
cannot generate a sufficient number of distinct test that should be a
hint that this is not the time for randomized testing.
I disagree that using stuff-that-people-think-looks-random-but-ain't
provides a higher confidence in testing. It sets off clanging bells
with bright blinking lights in my mind. You have introduced a bias
into the testing with no justification or correlation to either the
implementation or the domain. That said using a biased random
generation scheme may be very well justified, but it has to be
justified. For instance
* Our domain contains 23 different classes of input we need to
sample from each.
* We know all of the boundary conditions occur in this region
of the space, bias the tests to fall in this region.
* We have created Markov models of the input under these
different conditions. Stress test the implementation under
each.
Random testing is no substitute for coverage testing or partition
testing. I personally would be much happier with a statistically
verifiable statement about the probability that a error will be
encountered and that the program will correctly process some given
input. You can make these statements based on the way you sample the
input domain if you have a good random test generator, when you just
fiddle with things you cannot say anything beyond "Well, Zeke tried it
a jillion times and it didn't break." [1]
> Pseudo-randomness also gives you repeatability; given a particular
> seed, you can generate the same sequence over and over again.
> Imagine the frustration of someone who is trying to debug a subtle
> timing error that can't be reproduced because the circumstances
> can't be set up that caused the error in the first place...
Then use the pseudo-random generator for debugging or write your
random numbers to storage so that you can "play them back" when
debugging. There are may randomized algorithms like simulated
annealing, genetic algorithms, randomized SAT solvers which are
mathematically guaranteed to converge to a solution or to have a
certain well defined distribution of approximations which do not
converge in practice. This is because of unwanted interaction with
the pseudo-random number generator, change or replace the generator
and your results change. Hook it up to a real entropy source and
magically it works the way the math predicts. This is an even worse
bug since everything is correct, it just doesn't work.
My point is that many things depend on randomness to operate and when
you hook a complex system up to a pseudo-random number generator it is
quite common for disfunctional behavior to emerge.
-Eric
>
> --
> Duane Rettig Franz Inc. http://www.franz.com/ (www)
> 1995 University Ave Suite 275 Berkeley, CA 94704
> Phone: (510) 548-3600; FAX: (510) 548-8253 ·····@Franz.COM (internet)
Footnotes:
[1] I have to confess when hacking on things I have been known to pipe
lots of junk through the code to see where things break. But it is
nothing more than the fact that I cannot be bothered to figure out
what I really should test. This catches lots of bugs but I would
never try to sell that to a customer as a valid testing strategy.
From: Reini Urban
Subject: Re: Test Methodology (was Re: RANDOM with bignum argument)
Date:
Message-ID: <3ad2b535.68067776@judy>
eric dahlman wrote:
....
>-Eric
isn't that a trademark?
--
Reini Urban
http://xarch.tu-graz.ac.at/autocad/news/faq/autolisp.html
······@sbox.tu-graz.ac.at (Reini Urban) writes:
> eric dahlman wrote:
> ....
> >-Eric
>
> isn't that a trademark?
I think that only the reader macro versions are trademarked.
;-)
-Eric
eric dahlman <·······@cs.colostate.edu> writes:
> Lieven Marchand <···@wyrd.be> writes:
>
> > In ACL 5.0 the RANDOM function doesn't work with bignum arguments
> > above a certain size.
> >
> > USER(8): (random (fac 200))
> >
> > Error: This integer is too large to be converted to a double-float:
> [snip big int]
> > [condition type: SIMPLE-ERROR]
> >
> > Is this a bug or is this allowed by the standard as an implementation
> > limitation?
>
> Even if this was just an implementation limitation I think that you
> should look into getting your randomness from a different source.
> Pseudo-random number generators like random are notoriously bad when
> used to generate more that a handful of bits, even the 32 they claim
> to do in one shot is really tenuous. In general if you start asking
> for bigger chunks like the amount you would need for (random (fac
> 200)) you need a real source of randomness. Check out
> www.lavarand.com for a system out of SGI for doing this.
The randomness isn't very important for the application. I'm
generating a fixed number of bases for Fermat tests in a probabilistic
primality test. So an easy work around for the problem was generating
random bases below the limit ACL has problems with in stead of the
total range of [2, possible-prime-number].
--
Lieven Marchand <···@wyrd.be>
Gla�r ok reifr skyli gumna hverr, unz sinn b��r bana.
Jochen Schmidt <···@dataheaven.de> writes:
> Lieven Marchand wrote:
>
> > The randomness isn't very important for the application. I'm
> > generating a fixed number of bases for Fermat tests in a probabilistic
> > primality test. So an easy work around for the problem was generating
> > random bases below the limit ACL has problems with in stead of the
> > total range of [2, possible-prime-number].
>
> Why not a Miller-Rabin Test? Its safer than Fermat and faster than
> Solovay-Strassen...
Thanks for the code. I needed a primality test for a little run once
program to do with a puzzle in a magazine and I could remember fermat
without my books which I didn't have at hand but not the more advanced
methods.
I know Solovay-Strassen as a multiplication algorithm. What's their
primality test?
--
Lieven Marchand <···@wyrd.be>
Gla�r ok reifr skyli gumna hverr, unz sinn b��r bana.
Lieven Marchand <···@wyrd.be> writes:
>
> In ACL 5.0 the RANDOM function doesn't work with bignum arguments
> above a certain size.
>
> USER(8): (random (fac 200))
>
> Error: This integer is too large to be converted to a double-float:
> 788657867364790503552363213932185062295135977687173263294742
> 5332443594499634033429203042840119846239041772121389196388302576427902426371050
> 6192662495282993111346285727076331723739698894392244562145166424025403329186413
> 1227428294853277524242407573903240321257405579568660226031904170324062351700858
> 796178922222789623703897374720000000000000000000000000000000000000000000000000
> [condition type: SIMPLE-ERROR]
>
> Is this a bug or is this allowed by the standard as an implementation
> limitation?
Maybe because I never played with implementing random numbers and I
have no first-hand knowledge of how computationally expensive this
would be, but I don't really see a problem with an implementation
having a limitation of this kind. I don't know whether the standard
explicitly permits it, but the standard does permit implementations to
have capacity restricctions.
HOWEVER, I do think it should not signal ERROR but rather
SERIOUS-CONDITION (similar to the way STORAGE-CONDITION works) since
it's not documented to be an error.
Then again, this is just my personal opinion and not an offical judgment.
There is no agency that can judge implementational right-or-wrong; we leave
it to the marketplace to vote with their feet.