From: Kenneth Tilton
Subject: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a771de$0$5896$607ed4bc@cv.net>
[This one comes without a solution in Lisp, just a description of what I 
think has to be a worser way.]

So I want N values distributed normally around 100. Standard deviation 
your choice, cuz I don't know what those are.

All I can think of is the sine function. Walk value v from 2-3sd* below 
to 2-3 sd above, treating that as pi to 0 and actually populating a 
collection with (* (/ N 2) (sin v)) copies of v, shuffling and then 
popping to assign them to N places.

I just keep thinking building the population is unnecessary and that a 
straight calculation is possible, without any foundation for said thoughts.

kt

* Oh, hell, from 75 to 125.

From: Pillsy
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <2aec69e2-ebbd-472e-879f-0423a2d65c66@41g2000yqf.googlegroups.com>
On Feb 26, 11:54 pm, Kenneth Tilton <·········@gmail.com> wrote:

> So I want N values distributed normally around 100. Standard deviation
> your choice, cuz I don't know what those are.

The easy way is to approximate it by adding up a bunch of uniformly
distributed values and take the average.

(defun quick-n-dirty-normal (&optional (mean 0d0) (std-dev 1d0))
  (+ mean (* dev (- (loop :repeat 12 :summing (random 1d0)) 6))))

It's not perfect, and it's not that efficient, but it sure is easy. If
you sum 12 times because the math works out so that the standard
deviation will come out to 1 by default.

Cheers, Pillsy
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a78fbb$0$5893$607ed4bc@cv.net>
Pillsy wrote:
> On Feb 26, 11:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
> 
>> So I want N values distributed normally around 100. Standard deviation
>> your choice, cuz I don't know what those are.
> 
> The easy way is to approximate it by adding up a bunch of uniformly
> distributed values and take the average.
> 
> (defun quick-n-dirty-normal (&optional (mean 0d0) (std-dev 1d0))
>   (+ mean (* dev (- (loop :repeat 12 :summing (random 1d0)) 6))))
> 
> It's not perfect, and it's not that efficient, but it sure is easy. If
> you sum 12 times because the math works out so that the standard
> deviation will come out to 1 by default.

Could have a winner here. But...

...that works?! Some average, subtracting 6. Works great if we roll 12 
ones...oh, ok. You musta left that out.

Meanwhile. I do not see anything inefficient at all.

so far, so winner.

kt
From: Thomas A. Russ
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <ymi63ivqypl.fsf@blackcat.isi.edu>
Kenneth Tilton <·········@gmail.com> writes:

> Pillsy wrote:
> > On Feb 26, 11:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
> >
> >> So I want N values distributed normally around 100. Standard deviation
> >> your choice, cuz I don't know what those are.
> > The easy way is to approximate it by adding up a bunch of uniformly
> > distributed values and take the average.
> > (defun quick-n-dirty-normal (&optional (mean 0d0) (std-dev 1d0))
> >   (+ mean (* dev (- (loop :repeat 12 :summing (random 1d0)) 6))))
> > It's not perfect, and it's not that efficient, but it sure is easy. If
> > you sum 12 times because the math works out so that the standard
> > deviation will come out to 1 by default.
> 
> Could have a winner here. But...
> 
> ...that works?! Some average, subtracting 6. Works great if we roll 12
> ones...oh, ok. You musta left that out.

Um.  The loop is designed to produce a sample from a normal distribution
centered (mean) at 0.0 with a standard deviation of 1.0 so that you can
very nicely scale things appropriately.

Since you are summing 12 random numbers in the range [0..1.0) you would
expect the mean to be 6, so if you therefore subtract 6 you get a
zero-centered distribution.

-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a88359$0$20283$607ed4bc@cv.net>
Thomas A. Russ wrote:
> Kenneth Tilton <·········@gmail.com> writes:
> 
>> Pillsy wrote:
>>> On Feb 26, 11:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
>>>
>>>> So I want N values distributed normally around 100. Standard deviation
>>>> your choice, cuz I don't know what those are.
>>> The easy way is to approximate it by adding up a bunch of uniformly
>>> distributed values and take the average.
>>> (defun quick-n-dirty-normal (&optional (mean 0d0) (std-dev 1d0))
>>>   (+ mean (* dev (- (loop :repeat 12 :summing (random 1d0)) 6))))
>>> It's not perfect, and it's not that efficient, but it sure is easy. If
>>> you sum 12 times because the math works out so that the standard
>>> deviation will come out to 1 by default.
>> Could have a winner here. But...
>>
>> ...that works?! Some average, subtracting 6. Works great if we roll 12
>> ones...oh, ok. You musta left that out.
> 
> Um.  The loop is designed to produce a sample from a normal distribution
> centered (mean) at 0.0 with a standard deviation of 1.0 so that you can
> very nicely scale things appropriately.
> 
> Since you are summing 12 random numbers in the range [0..1.0) you would
> expect the mean to be 6, so if you therefore subtract 6 you get a
> zero-centered distribution.
> 

Oh, I get it. From -6 to 6. And that is a magic number range that Does 
The Right Thing(tm) given something else I do not follow. Definitely 
works well.

Cool, now we have three solutions!

Thx, kt
From: Tamas K Papp
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <70rgltFhji77U1@mid.individual.net>
On Fri, 27 Feb 2009 19:20:41 -0500, Kenneth Tilton wrote:

> Thomas A. Russ wrote:
>> Kenneth Tilton <·········@gmail.com> writes:
>> 
>>> Pillsy wrote:
>>>> On Feb 26, 11:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
>>>>
>>>>> So I want N values distributed normally around 100. Standard
>>>>> deviation your choice, cuz I don't know what those are.
>>>> The easy way is to approximate it by adding up a bunch of uniformly
>>>> distributed values and take the average. (defun quick-n-dirty-normal
>>>> (&optional (mean 0d0) (std-dev 1d0))
>>>>   (+ mean (* dev (- (loop :repeat 12 :summing (random 1d0)) 6))))
>>>> It's not perfect, and it's not that efficient, but it sure is easy.
>>>> If you sum 12 times because the math works out so that the standard
>>>> deviation will come out to 1 by default.
>>> Could have a winner here. But...
>>>
>>> ...that works?! Some average, subtracting 6. Works great if we roll 12
>>> ones...oh, ok. You musta left that out.
>> 
>> Um.  The loop is designed to produce a sample from a normal
>> distribution centered (mean) at 0.0 with a standard deviation of 1.0 so
>> that you can very nicely scale things appropriately.
>> 
>> Since you are summing 12 random numbers in the range [0..1.0) you would
>> expect the mean to be 6, so if you therefore subtract 6 you get a
>> zero-centered distribution.
>> 
>> 
> Oh, I get it. From -6 to 6. And that is a magic number range that Does
> The Right Thing(tm) given something else I do not follow. Definitely
> works well.

Hint: the expected value of (random 1d0) is 0.5, and 6=0.5*12.  That's
where the "magic number" 6 is coming from, you just normalize the
expected value by the mean.

> Cool, now we have three solutions!

As remarked above, this is only an approximation, which gets better as
12 goes to infinity.  OTOH the B-M solution from Scott gives you what
you want right away, very efficiently, you should use that if you do
want normal variates.

Tamas
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a8a6cf$0$20301$607ed4bc@cv.net>
Tamas K Papp wrote:
> On Fri, 27 Feb 2009 19:20:41 -0500, Kenneth Tilton wrote:
> 
>> Thomas A. Russ wrote:
>>> Kenneth Tilton <·········@gmail.com> writes:
>>>
>>>> Pillsy wrote:
>>>>> On Feb 26, 11:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
>>>>>
>>>>>> So I want N values distributed normally around 100. Standard
>>>>>> deviation your choice, cuz I don't know what those are.
>>>>> The easy way is to approximate it by adding up a bunch of uniformly
>>>>> distributed values and take the average. (defun quick-n-dirty-normal
>>>>> (&optional (mean 0d0) (std-dev 1d0))
>>>>>   (+ mean (* dev (- (loop :repeat 12 :summing (random 1d0)) 6))))
>>>>> It's not perfect, and it's not that efficient, but it sure is easy.
>>>>> If you sum 12 times because the math works out so that the standard
>>>>> deviation will come out to 1 by default.
>>>> Could have a winner here. But...
>>>>
>>>> ...that works?! Some average, subtracting 6. Works great if we roll 12
>>>> ones...oh, ok. You musta left that out.
>>> Um.  The loop is designed to produce a sample from a normal
>>> distribution centered (mean) at 0.0 with a standard deviation of 1.0 so
>>> that you can very nicely scale things appropriately.
>>>
>>> Since you are summing 12 random numbers in the range [0..1.0) you would
>>> expect the mean to be 6, so if you therefore subtract 6 you get a
>>> zero-centered distribution.
>>>
>>>
>> Oh, I get it. From -6 to 6. And that is a magic number range that Does
>> The Right Thing(tm) given something else I do not follow. Definitely
>> works well.
> 
> Hint: the expected value of (random 1d0) is 0.5, and 6=0.5*12.  That's
> where the "magic number" 6 is coming from, you just normalize the
> expected value by the mean.

But reviewing:

(defun quick-n-dirty-normal (&optional (mean 0d0) (dev 1d0))
   (+ mean (* dev (- (loop :repeat 12 :summing (random 1d0)) 6))))

We see the -6<->6 value multiplied by the s.dev to get an actual 
deflection. If I had looped 50 times and subtracted 25 I would have a 
range of -25<->25 and still be multiplying by the s.dev.

I eyeballed twenty values from qnd and they looked OK (I did 100 w sdev 
15 since that was offered under the name IQ) and I pretty much 
recognized the crowd at my sports bar so I thought 6 was magic.

kt
From: Rob Warnock
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <-P2dnRUVMMHJLzXUnZ2dnUVZ_uSdnZ2d@speakeasy.net>
Kenneth Tilton  <·········@gmail.com> wrote:
+---------------
| But reviewing:
| (defun quick-n-dirty-normal (&optional (mean 0d0) (dev 1d0))
|    (+ mean (* dev (- (loop :repeat 12 :summing (random 1d0)) 6))))
| 
| We see the -6<->6 value multiplied by the s.dev to get an actual 
| deflection. If I had looped 50 times and subtracted 25 I would have a 
| range of -25<->25 and still be multiplying by the s.dev.
| 
| I eyeballed twenty values from qnd and they looked OK (I did 100 w sdev 
| 15 since that was offered under the name IQ) and I pretty much 
| recognized the crowd at my sports bar so I thought 6 was magic.
+---------------

Since no-one else has mentioned it yet, I feel obligated to present
a reminder that nothing that truncates the range of the distribution
(such as the above Q&D summation) can actually be a "normal" or "Gaussian"
distribution, since the tails of latter extend to infinity on both sides.
Always. [Albeit with however low a probability...]  But since you're
only looking for a QUICK-N-DIRTY-NORMAL, that's probably not an issue
for you. But it just needed saying, 'kay?

On the other hand, Scott's 3-liner:

    (defun random-IQ-score ()
      (+ 100 (* 15 (sqrt (* -2 (log (random 1.0))))
		   (cos (* 2 pi (random 1.0))))))

*does* have unlimited tails [good], but can [rarely, *very* rarely!]
blow up with an arithmetic exception in the case when (RANDOM 1.0)
returns exactly 0.0 [oops].

This should be safe from blowups [though still not from negative IQs],
though it biases the results a tiny bit [a really, *really* tiny bit!]:

    (defun random-IQ-score ()
      (+ 100 (* 15 (sqrt (* -2 (log (+ (random 1d0)
				       least-positive-double-float))))
		   (cos (* 2 pi (random 1d0))))))

With IEEE doubles, the range of this is now approximately -479 to +679.
But in 10 million trials, fewer than a dozen times was the result less
than 25 or more than 175...


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a8e307$0$20299$607ed4bc@cv.net>
Rob Warnock wrote:
> Kenneth Tilton  <·········@gmail.com> wrote:
> +---------------
> | But reviewing:
> | (defun quick-n-dirty-normal (&optional (mean 0d0) (dev 1d0))
> |    (+ mean (* dev (- (loop :repeat 12 :summing (random 1d0)) 6))))
> | 
> | We see the -6<->6 value multiplied by the s.dev to get an actual 
> | deflection. If I had looped 50 times and subtracted 25 I would have a 
> | range of -25<->25 and still be multiplying by the s.dev.
> | 
> | I eyeballed twenty values from qnd and they looked OK (I did 100 w sdev 
> | 15 since that was offered under the name IQ) and I pretty much 
> | recognized the crowd at my sports bar so I thought 6 was magic.
> +---------------
> 
> Since no-one else has mentioned it yet, I feel obligated to present
> a reminder that nothing that truncates the range of the distribution
> (such as the above Q&D summation) can actually be a "normal" or "Gaussian"
> distribution, since the tails of latter extend to infinity on both sides.
> Always. [Albeit with however low a probability...]  But since you're
> only looking for a QUICK-N-DIRTY-NORMAL, that's probably not an issue
> for you. But it just needed saying, 'kay?

Could you hedge on this non-asserting assertion a little more?

> 
> On the other hand, Scott's 3-liner:
> 
>     (defun random-IQ-score ()
>       (+ 100 (* 15 (sqrt (* -2 (log (random 1.0))))
> 		   (cos (* 2 pi (random 1.0))))))
> 
> *does* have unlimited tails [good], but can [rarely, *very* rarely!]
> blow up with an arithmetic exception in the case when (RANDOM 1.0)
> returns exactly 0.0 [oops].

It is not clear that any value for rare is acceptable on this NG.

> 
> This should be safe from blowups [though still not from negative IQs],
> though it biases the results a tiny bit [a really, *really* tiny bit!]:

A really, *really* tiny bit is how Bush <spit> won the Presidency, 'mkay?

> 
>     (defun random-IQ-score ()
>       (+ 100 (* 15 (sqrt (* -2 (log (+ (random 1d0)
> 				       least-positive-double-float))))
> 		   (cos (* 2 pi (random 1d0))))))
> 
> With IEEE doubles, the range of this is now approximately -479 to +679.
> But in 10 million trials, fewer than a dozen times was the result less
> than 25 or more than 175...

sweet. the +175 must have been the tall blonde with green eyes kicking 
ass and taking numbers on the shuffleboard.

hth,kt
From: Thomas A. Russ
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <ymiab83n6rh.fsf@blackcat.isi.edu>
····@rpw3.org (Rob Warnock) writes:
> This should be safe from blowups [though still not from negative IQs],
> though it biases the results a tiny bit [a really, *really* tiny bit!]:
> 
>     (defun random-IQ-score ()
>       (+ 100 (* 15 (sqrt (* -2 (log (+ (random 1d0)
> 				       least-positive-double-float))))
> 		   (cos (* 2 pi (random 1d0))))))

Does it really bias it any more than before?
The reason is because the range of (random 1.0d0) is actually [0.0, 1.0)
and not [0.0, 1.0], and this is presumably being shifted slightly to
to (0.0, 1.0), since the least-positive-double-float won't be large
enough to change the value of 1.0d0 when added.

It does shift all of the low-valued results slightly, up until you get a
slight discontinuity at the point where adding the
least-positive-double-float doesn't change the value.

I'm not sure, but would that be the value at

 (/ least-positive-double-float double-float-epsilon)

?



-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: Rob Warnock
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <6padnTADmb3kaDHUnZ2dnUVZ_gKWnZ2d@speakeasy.net>
Thomas A. Russ <···@sevak.isi.edu> wrote:
+---------------
| ····@rpw3.org (Rob Warnock) writes:
| > This should be safe from blowups [though still not from negative IQs],
| > though it biases the results a tiny bit [a really, *really* tiny bit!]:
| > 
| >     (defun random-IQ-score ()
| >       (+ 100 (* 15 (sqrt (* -2 (log (+ (random 1d0)
| > 				       least-positive-double-float))))
| > 		   (cos (* 2 pi (random 1d0))))))
| 
| Does it really bias it any more than before?
| The reason is because the range of (random 1.0d0) is actually [0.0, 1.0)
| and not [0.0, 1.0], and this is presumably being shifted slightly to
| to (0.0, 1.0), since the least-positive-double-float won't be large
| enough to change the value of 1.0d0 when added.  ...
+---------------

Actually, as Tamas Papp pointed out, the correct domain for the
Box-Muller algorithm is (0.0, 1.0] anyway, so that "the right thing"
[after Scott corrected Tamas's typo] is to use (- 1d0 (random 1d0))
instead, and so now we're back to three lines:

    (defun random-IQ-score ()
      (+ 100 (* 15 (sqrt (* -2 (log (- 1d0 (random 1d0)))))
		   (cos (* 2 pi (random 1d0))))))


-Rob

p.s. Scott also suggested using the SIN result rather than COS,
but you're going to lose one point from the range in any case.
I don't see that it matters from which end.

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Pillsy
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <69d9af44-eb2b-416a-afa5-967883b92d04@w34g2000yqm.googlegroups.com>
On Feb 27, 9:51 pm, Kenneth Tilton <·········@gmail.com> wrote:
[...]
> I eyeballed twenty values from qnd and they looked OK (I did 100 w sdev
> 15 since that was offered under the name IQ) and I pretty much
> recognized the crowd at my sports bar so I thought 6 was magic.

The "magic" is that when you add up 12 of them, you get a standard
deviation of 1; the standard deviation of a single random variable
distributed between 0 and 1 (or -1/2 and 1/2) is (/ (sqrt 12)), and if
you sum up N independent, identically distributed random variables,
you multiply the standard deviation by (sqrt N).

Cheers,
Pillsy
From: Scott
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <7ae7c95b-e80f-48cd-8160-e643458e1a5f@t3g2000yqa.googlegroups.com>
On Feb 26, 9:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
> [This one comes without a solution in Lisp, just a description of what I
> think has to be a worser way.]
>
> So I want N values distributed normally around 100. Standard deviation
> your choice, cuz I don't know what those are.
>

I think you want this:

    http://en.wikipedia.org/wiki/Box-Muller_transform

That will give you a Normal distribution with mean 0 and stdev 1.
Multiply these numbers times the standard deviation you want.  Then
add 100 to change the mean.


Cheers,
    -Scott
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a78c57$0$5928$607ed4bc@cv.net>
Scott wrote:
> On Feb 26, 9:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
>> [This one comes without a solution in Lisp, just a description of what I
>> think has to be a worser way.]
>>
>> So I want N values distributed normally around 100. Standard deviation
>> your choice, cuz I don't know what those are.
>>
> 
> I think you want this:
> 
>     http://en.wikipedia.org/wiki/Box-Muller_transform
> 
> That will give you a Normal distribution with mean 0 and stdev 1.
> Multiply these numbers times the standard deviation you want.  Then
> add 100 to change the mean.

My understanding is that you can STFU until you can tell me for what I 
am shilling or apologize for the libel. Don't call me names just because 
you do not enjoy technology as much as I do.

kt
From: Scott
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <281a001b-67e0-46eb-a7d3-c9c63e5a61de@z9g2000yqi.googlegroups.com>
On Feb 26, 11:47 pm, Kenneth Tilton <·········@gmail.com> wrote:
>
> My understanding is that you can STFU until you can tell me for what I
> am shilling or apologize for the libel. Don't call me names just because
> you do not enjoy technology as much as I do.
>

Please accept my apology.

(defun box-muller ()
  (let ((u (random 1.0))
        (v (random 1.0)))
    (let ((r (sqrt (* -2 (log u))))
          (a (* 2 pi v)))
      (list (* r (cos a)) (* r (sin a))))))

(defun random-IQ-score ()
  (+ 100 (* 15 (car (box-muller)))))
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a79acc$0$20301$607ed4bc@cv.net>
Scott wrote:
> On Feb 26, 11:47 pm, Kenneth Tilton <·········@gmail.com> wrote:
>> My understanding is that you can STFU until you can tell me for what I
>> am shilling or apologize for the libel. Don't call me names just because
>> you do not enjoy technology as much as I do.
>>
> 
> Please accept my apology.

Sure!

> 
> (defun box-muller ()
>   (let ((u (random 1.0))
>         (v (random 1.0)))
>     (let ((r (sqrt (* -2 (log u))))
>           (a (* 2 pi v)))
>       (list (* r (cos a)) (* r (sin a))))))
> 

Ah, i didn't have that, I had something gaussian.

Now things get ugly: how could one compare solutions of something like 
this?! Is there a measure of the normalcy of normal distributions?!!

jes curious. ran some numbers, they look good enough for government work.

thx, kt
From: Scott
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <ccb6068d-31b9-4e3c-bacb-7fa662ee45d6@v15g2000yqn.googlegroups.com>
On Feb 27, 12:48 am, Kenneth Tilton <·········@gmail.com> wrote:

>
> Now things get ugly: how could one compare solutions of something like
> this?! Is there a measure of the normalcy of normal distributions?!!
>

There are several ways, one of which I like and have used.  I assume
you want Common Lisp code, and not the name of the algorithm or link
to the article...  The trick is that you sort the output results and
integrate them.  Then you can compare that to the cumulative
distribution function you were expecting.
From: Tamas K Papp
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <70qp87Fi2so9U1@mid.individual.net>
On Fri, 27 Feb 2009 02:48:43 -0500, Kenneth Tilton wrote:

> Now things get ugly: how could one compare solutions of something like
> this?! Is there a measure of the normalcy of normal distributions?!!

The aforementioned algorithm gives you normal random variates by
construction, so I don't see what you are trying to do.  Are you
comparing some other draws from an unknown distribution to a normal
distribution?

Tamas
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a86a26$0$13281$607ed4bc@cv.net>
Tamas K Papp wrote:
> On Fri, 27 Feb 2009 02:48:43 -0500, Kenneth Tilton wrote:
> 
>> Now things get ugly: how could one compare solutions of something like
>> this?! Is there a measure of the normalcy of normal distributions?!!
> 
> The aforementioned algorithm gives you normal random variates by
> construction, so I don't see what you are trying to do.  Are you
> comparing some other draws from an unknown distribution to a normal
> distribution?

No. I think you missed that I ended up with two fine solutions and was 
just wondering if one of you smart people knew if there was even such 
a thing as measuring which was "better".

I guess first we look to see if the s.d. of the population matches the 
target, and would we also want the mean to be close to the designated 
mean? Not sure we want to reward someone more for being a little closer. 
Then is there some measure that picks up too tight a clustering or 
something?

jes curious.

kxo
From: D Herring
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a8a131$0$3339$6e1ede2f@read.cnntp.org>
Kenneth Tilton wrote:
> Tamas K Papp wrote:
>> On Fri, 27 Feb 2009 02:48:43 -0500, Kenneth Tilton wrote:
>>
>>> Now things get ugly: how could one compare solutions of something like
>>> this?! Is there a measure of the normalcy of normal distributions?!!
>>
>> The aforementioned algorithm gives you normal random variates by
>> construction, so I don't see what you are trying to do.  Are you
>> comparing some other draws from an unknown distribution to a normal
>> distribution?
> 
> No. I think you missed that I ended up with two fine solutions and was 
> just wondering if one of you smart people knew if there was even such a 
> thing as measuring which was "better".
> 
> I guess first we look to see if the s.d. of the population matches the 
> target, and would we also want the mean to be close to the designated 
> mean? Not sure we want to reward someone more for being a little closer. 
> Then is there some measure that picks up too tight a clustering or 
> something?

If you have an unbiased number generator, the "quality" of fit will 
improve as you increase the sample size (i.e. generate more numbers).

There are numerous metrics for detecting the "normalcy" of a 
distribution.  The most common are the variance, skew, and kurtosis. 
The mean just tells you if the distributions are centered at the same 
spot.

http://en.wikipedia.org/wiki/Moment_(mathematics)

- Daniel
From: GP lisper
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <slrngqoeir.6b6.spambait@phoenix.clouddancer.com>
On Fri, 27 Feb 2009 21:28:03 -0500, <········@at.tentpost.dot.com> wrote:
>
> If you have an unbiased number generator, the "quality" of fit will 
> improve as you increase the sample size (i.e. generate more numbers).
>
> There are numerous metrics for detecting the "normalcy" of a 
> distribution.  The most common are the variance, skew, and kurtosis. 

None of which detect the most common problem, failing two-point
correlation.

-- 
Being a Slime user means living in a house inhabited by a family of
crazed carpenters. When you wake up, the house is different. Maybe
there is a new turret, or some walls have moved, or perhaps someone
has removed the floor under your bed.
From: Tamas K Papp
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <70rgcvFi3n6fU1@mid.individual.net>
On Fri, 27 Feb 2009 17:33:10 -0500, Kenneth Tilton wrote:

> Tamas K Papp wrote:
>> On Fri, 27 Feb 2009 02:48:43 -0500, Kenneth Tilton wrote:
>> 
>>> Now things get ugly: how could one compare solutions of something like
>>> this?! Is there a measure of the normalcy of normal distributions?!!
>> 
>> The aforementioned algorithm gives you normal random variates by
>> construction, so I don't see what you are trying to do.  Are you
>> comparing some other draws from an unknown distribution to a normal
>> distribution?
> 
> No. I think you missed that I ended up with two fine solutions and was
> just wondering if one of you smart people knew if there was even such a
> thing as measuring which was "better".

Nope, there isn't, the question doesn't make sense.  Those variates
will be normal, which you can *prove* with pencil & paper.  No need to 
measure anything (except for a unit test to uncover bugs).

> I guess first we look to see if the s.d. of the population matches the
> target, and would we also want the mean to be close to the designated
> mean? Not sure we want to reward someone more for being a little closer.
> Then is there some measure that picks up too tight a clustering or
> something?

Since you are generating normal variates, any statistic (including the
sd) will just be a random variable.  Unless you make a programming
error in the implementation, any such "measurement" will make  either 
algorithm "better" 50% of the time.

Tamas
From: ···············@gmail.com
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <64f8e378-c907-41e9-ba45-84bccbc1e081@o11g2000yql.googlegroups.com>
On Feb 27, 5:33 pm, Kenneth Tilton <·········@gmail.com> wrote:
> Tamas K Papp wrote:
> > On Fri, 27 Feb 2009 02:48:43 -0500, Kenneth Tilton wrote:
>
> >> Now things get ugly: how could one compare solutions of something like
> >> this?! Is there a measure of the normalcy of normal distributions?!!
>
> > The aforementioned algorithm gives you normal random variates by
> > construction, so I don't see what you are trying to do.  Are you
> > comparing some other draws from an unknown distribution to a normal
> > distribution?
>
> No. I think you missed that I ended up with two fine solutions and was
> just wondering if one of you smart people knew if there was even such
> a thing as measuring which was "better".
>
> I guess first we look to see if the s.d. of the population matches the
> target, and would we also want the mean to be close to the designated
> mean? Not sure we want to reward someone more for being a little closer.
> Then is there some measure that picks up too tight a clustering or
> something?
>
> jes curious.
>
> kxo

There are two separate questions here.

Tamas made the point that the formula generating normal deviates from
uniform deviates is precisely derived: the output will be as "normal"
insofar as the *underlying implementation's RANDOM is uniform.*. That
changes the question to the quality of your implementation's RANDOM.

The second question, as to how to judge the quality of a pseudo-random
number generator, is a much broader one. You can estimate the shape of
the distribution by calculating means, standard deviations, and higher
"moments" (represented by things like "skew" and "kurtosis"), which
have formulas similar to standard deviation but with higher exponents
(e.g., cubes instead of squares). PRNG's have other qualities beyond
the distribution, such as the correlation between variates: one could
generate numbers with a Gaussian distribution, but the signs might
predictably alternate, for instance; it might be easy to predict the
next number given a history of previously generated numbers. Some of
these qualities might be application-specific (e.g., a PRNG generating
poker hands might show particular patterns that end up skewing the
occurrence of straight flushes vs.
full houses.)

The answer to these questions could either be based on careful
analysis of the algorithm, or by experiment. In order to do either of
these, you need to be more specific and precise about the concerns
about your random number generators.
From: Tamas K Papp
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <70tbm6Fi75qaU1@mid.individual.net>
On Sat, 28 Feb 2009 09:27:58 -0800, ···············@gmail.com wrote:

> On Feb 27, 5:33 pm, Kenneth Tilton <·········@gmail.com> wrote:
>
>> just wondering if one of you smart people knew if there was even such a
>> thing as measuring which was "better".
>>
>> I guess first we look to see if the s.d. of the population matches the
>> target, and would we also want the mean to be close to the designated
>> mean? Not sure we want to reward someone more for being a little
>> closer. Then is there some measure that picks up too tight a clustering
>> or something?
>>
>> jes curious.
>>
>> kxo
> 
> There are two separate questions here.
> 
> Tamas made the point that the formula generating normal deviates from
> uniform deviates is precisely derived: the output will be as "normal"
> insofar as the *underlying implementation's RANDOM is uniform.*. That
> changes the question to the quality of your implementation's RANDOM.

Most implementations are pretty decent these days, and use a Mersenne
Twister or something as good as that, so I would not worry about this.
 
Tamas
From: Scott
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <e9c42698-325e-468e-a058-3e9beb94e3b2@j35g2000yqh.googlegroups.com>
On Feb 28, 10:47 am, Tamas K Papp <······@gmail.com> wrote:
> On Sat, 28 Feb 2009 09:27:58 -0800, ···············@gmail.com wrote:
>
> > Tamas made the point that the formula generating normal deviates from
> > uniform deviates is precisely derived: the output will be as "normal"
> > insofar as the *underlying implementation's RANDOM is uniform.*. That
> > changes the question to the quality of your implementation's RANDOM.
>
> Most implementations are pretty decent these days, and use a Mersenne
> Twister or something as good as that, so I would not worry about this.
>

Even using the Mersenne Twister, there are issues that might be a
problem in this case.  For instance, the 64 bit Mersenne Twister spits
out very good 64 bit integers, but if (random 1d0) ends up being
equivalent to taking a 64 bit integer and dividing it by 2^64, you're
going to get a zero 1 in 2^64 times.  That's much more often than the
Box-Muller algorithm is expecting to get a zero.  We're pretending a
discrete probability distribution is a continuous one.

(I'm ignoring other details such as how many of those 2^64 integers
will collapse into fewer than 2^64 double precision floating point
numbers.)

Really, the probability of getting a zero from a continuous uniform
distribution should be zero.  Rob Warnock's solution hides the zero,
but I think if you knew the details of your PRNG like this, you'd be
better off adding .5/(2^64).

If they're using a 32 bit random number generator, Kenny might get
tall blondes with infinite IQs once every 4 billion runs.
From: Tamas K Papp
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <70tg6gFi8afgU1@mid.individual.net>
On Sat, 28 Feb 2009 10:35:38 -0800, Scott wrote:

> On Feb 28, 10:47 am, Tamas K Papp <······@gmail.com> wrote:
>> On Sat, 28 Feb 2009 09:27:58 -0800, ···············@gmail.com wrote:
>>
>> > Tamas made the point that the formula generating normal deviates from
>> > uniform deviates is precisely derived: the output will be as "normal"
>> > insofar as the *underlying implementation's RANDOM is uniform.*. That
>> > changes the question to the quality of your implementation's RANDOM.
>>
>> Most implementations are pretty decent these days, and use a Mersenne
>> Twister or something as good as that, so I would not worry about this.
>>
>>
> Even using the Mersenne Twister, there are issues that might be a
> problem in this case.  For instance, the 64 bit Mersenne Twister spits
> out very good 64 bit integers, but if (random 1d0) ends up being
> equivalent to taking a 64 bit integer and dividing it by 2^64, you're
> going to get a zero 1 in 2^64 times.  That's much more often than the
> Box-Muller algorithm is expecting to get a zero.  We're pretending a
> discrete probability distribution is a continuous one.

That should be OK for most things though.

> (I'm ignoring other details such as how many of those 2^64 integers will
> collapse into fewer than 2^64 double precision floating point numbers.)
> 
> Really, the probability of getting a zero from a continuous uniform
> distribution should be zero.  Rob Warnock's solution hides the zero, but
> I think if you knew the details of your PRNG like this, you'd be better
> off adding .5/(2^64).

I would not do that --- the distortion is probably minor, but it seems
inelegant.  AFAIK the B-M algorithm requires uniform numbers from
(0,1], so I would just sample u and v from (1- (random 1d0)), this way
we do the RIGHT THING :-)

> If they're using a 32 bit random number generator, Kenny might get tall
> blondes with infinite IQs once every 4 billion runs.

Not with the solution I proposed.  Sorry Kenny :-)

Tamas
From: Scott
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <c1a3c25e-6f40-41ef-9b71-39d7ebf9b74a@p20g2000yqi.googlegroups.com>
On Feb 28, 12:04 pm, Tamas K Papp <······@gmail.com> wrote:
>
> > Really, the probability of getting a zero from a continuous uniform
> > distribution should be zero.  Rob Warnock's solution hides the zero, but
> > I think if you knew the details of your PRNG like this, you'd be better
> > off adding .5/(2^64).
>
> I would not do that --- the distortion is probably minor, but it seems
> inelegant.  AFAIK the B-M algorithm requires uniform numbers from
> (0,1], so I would just sample u and v from (1- (random 1d0)), this way
> we do the RIGHT THING :-)
>

I think you meant (- 1 (random 1d0)), but I agree this is a good way
to go.  This is what I wish I had put in the original version.  (you
don't have to invert the interval for v)

   (defun random-IQ-score ()
       (+ 100 (* 15 (sqrt (* -2 (log (- 1 (random 1d0)))))
                 (sin (* 2 pi (random 1d0))))))


However, I think the .5/(2^N) version adds the least distortion to the
variance, and Rob's least-positive-double-float version uses more of
the representable numbers between 0 and 1.  For similar reasons, we
probably should keep the sin result instead of the cos result too.
Those are really persnickety details though.


Cheers,
    -Scott
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a99fb2$0$5917$607ed4bc@cv.net>
Scott wrote:
> On Feb 28, 12:04 pm, Tamas K Papp <······@gmail.com> wrote:
>>> Really, the probability of getting a zero from a continuous uniform
>>> distribution should be zero.  Rob Warnock's solution hides the zero, but
>>> I think if you knew the details of your PRNG like this, you'd be better
>>> off adding .5/(2^64).
>> I would not do that --- the distortion is probably minor, but it seems
>> inelegant.  AFAIK the B-M algorithm requires uniform numbers from
>> (0,1], so I would just sample u and v from (1- (random 1d0)), this way
>> we do the RIGHT THING :-)
>>
> 
> I think you meant (- 1 (random 1d0)), but I agree this is a good way
> to go.  This is what I wish I had put in the original version.  (you
> don't have to invert the interval for v)

Oh, is that why some PhD I showed it* to whinged that it "consumed two 
gaussians"? Can we run out of gaussians? Mind you, the guy is from 
Australia and for my money suggests by example there is yet another 
country from whom the US is separated by a common language.

> 
>    (defun random-IQ-score ()
>        (+ 100 (* 15 (sqrt (* -2 (log (- 1 (random 1d0)))))
>                  (sin (* 2 pi (random 1d0))))))
> 
> 
> However, I think the .5/(2^N) version adds the least distortion to the
> variance, and Rob's least-positive-double-float version uses more of
> the representable numbers between 0 and 1.  For similar reasons, we
> probably should keep the sin result instead of the cos result too.
> Those are really persnickety details though.
> 
> 

Good thing I never went for a PhD.

kt

* (defun box-muller ()
   (let ((u (random 1.0))
         (v (random 1.0)))
     (let ((r (sqrt (* -2 (log u))))
           (a (* 2 pi v)))
       (list (* r (cos a)) (* r (sin a))))))

(defun random-IQ-score ()
   (+ 100 (* 15 (car (box-muller)))))
From: Scott
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <707f1f78-8d31-4641-b9f8-1619bd69283d@p11g2000yqe.googlegroups.com>
On Feb 28, 1:33 pm, Kenneth Tilton <·········@gmail.com> wrote:
>
> Oh, is that why some PhD I showed it* to whinged that it "consumed two
> gaussians"? Can we run out of gaussians? Mind you, the guy is from
> Australia and for my money suggests by example there is yet another
> country from whom the US is separated by a common language.
>

If I had to guess, I'd say he whinged because he didn't write it.  I
believe that's why most PhDs whinge at things.  If he had written it,
he might defend the flaws by pointing out that they aren't
significant, don't really matter, and are unlikely to happen in
practice.  Besides, if the flaws do cause problem, you probably won't
be able to repeat them with out a lot of effort.  :-)
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a9bf2f$0$5931$607ed4bc@cv.net>
Scott wrote:
> On Feb 28, 1:33 pm, Kenneth Tilton <·········@gmail.com> wrote:
>> Oh, is that why some PhD I showed it* to whinged that it "consumed two
>> gaussians"? Can we run out of gaussians? Mind you, the guy is from
>> Australia and for my money suggests by example there is yet another
>> country from whom the US is separated by a common language.
>>
> 
> If I had to guess, I'd say he whinged because he didn't write it.  I
> believe that's why most PhDs whinge at things.  If he had written it,
> he might defend the flaws by pointing out that they aren't
> significant, don't really matter, and are unlikely to happen in
> practice.  Besides, if the flaws do cause problem, you probably won't
> be able to repeat them with out a lot of effort.  :-)
> 

No, this guy can code. I don't say that very often, either.

kt
From: Bakul Shah
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49A9CD24.7000907@bitblocks.com>
Kenneth Tilton wrote:
> Scott wrote:
>> On Feb 28, 12:04 pm, Tamas K Papp <······@gmail.com> wrote:
>>>> Really, the probability of getting a zero from a continuous uniform
>>>> distribution should be zero.  Rob Warnock's solution hides the zero, 
>>>> but
>>>> I think if you knew the details of your PRNG like this, you'd be better
>>>> off adding .5/(2^64).
>>> I would not do that --- the distortion is probably minor, but it seems
>>> inelegant.  AFAIK the B-M algorithm requires uniform numbers from
>>> (0,1], so I would just sample u and v from (1- (random 1d0)), this way
>>> we do the RIGHT THING :-)
 >
>> I think you meant (- 1 (random 1d0)), but I agree this is a good way
>> to go.  This is what I wish I had put in the original version.  (you
>> don't have to invert the interval for v)
> 
> Oh, is that why some PhD I showed it* to whinged that it "consumed two 
> gaussians"? Can we run out of gaussians? Mind you, the guy is from 
> Australia and for my money suggests by example there is yet another 
> country from whom the US is separated by a common language.

May be he was thinking of the "ziggurat algorithm"?
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a99de5$0$5904$607ed4bc@cv.net>
Tamas K Papp wrote:
> On Sat, 28 Feb 2009 10:35:38 -0800, Scott wrote:
> 
>> On Feb 28, 10:47 am, Tamas K Papp <······@gmail.com> wrote:
>>> On Sat, 28 Feb 2009 09:27:58 -0800, ···············@gmail.com wrote:
>>>
>>>> Tamas made the point that the formula generating normal deviates from
>>>> uniform deviates is precisely derived: the output will be as "normal"
>>>> insofar as the *underlying implementation's RANDOM is uniform.*. That
>>>> changes the question to the quality of your implementation's RANDOM.
>>> Most implementations are pretty decent these days, and use a Mersenne
>>> Twister or something as good as that, so I would not worry about this.
>>>
>>>
>> Even using the Mersenne Twister, there are issues that might be a
>> problem in this case.  For instance, the 64 bit Mersenne Twister spits
>> out very good 64 bit integers, but if (random 1d0) ends up being
>> equivalent to taking a 64 bit integer and dividing it by 2^64, you're
>> going to get a zero 1 in 2^64 times.  That's much more often than the
>> Box-Muller algorithm is expecting to get a zero.  We're pretending a
>> discrete probability distribution is a continuous one.
> 
> That should be OK for most things though.
> 
>> (I'm ignoring other details such as how many of those 2^64 integers will
>> collapse into fewer than 2^64 double precision floating point numbers.)
>>
>> Really, the probability of getting a zero from a continuous uniform
>> distribution should be zero.  Rob Warnock's solution hides the zero, but
>> I think if you knew the details of your PRNG like this, you'd be better
>> off adding .5/(2^64).
> 
> I would not do that --- the distortion is probably minor, but it seems
> inelegant.  AFAIK the B-M algorithm requires uniform numbers from
> (0,1], so I would just sample u and v from (1- (random 1d0)), this way
> we do the RIGHT THING :-)
> 
>> If they're using a 32 bit random number generator, Kenny might get tall
>> blondes with infinite IQs once every 4 billion runs.
> 
> Not with the solution I proposed.  Sorry Kenny :-)

That's OK, I'd probably just screw up and ask her how tall she was.

Good discussion, all. Thx.

kt
From: Jason Riedy
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <87d4czrx6l.fsf@sparse.dyndns.org>
And Scott writes:
> That's much more often than the Box-Muller algorithm is
> expecting to get a zero.  We're pretending a discrete
> probability distribution is a continuous one.

To which Tamas K. Papp responds:
> That should be OK for most things though.

In LAPACK, the random complex number generator had a slight
bug and would return (0.0, 0.0) every few million runs.  This
in turn lead to an invalid operation and an infinite loop in
the test suite.  So OK for "most things" isn't so good for
general use.

And Scott writes:
> [...] I think if you knew the details of your PRNG like this,
> you'd be better off adding .5/(2^64).

To which Tamas K. Papp responds:
> I would not do that --- the distortion is probably minor, but
> it seems inelegant.  AFAIK the B-M algorithm requires uniform
> numbers from (0,1], so I would just sample u and v from (1-
> (random 1d0)), this way we do the RIGHT THING :-)

Test for the end points of the interval and run the generator
again if the number hits them.  That helps protect against
overzealous compiler optimizations as well, particularly on a
platform that can induce double-rounding bugs (x87).  Yeah, that
happened, but not with a Lisp compiler.

Jason
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49acd5cd$0$16810$607ed4bc@cv.net>
Jason Riedy wrote:
> And Scott writes:
>> That's much more often than the Box-Muller algorithm is
>> expecting to get a zero.  We're pretending a discrete
>> probability distribution is a continuous one.
> 
> To which Tamas K. Papp responds:
>> That should be OK for most things though.
> 
> In LAPACK, the random complex number generator had a slight
> bug...

Slight? Isn't this one of those horsehoes/hand grenade deals?

kt
From: Tamas K Papp
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <714p4sFj3s2tU1@mid.individual.net>
On Tue, 03 Mar 2009 00:29:38 -0500, Jason Riedy wrote:

> And Scott writes:
>> That's much more often than the Box-Muller algorithm is expecting to
>> get a zero.  We're pretending a discrete probability distribution is a
>> continuous one.
> 
> To which Tamas K. Papp responds:
>> That should be OK for most things though.
> 
> In LAPACK, the random complex number generator had a slight bug and
> would return (0.0, 0.0) every few million runs.  This in turn lead to an
> invalid operation and an infinite loop in the test suite.  So OK for
> "most things" isn't so good for general use.

If the underlying library has a bug, and something nasty (like
division by 0) occurs, then it will just signal a condition and then
we can go and debug it.  For me, that is good enough for general use.
Of course if I were designing nuclear power plants, I would be
programming more defensively, but I don't see the need for that
generally.

> And Scott writes:
>> [...] I think if you knew the details of your PRNG like this, you'd be
>> better off adding .5/(2^64).
> 
> To which Tamas K. Papp responds:
>> I would not do that --- the distortion is probably minor, but it seems
>> inelegant.  AFAIK the B-M algorithm requires uniform numbers from
>> (0,1], so I would just sample u and v from (1- (random 1d0)), this way
>> we do the RIGHT THING :-)
> 
> Test for the end points of the interval and run the generator again if
> the number hits them.  That helps protect against overzealous compiler
> optimizations as well, particularly on a platform that can induce
> double-rounding bugs (x87).  Yeah, that happened, but not with a Lisp
> compiler.

Again, that will just signal a condition.  And then I can debug it,
decide whether I want to work around the compiler or report it to its
authors.  In either case, I would feel silly if I tried to build up
defenses against compiler bugs etc that _may_ happen, instead of just
spending time on getting closer to the solution of my problem.

Tamas
From: Vend
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <1f08591c-b30c-4de2-af98-0c5a2741aab1@e15g2000vbe.googlegroups.com>
On 28 Feb, 19:35, Scott <·······@gmail.com> wrote:
> On Feb 28, 10:47 am, Tamas K Papp <······@gmail.com> wrote:
>
> > On Sat, 28 Feb 2009 09:27:58 -0800, ···············@gmail.com wrote:
>
> > > Tamas made the point that the formula generating normal deviates from
> > > uniform deviates is precisely derived: the output will be as "normal"
> > > insofar as the *underlying implementation's RANDOM is uniform.*. That
> > > changes the question to the quality of your implementation's RANDOM.
>
> > Most implementations are pretty decent these days, and use a Mersenne
> > Twister or something as good as that, so I would not worry about this.
>
> Even using the Mersenne Twister, there are issues that might be a
> problem in this case.  For instance, the 64 bit Mersenne Twister spits
> out very good 64 bit integers, but if (random 1d0) ends up being
> equivalent to taking a 64 bit integer and dividing it by 2^64, you're
> going to get a zero 1 in 2^64 times.  That's much more often than the
> Box-Muller algorithm is expecting to get a zero.  We're pretending a
> discrete probability distribution is a continuous one.
>
> (I'm ignoring other details such as how many of those 2^64 integers
> will collapse into fewer than 2^64 double precision floating point
> numbers.)
>
> Really, the probability of getting a zero from a continuous uniform
> distribution should be zero.  Rob Warnock's solution hides the zero,
> but I think if you knew the details of your PRNG like this, you'd be
> better off adding .5/(2^64).

From the reference implementation:

/* generates a random number on [0,1]-real-interval */
double genrand64_real1(void)
{
    return (genrand64_int64() >> 11) * (1.0/9007199254740991.0);
}

/* generates a random number on [0,1)-real-interval */
double genrand64_real2(void)
{
    return (genrand64_int64() >> 11) * (1.0/9007199254740992.0);
}

/* generates a random number on (0,1)-real-interval */
double genrand64_real3(void)
{
    return ((genrand64_int64() >> 12) + 0.5) *
(1.0/4503599627370496.0);
}

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c

> If they're using a 32 bit random number generator, Kenny might get
> tall blondes with infinite IQs once every 4 billion runs.

You can generate the 53 bits you need from two successive 32 bit
integers.
From: Raymond Toy
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <sxdtz6a8xil.fsf@rtp.ericsson.se>
>>>>> "Scott" == Scott  <·······@gmail.com> writes:

    Scott> On Feb 28, 10:47�am, Tamas K Papp <······@gmail.com> wrote:
    >> On Sat, 28 Feb 2009 09:27:58 -0800, ···············@gmail.com wrote:
    >> 
    >> > Tamas made the point that the formula generating normal deviates from
    >> > uniform deviates is precisely derived: the output will be as "normal"
    >> > insofar as the *underlying implementation's RANDOM is uniform.*. That
    >> > changes the question to the quality of your implementation's RANDOM.
    >> 
    >> Most implementations are pretty decent these days, and use a Mersenne
    >> Twister or something as good as that, so I would not worry about this.
    >> 

    Scott> Even using the Mersenne Twister, there are issues that might be a
    Scott> problem in this case.  For instance, the 64 bit Mersenne Twister spits
    Scott> out very good 64 bit integers, but if (random 1d0) ends up being
    Scott> equivalent to taking a 64 bit integer and dividing it by 2^64, you're
    Scott> going to get a zero 1 in 2^64 times.  That's much more often than the
    Scott> Box-Muller algorithm is expecting to get a zero.  We're pretending a
    Scott> discrete probability distribution is a continuous one.

Why the distinction for zero and not some other number X that would be
produced?  If we're being pedantic, isn't X also going to occur 1 in
2^64 (or so) times, which is too often?

Well, except for the fact that log might not like being handed a zero,
but that's a somewhat different issue.

Ray
From: Tamas K Papp
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <714vlbFj0469U1@mid.individual.net>
On Tue, 03 Mar 2009 09:57:22 -0500, Raymond Toy wrote:

>     Scott> Even using the Mersenne Twister, there are issues that might
>     be a Scott> problem in this case.  For instance, the 64 bit Mersenne
>     Twister spits Scott> out very good 64 bit integers, but if (random
>     1d0) ends up being Scott> equivalent to taking a 64 bit integer and
>     dividing it by 2^64, you're Scott> going to get a zero 1 in 2^64
>     times.  That's much more often than the Scott> Box-Muller algorithm
>     is expecting to get a zero.  We're pretending a Scott> discrete
>     probability distribution is a continuous one.
> 
> Why the distinction for zero and not some other number X that would be
> produced?  If we're being pedantic, isn't X also going to occur 1 in
> 2^64 (or so) times, which is too often?

Given the fact that we are using finite-precision arithmetic, it would
not make any difference if we had access to a "perfect" RNG, as the
results would have to be converted to finite precision at some point.

> Well, except for the fact that log might not like being handed a zero,
> but that's a somewhat different issue.

Nope, that's the only issue, and once you handle that, the solution is
as good as needed for practical purposes: it draws normal random
variates, represented as floats, and does not throw a condition even
with a small chance.  That's the spec, end of the story.

Kenny asked a simple question, he got some answers, and then the
thread degenerated into a pedantic discussion of trivialities.  Sad.

Tamas
From: Robert Dodier
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <da9829d4-1c31-46a0-8336-3ab870c1b3f8@r15g2000prh.googlegroups.com>
On Feb 27, 12:16 am, Scott <·······@gmail.com> wrote:

> On Feb 26, 11:47 pm, Kenneth Tilton <·········@gmail.com> wrote:

> > My understanding is that you can STFU

Well, aren't you a piece of work: loud, demanding, unpleasant, and
dull.

> Please accept my apology.

Why are you helping that idiot?

Robert Dodier
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a84f7d$0$5930$607ed4bc@cv.net>
Robert Dodier wrote:
> On Feb 27, 12:16 am, Scott <·······@gmail.com> wrote:
> 
>> On Feb 26, 11:47 pm, Kenneth Tilton <·········@gmail.com> wrote:
> 
>>> My understanding is that you can STFU
> 
> Well, aren't you a piece of work: loud, demanding, unpleasant, and
> dull.

God I love it when someone takes me seriously. My mother never did. <sob>

> 
>> Please accept my apology.
> 
> Why are you helping that idiot?

Interesting. Me 'n scott work things out and you want to keep the fight 
going. Tsk tsk.


kxo
From: William James
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <gocqia0dps@enews5.newsguy.com>
Kenneth Tilton wrote:

> Scott wrote:
> > On Feb 26, 9:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
> >> [This one comes without a solution in Lisp, just a description of what I
> >> think has to be a worser way.]
> > > 
> >> So I want N values distributed normally around 100. Standard deviation
> >> your choice, cuz I don't know what those are.

"cuz" instead of "because".  He thinks he's very cute.

> > > 
> > 
> > I think you want this:
> > 
> >     http://en.wikipedia.org/wiki/Box-Muller_transform
> > 
> > That will give you a Normal distribution with mean 0 and stdev 1.
> > Multiply these numbers times the standard deviation you want.  Then
> > add 100 to change the mean.
> 
> My understanding is that you can STFU until you can tell me for what I 
> am shilling or apologize for the libel. Don't call me names just because 
> you do not enjoy technology as much as I do.
> 

People are not willing to help students with their homework.
Why are they willing to help this arrogant, insolent kenny
with his work?  He's always begging for help while posturing
as a Lisp guru.

How many "From the Trenches" threads have we seen from this creep?

Remember when he was always starting threads for no other purpose
than to lure readers to his website so that he could make money
from google?
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a9fbe6$0$5934$607ed4bc@cv.net>
William James wrote:
> Kenneth Tilton wrote:
> 
>> Scott wrote:
>>> On Feb 26, 9:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
>>>> [This one comes without a solution in Lisp, just a description of what I
>>>> think has to be a worser way.]
>>>>
>>>> So I want N values distributed normally around 100. Standard deviation
>>>> your choice, cuz I don't know what those are.
> 
> "cuz" instead of "because".  He thinks he's very cute.

No, just following Tilton's Law of Abbreviations: Abbreviate nothing 
less than seven characters long*, nothing used rarely, and only if one 
can find an abbreviation less than half as long as the original.

> 
>>> I think you want this:
>>>
>>>     http://en.wikipedia.org/wiki/Box-Muller_transform
>>>
>>> That will give you a Normal distribution with mean 0 and stdev 1.
>>> Multiply these numbers times the standard deviation you want.  Then
>>> add 100 to change the mean.
>> My understanding is that you can STFU until you can tell me for what I 
>> am shilling or apologize for the libel. Don't call me names just because 
>> you do not enjoy technology as much as I do.
>>
> 
> People are not willing to help students with their homework.
> Why are they willing to help this arrogant, insolent kenny
> with his work?  He's always begging for help while posturing
> as a Lisp guru.
> 
> How many "From the Trenches" threads have we seen from this creep?

(1- the number of solutions he has posted with his challenges, which 
unlike everything some bores post the group seems to enjoy)

> 
> Remember when he was always starting threads for no other purpose
> than to lure readers to his website so that he could make money
> from google?

Thx, you just reminded me I forgot to post the context for my tall 
blonde crack:

http://smuglispweeny.blogspot.com/2008/02/what-hell-is-fortune-cookie-file-anyway.html

Thx!

knnth

* but it is ok to use no for number
From: Rob Warnock
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <XLGdnbn6fJJACDfUnZ2dnUVZ_uSdnZ2d@speakeasy.net>
Kenneth Tilton  <·········@gmail.com> wrote:
+---------------
| http://smuglispweeny.blogspot.com/2008/02/what-hell-is-fortune-cookie-file-anyway.html
+---------------

The one that starts with "Lisp nearing the age of fifty"
could use an update...


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Bob Felts
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <1ivvcck.1ooabxiz67a20N%wrf3@stablecross.com>
William James <·········@yahoo.com> wrote:

> Kenneth Tilton wrote:
> 
> People are not willing to help students with their homework.

That's not quite true.  I'm teaching my daughter Lisp and I help her
with her homework.  I've seen people in this group help others.  What we
won't do (at least those of us with any integrity) is do their work for
them.

> Why are they willing to help this arrogant, insolent kenny
> with his work?  He's always begging for help while posturing
> as a Lisp guru.
> 

Kenny wasn't asking for help with Lisp.  He was asking for help with a
certain mathematical technique.  Furthermore, it isn't homework.
Hopefully you can distinguish between two things that aren't equal.

As for his arrogance and insolence, that's part of his charm.  I'd
invite him out for a beer, or other beverage of his choice, any day of
the week.

> How many "From the Trenches" threads have we seen from this creep?
> 
> Remember when he was always starting threads for no other purpose
> than to lure readers to his website so that he could make money
> from google?

Oh, good idea.  I haven't check in a few days.  Time to see if he's
posted anything new.  And there's nothing wrong with a little
capitalism, especially when I'm entertained as a result.  It's no
different from buying episodes of BSG from iTunes.
From: GP lisper
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <slrngqofc7.6b6.spambait@phoenix.clouddancer.com>
On Sat, 28 Feb 2009 22:19:05 -0500, <····@stablecross.com> wrote:
>
> As for his arrogance and insolence, that's part of his charm.  I'd
> invite him out for a beer, or other beverage of his choice, any day of
> the week.

Me too.  c.l.l. is not life.

After all the Ruby and Clodjur people might actually have a clue in
The Big Blue Room.


-- 
Being a Slime user means living in a house inhabited by a family of
crazed carpenters. When you wake up, the house is different. Maybe
there is a new turret, or some walls have moved, or perhaps someone
has removed the floor under your bed.
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a78e1e$0$5895$607ed4bc@cv.net>
ps.

> 
>     http://en.wikipedia.org/wiki/Box-Muller_transform

I was looking for a three-liner, I have B-M already in the form of a 
Lisp port from a C recipes book, 20 lines. bzzt! Well, it doesn't cons 
so it qualifies as better, but this is like the proof of Fermat's Last: 
sufficient but unsatisfying.

kt
From: Scott
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <b3b07006-de1c-4fd5-8b69-e95f3243e82c@l39g2000yqn.googlegroups.com>
On Feb 26, 11:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
>
> I was looking for a three-liner, [...]

(defun random-IQ-score ()
      (+ 100 (* 15 (sqrt (* -2 (log (random 1.0)))) (cos (* 2 pi
(random 1.0))))))
From: Thomas Munro
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <15b992b0-3073-49e3-86b9-ab4a3d65d94f@a39g2000yqc.googlegroups.com>
On Feb 27, 4:54 am, Kenneth Tilton <·········@gmail.com> wrote:
> So I want N values distributed normally around 100. Standard deviation
> your choice, cuz I don't know what those are.

Unfortunately, you can find several algorithms for this in any
textbook on financial maths, since one of the most common ways of
pricing options seems to be to stick this inside a loop and pretend
it's a Monte Carlo simulation of reality.

There's a cumulative normal distribution (common Taylor series
approximation) given in Lisp here:

http://www.espenhaug.com/black_scholes.html
From: Mirko
Subject: Normally distributed random number generation (was:  From the 	Trenches There Has To be A Better Way #23a)
Date: 
Message-ID: <7004dd74-c645-403e-b91a-22b98972da1d@j35g2000yqh.googlegroups.com>
On Feb 26, 11:54 pm, Kenneth Tilton <·········@gmail.com> wrote:
> [This one comes without a solution in Lisp, just a description of what I
> think has to be a worser way.]
>
> So I want N values distributed normally around 100. Standard deviation
> your choice, cuz I don't know what those are.
>
> All I can think of is the sine function. Walk value v from 2-3sd* below
> to 2-3 sd above, treating that as pi to 0 and actually populating a
> collection with (* (/ N 2) (sin v)) copies of v, shuffling and then
> popping to assign them to N places.
>
> I just keep thinking building the population is unnecessary and that a
> straight calculation is possible, without any foundation for said thoughts.
>
> kt
>
> * Oh, hell, from 75 to 125.

Depending on your platform, I may find useful the gsll lisp library
(http://common-lisp.net/project/gsll/) that links to the Gnu
Scientific Library (http://www.gnu.org/software/gsl/).  The latter
seems to have an impressive array of random number generators for many
kinds of distributions, and also some


I use gsll on sbcl/linux very succesfully and on clisp/windows with
some occasional hiccups.  The gsll library author is very responsive
and helpful.  The library is under active development, and the
interface changes now and then (which is inconvenient), but always for
the better (which makes it convenient eventually).

Mirko

PS - I changed the subject to attract more informed folks.
From: Tamas K Papp
Subject: Re: Normally distributed random number generation (was:  From the 	Trenches There Has To be A Better Way #23a)
Date: 
Message-ID: <70qpa8Fi2so9U2@mid.individual.net>
On Fri, 27 Feb 2009 04:51:33 -0800, Mirko wrote:

> Depending on your platform, I may find useful the gsll lisp library
> (http://common-lisp.net/project/gsll/) that links to the Gnu Scientific
> Library (http://www.gnu.org/software/gsl/).  The latter seems to have an
> impressive array of random number generators for many kinds of
> distributions, and also some
> 
> 
> I use gsll on sbcl/linux very succesfully and on clisp/windows with some
> occasional hiccups.  The gsll library author is very responsive and
> helpful.  The library is under active development, and the interface
> changes now and then (which is inconvenient), but always for the better
> (which makes it convenient eventually).

I second that, GSLL is excellent, and the author (Liam Healy) has
recently reorganized the library so now it integrates into Lisp even
better.  I use it routinely.

Tamas
From: Vilho =?utf-8?B?UsOkaXPDpG5lbg==?=
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <87vdqwxx1r.fsf@iki.fi>
Kenneth Tilton <·········@gmail.com> writes:

Quick googling for "normal distribution lisp" would seem to produce some
code.

Standard method for creating a desired distribution would be to use
cumulation function (integral), but that is not probably available for
normal distribution in Lisp...

   BR,
     Vilho

> [This one comes without a solution in Lisp, just a description of what
> I think has to be a worser way.]
>
> So I want N values distributed normally around 100. Standard deviation
> your choice, cuz I don't know what those are.
>
> All I can think of is the sine function. Walk value v from 2-3sd*
> below to 2-3 sd above, treating that as pi to 0 and actually
> populating a collection with (* (/ N 2) (sin v)) copies of v,
> shuffling and then popping to assign them to N places.
>
> I just keep thinking building the population is unnecessary and that a
> straight calculation is possible, without any foundation for said
> thoughts.
>
> kt
>
> * Oh, hell, from 75 to 125.

-- 
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49a78b75$0$5937$607ed4bc@cv.net>
Vilho R�is�nen wrote:
> Kenneth Tilton <·········@gmail.com> writes:
> 
> Quick googling for "normal distribution lisp" would seem to produce some
> code.

Maybe I would seem to like quick interacting with fucking humans, OK?

Thanks for so little help, moron. Oh, gosh, Google, did you think of 
that yourself?*

kt

* I did try YouTube. Came up empty. Next stop the Weather Channel. k
From: Terry Sullivan
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <ad8a2$49acaee0$4b4cb16d$4130@KNOLOGY.NET>
Kenneth Tilton wrote:
> [This one comes without a solution in Lisp, just a description of what I 
> think has to be a worser way.]
> 
> So I want N values distributed normally around 100. Standard deviation 
> your choice, cuz I don't know what those are.
> 
> All I can think of is the sine function. Walk value v from 2-3sd* below 
> to 2-3 sd above, treating that as pi to 0 and actually populating a 
> collection with (* (/ N 2) (sin v)) copies of v, shuffling and then 
> popping to assign them to N places.
> 
> I just keep thinking building the population is unnecessary and that a 
> straight calculation is possible, without any foundation for said thoughts.
> 
> kt
> 
> * Oh, hell, from 75 to 125.

See if this works for you.

(defun random-in-range ()
   (let ((range-low 75)
	(range-high 125))
     (+ range-low
        (* (- range-high range-low)
	  (random 1.0)))))
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49acd704$0$16837$607ed4bc@cv.net>
Terry Sullivan wrote:
> Kenneth Tilton wrote:
>> [This one comes without a solution in Lisp, just a description of what 
>> I think has to be a worser way.]
>>
>> So I want N values distributed normally around 100. Standard deviation 
>> your choice, cuz I don't know what those are.
>>
>> All I can think of is the sine function. Walk value v from 2-3sd* 
>> below to 2-3 sd above, treating that as pi to 0 and actually 
>> populating a collection with (* (/ N 2) (sin v)) copies of v, 
>> shuffling and then popping to assign them to N places.
>>
>> I just keep thinking building the population is unnecessary and that a 
>> straight calculation is possible, without any foundation for said 
>> thoughts.
>>
>> kt
>>
>> * Oh, hell, from 75 to 125.
> 
> See if this works for you.
> 
> (defun random-in-range ()
>   (let ((range-low 75)
>     (range-high 125))
>     (+ range-low
>        (* (- range-high range-low)
>       (random 1.0)))))
> 

Congratulations on the massively accurate name of your function. The bad 
news is that had you wanted to respond to what I wrote the root "normal" 
would have been in the name somewhere.

kt
From: Terry Sullivan
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <35e0f$49ad4342$4b4cb16d$12700@KNOLOGY.NET>
Kenneth Tilton wrote:
> Terry Sullivan wrote:
>> Kenneth Tilton wrote:
>>> [This one comes without a solution in Lisp, just a description of 
>>> what I think has to be a worser way.]
>>>
>>> So I want N values distributed normally around 100. Standard 
>>> deviation your choice, cuz I don't know what those are.
>>>
>>> All I can think of is the sine function. Walk value v from 2-3sd* 
>>> below to 2-3 sd above, treating that as pi to 0 and actually 
>>> populating a collection with (* (/ N 2) (sin v)) copies of v, 
>>> shuffling and then popping to assign them to N places.
>>>
>>> I just keep thinking building the population is unnecessary and that 
>>> a straight calculation is possible, without any foundation for said 
>>> thoughts.
>>>
>>> kt
>>>
>>> * Oh, hell, from 75 to 125.
>>
>> See if this works for you.
>>
>> (defun random-in-range ()
>>   (let ((range-low 75)
>>     (range-high 125))
>>     (+ range-low
>>        (* (- range-high range-low)
>>       (random 1.0)))))
>>
> 
> Congratulations on the massively accurate name of your function. The bad 
> news is that had you wanted to respond to what I wrote the root "normal" 
> would have been in the name somewhere.
> 
> kt
> 

My bad.

(defun normal-distribution-random-in-range ()
   (let ((range-low 75)
     (range-high 125))
     (+ range-low
        (* (- range-high range-low)
       (random 1.0)))))


My assumption is that random is a Gaussian distribution aka normal 
distribution implementation. If that assumption is incorrect, then just 
substitute a normal distribution random function with a range of 0-1 in 
it's place.

Ts
From: Kenneth Tilton
Subject: Re: From the Trenches There Has To be A Better Way #23a
Date: 
Message-ID: <49ad49b6$0$5938$607ed4bc@cv.net>
Terry Sullivan wrote:
> Kenneth Tilton wrote:
>> Terry Sullivan wrote:
>>> Kenneth Tilton wrote:
>>>> [This one comes without a solution in Lisp, just a description of 
>>>> what I think has to be a worser way.]
>>>>
>>>> So I want N values distributed normally around 100. Standard 
>>>> deviation your choice, cuz I don't know what those are.
>>>>
>>>> All I can think of is the sine function. Walk value v from 2-3sd* 
>>>> below to 2-3 sd above, treating that as pi to 0 and actually 
>>>> populating a collection with (* (/ N 2) (sin v)) copies of v, 
>>>> shuffling and then popping to assign them to N places.
>>>>
>>>> I just keep thinking building the population is unnecessary and that 
>>>> a straight calculation is possible, without any foundation for said 
>>>> thoughts.
>>>>
>>>> kt
>>>>
>>>> * Oh, hell, from 75 to 125.
>>>
>>> See if this works for you.
>>>
>>> (defun random-in-range ()
>>>   (let ((range-low 75)
>>>     (range-high 125))
>>>     (+ range-low
>>>        (* (- range-high range-low)
>>>       (random 1.0)))))
>>>
>>
>> Congratulations on the massively accurate name of your function. The 
>> bad news is that had you wanted to respond to what I wrote the root 
>> "normal" would have been in the name somewhere.
>>
>> kt
>>
> 
> My bad.
> 
> (defun normal-distribution-random-in-range ()
>   (let ((range-low 75)
>     (range-high 125))
>     (+ range-low
>        (* (- range-high range-low)
>       (random 1.0)))))
> 
> 
> My assumption is that random is a Gaussian distribution aka normal 
> distribution implementation. If that assumption is incorrect, 

What part of "12.2.41...An approximately uniform choice distribution is 
used." do you find ambiguous?

> then just 
> substitute a normal distribution random function with a range of 0-1 in 
> it's place.

You mean what everyone else in this thread is actually addressing?

<sigh>

kt