Possible bug in inverse Trig funcs in ACL

From: Matt Kressel
Subject: Possible bug in inverse Trig funcs in ACL
Date: Wed, 09 Sep 1998 00:00:00 +0000
Message-ID: <35F67D7B.FB034E2@atdc.northgrum.com>

Hello,
	I have discovered a possible bug in the implementation
of the inverse trig functions in ACL 3.0.2 for windows.

Specifically:

on ACL:
(acos 0.999999993558585) = 1.13502554902016E-4

on Microsoft Visual C++, GNU C/C++ Linux, Microsoft FORTRAN, and
Macintosh Common LISP 3.0
(acos 0.999999993558585) = 1.135025547192...E-4

Although the difference seems insignificant, if one
is doing mathematically intensive calculations, the error
can become cumulative and skew sensitive data.

Most likely they used the equation in Guy L. Steele's book, 
so I will quote him here:
"These formula are mathematically correct assuming completely
accurate computation.  They may be terrible methods for floating
point computation...."

Can someone at Franz, or out on the net point me to a solution?

Thanks,
Matt

-- 
Matthew O. Kressel | INTERNET: ···············@atdc.northgrum.com
+---------  Northrop Grumman Corporation, Bethpage, NY ---------+
+---------  TEL: (516) 346-9101 FAX: (516) 346-9740 ------------+

Re: Possible bug in inverse Trig funcs in ACL Gareth McCaughan
- Re: Possible bug in inverse Trig funcs in ACL Matt Kressel
  - Re: Possible bug in inverse Trig funcs in ACL Raymond Toy
  - Re: Possible bug in inverse Trig funcs in ACL Alan Shutko
    - Re: Possible bug in inverse Trig funcs in ACL Gareth McCaughan
      - Re: Possible bug in inverse Trig funcs in ACL Gareth McCaughan
    - Re: Possible bug in inverse Trig funcs in ACL Fujio Kako
  - Re: Possible bug in inverse Trig funcs in ACL Erik Naggum
  - Re: Possible bug in inverse Trig funcs in ACL Gareth McCaughan
Re: Possible bug in inverse Trig funcs in ACL rusty craine
Re: Possible bug in inverse Trig funcs in ACL Stig Hemmer

From: Gareth McCaughan
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Wed, 09 Sep 1998 00:00:00 +0000
Message-ID: <86ww7dpefi.fsf@g.pet.cam.ac.uk>

Matt Kressel wrote:

[ACL says]
> (acos 0.999999993558585) = 1.13502554902016E-4
[some other things say]
> (acos 0.999999993558585) = 1.135025547192...E-4

Well, actually,
  (acos 0.999999993558585) = 1.13502555098247E-4

to that many decimal places, so they're both wrong :-).
Furthermore, the value given by ACL is closer to the truth
than the value you're getting from gcc etc. That's life
with floating-point arithmetic...

-- 
Gareth McCaughan       Dept. of Pure Mathematics & Mathematical Statistics,
·····@dpmms.cam.ac.uk  Cambridge University, England.

From: Matt Kressel
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Wed, 09 Sep 1998 00:00:00 +0000
Message-ID: <35F690FA.E6DBC476@atdc.northgrum.com>

Gareth McCaughan wrote:
> 
> Matt Kressel wrote:
> 
> [ACL says]
> > (acos 0.999999993558585) = 1.13502554902016E-4
> [some other things say]
> > (acos 0.999999993558585) = 1.135025547192...E-4
> 
> Well, actually,
>   (acos 0.999999993558585) = 1.13502555098247E-4

What makes you sure that the above answer is the correct one?
I see from your sig that you are in the pure mathematics department,
so help me out here.  What method do/did you use to compute this?

> 
> to that many decimal places, so they're both wrong :-).
> Furthermore, the value given by ACL is closer to the truth
> than the value you're getting from gcc etc. That's life
> with floating-point arithmetic...

I realize that floating point arithmetic can lead to problems such
as rounding and representation, but if we are using double
precision reals and the significand is 52 bits, shouldn't these 
errors appear much further to the right of the decimal?

Secondly, why should four or five implementations return the same number,
precise to twelve or more places, and another differ on the 9th digit?
I would understand if different architectures had this problem, but I get
different numbers on the *same* machine using a different compiler.  The
algorithms, I am assuming, are therefore different.  My question is, if you
can calculate a number precisely on four different compilers, why does the
fifth
suddenly differ?  It would seem to me that the makers of mathematical
languages
such as FORTRAN and C would have used the correct implementation, which led me
to believe that the ACLISP was wrong.  I don't mean to ramble, but I am
thouroughly
confused.

Thanks for any help,
Matt Kressel

-- 
Matthew O. Kressel | INTERNET: ···············@atdc.northgrum.com
+---------  Northrop Grumman Corporation, Bethpage, NY ---------+
+---------  TEL: (516) 346-9101 FAX: (516) 346-9740 ------------+

From: Raymond Toy
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Wed, 09 Sep 1998 00:00:00 +0000
Message-ID: <4nu32h5n0r.fsf@rtp.ericsson.se>

>>>>> "Matt" == Matt Kressel <···············@atdc.northgrum.com> writes:

    Matt> Gareth McCaughan wrote:
    >> 
    >> Matt Kressel wrote:
    >> 
    >> [ACL says]
    >> > (acos 0.999999993558585) = 1.13502554902016E-4
    >> [some other things say]
    >> > (acos 0.999999993558585) = 1.135025547192...E-4
    >> 
    >> Well, actually,
    >> (acos 0.999999993558585) = 1.13502555098247E-4

[snip]
    >> 
    >> to that many decimal places, so they're both wrong :-).
    >> Furthermore, the value given by ACL is closer to the truth
    >> than the value you're getting from gcc etc. That's life
    >> with floating-point arithmetic...

    Matt> I realize that floating point arithmetic can lead to problems such
    Matt> as rounding and representation, but if we are using double
    Matt> precision reals and the significand is 52 bits, shouldn't these 
    Matt> errors appear much further to the right of the decimal?

No.  acos(x) has an infinite derivative at 1, so any perturbation in x 
produces a magnified perturbation in acos(x).  You can lose half the
significance near x = 1.

    Matt> Secondly, why should four or five implementations return the same number,
    Matt> precise to twelve or more places, and another differ on the 9th digit?

I suspect most of the differences come from how the implementation
converts that decimal string of digits into the internal binary
format.

For example, using CLISP, I get:

> (acos 0.999999993558585d0)
1.1350255490194527d-4

But if we look at what CLISP converted that number into, and then
compute acos for a single-bit error, we get:

> (integer-decode-float  0.999999993558585d0)
9007199196721884
-53
1
* (setf (long-float-digits) 128)  ; 128-bit long-floats
128
> (acos (scale-float (float 9007199196721884 1L0) -53))
1.135025547192367484359818612137463454237L-4
> (acos (scale-float (float 9007199196721883 1L0) -53))
1.13502555697384795386441451197268726337L-4
> (acos (scale-float (float 9007199196721885 1L0) -53))
1.135025537410886930559886078128397962705L-4

So there's quite a bit of variation in the answer, even if I perturb
the result by a single bit.  This assumes, of course, that the acos
function is correctly in CLISP.  I believe it is.

From this, I would say that others convert the decimal string into the 
same value as CLISP, but ACL does something differently.  Perhaps ACL
is running on a PC and is using the FPU long-floats for greater
precision.

Ray

From: Alan Shutko
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Wed, 09 Sep 1998 00:00:00 +0000
Message-ID: <m31zplyzf0.fsf@hubert.wuh.wustl.edu>

>>>>> "M" == Matt Kressel <···············@atdc.northgrum.com> writes:

M> Gareth McCaughan wrote:
>>  Matt Kressel wrote:
>> 
>> [ACL says] > (acos 0.999999993558585) = 1.13502554902016E-4 [some
>> other things say] > (acos 0.999999993558585) = 1.135025547192...E-4
>> 
>> Well, actually, (acos 0.999999993558585) = 1.13502555098247E-4

M> What makes you sure that the above answer is the correct one? 

Good question... Mathematica agrees with you, unless I'm running it
wrong (a distinct possibility)

In[15]:= N[ArcCos[0.999999993558585],20]

Out[15]= 0.0001135025547192367

-- 
Alan Shutko <···@acm.org> - By consent of the corrupted
It is easier to change the specification to fit the program than vice versa.

From: Gareth McCaughan
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Thu, 10 Sep 1998 00:00:00 +0000
Message-ID: <86lnnrvoxv.fsf@g.pet.cam.ac.uk>

Alan Shutko wrote:

> Good question... Mathematica agrees with you, unless I'm running it
> wrong (a distinct possibility)
> 
> In[15]:= N[ArcCos[0.999999993558585],20]
> 
> Out[15]= 0.0001135025547192367

Then Mathematica is wrong (or, at best, profoundly misleading).
It has rounded that number 0.999etc to the nearest multiple of
2^-52, and then done the calculation accurately. Unfortunately,
"to the nearest multiple of 2^-52" isn't enough to give 20 places
of accuracy in reading the number. And, worse still (though
Mathematica can hardly be expected to work this out), to get
20 places of accuracy in the *answer* you need rather more than
20 places of accuracy in the *input*. So ideally it ought to
have represented that number 0.999etc *exaectly* initially,
then looked at the computation it was being asked to do,
checked the derivative of the function involved, realised
that it required about 24 decimal digits of accuracy in its
input, then converted 0.999etc to a 24-decimal-digit floating
pointer number, and then done the calculation. (Perhaps it
should have taken a couple of extra places for safety.)

However, Mathematica is not as intelligent as the people who
sell it would like you to believe. It won't replace mathematicians
for a while yet, even for doing simple calculations like this
one.

(The number whose inverse cosine it computed was, to 20 places,
.99999999355858504301. The number it was asked to do was, to 20 places,
.99999999355858500000.)

-- 
Gareth McCaughan       Dept. of Pure Mathematics & Mathematical Statistics,
·····@dpmms.cam.ac.uk  Cambridge University, England.

From: Gareth McCaughan
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Thu, 10 Sep 1998 00:00:00 +0000
Message-ID: <8667evvmfw.fsf@g.pet.cam.ac.uk>

I wrote:

> (The number whose inverse cosine it computed was, to 20 places,
> .99999999355858504301. The number it was asked to do was, to 20 places,
> .99999999355858500000.)

A pedantic correction: the `1' in the last place of that first
number should be a `2'.

Of course, my claim about 2^-52 is based on conjecture. What
I can say is that if the answer it gave is the result of
rounding acos(x) to 20 places, then x lies between
.999999993558585043018638 and .999999993558585043018650.
Which the number whose inverse cosine was actually asked for,
alas, does not.

-- 
Gareth McCaughan       Dept. of Pure Mathematics & Mathematical Statistics,
·····@dpmms.cam.ac.uk  Cambridge University, England.

From: Fujio Kako
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Wed, 16 Sep 1998 00:00:00 +0000
Message-ID: <3600210C.1F62@ics.nara-wu.ac.jp>

Alan Shutko wrote:
> 
> >>>>> "M" == Matt Kressel <···············@atdc.northgrum.com> writes:
> 
> M> Gareth McCaughan wrote:
> >>  Matt Kressel wrote:
> >>
> >> [ACL says] > (acos 0.999999993558585) = 1.13502554902016E-4 [some
> >> other things say] > (acos 0.999999993558585) = 1.135025547192...E-4
> >>
> >> Well, actually, (acos 0.999999993558585) = 1.13502555098247E-4
> 
> M> What makes you sure that the above answer is the correct one?
> 
> Good question... Mathematica agrees with you, unless I'm running it
> wrong (a distinct possibility)
> 
> In[15]:= N[ArcCos[0.999999993558585],20]
> 
> Out[15]= 0.0001135025547192367
> 

by using REDUCE, I got

1: on rounded,numval;
     
2: precision 100;
     
     12
     
3: acos(0.999999993558585);
     
     0.000113502555098247057440745866065922007430764955868704707992
     
     7633852808952520192285164437383971546635247
     
4: acos(0.999999993558585043018638316);
     
     0.000113502554719236748439730630143643712855713534227591167970
     
     0170119045701006655026706340283183889010584
     

where 0.999999993558585043186... is a double precision representation of
0.999999993558585.

From: Erik Naggum
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Wed, 09 Sep 1998 00:00:00 +0000
Message-ID: <3114373983670895@naggum.no>

* Matt Kressel <···············@atdc.northgrum.com>
| Secondly, why should four or five implementations return the same number,
| precise to twelve or more places, and another differ on the 9th digit?

  one easy answer is that those four or five use the same implementation of
  `acos' and/or the floating-point parser, such as the system math library.
  
#:Erik
-- 
  http://www.naggum.no/spam.html is about my spam protection scheme and how
  to guarantee that you reach me.  in brief: if you reply to a news article
  of mine, be sure to include an In-Reply-To or References header with the
  message-ID of that message in it.  otherwise, you need to read that page.

From: Gareth McCaughan
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Thu, 10 Sep 1998 00:00:00 +0000
Message-ID: <86ogsnvpdl.fsf@g.pet.cam.ac.uk>

Matt Kressel wrote:

[I said:]
>> Well, actually,
>>   (acos 0.999999993558585) = 1.13502555098247E-4
> 
> What makes you sure that the above answer is the correct one?
> I see from your sig that you are in the pure mathematics department,
> so help me out here.  What method do/did you use to compute this?

I stuffed it into a symbolic algebra package, asked for
lots of decimal places of accuracy, and looked to see
what it gave me. :-) Perhaps that was unwise; so I've
just tried again with a different package. Same answer.
(The two packages were PARI/GP and Maple. I asked Maple
to use 1000 digits for its floating-point calculations;
I don't remember exactly what I asked PARI for.)

> I realize that floating point arithmetic can lead to problems such
> as rounding and representation, but if we are using double
> precision reals and the significand is 52 bits, shouldn't these 
> errors appear much further to the right of the decimal?

I suspect the problem is with the input rather than the output,
as it were. You've got, in effect, an error of about 2^-52
in the input. At the input value we're looking at, the
derivative of the acos function is about -2^13, so that
error gets magnified into an error of about 2^-39, which
means something like 2*10^-12. Let's see...
           111
  123456789012
0.000113502555098  right answer
0.000113502554719  one wrong answer
0.000113502554902  another wrong answer

So they've done pretty well, actually.

> Secondly, why should four or five implementations return the same number,
> precise to twelve or more places, and another differ on the 9th digit?

I would guess that the four or five use your system's standard
mathematics library, and ACL has its own.

> It would seem to me that the makers of mathematical languages
> such as FORTRAN and C would have used the correct implementation,
> which led me to believe that the ACLISP was wrong.

There isn't a single `correct' implementation; there are lots of
ways of calculating arctangents. And it may be that actually
the systems differ in how they've read in the number.

However, here is some evidence that suggests that ACL *may*
be doing something less "correct" than the other systems.
If you take your number 0.999[etc], round it to the nearest
multiple of 2^-52, and calculate the inverse cosine of *that*,
you get 0.0001135025547192[etc] -- which is exactly what the
other systems are saying. So, *either* ACL is doing something
more sophisticated, *or* it's doing something less correct.
I have no idea which; I think I would guess at the latter,
because it's hard to see what more sophisticated thing it
could plausibly be doing.

> I don't mean to ramble, but I am thoroughly confused.

I hope you're now confused on a higher level. :-)

-- 
Gareth McCaughan       Dept. of Pure Mathematics & Mathematical Statistics,
·····@dpmms.cam.ac.uk  Cambridge University, England.

From: rusty craine
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Wed, 09 Sep 1998 00:00:00 +0000
Message-ID: <35F6A709.C289644C@flash.net>

Matt Kressel wrote:
> 
> Hello,
>         I have discovered a possible bug in the implementation
> of the inverse trig functions in ACL 3.0.2 for windows.
> 
> Specifically:
> 
> on ACL:
> (acos 0.999999993558585) = 1.13502554902016E-4
> 
> on Microsoft Visual C++, GNU C/C++ Linux, Microsoft FORTRAN, and
> Macintosh Common LISP 3.0
> (acos 0.999999993558585) = 1.135025547192...E-4
> 
> Although the difference seems insignificant, if one
> is doing mathematically intensive calculations, the error
> can become cumulative and skew sensitive data.
> 

Floating point computations in any language are not for the faint of
heart.  There are PhD's offered in nurmerical analysis.   The narrower
the critical range of the result (if close is not good enough) the
smarter your program has to be and the smarter your programmer has to
be.  Some compliers/interpreter let noisy bits creep in worse than
other.  Two things ya should do.  Check the IEEEE criteria for floating
point computations and see if their specifitations meet your critical
range and then see if your compiler complies with IEEE specs. If both of
those things are true and not good enough...that is not good alot!! 
_Volume 2 Numerical Analysis_ by (oh geeez forget his name with going to
look) Kunce(?), Knoch(?), Knuch(?).  Oh well some of the math gurus will
know it.  Even though I have a masters in math we had to hire a fully
feathered numerical analysis guru.  the air was much to rare for we
fledglings.  Not much help I know, sorry.  i will be very interested in
the advice from some of the really bright folks.
rusty
PS Don't email me form help.....we paid over $12K to get the problems
solved :)

From: Stig Hemmer
Subject: Re: Possible bug in inverse Trig funcs in ACL
Date: Wed, 09 Sep 1998 00:00:00 +0000
Message-ID: <ekvk93d40h5.fsf@bigblue.pvv.ntnu.no>

Matt Kressel <···············@atdc.northgrum.com> writes:
> (acos 0.999999993558585) = 1.13502554902016E-4
or
> (acos 0.999999993558585) = 1.135025547192...E-4

Raymond Toy said something similar, but I thought I should make it
more explict:

The main problem is probably in the input.  Decimal numbers aren't
exactly representable in a binary computer and there is round-off
errors in converting a number from the printed form to internal form.
These round-off errors can vary from architecture to architecture.

Using (acos) near 1.0 magnifies these errors to the point where they
become a serious problem.

> Although the difference seems insignificant, if one
> is doing mathematically intensive calculations, the error
> can become cumulative and skew sensitive data.

I think your mathematical model needs some work.  Calling (acos) near
1.0 is _not_ healthy, regardless of the precision of the calculations.

For example, assume a program uses the expression
(/ 1.0 (- (+ x (/ 1.0 y)) x))
where x and y are both large numbers.

This will give large round-off errors, which can be avoided by
recognizing that the expression can in fact be simplified to y.

I'm not saying that your problem is as simple as that, but you should
try to rearrage you math so that you stay away from near-singular
areas like (acos 1.0).

In general, seeing a long line of 9's or 0's in a decimal number is a
warning bell that you are probably losing precision. (0's like in
1.0000xyz, not like in 0.0000xyz)

Stig Hemmer, 
Jack of a Few Trades.