Rocksolid Light - comp.arch - Re: 80x87 and IEEE 754

Re: 80x87 and IEEE 754

<efe4736d-5fe8-437b-bb6d-527d972ed307n@googlegroups.com>

https://news.novabbs.org/devel/article-flat.php?id=32991&group=comp.arch#32991

X-Received: by 2002:a05:622a:594:b0:400:a63a:9fa with SMTP id c20-20020a05622a059400b00400a63a09famr21097qtb.13.1688240184655;
Sat, 01 Jul 2023 12:36:24 -0700 (PDT)
X-Received: by 2002:a17:90b:46d1:b0:25f:612:b17e with SMTP id
jx17-20020a17090b46d100b0025f0612b17emr4480568pjb.7.1688240184247; Sat, 01
Jul 2023 12:36:24 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 1 Jul 2023 12:36:23 -0700 (PDT)
In-Reply-To: <UUZnM.255$1ZN4.198@fx12.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:e145:3525:1604:b72c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:e145:3525:1604:b72c
References: <u78vn4$3keqd$1@newsreader4.netcologne.de> <u7b8q8$qh6r$1@dont-email.me>
<u7cde2$10aie$1@dont-email.me> <7467e4f5-f727-4d46-b491-8052af8fbcb2n@googlegroups.com>
<780c43c5-d79f-4bfe-91dc-6443b3397325n@googlegroups.com> <2304c8fb-20cd-40e9-a9e4-0149d4d39a87n@googlegroups.com>
<u7gha9$1mms4$1@dont-email.me> <d138dab4-a19a-4e0b-98af-780aa50ecdf4n@googlegroups.com>
<u7i0r1$1rhn2$1@dont-email.me> <77319995-dd17-425a-9427-210feddd2d46n@googlegroups.com>
<u7j7m6$233l8$1@dont-email.me> <2023Jun29.190507@mips.complang.tuwien.ac.at>
<u7kmcp$287sv$1@dont-email.me> <yeDnM.2558$fNr5.1450@fx16.iad> <UUZnM.255$1ZN4.198@fx12.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <efe4736d-5fe8-437b-bb6d-527d972ed307n@googlegroups.com>
Subject: Re: 80x87 and IEEE 754
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sat, 01 Jul 2023 19:36:24 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 5093

by: MitchAlsup - Sat, 1 Jul 2023 19:36 UTC

On Saturday, July 1, 2023 at 1:04:40 PM UTC-5, EricP wrote:
> EricP wrote:
> > Terje Mathisen wrote:
> >> Anton Ertl wrote:
> >
> > It took a bit of rummaging about but the 8087 was designed by
> > John F. Palmer, Bruce W. Ravenel, Rafi Nave.
> >
> > Palmer wrote 1980 article which describes of some of its design rational.
> >
> > The INTEL 8087 Numeric Data Processor, Palmer, 1980
> > https://search.iczhiku.com/paper/DyXtJLK9sG1DZUjG.pdf
<
> After poking about some more, just guessing but it looks to me like the
> reason that Double Rounding (DR) happened was: nobody thought of it.
<
a) this correlates with my history, nobody told us (88K) about it until about
1988.
b) it happens very infrequently:: about 1 in 2^28 for double precision to
single precision.
c) double rounding has a slight statistical bias in favor of higher accuracy
in real FP codes, and a slight statistical bias against numerical checking
codes.
>
> From looking at the dates on papers it doesn't look like anyone
> even noticed DR was happening until 1995. Then they started
> backtracking the problem to its source.
<
I think the Stanford Paranoia FP test suite found it.
>
> Palmer introduces the FP80 format for the reasons he gives and
> that unknowingly creates the potential for DR on FP80 to FP64.
> The 8087 launches in 1980.
> But it still takes 15 years for anyone to detect that DR is an
> issue and only if you spill the FP80 to FP64 with a store.
<
Closer to 8 years for those of us in the trenches, more like
double that for the average FP programmer.
>
> When is Double Rounding Innocuous, Figueroa, 1995
> https://dl.acm.org/doi/pdf/10.1145/221332.221334
<
This paper indicates that DR occurs about 1 in 2^28 RMS
conversions of DP->SP. Which for any FP calculation where
the original dataset contains noise is too far under the noise
floor to be visible. However the Stanford Paranoia IEEE 754
test suite has data sets that contain no noise whatsoever.
>
> If p is the smaller fraction #bits then DR differences only happen
> when the larger fraction bits is < 2p or 2p+1 or 2p+2 bits,
> depending on the operation + - * / or sqrt.
> This could not have happened on PDP11 or VAX FP formats.
>
Or IBM hexidecimal format or CDC 6600 SP and FP formats.
<
Or ANY FP format where the number of exponent bits is the
same SP and DP. It is only when the exponent field grows
as the precision increases where it can rear its ugly head
{above the noise floor}.
<
It will be interesting to see if posits display DR or whether
their extra statistical precision makes DR even harder to see.
>
> It could not happen on 8087 for FP80 to FP32, just FP80 to FP64.
<
Oh what a tangled web we weave............
>
> IIRC the Microsoft 16-bit compilers for Win3.1 supported FP80 as a
> native data type. However the 32-bit MS compiler for WinNT in 1992 onwards
> did not, just FP32 and FP64, for both results and spilled intermediates.
> And 32-bit code in WinNT 3.1 and 3.5 kinda languishes until Win95 lands.
>
> Then Win95 launches and the frequency of spilling FP80 to FP64 really
> goes up and that's people notice it and go "oh... crap".

Re: 80x87 and IEEE 754 (was: Aprupt underflow mode)

<u7pvi3$30pvq$1@dont-email.me>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=32992&group=comp.arch#32992

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: 80x87 and IEEE 754 (was: Aprupt underflow mode)
Date: Sat, 1 Jul 2023 14:41:52 -0500
Organization: A noiseless patient Spider
Lines: 220
Message-ID: <u7pvi3$30pvq$1@dont-email.me>
References: <u78vn4$3keqd$1@newsreader4.netcologne.de>
<u7b8q8$qh6r$1@dont-email.me> <u7cde2$10aie$1@dont-email.me>
<7467e4f5-f727-4d46-b491-8052af8fbcb2n@googlegroups.com>
<780c43c5-d79f-4bfe-91dc-6443b3397325n@googlegroups.com>
<2304c8fb-20cd-40e9-a9e4-0149d4d39a87n@googlegroups.com>
<u7gha9$1mms4$1@dont-email.me>
<d138dab4-a19a-4e0b-98af-780aa50ecdf4n@googlegroups.com>
<u7i0r1$1rhn2$1@dont-email.me>
<77319995-dd17-425a-9427-210feddd2d46n@googlegroups.com>
<u7j7m6$233l8$1@dont-email.me> <2023Jun29.190507@mips.complang.tuwien.ac.at>
<8de2ce9b-2800-42d3-acb6-67bb25c57accn@googlegroups.com>
<u7l2cm$29gag$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 1 Jul 2023 19:41:55 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="320869d70c3cdbc3bd794f646727abe0";
logging-data="3172346"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+pcLtM++0bJ/bueB7OLBVx"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:s25D7GWuv+KpbPV/DNw0HJ1NK6Q=
Content-Language: en-US
In-Reply-To: <u7l2cm$29gag$1@dont-email.me>

by: BGB - Sat, 1 Jul 2023 19:41 UTC

On 6/29/2023 5:59 PM, BGB wrote:
> On 6/29/2023 1:30 PM, MitchAlsup wrote:
>> On Thursday, June 29, 2023 at 12:13:59 PM UTC-5, Anton Ertl wrote:
>>> Terje Mathisen <terje.m...@tmsw.no> writes:
>>>> The only problem with the precision control word setting single or
>>>> double precision is that the exponent range isn't modified: If you want
>>>> to do that as well, in order to force under or overflow, then you do
>>>> have to store to a memory format.
>>> That does not give the same results in all cases; it's a
>>> double-rounding problem: First round for (say) 53-bit mantissa with
>>> 16-bit exponent, then round for 53-bit mantissa with 11-bit exponent.
>>>
>>> My guess is that Intel did not add a proper binary64 and binary32
>>> mode, because the cases where it makes a difference are rare.
>> <
>> One can even argue that the surprise factor is lower with the larger
>> exponent..............fewer programmers are surprised by spurious
>> results when overflows don't happen, than when they do.
>> <
>
> In most contexts, it shouldn't matter.
>
> Granted, in some of my NN experiments, I did end up needing to emulate
> Binary16 on my PC as otherwise my training code would sometimes lead to
> intermediate results going outside the range of what could be represented.
>
> Well, and ended up with a "partial back-propagation" hack in that if an
> intermediate value went outside of an allowed range, it would go back
> and reduce all the weights.
>
> Though, arguably, using BF16 could partly avoid this issue, provided at
> least one doesn't overflow the Binary32 exponent range (or limit the
> weights to a small enough range that an overflow is "basically
> impossible").
>
> But, say, clamping Binary32 or BF16 to not go over 2^120, is not asking
> quite as much as clamping Binary16 to around 2^8 to be sure that no
> intermediate values are able to exceed the 2^15 limit...
>
>
>
> Though, for the most part, in my experiment I was "mostly" using a
> genetic-algorithm approach, where for the top performing nets in each
> generation, it would breed new nets by randomly interpolating the
> weights (1).
>
> But, yeah, with emulation logic, and code to detect and reduce all the
> weights if at any point the accumulator went over 2^14.
>
> OTOH, arguably if one used S.E4.F3 weights, it would become basically
> impossible to overflow the Binary16 range for any "reasonable sized" net...
>
>
> 1, With an interpolation value of, say:
> 0.5+(ssqrt(rand()-16384)/181.0)
> Where ssqrt is, say:
> (x>=0)?sqrt(x):(-sqrt(-x))
>
> Where, it will primarily interpolate, but may also extrapolate.
>
> Where, say, an approximation of ssqrt is also used as one of the main
> activation functions in this case, ...
>

Note that this can only be used if there is sufficient difference
between the two values, otherwise one might use other strategies:
Randomly adjusting in a positive or negative direction;
Having a probability of randomly flipping bits;
...

>
> Though, one could maybe also argue for using back-propagation, or some
> hybrid strategy (back-propagate for the error in each test-run before
> selecting and breeding the nets?...).
>
> Well, also maybe one can argue for more than 3 hidden layers, but in my
> testing, effectiveness seemed to start to drop (relative to generation
> count) when going beyond 3 layers (whereas 3 seemed to be more effective
> than 1 or 2 layers). Though, this is likely to be task dependent to some
> extent...
>

Starts looking into it, back-propagation kinda looks like a pain to
implement in a multi-layer net. Also seems like it will be
computationally expensive as well.

So, for now I continue along with a potentially crappy genetic algorithm
based net training...

>
>
> Though, I guess it might be interesting to try some standardized
> benchmarks here, like digit recognition (say, so I can have some idea
> how it compares to "other stuff").
>
> Finds and goes to download the training data for the MNIST benchmark...
> ... Why exactly does it need to be multiple GB?...
> ... Why does it need to be an epic crapton of small files?...
>
> This is a total abuse of the NTFS filesystem, but ironically turning on
> "file and folder compression" makes the unzipping process faster...
>
> OK, I guess there isn't just digits here, but also a bunch of pictures
> of pieces of clothing for some reason. So, I guess the idea is that one
> can also train a net to hopefully tell the difference between
> pants/shirts/shoes/... ?...
>
> Yeah, unzipping this file is taking a painfully long time in any case...
>

I tried using a small subset for "useful" testing (noted that there were
images with grids of all of the same digit).

Even with the small subset used, it is still likely that the net would
only see a repeat of a given digit once every 200-300 generations or so.

Earlier tests with a net like:
128+64+32+12 neurons (first variant)
128+64+64+12 neurons (second, helped some)

After a number of generations, was getting barely above random chance
(was getting ~ 12-16%, random chance, 10% +/- ~ 3%).

But, around 40% if the task was matching printed digits instead (though,
the test set is considerably smaller in this case).

Some tweaking later:
256+128+128+64+12 neurons

The first layer is fed the image "fairly directly", though I did end up
down-sampling to 16x16 mostly to "make this stuff a little faster".

This configuration gives a net that is around 243K of floating-point
weights.

This seems to converge towards around a 60 .. 70% accuracy with printed
digits, ~ 30% with a mix of printed and hand-written, and ~ 20% with
purely hand-written...

Running initially with printed digits seems to improve accuracy once it
switches over to the hand-written digits.

Was getting best results (for "picking the right answer") when ranking
nets based primarily on the number of correct guesses.

This seems like it kinda sucks vs some of the other nets.
Though, it looks like most of the other nets are "Deep" or
"Convolutional" nets.
Doesn't really look like anyone is using perceptron style nets...

Not sure how much of the lackluster accuracy is due to net size,
implementation details, or training algorithm (intuitively, it doesn't
seem like digit recognition should need such a large net).

Accuracy rate seems moderately sensitive to net size and precision.
Or, at least, accuracy drops if I use truncated 10-bit values for sake
of lookup tables. Also accuracy drops if I try to use a smaller net
(trying to map input to output with 1 or 2 layers being entirely
ineffective in this case).

Moving an existing net from 16 to 10 bit, the net becomes temporarily
ineffective (it seems that it is somehow using a different strategy with
very low precision).

Moving the net from 10 to 16, there is a temporary loss of accuracy but
it soon recovers.

Say:
16-bit, direct emulation: Not terribly fast, but more accurate.
12-bit, lookup table: Somehow slower than direct emulation...
Granted, this is ~ 70 MB of lookup tables in this case...
Say, an A+B table is 32MB.
The table fetch in this case is dead slow.
10-bit, lookup table: Faster but less accurate
Say, an A+B table is ~ 2MB.
9-bit, lookup table: No obvious speed gain over 10-bit.
Say, an A+B table is ~ 256K.

For the direct emulation, there are a few strategies:
Bit twiddle into float, do operation, bit twiddle back;
Just do the op entirely with integer math.
In this case, the latter seems to be the faster option.

Ended up implementing a direct MAC operator in the emulation code, as it
was faster in this case to emulate a MAC operator than to use separate
MUL and ADD.

Granted, on could argue that on a PC, there is really no reason to
bother with Binary16, but alas...

The code seems to primarily end up preferring activation functions in
the order:
1st place: Ssqrt
2nd place: Sqrt (negative becomes 0)
3rd place: ReLU (negative clamped to 0)
4th place: Linear

....

MitchAlsup wrote:
> On Saturday, July 1, 2023 at 1:04:40 PM UTC-5, EricP wrote:
>> EricP wrote:
>>> Terje Mathisen wrote:
>>>> Anton Ertl wrote:
>>> It took a bit of rummaging about but the 8087 was designed by
>>> John F. Palmer, Bruce W. Ravenel, Rafi Nave.
>>>
>>> Palmer wrote 1980 article which describes of some of its design rational.
>>>
>>> The INTEL 8087 Numeric Data Processor, Palmer, 1980
>>> https://search.iczhiku.com/paper/DyXtJLK9sG1DZUjG.pdf
> <
>> After poking about some more, just guessing but it looks to me like the
>> reason that Double Rounding (DR) happened was: nobody thought of it.
> <
> a) this correlates with my history, nobody told us (88K) about it until about
> 1988.
> b) it happens very infrequently:: about 1 in 2^28 for double precision to
> single precision.
> c) double rounding has a slight statistical bias in favor of higher accuracy
> in real FP codes, and a slight statistical bias against numerical checking
> codes.
>> From looking at the dates on papers it doesn't look like anyone
>> even noticed DR was happening until 1995. Then they started
>> backtracking the problem to its source.
> <
> I think the Stanford Paranoia FP test suite found it.
>> Palmer introduces the FP80 format for the reasons he gives and
>> that unknowingly creates the potential for DR on FP80 to FP64.
>> The 8087 launches in 1980.
>> But it still takes 15 years for anyone to detect that DR is an
>> issue and only if you spill the FP80 to FP64 with a store.
> <
> Closer to 8 years for those of us in the trenches, more like
> double that for the average FP programmer.
>> When is Double Rounding Innocuous, Figueroa, 1995
>> https://dl.acm.org/doi/pdf/10.1145/221332.221334
> <
> This paper indicates that DR occurs about 1 in 2^28 RMS
> conversions of DP->SP. Which for any FP calculation where
> the original dataset contains noise is too far under the noise
> floor to be visible. However the Stanford Paranoia IEEE 754
> test suite has data sets that contain no noise whatsoever.
>> If p is the smaller fraction #bits then DR differences only happen
>> when the larger fraction bits is < 2p or 2p+1 or 2p+2 bits,
>> depending on the operation + - * / or sqrt.
>> This could not have happened on PDP11 or VAX FP formats.
>>
> Or IBM hexidecimal format or CDC 6600 SP and FP formats.
> <
> Or ANY FP format where the number of exponent bits is the
> same SP and DP. It is only when the exponent field grows
> as the precision increases where it can rear its ugly head
> {above the noise floor}.
> <
> It will be interesting to see if posits display DR or whether
> their extra statistical precision makes DR even harder to see.
>> It could not happen on 8087 for FP80 to FP32, just FP80 to FP64.
> <
> Oh what a tangled web we weave............

The paper doesn't mention exponent, just that it is based on the
significand precision.

I just found a reference to a slightly different issue called
"double-rounding on underflow" where IIUC a larger FP format result
produces a denorm and gets rounded. Then a down convert to smaller
FP format and the denorm gets rounded again to a different value
than if the first denorm had gone directly to the smaller format.

Is this what you are referring to as DP to SP conversion DR?

Re: 80x87 and IEEE 754

<593026f2-8ad9-47df-9b0b-e2069fc50082n@googlegroups.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=32996&group=comp.arch#32996

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:183:b0:3fe:bde8:76ea with SMTP id s3-20020a05622a018300b003febde876eamr23764qtw.0.1688330582390;
Sun, 02 Jul 2023 13:43:02 -0700 (PDT)
X-Received: by 2002:a17:902:ec84:b0:1ab:18eb:17c8 with SMTP id
x4-20020a170902ec8400b001ab18eb17c8mr7279816plg.2.1688330582071; Sun, 02 Jul
2023 13:43:02 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 2 Jul 2023 13:43:01 -0700 (PDT)
In-Reply-To: <HzjoM.4054$Sc61.1967@fx39.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:c1e3:1d0:d553:63e7;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:c1e3:1d0:d553:63e7
References: <u78vn4$3keqd$1@newsreader4.netcologne.de> <u7b8q8$qh6r$1@dont-email.me>
<u7cde2$10aie$1@dont-email.me> <7467e4f5-f727-4d46-b491-8052af8fbcb2n@googlegroups.com>
<780c43c5-d79f-4bfe-91dc-6443b3397325n@googlegroups.com> <2304c8fb-20cd-40e9-a9e4-0149d4d39a87n@googlegroups.com>
<u7gha9$1mms4$1@dont-email.me> <d138dab4-a19a-4e0b-98af-780aa50ecdf4n@googlegroups.com>
<u7i0r1$1rhn2$1@dont-email.me> <77319995-dd17-425a-9427-210feddd2d46n@googlegroups.com>
<u7j7m6$233l8$1@dont-email.me> <2023Jun29.190507@mips.complang.tuwien.ac.at>
<u7kmcp$287sv$1@dont-email.me> <yeDnM.2558$fNr5.1450@fx16.iad>
<UUZnM.255$1ZN4.198@fx12.iad> <efe4736d-5fe8-437b-bb6d-527d972ed307n@googlegroups.com>
<HzjoM.4054$Sc61.1967@fx39.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <593026f2-8ad9-47df-9b0b-e2069fc50082n@googlegroups.com>
Subject: Re: 80x87 and IEEE 754
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sun, 02 Jul 2023 20:43:02 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 6268

by: MitchAlsup - Sun, 2 Jul 2023 20:43 UTC

On Sunday, July 2, 2023 at 1:43:56 PM UTC-5, EricP wrote:
> MitchAlsup wrote:
> > On Saturday, July 1, 2023 at 1:04:40 PM UTC-5, EricP wrote:
> >> EricP wrote:
> >>> Terje Mathisen wrote:
> >>>> Anton Ertl wrote:
> >>> It took a bit of rummaging about but the 8087 was designed by
> >>> John F. Palmer, Bruce W. Ravenel, Rafi Nave.
> >>>
> >>> Palmer wrote 1980 article which describes of some of its design rational.
> >>>
> >>> The INTEL 8087 Numeric Data Processor, Palmer, 1980
> >>> https://search.iczhiku.com/paper/DyXtJLK9sG1DZUjG.pdf
> > <
> >> After poking about some more, just guessing but it looks to me like the
> >> reason that Double Rounding (DR) happened was: nobody thought of it.
> > <
> > a) this correlates with my history, nobody told us (88K) about it until about
> > 1988.
> > b) it happens very infrequently:: about 1 in 2^28 for double precision to
> > single precision.
> > c) double rounding has a slight statistical bias in favor of higher accuracy
> > in real FP codes, and a slight statistical bias against numerical checking
> > codes.
> >> From looking at the dates on papers it doesn't look like anyone
> >> even noticed DR was happening until 1995. Then they started
> >> backtracking the problem to its source.
> > <
> > I think the Stanford Paranoia FP test suite found it.
> >> Palmer introduces the FP80 format for the reasons he gives and
> >> that unknowingly creates the potential for DR on FP80 to FP64.
> >> The 8087 launches in 1980.
> >> But it still takes 15 years for anyone to detect that DR is an
> >> issue and only if you spill the FP80 to FP64 with a store.
> > <
> > Closer to 8 years for those of us in the trenches, more like
> > double that for the average FP programmer.
> >> When is Double Rounding Innocuous, Figueroa, 1995
> >> https://dl.acm.org/doi/pdf/10.1145/221332.221334
> > <
> > This paper indicates that DR occurs about 1 in 2^28 RMS
> > conversions of DP->SP. Which for any FP calculation where
> > the original dataset contains noise is too far under the noise
> > floor to be visible. However the Stanford Paranoia IEEE 754
> > test suite has data sets that contain no noise whatsoever.
> >> If p is the smaller fraction #bits then DR differences only happen
> >> when the larger fraction bits is < 2p or 2p+1 or 2p+2 bits,
> >> depending on the operation + - * / or sqrt.
> >> This could not have happened on PDP11 or VAX FP formats.
> >>
> > Or IBM hexidecimal format or CDC 6600 SP and FP formats.
> > <
> > Or ANY FP format where the number of exponent bits is the
> > same SP and DP. It is only when the exponent field grows
> > as the precision increases where it can rear its ugly head
> > {above the noise floor}.
> > <
> > It will be interesting to see if posits display DR or whether
> > their extra statistical precision makes DR even harder to see.
> >> It could not happen on 8087 for FP80 to FP32, just FP80 to FP64.
> > <
> > Oh what a tangled web we weave............
<
> The paper doesn't mention exponent, just that it is based on the
> significand precision.
<
The paper was referring to IEEE 754 double rounding--thus taking
on all the benefits and detriments of that axiom.
<
When you double the width of the container and do not increase
the size of the exponent you always end up in a situation where
the equations in the paper are obeyed; thus no double rounding.
<
Only when the exponent is increased with the width doubling
does double rounding become possible.
<
80-bit implementations suffer DR on: 80->64 because 64 < 2*52+2,
whereas 80->32 because 64 > 2*23 +2.
>
> I just found a reference to a slightly different issue called
> "double-rounding on underflow" where IIUC a larger FP format result
> produces a denorm and gets rounded. Then a down convert to smaller
> FP format and the denorm gets rounded again to a different value
> than if the first denorm had gone directly to the smaller format.
<
A 64-bit denorm converts to a 32-bit zero.
A 128-bit denorm converts to a 64-bit zero.
<
Can you cite paper/reference ??
>
> Is this what you are referring to as DP to SP conversion DR?

MitchAlsup wrote:
> On Sunday, July 2, 2023 at 1:43:56 PM UTC-5, EricP wrote:
>> I just found a reference to a slightly different issue called
>> "double-rounding on underflow" where IIUC a larger FP format result
>> produces a denorm and gets rounded. Then a down convert to smaller
>> FP format and the denorm gets rounded again to a different value
>> than if the first denorm had gone directly to the smaller format.
> <
> A 64-bit denorm converts to a 32-bit zero.
> A 128-bit denorm converts to a 64-bit zero.
> <
> Can you cite paper/reference ??

That was this starting at the bottom of page 13

"Double rounding can also cause some subtle differences for very small
numbers that are rounded into subnormal double-precision values if
computed in IEEE-754 double precision: if one uses the “double-precision”
mode of the x87 FPU, these numbers will be rounded into normalised values
inside the FPU register, because of a wider range of negative exponents;
then they will be rounded again into double-precision subnormals when
written to memory. This is known as double-rounding on underflow."

The Pitfalls of Verifying Floating-point Computations, Monniaux, 2007
https://arxiv.org/abs/cs/0701192

Re: 80x87 and IEEE 754

<9ac6859d-73fc-46e5-bde8-1baa4d2e1ca7n@googlegroups.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=33008&group=comp.arch#33008

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:1993:b0:403:39c4:de8 with SMTP id u19-20020a05622a199300b0040339c40de8mr44435qtc.11.1688535812326;
Tue, 04 Jul 2023 22:43:32 -0700 (PDT)
X-Received: by 2002:a05:6a00:b46:b0:674:a3be:2773 with SMTP id
p6-20020a056a000b4600b00674a3be2773mr20516328pfo.5.1688535811735; Tue, 04 Jul
2023 22:43:31 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 4 Jul 2023 22:43:31 -0700 (PDT)
In-Reply-To: <EqmoM.292$cFK.123@fx34.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:5f0:e89d:990f:ac9c;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:5f0:e89d:990f:ac9c
References: <u78vn4$3keqd$1@newsreader4.netcologne.de> <u7b8q8$qh6r$1@dont-email.me>
<u7cde2$10aie$1@dont-email.me> <7467e4f5-f727-4d46-b491-8052af8fbcb2n@googlegroups.com>
<780c43c5-d79f-4bfe-91dc-6443b3397325n@googlegroups.com> <2304c8fb-20cd-40e9-a9e4-0149d4d39a87n@googlegroups.com>
<u7gha9$1mms4$1@dont-email.me> <d138dab4-a19a-4e0b-98af-780aa50ecdf4n@googlegroups.com>
<u7i0r1$1rhn2$1@dont-email.me> <77319995-dd17-425a-9427-210feddd2d46n@googlegroups.com>
<u7j7m6$233l8$1@dont-email.me> <2023Jun29.190507@mips.complang.tuwien.ac.at>
<u7kmcp$287sv$1@dont-email.me> <yeDnM.2558$fNr5.1450@fx16.iad>
<UUZnM.255$1ZN4.198@fx12.iad> <efe4736d-5fe8-437b-bb6d-527d972ed307n@googlegroups.com>
<HzjoM.4054$Sc61.1967@fx39.iad> <593026f2-8ad9-47df-9b0b-e2069fc50082n@googlegroups.com>
<EqmoM.292$cFK.123@fx34.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9ac6859d-73fc-46e5-bde8-1baa4d2e1ca7n@googlegroups.com>
Subject: Re: 80x87 and IEEE 754
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 05 Jul 2023 05:43:32 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3593

by: Quadibloc - Wed, 5 Jul 2023 05:43 UTC

On Sunday, July 2, 2023 at 3:59:04 PM UTC-6, EricP wrote:

> That was this starting at the bottom of page 13
>
> "Double rounding can also cause some subtle differences for very small
> numbers that are rounded into subnormal double-precision values if
> computed in IEEE-754 double precision: if one uses the “double-precision”
> mode of the x87 FPU, these numbers will be rounded into normalised values
> inside the FPU register, because of a wider range of negative exponents;
> then they will be rounded again into double-precision subnormals when
> written to memory. This is known as double-rounding on underflow."

And I would understand this as referring to rounding when doing
a computation with 80-bit Temporary Real values, followed by
rounding again when converting the result to a 64-bit double precision
value.

Since the guard and sticky bits of the temporary real value are
lost after rounding _to_ temporary real, there is a real possibility
that subsequent rounding to double precision may not produce
the ideal result that IEEE 754's perfect rounding would have
produced if the computation had been performed directly on
double precision values.

For *normal* floating-point computations, things like this are
basically a non-issue. Floating-point computations had always
been expected to have a noise floor - and with IEEE 754, they
*still* do, it's just a little bit lower. The theoretical perfection of
the results, except for the fact that they _minimize_ the noise as
far as possible, doesn't actually *buy* you anything that is
important.

John Savard

Re: 80x87 and IEEE 754

<02e0b698-3ccb-4739-b64c-20b2abff27bcn@googlegroups.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=33011&group=comp.arch#33011

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:b30:b0:635:e6f8:b28d with SMTP id w16-20020a0562140b3000b00635e6f8b28dmr52518qvj.12.1688571619799;
Wed, 05 Jul 2023 08:40:19 -0700 (PDT)
X-Received: by 2002:a05:6a00:1820:b0:668:770b:55b5 with SMTP id
y32-20020a056a00182000b00668770b55b5mr3264837pfa.3.1688571619248; Wed, 05 Jul
2023 08:40:19 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!panix!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 5 Jul 2023 08:40:18 -0700 (PDT)
In-Reply-To: <9ac6859d-73fc-46e5-bde8-1baa4d2e1ca7n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:458e:d34c:d867:52d3;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:458e:d34c:d867:52d3
References: <u78vn4$3keqd$1@newsreader4.netcologne.de> <u7b8q8$qh6r$1@dont-email.me>
<u7cde2$10aie$1@dont-email.me> <7467e4f5-f727-4d46-b491-8052af8fbcb2n@googlegroups.com>
<780c43c5-d79f-4bfe-91dc-6443b3397325n@googlegroups.com> <2304c8fb-20cd-40e9-a9e4-0149d4d39a87n@googlegroups.com>
<u7gha9$1mms4$1@dont-email.me> <d138dab4-a19a-4e0b-98af-780aa50ecdf4n@googlegroups.com>
<u7i0r1$1rhn2$1@dont-email.me> <77319995-dd17-425a-9427-210feddd2d46n@googlegroups.com>
<u7j7m6$233l8$1@dont-email.me> <2023Jun29.190507@mips.complang.tuwien.ac.at>
<u7kmcp$287sv$1@dont-email.me> <yeDnM.2558$fNr5.1450@fx16.iad>
<UUZnM.255$1ZN4.198@fx12.iad> <efe4736d-5fe8-437b-bb6d-527d972ed307n@googlegroups.com>
<HzjoM.4054$Sc61.1967@fx39.iad> <593026f2-8ad9-47df-9b0b-e2069fc50082n@googlegroups.com>
<EqmoM.292$cFK.123@fx34.iad> <9ac6859d-73fc-46e5-bde8-1baa4d2e1ca7n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <02e0b698-3ccb-4739-b64c-20b2abff27bcn@googlegroups.com>
Subject: Re: 80x87 and IEEE 754
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Wed, 05 Jul 2023 15:40:19 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 4064

by: MitchAlsup - Wed, 5 Jul 2023 15:40 UTC

On Wednesday, July 5, 2023 at 12:43:34 AM UTC-5, Quadibloc wrote:
> On Sunday, July 2, 2023 at 3:59:04 PM UTC-6, EricP wrote:
>
> > That was this starting at the bottom of page 13
> >
> > "Double rounding can also cause some subtle differences for very small
> > numbers that are rounded into subnormal double-precision values if
> > computed in IEEE-754 double precision: if one uses the “double-precision”
> > mode of the x87 FPU, these numbers will be rounded into normalised values
> > inside the FPU register, because of a wider range of negative exponents;
> > then they will be rounded again into double-precision subnormals when
> > written to memory. This is known as double-rounding on underflow."
> And I would understand this as referring to rounding when doing
> a computation with 80-bit Temporary Real values, followed by
> rounding again when converting the result to a 64-bit double precision
> value.
>
> Since the guard and sticky bits of the temporary real value are
> lost after rounding _to_ temporary real, there is a real possibility
> that subsequent rounding to double precision may not produce
> the ideal result that IEEE 754's perfect rounding would have
> produced if the computation had been performed directly on
> double precision values.
>
> For *normal* floating-point computations, things like this are
> basically a non-issue. Floating-point computations had always
> been expected to have a noise floor - and with IEEE 754, they
> *still* do, it's just a little bit lower. The theoretical perfection of
> the results, except for the fact that they _minimize_ the noise as
> far as possible, doesn't actually *buy* you anything that is
> important.
<
The only people who recognize double roundings are people with
zero (0) noise in their operands--that is calculations to verify
correctness, not actual use itself.
>
> John Savard

"I am, therefore I am." -- Akira

devel / comp.arch / Re: 80x87 and IEEE 754

Subject	Author
Aprupt underflow mode	Thomas Koenig
Re: Abrupt underflow mode	John Dallman
Re: Aprupt underflow mode	Michael S
Re: Aprupt underflow mode	Anton Ertl
Re: Aprupt underflow mode	Michael S
Re: Aprupt underflow mode	Terje Mathisen
Re: Aprupt underflow mode	BGB
Re: Aprupt underflow mode	Michael S
Re: Aprupt underflow mode	Quadibloc
Re: Aprupt underflow mode	BGB
Re: Aprupt underflow mode	MitchAlsup
Re: Aprupt underflow mode	BGB
Re: Aprupt underflow mode	EricP
Re: Aprupt underflow mode	BGB
Re: Aprupt underflow mode	MitchAlsup
Re: Aprupt underflow mode	robf...@gmail.com
Re: Aprupt underflow mode	BGB
Re: Aprupt underflow mode	BGB
Re: Aprupt underflow mode	robf...@gmail.com
Re: Aprupt underflow mode	BGB
Re: Aprupt underflow mode	MitchAlsup
Re: Aprupt underflow mode	EricP
Re: Aprupt underflow mode	EricP
Re: Aprupt underflow mode	Michael S
Re: Aprupt underflow mode	Terje Mathisen
Re: Aprupt underflow mode	Michael S
Re: Aprupt underflow mode	Terje Mathisen
Re: Aprupt underflow mode	Michael S
Re: Aprupt underflow mode	Terje Mathisen
Re: Aprupt underflow mode	Thomas Koenig
Re: Aprupt underflow mode	Michael S
Re: Aprupt underflow mode	Michael S
80x87 and IEEE 754 (was: Aprupt underflow mode)	Anton Ertl
Re: 80x87 and IEEE 754 (was: Aprupt underflow mode)	MitchAlsup
Re: 80x87 and IEEE 754 (was: Aprupt underflow mode)	Michael S
Re: 80x87 and IEEE 754 (was: Aprupt underflow mode)	BGB
Re: 80x87 and IEEE 754 (was: Aprupt underflow mode)	BGB
Re: 80x87 and IEEE 754	Terje Mathisen
Re: 80x87 and IEEE 754	Michael S
Re: 80x87 and IEEE 754	Terje Mathisen
Re: 80x87 and IEEE 754	Michael S
Re: 80x87 and IEEE 754	MitchAlsup
Re: 80x87 and IEEE 754	Terje Mathisen
Re: 80x87 and IEEE 754	Thomas Koenig
Re: 80x87 and IEEE 754	MitchAlsup
Re: 80x87 and IEEE 754	EricP
Re: 80x87 and IEEE 754	EricP
Re: 80x87 and IEEE 754	MitchAlsup
Re: 80x87 and IEEE 754	EricP
Re: 80x87 and IEEE 754	MitchAlsup
Re: 80x87 and IEEE 754	EricP
Re: 80x87 and IEEE 754	MitchAlsup
Re: 80x87 and IEEE 754	EricP
Re: 80x87 and IEEE 754	Quadibloc
Re: 80x87 and IEEE 754	MitchAlsup
Re: Aprupt underflow mode	MitchAlsup
Re: Aprupt underflow mode	Michael S