Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

It's easy to get on the internet and forget you have a life -- Topic on #LinuxGER


devel / comp.arch / Two New 128-bit Floating-Point Formats

SubjectAuthor
* Two New 128-bit Floating-Point FormatsQuadibloc
+* Re: Two New 128-bit Floating-Point FormatsMitchAlsup
|`* Re: Two New 128-bit Floating-Point FormatsThomas Koenig
| +* Re: Two New 128-bit Floating-Point FormatsQuadibloc
| |`* Re: Two New 128-bit Floating-Point FormatsMitchAlsup
| | `* Re: Two New 128-bit Floating-Point FormatsStephen Fuld
| |  +- Re: Two New 128-bit Floating-Point FormatsMitchAlsup
| |  `- Re: Two New 128-bit Floating-Point FormatsThomas Koenig
| `* Re: Two New 128-bit Floating-Point FormatsBGB
|  +* Re: Two New 128-bit Floating-Point FormatsQuadibloc
|  |`- Re: Two New 128-bit Floating-Point FormatsBGB
|  `- Re: Two New 128-bit Floating-Point FormatsQuadibloc
`* Re: Two New 128-bit Floating-Point FormatsJimBrakefield
 +* Re: Two New 128-bit Floating-Point FormatsMitchAlsup
 |+- Re: Two New 128-bit Floating-Point FormatsJimBrakefield
 |`* Re: Two New 128-bit Floating-Point FormatsTerje Mathisen
 | `* Re: Two New 128-bit Floating-Point FormatsMitchAlsup
 |  `* Re: Two New 128-bit Floating-Point FormatsTerje Mathisen
 |   `* Re: Two New 128-bit Floating-Point FormatsMitchAlsup
 |    `* Re: Two New 128-bit Floating-Point FormatsQuadibloc
 |     +- Re: Two New 128-bit Floating-Point FormatsTerje Mathisen
 |     `- Re: Two New 128-bit Floating-Point FormatsMitchAlsup
 `* Re: Two New 128-bit Floating-Point FormatsJimBrakefield
  `* Re: Two New 128-bit Floating-Point FormatsQuadibloc
   `* Re: Two New 128-bit Floating-Point FormatsScott Lurndal
    `- Re: Two New 128-bit Floating-Point FormatsMitchAlsup

Pages:12
Two New 128-bit Floating-Point Formats

<439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33563&group=comp.arch#33563

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:14b:b0:768:3ae0:c178 with SMTP id e11-20020a05620a014b00b007683ae0c178mr11033qkn.1.1691526501482;
Tue, 08 Aug 2023 13:28:21 -0700 (PDT)
X-Received: by 2002:a05:6870:9882:b0:1bf:57b4:cadf with SMTP id
eg2-20020a056870988200b001bf57b4cadfmr212728oab.4.1691526501065; Tue, 08 Aug
2023 13:28:21 -0700 (PDT)
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 8 Aug 2023 13:28:20 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:e96c:b6d3:dd2a:f0f0;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:e96c:b6d3:dd2a:f0f0
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
Subject: Two New 128-bit Floating-Point Formats
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Tue, 08 Aug 2023 20:28:21 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 6423
 by: Quadibloc - Tue, 8 Aug 2023 20:28 UTC

Only the _first_ of which is _intentionally_ silly.

I have a section on my web site which discusses the history of the computer,
at
http://www.quadibloc.com/comp/histint.htm

On that page, one of the many computer systems I discuss is the
HP 9845, from 1978. This computer had amazing capabilities
for its day; some have termed it the "first workstation".
Unlike anything by Sun or Apollo, though, the processor for this
computer, designed by HP, had an architecture based on the
HP 211x microcomputer, but did calculations in decimal floating
point.
Hey, wait a moment. Isn't that a description of the processor chip
used in HP pocket calculators, and the earlier HP 9830? How on
Earth can something that does floating-point calculations at the
speed of a pocket calculator, even a good one, be called a
"workstation"?
Well, further study allowed me to resolve this doubt. The CPU
module of the 9845 included a chip called EMC, which did its
floating-point arithmetic. It did it within a 16-bit ALU, and the
floating-point format had a *binary* exponent with a range from
-511 to +511. It *did* do its arithmetic at speeds considerably
greater than those of pocket calculators.

Well, the HP 85 may have been the world's cutest computer, but
the HP 9845C seemed to me to have taken the crown for the
most quintessentially geeky computer to ever warm the heart of
a retrocomputing enthusiast that ever existed.

Inspired by this computer, and by another favorite of mine, the
famous RECOMP II computer, the one that's capable of handling
numbers that can go 2 1/2 times around the world, I came up with
this floating-point format...

the intended goal of which is to be included, along with more
conventional floating-point formats, in the processor for a
computer that boots up as a calculator, but can then be
switched over to full computer operation when desired.

Here it is:

(76 bits) Mantissa: 19 BCD digits
(1 bit) Sign
(51 bits) Excess-1,125,899,906,842,642 decimal exponent

Initially, I had conceptualized the format as being closer to that
of the RECOMP II, with one word of mantissa, and the sign and
exponent in the second word.
But then I thought of making the first 64 bit word into one BCD
digit, and six groups of three digits encoded by Chen-Ho encoding.
That would allow nineteen-digit precision.
Then I decided that a 63-bit exponent was so large that it would
be preferable to sacrifice some exponent bits, and have the same
increase of precision without going to the extra gate delays
required for Chen-Ho encoding.

The ideas I played with in that chain of thought then turned my
attention to how they might be used for a more serious
purpose.
Remember John Gustafson, and his quest, first with Unums, and
then with Posits, to devise a better floating-point format that
would help combat the dangerous numerical errors that abound
in conventional floating-point arithmetic?
Perhaps I could come up with something more conventional
that would go partway, at least, towards providing the facilities
that his inventions provide.

And here is where that chain of thought went:

(1 bit) Sign
(31 bits) Excess-1,073,741,824 binary exponent
(96 bits) Significand

Providing a wide exponent range (like Posits and Unums) and a high
precision (like Unums) but both within the bounds of reason, and
without any uncoventional steps, like decreasing precision for
large exponents, or having the length of the number variable.
But there's something *else* that I also came up with to do when
implementing this floating-point format in order to help it achieve
its ends.

Seymour Cray was the designer of the Control Data 6600 computer.
It had a 60-bit word. When he designed the Cray I computer, although
he surrendered to the 8-bit byte, and gave it a 64-bit word, apparently
he still felt that the 60-bit floats of the 6600 provided all the precision
that anyone needed.
So the floating-point format of the Cray I had an exponent field that
was 15 bits long. But the defined range of possible exponents in that
format would fit in a *14-bit* exponent field.
I guess this would make it easier to detect and, even more importantly,
to recover from floating-point overflows and underflows.

At first, I thought that simply copying this idea would be useful.
Then, inspired by the inexact bit of the IEEE-754 standard, I
decided on an even better way to softly warn the user, while allowing
the computation to proceed to completion without being halted by
an error, that it had used more of the available exponent range than
would be reasonable for a program which was correctly written
with consciousness of the requirements of sound numerical
analysis.

Even though the exponent, being an excess-1,073,741,824 binary
exponent, has a range from -1,073,741,824 to +1,073,741,823,
just like a two's complement number of the same length, there
would also be a latching Range status bit associated with the
use of this floating-point format that would be set if the exponent
during a computation ever strays out of the range -65,536 to
+65,535, which ought to be enough for anyone!
So a calculation that is blowing up somewhere into excessively
high exponents can be detected without the overhead of adding
a lot of debugging code testing for out-of-range values.

John Savard

Re: Two New 128-bit Floating-Point Formats

<8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33564&group=comp.arch#33564

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:250:b0:76c:7d45:ba50 with SMTP id q16-20020a05620a025000b0076c7d45ba50mr17576qkn.6.1691529138195;
Tue, 08 Aug 2023 14:12:18 -0700 (PDT)
X-Received: by 2002:a05:6870:954e:b0:1bb:4da2:9ed0 with SMTP id
v14-20020a056870954e00b001bb4da29ed0mr282557oal.4.1691529138001; Tue, 08 Aug
2023 14:12:18 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 8 Aug 2023 14:12:17 -0700 (PDT)
In-Reply-To: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:f99a:1cc2:896a:5596;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:f99a:1cc2:896a:5596
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Tue, 08 Aug 2023 21:12:18 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Tue, 8 Aug 2023 21:12 UTC

On Tuesday, August 8, 2023 at 3:28:23 PM UTC-5, Quadibloc wrote:
>
> Seymour Cray was the designer of the Control Data 6600 computer.
> It had a 60-bit word. When he designed the Cray I computer, although
> he surrendered to the 8-bit byte, and gave it a 64-bit word, apparently
> he still felt that the 60-bit floats of the 6600 provided all the precision
> that anyone needed.
<
The CDC 8600 (only 4 built, none sold) also had changed from 60-bits
to 64-bits. Also the CDC STARR 100 was a 64-bit design.
<
> So the floating-point format of the Cray I had an exponent field that
> was 15 bits long. But the defined range of possible exponents in that
> format would fit in a *14-bit* exponent field.
> I guess this would make it easier to detect and, even more importantly,
> to recover from floating-point overflows and underflows.
>
> At first, I thought that simply copying this idea would be useful.
> Then, inspired by the inexact bit of the IEEE-754 standard, I
> decided on an even better way to softly warn the user, while allowing
> the computation to proceed to completion without being halted by
> an error, that it had used more of the available exponent range than
> would be reasonable for a program which was correctly written
> with consciousness of the requirements of sound numerical
> analysis.
<
Berkeley BOOM RISC-V processor uses 65-bit FP format in registers
to east support of denormals.
>
> Even though the exponent, being an excess-1,073,741,824 binary
> exponent, has a range from -1,073,741,824 to +1,073,741,823,
> just like a two's complement number of the same length, there
> would also be a latching Range status bit associated with the
> use of this floating-point format that would be set if the exponent
> during a computation ever strays out of the range -65,536 to
> +65,535, which ought to be enough for anyone!
> So a calculation that is blowing up somewhere into excessively
> high exponents can be detected without the overhead of adding
> a lot of debugging code testing for out-of-range values.
<
Still, overall, posit gives you more of what you want--precision
when you don't need exponent bits, and lack of precision loss
when you do.
>
> John Savard

Re: Two New 128-bit Floating-Point Formats

<uauc26$215i9$1@newsreader4.netcologne.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33565&group=comp.arch#33565

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-24d2-0-4c14-8a50-cc04-931b.ipv6dyn.netcologne.de!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Two New 128-bit Floating-Point Formats
Date: Tue, 8 Aug 2023 21:28:38 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <uauc26$215i9$1@newsreader4.netcologne.de>
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com>
Injection-Date: Tue, 8 Aug 2023 21:28:38 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-24d2-0-4c14-8a50-cc04-931b.ipv6dyn.netcologne.de:2001:4dd6:24d2:0:4c14:8a50:cc04:931b";
logging-data="2135625"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 8 Aug 2023 21:28 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:

> Still, overall, posit gives you more of what you want--precision
> when you don't need exponent bits, and lack of precision loss
> when you do.

I'm thoroughly unconvinced on posits.

Losing precision when the scale is off seems like a bad idea to me,
especially if there is no indication of lost intermediate precision.

And if people argue that the extra range is not needed - well, if
one wants to deviate from IEEE 754 (few people do) it is possible
to design a floating point format with fewer exponent and more
mantissa bits.

Besides, posits do not conform to Fortran's model numbers :-)

Re: Two New 128-bit Floating-Point Formats

<76debbf3-c94c-4085-a034-4f6e2271daabn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33566&group=comp.arch#33566

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:8810:b0:765:9bf7:29a5 with SMTP id qj16-20020a05620a881000b007659bf729a5mr13806qkn.8.1691531618104;
Tue, 08 Aug 2023 14:53:38 -0700 (PDT)
X-Received: by 2002:a05:6870:1a97:b0:1bb:8ad0:1fa7 with SMTP id
ef23-20020a0568701a9700b001bb8ad01fa7mr258726oab.7.1691531617769; Tue, 08 Aug
2023 14:53:37 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 8 Aug 2023 14:53:37 -0700 (PDT)
In-Reply-To: <uauc26$215i9$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:10e6:f417:bc6b:712d;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:10e6:f417:bc6b:712d
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com> <uauc26$215i9$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <76debbf3-c94c-4085-a034-4f6e2271daabn@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Tue, 08 Aug 2023 21:53:38 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 23
 by: Quadibloc - Tue, 8 Aug 2023 21:53 UTC

On Tuesday, August 8, 2023 at 3:28:42 PM UTC-6, Thomas Koenig wrote:
> MitchAlsup <Mitch...@aol.com> schrieb:

> > Still, overall, posit gives you more of what you want--precision
> > when you don't need exponent bits, and lack of precision loss
> > when you do.

> I'm thoroughly unconvinced on posits.
>
> Losing precision when the scale is off seems like a bad idea to me,
> especially if there is no indication of lost intermediate precision.

That's a valid point. However, if you enlarge the exponent field by
one bit, in a floating-point number that must fit in a 64-bit word,
then you have to shrink the significand bit by one bit; in a word
with only so many bits, there are only so many bits.

So posits, at the overhead cost of one bit, let you choose the
floating-point format you want to use, in effect. But, yes, since
there's no indication of what got chosen... you don't know what
your precision really is, which is dangerous.

John Savard

Re: Two New 128-bit Floating-Point Formats

<6ca54868-70be-4f01-b382-57dcb430da23n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33567&group=comp.arch#33567

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1986:b0:40f:d6f0:7681 with SMTP id u6-20020a05622a198600b0040fd6f07681mr19524qtc.3.1691532517530;
Tue, 08 Aug 2023 15:08:37 -0700 (PDT)
X-Received: by 2002:a05:6870:c78f:b0:1b0:3945:af0b with SMTP id
dy15-20020a056870c78f00b001b03945af0bmr286390oab.9.1691532517185; Tue, 08 Aug
2023 15:08:37 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 8 Aug 2023 15:08:36 -0700 (PDT)
In-Reply-To: <76debbf3-c94c-4085-a034-4f6e2271daabn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:f99a:1cc2:896a:5596;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:f99a:1cc2:896a:5596
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com> <uauc26$215i9$1@newsreader4.netcologne.de>
<76debbf3-c94c-4085-a034-4f6e2271daabn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6ca54868-70be-4f01-b382-57dcb430da23n@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Tue, 08 Aug 2023 22:08:37 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 4107
 by: MitchAlsup - Tue, 8 Aug 2023 22:08 UTC

On Tuesday, August 8, 2023 at 4:53:39 PM UTC-5, Quadibloc wrote:
> On Tuesday, August 8, 2023 at 3:28:42 PM UTC-6, Thomas Koenig wrote:
> > MitchAlsup <Mitch...@aol.com> schrieb:
>
> > > Still, overall, posit gives you more of what you want--precision
> > > when you don't need exponent bits, and lack of precision loss
> > > when you do.
>
> > I'm thoroughly unconvinced on posits.
> >
> > Losing precision when the scale is off seems like a bad idea to me,
> > especially if there is no indication of lost intermediate precision.
<
There are (now) many numerical algorithms which have been tested in
posit form and found to have significantly better accuracy with posits
instead of IEEE 754--so much so that some of them can decrease the
size of their average data (63->32 or 32->16) and still give meaningful
results--mostly better than IEEE of 2× the size
<
However, I too remain unconvinced--because the failed results are not
equally marketed..
<
> That's a valid point. However, if you enlarge the exponent field by
> one bit, in a floating-point number that must fit in a 64-bit word,
> then you have to shrink the significand bit by one bit; in a word
> with only so many bits, there are only so many bits.
<
then you have lost all sense of IEEE 754.......
>
> So posits, at the overhead cost of one bit, let you choose the
> floating-point format you want to use, in effect. But, yes, since
> there's no indication of what got chosen... you don't know what
> your precision really is, which is dangerous.
<
In the range where most FP numbers are, posits give you several
more bits of accuracy, plus you can square-add-sqrt without suffering
overflow--and total loss of precision.
<
Still not advocating, .....
>
> John Savard
<
----------------------------------------------------------------------------------------------------
<
But let me return to the intro::
<
> > Losing precision when the scale is off seems like a bad idea to me,
> > especially if there is no indication of lost intermediate precision.
<
a) You gain precision when your numbers are in the normal range
b.1) you gain range where IEEE 754 would overflow
b.2) you gain memory footprint by avoiding 2× data-width
b.3) you gain processor cost by avoiding 2× data-width calculations
b.4) you gain cache performance from lower memory footprint
c) how many FP numbers near IEEE overflow (or **2 overflow) have
"that many digits of precision" in real world applications ????
d) you don't have to deal with overflows
e) you don't have to deal with NaNs
......
<
still not advocating

Re: Two New 128-bit Floating-Point Formats

<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33568&group=comp.arch#33568

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:184a:b0:63c:f6d5:e9e1 with SMTP id d10-20020a056214184a00b0063cf6d5e9e1mr22041qvy.6.1691534505363;
Tue, 08 Aug 2023 15:41:45 -0700 (PDT)
X-Received: by 2002:a05:6870:98b4:b0:1bb:4593:ede7 with SMTP id
eg52-20020a05687098b400b001bb4593ede7mr312106oab.11.1691534504592; Tue, 08
Aug 2023 15:41:44 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 8 Aug 2023 15:41:44 -0700 (PDT)
In-Reply-To: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.14.162; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.14.162
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: jim.brakefield@ieee.org (JimBrakefield)
Injection-Date: Tue, 08 Aug 2023 22:41:45 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 7691
 by: JimBrakefield - Tue, 8 Aug 2023 22:41 UTC

On Tuesday, August 8, 2023 at 3:28:23 PM UTC-5, Quadibloc wrote:
> Only the _first_ of which is _intentionally_ silly.
>
> I have a section on my web site which discusses the history of the computer,
> at
> http://www.quadibloc.com/comp/histint.htm
>
> On that page, one of the many computer systems I discuss is the
> HP 9845, from 1978. This computer had amazing capabilities
> for its day; some have termed it the "first workstation".
> Unlike anything by Sun or Apollo, though, the processor for this
> computer, designed by HP, had an architecture based on the
> HP 211x microcomputer, but did calculations in decimal floating
> point.
> Hey, wait a moment. Isn't that a description of the processor chip
> used in HP pocket calculators, and the earlier HP 9830? How on
> Earth can something that does floating-point calculations at the
> speed of a pocket calculator, even a good one, be called a
> "workstation"?
> Well, further study allowed me to resolve this doubt. The CPU
> module of the 9845 included a chip called EMC, which did its
> floating-point arithmetic. It did it within a 16-bit ALU, and the
> floating-point format had a *binary* exponent with a range from
> -511 to +511. It *did* do its arithmetic at speeds considerably
> greater than those of pocket calculators.
>
> Well, the HP 85 may have been the world's cutest computer, but
> the HP 9845C seemed to me to have taken the crown for the
> most quintessentially geeky computer to ever warm the heart of
> a retrocomputing enthusiast that ever existed.
>
> Inspired by this computer, and by another favorite of mine, the
> famous RECOMP II computer, the one that's capable of handling
> numbers that can go 2 1/2 times around the world, I came up with
> this floating-point format...
>
> the intended goal of which is to be included, along with more
> conventional floating-point formats, in the processor for a
> computer that boots up as a calculator, but can then be
> switched over to full computer operation when desired.
>
> Here it is:
>
> (76 bits) Mantissa: 19 BCD digits
> (1 bit) Sign
> (51 bits) Excess-1,125,899,906,842,642 decimal exponent
>
> Initially, I had conceptualized the format as being closer to that
> of the RECOMP II, with one word of mantissa, and the sign and
> exponent in the second word.
> But then I thought of making the first 64 bit word into one BCD
> digit, and six groups of three digits encoded by Chen-Ho encoding.
> That would allow nineteen-digit precision.
> Then I decided that a 63-bit exponent was so large that it would
> be preferable to sacrifice some exponent bits, and have the same
> increase of precision without going to the extra gate delays
> required for Chen-Ho encoding.
>
> The ideas I played with in that chain of thought then turned my
> attention to how they might be used for a more serious
> purpose.
> Remember John Gustafson, and his quest, first with Unums, and
> then with Posits, to devise a better floating-point format that
> would help combat the dangerous numerical errors that abound
> in conventional floating-point arithmetic?
> Perhaps I could come up with something more conventional
> that would go partway, at least, towards providing the facilities
> that his inventions provide.
>
> And here is where that chain of thought went:
>
> (1 bit) Sign
> (31 bits) Excess-1,073,741,824 binary exponent
> (96 bits) Significand
>
> Providing a wide exponent range (like Posits and Unums) and a high
> precision (like Unums) but both within the bounds of reason, and
> without any uncoventional steps, like decreasing precision for
> large exponents, or having the length of the number variable.
> But there's something *else* that I also came up with to do when
> implementing this floating-point format in order to help it achieve
> its ends.
>
> Seymour Cray was the designer of the Control Data 6600 computer.
> It had a 60-bit word. When he designed the Cray I computer, although
> he surrendered to the 8-bit byte, and gave it a 64-bit word, apparently
> he still felt that the 60-bit floats of the 6600 provided all the precision
> that anyone needed.
> So the floating-point format of the Cray I had an exponent field that
> was 15 bits long. But the defined range of possible exponents in that
> format would fit in a *14-bit* exponent field.
> I guess this would make it easier to detect and, even more importantly,
> to recover from floating-point overflows and underflows.
>
> At first, I thought that simply copying this idea would be useful.
> Then, inspired by the inexact bit of the IEEE-754 standard, I

Would think that in most calculations, most calculated values are inexact.
Previously considered taking one mantissa bit to indicate inexact.
Which is a painful loss of accuracy.

So how many values are exact? Can those values be encoded into
the NAN bits? If so, why not let inexact be the default, thereby allowing
one to use round-to-odd thereby eliminating double rounding issues?
(one would still follow the rule of rounding the nearest representable value)

> decided on an even better way to softly warn the user, while allowing
> the computation to proceed to completion without being halted by
> an error, that it had used more of the available exponent range than
> would be reasonable for a program which was correctly written
> with consciousness of the requirements of sound numerical
> analysis.
>
> Even though the exponent, being an excess-1,073,741,824 binary
> exponent, has a range from -1,073,741,824 to +1,073,741,823,
> just like a two's complement number of the same length, there
> would also be a latching Range status bit associated with the
> use of this floating-point format that would be set if the exponent
> during a computation ever strays out of the range -65,536 to
> +65,535, which ought to be enough for anyone!
> So a calculation that is blowing up somewhere into excessively
> high exponents can be detected without the overhead of adding
> a lot of debugging code testing for out-of-range values.
>
> John Savard

Re: Two New 128-bit Floating-Point Formats

<f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33569&group=comp.arch#33569

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:15d6:b0:403:ae4a:3397 with SMTP id d22-20020a05622a15d600b00403ae4a3397mr22212qty.11.1691538417168;
Tue, 08 Aug 2023 16:46:57 -0700 (PDT)
X-Received: by 2002:a05:6808:bcd:b0:3a7:763f:2501 with SMTP id
o13-20020a0568080bcd00b003a7763f2501mr712757oik.5.1691538416679; Tue, 08 Aug
2023 16:46:56 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 8 Aug 2023 16:46:56 -0700 (PDT)
In-Reply-To: <af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:f99a:1cc2:896a:5596;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:f99a:1cc2:896a:5596
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com> <af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Tue, 08 Aug 2023 23:46:57 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 9764
 by: MitchAlsup - Tue, 8 Aug 2023 23:46 UTC

On Tuesday, August 8, 2023 at 5:41:47 PM UTC-5, JimBrakefield wrote:
> On Tuesday, August 8, 2023 at 3:28:23 PM UTC-5, Quadibloc wrote:
> > Only the _first_ of which is _intentionally_ silly.
> >
> > I have a section on my web site which discusses the history of the computer,
> > at
> > http://www.quadibloc.com/comp/histint.htm
> >
> > On that page, one of the many computer systems I discuss is the
> > HP 9845, from 1978. This computer had amazing capabilities
> > for its day; some have termed it the "first workstation".
> > Unlike anything by Sun or Apollo, though, the processor for this
> > computer, designed by HP, had an architecture based on the
> > HP 211x microcomputer, but did calculations in decimal floating
> > point.
> > Hey, wait a moment. Isn't that a description of the processor chip
> > used in HP pocket calculators, and the earlier HP 9830? How on
> > Earth can something that does floating-point calculations at the
> > speed of a pocket calculator, even a good one, be called a
> > "workstation"?
> > Well, further study allowed me to resolve this doubt. The CPU
> > module of the 9845 included a chip called EMC, which did its
> > floating-point arithmetic. It did it within a 16-bit ALU, and the
> > floating-point format had a *binary* exponent with a range from
> > -511 to +511. It *did* do its arithmetic at speeds considerably
> > greater than those of pocket calculators.
> >
> > Well, the HP 85 may have been the world's cutest computer, but
> > the HP 9845C seemed to me to have taken the crown for the
> > most quintessentially geeky computer to ever warm the heart of
> > a retrocomputing enthusiast that ever existed.
> >
> > Inspired by this computer, and by another favorite of mine, the
> > famous RECOMP II computer, the one that's capable of handling
> > numbers that can go 2 1/2 times around the world, I came up with
> > this floating-point format...
> >
> > the intended goal of which is to be included, along with more
> > conventional floating-point formats, in the processor for a
> > computer that boots up as a calculator, but can then be
> > switched over to full computer operation when desired.
> >
> > Here it is:
> >
> > (76 bits) Mantissa: 19 BCD digits
> > (1 bit) Sign
> > (51 bits) Excess-1,125,899,906,842,642 decimal exponent
> >
> > Initially, I had conceptualized the format as being closer to that
> > of the RECOMP II, with one word of mantissa, and the sign and
> > exponent in the second word.
> > But then I thought of making the first 64 bit word into one BCD
> > digit, and six groups of three digits encoded by Chen-Ho encoding.
> > That would allow nineteen-digit precision.
> > Then I decided that a 63-bit exponent was so large that it would
> > be preferable to sacrifice some exponent bits, and have the same
> > increase of precision without going to the extra gate delays
> > required for Chen-Ho encoding.
> >
> > The ideas I played with in that chain of thought then turned my
> > attention to how they might be used for a more serious
> > purpose.
> > Remember John Gustafson, and his quest, first with Unums, and
> > then with Posits, to devise a better floating-point format that
> > would help combat the dangerous numerical errors that abound
> > in conventional floating-point arithmetic?
> > Perhaps I could come up with something more conventional
> > that would go partway, at least, towards providing the facilities
> > that his inventions provide.
> >
> > And here is where that chain of thought went:
> >
> > (1 bit) Sign
> > (31 bits) Excess-1,073,741,824 binary exponent
> > (96 bits) Significand
> >
> > Providing a wide exponent range (like Posits and Unums) and a high
> > precision (like Unums) but both within the bounds of reason, and
> > without any uncoventional steps, like decreasing precision for
> > large exponents, or having the length of the number variable.
> > But there's something *else* that I also came up with to do when
> > implementing this floating-point format in order to help it achieve
> > its ends.
> >
> > Seymour Cray was the designer of the Control Data 6600 computer.
> > It had a 60-bit word. When he designed the Cray I computer, although
> > he surrendered to the 8-bit byte, and gave it a 64-bit word, apparently
> > he still felt that the 60-bit floats of the 6600 provided all the precision
> > that anyone needed.
> > So the floating-point format of the Cray I had an exponent field that
> > was 15 bits long. But the defined range of possible exponents in that
> > format would fit in a *14-bit* exponent field.
> > I guess this would make it easier to detect and, even more importantly,
> > to recover from floating-point overflows and underflows.
> >
> > At first, I thought that simply copying this idea would be useful.
> > Then, inspired by the inexact bit of the IEEE-754 standard, I
<
> Would think that in most calculations, most calculated values are inexact..
> Previously considered taking one mantissa bit to indicate inexact.
> Which is a painful loss of accuracy.
<
The most precise thing we can routinely measure is 22-bits.
The most precise thing we can measure in 1ns is 8-bits.
The most precise thing we have ever measured is 44-bits.
{and this took 25+ years to decrease the noise to this}
<
However: there are lots of calculations that expand (ln, sqrt) and
compress (^2, exp, erf) the number of bits needed to retain precision
"down the line". Just computing ln2() to IEEE accuracy requires
something-like 72-bit of fraction in the intermediate x*ln2(y) part
to achieve IEEE 754 accuracy in the final result of Ln2() over the
entire range of x and y.
<
This is the problem, not the number of bits in the fraction at one
instant. It is a problem well understood by numerical analysists
...........And something casual programmers remain unaware of
for decades of experience.........
>
> So how many values are exact? Can those values be encoded into
> the NAN bits? If so, why not let inexact be the default, thereby allowing
> one to use round-to-odd thereby eliminating double rounding issues?
> (one would still follow the rule of rounding the nearest representable value)
<
Nobody doing real FP math gives a crap about exactness. The 99%
Only people testing FP arithmetic units do. The way-less-than 0.1%
<
Consider: COS( 6381956970095103×2^797) = -4.68716592425462761112E-19
<
Conceptually, this requires calculating over 800-bits of intermediate
INT(2/pi×x) !!! to get the proper reduced argument which will result in
the above properly rounded result.
<
To get that 800-bits one uses Payne-Hanek argument reduction which
takes somewhat longer than 100 cycles--compared to computing the
COS(reduced) polynomial taking slightly less than 100 cycles.
<
I have a patented method that can perform reduction in 5 cycles: and a
designed function unit that can perform the above COS(actual)
in 19 cycles.
<
> > decided on an even better way to softly warn the user, while allowing
> > the computation to proceed to completion without being halted by
> > an error, that it had used more of the available exponent range than
> > would be reasonable for a program which was correctly written
> > with consciousness of the requirements of sound numerical
> > analysis.
> >
> > Even though the exponent, being an excess-1,073,741,824 binary
> > exponent, has a range from -1,073,741,824 to +1,073,741,823,
> > just like a two's complement number of the same length, there
> > would also be a latching Range status bit associated with the
> > use of this floating-point format that would be set if the exponent
> > during a computation ever strays out of the range -65,536 to
> > +65,535, which ought to be enough for anyone!
> > So a calculation that is blowing up somewhere into excessively
> > high exponents can be detected without the overhead of adding
> > a lot of debugging code testing for out-of-range values.
> >
> > John Savard

Re: Two New 128-bit Floating-Point Formats

<ce3a0704-5a59-45f6-b8ea-31c640269a13n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33570&group=comp.arch#33570

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5489:0:b0:40f:eaf1:1c01 with SMTP id h9-20020ac85489000000b0040feaf11c01mr82534qtq.1.1691542024493;
Tue, 08 Aug 2023 17:47:04 -0700 (PDT)
X-Received: by 2002:a05:6830:14da:b0:6bc:a4ff:b7c5 with SMTP id
t26-20020a05683014da00b006bca4ffb7c5mr452574otq.3.1691542024201; Tue, 08 Aug
2023 17:47:04 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 8 Aug 2023 17:47:03 -0700 (PDT)
In-Reply-To: <f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.14.162; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.14.162
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com> <f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ce3a0704-5a59-45f6-b8ea-31c640269a13n@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: jim.brakefield@ieee.org (JimBrakefield)
Injection-Date: Wed, 09 Aug 2023 00:47:04 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: JimBrakefield - Wed, 9 Aug 2023 00:47 UTC

On Tuesday, August 8, 2023 at 6:46:58 PM UTC-5, MitchAlsup wrote:
> On Tuesday, August 8, 2023 at 5:41:47 PM UTC-5, JimBrakefield wrote:
> > On Tuesday, August 8, 2023 at 3:28:23 PM UTC-5, Quadibloc wrote:
> > > Only the _first_ of which is _intentionally_ silly.
> > >
> > > I have a section on my web site which discusses the history of the computer,
> > > at
> > > http://www.quadibloc.com/comp/histint.htm
> > >
> > > On that page, one of the many computer systems I discuss is the
> > > HP 9845, from 1978. This computer had amazing capabilities
> > > for its day; some have termed it the "first workstation".
> > > Unlike anything by Sun or Apollo, though, the processor for this
> > > computer, designed by HP, had an architecture based on the
> > > HP 211x microcomputer, but did calculations in decimal floating
> > > point.
> > > Hey, wait a moment. Isn't that a description of the processor chip
> > > used in HP pocket calculators, and the earlier HP 9830? How on
> > > Earth can something that does floating-point calculations at the
> > > speed of a pocket calculator, even a good one, be called a
> > > "workstation"?
> > > Well, further study allowed me to resolve this doubt. The CPU
> > > module of the 9845 included a chip called EMC, which did its
> > > floating-point arithmetic. It did it within a 16-bit ALU, and the
> > > floating-point format had a *binary* exponent with a range from
> > > -511 to +511. It *did* do its arithmetic at speeds considerably
> > > greater than those of pocket calculators.
> > >
> > > Well, the HP 85 may have been the world's cutest computer, but
> > > the HP 9845C seemed to me to have taken the crown for the
> > > most quintessentially geeky computer to ever warm the heart of
> > > a retrocomputing enthusiast that ever existed.
> > >
> > > Inspired by this computer, and by another favorite of mine, the
> > > famous RECOMP II computer, the one that's capable of handling
> > > numbers that can go 2 1/2 times around the world, I came up with
> > > this floating-point format...
> > >
> > > the intended goal of which is to be included, along with more
> > > conventional floating-point formats, in the processor for a
> > > computer that boots up as a calculator, but can then be
> > > switched over to full computer operation when desired.
> > >
> > > Here it is:
> > >
> > > (76 bits) Mantissa: 19 BCD digits
> > > (1 bit) Sign
> > > (51 bits) Excess-1,125,899,906,842,642 decimal exponent
> > >
> > > Initially, I had conceptualized the format as being closer to that
> > > of the RECOMP II, with one word of mantissa, and the sign and
> > > exponent in the second word.
> > > But then I thought of making the first 64 bit word into one BCD
> > > digit, and six groups of three digits encoded by Chen-Ho encoding.
> > > That would allow nineteen-digit precision.
> > > Then I decided that a 63-bit exponent was so large that it would
> > > be preferable to sacrifice some exponent bits, and have the same
> > > increase of precision without going to the extra gate delays
> > > required for Chen-Ho encoding.
> > >
> > > The ideas I played with in that chain of thought then turned my
> > > attention to how they might be used for a more serious
> > > purpose.
> > > Remember John Gustafson, and his quest, first with Unums, and
> > > then with Posits, to devise a better floating-point format that
> > > would help combat the dangerous numerical errors that abound
> > > in conventional floating-point arithmetic?
> > > Perhaps I could come up with something more conventional
> > > that would go partway, at least, towards providing the facilities
> > > that his inventions provide.
> > >
> > > And here is where that chain of thought went:
> > >
> > > (1 bit) Sign
> > > (31 bits) Excess-1,073,741,824 binary exponent
> > > (96 bits) Significand
> > >
> > > Providing a wide exponent range (like Posits and Unums) and a high
> > > precision (like Unums) but both within the bounds of reason, and
> > > without any uncoventional steps, like decreasing precision for
> > > large exponents, or having the length of the number variable.
> > > But there's something *else* that I also came up with to do when
> > > implementing this floating-point format in order to help it achieve
> > > its ends.
> > >
> > > Seymour Cray was the designer of the Control Data 6600 computer.
> > > It had a 60-bit word. When he designed the Cray I computer, although
> > > he surrendered to the 8-bit byte, and gave it a 64-bit word, apparently
> > > he still felt that the 60-bit floats of the 6600 provided all the precision
> > > that anyone needed.
> > > So the floating-point format of the Cray I had an exponent field that
> > > was 15 bits long. But the defined range of possible exponents in that
> > > format would fit in a *14-bit* exponent field.
> > > I guess this would make it easier to detect and, even more importantly,
> > > to recover from floating-point overflows and underflows.
> > >
> > > At first, I thought that simply copying this idea would be useful.
> > > Then, inspired by the inexact bit of the IEEE-754 standard, I
> <
> > Would think that in most calculations, most calculated values are inexact.
> > Previously considered taking one mantissa bit to indicate inexact.
> > Which is a painful loss of accuracy.
> <
> The most precise thing we can routinely measure is 22-bits.
> The most precise thing we can measure in 1ns is 8-bits.
> The most precise thing we have ever measured is 44-bits.
> {and this took 25+ years to decrease the noise to this}
> <
> However: there are lots of calculations that expand (ln, sqrt) and
> compress (^2, exp, erf) the number of bits needed to retain precision
> "down the line".

Then there is the choice of the wrong approach:
Area of a triangle when length of two sides slightly greater than the third:
No surveyor would establish a triangle based on side lengths?
Instead coordinates of the corners serves for all practical purposes.

> Just computing ln2() to IEEE accuracy requires
> something-like 72-bit of fraction in the intermediate x*ln2(y) part
> to achieve IEEE 754 accuracy in the final result of Ln2() over the
> entire range of x and y.
> <
> This is the problem, not the number of bits in the fraction at one
> instant. It is a problem well understood by numerical analysists
> ..........And something casual programmers remain unaware of
> for decades of experience.........
> >
> > So how many values are exact? Can those values be encoded into
> > the NAN bits? If so, why not let inexact be the default, thereby allowing
> > one to use round-to-odd thereby eliminating double rounding issues?
> > (one would still follow the rule of rounding the nearest representable value)
> <
> Nobody doing real FP math gives a crap about exactness. The 99%
> Only people testing FP arithmetic units do. The way-less-than 0.1%
> <
> Consider: COS( 6381956970095103×2^797) = -4.68716592425462761112E-19
> <
> Conceptually, this requires calculating over 800-bits of intermediate
> INT(2/pi×x) !!! to get the proper reduced argument which will result in
> the above properly rounded result.
> <
> To get that 800-bits one uses Payne-Hanek argument reduction which
> takes somewhat longer than 100 cycles--compared to computing the
> COS(reduced) polynomial taking slightly less than 100 cycles.
> <
> I have a patented method that can perform reduction in 5 cycles: and a
> designed function unit that can perform the above COS(actual)
> in 19 cycles.
> <
> > > decided on an even better way to softly warn the user, while allowing
> > > the computation to proceed to completion without being halted by
> > > an error, that it had used more of the available exponent range than
> > > would be reasonable for a program which was correctly written
> > > with consciousness of the requirements of sound numerical
> > > analysis.
> > >
> > > Even though the exponent, being an excess-1,073,741,824 binary
> > > exponent, has a range from -1,073,741,824 to +1,073,741,823,
> > > just like a two's complement number of the same length, there
> > > would also be a latching Range status bit associated with the
> > > use of this floating-point format that would be set if the exponent
> > > during a computation ever strays out of the range -65,536 to
> > > +65,535, which ought to be enough for anyone!
> > > So a calculation that is blowing up somewhere into excessively
> > > high exponents can be detected without the overhead of adding
> > > a lot of debugging code testing for out-of-range values.
> > >
> > > John Savard


Click here to read the complete article
Re: Two New 128-bit Floating-Point Formats

<uausnb$3o9dh$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33571&group=comp.arch#33571

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Two New 128-bit Floating-Point Formats
Date: Tue, 8 Aug 2023 21:11:19 -0500
Organization: A noiseless patient Spider
Lines: 86
Message-ID: <uausnb$3o9dh$1@dont-email.me>
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com>
<uauc26$215i9$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 9 Aug 2023 02:12:59 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d8ae39f3421a6cfd81c6833401d7cc80";
logging-data="3941809"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18PzL/MXMY7h3VmqW3dJrSE"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.14.0
Cancel-Lock: sha1:mPHRGrVNvuIoghUG3yKRB0L9Nr4=
In-Reply-To: <uauc26$215i9$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: BGB - Wed, 9 Aug 2023 02:11 UTC

On 8/8/2023 4:28 PM, Thomas Koenig wrote:
> MitchAlsup <MitchAlsup@aol.com> schrieb:
>
>> Still, overall, posit gives you more of what you want--precision
>> when you don't need exponent bits, and lack of precision loss
>> when you do.
>
> I'm thoroughly unconvinced on posits.
>
> Losing precision when the scale is off seems like a bad idea to me,
> especially if there is no indication of lost intermediate precision.
>
> And if people argue that the extra range is not needed - well, if
> one wants to deviate from IEEE 754 (few people do) it is possible
> to design a floating point format with fewer exponent and more
> mantissa bits.
>
> Besides, posits do not conform to Fortran's model numbers :-)

What little I had looked into posits, they seemed needlessly
over-engineered if compared with IEEE floats.

Then again, I am in the camp of suspecting that even the normal
floating-point rules are overkill for many use-cases (and that more
corner-cutting could be justified).

Say, if one could have a world of hard-wired "truncate towards zero"
rounding and where denormals did not exist, ...

My only "special" requirements here being:
Doing operations on integers expressed via floating point should give an
integer result (for ADD/SUB/MUL);
Converting from a narrower format to a wider format and then back to the
narrower format should yield an exact match.

Though, IRL, granted, from a "numerical goodness" perspective, no reason
to choose this over the existing IEEE rules, so more something for cases
where cost is a major issue (but, still "better than nothing").

Well, also say one could define a "faster but less accurate" version of
sin/cos, say:
Scale value by 0.5/M_PI;
Feed high bits of mantissa through a lookup table (to extract a group of
2 or 4 values, inverting the bits if input is negative);
Use a linear or cubic interpolation based on the lower-order bits.

The interpolation stage could be done using either floating-point or
fixed-point math.

While "less accurate", it can be significantly faster than other
options, and still accurate enough for many uses (even with a fairly
modest lookup table if using cubic interpolation).

Granted, as-is this can be done as a C library extension, say,
"_cos_fast()" or similar. Though, in practice, most programs which need
fast sin/cos already do it themselves using lookup tables anyways
(typically without interpolation, but this is "going a little too far"
for general-case usage).

Though, thinking about it, in some contexts it could make sense to,
rather than use multiple table lookups and a cubic interpolation,
instead storing a reference point, a slope (delta to the next point),
and a pair of derivatives (then one can calculate along the slope and
derivatives and then use this to calculate a target value relative to
the reference value). This could possibly use fewer arithmetic
operations relative to using a cubic interpolation (by "precooking" a
few of the numbers).

Then again, one could instead argue for going further, and turning it
into polynomial form, eg:
Y=B*(X^2)+C*X+D
Or:
Y=A*(X^3)+B*(X^2)+C*X+D
Turning the sin/cos functions into a big piecewise function (with the
coefficients pulled from the lookup table). Though, this would imply
doing all the intermediate math as floating point (this seems less
liable to work effectively with fixed point).

....

Re: Two New 128-bit Floating-Point Formats

<7f7e5960-f83a-4968-89ba-b1e05170a91bn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33572&group=comp.arch#33572

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:4e2e:0:b0:63c:f69f:359a with SMTP id dm14-20020ad44e2e000000b0063cf69f359amr36036qvb.6.1691571092029;
Wed, 09 Aug 2023 01:51:32 -0700 (PDT)
X-Received: by 2002:a4a:4f42:0:b0:56c:9413:4642 with SMTP id
c63-20020a4a4f42000000b0056c94134642mr742112oob.0.1691571091799; Wed, 09 Aug
2023 01:51:31 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 9 Aug 2023 01:51:31 -0700 (PDT)
In-Reply-To: <uausnb$3o9dh$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:3d1e:669a:6fbf:bfa1;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:3d1e:669a:6fbf:bfa1
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com> <uauc26$215i9$1@newsreader4.netcologne.de>
<uausnb$3o9dh$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7f7e5960-f83a-4968-89ba-b1e05170a91bn@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 09 Aug 2023 08:51:32 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Quadibloc - Wed, 9 Aug 2023 08:51 UTC

On Tuesday, August 8, 2023 at 8:13:04 PM UTC-6, BGB wrote:

> Say, if one could have a world of hard-wired "truncate towards zero"
> rounding and where denormals did not exist, ...

Why would truncate towards zero be superior than rounding, even
if the rounding was not as complicated as the exact rounding
of IEEE 754?

But it certainly is true that old-fashioned floating-point was
entirely adequate for the normal use case of floating point,
physical calculations in the sciences. However, except for
division (at least by the fastest algorithms) the cost of
guard-round-sticky exact rounding is negligible; I don't
think we should roll back the quality of floating-point arithmetic
any further than we have to.

John Savard

Re: Two New 128-bit Floating-Point Formats

<12378101-e165-4789-9a32-9328cea7a90dn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33573&group=comp.arch#33573

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:5a06:0:b0:76c:9ec6:48e0 with SMTP id o6-20020a375a06000000b0076c9ec648e0mr32271qkb.9.1691572694687;
Wed, 09 Aug 2023 02:18:14 -0700 (PDT)
X-Received: by 2002:a4a:338f:0:b0:56c:86f2:ae14 with SMTP id
q137-20020a4a338f000000b0056c86f2ae14mr777344ooq.0.1691572694379; Wed, 09 Aug
2023 02:18:14 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 9 Aug 2023 02:18:14 -0700 (PDT)
In-Reply-To: <uausnb$3o9dh$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:d983:daa7:9123:1d6a;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:d983:daa7:9123:1d6a
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com> <uauc26$215i9$1@newsreader4.netcologne.de>
<uausnb$3o9dh$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <12378101-e165-4789-9a32-9328cea7a90dn@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 09 Aug 2023 09:18:14 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Quadibloc - Wed, 9 Aug 2023 09:18 UTC

On Tuesday, August 8, 2023 at 8:13:04 PM UTC-6, BGB wrote:

> What little I had looked into posits, they seemed needlessly
> over-engineered if compared with IEEE floats.

For general use in numerical calculation, sure. Although they're
not actually all _that_ bad, and are reasonably practical to
implement.

The problem is that a lot of people write code that calculates
using floating-point numbers as if they're *real numbers*.

It's impossible, on finite computers, to solve this problem just
by making a new floating-point format, since it isn't possible to
make a type that behaves like a real number which fits in a
finite space.

That being said, though, some kind of floating point format that
amounts to "training wheels" for floating point, that makes it
easier to detect problem behavior in programs, so they can be
corrected before they result in erroneous results being accepted,
is, I think, a good idea.

So the way I look at it, we need to ensure that we're trying to
solve the right problem. Some right problems we should address
are:

- Correct any characteristics of floating-point representations that
make it unnecessarily difficult to write numerically-sound code.

A lot of IEEE 754 was in this category, which is legitimate, even if
later versions of the standard are getting too ambitious.

- Provide facilities that allow code to be checked for numerical
issues in an efficient fashion.

Bad coding, not the inevitable fact that floats aren't reals, is the
real problem we can address, and this is one way to help to
address it.

- Provide a more-forgiving numerical environment for the use
of beginning programmers with a modest overhead cost.

While this sounds like a way to help programmers learn bad
habits, I think it's a legitimate response to allow work to be
done in the "real world" where not everyone doing programming
is likely to recieve instruction in numerical analysis.

Here, the problem is to *realize what you're doing*, and not
view this environment as one that leads to programs that can
actually be _trusted_.

But it can be a tool that addresses the _preceding_ issue, as
code that isn't correctly written might sometimes work in
this environment while failing in a normal one, thus allowing
problems in algorithms to be detected.

John Savard

Re: Two New 128-bit Floating-Point Formats

<ub05dl$3v9rs$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33574&group=comp.arch#33574

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED.3.80-202-33.nextgentel.com!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Two New 128-bit Floating-Point Formats
Date: Wed, 9 Aug 2023 15:47:32 +0200
Organization: A noiseless patient Spider
Message-ID: <ub05dl$3v9rs$1@dont-email.me>
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com>
<f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 9 Aug 2023 13:47:33 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3.80-202-33.nextgentel.com:80.202.33.3";
logging-data="4171644"; mail-complaints-to="abuse@eternal-september.org"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.17
In-Reply-To: <f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
 by: Terje Mathisen - Wed, 9 Aug 2023 13:47 UTC

MitchAlsup wrote:
> On Tuesday, August 8, 2023 at 5:41:47 PM UTC-5, JimBrakefield wrote:
>> On Tuesday, August 8, 2023 at 3:28:23 PM UTC-5, Quadibloc wrote:
>>> Only the _first_ of which is _intentionally_ silly.
>>>
>>> I have a section on my web site which discusses the history of the computer,
>>> at
>>> http://www.quadibloc.com/comp/histint.htm
>>>
>>> On that page, one of the many computer systems I discuss is the
>>> HP 9845, from 1978. This computer had amazing capabilities
>>> for its day; some have termed it the "first workstation".
>>> Unlike anything by Sun or Apollo, though, the processor for this
>>> computer, designed by HP, had an architecture based on the
>>> HP 211x microcomputer, but did calculations in decimal floating
>>> point.
>>> Hey, wait a moment. Isn't that a description of the processor chip
>>> used in HP pocket calculators, and the earlier HP 9830? How on
>>> Earth can something that does floating-point calculations at the
>>> speed of a pocket calculator, even a good one, be called a
>>> "workstation"?
>>> Well, further study allowed me to resolve this doubt. The CPU
>>> module of the 9845 included a chip called EMC, which did its
>>> floating-point arithmetic. It did it within a 16-bit ALU, and the
>>> floating-point format had a *binary* exponent with a range from
>>> -511 to +511. It *did* do its arithmetic at speeds considerably
>>> greater than those of pocket calculators.
>>>
>>> Well, the HP 85 may have been the world's cutest computer, but
>>> the HP 9845C seemed to me to have taken the crown for the
>>> most quintessentially geeky computer to ever warm the heart of
>>> a retrocomputing enthusiast that ever existed.
>>>
>>> Inspired by this computer, and by another favorite of mine, the
>>> famous RECOMP II computer, the one that's capable of handling
>>> numbers that can go 2 1/2 times around the world, I came up with
>>> this floating-point format...
>>>
>>> the intended goal of which is to be included, along with more
>>> conventional floating-point formats, in the processor for a
>>> computer that boots up as a calculator, but can then be
>>> switched over to full computer operation when desired.
>>>
>>> Here it is:
>>>
>>> (76 bits) Mantissa: 19 BCD digits
>>> (1 bit) Sign
>>> (51 bits) Excess-1,125,899,906,842,642 decimal exponent
>>>
>>> Initially, I had conceptualized the format as being closer to that
>>> of the RECOMP II, with one word of mantissa, and the sign and
>>> exponent in the second word.
>>> But then I thought of making the first 64 bit word into one BCD
>>> digit, and six groups of three digits encoded by Chen-Ho encoding.
>>> That would allow nineteen-digit precision.
>>> Then I decided that a 63-bit exponent was so large that it would
>>> be preferable to sacrifice some exponent bits, and have the same
>>> increase of precision without going to the extra gate delays
>>> required for Chen-Ho encoding.
>>>
>>> The ideas I played with in that chain of thought then turned my
>>> attention to how they might be used for a more serious
>>> purpose.
>>> Remember John Gustafson, and his quest, first with Unums, and
>>> then with Posits, to devise a better floating-point format that
>>> would help combat the dangerous numerical errors that abound
>>> in conventional floating-point arithmetic?
>>> Perhaps I could come up with something more conventional
>>> that would go partway, at least, towards providing the facilities
>>> that his inventions provide.
>>>
>>> And here is where that chain of thought went:
>>>
>>> (1 bit) Sign
>>> (31 bits) Excess-1,073,741,824 binary exponent
>>> (96 bits) Significand
>>>
>>> Providing a wide exponent range (like Posits and Unums) and a high
>>> precision (like Unums) but both within the bounds of reason, and
>>> without any uncoventional steps, like decreasing precision for
>>> large exponents, or having the length of the number variable.
>>> But there's something *else* that I also came up with to do when
>>> implementing this floating-point format in order to help it achieve
>>> its ends.
>>>
>>> Seymour Cray was the designer of the Control Data 6600 computer.
>>> It had a 60-bit word. When he designed the Cray I computer, although
>>> he surrendered to the 8-bit byte, and gave it a 64-bit word, apparently
>>> he still felt that the 60-bit floats of the 6600 provided all the precision
>>> that anyone needed.
>>> So the floating-point format of the Cray I had an exponent field that
>>> was 15 bits long. But the defined range of possible exponents in that
>>> format would fit in a *14-bit* exponent field.
>>> I guess this would make it easier to detect and, even more importantly,
>>> to recover from floating-point overflows and underflows.
>>>
>>> At first, I thought that simply copying this idea would be useful.
>>> Then, inspired by the inexact bit of the IEEE-754 standard, I
> <
>> Would think that in most calculations, most calculated values are inexact.
>> Previously considered taking one mantissa bit to indicate inexact.
>> Which is a painful loss of accuracy.
> <
> The most precise thing we can routinely measure is 22-bits.
> The most precise thing we can measure in 1ns is 8-bits.
> The most precise thing we have ever measured is 44-bits.
> {and this took 25+ years to decrease the noise to this}
> <
> However: there are lots of calculations that expand (ln, sqrt) and
> compress (^2, exp, erf) the number of bits needed to retain precision
> "down the line". Just computing ln2() to IEEE accuracy requires
> something-like 72-bit of fraction in the intermediate x*ln2(y) part
> to achieve IEEE 754 accuracy in the final result of Ln2() over the
> entire range of x and y.
> <
> This is the problem, not the number of bits in the fraction at one
> instant. It is a problem well understood by numerical analysists
> ..........And something casual programmers remain unaware of
> for decades of experience.........
>>
>> So how many values are exact? Can those values be encoded into
>> the NAN bits? If so, why not let inexact be the default, thereby allowing
>> one to use round-to-odd thereby eliminating double rounding issues?
>> (one would still follow the rule of rounding the nearest representable value)
> <
> Nobody doing real FP math gives a crap about exactness. The 99%
> Only people testing FP arithmetic units do. The way-less-than 0.1%
> <
> Consider: COS( 6381956970095103×2^797) = -4.68716592425462761112E-19
> <
> Conceptually, this requires calculating over 800-bits of intermediate
> INT(2/pi×x) !!! to get the proper reduced argument which will result in
> the above properly rounded result.
> <
> To get that 800-bits one uses Payne-Hanek argument reduction which
> takes somewhat longer than 100 cycles--compared to computing the

100 cycles seems to be way to large, but maybe this is what current
implementations need?

> COS(reduced) polynomial taking slightly less than 100 cycles.
> <
> I have a patented method that can perform reduction in 5 cycles: and a
> designed function unit that can perform the above COS(actual)
> in 19 cycles.

This is of course really wonderful, to the point where I want _all_ FPUs
to license and/or emulate your algorithms.

However, the actual range reduction should be significantly faster than
100 cycles, even in pure SW:

Load and inspect the exponent, use it to determine a byte-granularity
starting point in the reciprocal pi expansion, then multiply the
mantissa by a 96-128 bit slice (two 64x64->128 MUL operations).

We discard the whole circles and normalize, then we use the top N (3-5)
bits to select the reduced range poly to use.

All this should be eminently doable in less than 20 cycles, right?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Two New 128-bit Floating-Point Formats

<14c4d8ca-9ded-4562-a6be-125b8435c643n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33576&group=comp.arch#33576

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:a67:b0:63c:f25e:fac1 with SMTP id ef7-20020a0562140a6700b0063cf25efac1mr56336qvb.3.1691596713811;
Wed, 09 Aug 2023 08:58:33 -0700 (PDT)
X-Received: by 2002:a17:903:2344:b0:1bc:6a89:86bd with SMTP id
c4-20020a170903234400b001bc6a8986bdmr369297plh.10.1691596713437; Wed, 09 Aug
2023 08:58:33 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!newsfeed.hasname.com!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 9 Aug 2023 08:58:32 -0700 (PDT)
In-Reply-To: <ub05dl$3v9rs$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:c5e:ef41:fe8e:dda5;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:c5e:ef41:fe8e:dda5
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com> <f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
<ub05dl$3v9rs$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <14c4d8ca-9ded-4562-a6be-125b8435c643n@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Wed, 09 Aug 2023 15:58:33 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 11818
 by: MitchAlsup - Wed, 9 Aug 2023 15:58 UTC

On Wednesday, August 9, 2023 at 8:47:36 AM UTC-5, Terje Mathisen wrote:
> MitchAlsup wrote:
> > On Tuesday, August 8, 2023 at 5:41:47 PM UTC-5, JimBrakefield wrote:
> >> On Tuesday, August 8, 2023 at 3:28:23 PM UTC-5, Quadibloc wrote:
> >>> Only the _first_ of which is _intentionally_ silly.
> >>>
> >>> I have a section on my web site which discusses the history of the computer,
> >>> at
> >>> http://www.quadibloc.com/comp/histint.htm
> >>>
> >>> On that page, one of the many computer systems I discuss is the
> >>> HP 9845, from 1978. This computer had amazing capabilities
> >>> for its day; some have termed it the "first workstation".
> >>> Unlike anything by Sun or Apollo, though, the processor for this
> >>> computer, designed by HP, had an architecture based on the
> >>> HP 211x microcomputer, but did calculations in decimal floating
> >>> point.
> >>> Hey, wait a moment. Isn't that a description of the processor chip
> >>> used in HP pocket calculators, and the earlier HP 9830? How on
> >>> Earth can something that does floating-point calculations at the
> >>> speed of a pocket calculator, even a good one, be called a
> >>> "workstation"?
> >>> Well, further study allowed me to resolve this doubt. The CPU
> >>> module of the 9845 included a chip called EMC, which did its
> >>> floating-point arithmetic. It did it within a 16-bit ALU, and the
> >>> floating-point format had a *binary* exponent with a range from
> >>> -511 to +511. It *did* do its arithmetic at speeds considerably
> >>> greater than those of pocket calculators.
> >>>
> >>> Well, the HP 85 may have been the world's cutest computer, but
> >>> the HP 9845C seemed to me to have taken the crown for the
> >>> most quintessentially geeky computer to ever warm the heart of
> >>> a retrocomputing enthusiast that ever existed.
> >>>
> >>> Inspired by this computer, and by another favorite of mine, the
> >>> famous RECOMP II computer, the one that's capable of handling
> >>> numbers that can go 2 1/2 times around the world, I came up with
> >>> this floating-point format...
> >>>
> >>> the intended goal of which is to be included, along with more
> >>> conventional floating-point formats, in the processor for a
> >>> computer that boots up as a calculator, but can then be
> >>> switched over to full computer operation when desired.
> >>>
> >>> Here it is:
> >>>
> >>> (76 bits) Mantissa: 19 BCD digits
> >>> (1 bit) Sign
> >>> (51 bits) Excess-1,125,899,906,842,642 decimal exponent
> >>>
> >>> Initially, I had conceptualized the format as being closer to that
> >>> of the RECOMP II, with one word of mantissa, and the sign and
> >>> exponent in the second word.
> >>> But then I thought of making the first 64 bit word into one BCD
> >>> digit, and six groups of three digits encoded by Chen-Ho encoding.
> >>> That would allow nineteen-digit precision.
> >>> Then I decided that a 63-bit exponent was so large that it would
> >>> be preferable to sacrifice some exponent bits, and have the same
> >>> increase of precision without going to the extra gate delays
> >>> required for Chen-Ho encoding.
> >>>
> >>> The ideas I played with in that chain of thought then turned my
> >>> attention to how they might be used for a more serious
> >>> purpose.
> >>> Remember John Gustafson, and his quest, first with Unums, and
> >>> then with Posits, to devise a better floating-point format that
> >>> would help combat the dangerous numerical errors that abound
> >>> in conventional floating-point arithmetic?
> >>> Perhaps I could come up with something more conventional
> >>> that would go partway, at least, towards providing the facilities
> >>> that his inventions provide.
> >>>
> >>> And here is where that chain of thought went:
> >>>
> >>> (1 bit) Sign
> >>> (31 bits) Excess-1,073,741,824 binary exponent
> >>> (96 bits) Significand
> >>>
> >>> Providing a wide exponent range (like Posits and Unums) and a high
> >>> precision (like Unums) but both within the bounds of reason, and
> >>> without any uncoventional steps, like decreasing precision for
> >>> large exponents, or having the length of the number variable.
> >>> But there's something *else* that I also came up with to do when
> >>> implementing this floating-point format in order to help it achieve
> >>> its ends.
> >>>
> >>> Seymour Cray was the designer of the Control Data 6600 computer.
> >>> It had a 60-bit word. When he designed the Cray I computer, although
> >>> he surrendered to the 8-bit byte, and gave it a 64-bit word, apparently
> >>> he still felt that the 60-bit floats of the 6600 provided all the precision
> >>> that anyone needed.
> >>> So the floating-point format of the Cray I had an exponent field that
> >>> was 15 bits long. But the defined range of possible exponents in that
> >>> format would fit in a *14-bit* exponent field.
> >>> I guess this would make it easier to detect and, even more importantly,
> >>> to recover from floating-point overflows and underflows.
> >>>
> >>> At first, I thought that simply copying this idea would be useful.
> >>> Then, inspired by the inexact bit of the IEEE-754 standard, I
> > <
> >> Would think that in most calculations, most calculated values are inexact.
> >> Previously considered taking one mantissa bit to indicate inexact.
> >> Which is a painful loss of accuracy.
> > <
> > The most precise thing we can routinely measure is 22-bits.
> > The most precise thing we can measure in 1ns is 8-bits.
> > The most precise thing we have ever measured is 44-bits.
> > {and this took 25+ years to decrease the noise to this}
> > <
> > However: there are lots of calculations that expand (ln, sqrt) and
> > compress (^2, exp, erf) the number of bits needed to retain precision
> > "down the line". Just computing ln2() to IEEE accuracy requires
> > something-like 72-bit of fraction in the intermediate x*ln2(y) part
> > to achieve IEEE 754 accuracy in the final result of Ln2() over the
> > entire range of x and y.
> > <
> > This is the problem, not the number of bits in the fraction at one
> > instant. It is a problem well understood by numerical analysists
> > ..........And something casual programmers remain unaware of
> > for decades of experience.........
> >>
> >> So how many values are exact? Can those values be encoded into
> >> the NAN bits? If so, why not let inexact be the default, thereby allowing
> >> one to use round-to-odd thereby eliminating double rounding issues?
> >> (one would still follow the rule of rounding the nearest representable value)
> > <
> > Nobody doing real FP math gives a crap about exactness. The 99%
> > Only people testing FP arithmetic units do. The way-less-than 0.1%
> > <
> > Consider: COS( 6381956970095103×2^797) = -4.68716592425462761112E-19
> > <
> > Conceptually, this requires calculating over 800-bits of intermediate
> > INT(2/pi×x) !!! to get the proper reduced argument which will result in
> > the above properly rounded result.
> > <
> > To get that 800-bits one uses Payne-Hanek argument reduction which
> > takes somewhat longer than 100 cycles--compared to computing the
> 100 cycles seems to be way to large, but maybe this is what current
> implementations need?
> > COS(reduced) polynomial taking slightly less than 100 cycles.
> > <
> > I have a patented method that can perform reduction in 5 cycles: and a
> > designed function unit that can perform the above COS(actual)
> > in 19 cycles.
> This is of course really wonderful, to the point where I want _all_ FPUs
> to license and/or emulate your algorithms.
>
> However, the actual range reduction should be significantly faster than
> 100 cycles, even in pure SW:
>
> Load and inspect the exponent, use it to determine a byte-granularity
> starting point in the reciprocal pi expansion, then multiply the
> mantissa by a 96-128 bit slice (two 64x64->128 MUL operations).
<
static void ReduceFull(double *xp, int *a, double x)
{
Click here to read the complete article

Re: Two New 128-bit Floating-Point Formats

<7096a3ad-f401-4cf1-83d7-8fa1781155b0n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33577&group=comp.arch#33577

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:2c81:0:b0:762:19b6:2900 with SMTP id s123-20020a372c81000000b0076219b62900mr68750qkh.5.1691596998904;
Wed, 09 Aug 2023 09:03:18 -0700 (PDT)
X-Received: by 2002:a17:902:d511:b0:1bb:1ffd:5cc8 with SMTP id
b17-20020a170902d51100b001bb1ffd5cc8mr365687plg.11.1691596997956; Wed, 09 Aug
2023 09:03:17 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 9 Aug 2023 09:03:17 -0700 (PDT)
In-Reply-To: <af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.14.162; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.14.162
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com> <af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7096a3ad-f401-4cf1-83d7-8fa1781155b0n@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: jim.brakefield@ieee.org (JimBrakefield)
Injection-Date: Wed, 09 Aug 2023 16:03:18 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 8730
 by: JimBrakefield - Wed, 9 Aug 2023 16:03 UTC

On Tuesday, August 8, 2023 at 5:41:47 PM UTC-5, JimBrakefield wrote:
> On Tuesday, August 8, 2023 at 3:28:23 PM UTC-5, Quadibloc wrote:
> > Only the _first_ of which is _intentionally_ silly.
> >
> > I have a section on my web site which discusses the history of the computer,
> > at
> > http://www.quadibloc.com/comp/histint.htm
> >
> > On that page, one of the many computer systems I discuss is the
> > HP 9845, from 1978. This computer had amazing capabilities
> > for its day; some have termed it the "first workstation".
> > Unlike anything by Sun or Apollo, though, the processor for this
> > computer, designed by HP, had an architecture based on the
> > HP 211x microcomputer, but did calculations in decimal floating
> > point.
> > Hey, wait a moment. Isn't that a description of the processor chip
> > used in HP pocket calculators, and the earlier HP 9830? How on
> > Earth can something that does floating-point calculations at the
> > speed of a pocket calculator, even a good one, be called a
> > "workstation"?
> > Well, further study allowed me to resolve this doubt. The CPU
> > module of the 9845 included a chip called EMC, which did its
> > floating-point arithmetic. It did it within a 16-bit ALU, and the
> > floating-point format had a *binary* exponent with a range from
> > -511 to +511. It *did* do its arithmetic at speeds considerably
> > greater than those of pocket calculators.
> >
> > Well, the HP 85 may have been the world's cutest computer, but
> > the HP 9845C seemed to me to have taken the crown for the
> > most quintessentially geeky computer to ever warm the heart of
> > a retrocomputing enthusiast that ever existed.
> >
> > Inspired by this computer, and by another favorite of mine, the
> > famous RECOMP II computer, the one that's capable of handling
> > numbers that can go 2 1/2 times around the world, I came up with
> > this floating-point format...
> >
> > the intended goal of which is to be included, along with more
> > conventional floating-point formats, in the processor for a
> > computer that boots up as a calculator, but can then be
> > switched over to full computer operation when desired.
> >
> > Here it is:
> >
> > (76 bits) Mantissa: 19 BCD digits
> > (1 bit) Sign
> > (51 bits) Excess-1,125,899,906,842,642 decimal exponent
> >
> > Initially, I had conceptualized the format as being closer to that
> > of the RECOMP II, with one word of mantissa, and the sign and
> > exponent in the second word.
> > But then I thought of making the first 64 bit word into one BCD
> > digit, and six groups of three digits encoded by Chen-Ho encoding.
> > That would allow nineteen-digit precision.
> > Then I decided that a 63-bit exponent was so large that it would
> > be preferable to sacrifice some exponent bits, and have the same
> > increase of precision without going to the extra gate delays
> > required for Chen-Ho encoding.
> >
> > The ideas I played with in that chain of thought then turned my
> > attention to how they might be used for a more serious
> > purpose.
> > Remember John Gustafson, and his quest, first with Unums, and
> > then with Posits, to devise a better floating-point format that
> > would help combat the dangerous numerical errors that abound
> > in conventional floating-point arithmetic?
> > Perhaps I could come up with something more conventional
> > that would go partway, at least, towards providing the facilities
> > that his inventions provide.
> >
> > And here is where that chain of thought went:
> >
> > (1 bit) Sign
> > (31 bits) Excess-1,073,741,824 binary exponent
> > (96 bits) Significand
> >
> > Providing a wide exponent range (like Posits and Unums) and a high
> > precision (like Unums) but both within the bounds of reason, and
> > without any uncoventional steps, like decreasing precision for
> > large exponents, or having the length of the number variable.
> > But there's something *else* that I also came up with to do when
> > implementing this floating-point format in order to help it achieve
> > its ends.
> >
> > Seymour Cray was the designer of the Control Data 6600 computer.
> > It had a 60-bit word. When he designed the Cray I computer, although
> > he surrendered to the 8-bit byte, and gave it a 64-bit word, apparently
> > he still felt that the 60-bit floats of the 6600 provided all the precision
> > that anyone needed.
> > So the floating-point format of the Cray I had an exponent field that
> > was 15 bits long. But the defined range of possible exponents in that
> > format would fit in a *14-bit* exponent field.
> > I guess this would make it easier to detect and, even more importantly,
> > to recover from floating-point overflows and underflows.
> >
> > At first, I thought that simply copying this idea would be useful.
> > Then, inspired by the inexact bit of the IEEE-754 standard, I
> Would think that in most calculations, most calculated values are inexact..
> Previously considered taking one mantissa bit to indicate inexact.
> Which is a painful loss of accuracy.
>
> So how many values are exact? Can those values be encoded into
> the NAN bits? If so, why not let inexact be the default, thereby allowing
> one to use round-to-odd thereby eliminating double rounding issues?
> (one would still follow the rule of rounding the nearest representable value)
> > decided on an even better way to softly warn the user, while allowing
> > the computation to proceed to completion without being halted by
> > an error, that it had used more of the available exponent range than
> > would be reasonable for a program which was correctly written
> > with consciousness of the requirements of sound numerical
> > analysis.
> >
> > Even though the exponent, being an excess-1,073,741,824 binary
> > exponent, has a range from -1,073,741,824 to +1,073,741,823,
> > just like a two's complement number of the same length, there
> > would also be a latching Range status bit associated with the
> > use of this floating-point format that would be set if the exponent
> > during a computation ever strays out of the range -65,536 to
> > +65,535, which ought to be enough for anyone!
> > So a calculation that is blowing up somewhere into excessively
> > high exponents can be detected without the overhead of adding
> > a lot of debugging code testing for out-of-range values.
> >
> > John Savard

RE
|> allowing one to use round-to-odd thereby eliminating double rounding issues?

On further consideration, one must acknowledge that lawyers and bankers are not
in favor of round to odd, unbiased or otherwise.
At one time round-to-odd was said to take an additional mantissa bit,
however as the LSB is always 1, it can be implied

The best explanations of rounding:

https://en.wikipedia.org/wiki/Rounding
note alternating open and closed intervals in the graph

https://www.clivemaxfield.com/diycalculator/popup-m-round.shtml
see paragraph on banker's rounding and paragraph on round-half-odd

Re: Two New 128-bit Floating-Point Formats

<ub0fic$jtk$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33579&group=comp.arch#33579

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED.ip72-209-250-219.tu.ok.cox.net!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Two New 128-bit Floating-Point Formats
Date: Wed, 9 Aug 2023 11:39:03 -0500
Organization: A noiseless patient Spider
Message-ID: <ub0fic$jtk$1@dont-email.me>
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com>
<uauc26$215i9$1@newsreader4.netcologne.de> <uausnb$3o9dh$1@dont-email.me>
<7f7e5960-f83a-4968-89ba-b1e05170a91bn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 9 Aug 2023 16:40:44 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="ip72-209-250-219.tu.ok.cox.net:72.209.250.219";
logging-data="20404"; mail-complaints-to="abuse@eternal-september.org"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.14.0
Content-Language: en-US
In-Reply-To: <7f7e5960-f83a-4968-89ba-b1e05170a91bn@googlegroups.com>
 by: BGB - Wed, 9 Aug 2023 16:39 UTC

On 8/9/2023 3:51 AM, Quadibloc wrote:
> On Tuesday, August 8, 2023 at 8:13:04 PM UTC-6, BGB wrote:
>
>> Say, if one could have a world of hard-wired "truncate towards zero"
>> rounding and where denormals did not exist, ...
>
> Why would truncate towards zero be superior than rounding, even
> if the rounding was not as complicated as the exact rounding
> of IEEE 754?

Truncate-towards-zero saves a carry propagation on the output step.
Full carry propagation has a moderately high latency;
Partial carry propagation (what I had used) avoids this, but is wonky
(values will round or truncate depending on how far the carry would have
propagated).

Truncate can give consistent results while also being the cheapest option.

However, if combined with the expectation that calculations with
integers yield integer results, does require that the logic within
FADD/FSUB internally follow twos complement behavior (which is,
annoyingly, slightly more expensive than ones' complement).

One can argue that ones' complement plus full-width rounding would also
pull this off, making this more of a tradeoff.

However, if one designs the unit such that it follows a "C=A+(B>>d)"
where 'A' is always the side with a larger exponent (or the larger
mantissa if the exponents are equal), then with normalized inputs then C
will always be positive (C can be negative if the exponents are equal
and mantissas are not compared; comparing the mantissas being "also
expensive").

Though, another option is to have two adders in parallel which produce
both a positive and negative result, and then select whichever output
was positive. One pays for an extra adder, but this doesn't cost much
more over being able to do "if(C<0) { C=-C; Sgn=!Sgn; }" and has a lower
latency than the other options.

Sadly, the "cheapest possible option" here (naive ones' complement with
truncate rounding) would lead to things like:
2.0-4.0 => -1.999999
Which, kinda sucks...

I consider this to below the level of what is "generally usable" (well,
and also if one tries to run Quake on an FPU with this property, its
"progs.dat" VM loses its crap, ...).

This ability for integer ops to produce integer results is basically
also required for JavaScript and similar.

>
> But it certainly is true that old-fashioned floating-point was
> entirely adequate for the normal use case of floating point,
> physical calculations in the sciences. However, except for
> division (at least by the fastest algorithms) the cost of
> guard-round-sticky exact rounding is negligible; I don't
> think we should roll back the quality of floating-point arithmetic
> any further than we have to.
>

For "general use", granted.

As can be noted, for "cheap FPUs" (on lower end hardware), it may make
sense to omit FDIV/FRCP/FSQRT/... and leave all this up to software.

So, one has an FPU that does 3 fundamental operators:
FADD, FSUB, FMUL
Where, say:
FCMPxx, can be routed through the ALU

Format converters could use truncate rounding, this being cheaper.
Where, say, FADD/FSUB/FMUL, FCMPEQ/FCMPGT, FCNVxx, is basically the
entire set of FPU ops.

For a "budget FPU", one might also argue that FP<->Int conversion only
does Int32 and by mapping the Int32 range to, say, 4G..8G, so then one
does the conversion partly by manually adding/subtracting a bias value
of roughly 6G (this saving the cost of the FADD/FSUB unit needing to be
able to deal with format conversion).

Dealing directly with the full Int64 range (with Binary64) isn't really
possible with this approach (would either require multiple stages, or
paying the cost of being able to route it through the FADD unit or similar).

In my case, I had gone the route of having a "more proper" Int64
converter for the scalar FPU though.

As-is for BJX2, the main scalar converters use rounding though, but with
the SIMD converter ops using truncate.

Similarly, only the main scalar FPU requires "mostly accurate"
semantics, with the "low precision" SIMD unit being more subject to more
aggressive cost-cutting (eg, "truncate only", etc).

There are optional scalar FDIV/FSQRT operators, but, as-is, they are
"kinda boat anchors" (with me having noted that FDIV could be routed
through a 64-bit integer MUL/DIV unit; which in this case works via a "1
bit at a time, shift and add" strategy).

This unit mostly providing:
FDIV, accurate but not fast;
64-bit integer MUL/DIV, though MUL is slower than SW.
32-bit integer DIV, moderately faster than software.

Though, in the latter case (32-bit DIV), was able to add a special case
where the IMUL unit will detect a certain subset of inputs (as something
I had called the "FAZ divider"), and then turn them into a "multiply by
reciprocal and right-shift" operation (and just so happens to cover the
bulk of common integer divides, making the "DIVS.L" instruction
"actually useful" if compared with "just do it in software").

Though, "FAZ" is still a bit of a hack, and only covers a limited range
of quotients and divisors (so, it means that the op will take either 3
or 38 cycles depending on the input values, which also "kinda sucks"...).

There are no "proper" DIV/SQRT operators for SIMD, but there are
"approximate" versions, where as seen as a bit pattern:
C=A/B
C=1/B
Roughly maps to:
C.bits = A.bits - B.bits + MAGIC_BIAS_DIV
C.bits = MAGIC_BIAS_RCP - B.bits

With some extra special-case handling (for the sign and exponent).
Magic bias in this case mostly being a bit-twiddle of the high-order
bits (rather than a full-width adder).

And:
C=FSQRT(A)
Maps to, say:
C.bits=(A.bits>>1)+MAGIC_BIAS_SQRT

These can also be used as an input-stage followed by a few N-R steps for
a "more proper" FDIV or FSQRT (sadly, without the N-R, the results are
not often sufficient even for low-precision tasks).

For FP-SIMD <-> Packed Integer conversions, the ops are a bit wonky as I
implemented them in a way that didn't (directly) require feeding them
through the FADD logic.

Say, if one maps them instead to the 2.0-4.0 range, the converter op
becomes a lot cheaper (but still "kinda crap" in some ways).

....

Re: Two New 128-bit Floating-Point Formats

<ub0sgr$2cu8$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33580&group=comp.arch#33580

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED.3.80-202-33.nextgentel.com!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Two New 128-bit Floating-Point Formats
Date: Wed, 9 Aug 2023 22:21:47 +0200
Organization: A noiseless patient Spider
Message-ID: <ub0sgr$2cu8$1@dont-email.me>
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com>
<f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
<ub05dl$3v9rs$1@dont-email.me>
<14c4d8ca-9ded-4562-a6be-125b8435c643n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 9 Aug 2023 20:21:47 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3.80-202-33.nextgentel.com:80.202.33.3";
logging-data="78792"; mail-complaints-to="abuse@eternal-september.org"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.17
In-Reply-To: <14c4d8ca-9ded-4562-a6be-125b8435c643n@googlegroups.com>
 by: Terje Mathisen - Wed, 9 Aug 2023 20:21 UTC

MitchAlsup wrote:
> On Wednesday, August 9, 2023 at 8:47:36 AM UTC-5, Terje Mathisen wrote:
>> MitchAlsup wrote:
>>> On Tuesday, August 8, 2023 at 5:41:47 PM UTC-5, JimBrakefield wrote:
>>>> On Tuesday, August 8, 2023 at 3:28:23 PM UTC-5, Quadibloc wrote:
>>>>> Only the _first_ of which is _intentionally_ silly.
>>>>>
>>>>> I have a section on my web site which discusses the history of the computer,
>>>>> at
>>>>> http://www.quadibloc.com/comp/histint.htm
>>>>>
>>>>> On that page, one of the many computer systems I discuss is the
>>>>> HP 9845, from 1978. This computer had amazing capabilities
>>>>> for its day; some have termed it the "first workstation".
>>>>> Unlike anything by Sun or Apollo, though, the processor for this
>>>>> computer, designed by HP, had an architecture based on the
>>>>> HP 211x microcomputer, but did calculations in decimal floating
>>>>> point.
>>>>> Hey, wait a moment. Isn't that a description of the processor chip
>>>>> used in HP pocket calculators, and the earlier HP 9830? How on
>>>>> Earth can something that does floating-point calculations at the
>>>>> speed of a pocket calculator, even a good one, be called a
>>>>> "workstation"?
>>>>> Well, further study allowed me to resolve this doubt. The CPU
>>>>> module of the 9845 included a chip called EMC, which did its
>>>>> floating-point arithmetic. It did it within a 16-bit ALU, and the
>>>>> floating-point format had a *binary* exponent with a range from
>>>>> -511 to +511. It *did* do its arithmetic at speeds considerably
>>>>> greater than those of pocket calculators.
>>>>>
>>>>> Well, the HP 85 may have been the world's cutest computer, but
>>>>> the HP 9845C seemed to me to have taken the crown for the
>>>>> most quintessentially geeky computer to ever warm the heart of
>>>>> a retrocomputing enthusiast that ever existed.
>>>>>
>>>>> Inspired by this computer, and by another favorite of mine, the
>>>>> famous RECOMP II computer, the one that's capable of handling
>>>>> numbers that can go 2 1/2 times around the world, I came up with
>>>>> this floating-point format...
>>>>>
>>>>> the intended goal of which is to be included, along with more
>>>>> conventional floating-point formats, in the processor for a
>>>>> computer that boots up as a calculator, but can then be
>>>>> switched over to full computer operation when desired.
>>>>>
>>>>> Here it is:
>>>>>
>>>>> (76 bits) Mantissa: 19 BCD digits
>>>>> (1 bit) Sign
>>>>> (51 bits) Excess-1,125,899,906,842,642 decimal exponent
>>>>>
>>>>> Initially, I had conceptualized the format as being closer to that
>>>>> of the RECOMP II, with one word of mantissa, and the sign and
>>>>> exponent in the second word.
>>>>> But then I thought of making the first 64 bit word into one BCD
>>>>> digit, and six groups of three digits encoded by Chen-Ho encoding.
>>>>> That would allow nineteen-digit precision.
>>>>> Then I decided that a 63-bit exponent was so large that it would
>>>>> be preferable to sacrifice some exponent bits, and have the same
>>>>> increase of precision without going to the extra gate delays
>>>>> required for Chen-Ho encoding.
>>>>>
>>>>> The ideas I played with in that chain of thought then turned my
>>>>> attention to how they might be used for a more serious
>>>>> purpose.
>>>>> Remember John Gustafson, and his quest, first with Unums, and
>>>>> then with Posits, to devise a better floating-point format that
>>>>> would help combat the dangerous numerical errors that abound
>>>>> in conventional floating-point arithmetic?
>>>>> Perhaps I could come up with something more conventional
>>>>> that would go partway, at least, towards providing the facilities
>>>>> that his inventions provide.
>>>>>
>>>>> And here is where that chain of thought went:
>>>>>
>>>>> (1 bit) Sign
>>>>> (31 bits) Excess-1,073,741,824 binary exponent
>>>>> (96 bits) Significand
>>>>>
>>>>> Providing a wide exponent range (like Posits and Unums) and a high
>>>>> precision (like Unums) but both within the bounds of reason, and
>>>>> without any uncoventional steps, like decreasing precision for
>>>>> large exponents, or having the length of the number variable.
>>>>> But there's something *else* that I also came up with to do when
>>>>> implementing this floating-point format in order to help it achieve
>>>>> its ends.
>>>>>
>>>>> Seymour Cray was the designer of the Control Data 6600 computer.
>>>>> It had a 60-bit word. When he designed the Cray I computer, although
>>>>> he surrendered to the 8-bit byte, and gave it a 64-bit word, apparently
>>>>> he still felt that the 60-bit floats of the 6600 provided all the precision
>>>>> that anyone needed.
>>>>> So the floating-point format of the Cray I had an exponent field that
>>>>> was 15 bits long. But the defined range of possible exponents in that
>>>>> format would fit in a *14-bit* exponent field.
>>>>> I guess this would make it easier to detect and, even more importantly,
>>>>> to recover from floating-point overflows and underflows.
>>>>>
>>>>> At first, I thought that simply copying this idea would be useful.
>>>>> Then, inspired by the inexact bit of the IEEE-754 standard, I
>>> <
>>>> Would think that in most calculations, most calculated values are inexact.
>>>> Previously considered taking one mantissa bit to indicate inexact.
>>>> Which is a painful loss of accuracy.
>>> <
>>> The most precise thing we can routinely measure is 22-bits.
>>> The most precise thing we can measure in 1ns is 8-bits.
>>> The most precise thing we have ever measured is 44-bits.
>>> {and this took 25+ years to decrease the noise to this}
>>> <
>>> However: there are lots of calculations that expand (ln, sqrt) and
>>> compress (^2, exp, erf) the number of bits needed to retain precision
>>> "down the line". Just computing ln2() to IEEE accuracy requires
>>> something-like 72-bit of fraction in the intermediate x*ln2(y) part
>>> to achieve IEEE 754 accuracy in the final result of Ln2() over the
>>> entire range of x and y.
>>> <
>>> This is the problem, not the number of bits in the fraction at one
>>> instant. It is a problem well understood by numerical analysists
>>> ..........And something casual programmers remain unaware of
>>> for decades of experience.........
>>>>
>>>> So how many values are exact? Can those values be encoded into
>>>> the NAN bits? If so, why not let inexact be the default, thereby allowing
>>>> one to use round-to-odd thereby eliminating double rounding issues?
>>>> (one would still follow the rule of rounding the nearest representable value)
>>> <
>>> Nobody doing real FP math gives a crap about exactness. The 99%
>>> Only people testing FP arithmetic units do. The way-less-than 0.1%
>>> <
>>> Consider: COS( 6381956970095103×2^797) = -4.68716592425462761112E-19
>>> <
>>> Conceptually, this requires calculating over 800-bits of intermediate
>>> INT(2/pi×x) !!! to get the proper reduced argument which will result in
>>> the above properly rounded result.
>>> <
>>> To get that 800-bits one uses Payne-Hanek argument reduction which
>>> takes somewhat longer than 100 cycles--compared to computing the
>> 100 cycles seems to be way to large, but maybe this is what current
>> implementations need?
>>> COS(reduced) polynomial taking slightly less than 100 cycles.
>>> <
>>> I have a patented method that can perform reduction in 5 cycles: and a
>>> designed function unit that can perform the above COS(actual)
>>> in 19 cycles.
>> This is of course really wonderful, to the point where I want _all_ FPUs
>> to license and/or emulate your algorithms.
>>
>> However, the actual range reduction should be significantly faster than
>> 100 cycles, even in pure SW:
>>
>> Load and inspect the exponent, use it to determine a byte-granularity
>> starting point in the reciprocal pi expansion, then multiply the
>> mantissa by a 96-128 bit slice (two 64x64->128 MUL operations).
> <
> static void ReduceFull(double *xp, int *a, double x)
> {
> Double X = { x };
> int ec = X.s.exponent - (1023+33);
> int k = (ec + 26) * (607*4) >> 16;
> int m = 27*k - ec;
> int offset = m >> 3;
> x *= 0x1p-400;
> double xDekker = x * (0x1p27 + 1);
> double x0 = xDekker - (xDekker - x);
> double x1 = x - x0;
> const double *p0 = &TwoOverPiWithOffset[offset][k]; // 180 DP FP numbers
> const double fp0 = p0[0];
> const double fp1 = p0[1];
> const double fp2 = p0[2];
> const double fp3 = p0[3];
> const double f0 = x1 * fp0 + fp1 * x0;
> Double f = x1 * fp1 + fp2 * x0;
> const double fi = f0 + f;
> static const double IntegerBias = 0x1.8p52;
> double Fi = { fi + IntegerBias };
> *a = Fi.s.significand2;
> double fint = Fi.d - IntegerBias;
> const double fp4 = p0[4];
> const double fp5 = p0[5];
> const double fp6 = p0[6];
> f = f0 - fint + f;
> f += x1 * fp2 + fp3 * x0;
> f += x1 * fp3 + fp4 * x0;
> f += x1 * fp4 + fp5 * x0;
> f += x1 * fp5 + fp6 * x0;
> *xp = f * 0x3.243F6A8885A3p-1;
> }
> along with a large array of FP numbers representing 2/pi. From an
> old SUN library. Double (with the capital first letter is a union overlay)
> on a double.
>>
>> We discard the whole circles and normalize, then we use the top N (3-5)
>> bits to select the reduced range poly to use.
> <
> After you create this result, the top 2 bits are the quadrant, but up to
> 61 bits between there and the hidden bit of the reduced argument
> can be 0 and have to be normalized away.
>>
>> All this should be eminently doable in less than 20 cycles, right?
> <
> See function.


Click here to read the complete article
Re: Two New 128-bit Floating-Point Formats

<0cd5cd9a-4373-483e-b04c-bdb62b9f43e7n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33581&group=comp.arch#33581

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:4f49:0:b0:635:d9b4:ba20 with SMTP id eu9-20020ad44f49000000b00635d9b4ba20mr7383qvb.11.1691620675567;
Wed, 09 Aug 2023 15:37:55 -0700 (PDT)
X-Received: by 2002:a17:903:190:b0:1bb:c7c6:3472 with SMTP id
z16-20020a170903019000b001bbc7c63472mr124195plg.13.1691620674951; Wed, 09 Aug
2023 15:37:54 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 9 Aug 2023 15:37:54 -0700 (PDT)
In-Reply-To: <ub0sgr$2cu8$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:c5e:ef41:fe8e:dda5;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:c5e:ef41:fe8e:dda5
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com> <f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
<ub05dl$3v9rs$1@dont-email.me> <14c4d8ca-9ded-4562-a6be-125b8435c643n@googlegroups.com>
<ub0sgr$2cu8$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0cd5cd9a-4373-483e-b04c-bdb62b9f43e7n@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Wed, 09 Aug 2023 22:37:55 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 13984
 by: MitchAlsup - Wed, 9 Aug 2023 22:37 UTC

On Wednesday, August 9, 2023 at 3:21:51 PM UTC-5, Terje Mathisen wrote:
> MitchAlsup wrote:
> > On Wednesday, August 9, 2023 at 8:47:36 AM UTC-5, Terje Mathisen wrote:
> >> MitchAlsup wrote:
> >>> On Tuesday, August 8, 2023 at 5:41:47 PM UTC-5, JimBrakefield wrote:
> >>>> On Tuesday, August 8, 2023 at 3:28:23 PM UTC-5, Quadibloc wrote:
> >>>>> Only the _first_ of which is _intentionally_ silly.
> >>>>>
> >>>>> I have a section on my web site which discusses the history of the computer,
> >>>>> at
> >>>>> http://www.quadibloc.com/comp/histint.htm
> >>>>>
> >>>>> On that page, one of the many computer systems I discuss is the
> >>>>> HP 9845, from 1978. This computer had amazing capabilities
> >>>>> for its day; some have termed it the "first workstation".
> >>>>> Unlike anything by Sun or Apollo, though, the processor for this
> >>>>> computer, designed by HP, had an architecture based on the
> >>>>> HP 211x microcomputer, but did calculations in decimal floating
> >>>>> point.
> >>>>> Hey, wait a moment. Isn't that a description of the processor chip
> >>>>> used in HP pocket calculators, and the earlier HP 9830? How on
> >>>>> Earth can something that does floating-point calculations at the
> >>>>> speed of a pocket calculator, even a good one, be called a
> >>>>> "workstation"?
> >>>>> Well, further study allowed me to resolve this doubt. The CPU
> >>>>> module of the 9845 included a chip called EMC, which did its
> >>>>> floating-point arithmetic. It did it within a 16-bit ALU, and the
> >>>>> floating-point format had a *binary* exponent with a range from
> >>>>> -511 to +511. It *did* do its arithmetic at speeds considerably
> >>>>> greater than those of pocket calculators.
> >>>>>
> >>>>> Well, the HP 85 may have been the world's cutest computer, but
> >>>>> the HP 9845C seemed to me to have taken the crown for the
> >>>>> most quintessentially geeky computer to ever warm the heart of
> >>>>> a retrocomputing enthusiast that ever existed.
> >>>>>
> >>>>> Inspired by this computer, and by another favorite of mine, the
> >>>>> famous RECOMP II computer, the one that's capable of handling
> >>>>> numbers that can go 2 1/2 times around the world, I came up with
> >>>>> this floating-point format...
> >>>>>
> >>>>> the intended goal of which is to be included, along with more
> >>>>> conventional floating-point formats, in the processor for a
> >>>>> computer that boots up as a calculator, but can then be
> >>>>> switched over to full computer operation when desired.
> >>>>>
> >>>>> Here it is:
> >>>>>
> >>>>> (76 bits) Mantissa: 19 BCD digits
> >>>>> (1 bit) Sign
> >>>>> (51 bits) Excess-1,125,899,906,842,642 decimal exponent
> >>>>>
> >>>>> Initially, I had conceptualized the format as being closer to that
> >>>>> of the RECOMP II, with one word of mantissa, and the sign and
> >>>>> exponent in the second word.
> >>>>> But then I thought of making the first 64 bit word into one BCD
> >>>>> digit, and six groups of three digits encoded by Chen-Ho encoding.
> >>>>> That would allow nineteen-digit precision.
> >>>>> Then I decided that a 63-bit exponent was so large that it would
> >>>>> be preferable to sacrifice some exponent bits, and have the same
> >>>>> increase of precision without going to the extra gate delays
> >>>>> required for Chen-Ho encoding.
> >>>>>
> >>>>> The ideas I played with in that chain of thought then turned my
> >>>>> attention to how they might be used for a more serious
> >>>>> purpose.
> >>>>> Remember John Gustafson, and his quest, first with Unums, and
> >>>>> then with Posits, to devise a better floating-point format that
> >>>>> would help combat the dangerous numerical errors that abound
> >>>>> in conventional floating-point arithmetic?
> >>>>> Perhaps I could come up with something more conventional
> >>>>> that would go partway, at least, towards providing the facilities
> >>>>> that his inventions provide.
> >>>>>
> >>>>> And here is where that chain of thought went:
> >>>>>
> >>>>> (1 bit) Sign
> >>>>> (31 bits) Excess-1,073,741,824 binary exponent
> >>>>> (96 bits) Significand
> >>>>>
> >>>>> Providing a wide exponent range (like Posits and Unums) and a high
> >>>>> precision (like Unums) but both within the bounds of reason, and
> >>>>> without any uncoventional steps, like decreasing precision for
> >>>>> large exponents, or having the length of the number variable.
> >>>>> But there's something *else* that I also came up with to do when
> >>>>> implementing this floating-point format in order to help it achieve
> >>>>> its ends.
> >>>>>
> >>>>> Seymour Cray was the designer of the Control Data 6600 computer.
> >>>>> It had a 60-bit word. When he designed the Cray I computer, although
> >>>>> he surrendered to the 8-bit byte, and gave it a 64-bit word, apparently
> >>>>> he still felt that the 60-bit floats of the 6600 provided all the precision
> >>>>> that anyone needed.
> >>>>> So the floating-point format of the Cray I had an exponent field that
> >>>>> was 15 bits long. But the defined range of possible exponents in that
> >>>>> format would fit in a *14-bit* exponent field.
> >>>>> I guess this would make it easier to detect and, even more importantly,
> >>>>> to recover from floating-point overflows and underflows.
> >>>>>
> >>>>> At first, I thought that simply copying this idea would be useful.
> >>>>> Then, inspired by the inexact bit of the IEEE-754 standard, I
> >>> <
> >>>> Would think that in most calculations, most calculated values are inexact.
> >>>> Previously considered taking one mantissa bit to indicate inexact.
> >>>> Which is a painful loss of accuracy.
> >>> <
> >>> The most precise thing we can routinely measure is 22-bits.
> >>> The most precise thing we can measure in 1ns is 8-bits.
> >>> The most precise thing we have ever measured is 44-bits.
> >>> {and this took 25+ years to decrease the noise to this}
> >>> <
> >>> However: there are lots of calculations that expand (ln, sqrt) and
> >>> compress (^2, exp, erf) the number of bits needed to retain precision
> >>> "down the line". Just computing ln2() to IEEE accuracy requires
> >>> something-like 72-bit of fraction in the intermediate x*ln2(y) part
> >>> to achieve IEEE 754 accuracy in the final result of Ln2() over the
> >>> entire range of x and y.
> >>> <
> >>> This is the problem, not the number of bits in the fraction at one
> >>> instant. It is a problem well understood by numerical analysists
> >>> ..........And something casual programmers remain unaware of
> >>> for decades of experience.........
> >>>>
> >>>> So how many values are exact? Can those values be encoded into
> >>>> the NAN bits? If so, why not let inexact be the default, thereby allowing
> >>>> one to use round-to-odd thereby eliminating double rounding issues?
> >>>> (one would still follow the rule of rounding the nearest representable value)
> >>> <
> >>> Nobody doing real FP math gives a crap about exactness. The 99%
> >>> Only people testing FP arithmetic units do. The way-less-than 0.1%
> >>> <
> >>> Consider: COS( 6381956970095103×2^797) = -4.68716592425462761112E-19
> >>> <
> >>> Conceptually, this requires calculating over 800-bits of intermediate
> >>> INT(2/pi×x) !!! to get the proper reduced argument which will result in
> >>> the above properly rounded result.
> >>> <
> >>> To get that 800-bits one uses Payne-Hanek argument reduction which
> >>> takes somewhat longer than 100 cycles--compared to computing the
> >> 100 cycles seems to be way to large, but maybe this is what current
> >> implementations need?
> >>> COS(reduced) polynomial taking slightly less than 100 cycles.
> >>> <
> >>> I have a patented method that can perform reduction in 5 cycles: and a
> >>> designed function unit that can perform the above COS(actual)
> >>> in 19 cycles.
> >> This is of course really wonderful, to the point where I want _all_ FPUs
> >> to license and/or emulate your algorithms.
> >>
> >> However, the actual range reduction should be significantly faster than
> >> 100 cycles, even in pure SW:
> >>
> >> Load and inspect the exponent, use it to determine a byte-granularity
> >> starting point in the reciprocal pi expansion, then multiply the
> >> mantissa by a 96-128 bit slice (two 64x64->128 MUL operations).
> > <
> > static void ReduceFull(double *xp, int *a, double x)
> > {
> > Double X = { x };
> > int ec = X.s.exponent - (1023+33);
> > int k = (ec + 26) * (607*4) >> 16;
> > int m = 27*k - ec;
> > int offset = m >> 3;
> > x *= 0x1p-400;
> > double xDekker = x * (0x1p27 + 1);
> > double x0 = xDekker - (xDekker - x);
> > double x1 = x - x0;
> > const double *p0 = &TwoOverPiWithOffset[offset][k]; // 180 DP FP numbers
> > const double fp0 = p0[0];
> > const double fp1 = p0[1];
> > const double fp2 = p0[2];
> > const double fp3 = p0[3];
> > const double f0 = x1 * fp0 + fp1 * x0;
> > Double f = x1 * fp1 + fp2 * x0;
> > const double fi = f0 + f;
> > static const double IntegerBias = 0x1.8p52;
> > double Fi = { fi + IntegerBias };
> > *a = Fi.s.significand2;
> > double fint = Fi.d - IntegerBias;
> > const double fp4 = p0[4];
> > const double fp5 = p0[5];
> > const double fp6 = p0[6];
> > f = f0 - fint + f;
> > f += x1 * fp2 + fp3 * x0;
> > f += x1 * fp3 + fp4 * x0;
> > f += x1 * fp4 + fp5 * x0;
> > f += x1 * fp5 + fp6 * x0;
> > *xp = f * 0x3.243F6A8885A3p-1;
> > }
> > along with a large array of FP numbers representing 2/pi. From an
> > old SUN library. Double (with the capital first letter is a union overlay)
> > on a double.
> >>
> >> We discard the whole circles and normalize, then we use the top N (3-5)
> >> bits to select the reduced range poly to use.
> > <
> > After you create this result, the top 2 bits are the quadrant, but up to
> > 61 bits between there and the hidden bit of the reduced argument
> > can be 0 and have to be normalized away.
> >>
> >> All this should be eminently doable in less than 20 cycles, right?
> > <
> > See function.
> Thanks for posting!
>
> I see immediately that in order to be portable, they are using a lot of
> redundant storage and doing a lot more FMULs than what I would need when
> using integer 64x64->128 MULs, on top of misaligned loads, as the
> bulding block.
>
> I would of course also do the actual poly evaluation with integer
> operations, but that happens later.
>
> Accessing a byte array means that I get maximum 7 extra leading bits, so
> after the 53 x 128 mul, I have at least 121 useful bits, with the bottom
> part capable of swallowing any carries from the truncation error at the
> end of the 128-bit reciprocal, right?
<
Pretty close:
<
After stripping off the 2 quadrant bits from the top, you have as many as
60.9 bits of zero (J.M.Mueler) and you still need 53-bits of fraction for the
reduced argument. I count that as 116-bits. But by the way they calculate
k and m there is jitter as to which bits became the quadrant bits before
normalization. Given this jitter, somewhere in the 120-125-bit range is
the number you have to calculate.
<
This also converts back to radians {*xp = f * 0x3.243F6A8885A3p-1; }
whereas I would "decorate" the coefficients such that that is not necessary
and calculate directly in reduced 2/pi range {0..2/pi} instead of {0..1}.
This save ½ bit of rounding error.
<
But you do see the expense of performing good argument reduction.
<
> Terje
>
>
> --
> - <Terje.Mathisen at tmsw.no>
> "almost all programming can be viewed as an exercise in caching"


Click here to read the complete article
Re: Two New 128-bit Floating-Point Formats

<847fb874-222a-4ea2-b615-20c655499be4n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33582&group=comp.arch#33582

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5a8a:0:b0:40f:c807:f5f8 with SMTP id c10-20020ac85a8a000000b0040fc807f5f8mr10620qtc.10.1691625312490;
Wed, 09 Aug 2023 16:55:12 -0700 (PDT)
X-Received: by 2002:a17:90b:20e:b0:268:1d63:b9ae with SMTP id
fy14-20020a17090b020e00b002681d63b9aemr205625pjb.3.1691625312001; Wed, 09 Aug
2023 16:55:12 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 9 Aug 2023 16:55:11 -0700 (PDT)
In-Reply-To: <7096a3ad-f401-4cf1-83d7-8fa1781155b0n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:8d95:444f:a057:56f5;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:8d95:444f:a057:56f5
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com> <7096a3ad-f401-4cf1-83d7-8fa1781155b0n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <847fb874-222a-4ea2-b615-20c655499be4n@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 09 Aug 2023 23:55:12 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 1674
 by: Quadibloc - Wed, 9 Aug 2023 23:55 UTC

On Wednesday, August 9, 2023 at 10:03:20 AM UTC-6, JimBrakefield wrote:

> On further consideration, one must acknowledge that lawyers and bankers are not
> in favor of round to odd, unbiased or otherwise.

But what does that have to do with floating-point? Of course _integer_ arithmetic
truncates.

John Savard

Re: Two New 128-bit Floating-Point Formats

<dfc4a06f-38bc-4bb7-bb3d-4870e5ebd59dn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33583&group=comp.arch#33583

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:b646:0:b0:76c:69ac:a0f0 with SMTP id g67-20020a37b646000000b0076c69aca0f0mr7101qkf.4.1691625525663;
Wed, 09 Aug 2023 16:58:45 -0700 (PDT)
X-Received: by 2002:a05:6a00:179b:b0:686:2b60:3361 with SMTP id
s27-20020a056a00179b00b006862b603361mr298700pfg.4.1691625525162; Wed, 09 Aug
2023 16:58:45 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 9 Aug 2023 16:58:44 -0700 (PDT)
In-Reply-To: <0cd5cd9a-4373-483e-b04c-bdb62b9f43e7n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:8d95:444f:a057:56f5;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:8d95:444f:a057:56f5
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com> <f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
<ub05dl$3v9rs$1@dont-email.me> <14c4d8ca-9ded-4562-a6be-125b8435c643n@googlegroups.com>
<ub0sgr$2cu8$1@dont-email.me> <0cd5cd9a-4373-483e-b04c-bdb62b9f43e7n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <dfc4a06f-38bc-4bb7-bb3d-4870e5ebd59dn@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 09 Aug 2023 23:58:45 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Quadibloc - Wed, 9 Aug 2023 23:58 UTC

On Wednesday, August 9, 2023 at 4:37:57 PM UTC-6, MitchAlsup wrote:

> But you do see the expense of performing good argument reduction.

It certainly is expensive. And it does not serve a practical purpose, so
I am mystified that current floating-point standards call for it.

John Savard

Re: Two New 128-bit Floating-Point Formats

<ub20tf$9tmg$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33584&group=comp.arch#33584

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED.3.80-202-33.nextgentel.com!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Two New 128-bit Floating-Point Formats
Date: Thu, 10 Aug 2023 08:42:55 +0200
Organization: A noiseless patient Spider
Message-ID: <ub20tf$9tmg$1@dont-email.me>
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com>
<f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
<ub05dl$3v9rs$1@dont-email.me>
<14c4d8ca-9ded-4562-a6be-125b8435c643n@googlegroups.com>
<ub0sgr$2cu8$1@dont-email.me>
<0cd5cd9a-4373-483e-b04c-bdb62b9f43e7n@googlegroups.com>
<dfc4a06f-38bc-4bb7-bb3d-4870e5ebd59dn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 10 Aug 2023 06:42:55 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3.80-202-33.nextgentel.com:80.202.33.3";
logging-data="325328"; mail-complaints-to="abuse@eternal-september.org"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.17
In-Reply-To: <dfc4a06f-38bc-4bb7-bb3d-4870e5ebd59dn@googlegroups.com>
 by: Terje Mathisen - Thu, 10 Aug 2023 06:42 UTC

Quadibloc wrote:
> On Wednesday, August 9, 2023 at 4:37:57 PM UTC-6, MitchAlsup wrote:
>
>> But you do see the expense of performing good argument reduction.
>
> It certainly is expensive. And it does not serve a practical purpose, so
> I am mystified that current floating-point standards call for it.

ieee 754 does NOT "call for it".

What we're asking for in the standard is much more in the "make a best
effort" camp instead of any absolute requirements.

Exact argument reduction for huge inputs is one of those things that I
think you should do simply because you can without introducing any
really bad slowdowns. (Or in the case of Mitch's hw/algorithm co-design,
pretty close to zero overhead.)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Two New 128-bit Floating-Point Formats

<xg6BM.359400$xMqa.24856@fx12.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33585&group=comp.arch#33585

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx12.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Two New 128-bit Floating-Point Formats
Newsgroups: comp.arch
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com> <af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com> <7096a3ad-f401-4cf1-83d7-8fa1781155b0n@googlegroups.com> <847fb874-222a-4ea2-b615-20c655499be4n@googlegroups.com>
Lines: 15
Message-ID: <xg6BM.359400$xMqa.24856@fx12.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Thu, 10 Aug 2023 14:13:49 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Thu, 10 Aug 2023 14:13:49 GMT
X-Received-Bytes: 1389
 by: Scott Lurndal - Thu, 10 Aug 2023 14:13 UTC

Quadibloc <jsavard@ecn.ab.ca> writes:
>On Wednesday, August 9, 2023 at 10:03:20=E2=80=AFAM UTC-6, JimBrakefield wr=
>ote:
>
>> On further consideration, one must acknowledge that lawyers and bankers a=
>re not=20
>> in favor of round to odd, unbiased or otherwise.=20
>
>But what does that have to do with floating-point? Of course _integer_ arit=
>hmetic
>truncates.

Does it? Consider integer arithmetic denominated in mils (1/10 cent). There
is a need for rounding when presenting values denominated in whole cents.

Re: Two New 128-bit Floating-Point Formats

<ub31a7$duua$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33587&group=comp.arch#33587

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: sfuld@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Two New 128-bit Floating-Point Formats
Date: Thu, 10 Aug 2023 08:55:51 -0700
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <ub31a7$duua$1@dont-email.me>
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com>
<uauc26$215i9$1@newsreader4.netcologne.de>
<76debbf3-c94c-4085-a034-4f6e2271daabn@googlegroups.com>
<6ca54868-70be-4f01-b382-57dcb430da23n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 10 Aug 2023 15:55:51 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e939fdc2adf1b718e6fb0eaa6bcf87d0";
logging-data="457674"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/rIgzMOZvzakz1r9RZwvwU1Qt4RZ/SD4Q="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:u6Boczk3Kf4F0HYMWNv4Qv4b8os=
Content-Language: en-US
In-Reply-To: <6ca54868-70be-4f01-b382-57dcb430da23n@googlegroups.com>
 by: Stephen Fuld - Thu, 10 Aug 2023 15:55 UTC

On 8/8/2023 3:08 PM, MitchAlsup wrote:
> On Tuesday, August 8, 2023 at 4:53:39 PM UTC-5, Quadibloc wrote:
>> On Tuesday, August 8, 2023 at 3:28:42 PM UTC-6, Thomas Koenig wrote:
>>> MitchAlsup <Mitch...@aol.com> schrieb:
>>
>>>> Still, overall, posit gives you more of what you want--precision
>>>> when you don't need exponent bits, and lack of precision loss
>>>> when you do.
>>
>>> I'm thoroughly unconvinced on posits.
>>>
>>> Losing precision when the scale is off seems like a bad idea to me,
>>> especially if there is no indication of lost intermediate precision.
> <
> There are (now) many numerical algorithms which have been tested in
> posit form and found to have significantly better accuracy with posits
> instead of IEEE 754--so much so that some of them can decrease the
> size of their average data (63->32 or 32->16) and still give meaningful
> results--mostly better than IEEE of 2× the size
> <
> However, I too remain unconvinced--because the failed results are not
> equally marketed..

Agreed. Though I did find this

https://arxiv.org/pdf/2109.08225.pdf

Which gives at least some real world examples where Posits are worse.

And like you, not advocating - though intrigued. :-)

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Two New 128-bit Floating-Point Formats

<787fc60d-a7a8-40e6-94e0-48a4ece390afn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33590&group=comp.arch#33590

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:d6:b0:40f:e2a5:30f5 with SMTP id p22-20020a05622a00d600b0040fe2a530f5mr51630qtw.5.1691688225005;
Thu, 10 Aug 2023 10:23:45 -0700 (PDT)
X-Received: by 2002:a63:7d5c:0:b0:564:1897:11b6 with SMTP id
m28-20020a637d5c000000b00564189711b6mr586382pgn.9.1691688224797; Thu, 10 Aug
2023 10:23:44 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 10 Aug 2023 10:23:44 -0700 (PDT)
In-Reply-To: <dfc4a06f-38bc-4bb7-bb3d-4870e5ebd59dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:89f1:a1fe:6c56:f4f5;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:89f1:a1fe:6c56:f4f5
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com> <f0673c9a-a6a4-46b7-8ec8-ec37d8bc769dn@googlegroups.com>
<ub05dl$3v9rs$1@dont-email.me> <14c4d8ca-9ded-4562-a6be-125b8435c643n@googlegroups.com>
<ub0sgr$2cu8$1@dont-email.me> <0cd5cd9a-4373-483e-b04c-bdb62b9f43e7n@googlegroups.com>
<dfc4a06f-38bc-4bb7-bb3d-4870e5ebd59dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <787fc60d-a7a8-40e6-94e0-48a4ece390afn@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Thu, 10 Aug 2023 17:23:45 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Thu, 10 Aug 2023 17:23 UTC

On Wednesday, August 9, 2023 at 6:58:47 PM UTC-5, Quadibloc wrote:
> On Wednesday, August 9, 2023 at 4:37:57 PM UTC-6, MitchAlsup wrote:
>
> > But you do see the expense of performing good argument reduction.
> It certainly is expensive. And it does not serve a practical purpose, so
> I am mystified that current floating-point standards call for it.
<
It is part of a major Tenet of IEEE 754--
<
Compute as-if to infinite precision and then round once properly.
>
> John Savard

Re: Two New 128-bit Floating-Point Formats

<6f4e180d-f331-402e-8d81-d0a00590352bn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33591&group=comp.arch#33591

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:149:b0:76c:d577:4105 with SMTP id e9-20020a05620a014900b0076cd5774105mr33685qkn.0.1691688396314;
Thu, 10 Aug 2023 10:26:36 -0700 (PDT)
X-Received: by 2002:a05:6a00:2d8d:b0:687:4ed6:ec12 with SMTP id
fb13-20020a056a002d8d00b006874ed6ec12mr1335193pfb.3.1691688395753; Thu, 10
Aug 2023 10:26:35 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 10 Aug 2023 10:26:35 -0700 (PDT)
In-Reply-To: <xg6BM.359400$xMqa.24856@fx12.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:89f1:a1fe:6c56:f4f5;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:89f1:a1fe:6c56:f4f5
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<af45f172-2766-4711-a1de-d0650a0f011dn@googlegroups.com> <7096a3ad-f401-4cf1-83d7-8fa1781155b0n@googlegroups.com>
<847fb874-222a-4ea2-b615-20c655499be4n@googlegroups.com> <xg6BM.359400$xMqa.24856@fx12.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6f4e180d-f331-402e-8d81-d0a00590352bn@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Thu, 10 Aug 2023 17:26:36 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Thu, 10 Aug 2023 17:26 UTC

On Thursday, August 10, 2023 at 9:13:54 AM UTC-5, Scott Lurndal wrote:
> Quadibloc <jsa...@ecn.ab.ca> writes:
> >On Wednesday, August 9, 2023 at 10:03:20=E2=80=AFAM UTC-6, JimBrakefield wr=
> >ote:
> >
> >> On further consideration, one must acknowledge that lawyers and bankers a=
> >re not=20
> >> in favor of round to odd, unbiased or otherwise.=20
> >
> >But what does that have to do with floating-point? Of course _integer_ arit=
> >hmetic
> >truncates.
>
> Does it? Consider integer arithmetic denominated in mils (1/10 cent). There
> is a need for rounding when presenting values denominated in whole cents.
<
In any event::
<
Bankers rounding is defined by "state" code:: 0..7 cents rounds to 0 tax, 8...15
cents rounds to 1 cent tax,... and the state gets to choose the lower and upper
bounds of each range, and may not space them evenly.

Re: Two New 128-bit Floating-Point Formats

<89a02722-9841-4db8-935c-c103dbf1c818n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33592&group=comp.arch#33592

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ae9:df82:0:b0:76c:9ec6:48de with SMTP id t124-20020ae9df82000000b0076c9ec648demr33691qkf.6.1691688896593;
Thu, 10 Aug 2023 10:34:56 -0700 (PDT)
X-Received: by 2002:a63:7702:0:b0:564:9d36:f3e7 with SMTP id
s2-20020a637702000000b005649d36f3e7mr655331pgc.0.1691688896208; Thu, 10 Aug
2023 10:34:56 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!2.eu.feeder.erje.net!feeder.erje.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 10 Aug 2023 10:34:55 -0700 (PDT)
In-Reply-To: <ub31a7$duua$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:89f1:a1fe:6c56:f4f5;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:89f1:a1fe:6c56:f4f5
References: <439bb4ce-e70d-4e81-a6f3-2bb9e6e654b3n@googlegroups.com>
<8f04a04d-7037-46b8-9859-ab0f8571df15n@googlegroups.com> <uauc26$215i9$1@newsreader4.netcologne.de>
<76debbf3-c94c-4085-a034-4f6e2271daabn@googlegroups.com> <6ca54868-70be-4f01-b382-57dcb430da23n@googlegroups.com>
<ub31a7$duua$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <89a02722-9841-4db8-935c-c103dbf1c818n@googlegroups.com>
Subject: Re: Two New 128-bit Floating-Point Formats
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Thu, 10 Aug 2023 17:34:56 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Thu, 10 Aug 2023 17:34 UTC

On Thursday, August 10, 2023 at 10:55:55 AM UTC-5, Stephen Fuld wrote:
> On 8/8/2023 3:08 PM, MitchAlsup wrote:
> > On Tuesday, August 8, 2023 at 4:53:39 PM UTC-5, Quadibloc wrote:
> >> On Tuesday, August 8, 2023 at 3:28:42 PM UTC-6, Thomas Koenig wrote:
> >>> MitchAlsup <Mitch...@aol.com> schrieb:
> >>
> >>>> Still, overall, posit gives you more of what you want--precision
> >>>> when you don't need exponent bits, and lack of precision loss
> >>>> when you do.
> >>
> >>> I'm thoroughly unconvinced on posits.
> >>>
> >>> Losing precision when the scale is off seems like a bad idea to me,
> >>> especially if there is no indication of lost intermediate precision.
> > <
> > There are (now) many numerical algorithms which have been tested in
> > posit form and found to have significantly better accuracy with posits
> > instead of IEEE 754--so much so that some of them can decrease the
> > size of their average data (63->32 or 32->16) and still give meaningful
> > results--mostly better than IEEE of 2× the size
> > <
> > However, I too remain unconvinced--because the failed results are not
> > equally marketed..
> Agreed. Though I did find this
>
> https://arxiv.org/pdf/2109.08225.pdf
>
> Which gives at least some real world examples where Posits are worse.
<
Thanks for this: although somewhat "tainted" due to using posit (32,3)
instead of posit(32,2) which most most work has been done on.
>
> And like you, not advocating - though intrigued. :-)
>
>
>
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Pages:12
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor