Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

"my terminal is a lethal teaspoon." -- Patricia O Tuama


devel / comp.arch / Re: memory speeds, Solving the Floating-Point Conundrum

SubjectAuthor
* Solving the Floating-Point ConundrumQuadibloc
+* Re: Solving the Floating-Point ConundrumStephen Fuld
|+* Re: Solving the Floating-Point ConundrumQuadibloc
||+- Re: Solving the Floating-Point ConundrumJohn Levine
||`- Re: Solving the Floating-Point ConundrumStephen Fuld
|`* Re: Solving the Floating-Point Conundrummac
| `- Re: Solving the Floating-Point ConundrumThomas Koenig
+* Re: Solving the Floating-Point ConundrumMitchAlsup
|+* Re: Solving the Floating-Point ConundrumQuadibloc
||+* Re: Solving the Floating-Point ConundrumMitchAlsup
|||`* Re: Solving the Floating-Point ConundrumQuadibloc
||| `* Re: Solving the Floating-Point ConundrumMitchAlsup
|||  `- Re: Solving the Floating-Point ConundrumQuadibloc
||`- Re: Solving the Floating-Point ConundrumJohn Dallman
|+- Re: Solving the Floating-Point ConundrumScott Lurndal
|`* Re: Solving the Floating-Point ConundrumQuadibloc
| +* Re: Solving the Floating-Point ConundrumMitchAlsup
| |`* Re: Solving the Floating-Point ConundrumBGB
| | +* Re: Solving the Floating-Point ConundrumScott Lurndal
| | |+* Re: Solving the Floating-Point ConundrumQuadibloc
| | ||+* Re: Solving the Floating-Point ConundrumMitchAlsup
| | |||`- Re: Solving the Floating-Point ConundrumTerje Mathisen
| | ||`* Re: Solving the Floating-Point ConundrumBGB
| | || `* Re: Solving the Floating-Point ConundrumStephen Fuld
| | ||  `* Re: Solving the Floating-Point ConundrumScott Lurndal
| | ||   `- Re: Solving the Floating-Point ConundrumMitchAlsup
| | |`* Re: Solving the Floating-Point ConundrumThomas Koenig
| | | `* Re: memory speeds, Solving the Floating-Point ConundrumJohn Levine
| | |  +- Re: memory speeds, Solving the Floating-Point ConundrumQuadibloc
| | |  +* Re: memory speeds, Solving the Floating-Point ConundrumScott Lurndal
| | |  |+* Re: memory speeds, Solving the Floating-Point ConundrumMitchAlsup
| | |  ||+* Re: memory speeds, Solving the Floating-Point ConundrumEricP
| | |  |||+* Re: memory speeds, Solving the Floating-Point ConundrumScott Lurndal
| | |  ||||`* Re: memory speeds, Solving the Floating-Point ConundrumEricP
| | |  |||| `- Re: memory speeds, Solving the Floating-Point ConundrumScott Lurndal
| | |  |||+- Re: memory speeds, Solving the Floating-Point ConundrumQuadibloc
| | |  |||+* Re: memory speeds, Solving the Floating-Point ConundrumJohn Levine
| | |  ||||`* Re: memory speeds, Solving the Floating-Point ConundrumEricP
| | |  |||| `- Re: memory speeds, Solving the Floating-Point ConundrumMitchAlsup
| | |  |||+- Re: memory speeds, Solving the Floating-Point ConundrumMitchAlsup
| | |  |||`- Re: memory speeds, Solving the Floating-Point ConundrumMitchAlsup
| | |  ||`* Re: memory speeds, Solving the Floating-Point ConundrumTimothy McCaffrey
| | |  || `- Re: memory speeds, Solving the Floating-Point ConundrumMitchAlsup
| | |  |`* Re: memory speeds, Solving the Floating-Point ConundrumQuadibloc
| | |  | +- Re: memory speeds, Solving the Floating-Point ConundrumMitchAlsup
| | |  | `- Re: memory speeds, Solving the Floating-Point Conundrummoi
| | |  `* Re: memory speeds, Solving the Floating-Point ConundrumAnton Ertl
| | |   +* Re: memory speeds, Solving the Floating-Point ConundrumMichael S
| | |   |+* Re: memory speeds, Solving the Floating-Point ConundrumJohn Levine
| | |   ||+- Re: memory speeds, Solving the Floating-Point ConundrumLynn Wheeler
| | |   ||`* Re: memory speeds, Solving the Floating-Point ConundrumAnton Ertl
| | |   || +- Re: memory speeds, Solving the Floating-Point ConundrumEricP
| | |   || `- Re: memory speeds, Solving the Floating-Point ConundrumJohn Levine
| | |   |`* Re: memory speeds, Solving the Floating-Point ConundrumAnton Ertl
| | |   | `- Re: memory speeds, Solving the Floating-Point ConundrumStephen Fuld
| | |   `* Re: memory speeds, Solving the Floating-Point ConundrumThomas Koenig
| | |    `- Re: memory speeds, Solving the Floating-Point ConundrumAnton Ertl
| | +* Re: Solving the Floating-Point ConundrumQuadibloc
| | |`* Re: Solving the Floating-Point ConundrumBGB
| | | `- Re: Solving the Floating-Point ConundrumStephen Fuld
| | +- Re: Solving the Floating-Point ConundrumMitchAlsup
| | `- Re: Solving the Floating-Point ConundrumMitchAlsup
| +* Re: Solving the Floating-Point ConundrumQuadibloc
| |`* Re: Solving the Floating-Point ConundrumQuadibloc
| | `* Re: Solving the Floating-Point ConundrumBGB
| |  `- Re: Solving the Floating-Point ConundrumScott Lurndal
| `* Re: Solving the Floating-Point ConundrumTimothy McCaffrey
|  +- Re: Solving the Floating-Point ConundrumScott Lurndal
|  +- Re: Solving the Floating-Point ConundrumStephen Fuld
|  +* Re: Solving the Floating-Point ConundrumQuadibloc
|  |`* Re: Solving the Floating-Point ConundrumQuadibloc
|  | +* Re: Solving the Floating-Point ConundrumQuadibloc
|  | |`* Re: Solving the Floating-Point ConundrumThomas Koenig
|  | | `* Re: Solving the Floating-Point ConundrumQuadibloc
|  | |  `* Re: Solving the Floating-Point ConundrumThomas Koenig
|  | |   `* Re: Solving the Floating-Point ConundrumQuadibloc
|  | |    `- Re: Solving the Floating-Point ConundrumThomas Koenig
|  | +* Re: Solving the Floating-Point ConundrumMitchAlsup
|  | |+- Re: Solving the Floating-Point ConundrumTerje Mathisen
|  | |`* Re: Solving the Floating-Point ConundrumQuadibloc
|  | | +* Re: Solving the Floating-Point ConundrumThomas Koenig
|  | | |+* Re: Solving the Floating-Point ConundrumJohn Dallman
|  | | ||+- Re: Solving the Floating-Point ConundrumQuadibloc
|  | | ||+* Re: Solving the Floating-Point ConundrumQuadibloc
|  | | |||+* Re: Solving the Floating-Point ConundrumMichael S
|  | | ||||+* Re: Solving the Floating-Point ConundrumMitchAlsup
|  | | |||||`- Re: Solving the Floating-Point ConundrumQuadibloc
|  | | ||||`- Re: Solving the Floating-Point ConundrumQuadibloc
|  | | |||+* Re: Solving the Floating-Point ConundrumMitchAlsup
|  | | ||||`- Re: Solving the Floating-Point ConundrumQuadibloc
|  | | |||`* Re: Solving the Floating-Point ConundrumTerje Mathisen
|  | | ||| `* Re: Solving the Floating-Point ConundrumMitchAlsup
|  | | |||  +* Re: Solving the Floating-Point Conundrumrobf...@gmail.com
|  | | |||  |+- Re: Solving the Floating-Point ConundrumScott Lurndal
|  | | |||  |+* Re: Solving the Floating-Point ConundrumMitchAlsup
|  | | |||  ||`- Re: Solving the Floating-Point ConundrumGeorge Neuner
|  | | |||  |+- Re: Solving the Floating-Point ConundrumThomas Koenig
|  | | |||  |`* Re: Solving the Floating-Point ConundrumTerje Mathisen
|  | | |||  | `- Re: Solving the Floating-Point ConundrumBGB
|  | | |||  `* Re: Solving the Floating-Point ConundrumTerje Mathisen
|  | | |||   +* Re: Solving the Floating-Point Conundrumcomp.arch
|  | | |||   `* Re: Solving the Floating-Point ConundrumMitchAlsup
|  | | ||`* Re: Solving the Floating-Point ConundrumQuadibloc
|  | | |`* Re: Solving the Floating-Point ConundrumJohn Levine
|  | | `- Re: Solving the Floating-Point ConundrumMitchAlsup
|  | +- Re: Solving the Floating-Point ConundrumQuadibloc
|  | `* Re: Solving the Floating-Point ConundrumStefan Monnier
|  +* Re: Solving the Floating-Point ConundrumBGB
|  `- Re: Solving the Floating-Point ConundrumThomas Koenig
+* Re: Solving the Floating-Point ConundrumMitchAlsup
`- Re: Solving the Floating-Point ConundrumQuadibloc

Pages:12345678910
Re: Solving the Floating-Point Conundrum

<ue7nkh$ne0$1@gal.iecc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34106&group=comp.arch#34106

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: johnl@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: Solving the Floating-Point Conundrum
Date: Sun, 17 Sep 2023 20:30:09 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <ue7nkh$ne0$1@gal.iecc.com>
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com> <8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com> <f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com> <ue788u$4u5l$1@newsreader4.netcologne.de>
Injection-Date: Sun, 17 Sep 2023 20:30:09 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="24000"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com> <8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com> <f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com> <ue788u$4u5l$1@newsreader4.netcologne.de>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Sun, 17 Sep 2023 20:30 UTC

According to Thomas Koenig <tkoenig@netcologne.de>:
>> That's not a power-of-two length, so how do I keep using these numbers both
>> efficient and simple?
>
>Make the architecture byte-addressable, with another width for the
>bytes; possible choices are 6 and 9.

I'm pretty sure the world has spoken and we are going to use 8-bit
bytes forever. I liked the PDP-8 and PDP-10 but they are, you know, dead.

>Then make your architecture capable of misaligned loads and stores
>and an extra floating point format, maybe 45 bits, with 9 bits
>exponent and 36 bits of significand.

If you're worried about performance, use your 45 bit format and store
it in a 64 bit word.

The IBM 360/44 was an odd machine intended for real-time applications.
It had a hard wired subset of the 360's instruction set, including all
the floating point. There was a knob on the front panel you could turn
to set the number of bytes for double precision operands, with shorter
being faster. They were still stored in 64 bit doublewords, ignoring
the low bytes. I've never seen anything saying whether it was useful
in practice or people just left the knob at the default 56 bits.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: Solving the Floating-Point Conundrum

<9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34107&group=comp.arch#34107

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7c4f:0:b0:417:974f:5631 with SMTP id o15-20020ac87c4f000000b00417974f5631mr195196qtv.2.1694983358906;
Sun, 17 Sep 2023 13:42:38 -0700 (PDT)
X-Received: by 2002:a9d:6b90:0:b0:6af:9f8b:c606 with SMTP id
b16-20020a9d6b90000000b006af9f8bc606mr2319666otq.0.1694983358689; Sun, 17 Sep
2023 13:42:38 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 13:42:38 -0700 (PDT)
In-Reply-To: <ue7nkh$ne0$1@gal.iecc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:bc30:160b:97b3:1ffb;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:bc30:160b:97b3:1ffb
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com> <f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com>
<ue788u$4u5l$1@newsreader4.netcologne.de> <ue7nkh$ne0$1@gal.iecc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sun, 17 Sep 2023 20:42:38 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3572
 by: MitchAlsup - Sun, 17 Sep 2023 20:42 UTC

On Sunday, September 17, 2023 at 3:30:19 PM UTC-5, John Levine wrote:
> According to Thomas Koenig <tko...@netcologne.de>:
> >> That's not a power-of-two length, so how do I keep using these numbers both
> >> efficient and simple?
> >
> >Make the architecture byte-addressable, with another width for the
> >bytes; possible choices are 6 and 9.
> I'm pretty sure the world has spoken and we are going to use 8-bit
> bytes forever. I liked the PDP-8 and PDP-10 but they are, you know, dead.
<
In addition, the world has spoken and little endian also won.
<
> >Then make your architecture capable of misaligned loads and stores
> >and an extra floating point format, maybe 45 bits, with 9 bits
> >exponent and 36 bits of significand.
<
> If you're worried about performance, use your 45 bit format and store
> it in a 64 bit word.
<
In 1985 one could get a descent 32-bit pipelined RISC architecture in 1cm^2
Today this design in < 0.1mm^2 or you can make a GBOoO version < 2mm^2.
<
And you really need 5mm^2 to get enough pins on the part to feed what you
can put inside; 7mm^2 makes even more sense on pins versus perf.
<
So, why are you catering to ANY bit counts less than 64 ??
Intel has version with 512-bit data paths, GPUs generally use 1024-bits in
and 1024 bits out per cycle continuously per shader core.
<
It is no longer 1990, adjust your thinking to the modern realities or our time !
>
> The IBM 360/44 was an odd machine intended for real-time applications.
> It had a hard wired subset of the 360's instruction set, including all
> the floating point. There was a knob on the front panel you could turn
> to set the number of bytes for double precision operands, with shorter
> being faster. They were still stored in 64 bit doublewords, ignoring
> the low bytes. I've never seen anything saying whether it was useful
> in practice or people just left the knob at the default 56 bits.
> --
> Regards,
> John Levine, jo...@taugh.com, Primary Perpetrator of "The Internet for Dummies",
> Please consider the environment before reading this e-mail. https://jl.ly

Re: Solving the Floating-Point Conundrum

<e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34108&group=comp.arch#34108

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:2605:b0:76d:86b1:ece8 with SMTP id z5-20020a05620a260500b0076d86b1ece8mr158315qko.12.1694992023547;
Sun, 17 Sep 2023 16:07:03 -0700 (PDT)
X-Received: by 2002:a05:6830:1d3:b0:6b9:a422:9f with SMTP id
r19-20020a05683001d300b006b9a422009fmr2448995ota.1.1694992023131; Sun, 17 Sep
2023 16:07:03 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 16:07:02 -0700 (PDT)
In-Reply-To: <9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.14.162; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.14.162
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com> <f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com>
<ue788u$4u5l$1@newsreader4.netcologne.de> <ue7nkh$ne0$1@gal.iecc.com> <9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jim.brakefield@ieee.org (JimBrakefield)
Injection-Date: Sun, 17 Sep 2023 23:07:03 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 4688
 by: JimBrakefield - Sun, 17 Sep 2023 23:07 UTC

On Sunday, September 17, 2023 at 3:42:40 PM UTC-5, MitchAlsup wrote:
> On Sunday, September 17, 2023 at 3:30:19 PM UTC-5, John Levine wrote:
> > According to Thomas Koenig <tko...@netcologne.de>:
> > >> That's not a power-of-two length, so how do I keep using these numbers both
> > >> efficient and simple?
> > >
> > >Make the architecture byte-addressable, with another width for the
> > >bytes; possible choices are 6 and 9.
> > I'm pretty sure the world has spoken and we are going to use 8-bit
> > bytes forever. I liked the PDP-8 and PDP-10 but they are, you know, dead.
> <
> In addition, the world has spoken and little endian also won.
> <
> > >Then make your architecture capable of misaligned loads and stores
> > >and an extra floating point format, maybe 45 bits, with 9 bits
> > >exponent and 36 bits of significand.
> <
> > If you're worried about performance, use your 45 bit format and store
> > it in a 64 bit word.
> <
> In 1985 one could get a descent 32-bit pipelined RISC architecture in 1cm^2
> Today this design in < 0.1mm^2 or you can make a GBOoO version < 2mm^2.
> <
> And you really need 5mm^2 to get enough pins on the part to feed what you
> can put inside; 7mm^2 makes even more sense on pins versus perf.
> <
> So, why are you catering to ANY bit counts less than 64 ??
> Intel has version with 512-bit data paths, GPUs generally use 1024-bits in
> and 1024 bits out per cycle continuously per shader core.
> <
> It is no longer 1990, adjust your thinking to the modern realities or our time !
> >
> > The IBM 360/44 was an odd machine intended for real-time applications.
> > It had a hard wired subset of the 360's instruction set, including all
> > the floating point. There was a knob on the front panel you could turn
> > to set the number of bytes for double precision operands, with shorter
> > being faster. They were still stored in 64 bit doublewords, ignoring
> > the low bytes. I've never seen anything saying whether it was useful
> > in practice or people just left the knob at the default 56 bits.
> > --
> > Regards,
> > John Levine, jo...@taugh.com, Primary Perpetrator of "The Internet for Dummies",
> > Please consider the environment before reading this e-mail. https://jl.ly

Ugh
|> It is no longer 1990, adjust your thinking to the modern realities or our time

So why do we still design register files of only 64 bit registers?

For an example ISA, assume sixteen 1024 bit registers
with 32-bit instructions specifying register sub-fields of 1024, 512, 256, 128, 64 and even 32-bits.
(four bit register selectors and zero to five bit sub-field selectors for a two operand, one result RISC instructions)
The bigger the subfield, the greater the effective IPC!
Load and store instructions burdened with alignment, mask setup and other duties.

All in all, a very basic RISC with large registers and potential for good performance,
to the extent that the problem software can make effective use of the large registers,
natural alignment of data and wide ALUs.

Re: Solving the Floating-Point Conundrum

<ue84ep$jmun$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34109&group=comp.arch#34109

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: sfuld@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Solving the Floating-Point Conundrum
Date: Sun, 17 Sep 2023 17:08:57 -0700
Organization: A noiseless patient Spider
Lines: 29
Message-ID: <ue84ep$jmun$1@dont-email.me>
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com>
<f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com>
<ue788u$4u5l$1@newsreader4.netcologne.de> <ue7nkh$ne0$1@gal.iecc.com>
<9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com>
<e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 18 Sep 2023 00:08:57 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="8f6f76c340acaa31dca1af5e06d1d21e";
logging-data="646103"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/XdZsN0d4ni+v0oLVWzuZ6M8vAgXAnaSo="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:09VXRT9+VcYUXtCaxiybQa821xE=
Content-Language: en-US
In-Reply-To: <e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>
 by: Stephen Fuld - Mon, 18 Sep 2023 00:08 UTC

On 9/17/2023 4:07 PM, JimBrakefield wrote:

snip

> So why do we still design register files of only 64 bit registers?
>
> For an example ISA, assume sixteen 1024 bit registers
> with 32-bit instructions specifying register sub-fields of 1024, 512, 256, 128, 64 and even 32-bits.
> (four bit register selectors and zero to five bit sub-field selectors for a two operand, one result RISC instructions)

I am not sure what you want here. Are the sub field selectors per
operand or per instruction? If you specify say a 64 bit subfield, does
that mean use only the low order 64 bits of the register(s), or is it
like a SIMD thing where you treat the registers as 16 fields, each 64
bits, or something else?

> The bigger the subfield, the greater the effective IPC!
> Load and store instructions burdened with alignment, mask setup and other duties.
>
> All in all, a very basic RISC with large registers and potential for good performance,
> to the extent that the problem software can make effective use of the large registers,
> natural alignment of data and wide ALUs.
>

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: memory speeds, Solving the Floating-Point Conundrum

<049219b5-8319-4374-9a66-91276c99c634n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34110&group=comp.arch#34110

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:2593:b0:417:b53c:5d4c with SMTP id cj19-20020a05622a259300b00417b53c5d4cmr44122qtb.1.1694996031594;
Sun, 17 Sep 2023 17:13:51 -0700 (PDT)
X-Received: by 2002:a05:6808:21aa:b0:3a8:43ed:ce9c with SMTP id
be42-20020a05680821aa00b003a843edce9cmr3409934oib.1.1694996031313; Sun, 17
Sep 2023 17:13:51 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 17:13:51 -0700 (PDT)
In-Reply-To: <DQ0NM.794$AfZe.536@fx45.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:bc30:160b:97b3:1ffb;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:bc30:160b:97b3:1ffb
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<udspsq$27b0q$1@dont-email.me> <qrmMM.7$5jrd.6@fx06.iad> <udu7us$3v2e2$1@newsreader4.netcologne.de>
<ue0esp$ps2$1@gal.iecc.com> <TG_MM.3194$H0Ge.3155@fx05.iad>
<2332d098-8ff4-496c-85cb-502ddf501054n@googlegroups.com> <DQ0NM.794$AfZe.536@fx45.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <049219b5-8319-4374-9a66-91276c99c634n@googlegroups.com>
Subject: Re: memory speeds, Solving the Floating-Point Conundrum
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Mon, 18 Sep 2023 00:13:51 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2185
 by: MitchAlsup - Mon, 18 Sep 2023 00:13 UTC

On Friday, September 15, 2023 at 12:51:36 PM UTC-5, EricP wrote:
>
> Considerations in the design of a computer with high
> logic-to-memory speed ratio 1962
> https://archive.computerhistory.org/resources/access/text/2020/10/102714096-05-01-acc.pdf
<
Thank you for this reference. James Thornton really had his finger on the pulse
of computer design--almost every problem he talks about in this reference still
is an actual problem on our most modern (and fastest) designs.
<
His analysis of wire delay and clock skew and having a library of fixed logic
building blocks mirrors advanced VLSI of today (except for the tooling between
concept and layout (Verilog synthesis).

Re: memory speeds, Solving the Floating-Point Conundrum

<59753637-2069-445d-a40c-eaff496b110en@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34111&group=comp.arch#34111

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5c56:0:b0:417:971e:ab19 with SMTP id j22-20020ac85c56000000b00417971eab19mr162741qtj.12.1694996032128;
Sun, 17 Sep 2023 17:13:52 -0700 (PDT)
X-Received: by 2002:a05:6808:308b:b0:3a7:3ced:532a with SMTP id
bl11-20020a056808308b00b003a73ced532amr3400107oib.7.1694996031773; Sun, 17
Sep 2023 17:13:51 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 17:13:51 -0700 (PDT)
In-Reply-To: <DQ0NM.794$AfZe.536@fx45.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:bc30:160b:97b3:1ffb;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:bc30:160b:97b3:1ffb
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<udspsq$27b0q$1@dont-email.me> <qrmMM.7$5jrd.6@fx06.iad> <udu7us$3v2e2$1@newsreader4.netcologne.de>
<ue0esp$ps2$1@gal.iecc.com> <TG_MM.3194$H0Ge.3155@fx05.iad>
<2332d098-8ff4-496c-85cb-502ddf501054n@googlegroups.com> <DQ0NM.794$AfZe.536@fx45.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <59753637-2069-445d-a40c-eaff496b110en@googlegroups.com>
Subject: Re: memory speeds, Solving the Floating-Point Conundrum
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Mon, 18 Sep 2023 00:13:52 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2183
 by: MitchAlsup - Mon, 18 Sep 2023 00:13 UTC

On Friday, September 15, 2023 at 12:51:36 PM UTC-5, EricP wrote:
>
> Considerations in the design of a computer with high
> logic-to-memory speed ratio 1962
> https://archive.computerhistory.org/resources/access/text/2020/10/102714096-05-01-acc.pdf
<
Thank you for this reference. James Thornton really had his finger on the pulse
of computer design--almost every problem he talks about in this reference still
is an actual problem on our most modern (and fastest) designs.
<
His analysis of wire delay and clock skew and having a library of fixed logic
building blocks mirrors advanced VLSI of today (except for the tooling between
concept and layout (Verilog synthesis).

Re: Solving the Floating-Point Conundrum

<eba1a354-a52d-46b0-a59b-b9f2e406f9cfn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34112&group=comp.arch#34112

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:24d:b0:76d:c9c7:dd6b with SMTP id q13-20020a05620a024d00b0076dc9c7dd6bmr169129qkn.3.1694996481307;
Sun, 17 Sep 2023 17:21:21 -0700 (PDT)
X-Received: by 2002:a05:6870:c989:b0:1b0:4e46:7f13 with SMTP id
hi9-20020a056870c98900b001b04e467f13mr3865168oab.2.1694996481036; Sun, 17 Sep
2023 17:21:21 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 17:21:20 -0700 (PDT)
In-Reply-To: <e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:bc30:160b:97b3:1ffb;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:bc30:160b:97b3:1ffb
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com> <f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com>
<ue788u$4u5l$1@newsreader4.netcologne.de> <ue7nkh$ne0$1@gal.iecc.com>
<9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com> <e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <eba1a354-a52d-46b0-a59b-b9f2e406f9cfn@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Mon, 18 Sep 2023 00:21:21 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3317
 by: MitchAlsup - Mon, 18 Sep 2023 00:21 UTC

On Sunday, September 17, 2023 at 6:07:05 PM UTC-5, JimBrakefield wrote:
> On Sunday, September 17, 2023 at 3:42:40 PM UTC-5, MitchAlsup wrote:
> > <
> > It is no longer 1990, adjust your thinking to the modern realities or our time !
>
> |> It is no longer 1990, adjust your thinking to the modern realities or our time
<
> So why do we still design register files of only 64 bit registers?
>
> For an example ISA, assume sixteen 1024 bit registers
<
Basically there is a paradigm whereby 1 register contains 1 value.
Do you want 1024-bit integers, 1024-bit pointers, and 1024-bit FP values ??
If yes, make the RF be 1024 bits in size; else don't.
<
> with 32-bit instructions specifying register sub-fields of 1024, 512, 256, 128, 64 and even 32-bits.
<
Where do the encoding bits come from ?? The beauty of 64-bit ISA is that encoding of
the operand sizes takes 0-bits (except in LD/ST and possibly FP).
<
> (four bit register selectors and zero to five bit sub-field selectors for a two operand, one result RISC instructions)
<
How do you specify 16-bits and 8-bits ??
<
> The bigger the subfield, the greater the effective IPC!
<
Does this work for LISP programs ??
<
> Load and store instructions burdened with alignment, mask setup and other duties.
<
So, you have to set yourself up to read 1024-bits × sets + addressSize×2 and then
only use 8-64-bits most of the time. This is a great waste of power most of the time.
>
> All in all, a very basic RISC with large registers and potential for good performance,
> to the extent that the problem software can make effective use of the large registers,
> natural alignment of data and wide ALUs.

Re: Solving the Floating-Point Conundrum

<df353284-3c0f-4e79-84dd-79d8246bd95bn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34113&group=comp.arch#34113

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:104f:b0:412:2f98:2b96 with SMTP id f15-20020a05622a104f00b004122f982b96mr192174qte.8.1694996798254;
Sun, 17 Sep 2023 17:26:38 -0700 (PDT)
X-Received: by 2002:a05:6830:44aa:b0:6bd:9ca2:51ba with SMTP id
r42-20020a05683044aa00b006bd9ca251bamr4119220otv.2.1694996797982; Sun, 17 Sep
2023 17:26:37 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 17:26:37 -0700 (PDT)
In-Reply-To: <ue84ep$jmun$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:bc30:160b:97b3:1ffb;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:bc30:160b:97b3:1ffb
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com> <f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com>
<ue788u$4u5l$1@newsreader4.netcologne.de> <ue7nkh$ne0$1@gal.iecc.com>
<9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com> <e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>
<ue84ep$jmun$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <df353284-3c0f-4e79-84dd-79d8246bd95bn@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Mon, 18 Sep 2023 00:26:38 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3170
 by: MitchAlsup - Mon, 18 Sep 2023 00:26 UTC

On Sunday, September 17, 2023 at 7:09:01 PM UTC-5, Stephen Fuld wrote:
> On 9/17/2023 4:07 PM, JimBrakefield wrote:
>
> snip
> > So why do we still design register files of only 64 bit registers?
> >
> > For an example ISA, assume sixteen 1024 bit registers
> > with 32-bit instructions specifying register sub-fields of 1024, 512, 256, 128, 64 and even 32-bits.
> > (four bit register selectors and zero to five bit sub-field selectors for a two operand, one result RISC instructions)
<
> I am not sure what you want here. Are the sub field selectors per
> operand or per instruction? If you specify say a 64 bit subfield, does
> that mean use only the low order 64 bits of the register(s), or is it
> like a SIMD thing where you treat the registers as 16 fields, each 64
> bits, or something else?
<
How would you encode::
<
uint8_t c[];
uint16_t h[];
uint32_t w[];
for( int j = 0; j < wMAX; j++ )
{
for( int i = 0; i < hMAX; i++ )
h[i] = c[i] * hMULT;
w[j] = h[j+hMAX/2]*c[j+hMAX/2+7];
}
<
?????
<
> > The bigger the subfield, the greater the effective IPC!
> > Load and store instructions burdened with alignment, mask setup and other duties.
> >
> > All in all, a very basic RISC with large registers and potential for good performance,
> > to the extent that the problem software can make effective use of the large registers,
> > natural alignment of data and wide ALUs.
> >
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Solving the Floating-Point Conundrum

<069cbff0-1e4b-424a-b43d-439ffa0814b0n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34114&group=comp.arch#34114

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:3d8f:b0:658:270a:42d0 with SMTP id om15-20020a0562143d8f00b00658270a42d0mr3604qvb.4.1694997047128;
Sun, 17 Sep 2023 17:30:47 -0700 (PDT)
X-Received: by 2002:a9d:68d5:0:b0:6b9:2c07:8849 with SMTP id
i21-20020a9d68d5000000b006b92c078849mr2371158oto.0.1694997046850; Sun, 17 Sep
2023 17:30:46 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 17:30:46 -0700 (PDT)
In-Reply-To: <ue84ep$jmun$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.14.162; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.14.162
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com> <f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com>
<ue788u$4u5l$1@newsreader4.netcologne.de> <ue7nkh$ne0$1@gal.iecc.com>
<9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com> <e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>
<ue84ep$jmun$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <069cbff0-1e4b-424a-b43d-439ffa0814b0n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jim.brakefield@ieee.org (JimBrakefield)
Injection-Date: Mon, 18 Sep 2023 00:30:47 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3579
 by: JimBrakefield - Mon, 18 Sep 2023 00:30 UTC

On Sunday, September 17, 2023 at 7:09:01 PM UTC-5, Stephen Fuld wrote:
> On 9/17/2023 4:07 PM, JimBrakefield wrote:
>
> snip
> > So why do we still design register files of only 64 bit registers?
> >
> > For an example ISA, assume sixteen 1024 bit registers
> > with 32-bit instructions specifying register sub-fields of 1024, 512, 256, 128, 64 and even 32-bits.
> > (four bit register selectors and zero to five bit sub-field selectors for a two operand, one result RISC instructions)
> I am not sure what you want here. Are the sub field selectors per
> operand or per instruction? If you specify say a 64 bit subfield, does
> that mean use only the low order 64 bits of the register(s), or is it
> like a SIMD thing where you treat the registers as 16 fields, each 64
> bits, or something else?
> > The bigger the subfield, the greater the effective IPC!
> > Load and store instructions burdened with alignment, mask setup and other duties.
> >
> > All in all, a very basic RISC with large registers and potential for good performance,
> > to the extent that the problem software can make effective use of the large registers,
> > natural alignment of data and wide ALUs.
> >
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Ugh,
|> Are the sub field selectors per
|> operand or per instruction? If you specify say a 64 bit subfield, does
|> that mean use only the low order 64 bits of the register(s), or is it
|> like a SIMD thing where you treat the registers as 16 fields, each 64
|> bits, or something else?

Haven't done an actual ISA encoding, so the details are fluid.
The idea is that there are instructions with 64 to 1024 power of two bit size operands.
So with 64-bit operands one can do a single 64-bit operation on any of the 256 possible 64-bit sub fields
Likewise a 1024-bit SIMD operation on any of the 16 "registers".
Thus one can still do 64-bit operations or it the algorithm allows wider operations at greater IPC.

Re: Solving the Floating-Point Conundrum

<857c5bf9-5281-4f85-b889-274326ce9507n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34115&group=comp.arch#34115

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:560d:b0:656:3266:e0d9 with SMTP id mg13-20020a056214560d00b006563266e0d9mr167645qvb.12.1694997764892;
Sun, 17 Sep 2023 17:42:44 -0700 (PDT)
X-Received: by 2002:a9d:6a48:0:b0:6b9:9cc3:976e with SMTP id
h8-20020a9d6a48000000b006b99cc3976emr2484521otn.0.1694997764512; Sun, 17 Sep
2023 17:42:44 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 17:42:44 -0700 (PDT)
In-Reply-To: <df353284-3c0f-4e79-84dd-79d8246bd95bn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.14.162; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.14.162
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com> <f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com>
<ue788u$4u5l$1@newsreader4.netcologne.de> <ue7nkh$ne0$1@gal.iecc.com>
<9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com> <e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>
<ue84ep$jmun$1@dont-email.me> <df353284-3c0f-4e79-84dd-79d8246bd95bn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <857c5bf9-5281-4f85-b889-274326ce9507n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jim.brakefield@ieee.org (JimBrakefield)
Injection-Date: Mon, 18 Sep 2023 00:42:44 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 4185
 by: JimBrakefield - Mon, 18 Sep 2023 00:42 UTC

On Sunday, September 17, 2023 at 7:26:40 PM UTC-5, MitchAlsup wrote:
> On Sunday, September 17, 2023 at 7:09:01 PM UTC-5, Stephen Fuld wrote:
> > On 9/17/2023 4:07 PM, JimBrakefield wrote:
> >
> > snip
> > > So why do we still design register files of only 64 bit registers?
> > >
> > > For an example ISA, assume sixteen 1024 bit registers
> > > with 32-bit instructions specifying register sub-fields of 1024, 512, 256, 128, 64 and even 32-bits.
> > > (four bit register selectors and zero to five bit sub-field selectors for a two operand, one result RISC instructions)
> <
> > I am not sure what you want here. Are the sub field selectors per
> > operand or per instruction? If you specify say a 64 bit subfield, does
> > that mean use only the low order 64 bits of the register(s), or is it
> > like a SIMD thing where you treat the registers as 16 fields, each 64
> > bits, or something else?
> <
> How would you encode::
> <
> uint8_t c[];
> uint16_t h[];
> uint32_t w[];
> for( int j = 0; j < wMAX; j++ )
> {
> for( int i = 0; i < hMAX; i++ )
> h[i] = c[i] * hMULT;
> w[j] = h[j+hMAX/2]*c[j+hMAX/2+7];
> }
> <
> ?????
> <
> > > The bigger the subfield, the greater the effective IPC!
> > > Load and store instructions burdened with alignment, mask setup and other duties.
> > >
> > > All in all, a very basic RISC with large registers and potential for good performance,
> > > to the extent that the problem software can make effective use of the large registers,
> > > natural alignment of data and wide ALUs.
> > >
> > --
> > - Stephen Fuld
> > (e-mail address disguised to prevent spam)

Ugh
> How would you encode::
> uint8_t c[];
> uint16_t h[];
> uint32_t w[];
> for( int j = 0; j < wMAX; j++ )
> {
> for( int i = 0; i < hMAX; i++ )
> h[i] = c[i] * hMULT;
> w[j] = h[j+hMAX/2]*c[j+hMAX/2+7];
> }
The load and store instructions would specify the memory data sizes, the data register sizes and a mask.
The mask implies the number of uint8, uint16 and uint32 in their respective load instructions.
The load instruction would probably also specify the subfield size which would need to be at least as large as the number of bits set in the mask.
Frequently the mask would be an immediate value following the instruction.

Am posting a "fresh" idea for discussion. Any student could probably do a rough implementation in short order.

Re: Solving the Floating-Point Conundrum

<ff7eb2f8-38f6-43bf-a9a2-a9149a5ea678n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34116&group=comp.arch#34116

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:8e8a:b0:767:fe53:3691 with SMTP id rf10-20020a05620a8e8a00b00767fe533691mr247141qkn.3.1695001103150;
Sun, 17 Sep 2023 18:38:23 -0700 (PDT)
X-Received: by 2002:a05:6808:2115:b0:3a7:7811:241c with SMTP id
r21-20020a056808211500b003a77811241cmr3277878oiw.4.1695001102963; Sun, 17 Sep
2023 18:38:22 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 18:38:22 -0700 (PDT)
In-Reply-To: <9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:dd6:f67:e3ae:cf11;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:dd6:f67:e3ae:cf11
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com> <f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com>
<ue788u$4u5l$1@newsreader4.netcologne.de> <ue7nkh$ne0$1@gal.iecc.com> <9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ff7eb2f8-38f6-43bf-a9a2-a9149a5ea678n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 18 Sep 2023 01:38:23 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Quadibloc - Mon, 18 Sep 2023 01:38 UTC

On Sunday, September 17, 2023 at 2:42:40 PM UTC-6, MitchAlsup wrote:

> In addition, the world has spoken and little endian also won.

Little-endian has won, but I really don't think it's because anyone
really asked for it. The PDP-11, the 6502, and the 8080 just happened
to be little-endian, but were very popular for other reasons entirely.

The PDP-11 was a minicomputer which was powerful and cheap because
of a more modern architecture than other minis.

The 6502 was inexpensive compared to the 6800 and the 8080.

The 8080 was first.

John Savard

Re: Solving the Floating-Point Conundrum

<1e77a6dc-8515-4603-a24c-21d60113e6bcn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34117&group=comp.arch#34117

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1aa4:b0:412:2510:2c7e with SMTP id s36-20020a05622a1aa400b0041225102c7emr165331qtc.10.1695001406109;
Sun, 17 Sep 2023 18:43:26 -0700 (PDT)
X-Received: by 2002:a05:6808:218f:b0:3ad:adea:3f05 with SMTP id
be15-20020a056808218f00b003adadea3f05mr3731170oib.10.1695001405911; Sun, 17
Sep 2023 18:43:25 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 18:43:25 -0700 (PDT)
In-Reply-To: <memo.20230917185814.16292G@jgd.cix.co.uk>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:dd6:f67:e3ae:cf11;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:dd6:f67:e3ae:cf11
References: <ue788u$4u5l$1@newsreader4.netcologne.de> <memo.20230917185814.16292G@jgd.cix.co.uk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1e77a6dc-8515-4603-a24c-21d60113e6bcn@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 18 Sep 2023 01:43:26 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2241
 by: Quadibloc - Mon, 18 Sep 2023 01:43 UTC

On Sunday, September 17, 2023 at 11:58:18 AM UTC-6, John Dallman wrote:
> No
> architecture can do everything efficiently.

Although that is doubtless true, I have been trying to think of
how to make *stride* more efficient, the way that the Burroughs
Scientific Processor did. But trying to make a version of the BSP
that works for different widths of data seems to be a route to
excessive complexity.

One thought I had was to have 17 memory banks for all widths,
and just use the memory in each bank for either 72-bit floats or
twice as many 36-bit floats. That would waste memory bus
bandwidth in the shorter case.

Another thought was to get rid of the complelxity of the BSP,
and just use 16-channel memory. Programmers can simply
remember to put their 2048 x 2048 matrices inside 2048 x 2049
(or 2049 x 2048 for some dialects of FORTRAN) arrays. That
could easily be adapted to work with 36-bit floats as well just by
having twice as many address buses.

John Savard

Re: Solving the Floating-Point Conundrum

<5fe1a043-120a-4974-a25b-73215c975394n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34118&group=comp.arch#34118

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:6b8e:0:b0:417:ba09:8b98 with SMTP id z14-20020ac86b8e000000b00417ba098b98mr4026qts.11.1695001601759;
Sun, 17 Sep 2023 18:46:41 -0700 (PDT)
X-Received: by 2002:a9d:6f8e:0:b0:6c2:10e1:9d6f with SMTP id
h14-20020a9d6f8e000000b006c210e19d6fmr2448389otq.6.1695001601613; Sun, 17 Sep
2023 18:46:41 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 18:46:41 -0700 (PDT)
In-Reply-To: <43901a10-4859-43d7-b500-70030047c8b2n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:dd6:f67:e3ae:cf11;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:dd6:f67:e3ae:cf11
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<a0dd4fb4-d708-48ae-9764-3ce5e24aec0cn@googlegroups.com> <5fa92a78-d27c-4dff-a3dc-35ee7b43cbfan@googlegroups.com>
<c9131381-2e9b-4008-bc43-d4df4d4d8ab4n@googlegroups.com> <edb0d2c4-1689-44b4-ae81-5ab1ef234f8en@googlegroups.com>
<43901a10-4859-43d7-b500-70030047c8b2n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5fe1a043-120a-4974-a25b-73215c975394n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 18 Sep 2023 01:46:41 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2732
 by: Quadibloc - Mon, 18 Sep 2023 01:46 UTC

On Thursday, September 14, 2023 at 11:06:36 PM UTC-6, Quadibloc wrote:

> One can have a vector register composed of a number of 72-bit registers,
> and one can have it handle twice as many 36-bit values by having one in
> each half of the register. That would make for rapid transfers to and from
> memory that are simple conceptually.
>
> If one does it that way, one could handle intermediate precision by doubling
> the width of the individual portions of the vector register - so that it's made
> up of 144-bit registers that can hold two double-precision numbers or four
> single-precision numbers. That could also hold three intermediate precision
> numbers... that are 48 bits in length.

Here, I very definitely am drawing heavily on the design of one specific
classic computer.

It was the Integrated Scientific Processor that was offered for the Univac 1100/90
which offered a Cray-like architecture with vector registers, but which, in addition,
not only allowed the vector registers to be used for 36-bit floats as well as 72-bit
floats, but also let twice as many values be packed into the vectors in that case.

John Savard

Re: Solving the Floating-Point Conundrum

<ue97ko$1n934$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34119&group=comp.arch#34119

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Solving the Floating-Point Conundrum
Date: Mon, 18 Sep 2023 12:09:28 +0200
Organization: A noiseless patient Spider
Lines: 79
Message-ID: <ue97ko$1n934$1@dont-email.me>
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com>
<f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com>
<ue788u$4u5l$1@newsreader4.netcologne.de> <ue7nkh$ne0$1@gal.iecc.com>
<9f5be6c2-afb2-452b-bd54-314fa5bed589n@googlegroups.com>
<e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 18 Sep 2023 10:09:28 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="9d83be33995c896a9ef76c9addc0391e";
logging-data="1811556"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/YZec+Fy2NnztpV6O5aDhO5crD/jOAhSqixVxU2PS7DQ=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.17
Cancel-Lock: sha1:JCgxSaChJ617eGvQI1jmctX9e/g=
In-Reply-To: <e15a4faa-ac4c-4297-bb36-fb0cfcb8e631n@googlegroups.com>
 by: Terje Mathisen - Mon, 18 Sep 2023 10:09 UTC

JimBrakefield wrote:
> On Sunday, September 17, 2023 at 3:42:40 PM UTC-5, MitchAlsup wrote:
>> On Sunday, September 17, 2023 at 3:30:19 PM UTC-5, John Levine wrote:
>>> According to Thomas Koenig <tko...@netcologne.de>:
>>>>> That's not a power-of-two length, so how do I keep using these numbers both
>>>>> efficient and simple?
>>>>
>>>> Make the architecture byte-addressable, with another width for the
>>>> bytes; possible choices are 6 and 9.
>>> I'm pretty sure the world has spoken and we are going to use 8-bit
>>> bytes forever. I liked the PDP-8 and PDP-10 but they are, you know, dead.
>> <
>> In addition, the world has spoken and little endian also won.
>> <
>>>> Then make your architecture capable of misaligned loads and stores
>>>> and an extra floating point format, maybe 45 bits, with 9 bits
>>>> exponent and 36 bits of significand.
>> <
>>> If you're worried about performance, use your 45 bit format and store
>>> it in a 64 bit word.
>> <
>> In 1985 one could get a descent 32-bit pipelined RISC architecture in 1cm^2
>> Today this design in < 0.1mm^2 or you can make a GBOoO version < 2mm^2.
>> <
>> And you really need 5mm^2 to get enough pins on the part to feed what you
>> can put inside; 7mm^2 makes even more sense on pins versus perf.
>> <
>> So, why are you catering to ANY bit counts less than 64 ??
>> Intel has version with 512-bit data paths, GPUs generally use 1024-bits in
>> and 1024 bits out per cycle continuously per shader core.
>> <
>> It is no longer 1990, adjust your thinking to the modern realities or our time !
>>>
>>> The IBM 360/44 was an odd machine intended for real-time applications.
>>> It had a hard wired subset of the 360's instruction set, including all
>>> the floating point. There was a knob on the front panel you could turn
>>> to set the number of bytes for double precision operands, with shorter
>>> being faster. They were still stored in 64 bit doublewords, ignoring
>>> the low bytes. I've never seen anything saying whether it was useful
>>> in practice or people just left the knob at the default 56 bits.
>>> --
>>> Regards,
>>> John Levine, jo...@taugh.com, Primary Perpetrator of "The Internet for Dummies",
>>> Please consider the environment before reading this e-mail. https://jl.ly
>
> Ugh
> |> It is no longer 1990, adjust your thinking to the modern realities or our time
>
> So why do we still design register files of only 64 bit registers?
>
> For an example ISA, assume sixteen 1024 bit registers
> with 32-bit instructions specifying register sub-fields of 1024, 512, 256, 128, 64 and even 32-bits.
> (four bit register selectors and zero to five bit sub-field selectors for a two operand, one result RISC instructions)
> The bigger the subfield, the greater the effective IPC!
> Load and store instructions burdened with alignment, mask setup and other duties.
>
> All in all, a very basic RISC with large registers and potential for good performance,
> to the extent that the problem software can make effective use of the large registers,
> natural alignment of data and wide ALUs.
>
This is more or less a description of the thinking behind Larrabee, with
the basic unit of computation being one cache line/512 bits.

History showed us quite clearly that it was missing somewhere in the 1.5
to 3 X range of performance compared to the same transistor budget used
on a dedicated GPU core.

For TPC, which is what Intel have been trying to move LRB towards since
then (and calling it AVX-512) it is more believable, but here it suffers
from exploding instruction set size. Mill with register (i.e. belt slot)
metadata allowing a single instruction to work across scalar/vector and
byte/short/word/dword/qword item sizes avoids this particular problem,
as does Mitch's MY 66000 VMM.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Solving the Floating-Point Conundrum

<ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34120&group=comp.arch#34120

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1883:b0:403:27b2:85b5 with SMTP id v3-20020a05622a188300b0040327b285b5mr211944qtc.12.1695048059708;
Mon, 18 Sep 2023 07:40:59 -0700 (PDT)
X-Received: by 2002:a4a:4942:0:b0:573:3e63:350b with SMTP id
z63-20020a4a4942000000b005733e63350bmr3173540ooa.1.1695048059535; Mon, 18 Sep
2023 07:40:59 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 18 Sep 2023 07:40:59 -0700 (PDT)
In-Reply-To: <memo.20230917185814.16292G@jgd.cix.co.uk>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:70f5:8e63:8b98:bc59;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:70f5:8e63:8b98:bc59
References: <ue788u$4u5l$1@newsreader4.netcologne.de> <memo.20230917185814.16292G@jgd.cix.co.uk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 18 Sep 2023 14:40:59 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 32
 by: Quadibloc - Mon, 18 Sep 2023 14:40 UTC

On Sunday, September 17, 2023 at 11:58:18 AM UTC-6, John Dallman wrote:

> Quadibloc <jsa...@ecn.ab.ca> schrieb:
> > By "intermediate precision" I mean 48-bit or 54-bit floating-point
> > on the computer which is built around a 36-bit word in order to
> > provide 36-bit and 72-bit floats.

> I don't understand why you feel this is valuable. Are there any
> architectures that have been used since about 1970 (when the Atlas
> machines were shut down) that provide it? Is there software suffering
> because it doesn't have it and needs to make do with double precision? No
> architecture can do everything efficiently.

Can today's modern computers perform a 64-bit FP division with only
one clock cycle of latency? I don't think so; I don't think that they can even
do 64-bit FP multiplication that quickly.

That's why I think it is valuable to have other floating-point formats
smaller than 64 bits, but still large enough to be useful for scientific
computing, which 32 bits _isn't_, available.

Of course, though, this is less important now than in the days of the
IBM System/360 model 44. That machine's implementation of
floating-point arithmetic was such that the time of floating-point
operations was proportional to the length of the mantissa; with
today's advanced ALU designs, it's closer to being proportional to
the *logarithm* of that length, so the gains from a shorter FP format
aren't as dramatic.

John Savard

Re: Solving the Floating-Point Conundrum

<d31df65d-a464-41d3-b3e3-232616b06789n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34121&group=comp.arch#34121

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:904:b0:417:b53e:c659 with SMTP id bx4-20020a05622a090400b00417b53ec659mr66457qtb.13.1695048532902;
Mon, 18 Sep 2023 07:48:52 -0700 (PDT)
X-Received: by 2002:a05:6870:8c34:b0:1d1:3c1f:349f with SMTP id
ec52-20020a0568708c3400b001d13c1f349fmr3141796oab.9.1695048532734; Mon, 18
Sep 2023 07:48:52 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 18 Sep 2023 07:48:52 -0700 (PDT)
In-Reply-To: <ue7nkh$ne0$1@gal.iecc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:70f5:8e63:8b98:bc59;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:70f5:8e63:8b98:bc59
References: <57c5e077-ac71-486c-8afa-edd6802cf6b1n@googlegroups.com>
<8a5563da-3be8-40f7-bfb9-39eb5e889c8an@googlegroups.com> <f097448b-e691-424b-b121-eab931c61d87n@googlegroups.com>
<ue788u$4u5l$1@newsreader4.netcologne.de> <ue7nkh$ne0$1@gal.iecc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d31df65d-a464-41d3-b3e3-232616b06789n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 18 Sep 2023 14:48:52 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2316
 by: Quadibloc - Mon, 18 Sep 2023 14:48 UTC

On Sunday, September 17, 2023 at 2:30:19 PM UTC-6, John Levine wrote:

> The IBM 360/44 was an odd machine intended for real-time applications.
> It had a hard wired subset of the 360's instruction set, including all
> the floating point. There was a knob on the front panel you could turn
> to set the number of bytes for double precision operands, with shorter
> being faster. They were still stored in 64 bit doublewords, ignoring
> the low bytes. I've never seen anything saying whether it was useful
> in practice or people just left the knob at the default 56 bits.

Given that the 360/44 wasn't a 360/91 or one of its cousins, the time
required for floating-point operations was proportional to mantissa length
rather than the logarithm of mantissa length. Thus, there was at least the
possibility that it would have indeed been useful under some circumstances
on occasion.

John Savard

Re: Solving the Floating-Point Conundrum

<1d55526a-afc1-4fdb-8f6a-324675d1d888n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34122&group=comp.arch#34122

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:18f3:b0:651:7452:1d9b with SMTP id ep19-20020a05621418f300b0065174521d9bmr238205qvb.1.1695053336485;
Mon, 18 Sep 2023 09:08:56 -0700 (PDT)
X-Received: by 2002:a05:6808:1282:b0:3a7:392a:7405 with SMTP id
a2-20020a056808128200b003a7392a7405mr4151509oiw.2.1695053336293; Mon, 18 Sep
2023 09:08:56 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 18 Sep 2023 09:08:56 -0700 (PDT)
In-Reply-To: <ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <ue788u$4u5l$1@newsreader4.netcologne.de> <memo.20230917185814.16292G@jgd.cix.co.uk>
<ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1d55526a-afc1-4fdb-8f6a-324675d1d888n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: already5chosen@yahoo.com (Michael S)
Injection-Date: Mon, 18 Sep 2023 16:08:56 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3738
 by: Michael S - Mon, 18 Sep 2023 16:08 UTC

On Monday, September 18, 2023 at 5:41:01 PM UTC+3, Quadibloc wrote:
> On Sunday, September 17, 2023 at 11:58:18 AM UTC-6, John Dallman wrote:
>
> > Quadibloc <jsa...@ecn.ab.ca> schrieb:
> > > By "intermediate precision" I mean 48-bit or 54-bit floating-point
> > > on the computer which is built around a 36-bit word in order to
> > > provide 36-bit and 72-bit floats.
>
> > I don't understand why you feel this is valuable. Are there any
> > architectures that have been used since about 1970 (when the Atlas
> > machines were shut down) that provide it? Is there software suffering
> > because it doesn't have it and needs to make do with double precision? No
> > architecture can do everything efficiently.
> Can today's modern computers perform a 64-bit FP division with only
> one clock cycle of latency? I don't think so; I don't think that they can even
> do 64-bit FP multiplication that quickly.
>
> That's why I think it is valuable to have other floating-point formats
> smaller than 64 bits, but still large enough to be useful for scientific
> computing, which 32 bits _isn't_, available.

Your logic is based on assumption that intermediate precision math can be
significantly faster, latency wise, than full IEEE binary64 math. But that's
not true. At least not according to my definition of the word 'significantly'.
If IEEE binary64 fmul latency = 4 then it's unlikely that you can shrink it
to 3 for your mid-precision format. 2 is completely out of question.
And for just about everything at full application level even the difference
between 4 and 2 is not significant, much less so difference between 4 and 3..
IEEE binary32 is valuable not due to lower latency, but due to higher throughput
on the SIMD side.

>
> Of course, though, this is less important now than in the days of the
> IBM System/360 model 44. That machine's implementation of
> floating-point arithmetic was such that the time of floating-point
> operations was proportional to the length of the mantissa; with
> today's advanced ALU designs, it's closer to being proportional to
> the *logarithm* of that length, so the gains from a shorter FP format
> aren't as dramatic.
>
> John Savard

Exactly.
So, you appear to know that your claimed motivation is wrong.
Now tell us what is your true motivation?

Re: Solving the Floating-Point Conundrum

<271be491-094d-4b3f-9f98-938a2b90dc0dn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34123&group=comp.arch#34123

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1ba7:b0:412:3f1:aafa with SMTP id bp39-20020a05622a1ba700b0041203f1aafamr202597qtb.5.1695056010146;
Mon, 18 Sep 2023 09:53:30 -0700 (PDT)
X-Received: by 2002:a05:6808:bd3:b0:3a7:9a19:332b with SMTP id
o19-20020a0568080bd300b003a79a19332bmr4090313oik.7.1695056009952; Mon, 18 Sep
2023 09:53:29 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 18 Sep 2023 09:53:29 -0700 (PDT)
In-Reply-To: <ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:785d:19a9:9e91:54fe;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:785d:19a9:9e91:54fe
References: <ue788u$4u5l$1@newsreader4.netcologne.de> <memo.20230917185814.16292G@jgd.cix.co.uk>
<ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <271be491-094d-4b3f-9f98-938a2b90dc0dn@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Mon, 18 Sep 2023 16:53:30 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 40
 by: MitchAlsup - Mon, 18 Sep 2023 16:53 UTC

On Monday, September 18, 2023 at 9:41:01 AM UTC-5, Quadibloc wrote:
> On Sunday, September 17, 2023 at 11:58:18 AM UTC-6, John Dallman wrote:
>
> > Quadibloc <jsa...@ecn.ab.ca> schrieb:
> > > By "intermediate precision" I mean 48-bit or 54-bit floating-point
> > > on the computer which is built around a 36-bit word in order to
> > > provide 36-bit and 72-bit floats.
>
> > I don't understand why you feel this is valuable. Are there any
> > architectures that have been used since about 1970 (when the Atlas
> > machines were shut down) that provide it? Is there software suffering
> > because it doesn't have it and needs to make do with double precision? No
> > architecture can do everything efficiently.
<
> Can today's modern computers perform a 64-bit FP division with only
> one clock cycle of latency? I don't think so; I don't think that they can even
> do 64-bit FP multiplication that quickly.
<
Reciprocation can be pipelined to about 20 cycles of latency and 1 beat
throughput. It is BIG but doable. The problem is reciprocation followed
by multiplication is 0.5 bits less accurate than Division and IEEE 754 does
not allow this.
>
> That's why I think it is valuable to have other floating-point formats
> smaller than 64 bits, but still large enough to be useful for scientific
> computing, which 32 bits _isn't_, available.
>
> Of course, though, this is less important now than in the days of the
> IBM System/360 model 44. That machine's implementation of
> floating-point arithmetic was such that the time of floating-point
> operations was proportional to the length of the mantissa; with
> today's advanced ALU designs, it's closer to being proportional to
> the *logarithm* of that length, so the gains from a shorter FP format
> aren't as dramatic.
>
> John Savard

Re: Solving the Floating-Point Conundrum

<da80672a-edfb-4695-b260-a969cef4bd45n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34124&group=comp.arch#34124

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:19a9:b0:417:611e:98f4 with SMTP id u41-20020a05622a19a900b00417611e98f4mr187491qtc.8.1695056369389;
Mon, 18 Sep 2023 09:59:29 -0700 (PDT)
X-Received: by 2002:a05:6808:1b2c:b0:3ab:8958:65dc with SMTP id
bx44-20020a0568081b2c00b003ab895865dcmr4455108oib.9.1695056368868; Mon, 18
Sep 2023 09:59:28 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 18 Sep 2023 09:59:28 -0700 (PDT)
In-Reply-To: <1d55526a-afc1-4fdb-8f6a-324675d1d888n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:785d:19a9:9e91:54fe;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:785d:19a9:9e91:54fe
References: <ue788u$4u5l$1@newsreader4.netcologne.de> <memo.20230917185814.16292G@jgd.cix.co.uk>
<ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com> <1d55526a-afc1-4fdb-8f6a-324675d1d888n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <da80672a-edfb-4695-b260-a969cef4bd45n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Mon, 18 Sep 2023 16:59:29 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 4518
 by: MitchAlsup - Mon, 18 Sep 2023 16:59 UTC

On Monday, September 18, 2023 at 11:08:58 AM UTC-5, Michael S wrote:
> On Monday, September 18, 2023 at 5:41:01 PM UTC+3, Quadibloc wrote:
> > On Sunday, September 17, 2023 at 11:58:18 AM UTC-6, John Dallman wrote:
> >
> > > Quadibloc <jsa...@ecn.ab.ca> schrieb:
> > > > By "intermediate precision" I mean 48-bit or 54-bit floating-point
> > > > on the computer which is built around a 36-bit word in order to
> > > > provide 36-bit and 72-bit floats.
> >
> > > I don't understand why you feel this is valuable. Are there any
> > > architectures that have been used since about 1970 (when the Atlas
> > > machines were shut down) that provide it? Is there software suffering
> > > because it doesn't have it and needs to make do with double precision? No
> > > architecture can do everything efficiently.
> > Can today's modern computers perform a 64-bit FP division with only
> > one clock cycle of latency? I don't think so; I don't think that they can even
> > do 64-bit FP multiplication that quickly.
> >
> > That's why I think it is valuable to have other floating-point formats
> > smaller than 64 bits, but still large enough to be useful for scientific
> > computing, which 32 bits _isn't_, available.
<
> Your logic is based on assumption that intermediate precision math can be
> significantly faster, latency wise, than full IEEE binary64 math. But that's
> not true. At least not according to my definition of the word 'significantly'.
> If IEEE binary64 fmul latency = 4 then it's unlikely that you can shrink it
> to 3 for your mid-precision format. 2 is completely out of question.
<
64-bit FMUL is about 60 gates of delay
32-bit FMUL is about 56 gates of delay
64-bit FADD is about 50 gates of delay
32-bit FADD is about 45 gates of delay
<
All intermediate precision is squeezed between these.
<
> And for just about everything at full application level even the difference
> between 4 and 2 is not significant, much less so difference between 4 and 3.
> IEEE binary32 is valuable not due to lower latency, but due to higher throughput
> on the SIMD side.
<
The difference between 2 cycle LDs and 3 cycle LDs is about 10% on
a 1-wide in-order machine, and less on an OoO machine. Some would
call 10% significant, others not.
<
> >
> > Of course, though, this is less important now than in the days of the
> > IBM System/360 model 44. That machine's implementation of
> > floating-point arithmetic was such that the time of floating-point
> > operations was proportional to the length of the mantissa; with
> > today's advanced ALU designs, it's closer to being proportional to
> > the *logarithm* of that length, so the gains from a shorter FP format
> > aren't as dramatic.
<
It is closer to ½×logarithmic and ½×linear.
<
> >
> > John Savard
> Exactly.
> So, you appear to know that your claimed motivation is wrong.
> Now tell us what is your true motivation?

Re: Solving the Floating-Point Conundrum

<4aa14507-b3f5-4fdd-a7f0-2a19e69ab6d5n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34125&group=comp.arch#34125

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:27ce:b0:76d:8cc1:559d with SMTP id i14-20020a05620a27ce00b0076d8cc1559dmr197351qkp.15.1695059055281;
Mon, 18 Sep 2023 10:44:15 -0700 (PDT)
X-Received: by 2002:a05:6830:270c:b0:6bc:6658:2d3f with SMTP id
j12-20020a056830270c00b006bc66582d3fmr107262otu.1.1695059055096; Mon, 18 Sep
2023 10:44:15 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer02.ams4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 18 Sep 2023 10:44:14 -0700 (PDT)
In-Reply-To: <1d55526a-afc1-4fdb-8f6a-324675d1d888n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:65ce:cb3e:15bb:a914;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:65ce:cb3e:15bb:a914
References: <ue788u$4u5l$1@newsreader4.netcologne.de> <memo.20230917185814.16292G@jgd.cix.co.uk>
<ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com> <1d55526a-afc1-4fdb-8f6a-324675d1d888n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <4aa14507-b3f5-4fdd-a7f0-2a19e69ab6d5n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 18 Sep 2023 17:44:15 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 1883
 by: Quadibloc - Mon, 18 Sep 2023 17:44 UTC

On Monday, September 18, 2023 at 10:08:58 AM UTC-6, Michael S wrote:

> So, you appear to know that your claimed motivation is wrong.
> Now tell us what is your true motivation?

No, my claimed motivation is not wrong, since even shaving one cycle
off of an FP multiply can still be valuable or even imperative, when
the goal is to perform a computation which will take months of compuiter
time.

John Savard

Re: Solving the Floating-Point Conundrum

<2d854293-1f4f-411b-8986-64d7e530ba34n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34126&group=comp.arch#34126

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1ba7:b0:412:3f1:aafa with SMTP id bp39-20020a05622a1ba700b0041203f1aafamr205558qtb.5.1695059194119;
Mon, 18 Sep 2023 10:46:34 -0700 (PDT)
X-Received: by 2002:a05:6808:1886:b0:3a7:4878:233d with SMTP id
bi6-20020a056808188600b003a74878233dmr4425805oib.0.1695059193857; Mon, 18 Sep
2023 10:46:33 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 18 Sep 2023 10:46:33 -0700 (PDT)
In-Reply-To: <da80672a-edfb-4695-b260-a969cef4bd45n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:65ce:cb3e:15bb:a914;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:65ce:cb3e:15bb:a914
References: <ue788u$4u5l$1@newsreader4.netcologne.de> <memo.20230917185814.16292G@jgd.cix.co.uk>
<ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com> <1d55526a-afc1-4fdb-8f6a-324675d1d888n@googlegroups.com>
<da80672a-edfb-4695-b260-a969cef4bd45n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2d854293-1f4f-411b-8986-64d7e530ba34n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 18 Sep 2023 17:46:34 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 1980
 by: Quadibloc - Mon, 18 Sep 2023 17:46 UTC

On Monday, September 18, 2023 at 10:59:31 AM UTC-6, MitchAlsup wrote:

> 64-bit FMUL is about 60 gates of delay
> 32-bit FMUL is about 56 gates of delay
> 64-bit FADD is about 50 gates of delay
> 32-bit FADD is about 45 gates of delay
> <
> All intermediate precision is squeezed between these.

Oh. In *that* case, since 12-16 gates per cycle
is reasonable, my "claimed motivation" _is_
wrong.

I knew the gains from shorter precision would be
modest, but I didn't realize they would be _that_
modest.

John Savard

Re: Solving the Floating-Point Conundrum

<83f3cf1c-96b2-4bef-94ef-e3f2ac78f8e3n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34127&group=comp.arch#34127

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:3d03:b0:773:a4a3:3d5c with SMTP id tq3-20020a05620a3d0300b00773a4a33d5cmr176975qkn.14.1695059307392;
Mon, 18 Sep 2023 10:48:27 -0700 (PDT)
X-Received: by 2002:a05:6808:17a3:b0:3a1:d419:9c64 with SMTP id
bg35-20020a05680817a300b003a1d4199c64mr4053122oib.5.1695059307090; Mon, 18
Sep 2023 10:48:27 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 18 Sep 2023 10:48:26 -0700 (PDT)
In-Reply-To: <271be491-094d-4b3f-9f98-938a2b90dc0dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:65ce:cb3e:15bb:a914;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:65ce:cb3e:15bb:a914
References: <ue788u$4u5l$1@newsreader4.netcologne.de> <memo.20230917185814.16292G@jgd.cix.co.uk>
<ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com> <271be491-094d-4b3f-9f98-938a2b90dc0dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <83f3cf1c-96b2-4bef-94ef-e3f2ac78f8e3n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 18 Sep 2023 17:48:27 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 1916
 by: Quadibloc - Mon, 18 Sep 2023 17:48 UTC

On Monday, September 18, 2023 at 10:53:31 AM UTC-6, MitchAlsup wrote:

> Reciprocation can be pipelined to about 20 cycles of latency and 1 beat
> throughput. It is BIG but doable. The problem is reciprocation followed
> by multiplication is 0.5 bits less accurate than Division and IEEE 754 does
> not allow this.

I am willing to give up on being fully compliant with IEEE 754. Presumably
as a user-selectable option, though, so that compliance is given up only when
maximum speed is required.

John Savard

Re: Solving the Floating-Point Conundrum

<uea4ou$1sudt$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34128&group=comp.arch#34128

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Solving the Floating-Point Conundrum
Date: Mon, 18 Sep 2023 20:26:38 +0200
Organization: A noiseless patient Spider
Lines: 52
Message-ID: <uea4ou$1sudt$1@dont-email.me>
References: <ue788u$4u5l$1@newsreader4.netcologne.de>
<memo.20230917185814.16292G@jgd.cix.co.uk>
<ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 18 Sep 2023 18:26:38 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="9d83be33995c896a9ef76c9addc0391e";
logging-data="1997245"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19MQiI7o1492YEUOKmdlhsJFEp+wqTEnTkPkbjcVDmg4w=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.17
Cancel-Lock: sha1:l9EsT9TPYy+zNqJ/f1mDkUC1cn4=
In-Reply-To: <ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com>
 by: Terje Mathisen - Mon, 18 Sep 2023 18:26 UTC

Quadibloc wrote:
> On Sunday, September 17, 2023 at 11:58:18 AM UTC-6, John Dallman wrote:
>
>> Quadibloc <jsa...@ecn.ab.ca> schrieb:
>>> By "intermediate precision" I mean 48-bit or 54-bit floating-point
>>> on the computer which is built around a 36-bit word in order to
>>> provide 36-bit and 72-bit floats.
>
>> I don't understand why you feel this is valuable. Are there any
>> architectures that have been used since about 1970 (when the Atlas
>> machines were shut down) that provide it? Is there software suffering
>> because it doesn't have it and needs to make do with double precision? No
>> architecture can do everything efficiently.
>
> Can today's modern computers perform a 64-bit FP division with only
> one clock cycle of latency? I don't think so; I don't think that they can even
> do 64-bit FP multiplication that quickly.
>
> That's why I think it is valuable to have other floating-point formats
> smaller than 64 bits, but still large enough to be useful for scientific
> computing, which 32 bits _isn't_, available.

John, you seem to think that the actual multiplication is the critical
path here, while in reality, all the other stuff related to FP, i.e.
unpacking, adjusting exponent/inserting hidden bit, special-casing all
non-normal inputs, normalizing & rounding the result and packing it back
into the external format will take a significant part of the total cycle
budget, almost independent of the precision selected.

Mitch obviously have the exact numbers but my wild guess would be that
float is typically just one cycle faster than double due to a shorter
carry chain.

You have to go all the way down to 8-bit or so in order to make all ops
single-cycle. (i.e. table lookups!)
>
> Of course, though, this is less important now than in the days of the
> IBM System/360 model 44. That machine's implementation of
> floating-point arithmetic was such that the time of floating-point
> operations was proportional to the length of the mantissa; with
> today's advanced ALU designs, it's closer to being proportional to
> the *logarithm* of that length, so the gains from a shorter FP format
> aren't as dramatic.

Exactly right, so please forget about such ideas.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Solving the Floating-Point Conundrum

<de99236c-9bd2-4c5b-98b4-c9e2985eb1b0n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34129&group=comp.arch#34129

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:59d2:0:b0:656:328f:7273 with SMTP id el18-20020ad459d2000000b00656328f7273mr207074qvb.6.1695063060778; Mon, 18 Sep 2023 11:51:00 -0700 (PDT)
X-Received: by 2002:a05:6870:5aa6:b0:1c0:eac2:979c with SMTP id dt38-20020a0568705aa600b001c0eac2979cmr3856896oab.3.1695063060423; Mon, 18 Sep 2023 11:51:00 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!69.80.99.14.MISMATCH!border-1.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 18 Sep 2023 11:51:00 -0700 (PDT)
In-Reply-To: <uea4ou$1sudt$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:785d:19a9:9e91:54fe; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:785d:19a9:9e91:54fe
References: <ue788u$4u5l$1@newsreader4.netcologne.de> <memo.20230917185814.16292G@jgd.cix.co.uk> <ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com> <uea4ou$1sudt$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <de99236c-9bd2-4c5b-98b4-c9e2985eb1b0n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Mon, 18 Sep 2023 18:51:00 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 95
 by: MitchAlsup - Mon, 18 Sep 2023 18:51 UTC

On Monday, September 18, 2023 at 1:26:43 PM UTC-5, Terje Mathisen wrote:
> Quadibloc wrote:
> > On Sunday, September 17, 2023 at 11:58:18 AM UTC-6, John Dallman wrote:
> >
> >> Quadibloc <jsa...@ecn.ab.ca> schrieb:
> >>> By "intermediate precision" I mean 48-bit or 54-bit floating-point
> >>> on the computer which is built around a 36-bit word in order to
> >>> provide 36-bit and 72-bit floats.
> >
> >> I don't understand why you feel this is valuable. Are there any
> >> architectures that have been used since about 1970 (when the Atlas
> >> machines were shut down) that provide it? Is there software suffering
> >> because it doesn't have it and needs to make do with double precision? No
> >> architecture can do everything efficiently.
> >
> > Can today's modern computers perform a 64-bit FP division with only
> > one clock cycle of latency? I don't think so; I don't think that they can even
> > do 64-bit FP multiplication that quickly.
> >
> > That's why I think it is valuable to have other floating-point formats
> > smaller than 64 bits, but still large enough to be useful for scientific
> > computing, which 32 bits _isn't_, available.
<
> John, you seem to think that the actual multiplication is the critical
> path here, while in reality, all the other stuff related to FP, i.e.
> unpacking, adjusting exponent/inserting hidden bit, special-casing all
> non-normal inputs, normalizing & rounding the result and packing it back
> into the external format will take a significant part of the total cycle
> budget, almost independent of the precision selected.
<
53×53 multiplier tree is::
3 gates for Boothe Recoding
10 gates of multiply
14 gates of 2:1 carry propagate adder*
<
(*) the actual adder is 168-bits followed by a 52-bit incrementor with the
property that no carry propagation manipulates more than 112-bits of
result.
<
Normalization::
5 gates to unary select
5 gates of output select in parallel with 4 gates of unary-to-binary encode
2 gates of 11-bit plus 7-bit addition (left in carry save format)
<
Rounding::
2-gates sticky resolve
5-gates [4-bit]×[3-bit] RM select ROM gives {-1, +1, 0} rounding correction
11-gates of rounding plus exponent finalization
<
And 3 gates for various multiplexing along the way.
<
>
> Mitch obviously have the exact numbers but my wild guess would be that
> float is typically just one cycle faster than double due to a shorter
> carry chain.
<
Shaving ½ the length off the carry chain saves (theoretically) 1 gate of
delay (its logarithmic) but in practice it is more like 2 gates due to fan
out.
>
> You have to go all the way down to 8-bit or so in order to make all ops
> single-cycle. (i.e. table lookups!)
<
8× multiply is still not 1 cycle:: at least in a machine you are cycling at
8-bit adder cycle time.
<
> >
> > Of course, though, this is less important now than in the days of the
> > IBM System/360 model 44. That machine's implementation of
> > floating-point arithmetic was such that the time of floating-point
> > operations was proportional to the length of the mantissa; with
> > today's advanced ALU designs, it's closer to being proportional to
> > the *logarithm* of that length, so the gains from a shorter FP format
> > aren't as dramatic.
> Exactly right, so please forget about such ideas.
> Terje
>
>
> --
> - <Terje.Mathisen at tmsw.no>
> "almost all programming can be viewed as an exercise in caching"

Re: Solving the Floating-Point Conundrum

<7df1b835-02e5-4988-96ef-2af73b8587d7n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34132&group=comp.arch#34132

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1ba7:b0:412:3f1:aafa with SMTP id bp39-20020a05622a1ba700b0041203f1aafamr213210qtb.5.1695070961064;
Mon, 18 Sep 2023 14:02:41 -0700 (PDT)
X-Received: by 2002:a05:6808:13c6:b0:3a4:1e93:8988 with SMTP id
d6-20020a05680813c600b003a41e938988mr4300091oiw.10.1695070960863; Mon, 18 Sep
2023 14:02:40 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 18 Sep 2023 14:02:40 -0700 (PDT)
In-Reply-To: <de99236c-9bd2-4c5b-98b4-c9e2985eb1b0n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=99.251.79.92; posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 99.251.79.92
References: <ue788u$4u5l$1@newsreader4.netcologne.de> <memo.20230917185814.16292G@jgd.cix.co.uk>
<ba7d6a5d-2373-4f55-a640-69b1ab3e00bbn@googlegroups.com> <uea4ou$1sudt$1@dont-email.me>
<de99236c-9bd2-4c5b-98b4-c9e2985eb1b0n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7df1b835-02e5-4988-96ef-2af73b8587d7n@googlegroups.com>
Subject: Re: Solving the Floating-Point Conundrum
From: robfi680@gmail.com (robf...@gmail.com)
Injection-Date: Mon, 18 Sep 2023 21:02:41 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1541
 by: robf...@gmail.com - Mon, 18 Sep 2023 21:02 UTC

What kind of precision is needed for space-time co-ordinates? I tried looking this
up and get the impression that 64-bit floats may not be enough, but 128-bit floats
are overkill. The reference was a space game based in a galaxy.


devel / comp.arch / Re: memory speeds, Solving the Floating-Point Conundrum

Pages:12345678910
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor