Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

All constants are variables.


devel / comp.arch / Re: Introducing ForwardCom: An open ISA with variable-length vector

SubjectAuthor
* Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com
`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner
 `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com
  `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner
   `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMichael S
    `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com
     +* Re: Introducing ForwardCom: An open ISA with variable-length vector registersDan Cross
     |`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com
     | +- Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
     | `* Re: Introducing ForwardCom: An open ISA with variable-length vectorBGB
     |  `* Re: Introducing ForwardCom: An open ISA with variable-length vectorBGB
     |   +* Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com
     |   |`- Re: Introducing ForwardCom: An open ISA with variable-length vectorBGB
     |   `* Re: Introducing ForwardCom: An open ISA with variable-length vectorJohn Dallman
     |    `* Re: Introducing ForwardCom: An open ISA with variable-length vectorTerje Mathisen
     |     `* Re: Introducing ForwardCom: An open ISA with variable-length vectorBGB
     |      `* Re: Introducing ForwardCom: An open ISA with variable-length vectorluke.l...@gmail.com
     |       +- Re: Introducing ForwardCom: An open ISA with variable-length vectorBGB
     |       +- Re: Introducing ForwardCom: An open ISA with variable-length vectorStephen Fuld
     |       +- Re: Introducing ForwardCom: An open ISA with variable-length vectorMitchAlsup
     |       `* Re: Introducing ForwardCom: An open ISA with variable-length vectorStefan Monnier
     |        `* Re: Introducing ForwardCom: An open ISA with variable-length vectorMitchAlsup
     |         +* Re: Introducing ForwardCom: An open ISA with variable-length vectorScott Lurndal
     |         |`* Re: Introducing ForwardCom: An open ISA with variable-length vectorMitchAlsup
     |         | +- Re: Introducing ForwardCom: An open ISA with variable-length vectorrobf...@gmail.com
     |         | `* RISCs and virtual vectors (was: Introducing ForwardCom)Anton Ertl
     |         |  +* Re: RISCs and virtual vectors (was: Introducing ForwardCom)MitchAlsup
     |         |  |`* Re: RISCs and virtual vectors (was: Introducing ForwardCom)luke.l...@gmail.com
     |         |  | `* Re: RISCs and virtual vectors (was: Introducing ForwardCom)MitchAlsup
     |         |  |  `* Re: RISCs and virtual vectors (was: Introducing ForwardCom)luke.l...@gmail.com
     |         |  |   `- Re: RISCs and virtual vectors (was: Introducing ForwardCom)MitchAlsup
     |         |  `* Re: RISCs and virtual vectors (was: Introducing ForwardCom)Thomas Koenig
     |         |   +* Re: RISCs and virtual vectors (was: Introducing ForwardCom)MitchAlsup
     |         |   |`* Re: RISCs and virtual vectorsEricP
     |         |   | `* Re: RISCs and virtual vectorsMitchAlsup
     |         |   |  `- Re: RISCs and virtual vectorsEricP
     |         |   `- Re: RISCs and virtual vectors (was: Introducing ForwardCom)luke.l...@gmail.com
     |         `* Re: Introducing ForwardCom: An open ISA with variable-length vectorStefan Monnier
     |          +- Re: Introducing ForwardCom: An open ISA with variable-length vectorMitchAlsup
     |          `* Re: Introducing ForwardCom: An open ISA with variable-length vectorluke.l...@gmail.com
     |           `- Re: Introducing ForwardCom: An open ISA with variable-length vectorMitchAlsup
     `* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
      `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com
       +- Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
       `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersJohn Dallman
        +- Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com
        `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersScott Lurndal
         `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersJohn Dallman
          +- Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com
          `* Re: Introducing ForwardCom: An open ISA with variable-length vectorBGB
           `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com
            +* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
            |+- Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
            |`* Re: Introducing ForwardCom: An open ISA with variable-length vectorDavid Schultz
            | `- Re: Introducing ForwardCom: An open ISA with variable-length vectorEricP
            +- Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
            `* Re: Introducing ForwardCom: An open ISA with variable-length vectorBGB
             `- Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com

Pages:123
Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<7LKcnVa7T63NbJj4nZ2dnZfqn_qdnZ2d@earthlink.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34081&group=comp.arch#34081

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!border-1.nntp.ord.giganews.com!nntp.giganews.com!Xl.tags.giganews.com!local-2.nntp.ord.giganews.com!nntp.earthlink.com!news.earthlink.com.POSTED!not-for-mail
NNTP-Posting-Date: Sat, 16 Sep 2023 18:17:52 +0000
Date: Sat, 16 Sep 2023 13:17:52 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.13.0
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Newsgroups: comp.arch
References: <i50LM.1449504$GMN3.428754@fx16.iad>
<memo.20230910132120.13508R@jgd.cix.co.uk> <udkmcl$k5hu$1@dont-email.me>
<aa8002f4-73d2-4221-b85d-811bcdb2e6aan@googlegroups.com>
<ue4bl4$31ji$1@newsreader4.netcologne.de>
Content-Language: en-US
From: david.schultz@earthlink.net (David Schultz)
In-Reply-To: <ue4bl4$31ji$1@newsreader4.netcologne.de>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID: <7LKcnVa7T63NbJj4nZ2dnZfqn_qdnZ2d@earthlink.com>
Lines: 32
X-Usenet-Provider: http://www.giganews.com
NNTP-Posting-Host: 108.194.108.200
X-Trace: sv3-Us7EoFP+92OwUL33qgQH0aLrino+krPVIzbrExK9zYkZwUW6OZ2n0XJU6FabfyjTAIlZRFi4Ua/36oE!P+ed4glZXyw+qJHHT091yFtGG9LAtOk5Ayu7+v5/xmW3zswxShMEyWA7oAHEmmQ0dWRJ5C0GOrVO!BI5GkOig1jGgFfx50+wlaC+Emhwwkuz31e6E3d+3fH8=
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
 by: David Schultz - Sat, 16 Sep 2023 18:17 UTC

On 9/16/23 8:47 AM, Thomas Koenig wrote:
> luke.l...@gmail.com <luke.leighton@gmail.com> schrieb:
>
>> i am always slightly dismayed by the continuous assumption that
>> an ISA's front-end *must* have a direct one-to-one relationship
>> with the back-end implementation.
>
> Many architectures have violated that assumption in the past. A few
> examples off the top of my head:
>
> The /360 series, where a 32-bit architecture was implemented in,
> for example, 8 bits on the model 30.
>
> The original Nova, which implemented its 16-bit architecture with
> a 4-bit ALU. Later models went to full 16 bits.
>
> The Z80, which hides its 4-bit ALU behind an 8-bit external
> appearance and sequencing. (How that saves chip area I'm
> not sure).
>
> The PDP 8/S had a single-bit width ALU with a 12-bit architecture.
>
> The 68000 was actually advertised as a 16-bit processor, but
> had an ISA geared towards 32 bit right from the start.

Add the CDP1802 1 bit ALU to your list.

--
http://davesrocketworks.com
David Schultz

Re: Introducing ForwardCom: An open ISA with variable-length vector

<memo.20230916204926.16432A@jgd.cix.co.uk>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34082&group=comp.arch#34082

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.nntp4.net!news.gegeweb.eu!gegeweb.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jgd@cix.co.uk (John Dallman)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
Date: Sat, 16 Sep 2023 20:49 +0100 (BST)
Organization: A noiseless patient Spider
Lines: 19
Message-ID: <memo.20230916204926.16432A@jgd.cix.co.uk>
References: <udickr$5rc3$1@dont-email.me>
Reply-To: jgd@cix.co.uk
Injection-Info: dont-email.me; posting-host="082a4f784230c21a0dd39b4ecf6d583d";
logging-data="4175284"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19j4Yf05HoDSWGsk2tGA+OV+keLY43+Gm0="
Cancel-Lock: sha1:wr7MQ7BmwOf/JyqFY4gTA5S4jFk=
 by: John Dallman - Sat, 16 Sep 2023 19:49 UTC

In article <udickr$5rc3$1@dont-email.me>, cr88192@gmail.com (BGB) wrote:

> I have noted that the ability to turn the "social behaviors" on and
> off, or "do a full 180", in terms of emotional sentiment, without
> any visible delay, is not something that neurotypicals seem able to
> do. Like, even when given permission to do so, and where the most
> sensible option would be turn off the social conventions and
> speak/act using a direct/impersonal style, they will not (and are
> seemingly mostly incapable of doing so).

I'm not sure I'm entirely neurotypical, but I learned to do this
relatively easily when it became obvious that it was the best way to run
collaboration meetings between groups of software developers.

I think of it as "Turning my ego off" and gain considerable satisfaction
from getting people to do sensible things when their egos tell them to do
otherwise.

John

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<ue50sb$3vdej$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34083&group=comp.arch#34083

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Sat, 16 Sep 2023 14:49:29 -0500
Organization: A noiseless patient Spider
Lines: 94
Message-ID: <ue50sb$3vdej$1@dont-email.me>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<32790430-cdc8-45ad-8b20-8c8d6452b280n@googlegroups.com>
<490db16b-bd76-431a-b483-1985fa289f3cn@googlegroups.com>
<ec4fd863-d316-4a23-acee-52fb6889735dn@googlegroups.com>
<udfa24$dfj$1@reader2.panix.com>
<60b99a68-2c30-44f0-be05-246cae312a4dn@googlegroups.com>
<udfod1$3jbu0$1@dont-email.me>
<6a39aa55-f4cd-4bd8-b48d-abedc5d21272n@googlegroups.com>
<udickr$5rc3$1@dont-email.me>
<8568809d-e7cd-4567-b6c2-a0b51f154f51n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 16 Sep 2023 19:49:31 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6e33bb6d27e808f3799ea4bdab996949";
logging-data="4175315"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+wBBaXgunTiIr8Nog5royj"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.15.1
Cancel-Lock: sha1:8Qsw1MJb0M+ZLLh+uzGs5l2QirM=
In-Reply-To: <8568809d-e7cd-4567-b6c2-a0b51f154f51n@googlegroups.com>
Content-Language: en-US
 by: BGB - Sat, 16 Sep 2023 19:49 UTC

On 9/16/2023 6:55 AM, luke.l...@gmail.com wrote:
> On Saturday, September 9, 2023 at 7:13:51 PM UTC+1, BGB wrote:
>
>> I have noted that the ability to turn the "social behaviors" on and off,
>> or "do a full 180", in terms of emotional sentiment, without any visible
>> delay, is not something that neurotypicals seem able to do. Like, even
>> when given permission to do so, and where the most sensible option would
>> be turn off the social conventions and speak/act using a
>> direct/impersonal style, they will not (and are seemingly mostly
>> incapable of doing so).
>
> this is down to training. if you find an easy way to make
> this happen congratulations you just made a fortune in
> "Goal-orientated Team Building" for businesses... :)
>

Both myself and the female I was talking to seem capable of doing this.
But, if one does this around other people, many neurotypicals seem to
freak out.

A case came up where she did this in a public setting, going somewhat
outside the scope of neurotypical social behavior, and causing
shock/surprise to bystanders. As I saw it, what she said was justified.
Still might have been better though (partly for her sake) if in the
moment she had "kept the mask on" a little better.

Elsewhere, probably would have been fine (and most ASD behaviors are
passable), but when NTs are around, at least in my experience, better to
try to avoid challenging their expectations for how emotions and
emotional responses work.

In her case though, she does also seem to come off as likely autistic,
though some other things (like how she relates to emotions are less
obvious).

Admittedly, the one female I had once dated (very long ago) also tended
to do this (in a sometimes more overt form), but was also a lot more
consistent about never "unmasking" in this way whenever anyone else was
around (actually, she would visibly change her expression and some
aspects of her persona depending on whoever was around).

But, her abilities in this area were a fair bit more dynamic than my
own. But, she really didn't like being talked about, or having any
attention brought to this (if she would flip personas and I was around,
I was just sort of supposed to go along like this is how she always
was). This seemed to be one of these areas where, if one went against
her (even subtly), she would get particularly angry about it (not
necessarily at the moment; as she would keep the persona, but once no
one else was around; one would be hearing about it, ...).

The changes were unlike the level of variation typically seen in
neurotypicals, where the variation is more subtle and usually has a
noticeable time delay; and, they pretty much never switch to a fully
detached/impersonal style. Well, and also tend not to openly talk about
other people in terms of their usefulness, or what they want to get from
the other person, or other similar metrics.

Well, and in this case, seeming the whole relationship was based on me
"remaining useful" (with her general mindset being that it wasn't going
to go any further than this, with a set of ending criteria that were
mentioned well in advance; effectively a preset expiration date, ...).
Once this time rolled around, she ended things.

Some of this was admittedly, difficult for me to deal with at the time
in an emotional sense. I had the fault of caring for her and wishing
that things could continue or maybe go further, but this was one of
those things, that if one brought up, she would respond with anger.

....

The person I was talking to recently doesn't seem quite the same.
Seemingly less dynamic (more like myself).
Seems to have more visible signs of social anxiety.
My attempts to probe for "utility oriented thinking" in relation to
interactions with others, appear to have mostly come up negative (her
responses seem to be indifferent on this matter).
Seems to have vocal inflections more typical for people with ASD (or,
more generally, associated with "nerdy" people in general);
....

> you may find this interesting
> https://www.youtube.com/watch?v=Grrbekq-6kw
>

I don't really think a lot of this is a dietary issue.

> l.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<ue5o8m$7476$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34090&group=comp.arch#34090

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Sat, 16 Sep 2023 21:28:37 -0500
Organization: A noiseless patient Spider
Lines: 84
Message-ID: <ue5o8m$7476$1@dont-email.me>
References: <i50LM.1449504$GMN3.428754@fx16.iad>
<memo.20230910132120.13508R@jgd.cix.co.uk> <udkmcl$k5hu$1@dont-email.me>
<aa8002f4-73d2-4221-b85d-811bcdb2e6aan@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 17 Sep 2023 02:28:38 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="a99610e57d0a66097b0f4d49c74577ae";
logging-data="233702"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+bxHK1gc8mZiO2RGNE7T6p"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.15.1
Cancel-Lock: sha1:uLnYs37A08VUsP8NlB0v2CMeNKg=
In-Reply-To: <aa8002f4-73d2-4221-b85d-811bcdb2e6aan@googlegroups.com>
Content-Language: en-US
 by: BGB - Sun, 17 Sep 2023 02:28 UTC

On 9/16/2023 7:06 AM, luke.l...@gmail.com wrote:
> On Sunday, September 10, 2023 at 4:12:25 PM UTC+1, BGB wrote:
>
>> Sadly, these don't really do anything for matrix-multiply, there isn't
>> really any good way to deal with making MatMult not suck (in these
>> contexts, one usually dealing with 4x4 Binary32 matrices).
>
> 3 instructions in SVP64, any arbitrary matrix dimensions up
> to a total of 127 MACs/FMACs/GFMACs/any-scalar-op
>
> * first to set up matrix multiply REMAP
> * second to say which registers the REMAP applies to
> * third is the something-and-accumulate scalar operation
>
> real simple.
>
> it becomes the hardware's problem to then sort out the resultant
> massive-inrush of operations caused by the REMAP loop, hiding
> the end-developer from the insanity normally imposed on them
> by SIMD (and even True-Scalable) Vector ISAs.
>
> i am always slightly dismayed by the continuous assumption that
> an ISA's front-end *must* have a direct one-to-one relationship
> with the back-end implementation. "oh we have a SIMD ISA
> therefore the back-end hardware MUST have SIMD ALUs of
> exactly that width" errr no - look at the Broadcom Videocore IV
> "Virtual Vectors", the user *thinks* they are doing 16 FMACs
> in parallel whereas in reality the hardware breaks them down
> into 4 batches of 4 pipelined FMACs... *without* telling the
> user that that's what it's doing.
>

In my BJX2 core, in the original implementation of the floating-point
SIMD ops, they were implemented by multiplexing a 1-wide floating-point
unit over the 4 elements (with around a 6-cycle latency for the main FPU).

The low-precision FPU is 1:1, but this was mostly to reduce the SIMD op
latency from 10C to 3L/1T.

Or, say:
Scalar Op:
A - - - - A'
SIMD op (4-element A/B/C/D):
A - - - - A'
B - - - - B'
C - - - - C'
D - - - - D'
- - - - - - - - - {A/B/C/D}'

A drawback of MatMult though is that the elements in one matrix are
rotated by 90 degrees from the other matrix, so one effectively needs to
rotate one of the matrices by 90 degrees before being able to pipe it
though the other SIMD ops.

I have instructions to build new registers by pulling elements from the
other registers, but it could be better if this step were not necessary.
MOVLLD Rm, Ro, Rn // Rn = { Rm[31: 0], Ro[31: 0] }
MOVLHD Rm, Ro, Rn // Rn = { Rm[31: 0], Ro[63:32] }
MOVHLD Rm, Ro, Rn // Rn = { Rm[63:32], Ro[31: 0] }
MOVHHD Rm, Ro, Rn // Rn = { Rm[63:32], Ro[63:32] }
These ops being valid in all 3 lanes and may be co-issued in any
combination.

Though, I guess, if one had 512 bit vectors, and a 512-bit shuffle, it
could be possible handle all of this as a single instruction.

As-is though, I don't have enough register ports to get anywhere near
pulling this off.

> AMD's AVX512 implementation uses Cray-style "Vector Chaining"
> which amazingly is the first time i'd heard of it being admitted
> to be used by a mainstream CPU manufacturer (there are likely
> others that *don't* admit they copied Vector Chaining in their
> SIMD ALU implementations)
>
> Agner, you also made this assumption and I am waiting to hear back
> from you to continue the conversation on that, if you are interested?
>
> l.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<99ca9672-d6a9-40d9-8863-d0771fa53afan@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34094&group=comp.arch#34094

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1988:b0:40d:b839:b5bb with SMTP id u8-20020a05622a198800b0040db839b5bbmr148020qtc.2.1694936303095;
Sun, 17 Sep 2023 00:38:23 -0700 (PDT)
X-Received: by 2002:a05:6870:a8a5:b0:1d6:4b44:a3d0 with SMTP id
eb37-20020a056870a8a500b001d64b44a3d0mr2256150oab.6.1694936302844; Sun, 17
Sep 2023 00:38:22 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!border-2.nntp.ord.giganews.com!border-1.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 17 Sep 2023 00:38:22 -0700 (PDT)
In-Reply-To: <ue5o8m$7476$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=82.132.233.213; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 82.132.233.213
References: <i50LM.1449504$GMN3.428754@fx16.iad> <memo.20230910132120.13508R@jgd.cix.co.uk>
<udkmcl$k5hu$1@dont-email.me> <aa8002f4-73d2-4221-b85d-811bcdb2e6aan@googlegroups.com>
<ue5o8m$7476$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <99ca9672-d6a9-40d9-8863-d0771fa53afan@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: luke.leighton@gmail.com (luke.l...@gmail.com)
Injection-Date: Sun, 17 Sep 2023 07:38:23 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 67
 by: luke.l...@gmail.com - Sun, 17 Sep 2023 07:38 UTC

On Sunday, September 17, 2023 at 3:28:42 AM UTC+1, BGB wrote:
> A drawback of MatMult though is that the elements in one matrix are
> rotated by 90 degrees from the other matrix, so one effectively needs to
> rotate one of the matrices by 90 degrees before being able to pipe it
> though the other SIMD ops.

indeed. and there are two ways to do it:
1. hard-wire the ISA to the back-end and have the developer
write instructions that perform the transform
2. have the hardware be instructed to sort out the transform itself.

you chose 1, i chose 2.

> I have instructions to build new registers by pulling elements from the
> other registers, but it could be better if this step were not necessary.

indeed.
> MOVLLD Rm, Ro, Rn // Rn = { Rm[31: 0], Ro[31: 0] }
> MOVLHD Rm, Ro, Rn // Rn = { Rm[31: 0], Ro[63:32] }
> MOVHLD Rm, Ro, Rn // Rn = { Rm[63:32], Ro[31: 0] }
> MOVHHD Rm, Ro, Rn // Rn = { Rm[63:32], Ro[63:32] }

and when you want a FP16 or INT16 MM this pattern
is no longer valid, the programmer is forced to rewrite
a new pattern of transforms.

> These ops being valid in all 3 lanes and may be co-issued in any
> combination.

if you have the hardware implemented why not separate the ISA
from that back-end hardware?

the trick of SVP64 is that it is not a Vector system at all,
but a looping system between Decode and Issue, spamming
back-end Multi-Issue with patterns that in a non-looping
system would require explicit loop-unrolled instructions.

thus if doing 4x4 32-bit matmul on a 32-bit regfile it is
*literally* as if you had 64 FMACs loop-unrolled, how hard is that?

and if you have 4-wide Multi-Issue, and the pipelines are 4 clocks
then as long as you have done an Inner Product you get 100%
throughput.

the clever bit - the silly-simple bit - is when doing 4x4 32-bit on
*64* bit regs.

well... um... all you do is, you treat the 64-bit regs as if they were a
pair of 32-bit regs. all 64-bit scalar ops are now *pair* reg ops,
but when you want to do 4x4 32-bit matmult you issue down to
the individual 32-bit regs.

it's not difficult.

> Though, I guess, if one had 512 bit vectors, and a 512-bit shuffle, it
> could be possible handle all of this as a single instruction.

and when the matrix is large enough to not fit?
> As-is though, I don't have enough register ports to get anywhere near
> pulling this off.

that's ok, you can multiplex (a PriorityPicker on the front)

l.

Re: Introducing ForwardCom: An open ISA with variable-length vector

<ue6rle$ceh2$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34096&group=comp.arch#34096

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
Date: Sun, 17 Sep 2023 14:32:45 +0200
Organization: A noiseless patient Spider
Lines: 39
Message-ID: <ue6rle$ceh2$1@dont-email.me>
References: <udickr$5rc3$1@dont-email.me>
<memo.20230916204926.16432A@jgd.cix.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 17 Sep 2023 12:32:46 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="58078478c3917e2b277789236d194109";
logging-data="408098"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/LSAs6Zyw+/xdHmND+XSq4RlICpcwlu6dD6HsZGMr/Rg=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.17
Cancel-Lock: sha1:suH4Jt2jlxwA4yXm53y27Jww9uo=
In-Reply-To: <memo.20230916204926.16432A@jgd.cix.co.uk>
 by: Terje Mathisen - Sun, 17 Sep 2023 12:32 UTC

John Dallman wrote:
> In article <udickr$5rc3$1@dont-email.me>, cr88192@gmail.com (BGB) wrote:
>
>> I have noted that the ability to turn the "social behaviors" on and
>> off, or "do a full 180", in terms of emotional sentiment, without
>> any visible delay, is not something that neurotypicals seem able to
>> do. Like, even when given permission to do so, and where the most
>> sensible option would be turn off the social conventions and
>> speak/act using a direct/impersonal style, they will not (and are
>> seemingly mostly incapable of doing so).
>
> I'm not sure I'm entirely neurotypical, but I learned to do this
> relatively easily when it became obvious that it was the best way to run
> collaboration meetings between groups of software developers.
>
> I think of it as "Turning my ego off" and gain considerable satisfaction
> from getting people to do sensible things when their egos tell them to do
> otherwise.

The extreme version of this is something my father was a master of: He
would come into a big meeting with everyone from the shop floor unions
to division directors and know pretty well what the best solution would
be, but never, ever mention it directly. Instead he would try to look
at the alternatives, on all sides, and discuss some failure points for
each. The best outcome of this was for someone supposedly "on the other
side of the table", like a union leader, to discover/suggest the very
idea he had in mind! At that point he would do an "Oh, is that possible?
Sounds interesting..." and go on to "discover" possible
drawbacks/problems and how to handle them, ending up with praising the
union leader for finding the best solution to the current issue.

"There is no limit to what you can achieve if you are always willing to
let someone else take credit for it."

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<65HNM.14022$BMnd.11958@fx04.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34103&group=comp.arch#34103

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx04.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
References: <i50LM.1449504$GMN3.428754@fx16.iad> <memo.20230910132120.13508R@jgd.cix.co.uk> <udkmcl$k5hu$1@dont-email.me> <aa8002f4-73d2-4221-b85d-811bcdb2e6aan@googlegroups.com> <ue4bl4$31ji$1@newsreader4.netcologne.de> <7LKcnVa7T63NbJj4nZ2dnZfqn_qdnZ2d@earthlink.com>
In-Reply-To: <7LKcnVa7T63NbJj4nZ2dnZfqn_qdnZ2d@earthlink.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 53
Message-ID: <65HNM.14022$BMnd.11958@fx04.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sun, 17 Sep 2023 17:56:18 UTC
Date: Sun, 17 Sep 2023 13:56:05 -0400
X-Received-Bytes: 3172
 by: EricP - Sun, 17 Sep 2023 17:56 UTC

David Schultz wrote:
> On 9/16/23 8:47 AM, Thomas Koenig wrote:
>> luke.l...@gmail.com <luke.leighton@gmail.com> schrieb:
>>
>>> i am always slightly dismayed by the continuous assumption that
>>> an ISA's front-end *must* have a direct one-to-one relationship
>>> with the back-end implementation.
>>
>> Many architectures have violated that assumption in the past. A few
>> examples off the top of my head:
>>
>> The /360 series, where a 32-bit architecture was implemented in,
>> for example, 8 bits on the model 30.
>>
>> The original Nova, which implemented its 16-bit architecture with
>> a 4-bit ALU. Later models went to full 16 bits.
>>
>> The Z80, which hides its 4-bit ALU behind an 8-bit external
>> appearance and sequencing. (How that saves chip area I'm
>> not sure).
>>
>> The PDP 8/S had a single-bit width ALU with a 12-bit architecture.
>>
>> The 68000 was actually advertised as a 16-bit processor, but
>> had an ISA geared towards 32 bit right from the start.
>
> Add the CDP1802 1 bit ALU to your list.

Was that you who did the reverse engineering on the 1802's control
and data path logic?

I'd forgotten about that 1-bit ALU.
That was one weirdness amongst all its other weirdness.
Like no instructions for calling subroutines, and that whole approach
of the P register for choosing a program counter register.

It has an 8-bit parallel ripple carry incrementer/decrementer
but then switches to bit serial for the ALU.

Perhaps some of this came about because the original COSMAC was a
2 chip set so they had to divvy up the register set and control logic.
Then when they merged the two chips to one maybe some of the original
design decisions were carried over to the new chip.

Sometimes I play around with different 8-bit ISA designs that might have
been possible with the same 5500 CMOS transistor budget as the 1802,
and using static control logic and Johnson counter sequencer,
but with nicer programming characteristics like a Branch And Link
instruction. I would reduce the register set from 16 to 8 16-bit registers
and use the freed up gates for an 8-bit ALU and add some addressing modes.

Re: Introducing ForwardCom: An open ISA with variable-length vector

<ue7ekk$fq2n$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34104&group=comp.arch#34104

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
Date: Sun, 17 Sep 2023 12:56:32 -0500
Organization: A noiseless patient Spider
Lines: 121
Message-ID: <ue7ekk$fq2n$1@dont-email.me>
References: <udickr$5rc3$1@dont-email.me>
<memo.20230916204926.16432A@jgd.cix.co.uk> <ue6rle$ceh2$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 17 Sep 2023 17:56:36 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="a99610e57d0a66097b0f4d49c74577ae";
logging-data="518231"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ktK7veemLlXATFpqNG+2q"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.15.1
Cancel-Lock: sha1:JqtfQcG4FQh9M6byqjw306QWBdo=
In-Reply-To: <ue6rle$ceh2$1@dont-email.me>
Content-Language: en-US
 by: BGB - Sun, 17 Sep 2023 17:56 UTC

On 9/17/2023 7:32 AM, Terje Mathisen wrote:
> John Dallman wrote:
>> In article <udickr$5rc3$1@dont-email.me>, cr88192@gmail.com (BGB) wrote:
>>
>>> I have noted that the ability to turn the "social behaviors" on and
>>> off, or "do a full 180", in terms of emotional sentiment, without
>>> any visible delay, is not something that neurotypicals seem able to
>>> do. Like, even when given permission to do so, and where the most
>>> sensible option would be turn off the social conventions and
>>> speak/act using a direct/impersonal style, they will not (and are
>>> seemingly mostly incapable of doing so).
>>
>> I'm not sure I'm entirely neurotypical, but I learned to do this
>> relatively easily when it became obvious that it was the best way to run
>> collaboration meetings between groups of software developers.
>>
>> I think of it as "Turning my ego off" and gain considerable satisfaction
>> from getting people to do sensible things when their egos tell them to do
>> otherwise.
>
> The extreme version of this is something my father was a master of: He
> would come into a big meeting with everyone from the shop floor unions
> to division directors and know pretty well what the best solution would
> be, but never, ever  mention it directly. Instead he would try to look
> at the alternatives, on all sides, and discuss some failure points for
> each. The best outcome of this was for someone supposedly "on the other
> side of the table", like a union leader, to discover/suggest the very
> idea he had in mind! At that point he would do an "Oh, is that possible?
> Sounds interesting..." and go on to "discover" possible
> drawbacks/problems and how to handle them, ending up with praising the
> union leader for finding the best solution to the current issue.
>
> "There is no limit to what you can achieve if you are always willing to
> let someone else take credit for it."
>

Interesting, though doesn't really seem like the same thing I was
thinking about here. I don't really feel inclined to go into the
specifics of the context at the moment.

More sort of like, in the case where one can switch to talking about
ones' interactions with the other person more like if they were a
disinterested 3rd party observer. Stating the situation in a "matter of
fact" style (without relying on emotional sentiment or other things).
Say, each person trying to make a case for what is most likely in the
best interest for each person involved in the interaction.

But, say, applied to scenarios like whether or not a relationship would
be sensible between two people, etc. Each person taking a role almost
as-if they were debating the pros/cons of a character-ship in a TV show.
(except, the two people just so happen to be the characters they are
debating about).

Note that one does not necessarily take a 3rd person perspective in this
case (if it were TV writing, this is usually where people would start
describing themselves in 3rd person terms; but in this case it is not
how it works in this case, as both parties still understand how pronouns
work...).

Now, say, this gets dropped in the middle of an otherwise casual
interaction in a public setting.

But, yeah, from what I can see, this sort of interaction style falls
outside the scope of normal neurotypical social interactions.

Also, I suspect a lot of people trying to write TV scripts don't
understand it either, either not using it in cases where it seemed like
it would be sensible to do so. Or, in the rare case they make an
attempt, they screw it up (either by having the person describe
themselves in the 3rd person, unnecessary levels of emotional sentiment,
or each person only arguing only from their own perspective or wants, ...).

Then again, the way TV would normally deal with these scenarios would be
to have the characters having dramatic displays of emotion and/or
yelling or similar.

Then again, the person I was interacting with shows some weaknesses in
this area (flip-flops between this and more NT-like patterns, not making
particularly effective use of this, nor making particularly effective
use of social context in the choice of interaction strategy; often
doesn't describe stuff in sufficient detail; doesn't really seem to
assume a neutral position; etc).

I suspect she has an issue with anxiety, and is possibly letting this
bleed over into her thinking (rather than, say, evaluating her position
in terms of best-interest or cost/benefit). Though, I will not fault her
for this, but it doesn't necessarily lead to an optimal outcome. Though,
in this case, has a relatively low probability of changing the answer.

Then again, can't ask for too much here.

I suspect she may be a little closer to normal than I am in these areas.

Then again, on the other side of things, someone going around and
basically using social patterns as tools to leverage what they want out
of people (*), is not ideal either, and would itself be cause for concern.

*: Say, a person that can be calm and neutral one moment, walk up to
someone on the street and put on a sad story about how something bad has
happened and they need some money to deal with it, then the person in
question gives them some money, and as soon as they leave, the person
switches back to a calm and neutral disposition (then using the money to
buy snacks at a convenience store or similar). (Well, as another person
in the mix, they need to keep quiet and play along, since if the first
persons' cover is blown, no money, no snacks, and then both people will
be angry, ...).

Where, say, one can note that this is not ideal due to being dishonest,
and it was not in the best interest of the random person on the street
to have had their money taken to buy snacks (well, along with it being
an indirect violation of the 7th commandment, etc).

....

Re: Introducing ForwardCom: An open ISA with variable-length vector

<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34248&group=comp.arch#34248

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:4e42:0:b0:655:d807:bd13 with SMTP id eb2-20020ad44e42000000b00655d807bd13mr65484qvb.8.1695285908451;
Thu, 21 Sep 2023 01:45:08 -0700 (PDT)
X-Received: by 2002:a05:6871:6aa8:b0:1d6:d2c7:5eb with SMTP id
zf40-20020a0568716aa800b001d6d2c705ebmr1945075oab.7.1695285908224; Thu, 21
Sep 2023 01:45:08 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 21 Sep 2023 01:45:07 -0700 (PDT)
In-Reply-To: <ue7ekk$fq2n$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=82.132.233.199; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 82.132.233.199
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
From: luke.leighton@gmail.com (luke.l...@gmail.com)
Injection-Date: Thu, 21 Sep 2023 08:45:08 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 67
 by: luke.l...@gmail.com - Thu, 21 Sep 2023 08:45 UTC

somewhere in this thread a couple weeks back i mentioned
about the "overloading" of opcodes being disastrous for
forwards-compatibility (i mention "forwards" specifically because
ForwardCom is that), but that also there is the hobson's
choice of adding literal-duplicates of opcodes resulting in
the exact same concept behind "SIMD considered harmful"
also doing the exact same damage not just to a SIMD ISA
but to a True-Scalable Vector ISA (such as ForwardCom,
MRISC32, RVV) and it turns out *even to a Scalar ISA*.

for a Scalar ISA it is "not so bad", you may only have to add
say 20-50 duplicate integer arithmetic operations (duplicating
the 32-bit ISA, adding 64-bit equivalents), duplicating say
70 or so FP32 operations and making FP64 variants, *and*
remembering also to add converters between the same.

Scalar Power ISA made the mistake of overloading the 32-bit
integer opcodes with 64-bit meanings, in a rushed-upgrade
about 20 years ago that they weren't really ready for. they *did*
however make a fantastic decision to support FP32 format
*in the exact same form as FP64* such that you *do not* have
to do "conversion" between double and single operations.
all unused bits when storing FP32 in FP64 form are zero:
the only "conversion" needed is on non-normalised operations
when you load the FP32, and luckily that's possible to do
without needing to raise any Exceptions.

Mitch's 66000 ISA helps reduce some of the conversion
proliferation (transfer between GPR and FPR) by having
just the one 64-entry regfile. RISC-V and other RISC ISAs
have a 2-bit dedicated field for specifying the bit-width
of the FP operation (FP16 FP32 FP64 FP80/128)

but i digress, despite this being important context outlining
that the problem of opcode proliferation is not limited to
Vector ISAs.

how do you reduce that opcode proliferation? how do you
not end up dedicating 2 critical bits to the width of operations
(like RISC-V) where you are already under pressure (say with
16-bit Compressed), especially when (say) you want source
*and* destination width-specifiers?

there are a few solutions i have encountered (i would be
interested to hear of more)

* "tagged" registers. ForwardCom actually has tags already
(the vector length of each vector register is a "tag" in the
register itself. why not also include the element width?)
* "Control Status Registers". Power ISA has "bit-accuracy"
CSR bits, already, where if you want all FP operations to
be reduced accuracy (faster speed) you are permitted to
do so. there is nothing to stop you using a couple of bits
of CSR to specify "all integer operations normally marked
as 64-bit are now actually 32-bit" (and vice-versa)
* "Prefixing". a RISC-uniform Prefix instruction (similar to
ARM SVE MOVPRFX) may specify "the following instruction's
source (and destination) registers are NN-bit wide"

there's a few more out there: they all have advantages and
disadvantages, but at least they do not poison the core of
the ISA with massive opcode proliferation.

fascinatingly, though, they *do* still impact the Decoder Phase
to some extent, i'd be interested to hear peoples' thoughts on
how to overcome some of those problems.

l.

Re: Introducing ForwardCom: An open ISA with variable-length vector

<uehonq$3jgbf$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34250&group=comp.arch#34250

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
Date: Thu, 21 Sep 2023 10:50:15 -0500
Organization: A noiseless patient Spider
Lines: 115
Message-ID: <uehonq$3jgbf$1@dont-email.me>
References: <udickr$5rc3$1@dont-email.me>
<memo.20230916204926.16432A@jgd.cix.co.uk> <ue6rle$ceh2$1@dont-email.me>
<ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 21 Sep 2023 15:50:18 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e2fc4c590642e6fbf0dcc38b70c97aae";
logging-data="3785071"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+rV1nrVRXkg5bucmrdr8Fc"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.15.1
Cancel-Lock: sha1:WqNqRm0x72VTeFTqlzhB/Y3Gh9o=
In-Reply-To: <287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>
Content-Language: en-US
 by: BGB - Thu, 21 Sep 2023 15:50 UTC

On 9/21/2023 3:45 AM, luke.l...@gmail.com wrote:
> somewhere in this thread a couple weeks back i mentioned
> about the "overloading" of opcodes being disastrous for
> forwards-compatibility (i mention "forwards" specifically because
> ForwardCom is that), but that also there is the hobson's
> choice of adding literal-duplicates of opcodes resulting in
> the exact same concept behind "SIMD considered harmful"
> also doing the exact same damage not just to a SIMD ISA
> but to a True-Scalable Vector ISA (such as ForwardCom,
> MRISC32, RVV) and it turns out *even to a Scalar ISA*.
>
> for a Scalar ISA it is "not so bad", you may only have to add
> say 20-50 duplicate integer arithmetic operations (duplicating
> the 32-bit ISA, adding 64-bit equivalents), duplicating say
> 70 or so FP32 operations and making FP64 variants, *and*
> remembering also to add converters between the same.
>
> Scalar Power ISA made the mistake of overloading the 32-bit
> integer opcodes with 64-bit meanings, in a rushed-upgrade
> about 20 years ago that they weren't really ready for. they *did*
> however make a fantastic decision to support FP32 format
> *in the exact same form as FP64* such that you *do not* have
> to do "conversion" between double and single operations.
> all unused bits when storing FP32 in FP64 form are zero:
> the only "conversion" needed is on non-normalised operations
> when you load the FP32, and luckily that's possible to do
> without needing to raise any Exceptions.
>
> Mitch's 66000 ISA helps reduce some of the conversion
> proliferation (transfer between GPR and FPR) by having
> just the one 64-entry regfile. RISC-V and other RISC ISAs
> have a 2-bit dedicated field for specifying the bit-width
> of the FP operation (FP16 FP32 FP64 FP80/128)
>
> but i digress, despite this being important context outlining
> that the problem of opcode proliferation is not limited to
> Vector ISAs.
>
> how do you reduce that opcode proliferation? how do you
> not end up dedicating 2 critical bits to the width of operations
> (like RISC-V) where you are already under pressure (say with
> 16-bit Compressed), especially when (say) you want source
> *and* destination width-specifiers?
>
> there are a few solutions i have encountered (i would be
> interested to hear of more)
>
> * "tagged" registers. ForwardCom actually has tags already
> (the vector length of each vector register is a "tag" in the
> register itself. why not also include the element width?)
> * "Control Status Registers". Power ISA has "bit-accuracy"
> CSR bits, already, where if you want all FP operations to
> be reduced accuracy (faster speed) you are permitted to
> do so. there is nothing to stop you using a couple of bits
> of CSR to specify "all integer operations normally marked
> as 64-bit are now actually 32-bit" (and vice-versa)
> * "Prefixing". a RISC-uniform Prefix instruction (similar to
> ARM SVE MOVPRFX) may specify "the following instruction's
> source (and destination) registers are NN-bit wide"
>
> there's a few more out there: they all have advantages and
> disadvantages, but at least they do not poison the core of
> the ISA with massive opcode proliferation.
>
> fascinatingly, though, they *do* still impact the Decoder Phase
> to some extent, i'd be interested to hear peoples' thoughts on
> how to overcome some of those problems.
>

One can always go a route similar to the FPU (and "SIMD") ops in SuperH
and use some flags in a control register somewhere to effectively change
out which instructions are being encoded.

Or, going further, one could "bank switch" parts of the ISA.

Conceivably, all one really needs is:
Some dedicated control bits somewhere;
A mechanism to change these bits;
(Presumably) a way to trigger a pipeline flush so that everything is
decoded as expected.

AFAICT, in SH, the idea was that modifying one of the offending control
registers would implicitly invoke a pipeline flush (say, by implicitly
causing the CPU to do a full branch to the following instruction).

Well, this was along with some variations of the ISA using fixed-length
16-bit instructions, and some other variants using a 16/32 encoding, and
some variants having different sets of instructions in the same
locations (so, effectively, SH2/SH3/SH4 variants would necessarily
remain binary incompatible with each other, and later iterations had
diverged down different paths).

In my original form of BJX1, I had replaced a few of these cases with
explicit instructions to Set/Clear/Invert some of these bits (in the
original SH ISA, one would typically need to reload the register state
via an in-memory literal). This was along with awkwardly gluing some
parts of later SH2 derivatives onto an otherwise SH4 based design, ...
(and whole thing sort of turning into an ugly mess).

Originally, pretty much all of this was eliminated for BJX2, eliminating
all use of contextually encoded instructions.
Well, except that now XG2 can be considered a variation of this idea,
but more switches out the general encoding scheme rather than individual
groups of instructions; and is currently applied at the scale of an
entire program executable (theoretically, it is possible to jump back
and forth between them, but this is not currently done in BGBCC).

But, if one runs out of encoding space, one will find a way...

> l.

Re: Introducing ForwardCom: An open ISA with variable-length vector

<uehoos$3jgc2$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34251&group=comp.arch#34251

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: sfuld@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
Date: Thu, 21 Sep 2023 08:50:52 -0700
Organization: A noiseless patient Spider
Lines: 88
Message-ID: <uehoos$3jgc2$1@dont-email.me>
References: <udickr$5rc3$1@dont-email.me>
<memo.20230916204926.16432A@jgd.cix.co.uk> <ue6rle$ceh2$1@dont-email.me>
<ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 21 Sep 2023 15:50:52 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="626fabe0c5544ad31d7d93442abd8941";
logging-data="3785090"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19fOcv1c4wxDNrNtQf1UdNS0Ist1DhsIwc="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:vrWlBtBfCbvF0Avfh3e9xDuD73I=
In-Reply-To: <287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>
Content-Language: en-US
 by: Stephen Fuld - Thu, 21 Sep 2023 15:50 UTC

On 9/21/2023 1:45 AM, luke.l...@gmail.com wrote:
> somewhere in this thread a couple weeks back i mentioned
> about the "overloading" of opcodes being disastrous for
> forwards-compatibility (i mention "forwards" specifically because
> ForwardCom is that), but that also there is the hobson's
> choice of adding literal-duplicates of opcodes resulting in
> the exact same concept behind "SIMD considered harmful"
> also doing the exact same damage not just to a SIMD ISA
> but to a True-Scalable Vector ISA (such as ForwardCom,
> MRISC32, RVV) and it turns out *even to a Scalar ISA*.
>
> for a Scalar ISA it is "not so bad", you may only have to add
> say 20-50 duplicate integer arithmetic operations (duplicating
> the 32-bit ISA, adding 64-bit equivalents), duplicating say
> 70 or so FP32 operations and making FP64 variants, *and*
> remembering also to add converters between the same.
>
> Scalar Power ISA made the mistake of overloading the 32-bit
> integer opcodes with 64-bit meanings, in a rushed-upgrade
> about 20 years ago that they weren't really ready for. they *did*
> however make a fantastic decision to support FP32 format
> *in the exact same form as FP64* such that you *do not* have
> to do "conversion" between double and single operations.
> all unused bits when storing FP32 in FP64 form are zero:
> the only "conversion" needed is on non-normalised operations
> when you load the FP32, and luckily that's possible to do
> without needing to raise any Exceptions.
>
> Mitch's 66000 ISA helps reduce some of the conversion
> proliferation (transfer between GPR and FPR) by having
> just the one 64-entry regfile. RISC-V and other RISC ISAs
> have a 2-bit dedicated field for specifying the bit-width
> of the FP operation (FP16 FP32 FP64 FP80/128)
>
> but i digress, despite this being important context outlining
> that the problem of opcode proliferation is not limited to
> Vector ISAs.
>
> how do you reduce that opcode proliferation? how do you
> not end up dedicating 2 critical bits to the width of operations
> (like RISC-V) where you are already under pressure (say with
> 16-bit Compressed), especially when (say) you want source
> *and* destination width-specifiers?
>
> there are a few solutions i have encountered (i would be
> interested to hear of more)
>
> * "tagged" registers. ForwardCom actually has tags already
> (the vector length of each vector register is a "tag" in the
> register itself. why not also include the element width?)

There are several "variants" of tagging. It is a question of where to
put the bits indicating operand data type/width. For reference, as you
indicate, most ISAs put them in the op-codes. The Burroughs large scale
system put them in each memory location (SIMD wasn't a thing back then).
The Mill is in-between Burroughs and most current ISAs in that it puts
them in the load instructions, then keeps them along with the
"registers" (Mill's belt slots). Each solution has plusses and minuses.

> * "Control Status Registers". Power ISA has "bit-accuracy"
> CSR bits, already, where if you want all FP operations to
> be reduced accuracy (faster speed) you are permitted to
> do so. there is nothing to stop you using a couple of bits
> of CSR to specify "all integer operations normally marked
> as 64-bit are now actually 32-bit" (and vice-versa)
> * "Prefixing". a RISC-uniform Prefix instruction (similar to
> ARM SVE MOVPRFX) may specify "the following instruction's
> source (and destination) registers are NN-bit wide"
>
> there's a few more out there: they all have advantages and
> disadvantages, but at least they do not poison the core of
> the ISA with massive opcode proliferation.
>
> fascinatingly, though, they *do* still impact the Decoder Phase
> to some extent, i'd be interested to hear peoples' thoughts on
> how to overcome some of those problems.

Interesting topic. It is directly related to how to "future proof" an
ISA and prevent, or at least minimize the mess that most long lived ISAs
have become.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Introducing ForwardCom: An open ISA with variable-length vector

<ebe63809-1001-4076-b293-12e0d3b560bbn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34254&group=comp.arch#34254

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:458a:b0:770:f19d:d6ac with SMTP id bp10-20020a05620a458a00b00770f19dd6acmr89065qkb.0.1695320402535;
Thu, 21 Sep 2023 11:20:02 -0700 (PDT)
X-Received: by 2002:a05:6808:3992:b0:3ad:f3e6:6706 with SMTP id
gq18-20020a056808399200b003adf3e66706mr2379518oib.8.1695320402323; Thu, 21
Sep 2023 11:20:02 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 21 Sep 2023 11:20:02 -0700 (PDT)
In-Reply-To: <287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:2978:ff9b:2d22:339c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:2978:ff9b:2d22:339c
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me> <287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ebe63809-1001-4076-b293-12e0d3b560bbn@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Thu, 21 Sep 2023 18:20:02 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 80
 by: MitchAlsup - Thu, 21 Sep 2023 18:20 UTC

On Thursday, September 21, 2023 at 3:45:10 AM UTC-5, luke.l...@gmail.com wrote:
> somewhere in this thread a couple weeks back i mentioned
> about the "overloading" of opcodes being disastrous for
> forwards-compatibility (i mention "forwards" specifically because
> ForwardCom is that), but that also there is the hobson's
> choice of adding literal-duplicates of opcodes resulting in
> the exact same concept behind "SIMD considered harmful"
> also doing the exact same damage not just to a SIMD ISA
> but to a True-Scalable Vector ISA (such as ForwardCom,
> MRISC32, RVV) and it turns out *even to a Scalar ISA*.
>
> for a Scalar ISA it is "not so bad", you may only have to add
> say 20-50 duplicate integer arithmetic operations (duplicating
> the 32-bit ISA, adding 64-bit equivalents), duplicating say
> 70 or so FP32 operations and making FP64 variants, *and*
> remembering also to add converters between the same.
>
> Scalar Power ISA made the mistake of overloading the 32-bit
> integer opcodes with 64-bit meanings, in a rushed-upgrade
> about 20 years ago that they weren't really ready for. they *did*
> however make a fantastic decision to support FP32 format
> *in the exact same form as FP64* such that you *do not* have
> to do "conversion" between double and single operations.
> all unused bits when storing FP32 in FP64 form are zero:
> the only "conversion" needed is on non-normalised operations
> when you load the FP32, and luckily that's possible to do
> without needing to raise any Exceptions.
>
> Mitch's 66000 ISA helps reduce some of the conversion
> proliferation (transfer between GPR and FPR) by having
> just the one 64-entry regfile. RISC-V and other RISC ISAs
> have a 2-bit dedicated field for specifying the bit-width
> of the FP operation (FP16 FP32 FP64 FP80/128)
>
> but i digress, despite this being important context outlining
> that the problem of opcode proliferation is not limited to
> Vector ISAs.
>
> how do you reduce that opcode proliferation? how do you
> not end up dedicating 2 critical bits to the width of operations
> (like RISC-V) where you are already under pressure (say with
> 16-bit Compressed), especially when (say) you want source
> *and* destination width-specifiers?
>
> there are a few solutions i have encountered (i would be
> interested to hear of more)
>
> * "tagged" registers. ForwardCom actually has tags already
> (the vector length of each vector register is a "tag" in the
> register itself. why not also include the element width?)
> * "Control Status Registers". Power ISA has "bit-accuracy"
> CSR bits, already, where if you want all FP operations to
> be reduced accuracy (faster speed) you are permitted to
> do so. there is nothing to stop you using a couple of bits
> of CSR to specify "all integer operations normally marked
> as 64-bit are now actually 32-bit" (and vice-versa)
> * "Prefixing". a RISC-uniform Prefix instruction (similar to
> ARM SVE MOVPRFX) may specify "the following instruction's
> source (and destination) registers are NN-bit wide"
<
* "Bookends". A pair of RISC instructions, one at the beginning
of a Loop and the other at the end of the Loop. The instructions
"in" the loop can be performed in SIMD-fashion, the LDs give
the widths of the inbound containers, the STs give the width
of the outbound containers. The VEC instruction contains the
registers that are live-out of the Loop, and the LOOP instruction
performs the Loop repetition.
<
You get all of the SIMDification and Vectorization from 2
total instructions instead of 1300, or 400.......
>
> there's a few more out there: they all have advantages and
> disadvantages, but at least they do not poison the core of
> the ISA with massive opcode proliferation.
>
> fascinatingly, though, they *do* still impact the Decoder Phase
> to some extent, i'd be interested to hear peoples' thoughts on
> how to overcome some of those problems.
>
> l.

Re: Introducing ForwardCom: An open ISA with variable-length vector

<jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34302&group=comp.arch#34302

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: monnier@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
Date: Mon, 25 Sep 2023 12:23:58 -0400
Organization: A noiseless patient Spider
Lines: 48
Message-ID: <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
References: <udickr$5rc3$1@dont-email.me>
<memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: dont-email.me; posting-host="13b9fbd8a7289dbef00e8dc30f1373d7";
logging-data="2118613"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/XtSGyReMOpny2ioOTutHFnJWzaWy6vGE="
User-Agent: Gnus/5.13 (Gnus v5.13)
Cancel-Lock: sha1:P4Mj0tFRR1cIDc6MiolBhNTV97k=
sha1:CTgIehE8hpm8MK3/EQu63wcLqJo=
 by: Stefan Monnier - Mon, 25 Sep 2023 16:23 UTC

> there are a few solutions i have encountered (i would be
> interested to hear of more)
> * "tagged" registers. ForwardCom actually has tags already
[...]
> * "Control Status Registers". Power ISA has "bit-accuracy"
[...]
> * "Prefixing". a RISC-uniform Prefix instruction (similar to
[...]
> fascinatingly, though, they *do* still impact the Decoder Phase
> to some extent, i'd be interested to hear peoples' thoughts on
> how to overcome some of those problems.

I think you can't get a good answer before clarifying what it is that
you consider as the problem in "opcode proliferation" (after all,
"prefixing" is just another name for variable-length instructions, and
what is considered as "opcode" vs "immediate operand" within an
instruction is largely philosophical).

As you point out, part of the complexity doesn't come from
instruction encoding, really, but from the actual desired semantics.

So in terms of instruction encoding, the main issues would be:

A. Code size.
B. Not preventing the core from working at full speed.
C. The cost of decoding.

(C) can only bite when backward compatibility forces inconvenient
encodings, but even the amd64 architecture seems to do fine, so it
doesn't seem to be a serious issue.

I think (B) is never an unsolvable problem. E.g. CSR-based solutions
sometimes require pipeline drainage, but that's usually solvable
without changing the encoding by passing the CSR through the pipeline.
But maybe tagged registers could be problematic because you end up
getting this info too late, in the execution rather than in the decode?

So I'd guess it's really mostly a code-size issue (beside the semantic
issues when you want to make it so old code can work with new sizes, of
course, but IIUC you were talking about the scalar case where this seems
to be too problematic to even contemplate).

My gut tells me that in terms of code size, tagged registers should be
the better choice. But I don't know of any actual efforts to try and
confirm it experimentally.

Stefan

Re: Introducing ForwardCom: An open ISA with variable-length vector

<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34344&group=comp.arch#34344

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7317:0:b0:405:4ef2:b3b1 with SMTP id x23-20020ac87317000000b004054ef2b3b1mr100181qto.0.1696032714368;
Fri, 29 Sep 2023 17:11:54 -0700 (PDT)
X-Received: by 2002:a05:6870:9566:b0:1bb:5fac:524e with SMTP id
v38-20020a056870956600b001bb5fac524emr2230265oal.5.1696032714083; Fri, 29 Sep
2023 17:11:54 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Sep 2023 17:11:53 -0700 (PDT)
In-Reply-To: <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7d44:3c21:e8cc:7b9b;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7d44:3c21:e8cc:7b9b
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sat, 30 Sep 2023 00:11:54 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Sat, 30 Sep 2023 00:11 UTC

On Monday, September 25, 2023 at 11:25:30 AM UTC-5, Stefan Monnier wrote:
> > there are a few solutions i have encountered (i would be
> > interested to hear of more)
> > * "tagged" registers. ForwardCom actually has tags already
> [...]
> > * "Control Status Registers". Power ISA has "bit-accuracy"
> [...]
> > * "Prefixing". a RISC-uniform Prefix instruction (similar to
> [...]
> > fascinatingly, though, they *do* still impact the Decoder Phase
> > to some extent, i'd be interested to hear peoples' thoughts on
> > how to overcome some of those problems.
<
> I think you can't get a good answer before clarifying what it is that
> you consider as the problem in "opcode proliferation" (after all,
<
The problem is that the Cartesian-product associated with SIMD
causes thousands of microscopic instructions to be needed (for
example ARM has at least 1,300 SIMD instructions, others worse.)
<
It is my contention that sooner or later one needs to respect the R
as the first letter in RISC and make it stand for REDUCED !!
<
It is also my contention that nobody with more than <wave hands>
200 instructions can be called RISC.
<
> "prefixing" is just another name for variable-length instructions, and
> what is considered as "opcode" vs "immediate operand" within an
> instruction is largely philosophical).
>
> As you point out, part of the complexity doesn't come from
> instruction encoding, really, but from the actual desired semantics.
>
> So in terms of instruction encoding, the main issues would be:
>

Re: Introducing ForwardCom: An open ISA with variable-length vector

<95KRM.207899$Hih7.53988@fx11.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34345&group=comp.arch#34345

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx11.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
Newsgroups: comp.arch
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk> <ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me> <287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org> <901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com>
Lines: 25
Message-ID: <95KRM.207899$Hih7.53988@fx11.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Sat, 30 Sep 2023 00:37:25 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Sat, 30 Sep 2023 00:37:25 GMT
X-Received-Bytes: 2090
 by: Scott Lurndal - Sat, 30 Sep 2023 00:37 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>On Monday, September 25, 2023 at 11:25:30=E2=80=AFAM UTC-5, Stefan Monnier =
>wrote:
>> > there are a few solutions i have encountered (i would be=20
>> > interested to hear of more)=20
>> > * "tagged" registers. ForwardCom actually has tags already
>> [...]
>> > * "Control Status Registers". Power ISA has "bit-accuracy"
>> [...]
>> > * "Prefixing". a RISC-uniform Prefix instruction (similar to
>> [...]
>> > fascinatingly, though, they *do* still impact the Decoder Phase=20
>> > to some extent, i'd be interested to hear peoples' thoughts on=20
>> > how to overcome some of those problems.
><
>> I think you can't get a good answer before clarifying what it is that=20
>> you consider as the problem in "opcode proliferation" (after all,=20
><
>The problem is that the Cartesian-product associated with SIMD
>causes thousands of microscopic instructions to be needed (for
>example ARM has at least 1,300 SIMD instructions, others worse.)

That's a bit misleading, as you're counting individual instruction
words (opcode + all variations of source and destination register(s)).

Re: Introducing ForwardCom: An open ISA with variable-length vector

<72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34346&group=comp.arch#34346

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:1d06:b0:770:7460:c3f6 with SMTP id dl6-20020a05620a1d0600b007707460c3f6mr70100qkb.6.1696035676120;
Fri, 29 Sep 2023 18:01:16 -0700 (PDT)
X-Received: by 2002:a05:6870:5b12:b0:1d6:689b:fe59 with SMTP id
ds18-20020a0568705b1200b001d6689bfe59mr2026987oab.10.1696035675899; Fri, 29
Sep 2023 18:01:15 -0700 (PDT)
Path: i2pn2.org!i2pn.org!news.niel.me!glou.org!news.glou.org!usenet-fr.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Sep 2023 18:01:15 -0700 (PDT)
In-Reply-To: <95KRM.207899$Hih7.53988@fx11.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7d44:3c21:e8cc:7b9b;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7d44:3c21:e8cc:7b9b
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sat, 30 Sep 2023 01:01:16 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Sat, 30 Sep 2023 01:01 UTC

On Friday, September 29, 2023 at 7:37:30 PM UTC-5, Scott Lurndal wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >On Monday, September 25, 2023 at 11:25:30=E2=80=AFAM UTC-5, Stefan Monnier =
> >wrote:
> >> > there are a few solutions i have encountered (i would be=20
> >> > interested to hear of more)=20
> >> > * "tagged" registers. ForwardCom actually has tags already
> >> [...]
> >> > * "Control Status Registers". Power ISA has "bit-accuracy"
> >> [...]
> >> > * "Prefixing". a RISC-uniform Prefix instruction (similar to
> >> [...]
> >> > fascinatingly, though, they *do* still impact the Decoder Phase=20
> >> > to some extent, i'd be interested to hear peoples' thoughts on=20
> >> > how to overcome some of those problems.
> ><
> >> I think you can't get a good answer before clarifying what it is that=20
> >> you consider as the problem in "opcode proliferation" (after all,=20
> ><
> >The problem is that the Cartesian-product associated with SIMD
> >causes thousands of microscopic instructions to be needed (for
> >example ARM has at least 1,300 SIMD instructions, others worse.)
<
> That's a bit misleading, as you're counting individual instruction
> words (opcode + all variations of source and destination register(s)).
<
Not source and destination--but operand size and register size.
<
Register sizes {64, 128, 256, 512}
Operand size {8, 16, 32, 64}
Calculation {int{+, -, ×, /, &, |, ^,}, FP{+, -, ×, /,}, swizzles, gather, scatter, ...}
......
Sooner or later it all adds up to shiploads of instructions. And the instructions
are not backwards compatible,...
<
And in comparison:: I got almost all of that capability with 2 instructions
that guarantees forward and backwards compatibility, and scales with
machine resources.
<
2 versus 1300 :: Which one is really RISC ??

Re: Introducing ForwardCom: An open ISA with variable-length vector

<041e3c2e-3243-41d9-8cd0-14554e5414e5n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34347&group=comp.arch#34347

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7eee:0:b0:418:fe8:f3c5 with SMTP id r14-20020ac87eee000000b004180fe8f3c5mr96394qtc.4.1696040142316;
Fri, 29 Sep 2023 19:15:42 -0700 (PDT)
X-Received: by 2002:a05:6870:1a89:b0:1dd:1837:c70b with SMTP id
ef9-20020a0568701a8900b001dd1837c70bmr2150060oab.4.1696040141996; Fri, 29 Sep
2023 19:15:41 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Sep 2023 19:15:41 -0700 (PDT)
In-Reply-To: <72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=99.251.79.92; posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 99.251.79.92
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad>
<72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <041e3c2e-3243-41d9-8cd0-14554e5414e5n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
From: robfi680@gmail.com (robf...@gmail.com)
Injection-Date: Sat, 30 Sep 2023 02:15:42 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 83
 by: robf...@gmail.com - Sat, 30 Sep 2023 02:15 UTC

On Friday, September 29, 2023 at 9:01:18 PM UTC-4, MitchAlsup wrote:
> On Friday, September 29, 2023 at 7:37:30 PM UTC-5, Scott Lurndal wrote:
> > MitchAlsup <Mitch...@aol.com> writes:
> > >On Monday, September 25, 2023 at 11:25:30=E2=80=AFAM UTC-5, Stefan Monnier =
> > >wrote:
> > >> > there are a few solutions i have encountered (i would be=20
> > >> > interested to hear of more)=20
> > >> > * "tagged" registers. ForwardCom actually has tags already
> > >> [...]
> > >> > * "Control Status Registers". Power ISA has "bit-accuracy"
> > >> [...]
> > >> > * "Prefixing". a RISC-uniform Prefix instruction (similar to
> > >> [...]
> > >> > fascinatingly, though, they *do* still impact the Decoder Phase=20
> > >> > to some extent, i'd be interested to hear peoples' thoughts on=20
> > >> > how to overcome some of those problems.
> > ><
> > >> I think you can't get a good answer before clarifying what it is that=20
> > >> you consider as the problem in "opcode proliferation" (after all,=20
> > ><
> > >The problem is that the Cartesian-product associated with SIMD
> > >causes thousands of microscopic instructions to be needed (for
> > >example ARM has at least 1,300 SIMD instructions, others worse.)
> <
> > That's a bit misleading, as you're counting individual instruction
> > words (opcode + all variations of source and destination register(s)).
> <
> Not source and destination--but operand size and register size.
> <
> Register sizes {64, 128, 256, 512}
> Operand size {8, 16, 32, 64}
> Calculation {int{+, -, ×, /, &, |, ^,}, FP{+, -, ×, /,}, swizzles, gather, scatter, ...}
> .....
> Sooner or later it all adds up to shiploads of instructions. And the instructions
> are not backwards compatible,...
> <
> And in comparison:: I got almost all of that capability with 2 instructions
> that guarantees forward and backwards compatibility, and scales with
> machine resources.
> <
> 2 versus 1300 :: Which one is really RISC ??

Those shiploads of instructions also require more bits to encode. Using the idea
there would be shiploads of instructions leads to wide instructions - 40-bits.
Otherwise, one ends up with piles of instruction prefixes. Operands could be
128-bits too and register size might not stop at 512 bits, so there are potentially even
more instructions on the way. It requires a good nomenclature to represent all the
instructions. Is there a nomenclature project somewhere?

I keep musing over the idea of using a register file tagged with the data type to try
and reduce the instructions, but have discarded it a couple of times. I get bogged
down implementing the rules for the target data type. For example, if an address is
added to an integer the result should be an address. Subtract two addresses and
the result should be an integer. Compare-and-branch causes an exception if the two
data types compared are not the same. Using tagged registers has its own set of
complexities and I am not sure the extra logic would be worth it. I think it may be
easier and less logic (faster) to have the compiler manage all the instructions.
These days most things are done in HLL so how valuable is having a compact
assembler language instruction set?

RISCs and virtual vectors (was: Introducing ForwardCom)

<2023Sep30.092454@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34353&group=comp.arch#34353

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: RISCs and virtual vectors (was: Introducing ForwardCom)
Date: Sat, 30 Sep 2023 07:24:54 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 23
Message-ID: <2023Sep30.092454@mips.complang.tuwien.ac.at>
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk> <ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me> <287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org> <901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad> <72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com>
Injection-Info: dont-email.me; posting-host="600a6c76e14b8d8276fffe88f4e8507e";
logging-data="806578"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19UqL3Q6ZQqHEEEbRLt3Ksd"
Cancel-Lock: sha1:Ajolq3+0yFc5lv+3grgriSqyJKE=
X-newsreader: xrn 10.11
 by: Anton Ertl - Sat, 30 Sep 2023 07:24 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>And in comparison:: I got almost all of that capability with 2 instructions
>that guarantees forward and backwards compatibility, and scales with
>machine resources.
><
>2 versus 1300 :: Which one is really RISC ??

There has been the argument that RISC is not about reduced numbers of
instructions, but about reduced complexity. The Cartesian product
does not make the instructions more complex, only more.

The complexity of your two additional instructions is, from an
architectural POV, the same as two nops, correct? That's good!

OTOH, it leads to the question why we need these instructions at all.
Can you virtual vectors not be implemented as a pure
microarchitectural mechanism without any additional instructions?
What do the additional instructions buy?

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: RISCs and virtual vectors (was: Introducing ForwardCom)

<ae46c988-7f46-4d66-9bae-8e5f03c86682n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34360&group=comp.arch#34360

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:2625:b0:774:2ad1:b816 with SMTP id z37-20020a05620a262500b007742ad1b816mr96992qko.4.1696089554492;
Sat, 30 Sep 2023 08:59:14 -0700 (PDT)
X-Received: by 2002:a05:6808:1789:b0:3ae:24b3:8f7d with SMTP id
bg9-20020a056808178900b003ae24b38f7dmr3302212oib.11.1696089554293; Sat, 30
Sep 2023 08:59:14 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 30 Sep 2023 08:59:13 -0700 (PDT)
In-Reply-To: <2023Sep30.092454@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:d819:6090:1710:645;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:d819:6090:1710:645
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad>
<72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com> <2023Sep30.092454@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ae46c988-7f46-4d66-9bae-8e5f03c86682n@googlegroups.com>
Subject: Re: RISCs and virtual vectors (was: Introducing ForwardCom)
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sat, 30 Sep 2023 15:59:14 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3243
 by: MitchAlsup - Sat, 30 Sep 2023 15:59 UTC

On Saturday, September 30, 2023 at 2:34:08 AM UTC-5, Anton Ertl wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >And in comparison:: I got almost all of that capability with 2 instructions
> >that guarantees forward and backwards compatibility, and scales with
> >machine resources.
> ><
> >2 versus 1300 :: Which one is really RISC ??
> There has been the argument that RISC is not about reduced numbers of
> instructions, but about reduced complexity. The Cartesian product
> does not make the instructions more complex, only more.
>
> The complexity of your two additional instructions is, from an
> architectural POV, the same as two nops, correct? That's good!
<
Not quite NoOps:: the leading one provides a bit vector of registers that
are live-out from the loop; the trailing one does the ADD-CMP-BC part
of the loop.
>
> OTOH, it leads to the question why we need these instructions at all.
> Can you virtual vectors not be implemented as a pure
> microarchitectural mechanism without any additional instructions?
<
It might be possible to recognize a loop as something special that can
be performed with non-standard HW mechanisms--it is just easier when
the loop self-identifies.
<
> What do the additional instructions buy?
<
Sequencing semantics--mainly in what does NOT need to be performed
(the live-outs for example minimizes the work of exiting the loop).
>
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: RISCs and virtual vectors (was: Introducing ForwardCom)

<ufa4lp$rnqn$1@newsreader4.netcologne.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34369&group=comp.arch#34369

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-5334-0-2480-d01c-f803-c824.ipv6dyn.netcologne.de!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: RISCs and virtual vectors (was: Introducing ForwardCom)
Date: Sat, 30 Sep 2023 21:41:13 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <ufa4lp$rnqn$1@newsreader4.netcologne.de>
References: <udickr$5rc3$1@dont-email.me>
<memo.20230916204926.16432A@jgd.cix.co.uk> <ue6rle$ceh2$1@dont-email.me>
<ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>
<jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com>
<95KRM.207899$Hih7.53988@fx11.iad>
<72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com>
<2023Sep30.092454@mips.complang.tuwien.ac.at>
<128abea4-a935-4ef6-b7fe-71f0f1f35be0n@googlegroups.com>
Injection-Date: Sat, 30 Sep 2023 21:41:13 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-5334-0-2480-d01c-f803-c824.ipv6dyn.netcologne.de:2001:4dd7:5334:0:2480:d01c:f803:c824";
logging-data="909143"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sat, 30 Sep 2023 21:41 UTC

luke.l...@gmail.com <luke.leighton@gmail.com> schrieb:

> as Mitch notes in a later post (today), the "Cartesian Product" turns out
> in say Audio DSPs doing SWAR (SIMD Within A Register) such as AndesStar
> to be an O(N^6) clusterpoop. these DSPs, they are under enormous cost
> and power budget pressure (think "C-Media sub-$1 USB Audio PHYs with
> built-in volume control", they potentially only run at 8-12 mhz and are likely
> in 130 or even 180 nm! USB1.1 and/or have a PLL that slaves to the host
> USB bus, the STM32F072 does this: really neat trick)
>
> take even just an ADD, you would think there
> would only ever be one add instruction in a RISC ISA? ehhhhm no
>
> * 8/16 bit source selection doubles that
> * 8/16 bit destination selection doubles it again

Assuming that these processors are indeed RISC (so, a load-
store architecture), then this could be handled by having

- 8-bit sign-extending load
- 8-bit zero-extending load
- 16 bit load
- 8-bit store
- 16-bit store

and always doing 16-bit arithmetic.

Hm, that's five loads and stores so far, still some room :-)

(Or is it a "RISC" in the sense of the MSP430, which is about as
much a RISC as the PDP-11? Then the above does not apply).

> * hi/lo half on source 1 doubles it again
> * hi/lo half on source 2 doubles it again

I'm not quite sure what the functionality is. Is it possible
to effectively treat each half of a register as a sub-register in
the ISA? Then, you need the bits for it.

> * signed and unsigned saturation triples the number of ADD operations
> (clipping in audio is important rather than getting wrapping distortion)

Yes, then you also need separate eight-bit additions.

> * "average-add" (x+y+1)>>1 (for Audio this is crucial) doubles again
> (if you only have 16 bit audio and a low-speed DSP you cannot afford
> to do that kind of calculation in 3 instructions, and you need 32-bit regs)

So, another addition...

> we are up to 6 dimensions and a whopping NINETY SIX *commercially necessary*
> variants on what is supposed to be one simple ADD operation!

Re: RISCs and virtual vectors (was: Introducing ForwardCom)

<c0a376dc-2c9b-440a-91df-0beea9c7d3c7n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34370&group=comp.arch#34370

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7541:0:b0:412:7ea:37c9 with SMTP id b1-20020ac87541000000b0041207ea37c9mr132563qtr.5.1696111194881;
Sat, 30 Sep 2023 14:59:54 -0700 (PDT)
X-Received: by 2002:a05:6808:2219:b0:3ad:aeed:7eeb with SMTP id
bd25-20020a056808221900b003adaeed7eebmr3932478oib.6.1696111194553; Sat, 30
Sep 2023 14:59:54 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 30 Sep 2023 14:59:54 -0700 (PDT)
In-Reply-To: <ufa4lp$rnqn$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:d819:6090:1710:645;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:d819:6090:1710:645
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad>
<72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com> <2023Sep30.092454@mips.complang.tuwien.ac.at>
<128abea4-a935-4ef6-b7fe-71f0f1f35be0n@googlegroups.com> <ufa4lp$rnqn$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c0a376dc-2c9b-440a-91df-0beea9c7d3c7n@googlegroups.com>
Subject: Re: RISCs and virtual vectors (was: Introducing ForwardCom)
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sat, 30 Sep 2023 21:59:54 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 5520
 by: MitchAlsup - Sat, 30 Sep 2023 21:59 UTC

On Saturday, September 30, 2023 at 4:41:16 PM UTC-5, Thomas Koenig wrote:
> luke.l...@gmail.com <luke.l...@gmail.com> schrieb:
> > as Mitch notes in a later post (today), the "Cartesian Product" turns out
> > in say Audio DSPs doing SWAR (SIMD Within A Register) such as AndesStar
> > to be an O(N^6) clusterpoop. these DSPs, they are under enormous cost
> > and power budget pressure (think "C-Media sub-$1 USB Audio PHYs with
> > built-in volume control", they potentially only run at 8-12 mhz and are likely
> > in 130 or even 180 nm! USB1.1 and/or have a PLL that slaves to the host
> > USB bus, the STM32F072 does this: really neat trick)
> >
> > take even just an ADD, you would think there
> > would only ever be one add instruction in a RISC ISA? ehhhhm no
> >
> > * 8/16 bit source selection doubles that
> > * 8/16 bit destination selection doubles it again
> Assuming that these processors are indeed RISC (so, a load-
> store architecture), then this could be handled by having
>
> - 8-bit sign-extending load
> - 8-bit zero-extending load
> - 16 bit load
> - 8-bit store
> - 16-bit store
<
7 LDs, 4 STs:: the LDs need sign/zero extension (except for the largest).
<
But we started off with SIMD and there registers go up to 512-bits (so far)..
At 512-bits, one needs {8, 16, 32, 64, 128, 256}×{s, u}+{512} for Loading data
>
> and always doing 16-bit arithmetic.
>
> Hm, that's five loads and stores so far, still some room :-)
<
Yes, exactly; isolating data width to LD and ST instructions saves instruction
count big time--UNTIL--you need multiple calculations within a single register
width. So, a real RISC machine has 1 (or 2) ADDs, whereas a SIMD machine
has {widths}×{signs}×{formats} (and more as EricP illustrated above.)
>
> (Or is it a "RISC" in the sense of the MSP430, which is about as
> much a RISC as the PDP-11? Then the above does not apply).
<
RISC means::
LD/ST architecture (no LD-OPs or LD-OP-STs)
Lots of GPRs (you might get away with 16 but 32 is general min)
Hardwired sequencing
Instruction pipelining
Driven primarily by compiler (with just a dabbling of ASM)
<
> > * hi/lo half on source 1 doubles it again
> > * hi/lo half on source 2 doubles it again
> I'm not quite sure what the functionality is. Is it possible
> to effectively treat each half of a register as a sub-register in
> the ISA? Then, you need the bits for it.
<
EricP is talking about a 512-bit register having {64,32,16,8,{4,2,1}}
values in that 1 register. And, yes, you need bits to specify each
nuance.
<
> > * signed and unsigned saturation triples the number of ADD operations
> > (clipping in audio is important rather than getting wrapping distortion)
> Yes, then you also need separate eight-bit additions.
> > * "average-add" (x+y+1)>>1 (for Audio this is crucial) doubles again
> > (if you only have 16 bit audio and a low-speed DSP you cannot afford
> > to do that kind of calculation in 3 instructions, and you need 32-bit regs)
> So, another addition...
> > we are up to 6 dimensions and a whopping NINETY SIX *commercially necessary*
> > variants on what is supposed to be one simple ADD operation!
<
Now take a look at what happens when the very vast majority of code
does not need any of "that" and you specify variants with prefixes (or
postfixes), you end up back in RISC land being able to specify each of
those variants but you have a sum of prefixing bits not a Cartesian product
of instructions. Here, when you double the width of a SIMD register, you
add 1 prefix (instead of 60+ each) and you have access to all its functionality.
<
......

Re: RISCs and virtual vectors (was: Introducing ForwardCom)

<a1ffb79c-ce7e-459d-a254-effed976b6aan@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34371&group=comp.arch#34371

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:4b2f:0:b0:640:116e:d176 with SMTP id s15-20020ad44b2f000000b00640116ed176mr110809qvw.3.1696115513217; Sat, 30 Sep 2023 16:11:53 -0700 (PDT)
X-Received: by 2002:a05:6808:181d:b0:3a7:2eb4:cdf0 with SMTP id bh29-20020a056808181d00b003a72eb4cdf0mr3584722oib.2.1696115512976; Sat, 30 Sep 2023 16:11:52 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!69.80.99.11.MISMATCH!border-1.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 30 Sep 2023 16:11:52 -0700 (PDT)
In-Reply-To: <ufa4lp$rnqn$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=82.132.234.46; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 82.132.234.46
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk> <ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me> <287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org> <901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad> <72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com> <2023Sep30.092454@mips.complang.tuwien.ac.at> <128abea4-a935-4ef6-b7fe-71f0f1f35be0n@googlegroups.com> <ufa4lp$rnqn$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <a1ffb79c-ce7e-459d-a254-effed976b6aan@googlegroups.com>
Subject: Re: RISCs and virtual vectors (was: Introducing ForwardCom)
From: luke.leighton@gmail.com (luke.l...@gmail.com)
Injection-Date: Sat, 30 Sep 2023 23:11:53 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 74
 by: luke.l...@gmail.com - Sat, 30 Sep 2023 23:11 UTC

On Saturday, September 30, 2023 at 10:41:16 PM UTC+1, Thomas Koenig wrote:
> luke.l...@gmail.com <luke.l...@gmail.com> schrieb:

> > * 8/16 bit source selection doubles that
> > * 8/16 bit destination selection doubles it again
> Assuming that these processors are indeed RISC (so, a load-
> store architecture), then this could be handled by having
>
> - 8-bit sign-extending load
> - 8-bit zero-extending load
> - 16 bit load
> - 8-bit store
> - 16-bit store
>
> and always doing 16-bit arithmetic.

at which point the processor clock rate must be doubled
(making it commercially unviable due to power consumption,
e.g. outside of USB2 bus ampage) or the amount of silicon
doubled (same result).

the example I gave (AndesStar Audio DSP) is at the 8051 level
so 16 bit registers, 2 channel audio, $0.25 or less, 8 MHz clock.
doubling to 16 MHz in 180 nm because the opportunity is lost to use
the other 8 bits for doing left-right audio in the same clock rather
than needing 2x the time, this is the whole reason why the
seduction of SIMD even exists.

> > * hi/lo half on source 1 doubles it again
> > * hi/lo half on source 2 doubles it again
> I'm not quite sure what the functionality is.

Left channel audio in lower half, right channel in upper,
you want to balance the left channel against the right
and yet detect clipping. But then you want to do the
opposite way as well.

> So, another addition...

to which you then still need all the other options,
Hilo lohi 8/16 saturate etc and you might as well just consider
it to be another dimension of the Cartesian Product of options
on "add"

On Saturday, September 30, 2023 at 10:59:56 PM UTC+1, MitchAlsup wrote:

> Now take a look at what happens when the very vast majority of code
> does not need any of "that" and you specify variants with prefixes (or
> postfixes), you end up back in RISC land being able to specify each of
> those variants but you have a sum of prefixing bits not a Cartesian product
> of instructions. Here, when you double the width of a SIMD register, you
> add 1 prefix (instead of 60+ each) and you have access to all its functionality.

and if you do not need a particular augmentation you do not
add the prefix to request it. thus the ISA stream remains both
RISC and compact for the majority of programs.

One scheme I came up with 4 years ago was a *third* L1 cache,
containing highly compact "augmentations" rather than polluting
the main L1 I-Cache with "prefixing". where the main ISA might
be 32-bit the "augmentation" L1 cache might only be 16 bit.
Whereas if the prefixes had to be embedded in the 32 bit ISA
itself they take up valuable 32-bit RISC opcode space and you
are back to square one.

It is a trade-off basically.

L.

Re: RISCs and virtual vectors (was: Introducing ForwardCom)

<b2899a8d-5125-4b53-a461-137b6b04d4ecn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34372&group=comp.arch#34372

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5987:0:b0:417:b90b:6c5c with SMTP id e7-20020ac85987000000b00417b90b6c5cmr114036qte.7.1696115595100;
Sat, 30 Sep 2023 16:13:15 -0700 (PDT)
X-Received: by 2002:a05:6870:5b9c:b0:1e1:40df:f0d1 with SMTP id
em28-20020a0568705b9c00b001e140dff0d1mr2206299oab.11.1696115594864; Sat, 30
Sep 2023 16:13:14 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 30 Sep 2023 16:13:14 -0700 (PDT)
In-Reply-To: <ae46c988-7f46-4d66-9bae-8e5f03c86682n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=82.132.234.46; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 82.132.234.46
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad>
<72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com> <2023Sep30.092454@mips.complang.tuwien.ac.at>
<ae46c988-7f46-4d66-9bae-8e5f03c86682n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b2899a8d-5125-4b53-a461-137b6b04d4ecn@googlegroups.com>
Subject: Re: RISCs and virtual vectors (was: Introducing ForwardCom)
From: luke.leighton@gmail.com (luke.l...@gmail.com)
Injection-Date: Sat, 30 Sep 2023 23:13:15 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2388
 by: luke.l...@gmail.com - Sat, 30 Sep 2023 23:13 UTC

On Saturday, September 30, 2023, 'MitchAlsup' via comp.arch <comp.arch@googlegroups.com> wrote:
> On Saturday, September 30, 2023 at 2:34:08 AM UTC-5, Anton Ertl wrote:

>> What do the additional instructions buy?

> Sequencing semantics--mainly in what does NOT need to be performed
> (the live-outs for example minimizes the work of exiting the loop).

and i remember mmm 4 years back you said that it is less
hardware to have *one* looped LD/ST than to have hardware
make the attempt to spot that say 12 scalar LD/STs are
sequential/contiguous.

i designed some hardware to do that and for 12 LD/ST Reservation
Stations it has something mad like 10,000 wires incoming/outgoing
but only 4,000 actual gates.

l.

Re: RISCs and virtual vectors (was: Introducing ForwardCom)

<4a3099a8-454d-4140-b3ce-6346a8288e31n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34373&group=comp.arch#34373

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:4f24:0:b0:65b:166a:15d7 with SMTP id fc4-20020ad44f24000000b0065b166a15d7mr106835qvb.3.1696117172224;
Sat, 30 Sep 2023 16:39:32 -0700 (PDT)
X-Received: by 2002:a05:6870:7405:b0:1c0:350a:92d5 with SMTP id
x5-20020a056870740500b001c0350a92d5mr3108264oam.4.1696117171839; Sat, 30 Sep
2023 16:39:31 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 30 Sep 2023 16:39:31 -0700 (PDT)
In-Reply-To: <b2899a8d-5125-4b53-a461-137b6b04d4ecn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:d819:6090:1710:645;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:d819:6090:1710:645
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad>
<72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com> <2023Sep30.092454@mips.complang.tuwien.ac.at>
<ae46c988-7f46-4d66-9bae-8e5f03c86682n@googlegroups.com> <b2899a8d-5125-4b53-a461-137b6b04d4ecn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <4a3099a8-454d-4140-b3ce-6346a8288e31n@googlegroups.com>
Subject: Re: RISCs and virtual vectors (was: Introducing ForwardCom)
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sat, 30 Sep 2023 23:39:32 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Sat, 30 Sep 2023 23:39 UTC

On Saturday, September 30, 2023 at 6:13:16 PM UTC-5, luke.l...@gmail.com wrote:
> On Saturday, September 30, 2023, 'MitchAlsup' via comp.arch <comp...@googlegroups.com> wrote:
> > On Saturday, September 30, 2023 at 2:34:08 AM UTC-5, Anton Ertl wrote:
>
> >> What do the additional instructions buy?
>
> > Sequencing semantics--mainly in what does NOT need to be performed
> > (the live-outs for example minimizes the work of exiting the loop).
<
> and i remember mmm 4 years back you said that it is less
> hardware to have *one* looped LD/ST than to have hardware
> make the attempt to spot that say 12 scalar LD/STs are
> sequential/contiguous.
<
Given one can see a LD or ST instruction and knowing that it is between
VEC and LOOP instructions, one can determine if the memory sequence
is "dense" or not. Dense memory instructions can be processed at cache
line width, and this allows the calculation instructions to be processed
SIMD-style, otherwise one can always resort to GBOoO execution styles.
<
Were it not for the VEC and LOOP instructions, the problem is way harder.
>
> i designed some hardware to do that and for 12 LD/ST Reservation
> Stations it has something mad like 10,000 wires incoming/outgoing
> but only 4,000 actual gates.
<
That is the classical trade off between Scoreboards and Reservation
Stations. {Imagine a 6-wide machine} The number of wires in the RS is
proportional to the number of busses and the namespace of the values
traversing that bus. Let us postulate that there are 6 FUs and a 128 register
value namespace. So, every cycle 6 FUs send a 7-bit tag {42 wires move
per cycle}. Each of those 7-bit items must be compared to the desired value
{CAM} and CAMs eat power in order to make register data flow decisions.
<
Consider the SB alternative, 6 FUs and 128-wires each {768 total}. But here,
only 6 wires move per cycle. Each of these 768 wires is compared to a
single value (AND gate) in order to make register data flow conditions.
>
> l.

Re: RISCs and virtual vectors (was: Introducing ForwardCom)

<00324988-ff7c-4adc-ac16-6a6744bb0f88n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34375&group=comp.arch#34375

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5b11:0:b0:407:2c52:2861 with SMTP id m17-20020ac85b11000000b004072c522861mr139256qtw.8.1696153976542;
Sun, 01 Oct 2023 02:52:56 -0700 (PDT)
X-Received: by 2002:a4a:330a:0:b0:57b:80a7:3f08 with SMTP id
q10-20020a4a330a000000b0057b80a73f08mr2510506ooq.0.1696153976035; Sun, 01 Oct
2023 02:52:56 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 1 Oct 2023 02:52:55 -0700 (PDT)
In-Reply-To: <4a3099a8-454d-4140-b3ce-6346a8288e31n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=82.132.235.149; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 82.132.235.149
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad>
<72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com> <2023Sep30.092454@mips.complang.tuwien.ac.at>
<ae46c988-7f46-4d66-9bae-8e5f03c86682n@googlegroups.com> <b2899a8d-5125-4b53-a461-137b6b04d4ecn@googlegroups.com>
<4a3099a8-454d-4140-b3ce-6346a8288e31n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <00324988-ff7c-4adc-ac16-6a6744bb0f88n@googlegroups.com>
Subject: Re: RISCs and virtual vectors (was: Introducing ForwardCom)
From: luke.leighton@gmail.com (luke.l...@gmail.com)
Injection-Date: Sun, 01 Oct 2023 09:52:56 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2630
 by: luke.l...@gmail.com - Sun, 1 Oct 2023 09:52 UTC

On Sunday, October 1, 2023 at 12:39:34 AM UTC+1, MitchAlsup wrote:
> On Saturday, September 30, 2023 at 6:13:16 PM UTC-5, luke.l...@gmail.com wrote:
> > i designed some hardware to do that and for 12 LD/ST Reservation
> > Stations it has something mad like 10,000 wires incoming/outgoing
> > but only 4,000 actual gates.
> <
> That is the classical trade off between Scoreboards and Reservation
> Stations.

ah no sorry i wasn't clear: this was based on the EA-matching that is in
your Scoreboard Mechanics book chapters (or perhaps the 88100 ones),
where you take 10-12 bits and perform a triangular match to find conflicts.
it's O(N^2) so by the time you get up to (say) 12 LD/STs a full crossbar
on address/data/tag where data is 128-bit by that point (because 64-bit @
up to 8 bytes shifted) it gets pretty hairy.

l.


devel / comp.arch / Re: Introducing ForwardCom: An open ISA with variable-length vector

Pages:123
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor