Message-ID:

"The way of the world is to praise dead saints and prosecute live ones." -- Nathaniel Howe

devel / comp.arch / Re: Introducing ForwardCom: An open ISA with variable-length vector

Re: RISCs and virtual vectors

<a2gSM.220496$_Lv6.84138@fx12.iad>

https://news.novabbs.org/devel/article-flat.php?id=34376&group=comp.arch#34376

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer02.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx12.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: RISCs and virtual vectors
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk> <ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me> <287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org> <901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad> <72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com> <2023Sep30.092454@mips.complang.tuwien.ac.at> <128abea4-a935-4ef6-b7fe-71f0f1f35be0n@googlegroups.com> <ufa4lp$rnqn$1@newsreader4.netcologne.de> <c0a376dc-2c9b-440a-91df-0beea9c7d3c7n@googlegroups.com>
In-Reply-To: <c0a376dc-2c9b-440a-91df-0beea9c7d3c7n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 71
Message-ID: <a2gSM.220496$_Lv6.84138@fx12.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sun, 01 Oct 2023 15:15:18 UTC
Date: Sun, 01 Oct 2023 11:14:48 -0400
X-Received-Bytes: 4531

by: EricP - Sun, 1 Oct 2023 15:14 UTC

MitchAlsup wrote:
> On Saturday, September 30, 2023 at 4:41:16 PM UTC-5, Thomas Koenig wrote:
>> luke.l...@gmail.com <luke.l...@gmail.com> schrieb:
>>> as Mitch notes in a later post (today), the "Cartesian Product" turns out
>>> in say Audio DSPs doing SWAR (SIMD Within A Register) such as AndesStar
>>> to be an O(N^6) clusterpoop. these DSPs, they are under enormous cost
>>> and power budget pressure (think "C-Media sub-$1 USB Audio PHYs with
>>> built-in volume control", they potentially only run at 8-12 mhz and are likely
>>> in 130 or even 180 nm! USB1.1 and/or have a PLL that slaves to the host
>>> USB bus, the STM32F072 does this: really neat trick)
>>>
>>> take even just an ADD, you would think there
>>> would only ever be one add instruction in a RISC ISA? ehhhhm no
>>>
>>> * 8/16 bit source selection doubles that
>>> * 8/16 bit destination selection doubles it again
>> Assuming that these processors are indeed RISC (so, a load-
>> store architecture), then this could be handled by having
>>
>> - 8-bit sign-extending load
>> - 8-bit zero-extending load
>> - 16 bit load
>> - 8-bit store
>> - 16-bit store
> <
> 7 LDs, 4 STs:: the LDs need sign/zero extension (except for the largest).
> <
> But we started off with SIMD and there registers go up to 512-bits (so far)..
> At 512-bits, one needs {8, 16, 32, 64, 128, 256}×{s, u}+{512} for Loading data
>> and always doing 16-bit arithmetic.
>>
>> Hm, that's five loads and stores so far, still some room :-)
> <
> Yes, exactly; isolating data width to LD and ST instructions saves instruction
> count big time--UNTIL--you need multiple calculations within a single register
> width. So, a real RISC machine has 1 (or 2) ADDs, whereas a SIMD machine
> has {widths}×{signs}×{formats} (and more as EricP illustrated above.)
>> (Or is it a "RISC" in the sense of the MSP430, which is about as
>> much a RISC as the PDP-11? Then the above does not apply).
> <
> RISC means::
> LD/ST architecture (no LD-OPs or LD-OP-STs)
> Lots of GPRs (you might get away with 16 but 32 is general min)
> Hardwired sequencing
> Instruction pipelining
> Driven primarily by compiler (with just a dabbling of ASM)
> <
>>> * hi/lo half on source 1 doubles it again
>>> * hi/lo half on source 2 doubles it again
>> I'm not quite sure what the functionality is. Is it possible
>> to effectively treat each half of a register as a sub-register in
>> the ISA? Then, you need the bits for it.
> <
> EricP is talking about a 512-bit register having {64,32,16,8,{4,2,1}}
> values in that 1 register. And, yes, you need bits to specify each
> nuance.

I think you mean Thomas not me

I've never written a single simd instruction in my life.
For me it's just easier to let the normal load-store units
take care of all the aliasing and packing and unpacking.

The only simd instructions I might have use for is wide load and store with
a register or immediate byte count from 1..64, zero fill unused bytes.
Then use the simd registers as source/dest fields,
essentially an L0 cache with manual coherence.
But I have trouble justifying even that for other than byte-blob moves
as it sounds more trouble than its worth.

Re: RISCs and virtual vectors

<fbf3c03d-0db4-4558-84c6-e48c871dbb0cn@googlegroups.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=34377&group=comp.arch#34377

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:947:b0:656:328e:7dd5 with SMTP id dn7-20020a056214094700b00656328e7dd5mr128067qvb.13.1696173624357;
Sun, 01 Oct 2023 08:20:24 -0700 (PDT)
X-Received: by 2002:a05:6830:11cf:b0:6bc:ce86:20bd with SMTP id
v15-20020a05683011cf00b006bcce8620bdmr2764668otq.7.1696173624164; Sun, 01 Oct
2023 08:20:24 -0700 (PDT)
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 1 Oct 2023 08:20:23 -0700 (PDT)
In-Reply-To: <a2gSM.220496$_Lv6.84138@fx12.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:fc49:5e34:ea83:1d4c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:fc49:5e34:ea83:1d4c
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad>
<72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com> <2023Sep30.092454@mips.complang.tuwien.ac.at>
<128abea4-a935-4ef6-b7fe-71f0f1f35be0n@googlegroups.com> <ufa4lp$rnqn$1@newsreader4.netcologne.de>
<c0a376dc-2c9b-440a-91df-0beea9c7d3c7n@googlegroups.com> <a2gSM.220496$_Lv6.84138@fx12.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <fbf3c03d-0db4-4558-84c6-e48c871dbb0cn@googlegroups.com>
Subject: Re: RISCs and virtual vectors
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sun, 01 Oct 2023 15:20:24 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 5561

by: MitchAlsup - Sun, 1 Oct 2023 15:20 UTC

On Sunday, October 1, 2023 at 10:15:23 AM UTC-5, EricP wrote:
> MitchAlsup wrote:
> > On Saturday, September 30, 2023 at 4:41:16 PM UTC-5, Thomas Koenig wrote:
> >> luke.l...@gmail.com <luke.l...@gmail.com> schrieb:
> >>> as Mitch notes in a later post (today), the "Cartesian Product" turns out
> >>> in say Audio DSPs doing SWAR (SIMD Within A Register) such as AndesStar
> >>> to be an O(N^6) clusterpoop. these DSPs, they are under enormous cost
> >>> and power budget pressure (think "C-Media sub-$1 USB Audio PHYs with
> >>> built-in volume control", they potentially only run at 8-12 mhz and are likely
> >>> in 130 or even 180 nm! USB1.1 and/or have a PLL that slaves to the host
> >>> USB bus, the STM32F072 does this: really neat trick)
> >>>
> >>> take even just an ADD, you would think there
> >>> would only ever be one add instruction in a RISC ISA? ehhhhm no
> >>>
> >>> * 8/16 bit source selection doubles that
> >>> * 8/16 bit destination selection doubles it again
> >> Assuming that these processors are indeed RISC (so, a load-
> >> store architecture), then this could be handled by having
> >>
> >> - 8-bit sign-extending load
> >> - 8-bit zero-extending load
> >> - 16 bit load
> >> - 8-bit store
> >> - 16-bit store
> > <
> > 7 LDs, 4 STs:: the LDs need sign/zero extension (except for the largest).
> > <
> > But we started off with SIMD and there registers go up to 512-bits (so far)..
> > At 512-bits, one needs {8, 16, 32, 64, 128, 256}×{s, u}+{512} for Loading data
> >> and always doing 16-bit arithmetic.
> >>
> >> Hm, that's five loads and stores so far, still some room :-)
> > <
> > Yes, exactly; isolating data width to LD and ST instructions saves instruction
> > count big time--UNTIL--you need multiple calculations within a single register
> > width. So, a real RISC machine has 1 (or 2) ADDs, whereas a SIMD machine
> > has {widths}×{signs}×{formats} (and more as EricP illustrated above.)
> >> (Or is it a "RISC" in the sense of the MSP430, which is about as
> >> much a RISC as the PDP-11? Then the above does not apply).
> > <
> > RISC means::
> > LD/ST architecture (no LD-OPs or LD-OP-STs)
> > Lots of GPRs (you might get away with 16 but 32 is general min)
> > Hardwired sequencing
> > Instruction pipelining
> > Driven primarily by compiler (with just a dabbling of ASM)
> > <
> >>> * hi/lo half on source 1 doubles it again
> >>> * hi/lo half on source 2 doubles it again
> >> I'm not quite sure what the functionality is. Is it possible
> >> to effectively treat each half of a register as a sub-register in
> >> the ISA? Then, you need the bits for it.
> > <
> > EricP is talking about a 512-bit register having {64,32,16,8,{4,2,1}}
> > values in that 1 register. And, yes, you need bits to specify each
> > nuance.
> I think you mean Thomas not me
<
Sorry, My mistake.........
>
> I've never written a single simd instruction in my life.
> For me it's just easier to let the normal load-store units
> take care of all the aliasing and packing and unpacking.
>
> The only simd instructions I might have use for is wide load and store with
> a register or immediate byte count from 1..64, zero fill unused bytes.
> Then use the simd registers as source/dest fields,
> essentially an L0 cache with manual coherence.
<
> But I have trouble justifying even that for other than byte-blob moves
> as it sounds more trouble than its worth.
<
I have MM (move multiple) for byte-blobs without going through the RF.

Re: RISCs and virtual vectors (was: Introducing ForwardCom)

<46064729-787a-4185-b463-869615d409cen@googlegroups.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=34378&group=comp.arch#34378

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:8809:b0:773:f2a0:fda5 with SMTP id qj9-20020a05620a880900b00773f2a0fda5mr113263qkn.4.1696174348010;
Sun, 01 Oct 2023 08:32:28 -0700 (PDT)
X-Received: by 2002:a05:6870:6289:b0:1dd:39ce:e25c with SMTP id
s9-20020a056870628900b001dd39cee25cmr3791096oan.3.1696174347720; Sun, 01 Oct
2023 08:32:27 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 1 Oct 2023 08:32:27 -0700 (PDT)
In-Reply-To: <00324988-ff7c-4adc-ac16-6a6744bb0f88n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:fc49:5e34:ea83:1d4c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:fc49:5e34:ea83:1d4c
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad>
<72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com> <2023Sep30.092454@mips.complang.tuwien.ac.at>
<ae46c988-7f46-4d66-9bae-8e5f03c86682n@googlegroups.com> <b2899a8d-5125-4b53-a461-137b6b04d4ecn@googlegroups.com>
<4a3099a8-454d-4140-b3ce-6346a8288e31n@googlegroups.com> <00324988-ff7c-4adc-ac16-6a6744bb0f88n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <46064729-787a-4185-b463-869615d409cen@googlegroups.com>
Subject: Re: RISCs and virtual vectors (was: Introducing ForwardCom)
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sun, 01 Oct 2023 15:32:28 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3841

by: MitchAlsup - Sun, 1 Oct 2023 15:32 UTC

On Sunday, October 1, 2023 at 4:52:58 AM UTC-5, luke.l...@gmail.com wrote:
> On Sunday, October 1, 2023 at 12:39:34 AM UTC+1, MitchAlsup wrote:
> > On Saturday, September 30, 2023 at 6:13:16 PM UTC-5, luke.l...@gmail.com wrote:
> > > i designed some hardware to do that and for 12 LD/ST Reservation
> > > Stations it has something mad like 10,000 wires incoming/outgoing
> > > but only 4,000 actual gates.
> > <
> > That is the classical trade off between Scoreboards and Reservation
> > Stations.
<
> ah no sorry i wasn't clear: this was based on the EA-matching that is in
> your Scoreboard Mechanics book chapters (or perhaps the 88100 ones),
> where you take 10-12 bits and perform a triangular match to find conflicts.
> it's O(N^2) so by the time you get up to (say) 12 LD/STs a full crossbar
> on address/data/tag where data is 128-bit by that point (because 64-bit @
> up to 8 bytes shifted) it gets pretty hairy.
<
The 88100 was a simple register interlock not a scoreboard.
<
You should be able to get the triangular match into the issue part (so an
instruction can be issued together with an earlier instruction which delivers
an operand to the later instruction) and then it is ScoreBoardy from there
on out.
<
The thing I forgot to mention, is that after each FU broadcasts is tag (128
wires) all of the wires with the same index and be combined (OR) so the
global wiring is only 128-bits--to do this the waiting instruction has to know
what FU is broadcasting which operand.
<
It also occurred to me that this is the perfect way to deal with My 66000
predication (where one has x instructions in the then clause and y instructions
in the else clause (x and y) <= 8). You use the PRED instruction to broadcast
then or else which releases the <likely completed> instructions from the then
or else clause and cancels the other branch.
>
> l.

Re: RISCs and virtual vectors

<m%gSM.259306$vMO8.130899@fx16.iad>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=34381&group=comp.arch#34381

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx16.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: RISCs and virtual vectors
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk> <ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me> <287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org> <901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <95KRM.207899$Hih7.53988@fx11.iad> <72b6255a-e9b5-4549-b243-49ab8c49b064n@googlegroups.com> <2023Sep30.092454@mips.complang.tuwien.ac.at> <128abea4-a935-4ef6-b7fe-71f0f1f35be0n@googlegroups.com> <ufa4lp$rnqn$1@newsreader4.netcologne.de> <c0a376dc-2c9b-440a-91df-0beea9c7d3c7n@googlegroups.com> <a2gSM.220496$_Lv6.84138@fx12.iad> <fbf3c03d-0db4-4558-84c6-e48c871dbb0cn@googlegroups.com>
In-Reply-To: <fbf3c03d-0db4-4558-84c6-e48c871dbb0cn@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 29
Message-ID: <m%gSM.259306$vMO8.130899@fx16.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sun, 01 Oct 2023 16:20:34 UTC
Date: Sun, 01 Oct 2023 12:20:16 -0400
X-Received-Bytes: 2603

by: EricP - Sun, 1 Oct 2023 16:20 UTC

MitchAlsup wrote:
> On Sunday, October 1, 2023 at 10:15:23 AM UTC-5, EricP wrote:
>>
>> The only simd instructions I might have use for is wide load and store with
>> a register or immediate byte count from 1..64, zero fill unused bytes.
>> Then use the simd registers as source/dest fields,
>> essentially an L0 cache with manual coherence.
> <
>> But I have trouble justifying even that for other than byte-blob moves
>> as it sounds more trouble than its worth.
> <
> I have MM (move multiple) for byte-blobs without going through the RF.

I too would have mem-blob operate instructions specifically design to
mem-blobby things. (this whole business of having the run-time libraries
decide on the fly which instruction sets to use for an optimum move is nuts).

For that simd variable wide LD/ST I'm thinking of operating on structs
which have a byte size and alignment. For example, balancing an AVL tree,
the node struct has 3 pointers, left, right, parent, and a depth count,
plus other data for that node. It could load 4 structs into wide registers,
do the balance rotations by operating on byte fields in the simd registers,
and store the 4 changes.
Since I know these are all separate objects, I know no aliasing
is possible so operating on them in bulk is straight forward.

Re: Introducing ForwardCom: An open ISA with variable-length vector

<jwvy1glys5x.fsf-monnier+comp.arch@gnu.org>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=34392&group=comp.arch#34392

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: monnier@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
Date: Mon, 02 Oct 2023 11:35:51 -0400
Organization: A noiseless patient Spider
Lines: 41
Message-ID: <jwvy1glys5x.fsf-monnier+comp.arch@gnu.org>
References: <udickr$5rc3$1@dont-email.me>
<memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com>
<jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: dont-email.me; posting-host="c15070295f9a2dee1b6caf06325ae9c0";
logging-data="3168518"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX188uOLiIzUsjTJjStLTVlnR0ldVHs+vOqM="
User-Agent: Gnus/5.13 (Gnus v5.13)
Cancel-Lock: sha1:UQ/vTqs+tVHeDLWp+ePnYIg0F/E=
sha1:iuXyJFRcD9xZLtkn5S+WoTPrrf8=

by: Stefan Monnier - Mon, 2 Oct 2023 15:35 UTC

>> I think you can't get a good answer before clarifying what it is that
>> you consider as the problem in "opcode proliferation" (after all,
> The problem is that the Cartesian-product associated with SIMD
> causes thousands of microscopic instructions to be needed (for
> example ARM has at least 1,300 SIMD instructions, others worse.)

I don't see why that *in itself* is a problem.
I don't like this either, but that doesn't mean it's a problem.
We should be more clear about the actual concrete problems (which are
probably somewhere in the consequences of having so many opcodes).

> It is my contention that sooner or later one needs to respect the R
> as the first letter in RISC and make it stand for REDUCED !!

But the *R* was not so much about the number of distinct instructions
but about the complexity of each instruction, AFAIK.
[ And even that has proved to be not so terribly important with current
BigOoO machines. ]

> It is also my contention that nobody with more than <wave hands>
> 200 instructions can be called RISC.

The "RISC" name in itself doesn't sell nor does it help find answers
faster or more cheaply.

> Sooner or later it all adds up to shiploads of instructions. And the
> instructions are not backwards compatible,...

Ah, here's a concrete problem, thanks: indeed it requires recompiling
your code to take advantage of wider implementations.
The "scalable vectors" guys claim they don't suffer from that, but
according to Luke it doesn't work out quite so nicely either :-(
[ The cynic in me suggests this is a feature-not-a-bug. ]

Luke mentioned another case where the proliferation may put pressure on the
cost of the cheapest implementation, thus locking that ISA out of
certain markets. I don't have a clear enough idea of the relative costs
of the various alternatives to judge how serious this is, tho.

Stefan

Re: Introducing ForwardCom: An open ISA with variable-length vector

<8c7b53fe-7204-40fc-b9a0-5e0d5581ca21n@googlegroups.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=34447&group=comp.arch#34447

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:5b0d:0:b0:417:adaa:be87 with SMTP id m13-20020ac85b0d000000b00417adaabe87mr46105qtw.11.1696447270912;
Wed, 04 Oct 2023 12:21:10 -0700 (PDT)
X-Received: by 2002:a05:6808:3028:b0:3ae:1e08:41e9 with SMTP id
ay40-20020a056808302800b003ae1e0841e9mr1753264oib.3.1696447270759; Wed, 04
Oct 2023 12:21:10 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 4 Oct 2023 12:21:10 -0700 (PDT)
In-Reply-To: <jwvy1glys5x.fsf-monnier+comp.arch@gnu.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:ad63:160c:8519:a509;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:ad63:160c:8519:a509
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <jwvy1glys5x.fsf-monnier+comp.arch@gnu.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8c7b53fe-7204-40fc-b9a0-5e0d5581ca21n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Wed, 04 Oct 2023 19:21:10 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1485

by: MitchAlsup - Wed, 4 Oct 2023 19:21 UTC

Re: Introducing ForwardCom: An open ISA with variable-length vector

<3331a75f-a200-4eab-b1f0-8a2fd3b0ce19n@googlegroups.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=34545&group=comp.arch#34545

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:7f0a:0:b0:403:27b2:85b5 with SMTP id f10-20020ac87f0a000000b0040327b285b5mr596878qtk.12.1697407550144;
Sun, 15 Oct 2023 15:05:50 -0700 (PDT)
X-Received: by 2002:a05:6808:18a3:b0:3a4:3c6c:27a1 with SMTP id
bi35-20020a05680818a300b003a43c6c27a1mr15728699oib.5.1697407549983; Sun, 15
Oct 2023 15:05:49 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 15 Oct 2023 15:05:49 -0700 (PDT)
In-Reply-To: <jwvy1glys5x.fsf-monnier+comp.arch@gnu.org>
Injection-Info: google-groups.googlegroups.com; posting-host=82.132.232.154; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 82.132.232.154
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <jwvy1glys5x.fsf-monnier+comp.arch@gnu.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3331a75f-a200-4eab-b1f0-8a2fd3b0ce19n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
From: luke.leighton@gmail.com (luke.l...@gmail.com)
Injection-Date: Sun, 15 Oct 2023 22:05:50 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 5197

by: luke.l...@gmail.com - Sun, 15 Oct 2023 22:05 UTC

On Monday, October 2, 2023 at 4:37:34 PM UTC+1, Stefan Monnier wrote:

> But the *R* was not so much about the number of distinct instructions
> but about the complexity of each instruction, AFAIK.

yes in the case of the Power ISA i can confirm that. there are two
stories:

1. if you look at the VHDL for microwatt you find *one* add operation.
it takes A/~A, carry-in equal to 0/1/CA, it takes B as RB or zero,
it outputs RT/~RT and it produces CA/CA32.

in other words it can synthesise literally any operation, neg, sub,
add, add-carry, sub-carry, the total permutations of the above comes
to a whopping 2x3x2x2x2 72 different arithmetic operations!

therefore despite there being *one adder* we should not be
surprised to find *over forty* different Decoder add/sub arithmetic
operations in Power ISA.

2. the LD/ST, shift-and-mask, and and/or logical operations have some
extremely weird register names, RA RB RC RS and RT.

it turns out that (read the original POWER1 paper from 1991) there
wasn't enough space to put a separate shifter into 130 nm silicon
as well as one in the LD/ST unit, so they *shared* it by hiding it
behind RISC-based Micro-Coding.

the names RA RB RC RT and RS were the names of "Operand
Forwarding Broadcast Buses" that were used to communicate
*partial results* between three micro-operations, which, back-to-back
performed the full LD operation, where the "raw" LD was of course
word-aligned, and had to be extracted with a shift-and-mask.

of course now "silicon is cheap" so the trick is no longer deployed,
but it gives us some insight into the RISC concept, for when resources
matter.

> The "scalable vectors" guys claim they don't suffer from that, but
> according to Luke it doesn't work out quite so nicely either :-(
> [ The cynic in me suggests this is a feature-not-a-bug. ]

they attempt to *sell* the dog's dinner mess as a feature.
ARM *may* succeed at that, but Silicon partners are fighting
back because they don't like the binary incompatibility.
RISC-V i doubt will succeed technically and why would you
use something that has bipartisan US senator support for
cutting China off at the knees due to over 75% of Board
Members of RISC-V being from China?
> Luke mentioned another case where the proliferation may put pressure on the
> cost of the cheapest implementation, thus locking that ISA out of
> certain markets. I don't have a clear enough idea of the relative costs
> of the various alternatives to judge how serious this is, tho.

mass-volume pricing is brutal. you got a solution that allows
product to be manufactured for $0.75 instead of $0.90 you
lose all your customer orders - hundreds of millions of business -
within about 3 months.

Allwinner came out with the very first tablet/IPTV ARM Cortex A8
back in 2010 and caused a major recession in Guangdong due
to everyone else holding stock of ARM11-based tablet offerings.

the BOM for the Allwinner A10 tablets were $18 allowing tablets
to be sold for the first time for $35. everyone else had a $40 BOM
and was selling for appx $70 at the time.

anyone holding stock of those ARM11 tablets *or who had components*
went out of business within 3 months.

nobody outside of China noticed but it was really serious, the
Chinese Govt had to step in. there were a lot of
very pissed off China investors who jumped on the Allwinner
bandwagon and carved out their own "niches" which became
the 5-or-so (independent) business units within Allwinner,
beyond even its CEO's control.

Re: Introducing ForwardCom: An open ISA with variable-length vector

<67bbc53e-ca5a-424a-9d88-826b51e0efc1n@googlegroups.com>

copy mid

https://news.novabbs.org/devel/article-flat.php?id=34546&group=comp.arch#34546

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:6:b0:415:15d2:b2e5 with SMTP id x6-20020a05622a000600b0041515d2b2e5mr617720qtw.4.1697408989563;
Sun, 15 Oct 2023 15:29:49 -0700 (PDT)
X-Received: by 2002:a05:6870:b796:b0:1d5:8e96:7d85 with SMTP id
ed22-20020a056870b79600b001d58e967d85mr11643153oab.1.1697408989346; Sun, 15
Oct 2023 15:29:49 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 15 Oct 2023 15:29:49 -0700 (PDT)
In-Reply-To: <3331a75f-a200-4eab-b1f0-8a2fd3b0ce19n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:e4fe:2185:26df:ed38;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:e4fe:2185:26df:ed38
References: <udickr$5rc3$1@dont-email.me> <memo.20230916204926.16432A@jgd.cix.co.uk>
<ue6rle$ceh2$1@dont-email.me> <ue7ekk$fq2n$1@dont-email.me>
<287e35a5-e78f-4ae2-bbb1-606f7bbdfe98n@googlegroups.com> <jwv5y3ydyqr.fsf-monnier+comp.arch@gnu.org>
<901e43e0-902d-4f5d-8ae4-22c570a94191n@googlegroups.com> <jwvy1glys5x.fsf-monnier+comp.arch@gnu.org>
<3331a75f-a200-4eab-b1f0-8a2fd3b0ce19n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <67bbc53e-ca5a-424a-9d88-826b51e0efc1n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sun, 15 Oct 2023 22:29:49 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 6422

by: MitchAlsup - Sun, 15 Oct 2023 22:29 UTC

On Sunday, October 15, 2023 at 5:05:51 PM UTC-5, luke.l...@gmail.com wrote:
> On Monday, October 2, 2023 at 4:37:34 PM UTC+1, Stefan Monnier wrote:
>
> > But the *R* was not so much about the number of distinct instructions
> > but about the complexity of each instruction, AFAIK.
> yes in the case of the Power ISA i can confirm that. there are two
> stories:
>
> 1. if you look at the VHDL for microwatt you find *one* add operation.
> it takes A/~A, carry-in equal to 0/1/CA, it takes B as RB or zero,
> it outputs RT/~RT and it produces CA/CA32.
>
> in other words it can synthesise literally any operation, neg, sub,
> add, add-carry, sub-carry, the total permutations of the above comes
> to a whopping 2x3x2x2x2 72 different arithmetic operations!
<
Spot on. {except for the carry-in thing can be done in 2 states plus 1 gate
so I only count this as 2 states not 3}.
>
> therefore despite there being *one adder* we should not be
> surprised to find *over forty* different Decoder add/sub arithmetic
> operations in Power ISA.
<
At some point you will want to express each of these useful forms with
assembler spellings representing each feature {alone or aggregated}.
The question becomes how many spellings do you have to express all
those nuances:: so would you rather have::
<
ADD Rd,[-]Rs1,[-]Rs2
OR
ADD Rd,Rs1,Rs2
SUB Rd,Rs1,Rs2
RSUB Rd,Rs1,Rs2
>
> 2. the LD/ST, shift-and-mask, and and/or logical operations have some
> extremely weird register names, RA RB RC RS and RT.
>
> it turns out that (read the original POWER1 paper from 1991) there
> wasn't enough space to put a separate shifter into 130 nm silicon
> as well as one in the LD/ST unit, so they *shared* it by hiding it
> behind RISC-based Micro-Coding.
<
1991 should have been closer to 0.5µ
>
> the names RA RB RC RT and RS were the names of "Operand
> Forwarding Broadcast Buses" that were used to communicate
> *partial results* between three micro-operations, which, back-to-back
> performed the full LD operation, where the "raw" LD was of course
> word-aligned, and had to be extracted with a shift-and-mask.
<
These busses might have only had data persist for 3/4 a clock cycle.
>
> of course now "silicon is cheap" so the trick is no longer deployed,
> but it gives us some insight into the RISC concept, for when resources
> matter.
> > The "scalable vectors" guys claim they don't suffer from that, but
> > according to Luke it doesn't work out quite so nicely either :-(
> > [ The cynic in me suggests this is a feature-not-a-bug. ]
<
> they attempt to *sell* the dog's dinner mess as a feature.
<
LoL
<
> ARM *may* succeed at that, but Silicon partners are fighting
> back because they don't like the binary incompatibility.
> RISC-V i doubt will succeed technically and why would you
> use something that has bipartisan US senator support for
> cutting China off at the knees due to over 75% of Board
> Members of RISC-V being from China?
<
> > Luke mentioned another case where the proliferation may put pressure on the
> > cost of the cheapest implementation, thus locking that ISA out of
> > certain markets. I don't have a clear enough idea of the relative costs
> > of the various alternatives to judge how serious this is, tho.
<
> mass-volume pricing is brutal. you got a solution that allows
> product to be manufactured for $0.75 instead of $0.90 you
> lose all your customer orders - hundreds of millions of business -
> within about 3 months.
<
I think your logic is backwards:: If you can't find a cost competitive
solution, you can't make any product profit at the beginning and
won't have any cache flow 3 months down the road.
>
> Allwinner came out with the very first tablet/IPTV ARM Cortex A8
> back in 2010 and caused a major recession in Guangdong due
> to everyone else holding stock of ARM11-based tablet offerings.
>
> the BOM for the Allwinner A10 tablets were $18 allowing tablets
> to be sold for the first time for $35. everyone else had a $40 BOM
> and was selling for appx $70 at the time.
>
> anyone holding stock of those ARM11 tablets *or who had components*
> went out of business within 3 months.
>
> nobody outside of China noticed but it was really serious, the
> Chinese Govt had to step in. there were a lot of
> very pissed off China investors who jumped on the Allwinner
> bandwagon and carved out their own "niches" which became
> the 5-or-so (independent) business units within Allwinner,
> beyond even its CEO's control.
>
> l.

Subject	Author
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	Agner
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	Agner
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	Michael S
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	Dan Cross
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector	Thomas Koenig
Re: Introducing ForwardCom: An open ISA with variable-length vector	BGB
Re: Introducing ForwardCom: An open ISA with variable-length vector	BGB
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector	BGB
Re: Introducing ForwardCom: An open ISA with variable-length vector	John Dallman
Re: Introducing ForwardCom: An open ISA with variable-length vector	Terje Mathisen
Re: Introducing ForwardCom: An open ISA with variable-length vector	BGB
Re: Introducing ForwardCom: An open ISA with variable-length vector	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector	BGB
Re: Introducing ForwardCom: An open ISA with variable-length vector	Stephen Fuld
Re: Introducing ForwardCom: An open ISA with variable-length vector	MitchAlsup
Re: Introducing ForwardCom: An open ISA with variable-length vector	Stefan Monnier
Re: Introducing ForwardCom: An open ISA with variable-length vector	MitchAlsup
Re: Introducing ForwardCom: An open ISA with variable-length vector	Scott Lurndal
Re: Introducing ForwardCom: An open ISA with variable-length vector	MitchAlsup
Re: Introducing ForwardCom: An open ISA with variable-length vector	robf...@gmail.com
RISCs and virtual vectors (was: Introducing ForwardCom)	Anton Ertl
Re: RISCs and virtual vectors (was: Introducing ForwardCom)	MitchAlsup
Re: RISCs and virtual vectors (was: Introducing ForwardCom)	luke.l...@gmail.com
Re: RISCs and virtual vectors (was: Introducing ForwardCom)	MitchAlsup
Re: RISCs and virtual vectors (was: Introducing ForwardCom)	luke.l...@gmail.com
Re: RISCs and virtual vectors (was: Introducing ForwardCom)	MitchAlsup
Re: RISCs and virtual vectors (was: Introducing ForwardCom)	Thomas Koenig
Re: RISCs and virtual vectors (was: Introducing ForwardCom)	MitchAlsup
Re: RISCs and virtual vectors	EricP
Re: RISCs and virtual vectors	MitchAlsup
Re: RISCs and virtual vectors	EricP
Re: RISCs and virtual vectors (was: Introducing ForwardCom)	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector	Stefan Monnier
Re: Introducing ForwardCom: An open ISA with variable-length vector	MitchAlsup
Re: Introducing ForwardCom: An open ISA with variable-length vector	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector	MitchAlsup
Re: Introducing ForwardCom: An open ISA with variable-length vector	Thomas Koenig
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	MitchAlsup
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	John Dallman
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	Scott Lurndal
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	John Dallman
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector	BGB
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	luke.l...@gmail.com
Re: Introducing ForwardCom: An open ISA with variable-length vector	Thomas Koenig
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	MitchAlsup
Re: Introducing ForwardCom: An open ISA with variable-length vector	David Schultz
Re: Introducing ForwardCom: An open ISA with variable-length vector	EricP
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	MitchAlsup
Re: Introducing ForwardCom: An open ISA with variable-length vector	BGB
Re: Introducing ForwardCom: An open ISA with variable-length vector registers	luke.l...@gmail.com