Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

It's the Magic that counts. -- Larry Wall on Perl's apparent ugliness


devel / comp.arch / Re: The synergy of type tags on register file registers

SubjectAuthor
* The synergy of type tags on register file registersJimBrakefield
+* Re: The synergy of type tags on register file registersTerje Mathisen
|`- Re: The synergy of type tags on register file registersrobf...@gmail.com
`* Re: The synergy of type tags on register file registersluke.l...@gmail.com
 +* Re: The synergy of type tags on register file registersluke.l...@gmail.com
 |`- Re: The synergy of type tags on register file registersScott Lurndal
 +* Re: The synergy of type tags on register file registersNiklas Holsti
 |`- Re: The synergy of type tags on register file registersluke.l...@gmail.com
 `* Re: The synergy of type tags on register file registersMitchAlsup
  +* Re: The synergy of type tags on register file registersJimBrakefield
  |+- Re: The synergy of type tags on register file registersluke.l...@gmail.com
  |`* Re: The synergy of type tags on register file registersMitchAlsup
  | +* Re: The synergy of type tags on register file registersJimBrakefield
  | |+- Re: The synergy of type tags on register file registersMitchAlsup
  | |+* Re: The synergy of type tags on register file registersluke.l...@gmail.com
  | ||+* Re: The synergy of type tags on register file registersMitchAlsup
  | |||`* Re: The synergy of type tags on register file registersThomas Koenig
  | ||| +* Re: The synergy of type tags on register file registersluke.l...@gmail.com
  | ||| |`- Re: The synergy of type tags on register file registersMitchAlsup
  | ||| +* Re: The synergy of type tags on register file registersBGB
  | ||| |`* Re: The synergy of type tags on register file registersMitchAlsup
  | ||| | `- Re: The synergy of type tags on register file registersBGB
  | ||| `* Re: The synergy of type tags on register file registersTerje Mathisen
  | |||  +* Re: The synergy of type tags on register file registersMitchAlsup
  | |||  |`- Re: The synergy of type tags on register file registersTerje Mathisen
  | |||  `* Re: The synergy of type tags on register file registersThomas Koenig
  | |||   `* Re: The synergy of type tags on register file registersTerje Mathisen
  | |||    `* Re: The synergy of type tags on register file registersluke.l...@gmail.com
  | |||     +* Re: The synergy of type tags on register file registersScott Lurndal
  | |||     |`* Re: The synergy of type tags on register file registersluke.l...@gmail.com
  | |||     | `- Re: The synergy of type tags on register file registersMitchAlsup
  | |||     +* Re: The synergy of type tags on register file registersThomas Koenig
  | |||     |`- Re: The synergy of type tags on register file registersBGB
  | |||     +* Re: The synergy of type tags on register file registersTerje Mathisen
  | |||     |`* Re: The synergy of type tags on register file registersMitchAlsup
  | |||     | +* Re: The synergy of type tags on register file registersTerje Mathisen
  | |||     | |+- Re: The synergy of type tags on register file registersMitchAlsup
  | |||     | |`* Re: The synergy of type tags on register file registersluke.l...@gmail.com
  | |||     | | +- Re: The synergy of type tags on register file registersMitchAlsup
  | |||     | | `- Re: The synergy of type tags on register file registersJimBrakefield
  | |||     | `* Re: The synergy of type tags on register file registersScott Lurndal
  | |||     |  +- Re: The synergy of type tags on register file registersluke.l...@gmail.com
  | |||     |  +* Re: The synergy of type tags on register file registersBGB
  | |||     |  |`* Re: The synergy of type tags on register file registersAnton Ertl
  | |||     |  | `* Re: The synergy of type tags on register file registersMitchAlsup
  | |||     |  |  `* Re: The synergy of type tags on register file registersBGB
  | |||     |  |   `* Re: The synergy of type tags on register file registersAnton Ertl
  | |||     |  |    +* Re: The synergy of type tags on register file registersBGB
  | |||     |  |    |`* Re: The synergy of type tags on register file registersMitchAlsup
  | |||     |  |    | `- Re: The synergy of type tags on register file registersBGB
  | |||     |  |    `* Re: The synergy of type tags on register file registersThomas Koenig
  | |||     |  |     `- Re: The synergy of type tags on register file registersTerje Mathisen
  | |||     |  `* Re: The synergy of type tags on register file registersAnton Ertl
  | |||     |   `- Re: The synergy of type tags on register file registersMichael S
  | |||     +- Re: The synergy of type tags on register file registersBGB
  | |||     `* Re: The synergy of type tags on register file registersMitchAlsup
  | |||      +- Re: The synergy of type tags on register file registersluke.l...@gmail.com
  | |||      `* Re: The synergy of type tags on register file registersStephen Fuld
  | |||       +* Re: The synergy of type tags on register file registersMitchAlsup
  | |||       |`- Re: The synergy of type tags on register file registersStephen Fuld
  | |||       `* Re: The synergy of type tags on register file registersAnton Ertl
  | |||        +* Re: The synergy of type tags on register file registersStephen Fuld
  | |||        |+* Re: The synergy of type tags on register file registersMitchAlsup
  | |||        ||+- Re: The synergy of type tags on register file registersScott Lurndal
  | |||        ||`* Re: The synergy of type tags on register file registersStephen Fuld
  | |||        || +- Re: The synergy of type tags on register file registersThomas Koenig
  | |||        || `* Re: The synergy of type tags on register file registersJohn Dallman
  | |||        ||  `* Re: The synergy of type tags on register file registersThomas Koenig
  | |||        ||   +* Re: The synergy of type tags on register file registersJohn Dallman
  | |||        ||   |`- Re: The synergy of type tags on register file registersThomas Koenig
  | |||        ||   `* Re: The synergy of type tags on register file registersTerje Mathisen
  | |||        ||    `* Re: The synergy of type tags on register file registersMitchAlsup
  | |||        ||     `- Re: The synergy of type tags on register file registersTerje Mathisen
  | |||        |`* Re: The synergy of type tags on register file registersAnton Ertl
  | |||        | +* Re: The synergy of type tags on register file registersStephen Fuld
  | |||        | |+* Re: The synergy of type tags on register file registersBGB
  | |||        | ||`* Re: The synergy of type tags on register file registersrobf...@gmail.com
  | |||        | || +* Re: The synergy of type tags on register file registersMitchAlsup
  | |||        | || |`* Re: The synergy of type tags on register file registersMichael S
  | |||        | || | `* Re: The synergy of type tags on register file registersMitchAlsup
  | |||        | || |  `* Re: The synergy of type tags on register file registersMichael S
  | |||        | || |   `- Re: The synergy of type tags on register file registersMitchAlsup
  | |||        | || +* Re: The synergy of type tags on register file registersBGB
  | |||        | || |`- Re: The synergy of type tags on register file registersMitchAlsup
  | |||        | || `* Re: The synergy of type tags on register file registersScott Lurndal
  | |||        | ||  +- Re: The synergy of type tags on register file registersBGB
  | |||        | ||  `* Re: The synergy of type tags on register file registersMitchAlsup
  | |||        | ||   +- Re: The synergy of type tags on register file registersBGB
  | |||        | ||   +* Re: The synergy of type tags on register file registersluke.l...@gmail.com
  | |||        | ||   |`* Re: The synergy of type tags on register file registersScott Lurndal
  | |||        | ||   | `* Re: The synergy of type tags on register file registersBGB
  | |||        | ||   |  `* Re: The synergy of type tags on register file registersScott Lurndal
  | |||        | ||   |   `* Re: The synergy of type tags on register file registersBGB
  | |||        | ||   |    `* Re: The synergy of type tags on register file registersScott Lurndal
  | |||        | ||   |     +* Re: The synergy of type tags on register file registersStephen Fuld
  | |||        | ||   |     |`- Re: The synergy of type tags on register file registersScott Lurndal
  | |||        | ||   |     +* Re: The synergy of type tags on register file registersMitchAlsup
  | |||        | ||   |     |`* Re: The synergy of type tags on register file registersScott Lurndal
  | |||        | ||   |     | `* Re: The synergy of type tags on register file registersMitchAlsup
  | |||        | ||   |     |  `- Re: The synergy of type tags on register file registersScott Lurndal
  | |||        | ||   |     `* Re: The synergy of type tags on register file registersBGB
  | |||        | ||   `- Re: The synergy of type tags on register file registersIvan Godard
  | |||        | |+* Re: The synergy of type tags on register file registersAnton Ertl
  | |||        | |`* Re: The synergy of type tags on register file registersTerje Mathisen
  | |||        | +* Re: The synergy of type tags on register file registersBill Findlay
  | |||        | `- Re: The synergy of type tags on register file registersTerje Mathisen
  | |||        `- Re: The synergy of type tags on register file registersMitchAlsup
  | ||`* Re: The synergy of type tags on register file registersScott Lurndal
  | |`- Re: The synergy of type tags on register file registersNiklas Holsti
  | `- Re: The synergy of type tags on register file registersScott Lurndal
  `* Re: The synergy of type tags on register file registersPaul A. Clayton

Pages:12345678910
Re: The synergy of type tags on register file registers

<3a286521-4ae3-4e8f-8b7b-36bb6a52588fn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32794&group=comp.arch#32794

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7f41:0:b0:3f8:404:9b7b with SMTP id g1-20020ac87f41000000b003f804049b7bmr3180816qtk.10.1686592473433;
Mon, 12 Jun 2023 10:54:33 -0700 (PDT)
X-Received: by 2002:a05:6870:87c6:b0:1a6:6b13:60cc with SMTP id
s6-20020a05687087c600b001a66b1360ccmr1936220oam.10.1686592473111; Mon, 12 Jun
2023 10:54:33 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 12 Jun 2023 10:54:32 -0700 (PDT)
In-Reply-To: <2023Jun12.085948@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7567:8988:6d81:ea5d;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7567:8988:6d81:ea5d
References: <08f739ac-2200-408c-a578-79e93f9cb272n@googlegroups.com>
<u57ond$2edvp$1@dont-email.me> <u5ddji$38t1d$2@dont-email.me>
<u5vg1e$1uh43$1@dont-email.me> <2023Jun9.181842@mips.complang.tuwien.ac.at>
<4b0c9281-35ea-452f-9abc-1a2733756508n@googlegroups.com> <2023Jun10.101353@mips.complang.tuwien.ac.at>
<d57dfa0d-db91-457b-a082-bd1f5c549e56n@googlegroups.com> <2023Jun11.165816@mips.complang.tuwien.ac.at>
<f6917a59-a076-4ca8-b410-8844a68fa402n@googlegroups.com> <2023Jun12.085948@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3a286521-4ae3-4e8f-8b7b-36bb6a52588fn@googlegroups.com>
Subject: Re: The synergy of type tags on register file registers
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Mon, 12 Jun 2023 17:54:33 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 5341
 by: MitchAlsup - Mon, 12 Jun 2023 17:54 UTC

On Monday, June 12, 2023 at 2:32:54 AM UTC-5, Anton Ertl wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >On Sunday, June 11, 2023 at 10:57:39=E2=80=AFAM UTC-5, Anton Ertl wrote:
> >> MitchAlsup <Mitch...@aol.com> writes:=20
> >> >In 64-bit architectures both are single registers.
> ><
> >> Yes, that's what makes the simple and efficient JVM implementation=20
> >> impossible. In the JVM int and float take one stack slot, while long=20
> >> and double take two (it was originally designed for being implemented=20
> >> with a 32-bit stack slot). There are JVM instructions working on the=20
> >> stack that do not include the types of the stack operand. So either=20
> >> you make the implementation complicated by doing compile-time type=20
> >> tracking, or you make it inefficient by keeping longs and doubles in=20
> >> two registers, and having instructions that work on them combine these=20
> >> two registers into one operand (once per operand), and split the=20
> >> result.=20
> ><
> >I don't get your point:: On the stack the int and float containers are
> >word size, after being loaded into a register they are still 32-bit
> >things but occupy a 64-bit container.
> Java int and float are not the problem, long and double are. You want
> them in one register on a 64-bit architecture, but a simple
> implementation will split them into two.
> >If you fetch+decode+issue+retire 4 inst per clock and they are all
> >integer you need 4 sets of integer ports, a few cycles later you FDIR
> >4 floating point inst in a cycle you need 4 sets of FP ports. Now, if
> >you FDIR 2 ints and 2 FPs you have 8 total ports and =C2=BD are going=20
> >unused.
>
> Maybe this made-up core has a front end that is too narrow for the
> execution engine. Let's look at the Cortex-X4:
>
> It has a dispatch width of 10 instructions, 8 ALUs (2 of which are
> also integer MACs), 3 branch units, 4 FP/SIMD units, and 4 memory
> units (2 LD, 1 ST, 1 LD/ST).
>
> I don't know how many read and write ports the register files on the
> Cortex-X4 have, but I am sure that if you combined the register files,
> and kept the ports the same as for the register file with the larger
> number of ports, you would see slowdowns from running out of ports.
>
> >> Have you also tried compiling code for the VAX? How does that compare=20
> >> to My66000 and RISC-V? The point: Why should the instruction count be=20
> >> particularly relevant?=20
> ><
> >I have around 12% more instructions than VAX. RISC-V has close to 150%
> >the instruction count of VAX.
> ><
> >So if VAX needs 100 instructions, My 66000 needs 112 instructions, and
> >RISC-V needs 150 instructions.
<
> So My 66000 has 112% the instruction count of VAX, while RISC-V has
> 50% more instructions than VAX:-).
<
At handwaving accuracy: Yes; the ratio between My 66000 and IRSC-V
is solid that of VAX much less so.
<
In 1983-ish, Mark Horowitz remarked that <Stanford> MIPS executed 50%
more instructions than VAX at 6× its frequency for a 4× performance
advantage.
<
I have lots of good data My 66000 versus RISC-V (using LLVM compilers)
and know this relationship with low noise. To bad this NG does not
supports *.jpg or I could show it.
<
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: The synergy of type tags on register file registers

<u67oes$36jnv$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32801&group=comp.arch#32801

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: sfuld@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
Date: Mon, 12 Jun 2023 11:34:02 -0700
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <u67oes$36jnv$1@dont-email.me>
References: <u5vrjb$1vorb$1@dont-email.me>
<memo.20230610042014.5208Y@jgd.cix.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 12 Jun 2023 18:34:04 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="30a296deabc9d4c3d69fd3fe72e9ba9f";
logging-data="3362559"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX196/GgOCIe4RHIPN9UMy1AvZq2H/Aed9kg="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:hesbyA/9BJObyn1dh6657CmQOAE=
In-Reply-To: <memo.20230610042014.5208Y@jgd.cix.co.uk>
Content-Language: en-US
 by: Stephen Fuld - Mon, 12 Jun 2023 18:34 UTC

On 6/9/2023 8:19 PM, John Dallman wrote:
> In article <u5vrjb$1vorb$1@dont-email.me>, sfuld@alumni.cmu.edu.invalid
> (Stephen Fuld) wrote:
>
>> Have you looked to see if VVM would help you?
>
> AFAIK there's no prospect of hardware that actually supports VVM. Am I
> wrong about that?

In thinking about how to respond, I realized that I had completely
missed the point of your previous post, and for that I apologize.

The question is not which hardware architecture does better on your
code, but whether a compiler recognizes your structs as "eligible" for
whatever vectorization the target hardware provides. I gather that, so
far, none do.

That being said, I have been thinking about the benefits of VVM,
assuming the compiler would recognize it. I assume a common task is,
given two points, each described by a three element structure (one 64
bit floating point value element per dimension), compute the Euclidean
distance between the points. The question then is would VVM give better
performance than a GBOoO machine with equivalent number of FUs.

I think the answer is that VVM would provide some benefit, derived from
its smaller code footprint and streaming loads, but it would be modest
due to only looping three times.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: The synergy of type tags on register file registers

<f5568d20-df02-4c56-834b-344d5158850cn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32803&group=comp.arch#32803

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1a88:b0:3f3:89cf:7f5f with SMTP id s8-20020a05622a1a8800b003f389cf7f5fmr3276964qtc.13.1686601934405;
Mon, 12 Jun 2023 13:32:14 -0700 (PDT)
X-Received: by 2002:a05:6870:c781:b0:1a6:a437:e0d6 with SMTP id
dy1-20020a056870c78100b001a6a437e0d6mr592772oab.5.1686601934130; Mon, 12 Jun
2023 13:32:14 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!newsfeed.hasname.com!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 12 Jun 2023 13:32:13 -0700 (PDT)
In-Reply-To: <u67oes$36jnv$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:9d3b:7550:6167:839f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:9d3b:7550:6167:839f
References: <u5vrjb$1vorb$1@dont-email.me> <memo.20230610042014.5208Y@jgd.cix.co.uk>
<u67oes$36jnv$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f5568d20-df02-4c56-834b-344d5158850cn@googlegroups.com>
Subject: Re: The synergy of type tags on register file registers
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Mon, 12 Jun 2023 20:32:14 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3408
 by: MitchAlsup - Mon, 12 Jun 2023 20:32 UTC

On Monday, June 12, 2023 at 1:34:08 PM UTC-5, Stephen Fuld wrote:
> On 6/9/2023 8:19 PM, John Dallman wrote:
> > In article <u5vrjb$1vorb$1...@dont-email.me>, sf...@alumni.cmu.edu.invalid
> > (Stephen Fuld) wrote:
> >
> >> Have you looked to see if VVM would help you?
> >
> > AFAIK there's no prospect of hardware that actually supports VVM. Am I
> > wrong about that?
> In thinking about how to respond, I realized that I had completely
> missed the point of your previous post, and for that I apologize.
>
> The question is not which hardware architecture does better on your
> code, but whether a compiler recognizes your structs as "eligible" for
> whatever vectorization the target hardware provides. I gather that, so
> far, none do.
>
> That being said, I have been thinking about the benefits of VVM,
> assuming the compiler would recognize it. I assume a common task is,
> given two points, each described by a three element structure (one 64
> bit floating point value element per dimension), compute the Euclidean
> distance between the points. The question then is would VVM give better
> performance than a GBOoO machine with equivalent number of FUs.
>
> I think the answer is that VVM would provide some benefit, derived from
> its smaller code footprint and streaming loads, but it would be modest
> due to only looping three times.
<
Think about it like this::
<
VVM allows a lowly 1-wide implementation to perform at vector rates
1 instruction per clock per function unit.
<
At the GBOoO side of things, VVM allows you to seed the loop in the
instruction queues once, and fire them all once per cycle without paying
the fetch-decode-issue power overheads. You can also recirculate the
register renaming from iteration to iteration by concatenating a modulo
loop index to the register renames.
<
At both small end and big end, VVM also allows multiple iterations of
the loop to begin each cycle (SIMD style).
<
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: The synergy of type tags on register file registers

<u69hkn$301le$1@newsreader4.netcologne.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32820&group=comp.arch#32820

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-2d6e-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
Date: Tue, 13 Jun 2023 10:49:59 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <u69hkn$301le$1@newsreader4.netcologne.de>
References: <08f739ac-2200-408c-a578-79e93f9cb272n@googlegroups.com>
<u57ond$2edvp$1@dont-email.me> <u5ddji$38t1d$2@dont-email.me>
<u5vg1e$1uh43$1@dont-email.me> <2023Jun9.181842@mips.complang.tuwien.ac.at>
<4b0c9281-35ea-452f-9abc-1a2733756508n@googlegroups.com>
<2023Jun10.101353@mips.complang.tuwien.ac.at>
<d57dfa0d-db91-457b-a082-bd1f5c549e56n@googlegroups.com>
<2023Jun11.165816@mips.complang.tuwien.ac.at>
<f6917a59-a076-4ca8-b410-8844a68fa402n@googlegroups.com>
<2023Jun12.085948@mips.complang.tuwien.ac.at>
Injection-Date: Tue, 13 Jun 2023 10:49:59 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-2d6e-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:2d6e:0:7285:c2ff:fe6c:992d";
logging-data="3147438"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 13 Jun 2023 10:49 UTC

Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
> MitchAlsup <MitchAlsup@aol.com> writes:
>>On Sunday, June 11, 2023 at 10:57:39=E2=80=AFAM UTC-5, Anton Ertl wrote:
>>> MitchAlsup <Mitch...@aol.com> writes:=20
>>> >In 64-bit architectures both are single registers.
>><
>>> Yes, that's what makes the simple and efficient JVM implementation=20
>>> impossible. In the JVM int and float take one stack slot, while long=20
>>> and double take two (it was originally designed for being implemented=20
>>> with a 32-bit stack slot). There are JVM instructions working on the=20
>>> stack that do not include the types of the stack operand. So either=20
>>> you make the implementation complicated by doing compile-time type=20
>>> tracking, or you make it inefficient by keeping longs and doubles in=20
>>> two registers, and having instructions that work on them combine these=20
>>> two registers into one operand (once per operand), and split the=20
>>> result.=20
>><
>>I don't get your point:: On the stack the int and float containers are
>>word size, after being loaded into a register they are still 32-bit
>>things but occupy a 64-bit container.
>
> Java int and float are not the problem, long and double are. You want
> them in one register on a 64-bit architecture, but a simple
> implementation will split them into two.

Hm... so a solution could be to write a sophisticated implementation
instead?

Re: The synergy of type tags on register file registers

<2023Jun13.192332@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32825&group=comp.arch#32825

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
Date: Tue, 13 Jun 2023 17:23:32 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 24
Distribution: world
Message-ID: <2023Jun13.192332@mips.complang.tuwien.ac.at>
References: <08f739ac-2200-408c-a578-79e93f9cb272n@googlegroups.com> <u5vg1e$1uh43$1@dont-email.me> <2023Jun9.181842@mips.complang.tuwien.ac.at> <4b0c9281-35ea-452f-9abc-1a2733756508n@googlegroups.com> <2023Jun10.101353@mips.complang.tuwien.ac.at> <d57dfa0d-db91-457b-a082-bd1f5c549e56n@googlegroups.com> <2023Jun11.165816@mips.complang.tuwien.ac.at> <f6917a59-a076-4ca8-b410-8844a68fa402n@googlegroups.com> <2023Jun12.085948@mips.complang.tuwien.ac.at> <u69hkn$301le$1@newsreader4.netcologne.de>
Injection-Info: dont-email.me; posting-host="66064d46a222b5841e367fd93f06bac1";
logging-data="3853516"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/1mLRWqC3T/CwAs067NWi9"
Cancel-Lock: sha1:pj9ID4aDM3fY+X1/o4BFpRIU7EE=
X-newsreader: xrn 10.11
 by: Anton Ertl - Tue, 13 Jun 2023 17:23 UTC

Thomas Koenig <tkoenig@netcologne.de> writes:
>Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
>> Java int and float are not the problem, long and double are. You want
>> them in one register on a 64-bit architecture, but a simple
>> implementation will split them into two.
>
>Hm... so a solution could be to write a sophisticated implementation
>instead?

Yes.

Now let's return to our original question of finding some scenario
where a combined integer/FP register file has an advantage: The
sophisticated JVM implementation works fine with separated integer and
FP registers.

So the combined register file provides the advantage of allowings
simple, efficient JVM implementations is there only for 32-bit
architectures (i.e., the 88000).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: The synergy of type tags on register file registers

<3d8f16ee-19aa-4e0d-bad9-07d65fcf8010n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32831&group=comp.arch#32831

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:1628:b0:628:4b4e:50ac with SMTP id e8-20020a056214162800b006284b4e50acmr1804537qvw.13.1686682036259;
Tue, 13 Jun 2023 11:47:16 -0700 (PDT)
X-Received: by 2002:a05:6870:3a15:b0:1a6:7b83:2574 with SMTP id
du21-20020a0568703a1500b001a67b832574mr2634193oab.11.1686682036021; Tue, 13
Jun 2023 11:47:16 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 13 Jun 2023 11:47:15 -0700 (PDT)
In-Reply-To: <2023Jun13.192332@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:991b:d366:7e92:932f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:991b:d366:7e92:932f
References: <08f739ac-2200-408c-a578-79e93f9cb272n@googlegroups.com>
<u5vg1e$1uh43$1@dont-email.me> <2023Jun9.181842@mips.complang.tuwien.ac.at>
<4b0c9281-35ea-452f-9abc-1a2733756508n@googlegroups.com> <2023Jun10.101353@mips.complang.tuwien.ac.at>
<d57dfa0d-db91-457b-a082-bd1f5c549e56n@googlegroups.com> <2023Jun11.165816@mips.complang.tuwien.ac.at>
<f6917a59-a076-4ca8-b410-8844a68fa402n@googlegroups.com> <2023Jun12.085948@mips.complang.tuwien.ac.at>
<u69hkn$301le$1@newsreader4.netcologne.de> <2023Jun13.192332@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3d8f16ee-19aa-4e0d-bad9-07d65fcf8010n@googlegroups.com>
Subject: Re: The synergy of type tags on register file registers
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Tue, 13 Jun 2023 18:47:16 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Tue, 13 Jun 2023 18:47 UTC

On Tuesday, June 13, 2023 at 12:32:19 PM UTC-5, Anton Ertl wrote:
> Thomas Koenig <tko...@netcologne.de> writes:
> >Anton Ertl <an...@mips.complang.tuwien.ac.at> schrieb:
> >> Java int and float are not the problem, long and double are. You want
> >> them in one register on a 64-bit architecture, but a simple
> >> implementation will split them into two.
> >
> >Hm... so a solution could be to write a sophisticated implementation
> >instead?
> Yes.
>
> Now let's return to our original question of finding some scenario
> where a combined integer/FP register file has an advantage: The
> sophisticated JVM implementation works fine with separated integer and
> FP registers.
<
Advantage: Area, utility, code size.
>
> So the combined register file provides the advantage of allowings
> simple, efficient JVM implementations is there only for 32-bit
> architectures (i.e., the 88000).
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: The synergy of type tags on register file registers

<u6aet6$30lua$1@newsreader4.netcologne.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32832&group=comp.arch#32832

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-2d6e-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
Date: Tue, 13 Jun 2023 19:09:26 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <u6aet6$30lua$1@newsreader4.netcologne.de>
References: <08f739ac-2200-408c-a578-79e93f9cb272n@googlegroups.com>
<u5vg1e$1uh43$1@dont-email.me> <2023Jun9.181842@mips.complang.tuwien.ac.at>
<4b0c9281-35ea-452f-9abc-1a2733756508n@googlegroups.com>
<2023Jun10.101353@mips.complang.tuwien.ac.at>
<d57dfa0d-db91-457b-a082-bd1f5c549e56n@googlegroups.com>
<2023Jun11.165816@mips.complang.tuwien.ac.at>
<f6917a59-a076-4ca8-b410-8844a68fa402n@googlegroups.com>
<2023Jun12.085948@mips.complang.tuwien.ac.at>
<u69hkn$301le$1@newsreader4.netcologne.de>
<2023Jun13.192332@mips.complang.tuwien.ac.at>
Injection-Date: Tue, 13 Jun 2023 19:09:26 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-2d6e-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:2d6e:0:7285:c2ff:fe6c:992d";
logging-data="3168202"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 13 Jun 2023 19:09 UTC

Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
> Thomas Koenig <tkoenig@netcologne.de> writes:
>>Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
>>> Java int and float are not the problem, long and double are. You want
>>> them in one register on a 64-bit architecture, but a simple
>>> implementation will split them into two.
>>
>>Hm... so a solution could be to write a sophisticated implementation
>>instead?
>
> Yes.
>
> Now let's return to our original question of finding some scenario
> where a combined integer/FP register file has an advantage:

[...]

Complex division with scaling.

(a,b)/(c,d) = ((a*c+b*d)/(c**2 + d**2), (b*c-a*d)/(c**2 + d**2))

This is easiest to do without loss of precision if you scale all
real and imagniary parts of the numbers exactly by a power of two,
so that no artificial overflow occurs.

Re: The synergy of type tags on register file registers

<memo.20230613215233.5208i@jgd.cix.co.uk>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32839&group=comp.arch#32839

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jgd@cix.co.uk (John Dallman)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
Date: Tue, 13 Jun 2023 21:52 +0100 (BST)
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <memo.20230613215233.5208i@jgd.cix.co.uk>
References: <u67oes$36jnv$1@dont-email.me>
Reply-To: jgd@cix.co.uk
Injection-Info: dont-email.me; posting-host="b78c05f7d66da91361b69e2cdd2e95a2";
logging-data="3906816"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18xgRsiaLicSsMfFLqX+YTmBxZYjLGvteQ="
Cancel-Lock: sha1:TKUbMQSFDRr1Gc1Yolv0VfgPH48=
 by: John Dallman - Tue, 13 Jun 2023 20:52 UTC

In article <u67oes$36jnv$1@dont-email.me>, sfuld@alumni.cmu.edu.invalid
(Stephen Fuld) wrote:

> In thinking about how to respond, I realized that I had completely
> missed the point of your previous post, and for that I apologize.

Thank you.

> The question is not which hardware architecture does better on your
> code, but whether a compiler recognizes your structs as "eligible"
> for whatever vectorization the target hardware provides. I gather
> that, so far, none do.

Not last time I tried.

> That being said, I have been thinking about the benefits of VVM,
> assuming the compiler would recognize it. I assume a common task
> is, given two points, each described by a three element structure
> (one 64 bit floating point value element per dimension), compute
> the Euclidean distance between the points. The question then is
> would VVM give better performance than a GBOoO machine with
> equivalent number of FUs.
>
> I think the answer is that VVM would provide some benefit, derived
> from its smaller code footprint and streaming loads, but it would
> be modest due to only looping three times.

The small sizes of the structs are a problem for any kind of
vectorisation.

John

Re: The synergy of type tags on register file registers

<14159ddc-e516-414b-be70-d96ab898ca8dn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32840&group=comp.arch#32840

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a5d:594f:0:b0:309:486f:98e2 with SMTP id e15-20020a5d594f000000b00309486f98e2mr1243164wri.7.1686691731956;
Tue, 13 Jun 2023 14:28:51 -0700 (PDT)
X-Received: by 2002:aca:be46:0:b0:39a:3c1d:7da2 with SMTP id
o67-20020acabe46000000b0039a3c1d7da2mr3072586oif.2.1686691708477; Tue, 13 Jun
2023 14:28:28 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!proxad.net!feeder1-2.proxad.net!209.85.128.87.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 13 Jun 2023 14:28:28 -0700 (PDT)
In-Reply-To: <memo.20230613215233.5208i@jgd.cix.co.uk>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:991b:d366:7e92:932f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:991b:d366:7e92:932f
References: <u67oes$36jnv$1@dont-email.me> <memo.20230613215233.5208i@jgd.cix.co.uk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <14159ddc-e516-414b-be70-d96ab898ca8dn@googlegroups.com>
Subject: Re: The synergy of type tags on register file registers
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Tue, 13 Jun 2023 21:28:51 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Tue, 13 Jun 2023 21:28 UTC

On Tuesday, June 13, 2023 at 3:52:37 PM UTC-5, John Dallman wrote:
> In article <u67oes$36jnv$1...@dont-email.me>, sf...@alumni.cmu.edu.invalid
> (Stephen Fuld) wrote:
>
> > In thinking about how to respond, I realized that I had completely
> > missed the point of your previous post, and for that I apologize.
> Thank you.
> > The question is not which hardware architecture does better on your
> > code, but whether a compiler recognizes your structs as "eligible"
> > for whatever vectorization the target hardware provides. I gather
> > that, so far, none do.
> Not last time I tried.
> > That being said, I have been thinking about the benefits of VVM,
> > assuming the compiler would recognize it. I assume a common task
> > is, given two points, each described by a three element structure
> > (one 64 bit floating point value element per dimension), compute
> > the Euclidean distance between the points. The question then is
> > would VVM give better performance than a GBOoO machine with
> > equivalent number of FUs.
> >
> > I think the answer is that VVM would provide some benefit, derived
> > from its smaller code footprint and streaming loads, but it would
> > be modest due to only looping three times.
<
> The small sizes of the structs are a problem for any kind of
> vectorisation.
<
Can you show a short code snippet that illustrates that claim ??
>
>
> John

Re: The synergy of type tags on register file registers

<memo.20230614151201.16808A@jgd.cix.co.uk>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32849&group=comp.arch#32849

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jgd@cix.co.uk (John Dallman)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
Date: Wed, 14 Jun 2023 15:12 +0100 (BST)
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <memo.20230614151201.16808A@jgd.cix.co.uk>
References: <14159ddc-e516-414b-be70-d96ab898ca8dn@googlegroups.com>
Reply-To: jgd@cix.co.uk
Injection-Info: dont-email.me; posting-host="33096466a288baa97984f5e31c4b1dcb";
logging-data="56726"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19pXXfxr4xjPV+doWDpO2dJ/TA1FYfRIF4="
Cancel-Lock: sha1:tQRh9soi5hYTHcK69slvmKhW53o=
 by: John Dallman - Wed, 14 Jun 2023 14:12 UTC

In article <14159ddc-e516-414b-be70-d96ab898ca8dn@googlegroups.com>,
MitchAlsup@aol.com (MitchAlsup) wrote:
> On Tuesday, June 13, 2023 at 3:52:37_PM UTC-5, John Dallman wrote:
> > The small sizes of the structs are a problem for any kind of
> > vectorisation.
> Can you show a short code snippet that illustrates that claim ??

Sorry, spoke carelessly. The short vectors limit the gains from
vectorisation, and put a premium on minimal setup costs.

John

Re: The synergy of type tags on register file registers

<01c767e1-70c7-49b0-832f-38d45c957c11n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32850&group=comp.arch#32850

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:199c:b0:3f9:a53b:d875 with SMTP id u28-20020a05622a199c00b003f9a53bd875mr748171qtc.9.1686754448592;
Wed, 14 Jun 2023 07:54:08 -0700 (PDT)
X-Received: by 2002:a05:6830:20cf:b0:6af:7dae:90dc with SMTP id
z15-20020a05683020cf00b006af7dae90dcmr3811036otq.4.1686754448325; Wed, 14 Jun
2023 07:54:08 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 14 Jun 2023 07:54:08 -0700 (PDT)
In-Reply-To: <memo.20230614151201.16808A@jgd.cix.co.uk>
Injection-Info: google-groups.googlegroups.com; posting-host=92.19.80.230; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 92.19.80.230
References: <14159ddc-e516-414b-be70-d96ab898ca8dn@googlegroups.com> <memo.20230614151201.16808A@jgd.cix.co.uk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <01c767e1-70c7-49b0-832f-38d45c957c11n@googlegroups.com>
Subject: Re: The synergy of type tags on register file registers
From: luke.leighton@gmail.com (luke.l...@gmail.com)
Injection-Date: Wed, 14 Jun 2023 14:54:08 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2518
 by: luke.l...@gmail.com - Wed, 14 Jun 2023 14:54 UTC

On Wednesday, June 14, 2023 at 3:12:04 PM UTC+1, John Dallman wrote:
> In article <14159ddc-e516-414b...@googlegroups.com>,
> Mitch...@aol.com (MitchAlsup) wrote:
> > On Tuesday, June 13, 2023 at 3:52:37_PM UTC-5, John Dallman wrote:
> > > The small sizes of the structs are a problem for any kind of
> > > vectorisation.
> > Can you show a short code snippet that illustrates that claim ??
> Sorry, spoke carelessly. The short vectors limit the gains from
> vectorisation, and put a premium on minimal setup costs.

in GPU-augmented front-end compilers (from the Khronos Group)
vec2/3/4 is a first-order construct, as is Swizzle in order to
cross-over RRRA or YXWW.

SVP64 has double-nested for-loops
....for i = 0 to VL-1
.......for j = 0 to SUBVL-1

as a hardware-level construct that is supported up to the ISA.

it can be done: it just needs someone to pay attention, and
unfortunately John you're using a General-Purpose ISA with
General-Purpose compilers, neither of which was ever intended
for or designed with 3D GPU workloads in mind.

the last ISA i heard of that was capable of such "Unified"
programming was the one from ICubeCorp. they had an
ex-SGI Compiler expert on board (Simon?) who knew VLIW
inside-out.

l.

Re: The synergy of type tags on register file registers

<u6egf6$bnoh$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32852&group=comp.arch#32852

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
Date: Thu, 15 Jun 2023 10:00:38 +0200
Organization: A noiseless patient Spider
Lines: 41
Message-ID: <u6egf6$bnoh$1@dont-email.me>
References: <14159ddc-e516-414b-be70-d96ab898ca8dn@googlegroups.com>
<memo.20230614151201.16808A@jgd.cix.co.uk>
<01c767e1-70c7-49b0-832f-38d45c957c11n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 15 Jun 2023 08:00:39 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d102096149b2376ce1cba6b31603f1d6";
logging-data="384785"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18pacgR2P0AWhawci6AyMYtCk1EaUBLvBXTs55uA9DU/g=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.16
Cancel-Lock: sha1:y5hUHuNOfl67Qwtd5raN24x5eTM=
In-Reply-To: <01c767e1-70c7-49b0-832f-38d45c957c11n@googlegroups.com>
 by: Terje Mathisen - Thu, 15 Jun 2023 08:00 UTC

luke.l...@gmail.com wrote:
> On Wednesday, June 14, 2023 at 3:12:04 PM UTC+1, John Dallman wrote:
>> In article <14159ddc-e516-414b...@googlegroups.com>,
>> Mitch...@aol.com (MitchAlsup) wrote:
>>> On Tuesday, June 13, 2023 at 3:52:37_PM UTC-5, John Dallman wrote:
>>>> The small sizes of the structs are a problem for any kind of
>>>> vectorisation.
>>> Can you show a short code snippet that illustrates that claim ??
>> Sorry, spoke carelessly. The short vectors limit the gains from
>> vectorisation, and put a premium on minimal setup costs.
>
> in GPU-augmented front-end compilers (from the Khronos Group)
> vec2/3/4 is a first-order construct, as is Swizzle in order to
> cross-over RRRA or YXWW.
>
> SVP64 has double-nested for-loops
> ...for i = 0 to VL-1
> ......for j = 0 to SUBVL-1
>
> as a hardware-level construct that is supported up to the ISA.
>
> it can be done: it just needs someone to pay attention, and
> unfortunately John you're using a General-Purpose ISA with
> General-Purpose compilers, neither of which was ever intended
> for or designed with 3D GPU workloads in mind.
>
> the last ISA i heard of that was capable of such "Unified"
> programming was the one from ICubeCorp. they had an
> ex-SGI Compiler expert on board (Simon?) who knew VLIW
> inside-out.

What about Larrabee? This has since morphed more or less into AVX-512,
but the original idea was to have a CPU which could also do GPU style
work, with cache lines as the unit of work.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: The synergy of type tags on register file registers

<u6smn3$2gaa7$4@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32881&group=comp.arch#32881

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: paaronclayton@gmail.com (Paul A. Clayton)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
Date: Tue, 20 Jun 2023 13:13:07 -0400
Organization: A noiseless patient Spider
Lines: 26
Message-ID: <u6smn3$2gaa7$4@dont-email.me>
References: <9speM.529570$0dpc.321120@fx33.iad> <cH%fM.4$uh74.2@fx36.iad>
<acc7141f-e876-47f2-873e-086f3738d9b5n@googlegroups.com>
<Eg1gM.398$SaD4.108@fx39.iad> <P73gM.630$tol1.552@fx09.iad>
<55662838-773a-4f18-95f7-c74e99e71d50n@googlegroups.com>
<Jc7gM.9$xwH8.8@fx08.iad> <2023Jun8.183046@mips.complang.tuwien.ac.at>
<3040ce7f-d247-4027-972c-f333199a6012n@googlegroups.com>
<u5ub4d$1qlhf$1@dont-email.me> <dJFgM.14268$d1y5.12227@fx17.iad>
<u5vm2n$1v5lf$1@dont-email.me> <KxJgM.8285$MDLb.3139@fx10.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 20 Jun 2023 17:13:07 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f3d0bcd62b374bce57dd1eafa9d08e5f";
logging-data="2632007"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19r33KdRWY3T+37qXb8Y7EYz6Py1dHVqyU="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.0
Cancel-Lock: sha1:ziVbZCyR6XoRZjIxnW9CFlBwJ7w=
X-Mozilla-News-Host: news://news.eternal-september.org
In-Reply-To: <KxJgM.8285$MDLb.3139@fx10.iad>
 by: Paul A. Clayton - Tue, 20 Jun 2023 17:13 UTC

On 6/9/23 1:44 PM, Scott Lurndal wrote:
[snip]
> Say, like Infiniband (which we used at 3leaf as the coherency transport between
> nodes). The most interesting characteristic of IB was the low switching
> latency (less than 100ns cut-through) when compared with 10Gigabit ethernet
> runing at similar speeds. This was due to the routing mechanisms used
> by infiniband (where the entire route was encapsulated in the packet,
> with each hop discarding the its destination in the packet
> header until the final destination was reached). RDMA was a bonus.

I wonder why the big datacenter organizations have not developed
such simpler routing for mostly-Ethernet-compatibility switches
and "cards". Perhaps something like DHCP could assign MAC
addresses to facilitate routing. (A DHCP-like mechanism would
obviously have overhead for reconnecting, but that seems likely to
be uncommon.) Such organizations seem to have the scale to support
semi-custom chips.

This would not just reduce latency but also (presumably) reduce
power as look-up tables would not be needed.

Since this seem obvious, presumably it is either already done or
the cost of implementation is higher than suspected by me (or the
benefits even lower than suspected). (Sometimes I wish I knew more
about more things — which is one reason why I appreciate others
casually sharing knowledge.)

Re: The synergy of type tags on register file registers

<memo.20230620213634.16808R@jgd.cix.co.uk>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32891&group=comp.arch#32891

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jgd@cix.co.uk (John Dallman)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
Date: Tue, 20 Jun 2023 21:36 +0100 (BST)
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <memo.20230620213634.16808R@jgd.cix.co.uk>
References: <01c767e1-70c7-49b0-832f-38d45c957c11n@googlegroups.com>
Reply-To: jgd@cix.co.uk
Injection-Info: dont-email.me; posting-host="28e451979489a96f17f17e6e93d0546f";
logging-data="2674211"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Q+cQfs73WpuUrtFr9WYBv1OgaoRUl/ak="
Cancel-Lock: sha1:UvdWHBqnREVz82SXQScMlPr/JwQ=
 by: John Dallman - Tue, 20 Jun 2023 20:36 UTC

In article <01c767e1-70c7-49b0-832f-38d45c957c11n@googlegroups.com>,
luke.leighton@gmail.com () wrote:

> it can be done: it just needs someone to pay attention, and
> unfortunately John you're using a General-Purpose ISA with
> General-Purpose compilers, neither of which was ever intended
> for or designed with 3D GPU workloads in mind.

Unfortunately, the 3D vectors work is only a part of what this modeller
does. It mixes that with pretty complex integer logic and data management.
Putting it all on a GPU-style architecture would be far worse than the
current situation.

> the last ISA i heard of that was capable of such "Unified"
> programming was the one from ICubeCorp. they had an
> ex-SGI Compiler expert on board (Simon?) who knew VLIW
> inside-out.

http://icubecorp.com/

The English-language website was last updated in 2014, although the
domain registration is up-to-date. As far as I can tell with Google
Translate, the Chinese site stopped updating at about the same time.

John

Re: The synergy of type tags on register file registers

<knyuyabkh9loyi.fsf@amorsen.dk>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32898&group=comp.arch#32898

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!dotsrc.org!filter.dotsrc.org!news.dotsrc.org!not-for-mail
From: benny+usenet@amorsen.dk (Benny Lyne Amorsen)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
References: <9speM.529570$0dpc.321120@fx33.iad> <cH%fM.4$uh74.2@fx36.iad>
<acc7141f-e876-47f2-873e-086f3738d9b5n@googlegroups.com>
<Eg1gM.398$SaD4.108@fx39.iad> <P73gM.630$tol1.552@fx09.iad>
<55662838-773a-4f18-95f7-c74e99e71d50n@googlegroups.com>
<Jc7gM.9$xwH8.8@fx08.iad> <2023Jun8.183046@mips.complang.tuwien.ac.at>
<3040ce7f-d247-4027-972c-f333199a6012n@googlegroups.com>
<u5ub4d$1qlhf$1@dont-email.me> <dJFgM.14268$d1y5.12227@fx17.iad>
<u5vm2n$1v5lf$1@dont-email.me> <KxJgM.8285$MDLb.3139@fx10.iad>
<u6smn3$2gaa7$4@dont-email.me>
Date: Wed, 21 Jun 2023 00:59:49 +0200
Message-ID: <knyuyabkh9loyi.fsf@amorsen.dk>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
Cancel-Lock: sha1:mzY1BVgbL/EtkRhreD10w+SqWgw=
MIME-Version: 1.0
Content-Type: text/plain
Lines: 18
Organization: SunSITE.dk - Supporting Open source
NNTP-Posting-Host: 0e8e5c44.news.sunsite.dk
X-Trace: 1687301989 news.sunsite.dk 712 benny+usenet@amorsen.dk/31.3.75.31:52412
X-Complaints-To: staff@sunsite.dk
 by: Benny Lyne Amorsen - Tue, 20 Jun 2023 22:59 UTC

"Paul A. Clayton" <paaronclayton@gmail.com> writes:

> I wonder why the big datacenter organizations have not developed
> such simpler routing for mostly-Ethernet-compatibility switches
> and "cards".

It is called Segment Routing, and it is available over MPLS or
IPv6. Although you get pretty much the same benefit with plain MPLS.

It does not, however, turn out to be that much lower latency than
regular routing and switching, and at 100Gbps+ the FEC latency starts to
dominate.

It looks like routing/switching latency will be stuck in the low
hundreds of ns for a while.

/Benny

Re: The synergy of type tags on register file registers

<58ae43ee-e8d0-4154-a710-70c9a8c052bbn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32900&group=comp.arch#32900

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:b482:0:b0:762:1e66:3920 with SMTP id d124-20020a37b482000000b007621e663920mr3097227qkf.11.1687302734481;
Tue, 20 Jun 2023 16:12:14 -0700 (PDT)
X-Received: by 2002:aca:bd8b:0:b0:39e:bf78:be70 with SMTP id
n133-20020acabd8b000000b0039ebf78be70mr2273359oif.1.1687302734089; Tue, 20
Jun 2023 16:12:14 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!panix!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 20 Jun 2023 16:12:13 -0700 (PDT)
In-Reply-To: <u6smn3$2gaa7$4@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2601:18c:4200:a37:6867:62df:8e57:f9f4;
posting-account=5gV3HwoAAAAce05MvbMFVKxb-iBCVVSr
NNTP-Posting-Host: 2601:18c:4200:a37:6867:62df:8e57:f9f4
References: <9speM.529570$0dpc.321120@fx33.iad> <cH%fM.4$uh74.2@fx36.iad>
<acc7141f-e876-47f2-873e-086f3738d9b5n@googlegroups.com> <Eg1gM.398$SaD4.108@fx39.iad>
<P73gM.630$tol1.552@fx09.iad> <55662838-773a-4f18-95f7-c74e99e71d50n@googlegroups.com>
<Jc7gM.9$xwH8.8@fx08.iad> <2023Jun8.183046@mips.complang.tuwien.ac.at>
<3040ce7f-d247-4027-972c-f333199a6012n@googlegroups.com> <u5ub4d$1qlhf$1@dont-email.me>
<dJFgM.14268$d1y5.12227@fx17.iad> <u5vm2n$1v5lf$1@dont-email.me>
<KxJgM.8285$MDLb.3139@fx10.iad> <u6smn3$2gaa7$4@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <58ae43ee-e8d0-4154-a710-70c9a8c052bbn@googlegroups.com>
Subject: Re: The synergy of type tags on register file registers
From: gomijacogeo@gmail.com (JohnG)
Injection-Date: Tue, 20 Jun 2023 23:12:14 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 4409
 by: JohnG - Tue, 20 Jun 2023 23:12 UTC

On Tuesday, June 20, 2023 at 1:13:11 PM UTC-4, Paul A. Clayton wrote:
> On 6/9/23 1:44 PM, Scott Lurndal wrote:
> [snip]
> > Say, like Infiniband (which we used at 3leaf as the coherency transport between
> > nodes). The most interesting characteristic of IB was the low switching
> > latency (less than 100ns cut-through) when compared with 10Gigabit ethernet
> > runing at similar speeds. This was due to the routing mechanisms used
> > by infiniband (where the entire route was encapsulated in the packet,
> > with each hop discarding the its destination in the packet
> > header until the final destination was reached). RDMA was a bonus.
> I wonder why the big datacenter organizations have not developed
> such simpler routing for mostly-Ethernet-compatibility switches
> and "cards". Perhaps something like DHCP could assign MAC
> addresses to facilitate routing. (A DHCP-like mechanism would
> obviously have overhead for reconnecting, but that seems likely to
> be uncommon.) Such organizations seem to have the scale to support
> semi-custom chips.
>
> This would not just reduce latency but also (presumably) reduce
> power as look-up tables would not be needed.
>
> Since this seem obvious, presumably it is either already done or
> the cost of implementation is higher than suspected by me (or the
> benefits even lower than suspected). (Sometimes I wish I knew more
> about more things — which is one reason why I appreciate others
> casually sharing knowledge.)

The next-hop routing Scott is describing is only used by the Subnet Manager when it is crawling the network topology to discover devices - i.e. go one hop out, if it's a switch, enumerate each port and go an additional hop, repeat. It goes through the Subnet Management Agent on a switch which is typically running on a very underpowered embedded processor and is dog slow. The rest of the time, IB uses LID-based routing (a 16-bit logical address, only up to the first 48k addresses are valid unicast addresses). There is also another addressing method that uses a 40 byte Global Routing Header, but I've only seen that used in geographically dispersed IB networks.

The main reason IB has lower latency is 1) a 16-bit LID is smaller than a 48-bit MAC and you can have a very small ram with all possible LIDs and their associated outbound ports as opposed to a bigger, slower TCAM, and 2) the IB market is VERY interested in that metric rather than layer-3,4,5,.. deep packet inspection or the 500 other things modern datacenter switches are expected to do.

-JohnG

Re: The synergy of type tags on register file registers

<u79n6h$h3st$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32934&group=comp.arch#32934

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: paaronclayton@gmail.com (Paul A. Clayton)
Newsgroups: comp.arch
Subject: Re: The synergy of type tags on register file registers
Date: Sun, 25 Jun 2023 11:41:03 -0400
Organization: A noiseless patient Spider
Lines: 73
Message-ID: <u79n6h$h3st$1@dont-email.me>
References: <9speM.529570$0dpc.321120@fx33.iad> <cH%fM.4$uh74.2@fx36.iad>
<acc7141f-e876-47f2-873e-086f3738d9b5n@googlegroups.com>
<Eg1gM.398$SaD4.108@fx39.iad> <P73gM.630$tol1.552@fx09.iad>
<55662838-773a-4f18-95f7-c74e99e71d50n@googlegroups.com>
<Jc7gM.9$xwH8.8@fx08.iad> <2023Jun8.183046@mips.complang.tuwien.ac.at>
<3040ce7f-d247-4027-972c-f333199a6012n@googlegroups.com>
<u5ub4d$1qlhf$1@dont-email.me> <dJFgM.14268$d1y5.12227@fx17.iad>
<u5vm2n$1v5lf$1@dont-email.me> <KxJgM.8285$MDLb.3139@fx10.iad>
<u6smn3$2gaa7$4@dont-email.me>
<58ae43ee-e8d0-4154-a710-70c9a8c052bbn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 25 Jun 2023 15:41:05 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="cfeb778ec84a70ab8e8e955e1292215b";
logging-data="561053"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ukj/NPziQdlR08tIXwLPrkV/Bz/tzrVc="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.0
Cancel-Lock: sha1:5XFQKOMaisjrqkNmIEXTb6YlwIg=
In-Reply-To: <58ae43ee-e8d0-4154-a710-70c9a8c052bbn@googlegroups.com>
 by: Paul A. Clayton - Sun, 25 Jun 2023 15:41 UTC

On 6/20/23 7:12 PM, JohnG wrote:
> On Tuesday, June 20, 2023 at 1:13:11 PM UTC-4, Paul A. Clayton wrote:
>> On 6/9/23 1:44 PM, Scott Lurndal wrote:
>> [snip]
>>> Say, like Infiniband (which we used at 3leaf as the coherency transport between
>>> nodes). The most interesting characteristic of IB was the low switching
>>> latency (less than 100ns cut-through) when compared with 10Gigabit ethernet
>>> runing at similar speeds. This was due to the routing mechanisms used
>>> by infiniband (where the entire route was encapsulated in the packet,
>>> with each hop discarding the its destination in the packet
>>> header until the final destination was reached). RDMA was a bonus.
>> I wonder why the big datacenter organizations have not developed
>> such simpler routing for mostly-Ethernet-compatibility switches
>> and "cards". Perhaps something like DHCP could assign MAC
>> addresses to facilitate routing. (A DHCP-like mechanism would
>> obviously have overhead for reconnecting, but that seems likely to
>> be uncommon.) Such organizations seem to have the scale to support
>> semi-custom chips.
>>
>> This would not just reduce latency but also (presumably) reduce
>> power as look-up tables would not be needed.
>>
>> Since this seem obvious, presumably it is either already done or
>> the cost of implementation is higher than suspected by me (or the
>> benefits even lower than suspected). (Sometimes I wish I knew more
>> about more things — which is one reason why I appreciate others
>> casually sharing knowledge.)
>
> The next-hop routing Scott is describing is only used by the Subnet
> Manager when it is crawling the network topology to discover devices - > i.e. go one hop out, if it's a switch, enumerate each port and
go an
> additional hop, repeat. It goes through the Subnet Management Agent
> on a switch which is typically running on a very underpowered
> embedded processor and is dog slow. The rest of the time, IB uses
> LID-based routing (a 16-bit logical address, only up to the first
> 48k addresses are valid unicast addresses). There is also another
> addressing method that uses a 40 byte Global Routing Header, but
>I've only seen that used in geographically dispersed IB networks.
>
> The main reason IB has lower latency is 1) a 16-bit LID is
> smaller than a 48-bit MAC and you can have a very small ram with
> all possible LIDs and their associated outbound ports as opposed
> to a bigger, slower TCAM, and

My (small) point was that an extended Ethernet specialized for
large datacenter operators (who could afford the specialization)
could use something like DHCP to give each end-node a trivially
routable address. For such addresses no table/TCAM would be needed
(and an arbitrary number of bits could be ignored/always be zero;
I do not know whether using a different packet format would be
practical). I.e., "ethernet" could mimic this aspect of
Infiniband.

> 2) the IB market is VERY interested in that metric rather than
> layer-3,4,5,.. deep packet inspection or the 500 other things
> modern datacenter switches are expected to do.

This might well be the primary hindrance. Generality is often
more attractive than efficiency achieved with specialization. Even
so, there might be some scope for specialization (without eye-
watering pricing). (Economics seems to hinder such. It is more
profitable to sell specialized products to those who *need* the
features than to sell modestly more expensive fuller-featured base
products even when the system would be better off having broad
availability of such features.)

Anyway, it seems that it is not so much a _technical_ question as
an economics question: market demand et al. Since this is well
outside of my usual reading, I appreciate the information (even if
it is disappointing to the desire to approach the best technically
possible solution — the information also points to other features
[like deep packet inspection] having more value generally, which
reduces my ignorance).

Re: The synergy of type tags on register file registers

<f2dfbf41-3f2b-4f0a-bcc4-4c0152033733n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33338&group=comp.arch#33338

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1a26:b0:404:132c:e7da with SMTP id f38-20020a05622a1a2600b00404132ce7damr2921qtb.5.1689902800880;
Thu, 20 Jul 2023 18:26:40 -0700 (PDT)
X-Received: by 2002:a4a:4594:0:b0:566:238a:3c55 with SMTP id
y142-20020a4a4594000000b00566238a3c55mr913639ooa.1.1689902800514; Thu, 20 Jul
2023 18:26:40 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 20 Jul 2023 18:26:40 -0700 (PDT)
In-Reply-To: <u79n6h$h3st$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:22f4:c800:94ad:3953:7b81:89b;
posting-account=5gV3HwoAAAAce05MvbMFVKxb-iBCVVSr
NNTP-Posting-Host: 2600:1700:22f4:c800:94ad:3953:7b81:89b
References: <9speM.529570$0dpc.321120@fx33.iad> <cH%fM.4$uh74.2@fx36.iad>
<acc7141f-e876-47f2-873e-086f3738d9b5n@googlegroups.com> <Eg1gM.398$SaD4.108@fx39.iad>
<P73gM.630$tol1.552@fx09.iad> <55662838-773a-4f18-95f7-c74e99e71d50n@googlegroups.com>
<Jc7gM.9$xwH8.8@fx08.iad> <2023Jun8.183046@mips.complang.tuwien.ac.at>
<3040ce7f-d247-4027-972c-f333199a6012n@googlegroups.com> <u5ub4d$1qlhf$1@dont-email.me>
<dJFgM.14268$d1y5.12227@fx17.iad> <u5vm2n$1v5lf$1@dont-email.me>
<KxJgM.8285$MDLb.3139@fx10.iad> <u6smn3$2gaa7$4@dont-email.me>
<58ae43ee-e8d0-4154-a710-70c9a8c052bbn@googlegroups.com> <u79n6h$h3st$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f2dfbf41-3f2b-4f0a-bcc4-4c0152033733n@googlegroups.com>
Subject: Re: The synergy of type tags on register file registers
From: gomijacogeo@gmail.com (JohnG)
Injection-Date: Fri, 21 Jul 2023 01:26:40 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 7467
 by: JohnG - Fri, 21 Jul 2023 01:26 UTC

On Sunday, June 25, 2023 at 8:41:09 AM UTC-7, Paul A. Clayton wrote:
> On 6/20/23 7:12 PM, JohnG wrote:
> > On Tuesday, June 20, 2023 at 1:13:11 PM UTC-4, Paul A. Clayton wrote:
> >> On 6/9/23 1:44 PM, Scott Lurndal wrote:
> >> [snip]
> >>> Say, like Infiniband (which we used at 3leaf as the coherency transport between
> >>> nodes). The most interesting characteristic of IB was the low switching
> >>> latency (less than 100ns cut-through) when compared with 10Gigabit ethernet
> >>> runing at similar speeds. This was due to the routing mechanisms used
> >>> by infiniband (where the entire route was encapsulated in the packet,
> >>> with each hop discarding the its destination in the packet
> >>> header until the final destination was reached). RDMA was a bonus.
> >> I wonder why the big datacenter organizations have not developed
> >> such simpler routing for mostly-Ethernet-compatibility switches
> >> and "cards". Perhaps something like DHCP could assign MAC
> >> addresses to facilitate routing. (A DHCP-like mechanism would
> >> obviously have overhead for reconnecting, but that seems likely to
> >> be uncommon.) Such organizations seem to have the scale to support
> >> semi-custom chips.
> >>
> >> This would not just reduce latency but also (presumably) reduce
> >> power as look-up tables would not be needed.
> >>
> >> Since this seem obvious, presumably it is either already done or
> >> the cost of implementation is higher than suspected by me (or the
> >> benefits even lower than suspected). (Sometimes I wish I knew more
> >> about more things — which is one reason why I appreciate others
> >> casually sharing knowledge.)
> >
> > The next-hop routing Scott is describing is only used by the Subnet
> > Manager when it is crawling the network topology to discover devices - > i.e. go one hop out, if it's a switch, enumerate each port and
> go an
> > additional hop, repeat. It goes through the Subnet Management Agent
> > on a switch which is typically running on a very underpowered
> > embedded processor and is dog slow. The rest of the time, IB uses
> > LID-based routing (a 16-bit logical address, only up to the first
> > 48k addresses are valid unicast addresses). There is also another
> > addressing method that uses a 40 byte Global Routing Header, but
> >I've only seen that used in geographically dispersed IB networks.
> >
> > The main reason IB has lower latency is 1) a 16-bit LID is
> > smaller than a 48-bit MAC and you can have a very small ram with
> > all possible LIDs and their associated outbound ports as opposed
> > to a bigger, slower TCAM, and
> My (small) point was that an extended Ethernet specialized for
> large datacenter operators (who could afford the specialization)
> could use something like DHCP to give each end-node a trivially
> routable address. For such addresses no table/TCAM would be needed
> (and an arbitrary number of bits could be ignored/always be zero;
> I do not know whether using a different packet format would be
> practical). I.e., "ethernet" could mimic this aspect of
> Infiniband.
> > 2) the IB market is VERY interested in that metric rather than
> > layer-3,4,5,.. deep packet inspection or the 500 other things
> > modern datacenter switches are expected to do.
> This might well be the primary hindrance. Generality is often
> more attractive than efficiency achieved with specialization. Even
> so, there might be some scope for specialization (without eye-
> watering pricing). (Economics seems to hinder such. It is more
> profitable to sell specialized products to those who *need* the
> features than to sell modestly more expensive fuller-featured base
> products even when the system would be better off having broad
> availability of such features.)
>
> Anyway, it seems that it is not so much a _technical_ question as
> an economics question: market demand et al. Since this is well
> outside of my usual reading, I appreciate the information (even if
> it is disappointing to the desire to approach the best technically
> possible solution — the information also points to other features
> [like deep packet inspection] having more value generally, which
> reduces my ignorance).

You're right in spirit - optimizing for datacenter (and now, AI - which looks a lot like classic HPC) has merit and a bunch of orgs have been doing that in-house for a while. It feels a lot like the late 90's where there were a bunch of boutique HPC interconnects and eventually it drove a few standardization efforts which ultimately led to InfiniBand. Now we appear to have the Ultra Ethernet Consortium. Here are a couple refs:

https://www.nextplatform.com/2023/07/20/ethernet-consortium-shoots-for-1-million-node-clusters-that-beat-infiniband/
(the links below are also all in-line in the above article)
https://ultraethernet.org/wp-content/uploads/sites/20/2023/07/23.07.12-UEC-1.0-Overview-FINAL-WITH-LOGO.pdf
https://www.nextplatform.com/2023/04/26/broadcom-takes-on-infiniband-with-jericho3-ai-switch-chips/
https://www.nextplatform.com/2023/06/22/cisco-guns-for-infiniband-with-silicon-one-g200/
https://www.nextplatform.com/2023/02/16/luminaries-argue-for-the-interconnect-we-could-have-already-had/
https://www.nextplatform.com/2022/04/12/with-aquila-google-abandons-ethernet-to-outdo-infiniband/

-JohnG

Re: The synergy of type tags on register file registers

<b48336f5-259f-4d68-929d-90187437213cn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33342&group=comp.arch#33342

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:16:b0:403:394c:bf29 with SMTP id x22-20020a05622a001600b00403394cbf29mr4399qtw.2.1689919816149;
Thu, 20 Jul 2023 23:10:16 -0700 (PDT)
X-Received: by 2002:a4a:1483:0:b0:55a:ed02:f433 with SMTP id
125-20020a4a1483000000b0055aed02f433mr1717927ood.0.1689919815841; Thu, 20 Jul
2023 23:10:15 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.cmpublishers.com!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 20 Jul 2023 23:10:15 -0700 (PDT)
In-Reply-To: <u6egf6$bnoh$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=92.19.80.230; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 92.19.80.230
References: <14159ddc-e516-414b-be70-d96ab898ca8dn@googlegroups.com>
<memo.20230614151201.16808A@jgd.cix.co.uk> <01c767e1-70c7-49b0-832f-38d45c957c11n@googlegroups.com>
<u6egf6$bnoh$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b48336f5-259f-4d68-929d-90187437213cn@googlegroups.com>
Subject: Re: The synergy of type tags on register file registers
From: luke.leighton@gmail.com (luke.l...@gmail.com)
Injection-Date: Fri, 21 Jul 2023 06:10:16 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2581
 by: luke.l...@gmail.com - Fri, 21 Jul 2023 06:10 UTC

On Thursday, June 15, 2023 at 9:00:43 AM UTC+1, Terje Mathisen wrote:

> > the last ISA i heard of that was capable of such "Unified"
> > programming was the one from ICubeCorp. they had an
> > ex-SGI Compiler expert on board (Simon?) who knew VLIW
> > inside-out.
> What about Larrabee? This has since morphed more or less into AVX-512,
> but the original idea was to have a CPU which could also do GPU style
> work, with cache lines as the unit of work.

sorry, am not keeping up. larrabee failed commercially because
they made some poor decisions on a costly (slow) iterative feedback
loop. 1) they couldn't get good enough GPU performance so went
for "raw compute" instead only to find that 2) scientific computing
users were interested in FP64 where they'd of course put FP32 into
silicon.

jeff bush's nyuzi work (he has a whole blog) explains why larrabee's
approach as a "soft-core-only GPU with insufficient specialised
actual 3D GPU capabilities" was a scant 25% of the power-performance
ratio of other commercial hard-accelerated 3D GPU offerings...
exactly as **any** general-purpose "Vector" ISA will ever be.

tom forsyth's video "design lifecyle of a instruction set" is really
informative and also very funny.

l.

Re: The synergy of type tags on register file registers

<89502cd7-b3e1-48cd-94ad-dda6afec356en@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33345&group=comp.arch#33345

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:242:b0:762:407d:3837 with SMTP id q2-20020a05620a024200b00762407d3837mr3282qkn.6.1689932986260;
Fri, 21 Jul 2023 02:49:46 -0700 (PDT)
X-Received: by 2002:a05:6808:1799:b0:3a3:d677:6c78 with SMTP id
bg25-20020a056808179900b003a3d6776c78mr4214361oib.2.1689932985750; Fri, 21
Jul 2023 02:49:45 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 21 Jul 2023 02:49:45 -0700 (PDT)
In-Reply-To: <f2dfbf41-3f2b-4f0a-bcc4-4c0152033733n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:908:a2e0:8bf9:63ff;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:908:a2e0:8bf9:63ff
References: <9speM.529570$0dpc.321120@fx33.iad> <cH%fM.4$uh74.2@fx36.iad>
<acc7141f-e876-47f2-873e-086f3738d9b5n@googlegroups.com> <Eg1gM.398$SaD4.108@fx39.iad>
<P73gM.630$tol1.552@fx09.iad> <55662838-773a-4f18-95f7-c74e99e71d50n@googlegroups.com>
<Jc7gM.9$xwH8.8@fx08.iad> <2023Jun8.183046@mips.complang.tuwien.ac.at>
<3040ce7f-d247-4027-972c-f333199a6012n@googlegroups.com> <u5ub4d$1qlhf$1@dont-email.me>
<dJFgM.14268$d1y5.12227@fx17.iad> <u5vm2n$1v5lf$1@dont-email.me>
<KxJgM.8285$MDLb.3139@fx10.iad> <u6smn3$2gaa7$4@dont-email.me>
<58ae43ee-e8d0-4154-a710-70c9a8c052bbn@googlegroups.com> <u79n6h$h3st$1@dont-email.me>
<f2dfbf41-3f2b-4f0a-bcc4-4c0152033733n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <89502cd7-b3e1-48cd-94ad-dda6afec356en@googlegroups.com>
Subject: Re: The synergy of type tags on register file registers
From: already5chosen@yahoo.com (Michael S)
Injection-Date: Fri, 21 Jul 2023 09:49:46 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 8123
 by: Michael S - Fri, 21 Jul 2023 09:49 UTC

On Friday, July 21, 2023 at 4:26:42 AM UTC+3, JohnG wrote:
> On Sunday, June 25, 2023 at 8:41:09 AM UTC-7, Paul A. Clayton wrote:
> > On 6/20/23 7:12 PM, JohnG wrote:
> > > On Tuesday, June 20, 2023 at 1:13:11 PM UTC-4, Paul A. Clayton wrote:
> > >> On 6/9/23 1:44 PM, Scott Lurndal wrote:
> > >> [snip]
> > >>> Say, like Infiniband (which we used at 3leaf as the coherency transport between
> > >>> nodes). The most interesting characteristic of IB was the low switching
> > >>> latency (less than 100ns cut-through) when compared with 10Gigabit ethernet
> > >>> runing at similar speeds. This was due to the routing mechanisms used
> > >>> by infiniband (where the entire route was encapsulated in the packet,
> > >>> with each hop discarding the its destination in the packet
> > >>> header until the final destination was reached). RDMA was a bonus.
> > >> I wonder why the big datacenter organizations have not developed
> > >> such simpler routing for mostly-Ethernet-compatibility switches
> > >> and "cards". Perhaps something like DHCP could assign MAC
> > >> addresses to facilitate routing. (A DHCP-like mechanism would
> > >> obviously have overhead for reconnecting, but that seems likely to
> > >> be uncommon.) Such organizations seem to have the scale to support
> > >> semi-custom chips.
> > >>
> > >> This would not just reduce latency but also (presumably) reduce
> > >> power as look-up tables would not be needed.
> > >>
> > >> Since this seem obvious, presumably it is either already done or
> > >> the cost of implementation is higher than suspected by me (or the
> > >> benefits even lower than suspected). (Sometimes I wish I knew more
> > >> about more things — which is one reason why I appreciate others
> > >> casually sharing knowledge.)
> > >
> > > The next-hop routing Scott is describing is only used by the Subnet
> > > Manager when it is crawling the network topology to discover devices - > i.e. go one hop out, if it's a switch, enumerate each port and
> > go an
> > > additional hop, repeat. It goes through the Subnet Management Agent
> > > on a switch which is typically running on a very underpowered
> > > embedded processor and is dog slow. The rest of the time, IB uses
> > > LID-based routing (a 16-bit logical address, only up to the first
> > > 48k addresses are valid unicast addresses). There is also another
> > > addressing method that uses a 40 byte Global Routing Header, but
> > >I've only seen that used in geographically dispersed IB networks.
> > >
> > > The main reason IB has lower latency is 1) a 16-bit LID is
> > > smaller than a 48-bit MAC and you can have a very small ram with
> > > all possible LIDs and their associated outbound ports as opposed
> > > to a bigger, slower TCAM, and
> > My (small) point was that an extended Ethernet specialized for
> > large datacenter operators (who could afford the specialization)
> > could use something like DHCP to give each end-node a trivially
> > routable address. For such addresses no table/TCAM would be needed
> > (and an arbitrary number of bits could be ignored/always be zero;
> > I do not know whether using a different packet format would be
> > practical). I.e., "ethernet" could mimic this aspect of
> > Infiniband.
> > > 2) the IB market is VERY interested in that metric rather than
> > > layer-3,4,5,.. deep packet inspection or the 500 other things
> > > modern datacenter switches are expected to do.
> > This might well be the primary hindrance. Generality is often
> > more attractive than efficiency achieved with specialization. Even
> > so, there might be some scope for specialization (without eye-
> > watering pricing). (Economics seems to hinder such. It is more
> > profitable to sell specialized products to those who *need* the
> > features than to sell modestly more expensive fuller-featured base
> > products even when the system would be better off having broad
> > availability of such features.)
> >
> > Anyway, it seems that it is not so much a _technical_ question as
> > an economics question: market demand et al. Since this is well
> > outside of my usual reading, I appreciate the information (even if
> > it is disappointing to the desire to approach the best technically
> > possible solution — the information also points to other features
> > [like deep packet inspection] having more value generally, which
> > reduces my ignorance).
> You're right in spirit - optimizing for datacenter (and now, AI - which looks a lot like classic HPC) has merit and a bunch of orgs have been doing that in-house for a while. It feels a lot like the late 90's where there were a bunch of boutique HPC interconnects and eventually it drove a few standardization efforts which ultimately led to InfiniBand. Now we appear to have the Ultra Ethernet Consortium. Here are a couple refs:
>
> https://www.nextplatform.com/2023/07/20/ethernet-consortium-shoots-for-1-million-node-clusters-that-beat-infiniband/
> (the links below are also all in-line in the above article)
> https://ultraethernet.org/wp-content/uploads/sites/20/2023/07/23.07.12-UEC-1.0-Overview-FINAL-WITH-LOGO.pdf
> https://www.nextplatform.com/2023/04/26/broadcom-takes-on-infiniband-with-jericho3-ai-switch-chips/
> https://www.nextplatform.com/2023/06/22/cisco-guns-for-infiniband-with-silicon-one-g200/
> https://www.nextplatform.com/2023/02/16/luminaries-argue-for-the-interconnect-we-could-have-already-had/
> https://www.nextplatform.com/2022/04/12/with-aquila-google-abandons-ethernet-to-outdo-infiniband/
>
> -JohnG

Yesterday I tried to figure out what Microsoft's Azure Boost actually does.
But that's too far away from my area of competence, so I understood nothing..
Is it in the same batch or completely different?
https://azure.microsoft.com/en-us/updates/preview-azure-boost/

Pages:12345678910
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor