Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

A modem is a baudy house.


devel / comp.arch / Re: Downfall hardware bug involving vgather instruction

SubjectAuthor
* Downfall hardware bug involving vgather instructionAnton Ertl
+* Re: Downfall hardware bug involving vgather instructionStephen Fuld
|`* Re: Downfall hardware bug involving vgather instructionAnton Ertl
| `* Re: Downfall hardware bug involving vgather instructionMitchAlsup
|  `* Re: Downfall hardware bug involving vgather instructionAnton Ertl
|   `* Re: Downfall hardware bug involving vgather instructionMichael S
|    `* Re: Downfall hardware bug involving vgather instructionAnton Ertl
|     +- Re: Downfall hardware bug involving vgather instructionMichael S
|     `* Re: Downfall hardware bug involving vgather instructionEricP
|      `- Re: Downfall hardware bug involving vgather instructionThomas Koenig
`* Re: Downfall hardware bug involving vgather instructionMitchAlsup
 `* Re: Downfall hardware bug involving vgather instructionAnton Ertl
  +* Re: Downfall hardware bug involving vgather instructionMichael S
  |+* Re: Downfall hardware bug involving vgather instructionScott Lurndal
  ||+- Re: Downfall hardware bug involving vgather instructionMichael S
  ||`- Re: Downfall hardware bug involving vgather instructionTerje Mathisen
  |`* Re: Downfall hardware bug involving vgather instructionAnton Ertl
  | +- Re: Downfall hardware bug involving vgather instructionTerje Mathisen
  | +* Re: Downfall hardware bug involving vgather instructionStephen Fuld
  | |`- Re: Downfall hardware bug involving vgather instructionMitchAlsup
  | +- Re: Downfall hardware bug involving vgather instructionMitchAlsup
  | `- Re: Downfall hardware bug involving vgather instructionMichael S
  `- Re: Downfall hardware bug involving vgather instructionMitchAlsup

1
Downfall hardware bug involving vgather instruction

<2023Aug9.171316@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33575&group=comp.arch#33575

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED.80-108-20-68.cable.dynamic.surfer.at!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Downfall hardware bug involving vgather instruction
Date: Wed, 09 Aug 2023 15:13:16 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID: <2023Aug9.171316@mips.complang.tuwien.ac.at>
Injection-Info: dont-email.me; posting-host="80-108-20-68.cable.dynamic.surfer.at:80.108.20.68";
logging-data="7370"; mail-complaints-to="abuse@eternal-september.org"
X-newsreader: xrn 10.11
 by: Anton Ertl - Wed, 9 Aug 2023 15:13 UTC

The paper describing it is:

https://downfall.page/media/downfall.pdf

@inproceedings{moghimi2023downfall,
title={{Downfall}: Exploiting Speculative Data Gathering},
author={Moghimi, Daniel},
booktitle={32th USENIX Security Symposium (USENIX Security 2023)},
year={2023}
}

There is a lot of stuff in there, and I have not read it thoroughly,
but "Gather Data Sampling" (GDS) appears to be a Meltdown-class attack
that speculatively extracts data from SIMD-load buffers by using
speculatively-executed gather instructions (present since AVX2), and
then uses a side channel to get the data from the speculative to the
architectural state. Then the author explores various ways to further
exploit this vulnerability, and it proves pretty versatile (read for
yourself), including breaking SGX (secure enclaves in Intel).

The author tested Tiger Lake, Ice Lake Server, Cascade Lake (a Server
Skylake variant), and Kaby Lake (a client Skylake variant), and they
all are vulnerable. He does not describe that some CPUs are safe, so
every CPU with AVX2 (which includes the gather instructions) might be
affected.

Disabling HT (SMT) reduces the leakage speed by a lot, but data can
still be leaked across context switches; the instructions that flush
other microarchitecural state on context switches do not flush these
SIMD load buffers.

In the eternal discussion about RISC vs. non-RISC, the RISC advocates
can point to another case where the complexity of accessing multiple
memory locations (and the desire to optimize that, and deal with,
e.g., TLB misses without restarting everything from scratch) has led
to complexity, that resulted in a hardware bug.

The strange thing is that gather is ultra-slow on Skylake, so whatever
they optimized does not work very well (maybe it's better on Tiger
Lake).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<ub2v4c$dk5m$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33586&group=comp.arch#33586

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: sfuld@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
Date: Thu, 10 Aug 2023 08:18:36 -0700
Organization: A noiseless patient Spider
Lines: 41
Message-ID: <ub2v4c$dk5m$1@dont-email.me>
References: <2023Aug9.171316@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 10 Aug 2023 15:18:36 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e939fdc2adf1b718e6fb0eaa6bcf87d0";
logging-data="446646"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/mK3W6ZWx947aF/o1qCyQ8b0b3YVUSxWo="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:bmwdQStNhD6oXtlkx2aisBZVpCo=
In-Reply-To: <2023Aug9.171316@mips.complang.tuwien.ac.at>
Content-Language: en-US
 by: Stephen Fuld - Thu, 10 Aug 2023 15:18 UTC

On 8/9/2023 8:13 AM, Anton Ertl wrote:
> The paper describing it is:
>
> https://downfall.page/media/downfall.pdf
>
> @inproceedings{moghimi2023downfall,
> title={{Downfall}: Exploiting Speculative Data Gathering},
> author={Moghimi, Daniel},
> booktitle={32th USENIX Security Symposium (USENIX Security 2023)},
> year={2023}
> }
>
> There is a lot of stuff in there, and I have not read it thoroughly,
> but "Gather Data Sampling" (GDS) appears to be a Meltdown-class attack
> that speculatively extracts data from SIMD-load buffers by using
> speculatively-executed gather instructions (present since AVX2), and
> then uses a side channel to get the data from the speculative to the
> architectural state. Then the author explores various ways to further
> exploit this vulnerability, and it proves pretty versatile (read for
> yourself), including breaking SGX (secure enclaves in Intel).
>
> The author tested Tiger Lake, Ice Lake Server, Cascade Lake (a Server
> Skylake variant), and Kaby Lake (a client Skylake variant), and they
> all are vulnerable. He does not describe that some CPUs are safe, so
> every CPU with AVX2 (which includes the gather instructions) might be
> affected.

According to Wired,

https://www.wired.com/story/downfall-flaw-intel-chips/

> Intel's current generation chips—including those in the Alder Lake, Raptor Lake, and Sapphire Rapids families—are not affected, because attempts to exploit the vulnerability would be blocked by defenses Intel has added recently.

So did they know about this vulnerability for a long time, or did they
fix it inadvertently?

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Downfall hardware bug involving vgather instruction

<2023Aug10.181205@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33588&group=comp.arch#33588

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
Date: Thu, 10 Aug 2023 16:12:05 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 26
Message-ID: <2023Aug10.181205@mips.complang.tuwien.ac.at>
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <ub2v4c$dk5m$1@dont-email.me>
Injection-Info: dont-email.me; posting-host="0c185a2329a38ecb1e627031ee53f227";
logging-data="461560"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Rn/8PHnYKaZtSjh7WoFXu"
Cancel-Lock: sha1:KRPc8QZbkHK2HcYxR6r7oHm/Dps=
X-newsreader: xrn 10.11
 by: Anton Ertl - Thu, 10 Aug 2023 16:12 UTC

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:
>According to Wired,
>
>https://www.wired.com/story/downfall-flaw-intel-chips/
>
>> Intel's current generation chips—including those in the Alder Lake, Raptor Lake, and Sapphire Rapids families—are not affected, because attempts to exploit the vulnerability would be blocked by defenses Intel has added recently.
>
>So did they know about this vulnerability for a long time, or did they
>fix it inadvertently?

Good question. Intel also tells us that Haswell and Broadwell are not
vulnerable. Maybe they added the defenses in Haswell, removed it in
Skylake, and readded it in Golden Cove and Gracemont. And according
to the author, preliminary results indicate that Zen2 is not
vulnerable, so maybe AMD also knew about this vulnerability and added
a defense against it.

I think a more useful POV is that Intel reimplemented Gather in
Skylake with a bug, and reimplemented it in Golden Cove without the
bug; whether they noticed the bug in the previous implementation when
they reimplemented it in Golden Cove is the question.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<ff76e49b-3f53-46c2-9132-c942cae8c0a0n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33589&group=comp.arch#33589

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:2b9a:b0:76d:6c4:1e78 with SMTP id dz26-20020a05620a2b9a00b0076d06c41e78mr39634qkb.7.1691688098697;
Thu, 10 Aug 2023 10:21:38 -0700 (PDT)
X-Received: by 2002:a17:90a:4615:b0:268:8e93:6459 with SMTP id
w21-20020a17090a461500b002688e936459mr729304pjg.8.1691688098118; Thu, 10 Aug
2023 10:21:38 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 10 Aug 2023 10:21:37 -0700 (PDT)
In-Reply-To: <2023Aug10.181205@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:89f1:a1fe:6c56:f4f5;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:89f1:a1fe:6c56:f4f5
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <ub2v4c$dk5m$1@dont-email.me>
<2023Aug10.181205@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ff76e49b-3f53-46c2-9132-c942cae8c0a0n@googlegroups.com>
Subject: Re: Downfall hardware bug involving vgather instruction
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Thu, 10 Aug 2023 17:21:38 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Thu, 10 Aug 2023 17:21 UTC

On Thursday, August 10, 2023 at 11:20:30 AM UTC-5, Anton Ertl wrote:
> Stephen Fuld <sf...@alumni.cmu.edu.invalid> writes:
> >According to Wired,
> >
> >https://www.wired.com/story/downfall-flaw-intel-chips/
> >
> >> Intel's current generation chips—including those in the Alder Lake, Raptor Lake, and Sapphire Rapids families—are not affected, because attempts to exploit the vulnerability would be blocked by defenses Intel has added recently.
> >
> >So did they know about this vulnerability for a long time, or did they
> >fix it inadvertently?
<
> Good question. Intel also tells us that Haswell and Broadwell are not
> vulnerable. Maybe they added the defenses in Haswell, removed it in
> Skylake, and readded it in Golden Cove and Gracemont. And according
> to the author, preliminary results indicate that Zen2 is not
> vulnerable, so maybe AMD also knew about this vulnerability and added
> a defense against it.
<
Either they knew
OR
AMD has to have better verification in order to maintain complete bug-for-bug
compatibility.
>
> I think a more useful POV is that Intel reimplemented Gather in
> Skylake with a bug, and reimplemented it in Golden Cove without the
> bug; whether they noticed the bug in the previous implementation when
> they reimplemented it in Golden Cove is the question.
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33593&group=comp.arch#33593

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1746:b0:407:2c52:2861 with SMTP id l6-20020a05622a174600b004072c522861mr51240qtk.8.1691694488412;
Thu, 10 Aug 2023 12:08:08 -0700 (PDT)
X-Received: by 2002:a17:902:ec90:b0:1b7:f55e:4ab0 with SMTP id
x16-20020a170902ec9000b001b7f55e4ab0mr1014663plg.0.1691694487971; Thu, 10 Aug
2023 12:08:07 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 10 Aug 2023 12:08:07 -0700 (PDT)
In-Reply-To: <2023Aug9.171316@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:89f1:a1fe:6c56:f4f5;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:89f1:a1fe:6c56:f4f5
References: <2023Aug9.171316@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
Subject: Re: Downfall hardware bug involving vgather instruction
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Thu, 10 Aug 2023 19:08:08 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Thu, 10 Aug 2023 19:08 UTC

On Wednesday, August 9, 2023 at 10:44:33 AM UTC-5, Anton Ertl wrote:
> The paper describing it is:
>
> https://downfall.page/media/downfall.pdf
>
> @inproceedings{moghimi2023downfall,
> title={{Downfall}: Exploiting Speculative Data Gathering},
> author={Moghimi, Daniel},
> booktitle={32th USENIX Security Symposium (USENIX Security 2023)},
> year={2023}
> }
>
> There is a lot of stuff in there, and I have not read it thoroughly,
> but "Gather Data Sampling" (GDS) appears to be a Meltdown-class attack
> that speculatively extracts data from SIMD-load buffers by using
> speculatively-executed gather instructions (present since AVX2), and
> then uses a side channel to get the data from the speculative to the
> architectural state. Then the author explores various ways to further
> exploit this vulnerability, and it proves pretty versatile (read for
> yourself), including breaking SGX (secure enclaves in Intel).
>
> The author tested Tiger Lake, Ice Lake Server, Cascade Lake (a Server
> Skylake variant), and Kaby Lake (a client Skylake variant), and they
> all are vulnerable. He does not describe that some CPUs are safe, so
> every CPU with AVX2 (which includes the gather instructions) might be
> affected.
>
> Disabling HT (SMT) reduces the leakage speed by a lot, but data can
> still be leaked across context switches; the instructions that flush
> other microarchitecural state on context switches do not flush these
> SIMD load buffers.
>
> In the eternal discussion about RISC vs. non-RISC, the RISC advocates
> can point to another case where the complexity of accessing multiple
> memory locations (and the desire to optimize that, and deal with,
> e.g., TLB misses without restarting everything from scratch) has led
> to complexity, that resulted in a hardware bug.
<
The gather bug is an example of not-obeying the rule:
"If you change 1 bit in an architectural register, you change them all"
<
RISC-people in 1980 already knew the x86 byte stuff was problematic
{along with a host of other mis-features}
<
But I guess, every µArchitecture generation gets to figure it out for
themselves.
>
> The strange thing is that gather is ultra-slow on Skylake, so whatever
> they optimized does not work very well (maybe it's better on Tiger
> Lake).
>
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<2023Aug11.080805@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33594&group=comp.arch#33594

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
Date: Fri, 11 Aug 2023 06:08:05 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 13
Message-ID: <2023Aug11.080805@mips.complang.tuwien.ac.at>
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <ub2v4c$dk5m$1@dont-email.me> <2023Aug10.181205@mips.complang.tuwien.ac.at> <ff76e49b-3f53-46c2-9132-c942cae8c0a0n@googlegroups.com>
Injection-Info: dont-email.me; posting-host="02fb8553b30dff11662fa858573f4531";
logging-data="797237"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/nAw7IBw8mUdHS/XdNIbIW"
Cancel-Lock: sha1:tQMIts0fBlc25GdgRRinZlit1PI=
X-newsreader: xrn 10.11
 by: Anton Ertl - Fri, 11 Aug 2023 06:08 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>Either they knew
>OR
>AMD has to have better verification in order to maintain complete bug-for-b=
>ug
>compatibility.

Zenbleed (an architectural bug) shows that AMD's verification has holes.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<2023Aug11.081533@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33595&group=comp.arch#33595

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
Date: Fri, 11 Aug 2023 06:15:33 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 46
Message-ID: <2023Aug11.081533@mips.complang.tuwien.ac.at>
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
Injection-Info: dont-email.me; posting-host="02fb8553b30dff11662fa858573f4531";
logging-data="807459"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+4IpRz2qPjubW7SkwXyoUw"
Cancel-Lock: sha1:1FxuuGu1yi5Hpi2Q8t5k73cCbzg=
X-newsreader: xrn 10.11
 by: Anton Ertl - Fri, 11 Aug 2023 06:15 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>The gather bug is an example of not-obeying the rule:
>"If you change 1 bit in an architectural register, you change them all"
><
>RISC-people in 1980 already knew the x86 byte stuff was problematic
>{along with a host of other mis-features}

Where can I read about this knowledge of the RISC people? Did they
even waste one thought to the 8086, which was an insignificant
microprocessor in 1980?

As for the gather bug, the gather instructions were added with AVX2,
first implemented in Haswell (released 2013), and with AVX-512 F
(Knights Landing, 2013). Intel had problems with partial register
updates since they introduced OoO with the Pentium Pro (released
1995), so they were aware of the problems of implementing such
instructions. And yet they chose to do so, in addition to choosing to
add an instruction that may perform 16 memory accesses.

Interestingly, the definition of, e.g., VGATHERPS says:

|* Faults are delivered in a right-to-left manner. That is, if a fault
| is triggered by an element and delivered, all elements closer to the
| LSB of the destination will be completed (and
| non-faulting). Individual elements closer to the MSB may or may not
| be completed. If a given element triggers multiple faults, they are
| delivered in the conventional order.
| |* Elements may be gathered in any order, but faults must be delivered
| in a right-to-left order; thus, elements to the left of a faulting
| one may be gathered before the fault is delivered. A given
| implementation of this instruction is repeatable - given the same
| input values and architectural state, the same set of elements to
| the left of the faulting one will be gathered.

It's not clear to me what this "completed" and "gathered" is about.
Is it about changing the destination register in case of a fault? Is
it about order of memory accesses for concurrent processing?

There is a good reason why multi-access instructions are shunned by
most architects.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<b40214fa-ccc2-4f06-9c7e-ddf53eb7b5f5n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33596&group=comp.arch#33596

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:9305:0:b0:76c:9791:167e with SMTP id v5-20020a379305000000b0076c9791167emr20699qkd.6.1691758264498;
Fri, 11 Aug 2023 05:51:04 -0700 (PDT)
X-Received: by 2002:a17:903:11c4:b0:1bc:4452:59b6 with SMTP id
q4-20020a17090311c400b001bc445259b6mr661338plh.11.1691758264196; Fri, 11 Aug
2023 05:51:04 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 11 Aug 2023 05:51:03 -0700 (PDT)
In-Reply-To: <2023Aug11.080805@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:2587:f642:e15c:3f95;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:2587:f642:e15c:3f95
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <ub2v4c$dk5m$1@dont-email.me>
<2023Aug10.181205@mips.complang.tuwien.ac.at> <ff76e49b-3f53-46c2-9132-c942cae8c0a0n@googlegroups.com>
<2023Aug11.080805@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b40214fa-ccc2-4f06-9c7e-ddf53eb7b5f5n@googlegroups.com>
Subject: Re: Downfall hardware bug involving vgather instruction
From: already5chosen@yahoo.com (Michael S)
Injection-Date: Fri, 11 Aug 2023 12:51:04 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2125
 by: Michael S - Fri, 11 Aug 2023 12:51 UTC

On Friday, August 11, 2023 at 9:13:29 AM UTC+3, Anton Ertl wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >Either they knew
> >OR
> >AMD has to have better verification in order to maintain complete bug-for-b=
> >ug
> >compatibility.
>
> Zenbleed (an architectural bug) shows that AMD's verification has holes.
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Also, zenbleed is sort of bug that can be relatively easily discovered with
automatic black-box testing.
Downfall, on the other hand, looks impossible to find unless you know
what you are looking for.

Re: Downfall hardware bug involving vgather instruction

<12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33597&group=comp.arch#33597

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ae9:df84:0:b0:76d:1e66:cb0c with SMTP id t126-20020ae9df84000000b0076d1e66cb0cmr22499qkf.5.1691759087621;
Fri, 11 Aug 2023 06:04:47 -0700 (PDT)
X-Received: by 2002:a4a:5882:0:b0:56c:86f2:ae09 with SMTP id
f124-20020a4a5882000000b0056c86f2ae09mr98503oob.0.1691759087255; Fri, 11 Aug
2023 06:04:47 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border-1.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 11 Aug 2023 06:04:47 -0700 (PDT)
In-Reply-To: <2023Aug11.081533@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:2587:f642:e15c:3f95;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:2587:f642:e15c:3f95
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
<2023Aug11.081533@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>
Subject: Re: Downfall hardware bug involving vgather instruction
From: already5chosen@yahoo.com (Michael S)
Injection-Date: Fri, 11 Aug 2023 13:04:47 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 62
 by: Michael S - Fri, 11 Aug 2023 13:04 UTC

On Friday, August 11, 2023 at 9:39:09 AM UTC+3, Anton Ertl wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >The gather bug is an example of not-obeying the rule:
> >"If you change 1 bit in an architectural register, you change them all"
> ><
> >RISC-people in 1980 already knew the x86 byte stuff was problematic
> >{along with a host of other mis-features}
> Where can I read about this knowledge of the RISC people? Did they
> even waste one thought to the 8086, which was an insignificant
> microprocessor in 1980?
>

MIPS solution for unaligned loads relies on partial register update.
IIRCC, they tought it's so cool that they even patented the idea.
So, obviously, MIPS was not designed by RISC-people in 1980s.
Or was it?

> As for the gather bug, the gather instructions were added with AVX2,
> first implemented in Haswell (released 2013), and with AVX-512 F
> (Knights Landing, 2013). Intel had problems with partial register
> updates since they introduced OoO with the Pentium Pro (released
> 1995), so they were aware of the problems of implementing such
> instructions. And yet they chose to do so, in addition to choosing to
> add an instruction that may perform 16 memory accesses.
>
> Interestingly, the definition of, e.g., VGATHERPS says:
>
> |* Faults are delivered in a right-to-left manner. That is, if a fault
> | is triggered by an element and delivered, all elements closer to the
> | LSB of the destination will be completed (and
> | non-faulting). Individual elements closer to the MSB may or may not
> | be completed. If a given element triggers multiple faults, they are
> | delivered in the conventional order.
> |
> |* Elements may be gathered in any order, but faults must be delivered
> | in a right-to-left order; thus, elements to the left of a faulting
> | one may be gathered before the fault is delivered. A given
> | implementation of this instruction is repeatable - given the same
> | input values and architectural state, the same set of elements to
> | the left of the faulting one will be gathered.
>
> It's not clear to me what this "completed" and "gathered" is about.
> Is it about changing the destination register in case of a fault? Is
> it about order of memory accesses for concurrent processing?
>
> There is a good reason why multi-access instructions are shunned by
> most architects.

Is there wider than 128-bit SIMD ISA that does *not* have that or
another variant of gather?

Personally I am of low opinion about gather. IMHO, if one's algorithm
needs gather for vectorization then it's a sign that vectorization gain
will be low and that you better not bother.
But it seems to me that most architect are enamored but this sort of crap.

> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<1FqBM.109941$8_8a.81845@fx48.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33598&group=comp.arch#33598

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx48.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Downfall hardware bug involving vgather instruction
Newsgroups: comp.arch
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com> <2023Aug11.081533@mips.complang.tuwien.ac.at> <12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>
Lines: 19
Message-ID: <1FqBM.109941$8_8a.81845@fx48.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Fri, 11 Aug 2023 13:25:17 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Fri, 11 Aug 2023 13:25:17 GMT
X-Received-Bytes: 1767
 by: Scott Lurndal - Fri, 11 Aug 2023 13:25 UTC

Michael S <already5chosen@yahoo.com> writes:
>On Friday, August 11, 2023 at 9:39:09=E2=80=AFAM UTC+3, Anton Ertl wrote:

>> There is a good reason why multi-access instructions are shunned by=20
>> most architects.
>
>Is there wider than 128-bit SIMD ISA that does *not* have that or
>another variant of gather?
>
>Personally I am of low opinion about gather. IMHO, if one's algorithm
>needs gather for vectorization then it's a sign that vectorization gain
>will be low and that you better not bother.
>But it seems to me that most architect are enamored but this sort of crap.

It has been my experience that architects are interested in generally improving
performance. They don't just willy-nilly add features for the sake
of adding features, but rather spend a great deal of time analyzing workloads
and modeling performance improvements before they are adopted into a
instruction set architecture.

Re: Downfall hardware bug involving vgather instruction

<97c33882-9796-4ce1-9b5b-2ebb53e16080n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33599&group=comp.arch#33599

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:4e34:0:b0:63c:ef89:1a5e with SMTP id dm20-20020ad44e34000000b0063cef891a5emr35218qvb.0.1691761273414;
Fri, 11 Aug 2023 06:41:13 -0700 (PDT)
X-Received: by 2002:a05:6a00:2488:b0:67d:41a8:3e19 with SMTP id
c8-20020a056a00248800b0067d41a83e19mr794063pfv.3.1691761273184; Fri, 11 Aug
2023 06:41:13 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 11 Aug 2023 06:41:12 -0700 (PDT)
In-Reply-To: <1FqBM.109941$8_8a.81845@fx48.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:2587:f642:e15c:3f95;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:2587:f642:e15c:3f95
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
<2023Aug11.081533@mips.complang.tuwien.ac.at> <12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>
<1FqBM.109941$8_8a.81845@fx48.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <97c33882-9796-4ce1-9b5b-2ebb53e16080n@googlegroups.com>
Subject: Re: Downfall hardware bug involving vgather instruction
From: already5chosen@yahoo.com (Michael S)
Injection-Date: Fri, 11 Aug 2023 13:41:13 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Michael S - Fri, 11 Aug 2023 13:41 UTC

On Friday, August 11, 2023 at 4:25:21 PM UTC+3, Scott Lurndal wrote:
> Michael S <already...@yahoo.com> writes:
> >On Friday, August 11, 2023 at 9:39:09=E2=80=AFAM UTC+3, Anton Ertl wrote:
>
> >> There is a good reason why multi-access instructions are shunned by=20
> >> most architects.
> >
> >Is there wider than 128-bit SIMD ISA that does *not* have that or
> >another variant of gather?
> >
> >Personally I am of low opinion about gather. IMHO, if one's algorithm
> >needs gather for vectorization then it's a sign that vectorization gain
> >will be low and that you better not bother.
> >But it seems to me that most architect are enamored but this sort of crap.
> It has been my experience that architects are interested in generally improving
> performance. They don't just willy-nilly add features for the sake
> of adding features, but rather spend a great deal of time analyzing workloads
> and modeling performance improvements before they are adopted into a
> instruction set architecture.

Gather is in SVE not because of some deep analysis of its advantages and
disadvantages on today's huge cores with relatively narrow 128-bit or 256-bit
SIMD. It's there because it made sense for Fujitsu A64Fx - relatively spartan
4-wide core with smallish OoO structures? long load pipliine and wide 512-bit
SIMD. Another big reason - it was present in Fujitsu's previous SPRAC64 VIIIFx.

Gather is in AVX-512 for similar reasons - legacy of LRBNI and of KNC/KNL.

Why they have gather in AVX2? I don't know. IMHO, by mistake.

Re: Downfall hardware bug involving vgather instruction

<ub5ggm$sfal$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33600&group=comp.arch#33600

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
Date: Fri, 11 Aug 2023 16:27:34 +0200
Organization: A noiseless patient Spider
Lines: 30
Message-ID: <ub5ggm$sfal$1@dont-email.me>
References: <2023Aug9.171316@mips.complang.tuwien.ac.at>
<3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
<2023Aug11.081533@mips.complang.tuwien.ac.at>
<12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>
<1FqBM.109941$8_8a.81845@fx48.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 11 Aug 2023 14:27:34 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="656e07e0bac898f6a02fe36e3dda52ea";
logging-data="933205"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/5JKst5T6jjQuvoCN0GduMWRmjTvvxxkD1LybTyWYKyg=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.17
Cancel-Lock: sha1:mp9+RYkSglbxwEk/yEV5Y/0T968=
In-Reply-To: <1FqBM.109941$8_8a.81845@fx48.iad>
 by: Terje Mathisen - Fri, 11 Aug 2023 14:27 UTC

Scott Lurndal wrote:
> Michael S <already5chosen@yahoo.com> writes:
>> On Friday, August 11, 2023 at 9:39:09=E2=80=AFAM UTC+3, Anton Ertl wrote:
>
>>> There is a good reason why multi-access instructions are shunned by=20
>>> most architects.
>>
>> Is there wider than 128-bit SIMD ISA that does *not* have that or
>> another variant of gather?
>>
>> Personally I am of low opinion about gather. IMHO, if one's algorithm
>> needs gather for vectorization then it's a sign that vectorization gain
>> will be low and that you better not bother.
>> But it seems to me that most architect are enamored but this sort of crap.
>
> It has been my experience that architects are interested in generally improving
> performance. They don't just willy-nilly add features for the sake
> of adding features, but rather spend a great deal of time analyzing workloads
> and modeling performance improvements before they are adopted into a
> instruction set architecture.

gather is one of (if not) the most obvious missing link(s) if you want
to allow a compiler to vectorize arbitrary scalar looping code,
particularly when any form of sparse matrix storage is involved.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Downfall hardware bug involving vgather instruction

<2023Aug12.083815@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33609&group=comp.arch#33609

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
Date: Sat, 12 Aug 2023 06:38:15 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 35
Message-ID: <2023Aug12.083815@mips.complang.tuwien.ac.at>
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com> <2023Aug11.081533@mips.complang.tuwien.ac.at> <12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>
Injection-Info: dont-email.me; posting-host="2ff738c98414fa90554011cf37d58574";
logging-data="1316029"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+BvQT2qURu1x3EyNk04rIN"
Cancel-Lock: sha1:mdpExbYZ4GSW2mc9eJxUKp2LnQY=
X-newsreader: xrn 10.11
 by: Anton Ertl - Sat, 12 Aug 2023 06:38 UTC

Michael S <already5chosen@yahoo.com> writes:
>Is there wider than 128-bit SIMD ISA that does *not* have that or
>another variant of gather?

AVX. However, Intel added gather instructions with AVX2 two years
later.

>Personally I am of low opinion about gather. IMHO, if one's algorithm
>needs gather for vectorization then it's a sign that vectorization gain
>will be low and that you better not bother.
>But it seems to me that most architect are enamored but this sort of crap.

Yes, it's better to arrange your data or your computations so you
don't need gather.

However, if the programmer does not do that (either because he does
not think about vectorization, or just does not find such an
arrangement), gather and scatter instructions may be useful for
avoiding to have to switch from SIMD to scalar code and back.

AFAIK the Cray-1 had gather and scatter instructions. I wonder how
much the speed difference between stride-1 loads and stores and gather
and scatter was on the Cray-1.

Maybe gather and scatter were added in the Cray-1 (for which people
thought about how to vectorize) because they could be implemented
efficiently, and nowadays SIMD designers add them, because the Cray-1
had them, and code designed with that background has them, even though
nowadays the performance advantages of stride-1 accesses are much
bigger.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<2023Aug12.085106@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33610&group=comp.arch#33610

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
Date: Sat, 12 Aug 2023 06:51:06 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 10
Message-ID: <2023Aug12.085106@mips.complang.tuwien.ac.at>
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <ub2v4c$dk5m$1@dont-email.me> <2023Aug10.181205@mips.complang.tuwien.ac.at> <ff76e49b-3f53-46c2-9132-c942cae8c0a0n@googlegroups.com> <2023Aug11.080805@mips.complang.tuwien.ac.at> <b40214fa-ccc2-4f06-9c7e-ddf53eb7b5f5n@googlegroups.com>
Injection-Info: dont-email.me; posting-host="2ff738c98414fa90554011cf37d58574";
logging-data="1316029"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18KlfQajd1g8zoHMzxD0O2U"
Cancel-Lock: sha1:rTUyxcCJyFkOfjV9AzMUAzZg/kg=
X-newsreader: xrn 10.11
 by: Anton Ertl - Sat, 12 Aug 2023 06:51 UTC

Michael S <already5chosen@yahoo.com> writes:
>Downfall, on the other hand, looks impossible to find unless you know
>what you are looking for.

So how do you think did Daniel Moghimi know what to look for?

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<ub7f1d$18p2t$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33611&group=comp.arch#33611

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
Date: Sat, 12 Aug 2023 10:14:36 +0200
Organization: A noiseless patient Spider
Lines: 56
Message-ID: <ub7f1d$18p2t$1@dont-email.me>
References: <2023Aug9.171316@mips.complang.tuwien.ac.at>
<3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
<2023Aug11.081533@mips.complang.tuwien.ac.at>
<12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>
<2023Aug12.083815@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 12 Aug 2023 08:14:37 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c4ca2e48ead0a6abd6f21214f8ee51cc";
logging-data="1336413"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX191Z6PR6yANYp2hgnVKjzvnMplW+EZbfihy8Ixe6aqo0w=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.17
Cancel-Lock: sha1:+2J5emrHPNw34bzJ4LkoaOEruTw=
In-Reply-To: <2023Aug12.083815@mips.complang.tuwien.ac.at>
 by: Terje Mathisen - Sat, 12 Aug 2023 08:14 UTC

Anton Ertl wrote:
> Michael S <already5chosen@yahoo.com> writes:
>> Is there wider than 128-bit SIMD ISA that does *not* have that or
>> another variant of gather?
>
> AVX. However, Intel added gather instructions with AVX2 two years
> later.
>
>> Personally I am of low opinion about gather. IMHO, if one's algorithm
>> needs gather for vectorization then it's a sign that vectorization gain
>> will be low and that you better not bother.
>> But it seems to me that most architect are enamored but this sort of crap.
>
> Yes, it's better to arrange your data or your computations so you
> don't need gather.
>
> However, if the programmer does not do that (either because he does
> not think about vectorization, or just does not find such an
> arrangement), gather and scatter instructions may be useful for
> avoiding to have to switch from SIMD to scalar code and back.

I think this is exactly right.
>
> AFAIK the Cray-1 had gather and scatter instructions. I wonder how
> much the speed difference between stride-1 loads and stores and gather
> and scatter was on the Cray-1.
>
> Maybe gather and scatter were added in the Cray-1 (for which people
> thought about how to vectorize) because they could be implemented
> efficiently, and nowadays SIMD designers add them, because the Cray-1
> had them, and code designed with that background has them, even though
> nowadays the performance advantages of stride-1 accesses are much
> bigger.

I view gather as a form of caching, in that it can be much more
efficient when the source data is mostly sequential, or at least
spatially nearby:

On the original Larrabee the intention/target was for gather to use one
cycle per source cache line, i.e. it would process source addresses from
the left, then while looking at the corresponding cache line, it would
also pick up any subsequent words that happened to lie in the same cache
line.

This meant that for purely sequential data you got maximal speed, while
a gather from 16 random addresses would take the same 16 cycles as a
scalar implementation.

(This all assumes sourc data in $L1 cache of course, adding wait cycles
for more remote source data.)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Downfall hardware bug involving vgather instruction

<ub89r5$1ccmd$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33613&group=comp.arch#33613

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: sfuld@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
Date: Sat, 12 Aug 2023 08:52:04 -0700
Organization: A noiseless patient Spider
Lines: 45
Message-ID: <ub89r5$1ccmd$1@dont-email.me>
References: <2023Aug9.171316@mips.complang.tuwien.ac.at>
<3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
<2023Aug11.081533@mips.complang.tuwien.ac.at>
<12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>
<2023Aug12.083815@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 12 Aug 2023 15:52:05 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c14ab4315f5f328415e1e23b15046ae8";
logging-data="1454797"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+suM9Huo+4XZyAssOR82YrQZ54gJRyAx4="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:56ommFBFwLfyMg99rwzc27B4gS8=
In-Reply-To: <2023Aug12.083815@mips.complang.tuwien.ac.at>
Content-Language: en-US
 by: Stephen Fuld - Sat, 12 Aug 2023 15:52 UTC

On 8/11/2023 11:38 PM, Anton Ertl wrote:
> Michael S <already5chosen@yahoo.com> writes:
>> Is there wider than 128-bit SIMD ISA that does *not* have that or
>> another variant of gather?
>
> AVX. However, Intel added gather instructions with AVX2 two years
> later.
>
>> Personally I am of low opinion about gather. IMHO, if one's algorithm
>> needs gather for vectorization then it's a sign that vectorization gain
>> will be low and that you better not bother.
>> But it seems to me that most architect are enamored but this sort of crap.
>
> Yes, it's better to arrange your data or your computations so you
> don't need gather.
>
> However, if the programmer does not do that (either because he does
> not think about vectorization, or just does not find such an
> arrangement), gather and scatter instructions may be useful for
> avoiding to have to switch from SIMD to scalar code and back.
>
> AFAIK the Cray-1 had gather and scatter instructions. I wonder how
> much the speed difference between stride-1 loads and stores and gather
> and scatter was on the Cray-1.

Probably not much at all, perhaps none. Remember, Cray-1 had no cache,
used static RAM for main memory (i.e. no page mode, etc.), and had
highly interleaved memory.

> Maybe gather and scatter were added in the Cray-1 (for which people
> thought about how to vectorize) because they could be implemented
> efficiently, and nowadays SIMD designers add them, because the Cray-1
> had them, and code designed with that background has them, even though
> nowadays the performance advantages of stride-1 accesses are much
> bigger.

Could be.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Downfall hardware bug involving vgather instruction

<6a350941-2ed6-4ea8-8c98-ecd1b14ac640n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33615&group=comp.arch#33615

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:250:b0:762:5081:31a9 with SMTP id q16-20020a05620a025000b00762508131a9mr82391qkn.0.1691859717992;
Sat, 12 Aug 2023 10:01:57 -0700 (PDT)
X-Received: by 2002:a17:903:2444:b0:1b8:5541:9d3e with SMTP id
l4-20020a170903244400b001b855419d3emr2144151pls.6.1691859717718; Sat, 12 Aug
2023 10:01:57 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 12 Aug 2023 10:01:57 -0700 (PDT)
In-Reply-To: <2023Aug12.083815@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:f13f:c8fd:311b:b1a;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:f13f:c8fd:311b:b1a
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
<2023Aug11.081533@mips.complang.tuwien.ac.at> <12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>
<2023Aug12.083815@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6a350941-2ed6-4ea8-8c98-ecd1b14ac640n@googlegroups.com>
Subject: Re: Downfall hardware bug involving vgather instruction
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sat, 12 Aug 2023 17:01:57 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Sat, 12 Aug 2023 17:01 UTC

On Saturday, August 12, 2023 at 1:49:58 AM UTC-5, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> >Is there wider than 128-bit SIMD ISA that does *not* have that or
> >another variant of gather?
> AVX. However, Intel added gather instructions with AVX2 two years
> later.
> >Personally I am of low opinion about gather. IMHO, if one's algorithm
> >needs gather for vectorization then it's a sign that vectorization gain
> >will be low and that you better not bother.
> >But it seems to me that most architect are enamored but this sort of crap.
> Yes, it's better to arrange your data or your computations so you
> don't need gather.
>
> However, if the programmer does not do that (either because he does
> not think about vectorization, or just does not find such an
> arrangement), gather and scatter instructions may be useful for
> avoiding to have to switch from SIMD to scalar code and back.
>
> AFAIK the Cray-1 had gather and scatter instructions. I wonder how
> much the speed difference between stride-1 loads and stores and gather
> and scatter was on the Cray-1.
<
Cray-1 and CRAY 1-S did NOT have gather/Scatter
Cray-XMP and YMP did
>
> Maybe gather and scatter were added in the Cray-1 (for which people
> thought about how to vectorize) because they could be implemented
> efficiently, and nowadays SIMD designers add them, because the Cray-1
> had them, and code designed with that background has them, even though
> nowadays the performance advantages of stride-1 accesses are much
> bigger.
<
Efficiently in CRAY XMP was AGENnig 1 new address per cycle per gather and
sending it out into the heavily interleaved memory system.
<
Efficiently in CRAYY YMP was AGENing 3 addresses per cycle and sending these
out into the even heavier interleaved memory system.
<
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<10c5f8f9-59cf-4509-9197-640db24b63aen@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33616&group=comp.arch#33616

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:162e:b0:63c:f952:2d0e with SMTP id e14-20020a056214162e00b0063cf9522d0emr75336qvw.2.1691859800494;
Sat, 12 Aug 2023 10:03:20 -0700 (PDT)
X-Received: by 2002:a17:90b:3004:b0:268:1d63:b9ae with SMTP id
hg4-20020a17090b300400b002681d63b9aemr1090547pjb.3.1691859799967; Sat, 12 Aug
2023 10:03:19 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 12 Aug 2023 10:03:19 -0700 (PDT)
In-Reply-To: <ub89r5$1ccmd$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:f13f:c8fd:311b:b1a;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:f13f:c8fd:311b:b1a
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
<2023Aug11.081533@mips.complang.tuwien.ac.at> <12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>
<2023Aug12.083815@mips.complang.tuwien.ac.at> <ub89r5$1ccmd$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <10c5f8f9-59cf-4509-9197-640db24b63aen@googlegroups.com>
Subject: Re: Downfall hardware bug involving vgather instruction
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sat, 12 Aug 2023 17:03:20 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3534
 by: MitchAlsup - Sat, 12 Aug 2023 17:03 UTC

On Saturday, August 12, 2023 at 10:52:09 AM UTC-5, Stephen Fuld wrote:
> On 8/11/2023 11:38 PM, Anton Ertl wrote:
> > Michael S <already...@yahoo.com> writes:
> >> Is there wider than 128-bit SIMD ISA that does *not* have that or
> >> another variant of gather?
> >
> > AVX. However, Intel added gather instructions with AVX2 two years
> > later.
> >
> >> Personally I am of low opinion about gather. IMHO, if one's algorithm
> >> needs gather for vectorization then it's a sign that vectorization gain
> >> will be low and that you better not bother.
> >> But it seems to me that most architect are enamored but this sort of crap.
> >
> > Yes, it's better to arrange your data or your computations so you
> > don't need gather.
> >
> > However, if the programmer does not do that (either because he does
> > not think about vectorization, or just does not find such an
> > arrangement), gather and scatter instructions may be useful for
> > avoiding to have to switch from SIMD to scalar code and back.
> >
> > AFAIK the Cray-1 had gather and scatter instructions. I wonder how
> > much the speed difference between stride-1 loads and stores and gather
> > and scatter was on the Cray-1.
> Probably not much at all, perhaps none. Remember, Cray-1 had no cache,
<
Yes,, CRAY-1
<
> used static RAM for main memory
<
CRAY-1S
<
> (i.e. no page mode, etc.), and had
> highly interleaved memory.
<
64-to-256 banks
<
> > Maybe gather and scatter were added in the Cray-1 (for which people
> > thought about how to vectorize) because they could be implemented
> > efficiently, and nowadays SIMD designers add them, because the Cray-1
> > had them, and code designed with that background has them, even though
> > nowadays the performance advantages of stride-1 accesses are much
> > bigger.
> Could be.
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Downfall hardware bug involving vgather instruction

<881f1c52-da86-455f-974f-fd8502ef52acn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33617&group=comp.arch#33617

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:8ed:b0:63d:b91:771c with SMTP id dr13-20020a05621408ed00b0063d0b91771cmr62744qvb.0.1691864707009;
Sat, 12 Aug 2023 11:25:07 -0700 (PDT)
X-Received: by 2002:a17:902:f349:b0:1b8:8c7:31e6 with SMTP id
q9-20020a170902f34900b001b808c731e6mr1773242ple.1.1691864706789; Sat, 12 Aug
2023 11:25:06 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 12 Aug 2023 11:25:06 -0700 (PDT)
In-Reply-To: <2023Aug11.081533@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:f13f:c8fd:311b:b1a;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:f13f:c8fd:311b:b1a
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
<2023Aug11.081533@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <881f1c52-da86-455f-974f-fd8502ef52acn@googlegroups.com>
Subject: Re: Downfall hardware bug involving vgather instruction
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Sat, 12 Aug 2023 18:25:07 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 5512
 by: MitchAlsup - Sat, 12 Aug 2023 18:25 UTC

On Friday, August 11, 2023 at 1:39:09 AM UTC-5, Anton Ertl wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >The gather bug is an example of not-obeying the rule:
> >"If you change 1 bit in an architectural register, you change them all"
> ><
> >RISC-people in 1980 already knew the x86 byte stuff was problematic
> >{along with a host of other mis-features}
<
> Where can I read about this knowledge of the RISC people? Did they
> even waste one thought to the 8086, which was an insignificant
> microprocessor in 1980?
<
It was well known that early x86 compilers had a hard time treating
the separate containers of the registers sometimes as a pair and
sometimes as a byte.
<
It was very well known that setting some condition codes without setting
all of them was already problematic--not just on the HW but for the
compilers, too.
<
It was well known that dedicated registers for MUL and DIV were
a different kind of problem for the compilers of that era.
<
And It was seen and a significant problem how IBM 360 has insert
character but not LDB (especially with the K&R C rules for integer
upward casting)
<
I don't know/remember where any of this was actually written down.
<
I certainly knew this stuff in 1983 after having built the DENELCOR
C compiler from PCC base in 1981-2. {This is also where I discovered
my dislike of condition codes, and separate register files.}
>
> As for the gather bug, the gather instructions were added with AVX2,
> first implemented in Haswell (released 2013), and with AVX-512 F
> (Knights Landing, 2013). Intel had problems with partial register
<
AMD had long and significant talks on partial register updates when
designing Opteron (x86-64). There were essentially 2 camps--a) get
rid of them, and b) backwards compatibility is a requirement. The
rest (as they say) is history.
<
> updates since they introduced OoO with the Pentium Pro (released
> 1995), so they were aware of the problems of implementing such
> instructions. And yet they chose to do so, in addition to choosing to
> add an instruction that may perform 16 memory accesses.
<
I think they would feel they were forced into it, and it was not a "free" choice.
>
> Interestingly, the definition of, e.g., VGATHERPS says:
>
> |* Faults are delivered in a right-to-left manner. That is, if a fault
> | is triggered by an element and delivered, all elements closer to the
> | LSB of the destination will be completed (and
> | non-faulting). Individual elements closer to the MSB may or may not
> | be completed. If a given element triggers multiple faults, they are
> | delivered in the conventional order.
> |
> |* Elements may be gathered in any order, but faults must be delivered
> | in a right-to-left order; thus, elements to the left of a faulting
> | one may be gathered before the fault is delivered. A given
> | implementation of this instruction is repeatable - given the same
> | input values and architectural state, the same set of elements to
> | the left of the faulting one will be gathered.
>
> It's not clear to me what this "completed" and "gathered" is about.
> Is it about changing the destination register in case of a fault? Is
> it about order of memory accesses for concurrent processing?
<
When you define gather as above:: and you have a OoO µArchitecture,
you are tempted to stream operands through AGEN and out through
the memory system, the odd accesses that miss the TLB are not serviced
in the order of occurrence in the AGENing and not serviced in the order
of occurrence in filling of the register. This can lead to all sorts of
"interesting stuff happening" not much of which is good for the
implementation.
>
> There is a good reason why multi-access instructions are shunned by
> most architects.
<
Especially when 1 address cannot define the memory access pattern.
<
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<ebd28e97-acbd-4c7e-aa64-8b871620e995n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33623&group=comp.arch#33623

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1894:b0:40f:c807:f5f8 with SMTP id v20-20020a05622a189400b0040fc807f5f8mr76532qtc.10.1691921205921;
Sun, 13 Aug 2023 03:06:45 -0700 (PDT)
X-Received: by 2002:a17:90a:f0c4:b0:268:5919:a271 with SMTP id
fa4-20020a17090af0c400b002685919a271mr1426116pjb.8.1691921205425; Sun, 13 Aug
2023 03:06:45 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!panix!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Aug 2023 03:06:44 -0700 (PDT)
In-Reply-To: <2023Aug12.085106@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <ub2v4c$dk5m$1@dont-email.me>
<2023Aug10.181205@mips.complang.tuwien.ac.at> <ff76e49b-3f53-46c2-9132-c942cae8c0a0n@googlegroups.com>
<2023Aug11.080805@mips.complang.tuwien.ac.at> <b40214fa-ccc2-4f06-9c7e-ddf53eb7b5f5n@googlegroups.com>
<2023Aug12.085106@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ebd28e97-acbd-4c7e-aa64-8b871620e995n@googlegroups.com>
Subject: Re: Downfall hardware bug involving vgather instruction
From: already5chosen@yahoo.com (Michael S)
Injection-Date: Sun, 13 Aug 2023 10:06:45 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2034
 by: Michael S - Sun, 13 Aug 2023 10:06 UTC

On Saturday, August 12, 2023 at 9:53:33 AM UTC+3, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> >Downfall, on the other hand, looks impossible to find unless you know
> >what you are looking for.
> So how do you think did Daniel Moghimi know what to look for?
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

I don't know. But his article leaves a feeling that somehow he suspected that
there is something of this sort.

Re: Downfall hardware bug involving vgather instruction

<460e480b-ccac-4e50-8add-b6b446e72269n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33624&group=comp.arch#33624

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:6649:b0:76c:deff:8c42 with SMTP id qg9-20020a05620a664900b0076cdeff8c42mr66135qkn.14.1691922071247;
Sun, 13 Aug 2023 03:21:11 -0700 (PDT)
X-Received: by 2002:a05:6a00:ac2:b0:676:ba7f:7906 with SMTP id
c2-20020a056a000ac200b00676ba7f7906mr2892624pfl.3.1691922071047; Sun, 13 Aug
2023 03:21:11 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Aug 2023 03:21:10 -0700 (PDT)
In-Reply-To: <2023Aug12.083815@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <3814934d-daaf-4d18-b8e7-a680713e6626n@googlegroups.com>
<2023Aug11.081533@mips.complang.tuwien.ac.at> <12d80e40-6123-438a-9f02-031c4e0753a1n@googlegroups.com>
<2023Aug12.083815@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <460e480b-ccac-4e50-8add-b6b446e72269n@googlegroups.com>
Subject: Re: Downfall hardware bug involving vgather instruction
From: already5chosen@yahoo.com (Michael S)
Injection-Date: Sun, 13 Aug 2023 10:21:11 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3427
 by: Michael S - Sun, 13 Aug 2023 10:21 UTC

On Saturday, August 12, 2023 at 9:49:58 AM UTC+3, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> >Is there wider than 128-bit SIMD ISA that does *not* have that or
> >another variant of gather?
> AVX. However, Intel added gather instructions with AVX2 two years
> later.
> >Personally I am of low opinion about gather. IMHO, if one's algorithm
> >needs gather for vectorization then it's a sign that vectorization gain
> >will be low and that you better not bother.
> >But it seems to me that most architect are enamored but this sort of crap.
> Yes, it's better to arrange your data or your computations so you
> don't need gather.
>
> However, if the programmer does not do that (either because he does
> not think about vectorization, or just does not find such an
> arrangement), gather and scatter instructions may be useful for
> avoiding to have to switch from SIMD to scalar code and back.
>

I wanted to write that the problem of current form of AVX/AVX2 is that
mixing SIMD ALUs with scalar load is expensive. But during writing I
realized that it is not necessarily true. One can use a series of broadcast
loads followed by VBLENDPx. Both a cheap.
I didn't think about this simple technique until now.

> AFAIK the Cray-1 had gather and scatter instructions. I wonder how
> much the speed difference between stride-1 loads and stores and gather
> and scatter was on the Cray-1.
>
> Maybe gather and scatter were added in the Cray-1 (for which people
> thought about how to vectorize) because they could be implemented
> efficiently, and nowadays SIMD designers add them, because the Cray-1
> had them, and code designed with that background has them, even though
> nowadays the performance advantages of stride-1 accesses are much
> bigger.
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Downfall hardware bug involving vgather instruction

<dz5CM.79592$zW7d.52846@fx43.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33625&group=comp.arch#33625

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx43.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
References: <2023Aug9.171316@mips.complang.tuwien.ac.at> <ub2v4c$dk5m$1@dont-email.me> <2023Aug10.181205@mips.complang.tuwien.ac.at> <ff76e49b-3f53-46c2-9132-c942cae8c0a0n@googlegroups.com> <2023Aug11.080805@mips.complang.tuwien.ac.at> <b40214fa-ccc2-4f06-9c7e-ddf53eb7b5f5n@googlegroups.com> <2023Aug12.085106@mips.complang.tuwien.ac.at>
In-Reply-To: <2023Aug12.085106@mips.complang.tuwien.ac.at>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 42
Message-ID: <dz5CM.79592$zW7d.52846@fx43.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sun, 13 Aug 2023 14:14:33 UTC
Date: Sun, 13 Aug 2023 10:14:04 -0400
X-Received-Bytes: 2897
 by: EricP - Sun, 13 Aug 2023 14:14 UTC

Anton Ertl wrote:
> Michael S <already5chosen@yahoo.com> writes:
>> Downfall, on the other hand, looks impossible to find unless you know
>> what you are looking for.
>
> So how do you think did Daniel Moghimi know what to look for?
>
> - anton

The paper references two patents

US8972697B2 - Gather using index array and finite state machine
https://patents.google.com/patent/US8972697B2/

US9189236B2 - Speculative non-faulting loads and gathers
https://patents.google.com/patent/US9189236/

If he was reading Intel or AMD patents (I do that myself sometimes)
he might have noticed US8972697B2 includes the following paragraph
which has a very Meltdown-y smell to it if an SMT task switch happens
at just the wrong moment.

He'd still have to be a pretty smart cookie to figure that out.

"In one embodiment, the uops schedulers 202, 204, 206, dispatch dependent
operations before the parent load has finished executing. As uops are
speculatively scheduled and executed in processor 200, the processor 200
also includes logic to handle memory misses. If a data load misses in the
data cache, there can be dependent operations in flight in the pipeline
that have left the scheduler with temporarily incorrect data. In some
embodiments, a replay mechanism may track and re-execute instructions
that use incorrect data. Only the dependent operations need to be
replayed and the independent ones are allowed to complete.
The schedulers and replay mechanism of one embodiment of a processor
are also designed to catch instructions that provide vector scatter
and/or gather functionality. In some alternative embodiments without
a replay mechanism, speculative execution of uops may be prevented and
dependent uops may reside in the schedulers 202, 204, 206 until they are
canceled, or until they cannot be canceled."

Re: Downfall hardware bug involving vgather instruction

<ubaskc$2984t$1@newsreader4.netcologne.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33626&group=comp.arch#33626

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-1c08-0-d25-c3ed-1d88-1820.ipv6dyn.netcologne.de!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Downfall hardware bug involving vgather instruction
Date: Sun, 13 Aug 2023 15:25:00 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <ubaskc$2984t$1@newsreader4.netcologne.de>
References: <2023Aug9.171316@mips.complang.tuwien.ac.at>
<ub2v4c$dk5m$1@dont-email.me> <2023Aug10.181205@mips.complang.tuwien.ac.at>
<ff76e49b-3f53-46c2-9132-c942cae8c0a0n@googlegroups.com>
<2023Aug11.080805@mips.complang.tuwien.ac.at>
<b40214fa-ccc2-4f06-9c7e-ddf53eb7b5f5n@googlegroups.com>
<2023Aug12.085106@mips.complang.tuwien.ac.at>
<dz5CM.79592$zW7d.52846@fx43.iad>
Injection-Date: Sun, 13 Aug 2023 15:25:00 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-1c08-0-d25-c3ed-1d88-1820.ipv6dyn.netcologne.de:2001:4dd6:1c08:0:d25:c3ed:1d88:1820";
logging-data="2400413"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sun, 13 Aug 2023 15:25 UTC

EricP <ThatWouldBeTelling@thevillage.com> schrieb:

> If he was reading Intel or AMD patents (I do that myself sometimes)

You're quite brave, reading and understanding patents is a hard work.

This has not gotten easier in more recent years, when patent offices
demand more and more detail in dislosure in patent texts - this
makes them harder to read, because it is hard to be sure if you are
currently reading about the core of the patent, or some arbitrary
description that the author thought the patent office would require,
just to be on the safe side.

And yes, I have been very much guilty in that respect - several
people have told me that they did not understand what I had written,
although they were active in the field. My standard reply was that
these patents are meant to be exact, not necessarily easy to read.

I recently read a patent from the late 1960s/early 1970s. It had
a total of three pages - a title page, a text page, a page with
two drawings. It was a thing of beauty.

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor