Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

Quod licet Iovi non licet bovi. (What Jove may do, is not permitted to a cow.)


devel / comp.arch / Re: Another security vulnerability

SubjectAuthor
* Another security vulnerabilityStephen Fuld
+* Re: Another security vulnerabilityMitchAlsup1
|+* Re: Another security vulnerabilityStefan Monnier
||`- Re: Another security vulnerabilityMichael S
|`- Re: Another security vulnerabilityMichael S
+* Re: Another security vulnerabilityLawrence D'Oliveiro
|+- Re: Another security vulnerabilityStefan Monnier
|`* Re: Another security vulnerabilityStephen Fuld
| `* Re: Another security vulnerabilityLawrence D'Oliveiro
|  +* Re: Another security vulnerabilityMitchAlsup1
|  |`* Re: Another security vulnerabilityLawrence D'Oliveiro
|  | `* Re: Another security vulnerabilityStephen Fuld
|  |  `* Re: Another security vulnerabilityLawrence D'Oliveiro
|  |   `- Re: Another security vulnerabilityMitchAlsup1
|  `- Re: Another security vulnerabilityMichael S
+* Re: Another security vulnerabilityThomas Koenig
|+* Re: Another security vulnerabilityAnton Ertl
||`* Re: Another security vulnerabilityScott Lurndal
|| +* Re: Another security vulnerabilityLawrence D'Oliveiro
|| |`* Re: Another security vulnerabilityScott Lurndal
|| | `- Re: Another security vulnerabilityAnton Ertl
|| `* Re: Another security vulnerabilityAnton Ertl
||  `* Re: Another security vulnerabilityScott Lurndal
||   `- Re: Another security vulnerabilityAnton Ertl
|+- Re: Another security vulnerabilityMichael S
|`- Re: Another security vulnerabilityScott Lurndal
+* Re: Another security vulnerabilityAnton Ertl
|`* Re: Another security vulnerabilityMichael S
| `* Re: Another security vulnerabilityThomas Koenig
|  +* Re: Another security vulnerabilityEricP
|  |+* Re: Another security vulnerabilityThomas Koenig
|  ||`* Re: Another security vulnerabilityEricP
|  || `- Re: Another security vulnerabilityThomas Koenig
|  |`* Re: Another security vulnerabilityAnton Ertl
|  | `* Re: Another security vulnerabilityEricP
|  |  `* Re: Another security vulnerabilityMitchAlsup1
|  |   `* Re: Another security vulnerabilityEricP
|  |    +* Re: Another security vulnerabilityMitchAlsup1
|  |    |`* Re: Another security vulnerabilityPaul A. Clayton
|  |    | `* Re: Another security vulnerabilityScott Lurndal
|  |    |  `* Re: Another security vulnerabilityPaul A. Clayton
|  |    |   `* Re: Another security vulnerabilityScott Lurndal
|  |    |    `- Re: Another security vulnerabilityMitchAlsup1
|  |    `* Re: Another security vulnerabilityStefan Monnier
|  |     `* Re: Another security vulnerabilityThomas Koenig
|  |      +* Re: Another security vulnerabilityMitchAlsup1
|  |      |`* Re: Another security vulnerabilityThomas Koenig
|  |      | `* Re: Another security vulnerabilityMitchAlsup1
|  |      |  +* Re: Another security vulnerabilityScott Lurndal
|  |      |  |`- Re: Another security vulnerabilityMitchAlsup1
|  |      |  `* Re: Another security vulnerabilityPaul A. Clayton
|  |      |   `- Re: Another security vulnerabilityMitchAlsup1
|  |      `* Re: Another security vulnerabilityStefan Monnier
|  |       `- Re: Another security vulnerabilityMitchAlsup1
|  `* Re: Another security vulnerabilityAnton Ertl
|   `* Re: Another security vulnerabilityScott Lurndal
|    `- Re: Another security vulnerabilityScott Lurndal
`* Re: Another security vulnerabilityJohn Savard
 `* Re: Another security vulnerabilityMichael S
  `- Re: Another security vulnerabilityLawrence D'Oliveiro

Pages:123
Re: Another security vulnerability

<2024Mar26.174038@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38166&group=comp.arch#38166

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Tue, 26 Mar 2024 16:40:38 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 25
Message-ID: <2024Mar26.174038@mips.complang.tuwien.ac.at>
References: <utpoi2$b6to$1@dont-email.me> <utr63b$u40q$1@dont-email.me> <2024Mar25.093751@mips.complang.tuwien.ac.at> <8biMN.162475$46Te.1680@fx38.iad> <2024Mar26.101836@mips.complang.tuwien.ac.at> <rJAMN.122730$SyNd.19177@fx33.iad>
Injection-Date: Tue, 26 Mar 2024 16:46:53 +0100 (CET)
Injection-Info: dont-email.me; posting-host="448288c924380b98e6ec89021008782f";
logging-data="2025169"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19oIhzJy2DV/0uGp/yFo3z/"
Cancel-Lock: sha1:QGsEW6OfGpmyyh3VtOlaKInW1Sg=
X-newsreader: xrn 10.11
 by: Anton Ertl - Tue, 26 Mar 2024 16:40 UTC

scott@slp53.sl.home (Scott Lurndal) writes:
>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>>scott@slp53.sl.home (Scott Lurndal) writes:
>>>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>>>>Also, if the prefetcher works with data in a shared cache (I don't
>>>>know whether the data-dependent prefetchers do that), it may not
>>>>matter on which core the code runs.
>>>
>>>Run it in non-cacheable memory. Slow but safe.
>>
>>To eliminate this particular vulnerability, it's sufficient to disable
>>the data-dependent prefetcher.
>
>That assumes that chicken bit(s) are available to do that.

The hardware designers have put in the chicken bit(s); it's highly
unlikely that they have unconditionally enabled the data-dependent
prefetcher on M1 and M2, and only added a chicken bit on M3. Now that
the hardware indeed turns out to be broken, they just need to activate
it/them.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Another security vulnerability

<20240326195619.0000657e@yahoo.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38167&group=comp.arch#38167

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Tue, 26 Mar 2024 18:56:19 +0200
Organization: A noiseless patient Spider
Lines: 39
Message-ID: <20240326195619.0000657e@yahoo.com>
References: <utpoi2$b6to$1@dont-email.me>
<589c076598e37c2339473f8ddb8718eb@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 26 Mar 2024 17:56:28 +0100 (CET)
Injection-Info: dont-email.me; posting-host="9a8f240f50772d42a957aee0b59324e0";
logging-data="1961728"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19jIgQF5osp756BoYc2arENDUNNxnVgi3A="
Cancel-Lock: sha1:tS12f1W9rSs7+AaRKA54dflUVmk=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)
 by: Michael S - Tue, 26 Mar 2024 16:56 UTC

On Sun, 24 Mar 2024 18:20:06 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

> Stephen Fuld wrote:
>
> > https://arstechnica.com/security/2024/03/hackers-can-extract-secret-encryption-keys-from-apples-mac-chips/
> >
>
> > So, is there a way to fix this while maintaining the feature's
> > performance advantage?
>
> They COULD start by not putting prefetched data into the cache
> until after the predicting instruction retires. {{I have a note
> from about 20 months ago where this feature was publicized and
> the note indicates a potential side-channel.}}
>
> An alternative is to notice that [*]cryption instructions are
> being processed and turn DMP off during those intervals of time.
> {Or both}.
>

Their PoC attacks public key crypto algorithms - RSA-2048, DHKE-2048 and
couple of exotic new algorithms that nobody uses.
I think, neither RSA-2048 nor DHKE-2048 use any special crypto
instructions.
On Intel/AMD it's likely that thise crypto routines use MULX
and ADX instruction much for often than non-crypto code, but that's not
guaranteed.
On ARM64 you don't even have that much, because equivalents of MULX and
of ADX are part of base instruction set.

> Principle:: an Architecturally visible unit of data can only become
> visible after the causing instruction retires. A high precision timer
> makes cache line [dis]placement visible; so either take away the HPT
> or don't alter cache visible state too early.
>
> And we are off to the races, again.....

Re: Another security vulnerability

<20240326195730.00004209@yahoo.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38168&group=comp.arch#38168

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5chosen@yahoo.com (Michael S)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Tue, 26 Mar 2024 18:57:30 +0200
Organization: A noiseless patient Spider
Lines: 17
Message-ID: <20240326195730.00004209@yahoo.com>
References: <utpoi2$b6to$1@dont-email.me>
<589c076598e37c2339473f8ddb8718eb@www.novabbs.org>
<jwvedbygubo.fsf-monnier+comp.arch@gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 26 Mar 2024 17:57:38 +0100 (CET)
Injection-Info: dont-email.me; posting-host="9a8f240f50772d42a957aee0b59324e0";
logging-data="1961728"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18B3xmju14670MtOix6i5zKYfFs9mhO8XY="
Cancel-Lock: sha1:2gLPhNtQDP4F2rdCBfhQhTKIUpo=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)
 by: Michael S - Tue, 26 Mar 2024 16:57 UTC

On Mon, 25 Mar 2024 12:18:18 -0400
Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> > Principle:: an Architecturally visible unit of data can only become
> > visible after the causing instruction retires. A high precision
> > timer makes cache line [dis]placement visible; so either take away
> > the HPT or don't alter cache visible state too early.
>
> And parallelism (e.g. multicores) can be used to emulate HPT, so "take
> away the HPT" is not really an option.
>
>
> Stefan

That's exactly how they measured time in PoC.

Re: Another security vulnerability

<uu0kt1$2nr9j$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38176&group=comp.arch#38176

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Wed, 27 Mar 2024 08:20:49 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 14
Message-ID: <uu0kt1$2nr9j$1@dont-email.me>
References: <utpoi2$b6to$1@dont-email.me>
<2024Mar25.082534@mips.complang.tuwien.ac.at>
<20240326192941.0000314a@yahoo.com>
Injection-Date: Wed, 27 Mar 2024 08:20:49 +0100 (CET)
Injection-Info: dont-email.me; posting-host="d28ef1cdb2468220a3cbbc5a3023845a";
logging-data="2878771"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Fxskp7Jpr69AahD51R20pwpZKkuGzrKc="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:mr9uiTWdZONjAJLt6oNNWQoqnx0=
 by: Thomas Koenig - Wed, 27 Mar 2024 08:20 UTC

Michael S <already5chosen@yahoo.com> schrieb:

> In case you missed it, the web page contains link to pdf:
> https://gofetch.fail/files/gofetch.pdf

Looking the paper, it seems that a separate "load value" instruction
(where it is guaranteed that no pointer prefetching will be done)
could fix this particular issue. Compilers know what type is being
loaded from memory, and could issue the corresponding instruction.
This would not impact performance.

Only works for new versions of an architecture, and supporting
compilers, but no code change would be required. And, of course,
it would eat up opcode space.

Re: Another security vulnerability

<VpVMN.731075$p%Mb.618266@fx15.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38177&group=comp.arch#38177

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me>
In-Reply-To: <uu0kt1$2nr9j$1@dont-email.me>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 30
Message-ID: <VpVMN.731075$p%Mb.618266@fx15.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 27 Mar 2024 13:45:25 UTC
Date: Wed, 27 Mar 2024 09:44:34 -0400
X-Received-Bytes: 2050
 by: EricP - Wed, 27 Mar 2024 13:44 UTC

Thomas Koenig wrote:
> Michael S <already5chosen@yahoo.com> schrieb:
>
>> In case you missed it, the web page contains link to pdf:
>> https://gofetch.fail/files/gofetch.pdf
>
> Looking the paper, it seems that a separate "load value" instruction
> (where it is guaranteed that no pointer prefetching will be done)
> could fix this particular issue. Compilers know what type is being
> loaded from memory, and could issue the corresponding instruction.
> This would not impact performance.
>
> Only works for new versions of an architecture, and supporting
> compilers, but no code change would be required. And, of course,
> it would eat up opcode space.

It doesn't need to eat opcode space if you only support one data type,
64-bit ints, and one address mode, [register].
Other address modes can be calculated using LEA.
Since these are rare instructions to solve a particular problem,
they won't be used that often, so a few extra instructions shouldn't matter.

I used this approach for the Atomic Fetch-and-OP instructions.
They only need one or two data types and one address mode.

I also considered the same single [reg] address mode for privileged
Load & Store to Physical Address, though these would need to
support 1,2,4, and 8 byte ints, and need some cache control bits.

Re: Another security vulnerability

<uu1ck0$2tfbq$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38178&group=comp.arch#38178

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Wed, 27 Mar 2024 15:05:36 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <uu1ck0$2tfbq$1@dont-email.me>
References: <utpoi2$b6to$1@dont-email.me>
<2024Mar25.082534@mips.complang.tuwien.ac.at>
<20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me>
<VpVMN.731075$p%Mb.618266@fx15.iad>
Injection-Date: Wed, 27 Mar 2024 15:05:36 +0100 (CET)
Injection-Info: dont-email.me; posting-host="518bc0805b9cb00d427055eddddea606";
logging-data="3063162"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18OoydnkF3GOZ74aRxlG1EX57lPTcakDcc="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:BFSZux5w2rYt9aTqRMywJsKf6WI=
 by: Thomas Koenig - Wed, 27 Mar 2024 15:05 UTC

EricP <ThatWouldBeTelling@thevillage.com> schrieb:
> Thomas Koenig wrote:
>> Michael S <already5chosen@yahoo.com> schrieb:
>>
>>> In case you missed it, the web page contains link to pdf:
>>> https://gofetch.fail/files/gofetch.pdf
>>
>> Looking the paper, it seems that a separate "load value" instruction
>> (where it is guaranteed that no pointer prefetching will be done)
>> could fix this particular issue. Compilers know what type is being
>> loaded from memory, and could issue the corresponding instruction.
>> This would not impact performance.
>>
>> Only works for new versions of an architecture, and supporting
>> compilers, but no code change would be required. And, of course,
>> it would eat up opcode space.
>
> It doesn't need to eat opcode space if you only support one data type,
> 64-bit ints, and one address mode, [register].
> Other address modes can be calculated using LEA.
> Since these are rare instructions to solve a particular problem,
> they won't be used that often, so a few extra instructions shouldn't matter.

Hm, I'm not sure it would actually be used rarely, at least not
the way I thought about it.

I envisage a "ldp" (load pointer) instruction, which turns on
prefetaching, for everything that looks like

foo_t *p = some_expr;

which could also mean something like

*p = ptrarray[i];

with a scaled and indexed load (for example), where prefixing
is turned on, and a "ldd" (load double data) instruction where,
explicitly, for

long int n = some_other_expr;

where prefetching is explicitly disabled. (Apart from the security
implicatins, this could also save a tiny bit of power).

Re: Another security vulnerability

<UvXMN.130838$GX69.82796@fx46.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38179&group=comp.arch#38179

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!nntp.comgw.net!peer02.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx46.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <VpVMN.731075$p%Mb.618266@fx15.iad> <uu1ck0$2tfbq$1@dont-email.me>
In-Reply-To: <uu1ck0$2tfbq$1@dont-email.me>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 64
Message-ID: <UvXMN.130838$GX69.82796@fx46.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 27 Mar 2024 16:08:20 UTC
Date: Wed, 27 Mar 2024 12:08:05 -0400
X-Received-Bytes: 3388
 by: EricP - Wed, 27 Mar 2024 16:08 UTC

Thomas Koenig wrote:
> EricP <ThatWouldBeTelling@thevillage.com> schrieb:
>> Thomas Koenig wrote:
>>> Michael S <already5chosen@yahoo.com> schrieb:
>>>
>>>> In case you missed it, the web page contains link to pdf:
>>>> https://gofetch.fail/files/gofetch.pdf
>>> Looking the paper, it seems that a separate "load value" instruction
>>> (where it is guaranteed that no pointer prefetching will be done)
>>> could fix this particular issue. Compilers know what type is being
>>> loaded from memory, and could issue the corresponding instruction.
>>> This would not impact performance.
>>>
>>> Only works for new versions of an architecture, and supporting
>>> compilers, but no code change would be required. And, of course,
>>> it would eat up opcode space.
>> It doesn't need to eat opcode space if you only support one data type,
>> 64-bit ints, and one address mode, [register].
>> Other address modes can be calculated using LEA.
>> Since these are rare instructions to solve a particular problem,
>> they won't be used that often, so a few extra instructions shouldn't matter.
>
> Hm, I'm not sure it would actually be used rarely, at least not
> the way I thought about it.

I'm referring to your load with prefetch disable.
For these particular loads it's users could likely tolerate the
"overhead" of an extra LEA instruction to calculate the address,
and don't need all 7 integer data types.

> I envisage a "ldp" (load pointer) instruction, which turns on
> prefetaching, for everything that looks like
>
> foo_t *p = some_expr;
>
> which could also mean something like
>
> *p = ptrarray[i];

So this would be
LEA r0,[rBase+rIndex*8+offset]
LDAPQ r0,[r0] // load quad with auto pointer prefetch

though I'm not really sold on the need for this if you have an instruction
(below) that explicitly disables pointer auto-prefetch.
Plus all this does is eliminate an explicit PREFCHR Prefetch-for-Read.

> with a scaled and indexed load (for example), where prefixing
> is turned on, and a "ldd" (load double data) instruction where,
> explicitly, for
>
> long int n = some_other_expr;
>
> where prefetching is explicitly disabled. (Apart from the security
> implicatins, this could also save a tiny bit of power).

LEA r0,[rBase+offset]
LDNPQ r0,[r0] // load quad no-auto-prefetch

costs 1 opcode as it only supports int64 data type and
doesn't need a corresponding STNPQ store.

Re: Another security vulnerability

<uu1h3k$12v4q$2@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38180&group=comp.arch#38180

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: sfuld@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Wed, 27 Mar 2024 09:22:12 -0700
Organization: A noiseless patient Spider
Lines: 23
Message-ID: <uu1h3k$12v4q$2@dont-email.me>
References: <utpoi2$b6to$1@dont-email.me> <utq715$jsuq$3@dont-email.me>
<utsa0k$12v4q$1@dont-email.me> <utspur$1akpd$4@dont-email.me>
<80e708682d016d9c2a36adffa668f58e@www.novabbs.org>
<utt4pm$1d406$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 27 Mar 2024 17:22:15 +0100 (CET)
Injection-Info: dont-email.me; posting-host="8ff8833736304ccba397a3605627ddcb";
logging-data="1146010"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+kTNbVulpHweG5adc6dhtR4ZxyrY3RMk4="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:ITAXUso0QcK4Vr5bT0/PW+mkFSg=
In-Reply-To: <utt4pm$1d406$2@dont-email.me>
Content-Language: en-US
 by: Stephen Fuld - Wed, 27 Mar 2024 16:22 UTC

On 3/25/2024 5:27 PM, Lawrence D'Oliveiro wrote:
> On Mon, 25 Mar 2024 22:17:55 +0000, MitchAlsup1 wrote:
>
>> Lawrence D'Oliveiro wrote:
>>
>>> The basic problem is that building all this complex, bug-prone
>>> functionality into monolithic, nonupgradeable hardware is not really a
>>> good idea.
>>
>> Would you like to inform us of how it can be done otherwise ?
>
> Upgradeable firmware/software, of course.

But microcode is generally slower than dedicated hardware, and most
people seem to be unwilling to give up performance all the time to gain
an advantage in a situation that occurs infrequently and mostly never.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Another security vulnerability

<2024Mar27.185230@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38181&group=comp.arch#38181

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Wed, 27 Mar 2024 17:52:30 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 38
Message-ID: <2024Mar27.185230@mips.complang.tuwien.ac.at>
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me>
Injection-Date: Wed, 27 Mar 2024 18:14:02 +0100 (CET)
Injection-Info: dont-email.me; posting-host="dd96cb78367ba2d846b5f6040a81f592";
logging-data="3155112"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1955rPXOg4d46tC0AsBu+Zs"
Cancel-Lock: sha1:1KVSXYlHXsx/hTAq8VBEtZH35Hw=
X-newsreader: xrn 10.11
 by: Anton Ertl - Wed, 27 Mar 2024 17:52 UTC

Thomas Koenig <tkoenig@netcologne.de> writes:
>Michael S <already5chosen@yahoo.com> schrieb:
>
>> In case you missed it, the web page contains link to pdf:
>> https://gofetch.fail/files/gofetch.pdf
>
>Looking the paper, it seems that a separate "load value" instruction
>(where it is guaranteed that no pointer prefetching will be done)
>could fix this particular issue. Compilers know what type is being
>loaded from memory, and could issue the corresponding instruction.
>This would not impact performance.
>
>Only works for new versions of an architecture, and supporting
>compilers, but no code change would be required. And, of course,
>it would eat up opcode space.

The other way 'round seems to be a better approach: mark those loads
that load addresses, and then prefetch at most based on the data
loaded by these instructions (of course, the data-dependent prefetcher
may choose to ignore the data based on history). That means that
existing programs are immune to GoFetch, but also don't benefit from
the data-dependent prefetcher (which is a minor issue IMO).

As for opcode space, we already have prefetch instructions, so one
could implement this by letting every load that actually loads an
address be followed by a prefetch instruction. But of course that
would consume more code space and decoding bandwidth than just adding
a load-address instruction.

In any case, passing the prefetch hints to hardware that may ignore
the hint based on history may help reduce the performance
disadvantages that have been seen when using prefetch hint
instructions.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Another security vulnerability

<2024Mar27.191411@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38183&group=comp.arch#38183

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Wed, 27 Mar 2024 18:14:11 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 16
Message-ID: <2024Mar27.191411@mips.complang.tuwien.ac.at>
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <VpVMN.731075$p%Mb.618266@fx15.iad>
Injection-Date: Wed, 27 Mar 2024 18:25:38 +0100 (CET)
Injection-Info: dont-email.me; posting-host="dd96cb78367ba2d846b5f6040a81f592";
logging-data="3160632"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18blsC62bNR7xOhkqZzhvK5"
Cancel-Lock: sha1:8DATGjs/03sG4Gf/aW9DeiMRhIw=
X-newsreader: xrn 10.11
 by: Anton Ertl - Wed, 27 Mar 2024 18:14 UTC

EricP <ThatWouldBeTelling@thevillage.com> writes:
>It doesn't need to eat opcode space if you only support one data type,
>64-bit ints, and one address mode, [register].
>Other address modes can be calculated using LEA.
>Since these are rare instructions to solve a particular problem,
>they won't be used that often, so a few extra instructions shouldn't matter.

You lost me here. Do you mean that a load with address mode
[register] is considered to be a non-address load and not followed by
the data-dependent prefetcher? So how would an address load be
encoded if the natural expression would be [register]?

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Another security vulnerability

<HH_MN.732789$p%Mb.8039@fx15.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38185&group=comp.arch#38185

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!newsfeed.bofh.team!2.eu.feeder.erje.net!feeder.erje.net!feeder1-2.proxad.net!proxad.net!feeder1-1.proxad.net!193.141.40.65.MISMATCH!npeer.as286.net!npeer-ng0.as286.net!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <VpVMN.731075$p%Mb.618266@fx15.iad> <2024Mar27.191411@mips.complang.tuwien.ac.at>
In-Reply-To: <2024Mar27.191411@mips.complang.tuwien.ac.at>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 40
Message-ID: <HH_MN.732789$p%Mb.8039@fx15.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 27 Mar 2024 19:45:43 UTC
Date: Wed, 27 Mar 2024 15:42:34 -0400
X-Received-Bytes: 2709
 by: EricP - Wed, 27 Mar 2024 19:42 UTC

Anton Ertl wrote:
> EricP <ThatWouldBeTelling@thevillage.com> writes:
>> It doesn't need to eat opcode space if you only support one data type,
>> 64-bit ints, and one address mode, [register].
>> Other address modes can be calculated using LEA.
>> Since these are rare instructions to solve a particular problem,
>> they won't be used that often, so a few extra instructions shouldn't matter.
>
> You lost me here. Do you mean that a load with address mode
> [register] is considered to be a non-address load and not followed by
> the data-dependent prefetcher? So how would an address load be
> encoded if the natural expression would be [register]?
>
> - anton

I'm pointing out that not all instructions need to be orthogonal.
There can be savings in opcode space by tempering that based on
expected frequency of occurrence.

The normal LD and ST have all their address modes and data types
because these functions occur frequently enough that we deem it
worthwhile to support these all in one instruction,
such as supporting both sign and zero extended loads
or scaled index addressing.

I note there is this class of relatively rarely used special purpose
memory access instructions that don't need to have all singing and all
dancing address modes and/or data types like the regular LD and ST.

Since I need a LEA Load Effective Address instruction anyway
which does rBase+rIndex*scale+offset calculation
(plus I have others, like where rBase is RIP or an absolute address),
then I can drop all but the [reg] address mode for these rare instructions
and in many cases drop some sign or zero extend types for loads.

For example, I use just two opcodes for Atomic Fetch Add int64 and int32
AFADD8 rDst,rSrc,[rAddr]
AFADD4 rDst,rSrc,[rAddr]

Re: Another security vulnerability

<uu1uot$3206d$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38186&group=comp.arch#38186

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Wed, 27 Mar 2024 20:15:25 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 40
Message-ID: <uu1uot$3206d$1@dont-email.me>
References: <utpoi2$b6to$1@dont-email.me>
<2024Mar25.082534@mips.complang.tuwien.ac.at>
<20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me>
<VpVMN.731075$p%Mb.618266@fx15.iad> <uu1ck0$2tfbq$1@dont-email.me>
<UvXMN.130838$GX69.82796@fx46.iad>
Injection-Date: Wed, 27 Mar 2024 20:15:25 +0100 (CET)
Injection-Info: dont-email.me; posting-host="8efce553cdd8837e74ed172d4b5d03c3";
logging-data="3211469"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX195q06dSgTb4cUI3uxTzFBef5/v8ODHcnQ="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:6x0cZ4wHle/OIPAvaorjGkyw2J0=
 by: Thomas Koenig - Wed, 27 Mar 2024 20:15 UTC

EricP <ThatWouldBeTelling@thevillage.com> schrieb:
> Thomas Koenig wrote:
>> EricP <ThatWouldBeTelling@thevillage.com> schrieb:
>>> Thomas Koenig wrote:
>>>> Michael S <already5chosen@yahoo.com> schrieb:
>>>>
>>>>> In case you missed it, the web page contains link to pdf:
>>>>> https://gofetch.fail/files/gofetch.pdf
>>>> Looking the paper, it seems that a separate "load value" instruction
>>>> (where it is guaranteed that no pointer prefetching will be done)
>>>> could fix this particular issue. Compilers know what type is being
>>>> loaded from memory, and could issue the corresponding instruction.
>>>> This would not impact performance.
>>>>
>>>> Only works for new versions of an architecture, and supporting
>>>> compilers, but no code change would be required. And, of course,
>>>> it would eat up opcode space.
>>> It doesn't need to eat opcode space if you only support one data type,
>>> 64-bit ints, and one address mode, [register].
>>> Other address modes can be calculated using LEA.
>>> Since these are rare instructions to solve a particular problem,
>>> they won't be used that often, so a few extra instructions shouldn't matter.
>>
>> Hm, I'm not sure it would actually be used rarely, at least not
>> the way I thought about it.
>
> I'm referring to your load with prefetch disable.
> For these particular loads it's users could likely tolerate the
> "overhead" of an extra LEA instruction to calculate the address,
> and don't need all 7 integer data types.

If it was LEA-only, it would need some kind of pragma in the code
which said "use this more cumbersome and slower, but more
safe version".

For that reason, I would probably prefer a separate version
which is implicitly safe and does not have any other drawbacks
for performance, with no code changes.

If it's worth the opcode space...

Re: Another security vulnerability

<1G%MN.103070$5Hnd.31046@fx03.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38187&group=comp.arch#38187

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!nntp.comgw.net!peer02.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx03.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Another security vulnerability
Newsgroups: comp.arch
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <2024Mar27.185230@mips.complang.tuwien.ac.at>
Lines: 25
Message-ID: <1G%MN.103070$5Hnd.31046@fx03.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Wed, 27 Mar 2024 20:52:13 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Wed, 27 Mar 2024 20:52:13 GMT
X-Received-Bytes: 1930
 by: Scott Lurndal - Wed, 27 Mar 2024 20:52 UTC

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>Thomas Koenig <tkoenig@netcologne.de> writes:
>>Michael S <already5chosen@yahoo.com> schrieb:
>>
>>> In case you missed it, the web page contains link to pdf:
>>> https://gofetch.fail/files/gofetch.pdf
>>
>>Looking the paper, it seems that a separate "load value" instruction
>>(where it is guaranteed that no pointer prefetching will be done)
>>could fix this particular issue. Compilers know what type is being
>>loaded from memory, and could issue the corresponding instruction.
>>This would not impact performance.

It is worth noting (from the paper's Introduction):

In particular, Augury reported that the [Apple M-series ed.] DMP only activates
in the presence of a rather idiosyncratic program memory
access pattern (where the program streams through an array
of pointers and architecturally dereferences those pointers).
This access pattern is not typically found in security critical
software such as side-channel hardened constant-time code--
hence making that code impervious to leakage through the
DMP.

Re: Another security vulnerability

<9H%MN.103104$5Hnd.78934@fx03.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38188&group=comp.arch#38188

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!npeer.as286.net!npeer-ng0.as286.net!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx03.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Another security vulnerability
Newsgroups: comp.arch
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <2024Mar27.185230@mips.complang.tuwien.ac.at> <1G%MN.103070$5Hnd.31046@fx03.iad>
Lines: 28
Message-ID: <9H%MN.103104$5Hnd.78934@fx03.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Wed, 27 Mar 2024 20:53:25 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Wed, 27 Mar 2024 20:53:25 GMT
X-Received-Bytes: 2106
 by: Scott Lurndal - Wed, 27 Mar 2024 20:53 UTC

scott@slp53.sl.home (Scott Lurndal) writes:
>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>>Thomas Koenig <tkoenig@netcologne.de> writes:
>>>Michael S <already5chosen@yahoo.com> schrieb:
>>>
>>>> In case you missed it, the web page contains link to pdf:
>>>> https://gofetch.fail/files/gofetch.pdf
>>>
>>>Looking the paper, it seems that a separate "load value" instruction
>>>(where it is guaranteed that no pointer prefetching will be done)
>>>could fix this particular issue. Compilers know what type is being
>>>loaded from memory, and could issue the corresponding instruction.
>>>This would not impact performance.
>
>It is worth noting (from the paper's Introduction):
>
> In particular, Augury reported that the [Apple M-series ed.] DMP only activates
> in the presence of a rather idiosyncratic program memory
> access pattern (where the program streams through an array
> of pointers and architecturally dereferences those pointers).
> This access pattern is not typically found in security critical
> software such as side-channel hardened constant-time code--
> hence making that code impervious to leakage through the
> DMP.

Reminder to self, read rest of article before commenting.

Never mind.

Re: Another security vulnerability

<5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38191&group=comp.arch#38191

  copy link   Newsgroups: comp.arch
Date: Wed, 27 Mar 2024 21:27:16 +0000
Subject: Re: Another security vulnerability
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$RRf57xbgOW9AW4WN0bcf3uRfq5p7WFCL3Xr7DYRutQebg1zmytWZe
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <VpVMN.731075$p%Mb.618266@fx15.iad> <2024Mar27.191411@mips.complang.tuwien.ac.at> <HH_MN.732789$p%Mb.8039@fx15.iad>
Organization: Rocksolid Light
Message-ID: <5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org>
 by: MitchAlsup1 - Wed, 27 Mar 2024 21:27 UTC

EricP wrote:

> Anton Ertl wrote:
>> EricP <ThatWouldBeTelling@thevillage.com> writes:
>>> It doesn't need to eat opcode space if you only support one data type,
>>> 64-bit ints, and one address mode, [register].
>>> Other address modes can be calculated using LEA.
>>> Since these are rare instructions to solve a particular problem,
>>> they won't be used that often, so a few extra instructions shouldn't matter.
>>
>> You lost me here. Do you mean that a load with address mode
>> [register] is considered to be a non-address load and not followed by
>> the data-dependent prefetcher? So how would an address load be
>> encoded if the natural expression would be [register]?
>>
>> - anton

> I'm pointing out that not all instructions need to be orthogonal.
> There can be savings in opcode space by tempering that based on
> expected frequency of occurrence.

> The normal LD and ST have all their address modes and data types
> because these functions occur frequently enough that we deem it
> worthwhile to support these all in one instruction,
> such as supporting both sign and zero extended loads
> or scaled index addressing.

> I note there is this class of relatively rarely used special purpose
> memory access instructions that don't need to have all singing and all
> dancing address modes and/or data types like the regular LD and ST.

> Since I need a LEA Load Effective Address instruction anyway
> which does rBase+rIndex*scale+offset calculation
> (plus I have others, like where rBase is RIP or an absolute address),
> then I can drop all but the [reg] address mode for these rare instructions
> and in many cases drop some sign or zero extend types for loads.

It seems to me that once the core has identified an address and an offset
from that address contains another address (foo->next, foo->prev) that
only those are prefetched. So this depends on placing next as the first
container in a structure and remains dependent on chasing next a lot more
often than chasing prev.

Otherwise, knowing a loaded value contains a pointer to a structure (or array)
one cannot predict what to prefetch unless one can assume the offset into the
struct (or array).

Now Note:: If there were an instruction that loaded the value known to be
a pointer and prefetched based on the received pointer, then the prefetch
is now architectural not µArchitectural and you are allowed to damage the
cache or TLB when/after the instruction retires.

> For example, I use just two opcodes for Atomic Fetch Add int64 and int32
> AFADD8 rDst,rSrc,[rAddr]
> AFADD4 rDst,rSrc,[rAddr]

Re: Another security vulnerability

<uu2ekk$35lku$2@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38195&group=comp.arch#38195

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ldo@nz.invalid (Lawrence D'Oliveiro)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Thu, 28 Mar 2024 00:46:13 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <uu2ekk$35lku$2@dont-email.me>
References: <utpoi2$b6to$1@dont-email.me> <utq715$jsuq$3@dont-email.me>
<utsa0k$12v4q$1@dont-email.me> <utspur$1akpd$4@dont-email.me>
<80e708682d016d9c2a36adffa668f58e@www.novabbs.org>
<utt4pm$1d406$2@dont-email.me> <uu1h3k$12v4q$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 28 Mar 2024 00:46:13 +0100 (CET)
Injection-Info: dont-email.me; posting-host="6d46a86d4234b3675ba060851576d639";
logging-data="3331742"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX192TcNXOYdyI0CTnKEpOkHs"
User-Agent: Pan/0.155 (Kherson; fc5a80b8)
Cancel-Lock: sha1:CRwO85i6JB4u0tDrWpFErzvsmUE=
 by: Lawrence D'Oliv - Thu, 28 Mar 2024 00:46 UTC

On Wed, 27 Mar 2024 09:22:12 -0700, Stephen Fuld wrote:

> On 3/25/2024 5:27 PM, Lawrence D'Oliveiro wrote:
>
>> On Mon, 25 Mar 2024 22:17:55 +0000, MitchAlsup1 wrote:
>>
>>> Lawrence D'Oliveiro wrote:
>>>
>>>> The basic problem is that building all this complex, bug-prone
>>>> functionality into monolithic, nonupgradeable hardware is not really
>>>> a good idea.
>>>
>>> Would you like to inform us of how it can be done otherwise ?
>>
>> Upgradeable firmware/software, of course.
>
> But microcode is generally slower than dedicated hardware, and most
> people seem to be unwilling to give up performance all the time to gain
> an advantage in a situation that occurs infrequently and mostly never.

Bruce Schneier has a saying: “attacks never get worse, they can only get
better”.

Re: Another security vulnerability

<7c9fac56fef02978cfa65a7c12df24ad@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38196&group=comp.arch#38196

  copy link   Newsgroups: comp.arch
Date: Thu, 28 Mar 2024 01:22:06 +0000
Subject: Re: Another security vulnerability
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$RatkcIiY0NVbDvJq0UXZFOZhM6hfPFN2DNj1/ecY7TVhjkXv5dbjO
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <utpoi2$b6to$1@dont-email.me> <utq715$jsuq$3@dont-email.me> <utsa0k$12v4q$1@dont-email.me> <utspur$1akpd$4@dont-email.me> <80e708682d016d9c2a36adffa668f58e@www.novabbs.org> <utt4pm$1d406$2@dont-email.me> <uu1h3k$12v4q$2@dont-email.me> <uu2ekk$35lku$2@dont-email.me>
Organization: Rocksolid Light
Message-ID: <7c9fac56fef02978cfa65a7c12df24ad@www.novabbs.org>
 by: MitchAlsup1 - Thu, 28 Mar 2024 01:22 UTC

Lawrence D'Oliveiro wrote:

> On Wed, 27 Mar 2024 09:22:12 -0700, Stephen Fuld wrote:

>> On 3/25/2024 5:27 PM, Lawrence D'Oliveiro wrote:
>>
>>> On Mon, 25 Mar 2024 22:17:55 +0000, MitchAlsup1 wrote:
>>>
>>>> Lawrence D'Oliveiro wrote:
>>>>
>>>>> The basic problem is that building all this complex, bug-prone
>>>>> functionality into monolithic, nonupgradeable hardware is not really
>>>>> a good idea.
>>>>
>>>> Would you like to inform us of how it can be done otherwise ?
>>>
>>> Upgradeable firmware/software, of course.
>>
>> But microcode is generally slower than dedicated hardware, and most
>> people seem to be unwilling to give up performance all the time to gain
>> an advantage in a situation that occurs infrequently and mostly never.

S.E.L 32/65, 32/67, 32/87 were all microcoded. 95% of the instructions*
ran down the pipeline without using the microcode (which was only there
to pick up the pieces after HW logic sequencers got tooo complicated.

(*) closer to 97% of the dynamic instruction stream.

Microcode IS generally slower, but not always. PDP-11 or VAX microcode
is too much, IBM, S.E.L. is not too much.

> Bruce Schneier has a saying: “attacks never get worse, they can only get
> better”.

Which is why to have to design for attackers (to be thwarted}.

Observing the last 4-odd years, it appears to me that CPU designers will
not be in a position to "give a rat's ass" until there is performance
competitive, cost competitive, alternatives. Given Apple, AMD, ARM, and
Intel not giving a rat's ass, it's going to have to come from somewhere
else.

Re: Another security vulnerability

<TfhNN.110764$_a1e.90012@fx16.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38199&group=comp.arch#38199

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx16.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <VpVMN.731075$p%Mb.618266@fx15.iad> <2024Mar27.191411@mips.complang.tuwien.ac.at> <HH_MN.732789$p%Mb.8039@fx15.iad> <5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org>
In-Reply-To: <5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 104
Message-ID: <TfhNN.110764$_a1e.90012@fx16.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Thu, 28 Mar 2024 16:53:07 UTC
Date: Thu, 28 Mar 2024 12:52:07 -0400
X-Received-Bytes: 5692
 by: EricP - Thu, 28 Mar 2024 16:52 UTC

MitchAlsup1 wrote:
> EricP wrote:
>
>> Anton Ertl wrote:
>>> EricP <ThatWouldBeTelling@thevillage.com> writes:
>>>> It doesn't need to eat opcode space if you only support one data type,
>>>> 64-bit ints, and one address mode, [register].
>>>> Other address modes can be calculated using LEA.
>>>> Since these are rare instructions to solve a particular problem,
>>>> they won't be used that often, so a few extra instructions shouldn't
>>>> matter.
>>>
>>> You lost me here. Do you mean that a load with address mode
>>> [register] is considered to be a non-address load and not followed by
>>> the data-dependent prefetcher? So how would an address load be
>>> encoded if the natural expression would be [register]?
>>>
>>> - anton
>
>> I'm pointing out that not all instructions need to be orthogonal.
>> There can be savings in opcode space by tempering that based on
>> expected frequency of occurrence.
>
>> The normal LD and ST have all their address modes and data types
>> because these functions occur frequently enough that we deem it
>> worthwhile to support these all in one instruction,
>> such as supporting both sign and zero extended loads
>> or scaled index addressing.
>
>> I note there is this class of relatively rarely used special purpose
>> memory access instructions that don't need to have all singing and all
>> dancing address modes and/or data types like the regular LD and ST.
>
>> Since I need a LEA Load Effective Address instruction anyway
>> which does rBase+rIndex*scale+offset calculation
>> (plus I have others, like where rBase is RIP or an absolute address),
>> then I can drop all but the [reg] address mode for these rare
>> instructions
>> and in many cases drop some sign or zero extend types for loads.
>
> It seems to me that once the core has identified an address and an offset
> from that address contains another address (foo->next, foo->prev) that
> only those are prefetched. So this depends on placing next as the first
> container in a structure and remains dependent on chasing next a lot more
> often than chasing prev.
>
> Otherwise, knowing a loaded value contains a pointer to a structure (or
> array)
> one cannot predict what to prefetch unless one can assume the offset
> into the
> struct (or array).

Right, this is the problem that these "data memory-dependent" prefetchers
like described in that Intel Programmable and Integrated Unified Memory
Architecture (PIUMA)" paper referenced by Paul Clayton are trying to solve.

The pointer field to chase can be
(a) at an +- offset from the current pointer virtual address
(b) at a different offset for each iteration
(c) conditional on some other field at some other offset

and most important:

(d) any new pointers are virtual address that have to start back at
the Load Store Queue for VA translation and forwarding testing
after applying (a),(b) and (c) above.

Since each chased pointer starts back at LSQ, the cost is no different
than an explicit Prefetch instruction, except without (a),(b) and (c)
having been applied first.

So I find the simplistic, blithe data-dependent auto prefetching
described as questionable.

> Now Note:: If there were an instruction that loaded the value known to be
> a pointer and prefetched based on the received pointer, then the prefetch
> is now architectural not µArchitectural and you are allowed to damage the
> cache or TLB when/after the instruction retires.

In the PIUMA case those pointers were to sparse data sets
so part of the problem was rolling over the cache, as well as
(and the PIUMA paper didn't mention this) the TLB.

After reading the PIUMA paper I had an idea for a small modification
to the PTE cache control bits to handle sparse data. The PTE's 3 CC bits
can specify the upper page table levels are cached in the TLB but
lower levels are not because they would always roll over the TLB.
However the non-TLB cached PTE's may optionally still be cached
in L1 or L2, or not at all.

This allows one to hold the top page table levels in the TLB,
the upper middle levels in L1, lower middle levels in L2,
and leaf PTE's and sparse code/data not cached at all.
BUT, as PIUMA proposes, we also allow the memory subsystem to read and write
individual aligned 8-byte values from DRAM, rather than whole cache lines,
so we only move that actual 8 bytes values we need.

Also note that page table walks are also graph structure walks
but chasing physical addresses at some simple calculated offsets.
These physical addresses might be cached in L1 or L2 so we can't
just chase these pointers in the memory controller but,
if one wants to do this, have to do so in the cache controller.

Re: Another security vulnerability

<14b25c0880216e54fe36d28c96e8428c@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38200&group=comp.arch#38200

  copy link   Newsgroups: comp.arch
Date: Thu, 28 Mar 2024 19:59:45 +0000
Subject: Re: Another security vulnerability
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$gQ5/nGXn2XWXzGEQXvqSVeGleqdHXlBgm6p5.iIW8QNE.NKAstG6e
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <VpVMN.731075$p%Mb.618266@fx15.iad> <2024Mar27.191411@mips.complang.tuwien.ac.at> <HH_MN.732789$p%Mb.8039@fx15.iad> <5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org> <TfhNN.110764$_a1e.90012@fx16.iad>
Organization: Rocksolid Light
Message-ID: <14b25c0880216e54fe36d28c96e8428c@www.novabbs.org>
 by: MitchAlsup1 - Thu, 28 Mar 2024 19:59 UTC

EricP wrote:

> MitchAlsup1 wrote:
>>
>> It seems to me that once the core has identified an address and an offset
>> from that address contains another address (foo->next, foo->prev) that
>> only those are prefetched. So this depends on placing next as the first
>> container in a structure and remains dependent on chasing next a lot more
>> often than chasing prev.
>>
>> Otherwise, knowing a loaded value contains a pointer to a structure (or
>> array)
>> one cannot predict what to prefetch unless one can assume the offset
>> into the
>> struct (or array).

> Right, this is the problem that these "data memory-dependent" prefetchers
> like described in that Intel Programmable and Integrated Unified Memory
> Architecture (PIUMA)" paper referenced by Paul Clayton are trying to solve.

> The pointer field to chase can be
> (a) at an +- offset from the current pointer virtual address
> (b) at a different offset for each iteration
> (c) conditional on some other field at some other offset

> and most important:

> (d) any new pointers are virtual address that have to start back at
> the Load Store Queue for VA translation and forwarding testing
> after applying (a),(b) and (c) above.

This is the tidbit that prevents doing prefetches at/in the DRAM controller.
The address so fetched needs translation !! And this requires dragging
stuff over to DRC that is not normally done.

> Since each chased pointer starts back at LSQ, the cost is no different
> than an explicit Prefetch instruction, except without (a),(b) and (c)
> having been applied first.

Latency cost is identical, instruction issue/retire costs are lower.

> So I find the simplistic, blithe data-dependent auto prefetching
> described as questionable.

K9 built a SW model of such a prefetcher. For the first 1B cycles of a
SPEC benchmark from ~2004 it performed quite well. {{We later figured out
that this was initialization that built a GB data structure.}} Late on
in the benchmark the pointers got scrambled and the performance from the
prefetched fell on its face.

Moral, you need close to 1T cycle of simulation to qualify a prefetcher.

>> Now Note:: If there were an instruction that loaded the value known to be
>> a pointer and prefetched based on the received pointer, then the prefetch
>> is now architectural not µArchitectural and you are allowed to damage the
>> cache or TLB when/after the instruction retires.

> In the PIUMA case those pointers were to sparse data sets
> so part of the problem was rolling over the cache, as well as
> (and the PIUMA paper didn't mention this) the TLB.

> After reading the PIUMA paper I had an idea for a small modification
> to the PTE cache control bits to handle sparse data. The PTE's 3 CC bits
> can specify the upper page table levels are cached in the TLB but
> lower levels are not because they would always roll over the TLB.
> However the non-TLB cached PTE's may optionally still be cached
> in L1 or L2, or not at all.

> This allows one to hold the top page table levels in the TLB,
> the upper middle levels in L1, lower middle levels in L2,
> and leaf PTE's and sparse code/data not cached at all.

Given the 2-level TLBs currently in vogue, the fist level might
have 32-64 PTEs, while the second might have 2048 PTEs. With this
number of PTEs available, does you scheme still give benefit ??

> BUT, as PIUMA proposes, we also allow the memory subsystem to read and write
> individual aligned 8-byte values from DRAM, rather than whole cache lines,
> so we only move that actual 8 bytes values we need.

Busses on cores are reaching the stage where an entire cache line
is transferred in 1-cycle. With such busses, why define anything
smaller than a cache line ?? {other than uncacheable accesses}

> Also note that page table walks are also graph structure walks
> but chasing physical addresses at some simple calculated offsets.
> These physical addresses might be cached in L1 or L2 so we can't
> just chase these pointers in the memory controller but,
> if one wants to do this, have to do so in the cache controller.

Yes, this is why the K9 prefetcher was in the L2 where it had access
to the L2 TLB.

Re: Another security vulnerability

<uu56rq$3u2ve$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38202&group=comp.arch#38202

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: paaronclayton@gmail.com (Paul A. Clayton)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Thu, 28 Mar 2024 21:51:51 -0400
Organization: A noiseless patient Spider
Lines: 53
Message-ID: <uu56rq$3u2ve$1@dont-email.me>
References: <utpoi2$b6to$1@dont-email.me>
<2024Mar25.082534@mips.complang.tuwien.ac.at>
<20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me>
<VpVMN.731075$p%Mb.618266@fx15.iad>
<2024Mar27.191411@mips.complang.tuwien.ac.at>
<HH_MN.732789$p%Mb.8039@fx15.iad>
<5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org>
<TfhNN.110764$_a1e.90012@fx16.iad>
<14b25c0880216e54fe36d28c96e8428c@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 29 Mar 2024 01:51:55 +0100 (CET)
Injection-Info: dont-email.me; posting-host="e986263148564954c0f3cab57a0b9286";
logging-data="4131822"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/BkL8KUkPJ9xKIlo2MazbNmK9QtzQD5KE="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.0
Cancel-Lock: sha1:nS7d1IxPti8/R0Ef28rE7ozVlBY=
In-Reply-To: <14b25c0880216e54fe36d28c96e8428c@www.novabbs.org>
 by: Paul A. Clayton - Fri, 29 Mar 2024 01:51 UTC

On 3/28/24 3:59 PM, MitchAlsup1 wrote:
> EricP wrote:
[snip]
>> (d) any new pointers are virtual address that have to start back at
>>      the Load Store Queue for VA translation and forwarding testing
>>      after applying (a),(b) and (c) above.
>
> This is the tidbit that prevents doing prefetches at/in the DRAM controller.
> The address so fetched needs translation !! And this requires dragging
> stuff over to DRC that is not normally done.

With multiple memory channels having independent memory
controllers (a reasonable design I suspect), a memory controller
may have to send the prefetch request to another memory controller
anyway. If the prefetch has to take a trip on the on-chip network,
a "minor side trip" for translation might not be horrible (though
it seems distasteful to me).

With the Mill having translation at last level cache miss, such
prefetching may be more natural *but* distributing the virtual
address translations and the memory controllers seems challenging
when one wants to minimize hops.

[snip]
>> BUT, as PIUMA proposes, we also allow the memory subsystem to
>> read and write
>> individual aligned 8-byte values from DRAM, rather than whole
>> cache lines,
>> so we only move that actual 8 bytes values we need.
>
> Busses on cores are reaching the stage where an entire cache line
> is transferred in 1-cycle. With such busses, why define anything
> smaller than a cache line ?? {other than uncacheable accesses}

The Intel research chip was special-purpose targeting
cache-unfriendly code. Reading 64 bytes when 99% of the time 56
bytes would be unused is rather wasteful (and having more memory
channels helps under high thread count).

However, even for a "general purpose" processor, "word"-granular
atomic operations could justify not having all data transfers be
cache line size. (Such are rare compared with cache line loads
from memory or other caches, but a design might have narrower
connections for coherence, interrupts, etc. that could be used for
small data communication.)

In-cache compression might also nudge the tradeoffs. Being able to
have higher effective bandwidth when data is transmitted in a
compressed form might be useful. "Lossy compression", where the
recipient does not care about much of the data, would allow
compression even when the data itself is not compressible. For
contiguous useful data, this is comparable to a smaller cache
line.

Re: Another security vulnerability

<%1ANN.756839$p%Mb.622365@fx15.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38204&group=comp.arch#38204

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!nntp.comgw.net!peer03.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Another security vulnerability
Newsgroups: comp.arch
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <VpVMN.731075$p%Mb.618266@fx15.iad> <2024Mar27.191411@mips.complang.tuwien.ac.at> <HH_MN.732789$p%Mb.8039@fx15.iad> <5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org> <TfhNN.110764$_a1e.90012@fx16.iad> <14b25c0880216e54fe36d28c96e8428c@www.novabbs.org> <uu56rq$3u2ve$1@dont-email.me>
Lines: 44
Message-ID: <%1ANN.756839$p%Mb.622365@fx15.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Fri, 29 Mar 2024 14:15:23 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Fri, 29 Mar 2024 14:15:23 GMT
X-Received-Bytes: 2957
 by: Scott Lurndal - Fri, 29 Mar 2024 14:15 UTC

"Paul A. Clayton" <paaronclayton@gmail.com> writes:
>On 3/28/24 3:59 PM, MitchAlsup1 wrote:
>> EricP wrote:
>[snip]
>>> (d) any new pointers are virtual address that have to start back at
>>>      the Load Store Queue for VA translation and forwarding testing
>>>      after applying (a),(b) and (c) above.
>>
>> This is the tidbit that prevents doing prefetches at/in the DRAM controller.
>> The address so fetched needs translation !! And this requires dragging
>> stuff over to DRC that is not normally done.
>
>With multiple memory channels having independent memory
>controllers (a reasonable design I suspect), a memory controller
>may have to send the prefetch request to another memory controller
>anyway.

Which is usually handled by the LLC when the address space is
striped across multiple memory controllers.

>> Busses on cores are reaching the stage where an entire cache line
>> is transferred in 1-cycle. With such busses, why define anything
>> smaller than a cache line ?? {other than uncacheable accesses}
>
>The Intel research chip was special-purpose targeting
>cache-unfriendly code. Reading 64 bytes when 99% of the time 56
>bytes would be unused is rather wasteful (and having more memory
>channels helps under high thread count).

Given the lack of both spatial and temporal locality in that
workload, one wonders if the data should be cached at all.

>
>However, even for a "general purpose" processor, "word"-granular
>atomic operations could justify not having all data transfers be
>cache line size. (Such are rare compared with cache line loads
>from memory or other caches, but a design might have narrower
>connections for coherence, interrupts, etc. that could be used for
>small data communication.)

So long as the data transfer is cachable, the atomics can be handled
at the LLC, rather than the memory controller.

Re: Another security vulnerability

<uu7bj0$h78h$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38206&group=comp.arch#38206

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: paaronclayton@gmail.com (Paul A. Clayton)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Fri, 29 Mar 2024 17:24:45 -0400
Organization: A noiseless patient Spider
Lines: 60
Message-ID: <uu7bj0$h78h$1@dont-email.me>
References: <utpoi2$b6to$1@dont-email.me>
<2024Mar25.082534@mips.complang.tuwien.ac.at>
<20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me>
<VpVMN.731075$p%Mb.618266@fx15.iad>
<2024Mar27.191411@mips.complang.tuwien.ac.at>
<HH_MN.732789$p%Mb.8039@fx15.iad>
<5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org>
<TfhNN.110764$_a1e.90012@fx16.iad>
<14b25c0880216e54fe36d28c96e8428c@www.novabbs.org>
<uu56rq$3u2ve$1@dont-email.me> <%1ANN.756839$p%Mb.622365@fx15.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 29 Mar 2024 21:24:48 +0100 (CET)
Injection-Info: dont-email.me; posting-host="595baf1f222d400d65e10ab7c5a16e34";
logging-data="564497"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18H4xQ+x5bXQEKj/MYuwbupUt3+i88/tfQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.0
Cancel-Lock: sha1:VPOzUXLP1At4iZ3pMAwaNKgY11Y=
In-Reply-To: <%1ANN.756839$p%Mb.622365@fx15.iad>
 by: Paul A. Clayton - Fri, 29 Mar 2024 21:24 UTC

On 3/29/24 10:15 AM, Scott Lurndal wrote:
> "Paul A. Clayton" <paaronclayton@gmail.com> writes:
>> On 3/28/24 3:59 PM, MitchAlsup1 wrote:
[snip]
>>> This is the tidbit that prevents doing prefetches at/in the DRAM controller.
>>> The address so fetched needs translation !! And this requires dragging
>>> stuff over to DRC that is not normally done.
>>
>> With multiple memory channels having independent memory
>> controllers (a reasonable design I suspect), a memory controller
>> may have to send the prefetch request to another memory controller
>> anyway.
>
> Which is usually handled by the LLC when the address space is
> striped across multiple memory controllers.

For data-dependent (pointer chasing) prefetching, one would like
for the prefetch to start as soon as the data was available. For a
cache miss that would be in the memory controller. Having to do
address translation can be inconvenient even for LLC.
>
>
>
>>> Busses on cores are reaching the stage where an entire cache line
>>> is transferred in 1-cycle. With such busses, why define anything
>>> smaller than a cache line ?? {other than uncacheable accesses}
>>
>> The Intel research chip was special-purpose targeting
>> cache-unfriendly code. Reading 64 bytes when 99% of the time 56
>> bytes would be unused is rather wasteful (and having more memory
>> channels helps under high thread count).
>
> Given the lack of both spatial and temporal locality in that
> workload, one wonders if the data should be cached at all.

Most of the data was intended not to be cached. Instructions and
stack would still have spatial locality and some temporal
locality.

>> However, even for a "general purpose" processor, "word"-granular
>> atomic operations could justify not having all data transfers be
>> cache line size. (Such are rare compared with cache line loads
>>from memory or other caches, but a design might have narrower
>> connections for coherence, interrupts, etc. that could be used for
>> small data communication.)
>
> So long as the data transfer is cachable, the atomics can be handled
> at the LLC, rather than the memory controller.

Yes, but if the width of the on-chip network — which is what Mitch
was referring to in transferring a cache line in one cycle — is
c.72 bytes (64 bytes for the data and 8 bytes for control
information) it seems that short messages would either have to be
grouped (increasing latency) or waste a significant fraction of
the potential bandwidth for that transfer. Compressed cache lines
would also not save bandwidth. These may not be significant
considerations, but this is an answer to "why define anything
smaller than a cache line?", i.e., seemingly reasonable
motivations may exist.

Re: Another security vulnerability

<8mHNN.117982$Sf59.36214@fx48.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38207&group=comp.arch#38207

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx48.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Another security vulnerability
Newsgroups: comp.arch
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <VpVMN.731075$p%Mb.618266@fx15.iad> <2024Mar27.191411@mips.complang.tuwien.ac.at> <HH_MN.732789$p%Mb.8039@fx15.iad> <5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org> <TfhNN.110764$_a1e.90012@fx16.iad> <14b25c0880216e54fe36d28c96e8428c@www.novabbs.org> <uu56rq$3u2ve$1@dont-email.me> <%1ANN.756839$p%Mb.622365@fx15.iad> <uu7bj0$h78h$1@dont-email.me>
Lines: 40
Message-ID: <8mHNN.117982$Sf59.36214@fx48.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Fri, 29 Mar 2024 22:34:44 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Fri, 29 Mar 2024 22:34:44 GMT
X-Received-Bytes: 2985
 by: Scott Lurndal - Fri, 29 Mar 2024 22:34 UTC

"Paul A. Clayton" <paaronclayton@gmail.com> writes:
>On 3/29/24 10:15 AM, Scott Lurndal wrote:
>> "Paul A. Clayton" <paaronclayton@gmail.com> writes:
>>> On 3/28/24 3:59 PM, MitchAlsup1 wrote:
>[snip]

>>> However, even for a "general purpose" processor, "word"-granular
>>> atomic operations could justify not having all data transfers be
>>> cache line size. (Such are rare compared with cache line loads
>>>from memory or other caches, but a design might have narrower
>>> connections for coherence, interrupts, etc. that could be used for
>>> small data communication.)
>>
>> So long as the data transfer is cachable, the atomics can be handled
>> at the LLC, rather than the memory controller.
>
>Yes, but if the width of the on-chip network — which is what Mitch
>was referring to in transferring a cache line in one cycle — is
>c.72 bytes (64 bytes for the data and 8 bytes for control
>information) it seems that short messages would either have to be
>grouped (increasing latency) or waste a significant fraction of
>the potential bandwidth for that transfer. Compressed cache lines
>would also not save bandwidth. These may not be significant
>considerations, but this is an answer to "why define anything
>smaller than a cache line?", i.e., seemingly reasonable
>motivations may exist.
>

It's not uncommon for the bus/switch/mesh -structure- to be 512-bits wide,
which indeed will support a full cache line transfer in a single transaction;
it also supports high-volume DMA operations (either memory to memory or
device to memory).

Most of the interconnect (bus, switched or point-to-point) implementations
have an overlaying protocol (including the cache coherency
protocol) and are effectively message based, with agents posting requests
that don't need a reply and expecting a reply for the rest.

That doesn't require that every transaction over that bus to
utilize the full width of the bus.

Re: Another security vulnerability

<5140da0c7db5686c4bb9948276454914@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38208&group=comp.arch#38208

  copy link   Newsgroups: comp.arch
Date: Sat, 30 Mar 2024 01:06:23 +0000
Subject: Re: Another security vulnerability
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$owmzDK56O5ZGiRHToArvfedFCPJEc9BcNDqur37Zqnx4uER6m/NZm
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <utpoi2$b6to$1@dont-email.me> <2024Mar25.082534@mips.complang.tuwien.ac.at> <20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me> <VpVMN.731075$p%Mb.618266@fx15.iad> <2024Mar27.191411@mips.complang.tuwien.ac.at> <HH_MN.732789$p%Mb.8039@fx15.iad> <5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org> <TfhNN.110764$_a1e.90012@fx16.iad> <14b25c0880216e54fe36d28c96e8428c@www.novabbs.org> <uu56rq$3u2ve$1@dont-email.me> <%1ANN.756839$p%Mb.622365@fx15.iad> <uu7bj0$h78h$1@dont-email.me> <8mHNN.117982$Sf59.36214@fx48.iad>
Organization: Rocksolid Light
Message-ID: <5140da0c7db5686c4bb9948276454914@www.novabbs.org>
 by: MitchAlsup1 - Sat, 30 Mar 2024 01:06 UTC

Scott Lurndal wrote:

> "Paul A. Clayton" <paaronclayton@gmail.com> writes:
>>On 3/29/24 10:15 AM, Scott Lurndal wrote:
>>> "Paul A. Clayton" <paaronclayton@gmail.com> writes:
>>>> On 3/28/24 3:59 PM, MitchAlsup1 wrote:
>>[snip]

>>>> However, even for a "general purpose" processor, "word"-granular
>>>> atomic operations could justify not having all data transfers be
>>>> cache line size. (Such are rare compared with cache line loads
>>>>from memory or other caches, but a design might have narrower
>>>> connections for coherence, interrupts, etc. that could be used for
>>>> small data communication.)
>>>
>>> So long as the data transfer is cachable, the atomics can be handled
>>> at the LLC, rather than the memory controller.
>>
>>Yes, but if the width of the on-chip network — which is what Mitch
>>was referring to in transferring a cache line in one cycle — is
>>c.72 bytes (64 bytes for the data and 8 bytes for control
>>information) it seems that short messages would either have to be
>>grouped (increasing latency) or waste a significant fraction of
>>the potential bandwidth for that transfer. Compressed cache lines
>>would also not save bandwidth. These may not be significant
>>considerations, but this is an answer to "why define anything
>>smaller than a cache line?", i.e., seemingly reasonable
>>motivations may exist.
>>

> It's not uncommon for the bus/switch/mesh -structure- to be 512-bits wide,
> which indeed will support a full cache line transfer in a single transaction;

It is not the transaction it is a single beat of the clock. One can have
narrower bus widths and simply divide the cache line size by the bus width
to get the number of required beats.

> it also supports high-volume DMA operations (either memory to memory or
> device to memory).

> Most of the interconnect (bus, switched or point-to-point) implementations
> have an

or more than one

> overlaying protocol (including the cache coherency
> protocol) and are effectively message based, with agents posting requests
> that don't need a reply and expecting a reply for the rest.

Many older busses read PTP and PTEs from memory sizeof( PTE ) at a time,
some of them requesting write permission so that used and modified bits
can be written back immediately.{{Which skirts the distinction between
cacheable and uncacheable in several ways.}}

> That doesn't require that every transaction over that bus to
> utilize the full width of the bus.

In my wide bus situation, the line width is used to gang up multiple
responses (from different end-points) into a single beat==message.
For example the chip-to-chip transport can carry multiple independent
SNOOP responses in a single beat (saving cycles and lowering latency).

Re: Another security vulnerability

<jwvedbmco81.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38213&group=comp.arch#38213

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: monnier@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Another security vulnerability
Date: Wed, 03 Apr 2024 14:10:41 -0400
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <jwvedbmco81.fsf-monnier+comp.arch@gnu.org>
References: <utpoi2$b6to$1@dont-email.me>
<2024Mar25.082534@mips.complang.tuwien.ac.at>
<20240326192941.0000314a@yahoo.com> <uu0kt1$2nr9j$1@dont-email.me>
<VpVMN.731075$p%Mb.618266@fx15.iad>
<2024Mar27.191411@mips.complang.tuwien.ac.at>
<HH_MN.732789$p%Mb.8039@fx15.iad>
<5fc6ea8088c0afe8618d2862cbacebab@www.novabbs.org>
<TfhNN.110764$_a1e.90012@fx16.iad>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Date: Wed, 03 Apr 2024 18:13:12 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="d97d774ac232f6e68641575478ad60df";
logging-data="93107"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+UFUNoErvr/5wYx3EfGSmU4oGKUOw+gTw="
User-Agent: Gnus/5.13 (Gnus v5.13)
Cancel-Lock: sha1:onaYJpzdR5zQbh8WNIcD4xUydSI=
sha1:V8cH1LkdiJZblnhp8h60pmmC+bc=
 by: Stefan Monnier - Wed, 3 Apr 2024 18:10 UTC

> Since each chased pointer starts back at LSQ, the cost is no different
> than an explicit Prefetch instruction, except without (a),(b) and (c)
> having been applied first.

I thought the important difference is that the decision to prefetch or
not can be done dynamically based on past history.

Stefan


devel / comp.arch / Re: Another security vulnerability

Pages:123
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor