Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

Experiments must be reproducible; they should all fail in the same way.


devel / comp.arch / Re: Interrupts on the Concertina II

SubjectAuthor
* Interrupts on the Concertina IIQuadibloc
+* Re: Interrupts on the Concertina IIScott Lurndal
|`- Re: Interrupts on the Concertina IIDavid Brown
+* Re: Interrupts on the Concertina IIEricP
|+* Re: Interrupts on the Concertina IIMitchAlsup1
||+- Re: Interrupts on the Concertina IIScott Lurndal
||`- Re: Interrupts on the Concertina IIEricP
|+* Re: Interrupts on the Concertina IIQuadibloc
||`* Re: Interrupts on the Concertina IIEricP
|| `* Re: Interrupts on the Concertina IIMitchAlsup1
||  `- Re: Interrupts on the Concertina IIEricP
|`* Re: Interrupts on the Concertina IIQuadibloc
| +* Re: Interrupts on the Concertina IIBGB
| |`* Re: Interrupts on the Concertina IIMitchAlsup1
| | +- Re: Interrupts on the Concertina IIScott Lurndal
| | `* Re: Interrupts on the Concertina IIBGB
| |  `* Re: Interrupts on the Concertina IIMitchAlsup1
| |   +- Re: Interrupts on the Concertina IIBGB
| |   `* Re: Interrupts on the Concertina IIScott Lurndal
| |    +* Re: Interrupts on the Concertina IIMitchAlsup1
| |    |`* Re: Interrupts on the Concertina IIScott Lurndal
| |    | `* Re: Interrupts on the Concertina IIMitchAlsup1
| |    |  `- Re: Interrupts on the Concertina IIScott Lurndal
| |    `* Re: Interrupts on the Concertina IIBGB
| |     +* Re: Interrupts on the Concertina IIMitchAlsup1
| |     |+* Re: Interrupts on the Concertina IIBGB-Alt
| |     ||`- Re: Interrupts on the Concertina IIChris M. Thomasson
| |     |`* Re: Interrupts on the Concertina IIEricP
| |     | `* Re: Interrupts on the Concertina IIMitchAlsup1
| |     |  `* Re: Interrupts on the Concertina IIEricP
| |     |   `* Re: Interrupts on the Concertina IIMitchAlsup1
| |     |    `* Re: Page tables and TLBs [was Interrupts on the Concertina II]EricP
| |     |     `* Re: Page tables and TLBs [was Interrupts on the Concertina II]MitchAlsup1
| |     |      `* Re: Page tables and TLBs [was Interrupts on the Concertina II]EricP
| |     |       `* Re: Page tables and TLBs [was Interrupts on the Concertina II]MitchAlsup1
| |     |        `* Re: Page tables and TLBs [was Interrupts on the Concertina II]EricP
| |     |         `- Re: Page tables and TLBs [was Interrupts on the Concertina II]MitchAlsup1
| |     `- Re: Interrupts on the Concertina IIScott Lurndal
| `- Re: Interrupts on the Concertina IIMitchAlsup1
`* Re: Interrupts on the Concertina IIMitchAlsup1
 `* Re: Interrupts on the Concertina IIChris M. Thomasson
  `* Re: Interrupts on the Concertina IIMitchAlsup1
   +* Re: Interrupts on the Concertina IIChris M. Thomasson
   |+- Re: Interrupts on the Concertina IIChris M. Thomasson
   |`- Re: Interrupts on the Concertina IIMitchAlsup1
   `* Re: Interrupts on the Concertina IIEricP
    +- Re: Interrupts on the Concertina IIChris M. Thomasson
    `* Re: Interrupts on the Concertina IIMitchAlsup1
     +- Re: Interrupts on the Concertina IIChris M. Thomasson
     +* Re: Interrupts on the Concertina IIChris M. Thomasson
     |`* Re: Interrupts on the Concertina IIMitchAlsup1
     | `* Re: Interrupts on the Concertina IIChris M. Thomasson
     |  `* Re: Interrupts on the Concertina IIChris M. Thomasson
     |   `- Re: Interrupts on the Concertina IIChris M. Thomasson
     `* Re: Interrupts on the Concertina IIEricP
      +- Re: Interrupts on the Concertina IIMitchAlsup1
      `- Re: Interrupts on the Concertina IIChris M. Thomasson

Pages:123
Re: Interrupts on the Concertina II

<uokhu9$ehdm$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37012&group=comp.arch#37012

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Sun, 21 Jan 2024 17:55:20 -0800
Organization: A noiseless patient Spider
Lines: 85
Message-ID: <uokhu9$ehdm$1@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me>
<c11eb332fac9497c3b5c75bfdc7c2eb3@news.novabbs.org>
<uo9pa2$286lv$1@dont-email.me>
<c168e8bb229ff12236468563107c8822@www.novabbs.org>
<P2yqN.362492$83n7.220225@fx18.iad>
<af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Jan 2024 01:55:21 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f5efac9e35c211335209a589d0078cec";
logging-data="476598"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+EnNzRjOUoEmH2gvBWX8AarUDRfEsb7wI="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:u+AoPhzoUg4GHLyq6/OpXIaqB+s=
In-Reply-To: <af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org>
Content-Language: en-US
 by: Chris M. Thomasson - Mon, 22 Jan 2024 01:55 UTC

On 1/21/2024 12:58 PM, MitchAlsup1 wrote:
> EricP wrote:
>
>> MitchAlsup1 wrote:
>>> Chris M. Thomasson wrote:
>>>
>>>> On 1/17/2024 2:11 PM, MitchAlsup1 wrote:
>>>>> Quadibloc wrote:
>>>>>
>>>>>> When a computer recieves an interrupt signal, it needs to save
>>>>>> the complete machine state, so that upon return from the
>>>>>> interrupt, the program thus interrupted is in no way affected.
>>>>>
>>>>> State needs to be saved, whether SW or HW does the save is a free
>>>>> variable.
>>>>>
>>>>>> This is because interrupts can happen at any time, and thus
>>>>>> programs don't prepare for them or expect them. Any disturbance
>>>>>> to the contents of any register would risk causing programs to
>>>>>> crash.
>>>>>
>>>>> Also note:: the ABA problem can happen when an interrupt transpires
>>>>> in the middle of an ATOMIC sequence. Thus, My 66000 fails the event
>>>>> before transferring control to the interrupt handler.
>>>> [...]
>>>
>>>> Just to be clear an interrupt occurring within the hardware
>>>> implementation of a CAS operation (e.g, lock cmpxchg over on Intel)
>>>> should not effect the outcome of the CAS. Actually, it should not
>>>> happen at all, right? CAS does not have any spurious failures.
>>>
>>> ABA failure happens BECAUSE one uses the value of data to decide if
>>> something appeared ATOMIC. The CAS instruction (itself and all variants)
>>> is ATOMIC, the the setup to CAS is non-ATOMIC, because the original
>>> value
>>> to be compared was fetched without any ATOMIC indicator, and someone
>>> else
>>> can alter it before CAS. If more than 1 thread alters the location,
>>> it can (seldom) end up with the same data value as the suspended thread
>>> thought it should be.
>>>
>>> CAS is ATOMIC, the code leading to CAS was not and this opens up the
>>> hole.
>>>
>>> Note:: CAS functionality implemented with LL/SC does not suffer ABA
>>> because the core monitors the LL address until the SC is performed.
>>> It is an addressed based comparison not a data value based one.
>
>> Yes but an equal point of view is that LL/SC only emulates atomic and
>> uses the cache line ownership grab while "locked" to detect possible
>> interference and infer potential change.
>
> Which, BTW, opens up a different side channel ...
>
>> Note that if LL/SC is implemented with temporary line pinning
>> (as might be done to guarantee forward progress and prevent ping-pong)
>> then it cannot be interfered with, and CAS and atomic-fetch-op sequences
>> are semantically identical to the equivalent single instructions
>> (which may also be implemented with temporary line pinning if their
>> data must move from cache through the core and back).
>
> Line pinning requires a NAK in the coherence protocol. As far as I know,
> only My 66000 interconnect protocol has such a NaK.
>
>> Also LL/SC as implemented on Alpha, MIPS, Power, ARM, RISC-V don't allow
>> any other location loads or stores between them so really aren't useful
>> for detecting ABA because detecting it requires monitoring two memory
>> locations for change.
>
>> The classic example is the single linked list with items head->A->B->C
>> Detecting ABA requires monitoring if either head or head->Next change
>> which LL/SC cannot do as reading head->Next cancels the lock on head.
>
> Detecting ABA requires one to monitor addresses not data values.

Well, the version counter tries to negate this wrt double-wide
compare-and-swap? ;^)

>
>> x86 has cmpxchg8b and ARM has double wide LL/SC which can be used to
>> implement CASD atomic-double-wide-compare-and-swap. The first word holds
>> the head pointer and the second word holds a generation counter whose
>> change is used to infer that head->Next might have changed.

Re: Interrupts on the Concertina II

<uoki5o$ehdm$2@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37013&group=comp.arch#37013

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Sun, 21 Jan 2024 17:59:19 -0800
Organization: A noiseless patient Spider
Lines: 85
Message-ID: <uoki5o$ehdm$2@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me>
<c11eb332fac9497c3b5c75bfdc7c2eb3@news.novabbs.org>
<uo9pa2$286lv$1@dont-email.me>
<c168e8bb229ff12236468563107c8822@www.novabbs.org>
<P2yqN.362492$83n7.220225@fx18.iad>
<af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Jan 2024 01:59:20 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f5efac9e35c211335209a589d0078cec";
logging-data="476598"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19ngmH9YBk7o0lOh2ePuoakY+kmkzaaVkY="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:/xsrw3mNOmo9wJkqOrj0cXtTJXI=
Content-Language: en-US
In-Reply-To: <af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org>
 by: Chris M. Thomasson - Mon, 22 Jan 2024 01:59 UTC

On 1/21/2024 12:58 PM, MitchAlsup1 wrote:
> EricP wrote:
>
>> MitchAlsup1 wrote:
>>> Chris M. Thomasson wrote:
>>>
>>>> On 1/17/2024 2:11 PM, MitchAlsup1 wrote:
>>>>> Quadibloc wrote:
>>>>>
>>>>>> When a computer recieves an interrupt signal, it needs to save
>>>>>> the complete machine state, so that upon return from the
>>>>>> interrupt, the program thus interrupted is in no way affected.
>>>>>
>>>>> State needs to be saved, whether SW or HW does the save is a free
>>>>> variable.
>>>>>
>>>>>> This is because interrupts can happen at any time, and thus
>>>>>> programs don't prepare for them or expect them. Any disturbance
>>>>>> to the contents of any register would risk causing programs to
>>>>>> crash.
>>>>>
>>>>> Also note:: the ABA problem can happen when an interrupt transpires
>>>>> in the middle of an ATOMIC sequence. Thus, My 66000 fails the event
>>>>> before transferring control to the interrupt handler.
>>>> [...]
>>>
>>>> Just to be clear an interrupt occurring within the hardware
>>>> implementation of a CAS operation (e.g, lock cmpxchg over on Intel)
>>>> should not effect the outcome of the CAS. Actually, it should not
>>>> happen at all, right? CAS does not have any spurious failures.
>>>
>>> ABA failure happens BECAUSE one uses the value of data to decide if
>>> something appeared ATOMIC. The CAS instruction (itself and all variants)
>>> is ATOMIC, the the setup to CAS is non-ATOMIC, because the original
>>> value
>>> to be compared was fetched without any ATOMIC indicator, and someone
>>> else
>>> can alter it before CAS. If more than 1 thread alters the location,
>>> it can (seldom) end up with the same data value as the suspended thread
>>> thought it should be.
>>>
>>> CAS is ATOMIC, the code leading to CAS was not and this opens up the
>>> hole.
>>>
>>> Note:: CAS functionality implemented with LL/SC does not suffer ABA
>>> because the core monitors the LL address until the SC is performed.
>>> It is an addressed based comparison not a data value based one.
>
>> Yes but an equal point of view is that LL/SC only emulates atomic and
>> uses the cache line ownership grab while "locked" to detect possible
>> interference and infer potential change.
>
> Which, BTW, opens up a different side channel ...
>
>> Note that if LL/SC is implemented with temporary line pinning
>> (as might be done to guarantee forward progress and prevent ping-pong)
>> then it cannot be interfered with, and CAS and atomic-fetch-op sequences
>> are semantically identical to the equivalent single instructions
>> (which may also be implemented with temporary line pinning if their
>> data must move from cache through the core and back).
>
> Line pinning requires a NAK in the coherence protocol. As far as I know,
> only My 66000 interconnect protocol has such a NaK.
>
>> Also LL/SC as implemented on Alpha, MIPS, Power, ARM, RISC-V don't allow
>> any other location loads or stores between them so really aren't useful
>> for detecting ABA because detecting it requires monitoring two memory
>> locations for change.
>
>> The classic example is the single linked list with items head->A->B->C
>> Detecting ABA requires monitoring if either head or head->Next change
>> which LL/SC cannot do as reading head->Next cancels the lock on head.
>
> Detecting ABA requires one to monitor addresses not data values.

Not 100% true.

>
>> x86 has cmpxchg8b and ARM has double wide LL/SC which can be used to
>> implement CASD atomic-double-wide-compare-and-swap. The first word holds
>> the head pointer and the second word holds a generation counter whose
>> change is used to infer that head->Next might have changed.

Re: Interrupts on the Concertina II

<5a7861b20603040b93d45ad550d67e85@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37014&group=comp.arch#37014

  copy link   Newsgroups: comp.arch
Date: Mon, 22 Jan 2024 02:07:22 +0000
Subject: Re: Interrupts on the Concertina II
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$UHg4KqmbX8ucEJkV0fgirOC89n5s4vKuRBugaTGRFUdEAMgtAMhpe
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <uo930v$24cq0$1@dont-email.me> <c11eb332fac9497c3b5c75bfdc7c2eb3@news.novabbs.org> <uo9pa2$286lv$1@dont-email.me> <c168e8bb229ff12236468563107c8822@www.novabbs.org> <P2yqN.362492$83n7.220225@fx18.iad> <af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org> <uoki5o$ehdm$2@dont-email.me>
Organization: Rocksolid Light
Message-ID: <5a7861b20603040b93d45ad550d67e85@www.novabbs.org>
 by: MitchAlsup1 - Mon, 22 Jan 2024 02:07 UTC

Chris M. Thomasson wrote:

> On 1/21/2024 12:58 PM, MitchAlsup1 wrote:

>> Detecting ABA requires one to monitor addresses not data values.

> Not 100% true.

IBM's original ABA problem was encountered when a background task
(once a week or once a month) was swapped out to disk the instruction
prior to CAS, and when it came back the data comparison register
matched the memory data, but the value to be swapped in had no
relationship with the current linked list structure. Machine crashed.

Without knowing the address, how can this particular problem be
rectified ??

Re: Interrupts on the Concertina II

<uokk2j$erqm$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37015&group=comp.arch#37015

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!rocksolid2!news.neodome.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Sun, 21 Jan 2024 18:31:46 -0800
Organization: A noiseless patient Spider
Lines: 66
Message-ID: <uokk2j$erqm$1@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me>
<c11eb332fac9497c3b5c75bfdc7c2eb3@news.novabbs.org>
<uo9pa2$286lv$1@dont-email.me>
<c168e8bb229ff12236468563107c8822@www.novabbs.org>
<P2yqN.362492$83n7.220225@fx18.iad>
<af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org>
<uoki5o$ehdm$2@dont-email.me>
<5a7861b20603040b93d45ad550d67e85@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 22 Jan 2024 02:31:48 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f5efac9e35c211335209a589d0078cec";
logging-data="487254"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19S2zfJL/aq9+ToCUPyrnlHChqcYrRqsUY="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:d8IH+fKPPTCxgReMF4Ba6cLTQ9s=
Content-Language: en-US
In-Reply-To: <5a7861b20603040b93d45ad550d67e85@www.novabbs.org>
 by: Chris M. Thomasson - Mon, 22 Jan 2024 02:31 UTC

On 1/21/2024 6:07 PM, MitchAlsup1 wrote:
> Chris M. Thomasson wrote:
>
>> On 1/21/2024 12:58 PM, MitchAlsup1 wrote:
>
>>> Detecting ABA requires one to monitor addresses not data values.
>
>> Not 100% true.
>
> IBM's original ABA problem was encountered when a background task (once
> a week or once a month) was swapped out to disk the instruction
> prior to CAS, and when it came back the data comparison register matched
> the memory data, but the value to be swapped in had no
> relationship with the current linked list structure. Machine crashed.
>
> Without knowing the address, how can this particular problem be
> rectified ??

The version counter wrt a double wide compare and swap where:

struct dwcas_anchor
{ word* next;
word version;
};

comes into play.

Basically, from IBM:
__________________________________
Consider a chained list of the type used in the
LIFO lock/unlock example. Assume that the first
two elements are at locations A and B, respectively. If one program
attempted to remove the
first element and was interrupted between the
fourth and fifth instructions of the LUNLK routine,
the list could be changed so that elements A and
C are the first two elements when the interrupted
program resumes execution. The COMPARE
AND SWAP instruction would then succeed in
storing the value B into the header, thereby
destroying the list.
The probability of the occurrence of such list
destruction can be reduced to near zero by
appending to the header a counter that indicates
the number of times elements have been added to
the list. The use of a 32-bit counter guarantees
that the list will not be destroyed unless the following events occur,
in the exact sequence:
1. An unlock routine is interrupted between the
fetch of the pointer from the first element and
the update of the header.
2. The list is manipulated, including the deletion
of the element referenced in 1, and exactly
2óò (or an integer multiple of 2óò) additions to
the list are performed. Note that this takes on
the order of days to perform in any practical
situation.
3. The element referenced in 1 is added to the
list.
4. The unlock routine interrupted in 1 resumes
execution.
__________________________________

Re: Interrupts on the Concertina II

<uokk4v$erqm$2@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37016&group=comp.arch#37016

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.chmurka.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Sun, 21 Jan 2024 18:33:02 -0800
Organization: A noiseless patient Spider
Lines: 30
Message-ID: <uokk4v$erqm$2@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me>
<c11eb332fac9497c3b5c75bfdc7c2eb3@news.novabbs.org>
<uo9pa2$286lv$1@dont-email.me>
<c168e8bb229ff12236468563107c8822@www.novabbs.org>
<P2yqN.362492$83n7.220225@fx18.iad>
<af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org>
<uoki5o$ehdm$2@dont-email.me>
<5a7861b20603040b93d45ad550d67e85@www.novabbs.org>
<uokk2j$erqm$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 22 Jan 2024 02:33:04 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f5efac9e35c211335209a589d0078cec";
logging-data="487254"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+/8xWs4Qfilb3O4fP+xYKpVcB1lDG2b7Y="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:qYMIpqscBFT8Ci98survVlAMdvA=
Content-Language: en-US
In-Reply-To: <uokk2j$erqm$1@dont-email.me>
 by: Chris M. Thomasson - Mon, 22 Jan 2024 02:33 UTC

On 1/21/2024 6:31 PM, Chris M. Thomasson wrote:
> On 1/21/2024 6:07 PM, MitchAlsup1 wrote:
>> Chris M. Thomasson wrote:
>>
>>> On 1/21/2024 12:58 PM, MitchAlsup1 wrote:
>>
>>>> Detecting ABA requires one to monitor addresses not data values.
>>
>>> Not 100% true.
>>
>> IBM's original ABA problem was encountered when a background task
>> (once a week or once a month) was swapped out to disk the instruction
>> prior to CAS, and when it came back the data comparison register
>> matched the memory data, but the value to be swapped in had no
>> relationship with the current linked list structure. Machine crashed.
>>
>> Without knowing the address, how can this particular problem be
>> rectified ??
>
> The version counter wrt a double wide compare and swap where:
>
> struct dwcas_anchor
> {
>     word* next;
>     word version;
> };

sizeof(word*) == sizeof(word), sizeof(struct dwcas_anchor) ==
sizeof(word) * 2, in this setup, and they must be contiguous.

Re: Interrupts on the Concertina II

<uoknoa$j55p$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37020&group=comp.arch#37020

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Sun, 21 Jan 2024 21:34:32 -0600
Organization: A noiseless patient Spider
Lines: 324
Message-ID: <uoknoa$j55p$1@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad>
<uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me>
<ff86faba91c3898f808cce78672bb058@www.novabbs.org>
<uoka6h$dlog$1@dont-email.me>
<55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 22 Jan 2024 03:34:34 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c3f5f7bcf3cbeca25160f5a4be788e24";
logging-data="627897"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19STKUWZUPK5yKpcIIqTCDa"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:1ewsLmkyf47N12usBh2gqJdkpPk=
Content-Language: en-US
In-Reply-To: <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org>
 by: BGB - Mon, 22 Jan 2024 03:34 UTC

On 1/21/2024 7:22 PM, MitchAlsup1 wrote:
> BGB wrote:
>
>> On 1/21/2024 3:18 PM, MitchAlsup1 wrote:
>>> BGB wrote:
>>>
>>>> On 1/20/2024 12:54 PM, Quadibloc wrote:
>>>>> On Wed, 17 Jan 2024 15:35:56 -0500, EricP wrote:
>>>>>
>>>>> So one can't just have an interrupt behave like on an 8-bit
>>>>> microprocessor,
>>>>> saving only the program counter and the status bits, and leaving any
>>>>> registers to be saved by software. At least some of the general
>>>>> registers
>>>>> have to be saved, and set up with new starting values, for the
>>>>> interrupt
>>>>> routine to be able to save anything else, if need be.
>>>>>
>>>
>>>> IIRC, saving off PC, some flags bits, swapping the stack registers,
>>>> and doing a computed branch relative to a control register (via
>>>> bit-slicing, *). This is effectively the interrupt mechanism I am
>>>> using on a 64-bit ISA.
>>>
>>> And sounds like the interrupt mechanism for an 8-bit µprocessor...
>>>
>
>> It was partly a simplification of the design from the SH-4, which was
>> a 32-bit CPU mostly used in embedded systems (and in the Sega
>> Dreamcast...).
>
>> Though, the SH-4 did bank out half the registers, which was a feature
>> that ended up being dropped for cost-saving reasons.
>
>
>>>> *: For a table that is generally one-off in the kernel or similar,
>>>> it doesn't ask much to mandate that it has a certain alignment. And
>>>> if the required alignment is larger than the size of the table, you
>>>> have just saved yourself needing an adder...
>>>
>>> In a modern system where you have several HyperVisors and a multiplicity
>>> of GuestOSs, a single interrupt table is unworkable looking forward.
>>> What you want and need is every GuestOS to have its own table, and
>>> every HyperVisor have its own table, some kind of routing mechanism to
>>> route device interrupts to the correct table, and inform appropriate
>>> cores of raised and enabled interrupts. All these tables have to be
>>> concurrently available continuously and simultaneously. The old fixed
>>> mapping will no longer work efficiently--you can make them work with
>>> \a near-Herculean amount of carefully programming.
>>>
>>> Or you can look at the problem from a modern viewpoint and fix the
>>> model so the above is manifest.
>>>
>
>> Presumably, only the actual "bare metal" layer has an actual
>> hardware-level interrupt table, and all of the "guest" tables are
>> faked in software?...
>
>> Much like with MMU:
>> Only the base level needs to actually handle TLB miss events, and
>> everything else (nested translation, etc), can be left to software
>> emulation.
>
> Name a single ISA that fakes the TLB ?? (and has an MMU)
>

Not sure.
At least in software, the BJX2 emulator fakes the TLB.

Nothing says it can't be "turtles all the way down", though care would
be needed in the emulator design to limit how much performance overhead
is added with each HV level.

Assuming that a mechanism is in place, say, to trap LDTLB events, then
potentially each (virtual) LDTLB can hook into the next as a sort of
cascade, until it reaches a top-level virtual TLB (which could then be
treated similar to a software-managed inverted page-table).

Likely, would need some way to signal to the top-level TLB-Miss ISR that
it is running a VM, and that it should access this IPT for the TLBE's,
else somehow forward the TLB miss back into the VM (would need to come
up with an API for this, and or route it in via the "signal()" mechanism
or similar).

>>>> If anything, it is a little simpler than the mechanism used on some
>>>> 8-bit systems, which would have needed a mechanism to push these
>>>> values to the stack, and restore them from the stack.
>>>
>>> Do you think you mechanism would work "well" with 1024 cores in your
>>> system ??
>>>
>
>> Number of cores should not matter that much.
>
> Exactly !! but then try running 1024 cores under differing GuestOSs, and
> HyperVisors under one set of system-wide Tables !!
>

Nothing requires that all cores use the same tables.

>> Presumably, each core gets its own ISR stack, which should not have
>> any reason to need to interact with each other.
>
> I presume an interrupt can be serviced by any number of cores.
> I presume that there are a vast number of devices. Each device assigned
> to a few GuestOSs.
> I presume the core that services the interrupt (ISR) is running the same
> GuestOS under the same HyperVisor that initiated the device.
> I presume the core that services the interrupt was of the lowest priority
> of all the cores then running that GuestOS.
> I presume the core that services the interrupt wasted no time in doing so.
>
> And the GuestOS decides on how its ISR stack is {formatted, allocated,
> used,
> serviced, ...} which can be different for each GuestOS.
>

I would have assumed that the cores are organized in a hierarchy, say:
Core 0:
Starts the Boot process;
Natural target for hardware interrupts.
Cores 1-11:
Go to sleep at power-on, waked by main core;
Do not receive hardware interrupts;
Cores 12-15:
Reserved for nested cores;
These operate on a sub-ring, repeating the same pattern;
Sub-rings could potentially add another level of cache.

Communication between cores is via inter-processor interrupts and memory.

Or, possibly, for a 136-core target:
Core 0:
Main Core;
Cores 1-7:
Secondary top-level cores;
8-15:
Sub-Rings, each holding 16 more cores.

Though, to go to a larger number of cores, may need to expand the
ringbus routing scheme (and/or apply something similar to NAT regarding
request/response routing).

Though, it is likely if another level of cache were added (say, an
"L2-Sub" cache), it may need to use Write-Through semantics with a
mechanism to selectively ignore cache-lines (for "No-Cache" or possible
"knock detection"), say, so that the weak coherence model still works
with the multi-level cache (well, either this, or it is the same size as
the L1 caches, so that any L1 knock events also knock the L2S cache).

Well, or maybe rename the top-level L2 cache to L3, and call the L2S
caches "L2 Cache"?...

Though, by this point, may need a good way to explicitly signal no-cache
memory accesses, as "cache knocking" isn't a particularly scalable strategy.

Then again, not like any viable (cost-effective) FPGA's have the
resources for this many cores.

Well, at least excluding the option of going over to 16-bit CPU cores or
similar (possibly each with its own local non-shared RAM spaces, and a
programming model similar to Occam or Erlang?...).

Assuming fairly basic RISC style CPU cores, could potentially fit around
16x 32-bit CPU cores onto an XC7A200T.

Going too much more than this isn't going to happen short of going over
to a 16-bit design.

Though, this is unlikely to be particularly useful...

And, say, if I wanted to use them for Binary16 FP-SIMD, the cost of the
SIMD units would require a smaller number of cores (would be
hard-pressed to fit more than around 4 cores, if one wants FP-SIMD).

And, for an "FP-SIMD beast", might be cheaper just to make bigger cores
that can do 8x Binary16 vector ops.

Well, and/or come up with a cheaper alternative to Binary16.

>> For extra speed, maybe the ISR stacks could be mapped to some sort of
>> core-local SRAM. This hasn't been done yet though.
>
> Caches either work or they don't.
>

The SRAM, as a cache, would be "technically not working" as a cache...
It is faster by sake of being invisible to the outside world.

Thus, not needing to contribute any activity to the bus outside the area
for which it applies.

> Wasting cycles fetching instructions, translations, and data are genuine
> overhead that can be avoided if one treats thread-state as a cache.
>
> If the interrupt occurs often enough to mater, its instructions, data,
> and translations will be in the cache hierarchy.
>
> HW that knows what it is doing can start fetching these things even
> BEFORE it can execute the first instruction on behalf of the interrupt
> dispatcher. SW can NEVER do any of this prior to starting to run instr.
>
>> Idea here being probably the SRAM region could have a special address
>> range, and any access to this region would be invisible to any other
>> cores (and it need not need have backing in external RAM).
>
>> One could maybe debate the cost of giving each core 4K or 8K of
>> dedicated local SRAM though merely for "slightly faster interrupt
>> handling".
>
> Yech: end of debate.....
>


Click here to read the complete article
Re: Interrupts on the Concertina II

<uol4on$kl3u$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37022&group=comp.arch#37022

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Mon, 22 Jan 2024 08:16:39 +0100
Organization: A noiseless patient Spider
Lines: 68
Message-ID: <uol4on$kl3u$1@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me> <PCUpN.237105$xHn7.12585@fx14.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Jan 2024 07:16:39 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="773b575d0a21626e330ecd5dcb685706";
logging-data="676990"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19EEt8w7F9QBnltqcoKQNV/uIBYtSA8VRk="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:j7zZ+RedJDiaxlTeb5c4P5ZRABU=
In-Reply-To: <PCUpN.237105$xHn7.12585@fx14.iad>
Content-Language: en-GB
 by: David Brown - Mon, 22 Jan 2024 07:16 UTC

On 17/01/2024 19:02, Scott Lurndal wrote:
> Quadibloc <quadibloc@servername.invalid> writes:
>> When a computer recieves an interrupt signal, it needs to save
>> the complete machine state, so that upon return from the
>> interrupt, the program thus interrupted is in no way affected.
>>
>> This is because interrupts can happen at any time, and thus
>> programs don't prepare for them or expect them. Any disturbance
>> to the contents of any register would risk causing programs to
>> crash.
>
> Something needs to preserve state, either the hardware or
> the software. Most risc processors lean towards the latter,
> generally for good reason - one may not need to save
> all the state if the interrupt handler only touchs part of it.
>
>>
>> The Concertina II has a potentially large machine state which most
>> programs do not use. There are vector registers, of the huge
>> kind found in the Cray I. There are banks of 128 registers to
>> supplement the banks of 32 registers.
>>
>> One obvious step in addressing this is for programs that don't
>> use these registers to run without access to those registers.
>> If this is indicated in the PSW, then the interrupt routine will
>> know what it needs to save and restore.
>
> Just like x86 floating point.

Also ARM floating point (at least, on the 32-bit Cortex-M ARM families).

>
>>
>> A more elaborate and more automated method is also possible.
>>
>> Let us imagine the computer speeds up interrupts by having a
>> second bank of registers that interrupt routines use. But two
>> register banks aren't enough, as many user programs are running
>> concurrently.
>>
>> Here is how I envisage the sequence of events in response to
>> an interrupt could work:
>>
>> 1) The computer, at the beginning of an area of memory
>> sufficient to hold all the contents of the computer's
>> registers, including the PSW and program counter, places
>> a _restore status_.
>
> Slow DRAM or special SRAMs? The former will add
> considerable latency to an interrupt, the later costs
> area (on a per-hardware-thread basis) and floorplanning
> issues.

An SRAM block sufficient to hold a small number of copies of registers,
even for ISA's with lots of registers, would be small compared to
typical cache blocks. Indeed, it could be considered as a kind of
dedicated cache.

>
> Best is to save the minimal amount of state in hardware
> and let software deal with the rest, perhaps with
> hints from the hardware (e.g. a bit that indicates
> whether the FPRs were modified since the last context
> switch, etc).

A combined effort sounds good to me.

Re: Interrupts on the Concertina II

<mrvrN.46579$5Hnd.4925@fx03.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37025&group=comp.arch#37025

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx03.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoag35$2fubj$1@dont-email.me> <OwbqN.60118$Vrtf.6995@fx39.iad> <3f2ebe5753d55c9630e376d28df91395@www.novabbs.org>
In-Reply-To: <3f2ebe5753d55c9630e376d28df91395@www.novabbs.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 72
Message-ID: <mrvrN.46579$5Hnd.4925@fx03.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 22 Jan 2024 15:01:38 UTC
Date: Mon, 22 Jan 2024 10:00:20 -0500
X-Received-Bytes: 4002
 by: EricP - Mon, 22 Jan 2024 15:00 UTC

MitchAlsup1 wrote:
> EricP wrote:
>
>> There also can be many restrictions on what an ISR is allowed to do
>> because the OS designers did not want to, say, force every ISR to
>> sync with the slow x87 FPU just in case someone wanted to use it.
>
> What about all the architectures that are not x86 and do not need to synch
> to FP, Vectors, SIMD, ..... ?? Why are they constrained by the one badly
> designed long life architecture ??

This is about minimizing what it saves *by default*.

I am not assuming a vector coprocessor would be halted by interrupts.
Not automatically halting for interrupts is one reason to have a coprocessor.

Also I am not assuming that halting these devices is free.

It is also about an OS *by design* discouraging people from putting code
which requires a large state save and restore into latency sensitive
device drivers that can effect the whole system performance.

If you really insist on using SIMD in a driver then
(a) don't put it in an ISR, put it in a post routine,
(b) use utility routines to manually save and restore that state.

>> I would not assume that anything other than integer registers would be
>> available in an ISR.
>
> This is quite reasonable: as long as you have a sufficient number that
> the ISR can be written in some HLL without a bunch of flags to the
> compiler.

Yes, an OS can have a different ABI for ISR routines and everything else.
ISR level routines would have a different declaration as 99% of the time
a small save/restore set is sufficient.

>> In a post processing DPC/SoftIrq routine it might be possible but again
>> there can be limitations. What you don't ever want to happen is to hang
>> the cpu waiting to sync with a piece of hardware so you can save its
>> state,
>> as might happen if it was a co-processor. You also don't want to have to
>> save any state just in case a post routine might want to do something,
>> but rather save/restore the state on demand and just what is needed.
>> So it really depends on the device and the platform.
>
> As long as there are not more than one flag to clue the compiler in,
> I am on board.

I would do it with declarations (routine attributes) as it is less
error prone, just like MS C has stdcall, cdecl calling conventions.

void __isrcall MyDeviceIsr (IoDevice_t *dev, etc);

This would just be for an ISR routine and the few routines it calls.
The driver post routines could use a standard ABI.

The isrcall attribute changes the ABI to be R0:R7 are not preserved,
R8:R31 are preserved. Also there is no need for a frame pointer
as variable allocations are not allowed, neither are exceptions.
It could also do things like change the stack pointer to be in
R7 instead of R31 (just pointing out the possibilities).

The interrupt prologue saves R0:R7 and loops calling the ISR for
each device receiving an interrupt at that interrupt priority level.
After all are serviced it checks if it needs post processing.
If not then it executes the epilogue to restore R0:R7 and REI's.
If it does then it saves R8:R15 to comply with the standard ABI
and jumps into the OS Dispatcher which flushes the post routines.

Re: Interrupts on the Concertina II

<mrwrN.323723$p%Mb.172024@fx15.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37026&group=comp.arch#37026

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Interrupts on the Concertina II
Newsgroups: comp.arch
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me> <ff86faba91c3898f808cce78672bb058@www.novabbs.org> <uoka6h$dlog$1@dont-email.me> <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org>
Lines: 79
Message-ID: <mrwrN.323723$p%Mb.172024@fx15.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Mon, 22 Jan 2024 16:09:54 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Mon, 22 Jan 2024 16:09:54 GMT
X-Received-Bytes: 4164
 by: Scott Lurndal - Mon, 22 Jan 2024 16:09 UTC

mitchalsup@aol.com (MitchAlsup1) writes:
>BGB wrote:
>

>> Much like with MMU:
>> Only the base level needs to actually handle TLB miss events, and
>> everything else (nested translation, etc), can be left to software
>> emulation.
>
>Name a single ISA that fakes the TLB ?? (and has an MMU)

MIPS?

>> Presumably, each core gets its own ISR stack, which should not have any
>> reason to need to interact with each other.
>
>I presume an interrupt can be serviced by any number of cores.

Or restricted to a specific set of cores (i.e. those currently
owned by the target guest).

The guest OS will generally specify the target virutal core (or set of cores)
for a specific interrupt. The Hypervisor and/or hardware needs
to deal with the case where the interrupt arrives while the target
guest core isn't currently scheduled on a physical core (and poke
the kernel to schedule the guest optionally). Such as recording
the pending interrupt and optionally notifying the hypervisor that
there is a pending guest interrupt so it can schedule the guest
core(s) on physical cores to handle the interrupt.

>I presume that there are a vast number of devices. Each device assigned
>to a few GuestOSs.

Or, with SR-IOV, virtual functions are assigned to specific guests
and all interrupts are MSI-X messages from the device to the
interrupt controller (LAPIC, GIC, etc).

Dealing with inter-processor interrupts in a multicore guest can also
be tricky; either trapped by the hypervisor or there must be hardware
support in the interrupt controller to notify the hypervisor that a pending
guest IPI interrupt has arrived. ARM started with the former behavior, but
added a mechanism to handle direct injection of interprocessor interrupts
by the guest, without hypervisor intervention (assuming the guest core
is currently scheduled on a physical core, otherwise the hypervisor gets
notified that there is a pending interrupt for a non-scheduled guest
core).

>I presume the core that services the interrupt (ISR) is running the same
>GuestOS under the same HyperVisor that initiated the device.

Generally a safe assumption. Note that the guest core may not be
resident on any physical core when the guest interrupt arives.

>I presume the core that services the interrupt was of the lowest priority
>of all the cores then running that GuestOS.
>I presume the core that services the interrupt wasted no time in doing so.
>
>And the GuestOS decides on how its ISR stack is {formatted, allocated, used,
>serviced, ...} which can be different for each GuestOS.

To a certain extent, the format of the ISR stack is hardware defined,
and there rest is completely up to the guest. ARM for example,
saves the current PC into a system register (ELR_ELx) and switches
the stack pointer. Everything else is up to the software interrupt
handler to save/restore. I see little benefit in hardware doing
any state saving other than that.

>
>If the interrupt occurs often enough to mater, its instructions, data,
>and translations will be in the cache hierarchy.

Although there has been a great deal of work mitigating the
number of interrupts (setting interrupt threshholds, RSS,
polling (DPDK, ODP), etc)

I don't see any advantages to all the fancy hardware interrupt
proposals from either of you.

Re: Interrupts on the Concertina II

<CFwrN.68012$GX69.35119@fx46.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37027&group=comp.arch#37027

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!rocksolid2!news.neodome.net!tncsrv06.tnetconsulting.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx46.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
References: <uo930v$24cq0$1@dont-email.me> <c11eb332fac9497c3b5c75bfdc7c2eb3@news.novabbs.org> <uo9pa2$286lv$1@dont-email.me> <c168e8bb229ff12236468563107c8822@www.novabbs.org> <P2yqN.362492$83n7.220225@fx18.iad> <af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org>
In-Reply-To: <af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 83
Message-ID: <CFwrN.68012$GX69.35119@fx46.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 22 Jan 2024 16:25:06 UTC
Date: Mon, 22 Jan 2024 11:24:48 -0500
X-Received-Bytes: 4649
 by: EricP - Mon, 22 Jan 2024 16:24 UTC

MitchAlsup1 wrote:
> EricP wrote:
>
>> MitchAlsup1 wrote:
>>> Chris M. Thomasson wrote:
>>>> Just to be clear an interrupt occurring within the hardware
>>>> implementation of a CAS operation (e.g, lock cmpxchg over on Intel)
>>>> should not effect the outcome of the CAS. Actually, it should not
>>>> happen at all, right? CAS does not have any spurious failures.
>>>
>>> ABA failure happens BECAUSE one uses the value of data to decide if
>>> something appeared ATOMIC. The CAS instruction (itself and all variants)
>>> is ATOMIC, the the setup to CAS is non-ATOMIC, because the original
>>> value
>>> to be compared was fetched without any ATOMIC indicator, and someone
>>> else
>>> can alter it before CAS. If more than 1 thread alters the location,
>>> it can (seldom) end up with the same data value as the suspended thread
>>> thought it should be.
>>>
>>> CAS is ATOMIC, the code leading to CAS was not and this opens up the
>>> hole.
>>>
>>> Note:: CAS functionality implemented with LL/SC does not suffer ABA
>>> because the core monitors the LL address until the SC is performed.
>>> It is an addressed based comparison not a data value based one.
>
>> Yes but an equal point of view is that LL/SC only emulates atomic and
>> uses the cache line ownership grab while "locked" to detect possible
>> interference and infer potential change.
>
> Which, BTW, opens up a different side channel ...

How so? The location has to be inside the same virtual space.

>> Note that if LL/SC is implemented with temporary line pinning
>> (as might be done to guarantee forward progress and prevent ping-pong)
>> then it cannot be interfered with, and CAS and atomic-fetch-op sequences
>> are semantically identical to the equivalent single instructions
>> (which may also be implemented with temporary line pinning if their
>> data must move from cache through the core and back).
>
> Line pinning requires a NAK in the coherence protocol. As far as I know,
> only My 66000 interconnect protocol has such a NaK.

Not necessarily, provided it is time limited (few tens of clocks).

Also I suspect the worst case latency for moving a line ownership
could be quite large (a lots of queues and cache levels to traverse),
and main memory can be many hundreds of clocks away.

So the cache protocol should already be long latency tolerant
and adding some 10's of clocks shouldn't really matter.

>> Also LL/SC as implemented on Alpha, MIPS, Power, ARM, RISC-V don't allow
>> any other location loads or stores between them so really aren't useful
>> for detecting ABA because detecting it requires monitoring two memory
>> locations for change.
>
>> The classic example is the single linked list with items head->A->B->C
>> Detecting ABA requires monitoring if either head or head->Next change
>> which LL/SC cannot do as reading head->Next cancels the lock on head.
>
> Detecting ABA requires one to monitor addresses not data values.

It is a method for reading a pair of addresses, and knowing that
neither of them has changed between those two steps,
proceeding to update the first address.

It requires monitoring a first address while reading a second address,
and then updating the first address (releasing the monitor),
and using any update to the first address between those three steps to
infer there might have been a change to the second and blocking the update.

Which none of the LL/SC guarantee you can do.

>> x86 has cmpxchg8b and ARM has double wide LL/SC which can be used to
>> implement CASD atomic-double-wide-compare-and-swap. The first word holds
>> the head pointer and the second word holds a generation counter whose
>> change is used to infer that head->Next might have changed.

Re: Interrupts on the Concertina II

<8258c9888cf0b094bafad0bf9fba9c91@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37036&group=comp.arch#37036

  copy link   Newsgroups: comp.arch
Date: Mon, 22 Jan 2024 19:15:49 +0000
Subject: Re: Interrupts on the Concertina II
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$7Ik0O4od6DCKU5ktwJcJRejzQv0/fZiKqrEI2NiOqUmC98PiWg4GK
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me> <ff86faba91c3898f808cce78672bb058@www.novabbs.org> <uoka6h$dlog$1@dont-email.me> <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org> <mrwrN.323723$p%Mb.172024@fx15.iad>
Organization: Rocksolid Light
Message-ID: <8258c9888cf0b094bafad0bf9fba9c91@www.novabbs.org>
 by: MitchAlsup1 - Mon, 22 Jan 2024 19:15 UTC

Scott Lurndal wrote:

> mitchalsup@aol.com (MitchAlsup1) writes:
>>BGB wrote:
>>

>>> Much like with MMU:
>>> Only the base level needs to actually handle TLB miss events, and
>>> everything else (nested translation, etc), can be left to software
>>> emulation.
>>
>>Name a single ISA that fakes the TLB ?? (and has an MMU)

> MIPS?

Even R2000 has a TLB, it is a SW serviced TLB, but the "zero overhead
on hit" part is present.

>>> Presumably, each core gets its own ISR stack, which should not have any
>>> reason to need to interact with each other.
>>
>>I presume an interrupt can be serviced by any number of cores.

> Or restricted to a specific set of cores (i.e. those currently
> owned by the target guest).

Even that gets tricky when you (or the OS) virtualizes cores.

> The guest OS will generally specify the target virutal core (or set of cores)

Yes, set of cores.

> for a specific interrupt. The Hypervisor and/or hardware needs
> to deal with the case where the interrupt arrives while the target
> guest core isn't currently scheduled on a physical core (and poke
> the kernel to schedule the guest optionally). Such as recording
> the pending interrupt and optionally notifying the hypervisor that
> there is a pending guest interrupt so it can schedule the guest
> core(s) on physical cores to handle the interrupt.

That is the routing I was talking about.

>>I presume that there are a vast number of devices. Each device assigned
>>to a few GuestOSs.

> Or, with SR-IOV, virtual functions are assigned to specific guests
> and all interrupts are MSI-X messages from the device to the
> interrupt controller (LAPIC, GIC, etc).

In my case, the interrupt controller merely sets bits in the interrupt
table, the watching cores watch for changes to its pending interrupt
register (64-bits). Said messages come up from PCIe as MSI-X messages,
and are directed to the interrupt controller over in the Memory Controller
(L3).

> Dealing with inter-processor interrupts in a multicore guest can also
> be tricky;

Core sends MSI-X message to interrupt controller and the rest happens
no different than a device initerrupt.

> either trapped by the hypervisor or there must be hardware
> support in the interrupt controller to notify the hypervisor that a pending
> guest IPI interrupt has arrived. ARM started with the former behavior, but
> added a mechanism to handle direct injection of interprocessor interrupts
> by the guest, without hypervisor intervention (assuming the guest core
> is currently scheduled on a physical core, otherwise the hypervisor gets
> notified that there is a pending interrupt for a non-scheduled guest
> core).

>>I presume the core that services the interrupt (ISR) is running the same
>>GuestOS under the same HyperVisor that initiated the device.

> Generally a safe assumption. Note that the guest core may not be
> resident on any physical core when the guest interrupt arives.

Which is why its table has to be present at all times--even if the threads
are not. When one or more threads from that GuestOS are activated, the
pending interrupt will be serviced.

>>I presume the core that services the interrupt was of the lowest priority
>>of all the cores then running that GuestOS.
>>I presume the core that services the interrupt wasted no time in doing so.
>>
>>And the GuestOS decides on how its ISR stack is {formatted, allocated, used,
>>serviced, ...} which can be different for each GuestOS.

> To a certain extent, the format of the ISR stack is hardware defined,
> and there rest is completely up to the guest. ARM for example,
> saves the current PC into a system register (ELR_ELx) and switches
> the stack pointer. Everything else is up to the software interrupt
> handler to save/restore. I see little benefit in hardware doing
> any state saving other than that.

>>
>>If the interrupt occurs often enough to mater, its instructions, data,
>>and translations will be in the cache hierarchy.

> Although there has been a great deal of work mitigating the
> number of interrupts (setting interrupt threshholds, RSS,
> polling (DPDK, ODP), etc)

> I don't see any advantages to all the fancy hardware interrupt
> proposals from either of you.

I understand.

Re: Interrupts on the Concertina II

<3c40b2226d9cdb889517e7d39b391cdf@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37037&group=comp.arch#37037

  copy link   Newsgroups: comp.arch
Date: Mon, 22 Jan 2024 19:19:55 +0000
Subject: Re: Interrupts on the Concertina II
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$ukOmr8fGdndfFry95paDcOtLw7pdybdzi0pLs0M9Xy75of0/EOh2K
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <uo930v$24cq0$1@dont-email.me> <c11eb332fac9497c3b5c75bfdc7c2eb3@news.novabbs.org> <uo9pa2$286lv$1@dont-email.me> <c168e8bb229ff12236468563107c8822@www.novabbs.org> <P2yqN.362492$83n7.220225@fx18.iad> <af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org> <CFwrN.68012$GX69.35119@fx46.iad>
Organization: Rocksolid Light
Message-ID: <3c40b2226d9cdb889517e7d39b391cdf@www.novabbs.org>
 by: MitchAlsup1 - Mon, 22 Jan 2024 19:19 UTC

EricP wrote:

> MitchAlsup1 wrote:
>> EricP wrote:
>>
>>> MitchAlsup1 wrote:
>>>> Chris M. Thomasson wrote:
>>>>> Just to be clear an interrupt occurring within the hardware
>>>>> implementation of a CAS operation (e.g, lock cmpxchg over on Intel)
>>>>> should not effect the outcome of the CAS. Actually, it should not
>>>>> happen at all, right? CAS does not have any spurious failures.
>>>>
>>>> ABA failure happens BECAUSE one uses the value of data to decide if
>>>> something appeared ATOMIC. The CAS instruction (itself and all variants)
>>>> is ATOMIC, the the setup to CAS is non-ATOMIC, because the original
>>>> value
>>>> to be compared was fetched without any ATOMIC indicator, and someone
>>>> else
>>>> can alter it before CAS. If more than 1 thread alters the location,
>>>> it can (seldom) end up with the same data value as the suspended thread
>>>> thought it should be.
>>>>
>>>> CAS is ATOMIC, the code leading to CAS was not and this opens up the
>>>> hole.
>>>>
>>>> Note:: CAS functionality implemented with LL/SC does not suffer ABA
>>>> because the core monitors the LL address until the SC is performed.
>>>> It is an addressed based comparison not a data value based one.
>>
>>> Yes but an equal point of view is that LL/SC only emulates atomic and
>>> uses the cache line ownership grab while "locked" to detect possible
>>> interference and infer potential change.
>>
>> Which, BTW, opens up a different side channel ...

> How so? The location has to be inside the same virtual space.

Anything, that changes the amount of time something takes; opens up a
side channel. Whether data can flow through the channel is a different
story. Holding a line changes the bounds on the time taken.

>>> Note that if LL/SC is implemented with temporary line pinning
>>> (as might be done to guarantee forward progress and prevent ping-pong)
>>> then it cannot be interfered with, and CAS and atomic-fetch-op sequences
>>> are semantically identical to the equivalent single instructions
>>> (which may also be implemented with temporary line pinning if their
>>> data must move from cache through the core and back).
>>
>> Line pinning requires a NAK in the coherence protocol. As far as I know,
>> only My 66000 interconnect protocol has such a NaK.

> Not necessarily, provided it is time limited (few tens of clocks).

That I will grant.

> Also I suspect the worst case latency for moving a line ownership
> could be quite large (a lots of queues and cache levels to traverse),
> and main memory can be many hundreds of clocks away.

Figure 100-cycles as a loaded system average.

> So the cache protocol should already be long latency tolerant
> and adding some 10's of clocks shouldn't really matter.

But does 100+cycles ??

>>> Also LL/SC as implemented on Alpha, MIPS, Power, ARM, RISC-V don't allow
>>> any other location loads or stores between them so really aren't useful
>>> for detecting ABA because detecting it requires monitoring two memory
>>> locations for change.
>>
>>> The classic example is the single linked list with items head->A->B->C
>>> Detecting ABA requires monitoring if either head or head->Next change
>>> which LL/SC cannot do as reading head->Next cancels the lock on head.
>>
>> Detecting ABA requires one to monitor addresses not data values.

> It is a method for reading a pair of addresses, and knowing that
> neither of them has changed between those two steps,
> proceeding to update the first address.

> It requires monitoring a first address while reading a second address,
> and then updating the first address (releasing the monitor),
> and using any update to the first address between those three steps to
> infer there might have been a change to the second and blocking the update.

> Which none of the LL/SC guarantee you can do.

Right LL/SC is a single container synchronization model.

>>> x86 has cmpxchg8b and ARM has double wide LL/SC which can be used to
>>> implement CASD atomic-double-wide-compare-and-swap. The first word holds
>>> the head pointer and the second word holds a generation counter whose
>>> change is used to infer that head->Next might have changed.

Re: Interrupts on the Concertina II

<uomff4$s535$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37039&group=comp.arch#37039

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.goja.nl.eu.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Mon, 22 Jan 2024 13:25:21 -0600
Organization: A noiseless patient Spider
Lines: 245
Message-ID: <uomff4$s535$1@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad>
<uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me>
<ff86faba91c3898f808cce78672bb058@www.novabbs.org>
<uoka6h$dlog$1@dont-email.me>
<55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org>
<mrwrN.323723$p%Mb.172024@fx15.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Jan 2024 19:25:24 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c3f5f7bcf3cbeca25160f5a4be788e24";
logging-data="922725"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+wFklqcIFy3HkN7RVwm/n1"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:dstF5kkHWecDuR28sYKeFWOjhKg=
In-Reply-To: <mrwrN.323723$p%Mb.172024@fx15.iad>
Content-Language: en-US
 by: BGB - Mon, 22 Jan 2024 19:25 UTC

On 1/22/2024 10:09 AM, Scott Lurndal wrote:
> mitchalsup@aol.com (MitchAlsup1) writes:
>> BGB wrote:
>>
>
>>> Much like with MMU:
>>> Only the base level needs to actually handle TLB miss events, and
>>> everything else (nested translation, etc), can be left to software
>>> emulation.
>>
>> Name a single ISA that fakes the TLB ?? (and has an MMU)
>
> MIPS?
>

Hmm...

In my case, the use of Soft TLB is not strictly required, as the OS may
opt-in to use a hardware page-walker "if it exists", with TLB Miss
interrupts mostly happening if no hardware page walker exists (or if
there is not a valid page in the page table).

This allows the option of implementing a nested page-translation
mechanism in the top-level TLB Miss handler (with a guess able to opt
out of the hardware page walking if it wants to run its own VM, it which
case will need to recursively emulate the TLB Miss ISR's and LDTLB
handling).

Well, or come up with a convention where the top level can see the VM
state of each guest recursively, so that the top-level ISR can
(directly) handle N levels of nested page-tables (rather than needing to
nest the TLB Miss ISR).

Though, the most likely option for this would be to make the nested VM's
express their VM state using the same contest structure as normal
threads/processes, and effectively canonizing these parts of the
structure as part of the ISA/ABI spec (and a guest deviating from this
structure would come at potentially significant performance cost).

May also make sense to add specific interrupts for specific privileged
instructions, such that common cases like accessing a CR or using an
LDTLB instruction can be trapped more efficiently (IOW: not needing to
disassemble the offending instruction to figure out what to do).

>
>>> Presumably, each core gets its own ISR stack, which should not have any
>>> reason to need to interact with each other.
>>
>> I presume an interrupt can be serviced by any number of cores.
>
> Or restricted to a specific set of cores (i.e. those currently
> owned by the target guest).
>
> The guest OS will generally specify the target virutal core (or set of cores)
> for a specific interrupt. The Hypervisor and/or hardware needs
> to deal with the case where the interrupt arrives while the target
> guest core isn't currently scheduled on a physical core (and poke
> the kernel to schedule the guest optionally). Such as recording
> the pending interrupt and optionally notifying the hypervisor that
> there is a pending guest interrupt so it can schedule the guest
> core(s) on physical cores to handle the interrupt.
>

I am guessing maybe my assumed approach of always routing all of the
external hardware interrupts to a specific core, is not typical then?...

Say, only Core=0 or Core=1, will get the interrupts.

*: Here, 0 vs 1 is ambiguous partly as '0' was left as a "This core",
with other cores numbered 1-15.

This scheme does work directly with < 15 cores, with trickery for 16
cores, but would require nesting trickery for more cores.

>> I presume that there are a vast number of devices. Each device assigned
>> to a few GuestOSs.
>
> Or, with SR-IOV, virtual functions are assigned to specific guests
> and all interrupts are MSI-X messages from the device to the
> interrupt controller (LAPIC, GIC, etc).
>
> Dealing with inter-processor interrupts in a multicore guest can also
> be tricky; either trapped by the hypervisor or there must be hardware
> support in the interrupt controller to notify the hypervisor that a pending
> guest IPI interrupt has arrived. ARM started with the former behavior, but
> added a mechanism to handle direct injection of interprocessor interrupts
> by the guest, without hypervisor intervention (assuming the guest core
> is currently scheduled on a physical core, otherwise the hypervisor gets
> notified that there is a pending interrupt for a non-scheduled guest
> core).
>

Yeah.

Admittedly, I hadn't really thought about or looked into these parts...

>> I presume the core that services the interrupt (ISR) is running the same
>> GuestOS under the same HyperVisor that initiated the device.
>
> Generally a safe assumption. Note that the guest core may not be
> resident on any physical core when the guest interrupt arives.
>

Trying to route actual HW interrupts into virtual guest OS's seems like
a pain.

In any case, it needs to be routed to where it needs to go.

>> I presume the core that services the interrupt was of the lowest priority
>> of all the cores then running that GuestOS.
>> I presume the core that services the interrupt wasted no time in doing so.
>>
>> And the GuestOS decides on how its ISR stack is {formatted, allocated, used,
>> serviced, ...} which can be different for each GuestOS.
>
> To a certain extent, the format of the ISR stack is hardware defined,
> and there rest is completely up to the guest. ARM for example,
> saves the current PC into a system register (ELR_ELx) and switches
> the stack pointer. Everything else is up to the software interrupt
> handler to save/restore. I see little benefit in hardware doing
> any state saving other than that.
>

Mostly agreed.

If ARM goes minimnal here, and pretty much nowhere else, this seems
telling...

As I see it, the main limiting factor for interrupt performance is not
the instructions to save and restore the registers, but rather the L1
misses that result from doing so.

Short of having special core-local SRAM or similar, this cost is
unavoidable.

Currently there is an SRAM region, but it is shared and in the L2 Ring,
so it will not have L2 misses, but has higher access latency than if it
were in the L1 ring.

But, it is debatable if it really actually matters, and there are
probably reasons not to have core-local memory regions.

But, compared with the RISC-V solution of doing N copies of the register
file, a core-local SRAM for the ISR stack would be cheap.

But, yeah:
Save PC;
Save any CPU flags/state;
Swap the stacks;
Set CPU state to a supervisor+ISR mode;
Branch to ISR entry point (to an offset in a vector table).

Does work, and seems pretty close to the minimum requirement.
Couldn't really think up a good way to trim it down much smaller.
At least without adding a bunch of extra wonk.

In HW, there are effectively two stack-pointer register registers, which
swap places on ISR entry/exit (currently by renumbering the registers in
the decoder).

Can't really get rid of the stack-swap without adding considerably more
wonk to the ISR handling mechanism (if the ISR entry point has 0 free
registers, and no usable stack pointer, well then, we have a bit more of
a puzzle...).

So, a mechanism to swap a pair of stack-pointer registers seemed like a
necessary evil.

With a Soft-TLB, it is also basically required to fall back to physical
addressing for ISR's (and with HW page-walking, if virtual-memory could
exist in ISRs, it would likely be necessary to jump over to a different
set of page-tables from the usermode program).

>
>>
>> If the interrupt occurs often enough to mater, its instructions, data,
>> and translations will be in the cache hierarchy.
>
> Although there has been a great deal of work mitigating the
> number of interrupts (setting interrupt threshholds, RSS,
> polling (DPDK, ODP), etc)
>
> I don't see any advantages to all the fancy hardware interrupt
> proposals from either of you.

?...

In my case, I had not been arguing for any fancy interrupt handling in
hardware...

The most fancy part of my interrupt mechanism, is that one can encode
the ID of a core into the value passed to a "TRAPA", and it will
redirect the interrupt to that specific core.

But, this mechanism currently has the limitations of a 4-bit field, so
going beyond ~ 15 cores is going to require a nesting scheme and
bouncing IPI's across multiple cores.

Though, if needed, I could tweak the format slightly in this case, and
maybe expand the Core-ID for IPI's to 8-bits, albeit limiting it to 16
unique IPI interrupt types.

Or, an intermediate would be 6-bit, and then require nesting for more
than 63 cores.

Doesn't matter for an FPGA, as with the BJX2 Core, I am mostly limited
to 1 or 2 cores on "consumer grade" FPGAs (all of the FPGA's that could
fit more than two cores; well, I can no longer use free Vivado).

In theory, could fit a quad-core on a Kintex-325T that I got of
AliExpress (and probably run at a higher clock-speed as well), but,
can't exactly use this FPGA in the free version of Vivado (and the
open-source tools both didn't work for me, and put up some "major red
flags" regarding their reverse engineering strategies; so even if the
tools did work, using them to generate bitstreams for a Kintex-325T or
similar would be legally suspect).


Click here to read the complete article
Re: Interrupts on the Concertina II

<uomj52$sk3o$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37043&group=comp.arch#37043

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Mon, 22 Jan 2024 12:28:16 -0800
Organization: A noiseless patient Spider
Lines: 88
Message-ID: <uomj52$sk3o$1@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me>
<c11eb332fac9497c3b5c75bfdc7c2eb3@news.novabbs.org>
<uo9pa2$286lv$1@dont-email.me>
<c168e8bb229ff12236468563107c8822@www.novabbs.org>
<P2yqN.362492$83n7.220225@fx18.iad>
<af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org>
<CFwrN.68012$GX69.35119@fx46.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Jan 2024 20:28:18 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f5efac9e35c211335209a589d0078cec";
logging-data="938104"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+3I6FyUEo+HS3DizLSbaFn+HZbT/9079Y="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:q1C7Nm22Snwy1QhHbOsfbCfdY1o=
Content-Language: en-US
In-Reply-To: <CFwrN.68012$GX69.35119@fx46.iad>
 by: Chris M. Thomasson - Mon, 22 Jan 2024 20:28 UTC

On 1/22/2024 8:24 AM, EricP wrote:
> MitchAlsup1 wrote:
>> EricP wrote:
>>
>>> MitchAlsup1 wrote:
>>>> Chris M. Thomasson wrote:
>>>>> Just to be clear an interrupt occurring within the hardware
>>>>> implementation of a CAS operation (e.g, lock cmpxchg over on Intel)
>>>>> should not effect the outcome of the CAS. Actually, it should not
>>>>> happen at all, right? CAS does not have any spurious failures.
>>>>
>>>> ABA failure happens BECAUSE one uses the value of data to decide if
>>>> something appeared ATOMIC. The CAS instruction (itself and all
>>>> variants)
>>>> is ATOMIC, the the setup to CAS is non-ATOMIC, because the original
>>>> value
>>>> to be compared was fetched without any ATOMIC indicator, and someone
>>>> else
>>>> can alter it before CAS. If more than 1 thread alters the location,
>>>> it can (seldom) end up with the same data value as the suspended thread
>>>> thought it should be.
>>>>
>>>> CAS is ATOMIC, the code leading to CAS was not and this opens up the
>>>> hole.
>>>>
>>>> Note:: CAS functionality implemented with LL/SC does not suffer ABA
>>>> because the core monitors the LL address until the SC is performed.
>>>> It is an addressed based comparison not a data value based one.
>>
>>> Yes but an equal point of view is that LL/SC only emulates atomic and
>>> uses the cache line ownership grab while "locked" to detect possible
>>> interference and infer potential change.
>>
>> Which, BTW, opens up a different side channel ...
>
> How so? The location has to be inside the same virtual space.
>
>>> Note that if LL/SC is implemented with temporary line pinning
>>> (as might be done to guarantee forward progress and prevent ping-pong)
>>> then it cannot be interfered with, and CAS and atomic-fetch-op sequences
>>> are semantically identical to the equivalent single instructions
>>> (which may also be implemented with temporary line pinning if their
>>> data must move from cache through the core and back).
>>
>> Line pinning requires a NAK in the coherence protocol. As far as I know,
>> only My 66000 interconnect protocol has such a NaK.
>
> Not necessarily, provided it is time limited (few tens of clocks).
>
> Also I suspect the worst case latency for moving a line ownership
> could be quite large (a lots of queues and cache levels to traverse),
> and main memory can be many hundreds of clocks away.
>
> So the cache protocol should already be long latency tolerant
> and adding some 10's of clocks shouldn't really matter.
>
>>> Also LL/SC as implemented on Alpha, MIPS, Power, ARM, RISC-V don't allow
>>> any other location loads or stores between them so really aren't useful
>>> for detecting ABA because detecting it requires monitoring two memory
>>> locations for change.
>>
>>> The classic example is the single linked list with items head->A->B->C
>>> Detecting ABA requires monitoring if either head or head->Next change
>>> which LL/SC cannot do as reading head->Next cancels the lock on head.
>>
>> Detecting ABA requires one to monitor addresses not data values.
>
> It is a method for reading a pair of addresses, and knowing that
> neither of them has changed between those two steps,
> proceeding to update the first address.
>
> It requires monitoring a first address while reading a second address,
> and then updating the first address (releasing the monitor),
> and using any update to the first address between those three steps to
> infer there might have been a change to the second and blocking the update.
>
> Which none of the LL/SC guarantee you can do.
>
>>> x86 has cmpxchg8b and ARM has double wide LL/SC which can be used to
>>> implement CASD atomic-double-wide-compare-and-swap. The first word holds
>>> the head pointer and the second word holds a generation counter whose
>>> change is used to infer that head->Next might have changed.
>
>
>

I feel the need to clarify that these words needs to be adjacent within
the _same_ cache line.

Re: Interrupts on the Concertina II

<65d2323335476993a8b1aa39720022b6@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37044&group=comp.arch#37044

  copy link   Newsgroups: comp.arch
Date: Mon, 22 Jan 2024 20:31:08 +0000
Subject: Re: Interrupts on the Concertina II
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$usf6IkTyAweg12AMlervW.s8yyspBdikX3QIefzDwADrjasBZhLUO
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me> <ff86faba91c3898f808cce78672bb058@www.novabbs.org> <uoka6h$dlog$1@dont-email.me> <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org> <mrwrN.323723$p%Mb.172024@fx15.iad> <uomff4$s535$1@dont-email.me>
Organization: Rocksolid Light
Message-ID: <65d2323335476993a8b1aa39720022b6@www.novabbs.org>
 by: MitchAlsup1 - Mon, 22 Jan 2024 20:31 UTC

BGB wrote:

> On 1/22/2024 10:09 AM, Scott Lurndal wrote:
>> mitchalsup@aol.com (MitchAlsup1) writes:

> In my case, the use of Soft TLB is not strictly required, as the OS may
> opt-in to use a hardware page-walker "if it exists", with TLB Miss
> interrupts mostly happening if no hardware page walker exists (or if
> there is not a valid page in the page table).

Has anyone done a SW refill TLB implementation that has both Hypervisor
and Supervisor page <nested> translations ??

This seems to me a bad idea as HV would end up having to manipulate
GuestOS mappings {Because you cannot allow GuestOS to see HV mappings}.

{{Aside:: At one time I was enamored with SW TLB refill and one could
reduce TLB refill penalty by allocating a "big enough" secondary hashed
TLB (1MB+). When HV + GuesOS came about, I saw the futility of it all}}

>>
>> The guest OS will generally specify the target virutal core (or set of cores)
>> for a specific interrupt. The Hypervisor and/or hardware needs
>> to deal with the case where the interrupt arrives while the target
>> guest core isn't currently scheduled on a physical core (and poke
>> the kernel to schedule the guest optionally). Such as recording
>> the pending interrupt and optionally notifying the hypervisor that
>> there is a pending guest interrupt so it can schedule the guest
>> core(s) on physical cores to handle the interrupt.
>>

> I am guessing maybe my assumed approach of always routing all of the
> external hardware interrupts to a specific core, is not typical then?...

> Say, only Core=0 or Core=1, will get the interrupts.

What do you think happens when there are thousands of cores and thousands
of disks, hundreds of Gigabit Ethernets controllers, where the number of
interrupts per second is larger than 1 or 2 cores can manage ??

<snip>

> So, a mechanism to swap a pair of stack-pointer registers seemed like a
> necessary evil.

> With a Soft-TLB, it is also basically required to fall back to physical
> addressing for ISR's (and with HW page-walking, if virtual-memory could
> exist in ISRs, it would likely be necessary to jump over to a different
> set of page-tables from the usermode program).

Danger Will Robinson, Danger

> In my case, I had not been arguing for any fancy interrupt handling in
> hardware...

In my case, MSI-X interrupts are routed to MC(L3) where each message sets
up to 2 bits, one demarking the unique interrupt, and the other merging
interrupts of a priority level into a second single bit. The setting of
this second bit is SNOOPed by cores to decide if they should attempt to
recognize an interrupt. Cores not associated with that interrupt table
do not see that interrupt; but those that are do. Thus, there is no pre-
assigned cores to service interrupts.

> The most fancy part of my interrupt mechanism, is that one can encode
> the ID of a core into the value passed to a "TRAPA", and it will
> redirect the interrupt to that specific core.

> But, this mechanism currently has the limitations of a 4-bit field, so
> going beyond ~ 15 cores is going to require a nesting scheme and
> bouncing IPI's across multiple cores.

Danger Will Robinson, Danger !!

> Though, if needed, I could tweak the format slightly in this case, and
> maybe expand the Core-ID for IPI's to 8-bits, albeit limiting it to 16
> unique IPI interrupt types.

I have 512 unique interrupts per priority level. There are 64 priority
levels.

Re: Interrupts on the Concertina II

<apArN.40939$SyNd.29376@fx33.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37048&group=comp.arch#37048

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!newsfeed.endofthelinebbs.com!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx33.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Interrupts on the Concertina II
Newsgroups: comp.arch
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me> <ff86faba91c3898f808cce78672bb058@www.novabbs.org> <uoka6h$dlog$1@dont-email.me> <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org> <mrwrN.323723$p%Mb.172024@fx15.iad> <8258c9888cf0b094bafad0bf9fba9c91@www.novabbs.org>
Lines: 53
Message-ID: <apArN.40939$SyNd.29376@fx33.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Mon, 22 Jan 2024 20:40:38 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Mon, 22 Jan 2024 20:40:38 GMT
X-Received-Bytes: 2908
 by: Scott Lurndal - Mon, 22 Jan 2024 20:40 UTC

mitchalsup@aol.com (MitchAlsup1) writes:
>Scott Lurndal wrote:
>

>> Or restricted to a specific set of cores (i.e. those currently
>> owned by the target guest).
>
>Even that gets tricky when you (or the OS) virtualizes cores.

Oh, indeed. It's helpful to have good hardware support. The
ARM GIC, for example, helps eliminate hypervisor interaction
during normal guest interrupt handling (aside from scheduling the
guest on a host core).

>
>In my case, the interrupt controller merely sets bits in the interrupt
>table, the watching cores watch for changes to its pending interrupt
>register (64-bits). Said messages come up from PCIe as MSI-X messages,

The interrupt space for MSI-X messages is 32-bits. Implementations
may support fewer than 2**32 interrupts - ours support 2**24 distinct
interrupt vectors.

>and are directed to the interrupt controller over in the Memory Controller
>(L3).
>
>> Dealing with inter-processor interrupts in a multicore guest can also
>> be tricky;
>
>Core sends MSI-X message to interrupt controller and the rest happens
>no different than a device initerrupt.

Not necessarily, particularly if the guest isn't resident on any
core at the time the interrupt is received.

>
>>>I presume the core that services the interrupt (ISR) is running the same
>>>GuestOS under the same HyperVisor that initiated the device.
>
>> Generally a safe assumption. Note that the guest core may not be
>> resident on any physical core when the guest interrupt arives.
>
>Which is why its table has to be present at all times--even if the threads
>are not. When one or more threads from that GuestOS are activated, the
>pending interrupt will be serviced.

Yes, but the hypervisor needs to be notified by the hardware when the table
is updated and the target guest VCPU isn't currently scheduled
on any core so that it can decide to schedule the guest (which may,
for instance, have been parked because it executed a WFI, PAUSE
or MWAIT instruction).

Re: Interrupts on the Concertina II

<5OBrN.373128$83n7.109574@fx18.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37050&group=comp.arch#37050

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx18.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Interrupts on the Concertina II
Newsgroups: comp.arch
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me> <ff86faba91c3898f808cce78672bb058@www.novabbs.org> <uoka6h$dlog$1@dont-email.me> <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org> <mrwrN.323723$p%Mb.172024@fx15.iad> <uomff4$s535$1@dont-email.me>
Lines: 69
Message-ID: <5OBrN.373128$83n7.109574@fx18.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Mon, 22 Jan 2024 22:15:29 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Mon, 22 Jan 2024 22:15:29 GMT
X-Received-Bytes: 3825
 by: Scott Lurndal - Mon, 22 Jan 2024 22:15 UTC

BGB <cr88192@gmail.com> writes:
>On 1/22/2024 10:09 AM, Scott Lurndal wrote:

>I am guessing maybe my assumed approach of always routing all of the
>external hardware interrupts to a specific core, is not typical then?...
>
>Say, only Core=0 or Core=1, will get the interrupts.

Maybe on a microcontroller :-).

On a desktop or server system (particularly the latter), the kernel
may distribute interrupts however it likes. Network card RSS
(Receive Side Scaling) requires being able to distribute interrupts
over a set of (or all) cores. Any time you make a core "special"
all kinds of new usage constraints arise (not to mention reduced
fault tolerance).

>
>Trying to route actual HW interrupts into virtual guest OS's seems like
>a pain.

Check out the ARM GICv3/v4 implementation to see how it does
this. It has evolved over time to where you see it now. Originally,
they only provided a set of CPU system registers to the hypervisor
that allowed the hypervisor to inject interrupts into the guest. The
hypervisor handled all interrupts itself, then queued them
(in a set of one or more List Registers) to the guest. When the
hypervisor dispatched the guest on the core, it would get
an interrupt and read the same interrupt ack register that
the hypervisor uses but the hardware would, for the guest
access, access one of the list registers and announce that
interrupt to the guest. The guest would end the interrupt
just like a bare-metal os by writing the interrupt number
to and interrupt END system register, which would drop
the running interrupt priority (for nested interrupts).
If the interrupt was level sensitive, unmasked and the
highest priority pending interrupt, the guest would
get another interrupt (wash, rinse, repeat).

Lots of trips (even low cost on AAarch64) between exception
levels.

So, they've added a capability (only for message signaled
interrupts) to deliver the MSI interrupt directly to the
guest - if the target guest core isn't resident, the hardware
will ring a doorbell for the hypervisor. Once the HV makes
the guest resident on the CPU, it will take any pending
interrupts recored for that virtual CPU, in order of
interrupt priority.

The final enhancement (GICV4.1) adds the ability to issue
virtual inter-processor interrupts between virtual CPU's
without hypervisor intervention (other than making the
guest vcpus resident on real cores).

>As I see it, the main limiting factor for interrupt performance is not
>the instructions to save and restore the registers, but rather the L1
>misses that result from doing so.

If the interrupt is happening at a rate where the L1
cache miss is significant, then the device probably needs to be
redesigned to reduce the number of interrupts (e.g.
interrupt coalescing), use DMA, or do more work per interrupt,
or poll the completion status from the driver rather
than waiting for the interrupt.

Re: Interrupts on the Concertina II

<uompig$tt92$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37051&group=comp.arch#37051

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bohannonindustriesllc@gmail.com (BGB-Alt)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Mon, 22 Jan 2024 16:17:52 -0600
Organization: A noiseless patient Spider
Lines: 192
Message-ID: <uompig$tt92$1@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad>
<uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me>
<ff86faba91c3898f808cce78672bb058@www.novabbs.org>
<uoka6h$dlog$1@dont-email.me>
<55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org>
<mrwrN.323723$p%Mb.172024@fx15.iad> <uomff4$s535$1@dont-email.me>
<65d2323335476993a8b1aa39720022b6@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 22 Jan 2024 22:17:52 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="60cc852d543f0e236045ec47cc8fe04a";
logging-data="980258"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1813bJV2XEQWRha4x8Q9LRmYJV6/s1lFOI="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:HlHDHJaJh/kezu1bC61sKpUYT20=
Content-Language: en-US
In-Reply-To: <65d2323335476993a8b1aa39720022b6@www.novabbs.org>
 by: BGB-Alt - Mon, 22 Jan 2024 22:17 UTC

On 1/22/2024 2:31 PM, MitchAlsup1 wrote:
> BGB wrote:
>
>> On 1/22/2024 10:09 AM, Scott Lurndal wrote:
>>> mitchalsup@aol.com (MitchAlsup1) writes:
>
>
>> In my case, the use of Soft TLB is not strictly required, as the OS
>> may opt-in to use a hardware page-walker "if it exists", with TLB Miss
>> interrupts mostly happening if no hardware page walker exists (or if
>> there is not a valid page in the page table).
>
> Has anyone done a SW refill TLB implementation that has both Hypervisor
> and Supervisor page <nested> translations ??
>
> This seems to me a bad idea as HV would end up having to manipulate
> GuestOS mappings {Because you cannot allow GuestOS to see HV mappings}.
>
> {{Aside:: At one time I was enamored with SW TLB refill and one could
> reduce TLB refill penalty by allocating a "big enough" secondary hashed
> TLB (1MB+). When HV + GuesOS came about, I saw the futility of it all}}
>

One would need to standardize on parts of the ABI, and treat them like
one would hardware-level constraints, to allow the top-level HV to cross
multiple levels of mapping.

Or, suffer the performance overhead of using multiple levels of emulation.
One of the two...

Granted, in simple cases, things like DOSBox and QEMU do surprisingly
well on Windows, despite the overheads of these using software emulation
rather than fancy hardware virtualization (so, the HW native stuff may
be overrated).

But, granted, being on a machine where the actual hardware
virtualization apparently doesn't work for some unknown reason, these
are basically the only real option (as most of the other VMs have,
annoyingly, gone over to requiring that the hardware virtualization
"actually work"...).

>>>
>>> The guest OS will generally specify the target virutal core (or set
>>> of cores)
>>> for a specific interrupt.   The Hypervisor and/or hardware needs
>>> to deal with the case where the interrupt arrives while the target
>>> guest core isn't currently scheduled on a physical core (and poke
>>> the kernel to schedule the guest optionally).   Such as recording
>>> the pending interrupt and optionally notifying the hypervisor that
>>> there is a pending guest interrupt so it can schedule the guest
>>> core(s) on physical cores to handle the interrupt.
>>>
>
>> I am guessing maybe my assumed approach of always routing all of the
>> external hardware interrupts to a specific core, is not typical then?...
>
>> Say, only Core=0 or Core=1, will get the interrupts.
>
> What do you think happens when there are thousands of cores and thousands
> of disks, hundreds of Gigabit Ethernets controllers, where the number of
> interrupts per second is larger than 1 or 2 cores can manage ??
>

Dunno.

Most of the systems I was familiar with were focused on using the cores
for large amounts of floating-point or integer math, rather than huge
amounts of IO. Had not considered IO intensive cases.

But, in this case, it makes more sense to have 1 core run the OS and
similar, and most of the other cores are left to grind away at doing
math or similar.

Possibly with the dataset carved up along a grid and each thread
primarily working on its own local section of the grid, ...

> <snip>
>
>> So, a mechanism to swap a pair of stack-pointer registers seemed like
>> a necessary evil.
>
>
>> With a Soft-TLB, it is also basically required to fall back to
>> physical addressing for ISR's (and with HW page-walking, if
>> virtual-memory could exist in ISRs, it would likely be necessary to
>> jump over to a different set of page-tables from the usermode program).
>
> Danger Will Robinson, Danger
>
>
>> In my case, I had not been arguing for any fancy interrupt handling in
>> hardware...
>
> In my case, MSI-X interrupts are routed to MC(L3) where each message sets
> up to 2 bits, one demarking the unique interrupt, and the other merging
> interrupts of a priority level into a second single bit. The setting of
> this second bit is SNOOPed by cores to decide if they should attempt to
> recognize an interrupt. Cores not associated with that interrupt table
> do not see that interrupt; but those that are do. Thus, there is no pre-
> assigned cores to service interrupts.
>

Hmm...

I guess it could be possible there could be a hardware "interrupt
router", where it tries to figure out which device sent the interrupt
and where it needs to go, but dunno...

AFAIK, traditional was more like:
Hardware sends an interrupt, a chip records this in an internal flag
register;
This asserts an IRQ pin or similar on the CPU, somehow this signaled to
one of several interrupt handlers, which would then access the chip to
figure out which actual IRQ had generated the interrupt.

Well, or my scheme, where a 16-bit ID is used, 8 bits of which was left
for the device to signal which device it was, say:
Cnxx:
n: Core that interrupt is directed to
Typically 0 for HW, maps to Core 1.
xx: Magic number to categorize IRQ or self-identify, 00..FF.
With Dnxx having been left for IPI.

Say:
0xxx-7xxx: Non-interrupt signals.
8nxx: Fault
9nxx: Reserved
Anxx: TLB
Bnxx: Reserved
Cnxx: Interrupt (Hardware)
Dnxx: Interrupt (Inter-Processor)
Enxx: Syscall
Fnxx: CPU Internal Interrupt (RTE and similar)

There is another 48 bits combined to this, generally an address
associated with the fault in question (such as the faulted memory
address, or where the miss occurred for a TLB Miss).

>> The most fancy part of my interrupt mechanism, is that one can encode
>> the ID of a core into the value passed to a "TRAPA", and it will
>> redirect the interrupt to that specific core.
>
>
>> But, this mechanism currently has the limitations of a 4-bit field, so
>> going beyond ~ 15 cores is going to require a nesting scheme and
>> bouncing IPI's across multiple cores.
>
> Danger Will Robinson, Danger !!
>

The field was 16 bits, split between 48-bit address and 16-bit status
code. Can't really make the field bigger short of redesigning some things.

>> Though, if needed, I could tweak the format slightly in this case, and
>> maybe expand the Core-ID for IPI's to 8-bits, albeit limiting it to 16
>> unique IPI interrupt types.
>
> I have 512 unique interrupts per priority level. There are 64 priority
> levels.

As part of the design, there were 4 priority levels.
At present, this is more interpreted as two levels though, functioning
more like CLI/STI on x86.

Though, this part of the design was also carried over from SuperH...
The general layout of the SR register, as can be noted, is fairly
similar to the layout used in SuperH (the bits which indicate the
location of User/Supervisor, whether it is in an ISR, and the SR.S /
SR.T and interrupt bits, etc, were mostly all carried over from SuperH).

The design of the interrupt mechanism is similar as well, just I moved
some stuff from MMIO to CRs, and made some design simplifications (vs SH-4).

Say, collapsing down the VBR relative entry points to be 8 bytes apart,
rather than some more ad-hoc offsets. I am guessing originally they had
hard-coded the VBR offsets relative to the layout of their Boot ROM or
something. In this case, 8 bytes was enough to encode a branch to
wherever the entry point was.

....

Re: Interrupts on the Concertina II

<uomppv$ttrk$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37053&group=comp.arch#37053

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Mon, 22 Jan 2024 14:21:49 -0800
Organization: A noiseless patient Spider
Lines: 38
Message-ID: <uomppv$ttrk$1@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me>
<c11eb332fac9497c3b5c75bfdc7c2eb3@news.novabbs.org>
<uo9pa2$286lv$1@dont-email.me>
<c168e8bb229ff12236468563107c8822@www.novabbs.org>
<P2yqN.362492$83n7.220225@fx18.iad>
<af6c065ea55651f0faa67d1cdb768cb0@www.novabbs.org>
<uoki5o$ehdm$2@dont-email.me>
<5a7861b20603040b93d45ad550d67e85@www.novabbs.org>
<uokk2j$erqm$1@dont-email.me> <uokk4v$erqm$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 22 Jan 2024 22:21:51 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f5efac9e35c211335209a589d0078cec";
logging-data="980852"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19j1xktDXSd+mCTXTweYhGd+soeC/oNXjA="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:IJPY8BfsUPi1m7FH65sMY2LE+cE=
Content-Language: en-US
In-Reply-To: <uokk4v$erqm$2@dont-email.me>
 by: Chris M. Thomasson - Mon, 22 Jan 2024 22:21 UTC

On 1/21/2024 6:33 PM, Chris M. Thomasson wrote:
> On 1/21/2024 6:31 PM, Chris M. Thomasson wrote:
>> On 1/21/2024 6:07 PM, MitchAlsup1 wrote:
>>> Chris M. Thomasson wrote:
>>>
>>>> On 1/21/2024 12:58 PM, MitchAlsup1 wrote:
>>>
>>>>> Detecting ABA requires one to monitor addresses not data values.
>>>
>>>> Not 100% true.
>>>
>>> IBM's original ABA problem was encountered when a background task
>>> (once a week or once a month) was swapped out to disk the instruction
>>> prior to CAS, and when it came back the data comparison register
>>> matched the memory data, but the value to be swapped in had no
>>> relationship with the current linked list structure. Machine crashed.
>>>
>>> Without knowing the address, how can this particular problem be
>>> rectified ??
>>
>> The version counter wrt a double wide compare and swap where:
>>
>> struct dwcas_anchor
>> {
>>      word* next;
^^^^^^^^^^^

Actually, I should say head here wrt the name of dwcas_anchor::next.
Sorry for any confusion. The head of the list would be in the main anchor.

>>      word version;
>> };
>
> sizeof(word*) == sizeof(word), sizeof(struct dwcas_anchor) ==
> sizeof(word) * 2, in this setup, and they must be contiguous.
>

Re: Interrupts on the Concertina II

<uomq6s$ttrm$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37054&group=comp.arch#37054

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
Date: Mon, 22 Jan 2024 14:28:42 -0800
Organization: A noiseless patient Spider
Lines: 44
Message-ID: <uomq6s$ttrm$1@dont-email.me>
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad>
<uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me>
<ff86faba91c3898f808cce78672bb058@www.novabbs.org>
<uoka6h$dlog$1@dont-email.me>
<55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org>
<mrwrN.323723$p%Mb.172024@fx15.iad> <uomff4$s535$1@dont-email.me>
<65d2323335476993a8b1aa39720022b6@www.novabbs.org>
<uompig$tt92$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Jan 2024 22:28:44 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f5efac9e35c211335209a589d0078cec";
logging-data="980854"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+exSLoD/ml9JJJv/W2FErJPADRDUACCyA="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:HV9Bcw+DISFb0GfP4/Kyq27HJwU=
Content-Language: en-US
In-Reply-To: <uompig$tt92$1@dont-email.me>
 by: Chris M. Thomasson - Mon, 22 Jan 2024 22:28 UTC

On 1/22/2024 2:17 PM, BGB-Alt wrote:
> On 1/22/2024 2:31 PM, MitchAlsup1 wrote:
>> BGB wrote:
>>
>>> On 1/22/2024 10:09 AM, Scott Lurndal wrote:
>>>> mitchalsup@aol.com (MitchAlsup1) writes:
>>
>>
>>> In my case, the use of Soft TLB is not strictly required, as the OS
>>> may opt-in to use a hardware page-walker "if it exists", with TLB
>>> Miss interrupts mostly happening if no hardware page walker exists
>>> (or if there is not a valid page in the page table).
>>
>> Has anyone done a SW refill TLB implementation that has both Hypervisor
>> and Supervisor page <nested> translations ??
>>
>> This seems to me a bad idea as HV would end up having to manipulate
>> GuestOS mappings {Because you cannot allow GuestOS to see HV mappings}.
>>
>> {{Aside:: At one time I was enamored with SW TLB refill and one could
>> reduce TLB refill penalty by allocating a "big enough" secondary hashed
>> TLB (1MB+). When HV + GuesOS came about, I saw the futility of it all}}
>>
>
> One would need to standardize on parts of the ABI, and treat them like
> one would hardware-level constraints, to allow the top-level HV to cross
> multiple levels of mapping.
>
> Or, suffer the performance overhead of using multiple levels of emulation.
> One of the two...
>
>
> Granted, in simple cases, things like DOSBox and QEMU do surprisingly
> well on Windows, despite the overheads of these using software emulation
> rather than fancy hardware virtualization (so, the HW native stuff may
> be overrated).
> [...]

Fwiw, have you ever taken at look at Kegs32?

https://www.emaculation.com/kegs32.htm

Pretty nice! :^)

Re: Interrupts on the Concertina II

<ZtCrN.374797$83n7.249143@fx18.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37055&group=comp.arch#37055

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.samoylyk.net!weretis.net!feeder6.news.weretis.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx18.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me> <ff86faba91c3898f808cce78672bb058@www.novabbs.org> <uoka6h$dlog$1@dont-email.me> <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org> <mrwrN.323723$p%Mb.172024@fx15.iad> <uomff4$s535$1@dont-email.me> <65d2323335476993a8b1aa39720022b6@www.novabbs.org>
In-Reply-To: <65d2323335476993a8b1aa39720022b6@www.novabbs.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 39
Message-ID: <ZtCrN.374797$83n7.249143@fx18.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 22 Jan 2024 23:02:17 UTC
Date: Mon, 22 Jan 2024 18:01:54 -0500
X-Received-Bytes: 2692
 by: EricP - Mon, 22 Jan 2024 23:01 UTC

MitchAlsup1 wrote:
> BGB wrote:
>
>> On 1/22/2024 10:09 AM, Scott Lurndal wrote:
>>> mitchalsup@aol.com (MitchAlsup1) writes:
>
>
>> In my case, the use of Soft TLB is not strictly required, as the OS
>> may opt-in to use a hardware page-walker "if it exists", with TLB Miss
>> interrupts mostly happening if no hardware page walker exists (or if
>> there is not a valid page in the page table).
>
> Has anyone done a SW refill TLB implementation that has both Hypervisor
> and Supervisor page <nested> translations ??
>
> This seems to me a bad idea as HV would end up having to manipulate
> GuestOS mappings {Because you cannot allow GuestOS to see HV mappings}.

I actually pondered something like this to eliminate the two-level table
walk in virtual machines. I was thinking that the HV might propagate its
PTE entries into the GuestOS PTE entries, then mark them (somehow)
so they trap to the HV if GuestOS tries to look at them.
But it got complicated and never really went anywhere.

One accomplishes the same effect by caching the interior PTE nodes
for each of the HV and GuestOS tables separately on the downward walk,
and hold the combined nested table mapping in the TLB.
The bottom-up table walkers on each interior PTE cache should
eliminate 98% of the PTE reads with none of the headaches.

> {{Aside:: At one time I was enamored with SW TLB refill and one could
> reduce TLB refill penalty by allocating a "big enough" secondary hashed
> TLB (1MB+). When HV + GuesOS came about, I saw the futility of it all}}

I also wondered if an hashed/inverted page table could help here.
But that also went nowhere. The separate bottom-up walkers looked best.

Re: Interrupts on the Concertina II

<d23a16dcdf06df96a208e1c39002d9a3@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37059&group=comp.arch#37059

  copy link   Newsgroups: comp.arch
Date: Tue, 23 Jan 2024 00:54:54 +0000
Subject: Re: Interrupts on the Concertina II
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$m58zhyMP/aJ15xUGo5lkE.KQ.hCuxqdn/F3nCzIhU0q4cO7qfzmie
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me> <ff86faba91c3898f808cce78672bb058@www.novabbs.org> <uoka6h$dlog$1@dont-email.me> <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org> <mrwrN.323723$p%Mb.172024@fx15.iad> <uomff4$s535$1@dont-email.me> <65d2323335476993a8b1aa39720022b6@www.novabbs.org> <ZtCrN.374797$83n7.249143@fx18.iad>
Organization: Rocksolid Light
Message-ID: <d23a16dcdf06df96a208e1c39002d9a3@www.novabbs.org>
 by: MitchAlsup1 - Tue, 23 Jan 2024 00:54 UTC

EricP wrote:

> MitchAlsup1 wrote:
>> BGB wrote:
>>
>>> On 1/22/2024 10:09 AM, Scott Lurndal wrote:
>>>> mitchalsup@aol.com (MitchAlsup1) writes:
>>
>>
>>> In my case, the use of Soft TLB is not strictly required, as the OS
>>> may opt-in to use a hardware page-walker "if it exists", with TLB Miss
>>> interrupts mostly happening if no hardware page walker exists (or if
>>> there is not a valid page in the page table).
>>
>> Has anyone done a SW refill TLB implementation that has both Hypervisor
>> and Supervisor page <nested> translations ??
>>
>> This seems to me a bad idea as HV would end up having to manipulate
>> GuestOS mappings {Because you cannot allow GuestOS to see HV mappings}.

> I actually pondered something like this to eliminate the two-level table
> walk in virtual machines. I was thinking that the HV might propagate its
> PTE entries into the GuestOS PTE entries, then mark them (somehow)
> so they trap to the HV if GuestOS tries to look at them.
> But it got complicated and never really went anywhere.

> One accomplishes the same effect by caching the interior PTE nodes
> for each of the HV and GuestOS tables separately on the downward walk,
> and hold the combined nested table mapping in the TLB.
> The bottom-up table walkers on each interior PTE cache should
> eliminate 98% of the PTE reads with none of the headaches.

I call these things:: TableWalk Accelerators.

Given CAMs at your access, one can cache the outer layers and short
circuit most of the MMU accesses--such that you don't siply read the
Accelerator RAM 25 times (two 5-level tables), you CAM down both
GuestOS and HV tables so only walk the parts not in your CAM. {And
them put them in your CAM.} A Density trick is for each CAM to have
access to a whole cache line of PTEs (8 in my case).

>> {{Aside:: At one time I was enamored with SW TLB refill and one could
>> reduce TLB refill penalty by allocating a "big enough" secondary hashed
>> TLB (1MB+). When HV + GuesOS came about, I saw the futility of it all}}

> I also wondered if an hashed/inverted page table could help here.
> But that also went nowhere. The separate bottom-up walkers looked best.

Best I could do was two tables, one mapping appliction to GuestPA, the
other mapping GuestPA to RealPA. If the former missed, GuestOS fixed its
own table, if the late, HV fixed its own table.

Re: Interrupts on the Concertina II

<SwUrN.301374$xHn7.210887@fx14.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37068&group=comp.arch#37068

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx14.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Interrupts on the Concertina II
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me> <ff86faba91c3898f808cce78672bb058@www.novabbs.org> <uoka6h$dlog$1@dont-email.me> <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org> <mrwrN.323723$p%Mb.172024@fx15.iad> <uomff4$s535$1@dont-email.me> <65d2323335476993a8b1aa39720022b6@www.novabbs.org> <ZtCrN.374797$83n7.249143@fx18.iad> <d23a16dcdf06df96a208e1c39002d9a3@www.novabbs.org>
In-Reply-To: <d23a16dcdf06df96a208e1c39002d9a3@www.novabbs.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 39
Message-ID: <SwUrN.301374$xHn7.210887@fx14.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Tue, 23 Jan 2024 19:34:10 UTC
Date: Tue, 23 Jan 2024 14:33:56 -0500
X-Received-Bytes: 3055
 by: EricP - Tue, 23 Jan 2024 19:33 UTC

MitchAlsup1 wrote:
> EricP wrote:
>
>> One accomplishes the same effect by caching the interior PTE nodes
>> for each of the HV and GuestOS tables separately on the downward walk,
>> and hold the combined nested table mapping in the TLB.
>> The bottom-up table walkers on each interior PTE cache should
>> eliminate 98% of the PTE reads with none of the headaches.
>
> I call these things:: TableWalk Accelerators.
>
> Given CAMs at your access, one can cache the outer layers and short
> circuit most of the MMU accesses--such that you don't siply read the
> Accelerator RAM 25 times (two 5-level tables), you CAM down both
> GuestOS and HV tables so only walk the parts not in your CAM. {And
> them put them in your CAM.} A Density trick is for each CAM to have
> access to a whole cache line of PTEs (8 in my case).

An idea I had here was to allow the OS more explicit control
for the invalidates of the interior nodes caches.

On x86/x64 the interior cache invalidation had to be backwards compatible,
so the INVLPG instruction has to guess what besides the main TLB needs to be
invalidated, and it has to do so in a conservative (ie paranoid) manner.
So it tosses these interior PTE's just in case which means they
have to be reloaded on the next TLB miss.

The OS knows which paging levels it is recycling memory for and
can provide a finer grain control for these TLB invalidates.
The INVLPG and INVPCID instructions need a control bit mask allowing OS
to invalidate just the TLB levels it is changing for a virtual address.

And for OS debugging purposes, all these HW TLB tables need to be readable
and writable by some means (as control registers or whatever).
Because when something craps out, what's in memory may not be the same
as what was loaded into HW some time ago. A debugger should be able to
look into and manipulate these HW structures.

Re: Interrupts on the Concertina II

<f25fc2ceb6287463678e883b4adfc6ff@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37071&group=comp.arch#37071

  copy link   Newsgroups: comp.arch
Date: Tue, 23 Jan 2024 21:09:45 +0000
Subject: Re: Interrupts on the Concertina II
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$bJN.15MRsCdLsO4UtEgEfOluxglogKRQ41iLMNVlwdCH1IJvgrrOi
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me> <ff86faba91c3898f808cce78672bb058@www.novabbs.org> <uoka6h$dlog$1@dont-email.me> <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org> <mrwrN.323723$p%Mb.172024@fx15.iad> <uomff4$s535$1@dont-email.me> <65d2323335476993a8b1aa39720022b6@www.novabbs.org> <ZtCrN.374797$83n7.249143@fx18.iad> <d23a16dcdf06df96a208e1c39002d9a3@www.novabbs.org> <SwUrN.301374$xHn7.210887@fx14.iad>
Organization: Rocksolid Light
Message-ID: <f25fc2ceb6287463678e883b4adfc6ff@www.novabbs.org>
 by: MitchAlsup1 - Tue, 23 Jan 2024 21:09 UTC

EricP wrote:

> MitchAlsup1 wrote:
>> EricP wrote:
>>
>>> One accomplishes the same effect by caching the interior PTE nodes
>>> for each of the HV and GuestOS tables separately on the downward walk,
>>> and hold the combined nested table mapping in the TLB.
>>> The bottom-up table walkers on each interior PTE cache should
>>> eliminate 98% of the PTE reads with none of the headaches.
>>
>> I call these things:: TableWalk Accelerators.
>>
>> Given CAMs at your access, one can cache the outer layers and short
>> circuit most of the MMU accesses--such that you don't siply read the
>> Accelerator RAM 25 times (two 5-level tables), you CAM down both
>> GuestOS and HV tables so only walk the parts not in your CAM. {And
>> them put them in your CAM.} A Density trick is for each CAM to have
>> access to a whole cache line of PTEs (8 in my case).

> An idea I had here was to allow the OS more explicit control
> for the invalidates of the interior nodes caches.

The interior nodes, stored in the CAM, retain their physical address, and
are snooped, so no invalidation is required. ANY write to them is seen and
the entry invalidates itself.

> On x86/x64 the interior cache invalidation had to be backwards compatible,
> so the INVLPG instruction has to guess what besides the main TLB needs to be
> invalidated, and it has to do so in a conservative (ie paranoid) manner.
> So it tosses these interior PTE's just in case which means they
> have to be reloaded on the next TLB miss.

> The OS knows which paging levels it is recycling memory for and
> can provide a finer grain control for these TLB invalidates.
> The INVLPG and INVPCID instructions need a control bit mask allowing OS
> to invalidate just the TLB levels it is changing for a virtual address.

OS or HV does not need to bother in My 66000.

> And for OS debugging purposes, all these HW TLB tables need to be readable
> and writable by some means (as control registers or whatever).
> Because when something craps out, what's in memory may not be the same
> as what was loaded into HW some time ago. A debugger should be able to
> look into and manipulate these HW structures.

All control registers, including the TLB CAMs are accessible via MMI/O
accesses. So a remote core can decide what a crashed core was doing at
the instant of the crash.

Re: Page tables and TLBs [was Interrupts on the Concertina II]

<cwusN.276944$PuZ9.103529@fx11.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37111&group=comp.arch#37111

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.samoylyk.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx11.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Page tables and TLBs [was Interrupts on the Concertina II]
References: <uo930v$24cq0$1@dont-email.me> <xTWpN.140322$yEgf.868@fx09.iad> <uoh4t9$3pht2$2@dont-email.me> <uojugt$buth$1@dont-email.me> <ff86faba91c3898f808cce78672bb058@www.novabbs.org> <uoka6h$dlog$1@dont-email.me> <55df2c2e4662c064fd9eb8f31c8783b7@www.novabbs.org> <mrwrN.323723$p%Mb.172024@fx15.iad> <uomff4$s535$1@dont-email.me> <65d2323335476993a8b1aa39720022b6@www.novabbs.org> <ZtCrN.374797$83n7.249143@fx18.iad> <d23a16dcdf06df96a208e1c39002d9a3@www.novabbs.org> <SwUrN.301374$xHn7.210887@fx14.iad> <f25fc2ceb6287463678e883b4adfc6ff@www.novabbs.org>
In-Reply-To: <f25fc2ceb6287463678e883b4adfc6ff@www.novabbs.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 82
Message-ID: <cwusN.276944$PuZ9.103529@fx11.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Thu, 25 Jan 2024 14:47:36 UTC
Date: Thu, 25 Jan 2024 09:47:26 -0500
X-Received-Bytes: 4868
 by: EricP - Thu, 25 Jan 2024 14:47 UTC

MitchAlsup1 wrote:
> EricP wrote:
>
>> MitchAlsup1 wrote:
>>> EricP wrote:
>>>
>>>> One accomplishes the same effect by caching the interior PTE nodes
>>>> for each of the HV and GuestOS tables separately on the downward walk,
>>>> and hold the combined nested table mapping in the TLB.
>>>> The bottom-up table walkers on each interior PTE cache should
>>>> eliminate 98% of the PTE reads with none of the headaches.
>>>
>>> I call these things:: TableWalk Accelerators.
>>>
>>> Given CAMs at your access, one can cache the outer layers and short
>>> circuit most of the MMU accesses--such that you don't siply read the
>>> Accelerator RAM 25 times (two 5-level tables), you CAM down both
>>> GuestOS and HV tables so only walk the parts not in your CAM. {And
>>> them put them in your CAM.} A Density trick is for each CAM to have
>>> access to a whole cache line of PTEs (8 in my case).
>
>> An idea I had here was to allow the OS more explicit control
>> for the invalidates of the interior nodes caches.
>
> The interior nodes, stored in the CAM, retain their physical address, and
> are snooped, so no invalidation is required. ANY write to them is seen and
> the entry invalidates itself.

On My66000, but other cores don't have automatically coherent TLB's.
This feature is intended for that general rabble.

Just to play devil's advocate...

To snoop page table updates My66000 TLB would need a large CAM with all
the physical addresses of the PTE's source cache lines parallel to the
virtual and ASID CAM's, and route the cache line invalidates through it.

While the virtual index CAM's are separated in different banks,
one for each page table level, the P.A. CAM is for all entries in all banks.
This extra P.A. CAM will have a lot of entries and therefore be slow.

Also routing the Invalidate messages through the TLB could slow down all
their ACK's messages even though there is very low probability of a hit
because page tables update relatively infrequently.

Also the L2-TLB's, called the STLB for Second-level TLB by Intel,
are set assoc., and would have to be virtually indexed and virtually
tagged with both VA and ASID plus table level to select address mask.
On Skylake the STLB for 4k/2M pages is 128-rows*12-way, 1G is 4-rows*4-way.

How can My66000 look up STLB entries by invalidate physical line address?
It would have to scan all 128 rows for each message.

>> On x86/x64 the interior cache invalidation had to be backwards
>> compatible,
>> so the INVLPG instruction has to guess what besides the main TLB needs
>> to be
>> invalidated, and it has to do so in a conservative (ie paranoid) manner.
>> So it tosses these interior PTE's just in case which means they
>> have to be reloaded on the next TLB miss.
>
>> The OS knows which paging levels it is recycling memory for and
>> can provide a finer grain control for these TLB invalidates.
>> The INVLPG and INVPCID instructions need a control bit mask allowing OS
>> to invalidate just the TLB levels it is changing for a virtual address.
>
> OS or HV does not need to bother in My 66000.
>
>> And for OS debugging purposes, all these HW TLB tables need to be
>> readable
>> and writable by some means (as control registers or whatever).
>> Because when something craps out, what's in memory may not be the same
>> as what was loaded into HW some time ago. A debugger should be able to
>> look into and manipulate these HW structures.
>
> All control registers, including the TLB CAMs are accessible via MMI/O
> accesses. So a remote core can decide what a crashed core was doing at
> the instant of the crash.


devel / comp.arch / Re: Interrupts on the Concertina II

Pages:123
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor