Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

Time sharing: The use of many people by the computer.


devel / comp.arch / Re: Tonight's tradeoff

SubjectAuthor
* Tonight's tradeoffRobert Finch
+* Re: Tonight's tradeoffEricP
|`* Re: Tonight's tradeoffMitchAlsup
| `* Re: Tonight's tradeoffRobert Finch
|  `* Re: Tonight's tradeoffMitchAlsup
|   `* Re: Tonight's tradeoffRobert Finch
|    +- Re: Tonight's tradeoffRobert Finch
|    `* Re: Tonight's tradeoffMitchAlsup
|     `* Re: Tonight's tradeoffRobert Finch
|      `* Re: Tonight's tradeoffRobert Finch
|       `* Re: Tonight's tradeoffMitchAlsup
|        +* Re: Tonight's tradeoffRobert Finch
|        |+* Re: Tonight's tradeoffBGB
|        ||`* Re: Tonight's tradeoffRobert Finch
|        || +* Re: Tonight's tradeoffScott Lurndal
|        || |`- Re: Tonight's tradeoffMitchAlsup
|        || `- Re: Tonight's tradeoffBGB
|        |+- Re: Tonight's tradeoffScott Lurndal
|        |`* Re: Tonight's tradeoffMitchAlsup
|        | `* Re: Tonight's tradeoffScott Lurndal
|        |  +* Re: Tonight's tradeoffMitchAlsup
|        |  |`* Re: Tonight's tradeoffScott Lurndal
|        |  | `* Re: Tonight's tradeoffRobert Finch
|        |  |  +- Re: Tonight's tradeoffMitchAlsup
|        |  |  `- Re: Tonight's tradeoffScott Lurndal
|        |  `* Re: Tonight's tradeoffAnton Ertl
|        |   +* Re: Tonight's tradeoffEricP
|        |   |+- Re: Tonight's tradeoffMitchAlsup
|        |   |`- Re: Tonight's tradeoffAnton Ertl
|        |   +* Re: Tonight's tradeoffBGB
|        |   |+* Re: Tonight's tradeoffScott Lurndal
|        |   ||+- Re: Tonight's tradeoffBGB
|        |   ||`* Re: Tonight's tradeoffMitchAlsup
|        |   || `- Re: Tonight's tradeoffBGB
|        |   |+- Re: Tonight's tradeoffRobert Finch
|        |   |`* Re: Tonight's tradeoffAnton Ertl
|        |   | `- Re: Tonight's tradeoffBGB
|        |   `* Re: Tonight's tradeoffScott Lurndal
|        |    `* Re: Tonight's tradeoffAnton Ertl
|        |     `* Re: Tonight's tradeoffScott Lurndal
|        |      `* Re: Tonight's tradeoffAnton Ertl
|        |       `* Re: Tonight's tradeoffRobert Finch
|        |        +- Re: Tonight's tradeoffScott Lurndal
|        |        +* Re: Tonight's tradeoffEricP
|        |        |`* Re: Tonight's tradeoffMitchAlsup
|        |        | `* Re: Tonight's tradeoffRobert Finch
|        |        |  `* Re: Tonight's tradeoffMitchAlsup
|        |        |   `* Re: Tonight's tradeoffRobert Finch
|        |        |    `* Re: Tonight's tradeoffMitchAlsup
|        |        |     `* Re: Tonight's tradeoffRobert Finch
|        |        |      `- Re: Tonight's tradeoffMitchAlsup
|        |        `* Re: Tonight's tradeoffRobert Finch
|        |         `* Re: Tonight's tradeoffEricP
|        |          +* Re: Tonight's tradeoffMitchAlsup
|        |          |+- Re: Tonight's tradeoffRobert Finch
|        |          |`* Re: Tonight's tradeoffBGB
|        |          | `* Re: Tonight's tradeoffRobert Finch
|        |          |  `* Re: Tonight's tradeoffBGB
|        |          |   `* Re: Tonight's tradeoffRobert Finch
|        |          |    +- Re: Tonight's tradeoffMitchAlsup
|        |          |    `* Re: Tonight's tradeoffBGB
|        |          |     `* Re: Tonight's tradeoffRobert Finch
|        |          |      `* Re: Tonight's tradeoffBGB
|        |          |       `* Re: Tonight's tradeoffRobert Finch
|        |          |        `* Re: Tonight's tradeoffRobert Finch
|        |          |         `* Re: Tonight's tradeoffMitchAlsup
|        |          |          `* Re: Tonight's tradeoffBGB
|        |          |           `* Re: Tonight's tradeoffRobert Finch
|        |          |            `* Re: Tonight's tradeoffMitchAlsup
|        |          |             `* Re: Tonight's tradeoffRobert Finch
|        |          |              `* Re: Tonight's tradeoffMitchAlsup
|        |          |               `* Re: Tonight's tradeoffRobert Finch
|        |          |                +- Re: Tonight's tradeoffRobert Finch
|        |          |                `* Re: Tonight's tradeoffMitchAlsup
|        |          |                 `* Re: Tonight's tradeoffRobert Finch
|        |          |                  +* Re: Tonight's tradeoffMitchAlsup
|        |          |                  |`* Re: Tonight's tradeoffRobert Finch
|        |          |                  | `* Re: Tonight's tradeoffBGB
|        |          |                  |  `* Re: Tonight's tradeoffRobert Finch
|        |          |                  |   +* Re: Tonight's tradeoffMitchAlsup
|        |          |                  |   |`- Re: Tonight's tradeoffRobert Finch
|        |          |                  |   `* Re: Tonight's tradeoffBGB
|        |          |                  |    `* Re: Tonight's tradeoffRobert Finch
|        |          |                  |     `* Re: Tonight's tradeoffRobert Finch
|        |          |                  |      `* Re: Tonight's tradeoffEricP
|        |          |                  |       +* Re: Tonight's tradeoffMitchAlsup
|        |          |                  |       |`* Re: Tonight's tradeoffRobert Finch
|        |          |                  |       | +- Re: Tonight's tradeoffRobert Finch
|        |          |                  |       | `* Re: Tonight's tradeoffEricP
|        |          |                  |       |  `* Re: Tonight's tradeoffMitchAlsup
|        |          |                  |       |   `* Re: Tonight's tradeoffRobert Finch
|        |          |                  |       |    `* Re: Tonight's tradeoffRobert Finch
|        |          |                  |       |     +- Re: Tonight's tradeoffBGB
|        |          |                  |       |     `* Re: Tonight's tradeoffEricP
|        |          |                  |       |      `* Re: Tonight's tradeoffMitchAlsup
|        |          |                  |       |       +- Re: Tonight's tradeoffRobert Finch
|        |          |                  |       |       `* Re: Tonight's tradeoffEricP
|        |          |                  |       |        +* Re: Tonight's tradeoffChris M. Thomasson
|        |          |                  |       |        |`* Re: Tonight's tradeoffEricP
|        |          |                  |       |        | +- Re: Tonight's tradeoffAnton Ertl
|        |          |                  |       |        | `* Re: Tonight's tradeoffChris M. Thomasson
|        |          |                  |       |        `* Re: Tonight's tradeoffChris M. Thomasson
|        |          |                  |       `- Re: Tonight's tradeoffBGB
|        |          |                  `- Re: Tonight's tradeoffMitchAlsup
|        |          `- Re: Tonight's tradeoffRobert Finch
|        `- Re: Tonight's tradeoffScott Lurndal
+- Re: Tonight's tradeoffMitchAlsup
`* Re: Tonight's tradeoffRobert Finch

Pages:123456789101112
Re: Tonight's tradeoff

<fb335a14621553185589c1daa6812244@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35243&group=comp.arch#35243

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Sun, 26 Nov 2023 01:34:53 +0000
Organization: novaBBS
Message-ID: <fb335a14621553185589c1daa6812244@news.novabbs.com>
References: <uis67u$fkj4$1@dont-email.me> <uj1o0t$1kves$1@dont-email.me> <7761287e80bb22b7742fd7f292664497@news.novabbs.com> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <1cc3cef16ea12c020cb2fd81c9e0e365@news.novabbs.com> <Y2u8N.108363$svP4.76046@fx12.iad> <uju4k6$3040j$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2123877"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on novalink.us
X-Rslight-Site: $2y$10$rl5YQWRKpVyrw4Wltp.lUuLFEsR1cdOUeFgssG0GUHBPBoTihzD4u
 by: MitchAlsup - Sun, 26 Nov 2023 01:34 UTC

Robert Finch wrote:

> Are top-level page directory pages shared between tasks?

The HyperVisor tables supporting a single Guest OS certainly are.
The Guest OS tables supporting Guest OS certainly are.

> Suppose a task
> needs a 32-bit address space. With one level of page maps, 27 bits is
> accommodated, that leaves 5 bits of address translation to be done by
> the page directory. Using a whole page which can handle 11 address bits
> would be wasteful. But if root pointers could point into the same page
> directory page then the space would not be wasted. For instance, root
> pointer for task #1 could point the first 32 entries, root pointer for
> task #2 could point into the next 32 entries, and so on.

I should Note: that My 66000 Root Pointers determine the address space they
map; anything from 8MB through 8EB and PTEs supporting 8KB through 8EB page
sizes--with the kicker that large page entries can restrict themselves::
for example you can use a 8MB PTE and enable only 1..1024 pages under that
Virtual sub Address Space; furthermore, levels in the hierarchy can be
skipped--all of this to minimize table walk time.

Re: Tonight's tradeoff

<sTJ8N.7150$Ycdc.3883@fx09.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35246&group=comp.arch#35246

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!2.eu.feeder.erje.net!feeder.erje.net!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx09.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Tonight's tradeoff
Newsgroups: comp.arch
References: <uis67u$fkj4$1@dont-email.me> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <1cc3cef16ea12c020cb2fd81c9e0e365@news.novabbs.com> <Y2u8N.108363$svP4.76046@fx12.iad> <uju4k6$3040j$1@dont-email.me>
Lines: 26
Message-ID: <sTJ8N.7150$Ycdc.3883@fx09.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Sun, 26 Nov 2023 15:55:04 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Sun, 26 Nov 2023 15:55:04 GMT
X-Received-Bytes: 2385
 by: Scott Lurndal - Sun, 26 Nov 2023 15:55 UTC

Robert Finch <robfi680@gmail.com> writes:
>Are top-level page directory pages shared between tasks?

The top half of the VA space could support this, for
the most part (since the top half is generally shared
by all tasks). The bottom half that's much less likely.

>Suppose a task
>needs a 32-bit address space. With one level of page maps, 27 bits is
>accommodated, that leaves 5 bits of address translation to be done by
>the page directory. Using a whole page which can handle 11 address bits
>would be wasteful. But if root pointers could point into the same page
>directory page then the space would not be wasted. For instance, root
>pointer for task #1 could point the first 32 entries, root pointer for
>task #2 could point into the next 32 entries, and so on.

If the VA space is small enough, on ARMv8, the tables can be configured
with fewer than the normal four levels by specifying a smaller VA
size in the TCR_ELx register, so the walk may be only two or three levels
deep instead of four (or five when the VA gets larger than 52 bits).

Using intermediate level blocks (soi disant 'huge pages') reduces the
walk overhead as well, but has it's issues with allocation (since
the huge pages need not just be physical contiguous, but aligned
on huge-page-sized boundaries.

Re: Tonight's tradeoff

<2023Nov26.164506@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35248&group=comp.arch#35248

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Sun, 26 Nov 2023 15:45:06 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 41
Message-ID: <2023Nov26.164506@mips.complang.tuwien.ac.at>
References: <uis67u$fkj4$1@dont-email.me> <643607718b82ff03ae09d2b661963223@news.novabbs.com> <uj1o0t$1kves$1@dont-email.me> <7761287e80bb22b7742fd7f292664497@news.novabbs.com> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad>
Injection-Info: dont-email.me; posting-host="b18e1ec3d766c8ce9447304006b90139";
logging-data="3499467"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/FamV7jAleDoMzQWpdZXx2"
Cancel-Lock: sha1:5NMcWsLSbEniiQi4VVRQiuj52PQ=
X-newsreader: xrn 10.11
 by: Anton Ertl - Sun, 26 Nov 2023 15:45 UTC

scott@slp53.sl.home (Scott Lurndal) writes:
>mitchalsup@aol.com (MitchAlsup) writes:
>>Consider the case where two different processes MMAP the same area
>>of memory.
>
>In which case, the area of memory would be mapped to different
>virtual address ranges in each process,

Says who? Unless the user process asks for MAP_FIXED or the address
range is already occupied in the user process, nothing prevents the OS
from putting the shared area in the same process. If the permissions
are also the same, the OS can then use one ASID for the shared area.

This would be especially useful for the read-only sections (e.g, code)
of common libraries like libc. However, in todays security landscape,
you don't want one process to know where library code is mapped in
other processes (i.e., you want ASLR), so we can no longer make use of
that benefit. And it's doubtful whether other uses are worth the
complications (and even if they are, there might be security issues,
too).

>FWIW, MAP_FIXED is specified as an optional feature by POSIX
>and may not be supported by the OS at all.

As usual, what is specified by a common-subset standard is not
relevant for what an OS implementor has to do if they want to supply
more than a practically unusable checkbox feature like the POSIX
subsystem for Windows. There is a reason why WSL2 includes a full
Linux kernel.

>>Should they both end up using the same ASID ??
>
>They couldn't share an ASID assuming the TLB looks up by VA.

Of course the TLB looks up by VA, what else. But if the VA is the
same and the PA is the same, the same ASID can be used.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Tonight's tradeoff

<PiL8N.53862$%d2c.19977@fx08.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35253&group=comp.arch#35253

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx08.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
References: <uis67u$fkj4$1@dont-email.me> <643607718b82ff03ae09d2b661963223@news.novabbs.com> <uj1o0t$1kves$1@dont-email.me> <7761287e80bb22b7742fd7f292664497@news.novabbs.com> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at>
In-Reply-To: <2023Nov26.164506@mips.complang.tuwien.ac.at>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 23
Message-ID: <PiL8N.53862$%d2c.19977@fx08.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sun, 26 Nov 2023 17:32:31 UTC
Date: Sun, 26 Nov 2023 12:32:08 -0500
X-Received-Bytes: 2249
 by: EricP - Sun, 26 Nov 2023 17:32 UTC

Anton Ertl wrote:
> scott@slp53.sl.home (Scott Lurndal) writes:
>> mitchalsup@aol.com (MitchAlsup) writes:
>>> Consider the case where two different processes MMAP the same area
>>> of memory.
>> In which case, the area of memory would be mapped to different
>> virtual address ranges in each process,
>
> Says who? Unless the user process asks for MAP_FIXED or the address
> range is already occupied in the user process, nothing prevents the OS
> from putting the shared area in the same process. If the permissions
> are also the same, the OS can then use one ASID for the shared area.

If the mapping range is being selected dynamically, the chance that a
range will already be in use goes up with the number of sharers.
At some point when a new member tries to join the sharing group
the map request will be denied.

Software that does not want to have a mapping request fail should assume
that a shared area will be mapped at a different address in each process.
That implies one should not assume that virtual address can be passed
but instead use, say, section relative offsets to build a linked list.

Re: Tonight's tradeoff

<0834c143414829227416d6a086c6fbd1@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35255&group=comp.arch#35255

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Sun, 26 Nov 2023 20:52:23 +0000
Organization: novaBBS
Message-ID: <0834c143414829227416d6a086c6fbd1@news.novabbs.com>
References: <uis67u$fkj4$1@dont-email.me> <643607718b82ff03ae09d2b661963223@news.novabbs.com> <uj1o0t$1kves$1@dont-email.me> <7761287e80bb22b7742fd7f292664497@news.novabbs.com> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <PiL8N.53862$%d2c.19977@fx08.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2208572"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949
X-Rslight-Site: $2y$10$neZqglXERsdeOhmvIfAbYe2S0Q63N4Pf5ym0b7ihZgeaDd8uan00G
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on novalink.us
 by: MitchAlsup - Sun, 26 Nov 2023 20:52 UTC

EricP wrote:

> Anton Ertl wrote:
>> scott@slp53.sl.home (Scott Lurndal) writes:
>>> mitchalsup@aol.com (MitchAlsup) writes:
>>>> Consider the case where two different processes MMAP the same area
>>>> of memory.
>>> In which case, the area of memory would be mapped to different
>>> virtual address ranges in each process,
>>
>> Says who? Unless the user process asks for MAP_FIXED or the address
>> range is already occupied in the user process, nothing prevents the OS
>> from putting the shared area in the same process. If the permissions
>> are also the same, the OS can then use one ASID for the shared area.

> If the mapping range is being selected dynamically, the chance that a
> range will already be in use goes up with the number of sharers.
> At some point when a new member tries to join the sharing group
> the map request will be denied.

> Software that does not want to have a mapping request fail should assume
> that a shared area will be mapped at a different address in each process.
> That implies one should not assume that virtual address can be passed
> but instead use, say, section relative offsets to build a linked list.

Here you are using shared memory like PL/1 uses AREA and OFFSET types.

Re: Tonight's tradeoff

<2023Nov26.222623@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35258&group=comp.arch#35258

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Sun, 26 Nov 2023 21:26:23 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 41
Message-ID: <2023Nov26.222623@mips.complang.tuwien.ac.at>
References: <uis67u$fkj4$1@dont-email.me> <7761287e80bb22b7742fd7f292664497@news.novabbs.com> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <PiL8N.53862$%d2c.19977@fx08.iad>
Injection-Info: dont-email.me; posting-host="b18e1ec3d766c8ce9447304006b90139";
logging-data="3599903"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18nlME0rp9IuGUZ21wDQqvb"
Cancel-Lock: sha1:ww3hggLAy0lgT3IAvVPCnT3cX/A=
X-newsreader: xrn 10.11
 by: Anton Ertl - Sun, 26 Nov 2023 21:26 UTC

EricP <ThatWouldBeTelling@thevillage.com> writes:
>Anton Ertl wrote:
>> scott@slp53.sl.home (Scott Lurndal) writes:
>>> In which case, the area of memory would be mapped to different
>>> virtual address ranges in each process,
>>
>> Says who? Unless the user process asks for MAP_FIXED or the address
>> range is already occupied in the user process, nothing prevents the OS
>> from putting the shared area in the same process.

s/process/address range/ for the last word.

>> If the permissions
>> are also the same, the OS can then use one ASID for the shared area.
>
>If the mapping range is being selected dynamically, the chance that a
>range will already be in use goes up with the number of sharers.
>At some point when a new member tries to join the sharing group
>the map request will be denied.

It will map, but with a different address range, and therefore a
different ASID. Then, for further mapping requests, the chances that
one of the two address ranges are free are increased. So even with a
large number of processes mapping the same library, you will need only
a few ASIDs for this physical memory, so there will be lots of
sharing. Of course with ASLR this is all no longer relevant.

>Software that does not want to have a mapping request fail should assume
>that a shared area will be mapped at a different address in each process.
>That implies one should not assume that virtual address can be passed
>but instead use, say, section relative offsets to build a linked list.

Yes. The other option is to use MAP_FIXED early in the process, and
to have some way of dealing with potential failures. But sharing of
VAs in user code between processes is not what the sharing of ASIDs we
have discussed here would be primarily about.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Tonight's tradeoff

<uk0e96$3dtqn$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35259&group=comp.arch#35259

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Sun, 26 Nov 2023 15:45:08 -0600
Organization: A noiseless patient Spider
Lines: 195
Message-ID: <uk0e96$3dtqn$1@dont-email.me>
References: <uis67u$fkj4$1@dont-email.me>
<643607718b82ff03ae09d2b661963223@news.novabbs.com>
<uj1o0t$1kves$1@dont-email.me>
<7761287e80bb22b7742fd7f292664497@news.novabbs.com>
<uj9bm2$36401$1@dont-email.me>
<71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com>
<ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me>
<987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com>
<ujrm4a$2llie$1@dont-email.me>
<d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com>
<bps8N.150652$wvv7.7314@fx14.iad>
<2023Nov26.164506@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 26 Nov 2023 21:45:10 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="dcf0aa9e6f429425e09f6aa229e16f53";
logging-data="3602263"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/xwyVv6irYAVbai52XZvrd"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:FMeeVo8dh+R3rk1GWenCPBT18So=
Content-Language: en-US
In-Reply-To: <2023Nov26.164506@mips.complang.tuwien.ac.at>
 by: BGB - Sun, 26 Nov 2023 21:45 UTC

On 11/26/2023 9:45 AM, Anton Ertl wrote:
> scott@slp53.sl.home (Scott Lurndal) writes:
>> mitchalsup@aol.com (MitchAlsup) writes:
>>> Consider the case where two different processes MMAP the same area
>>> of memory.
>>
>> In which case, the area of memory would be mapped to different
>> virtual address ranges in each process,
>
> Says who? Unless the user process asks for MAP_FIXED or the address
> range is already occupied in the user process, nothing prevents the OS
> from putting the shared area in the same process. If the permissions
> are also the same, the OS can then use one ASID for the shared area.
>
> This would be especially useful for the read-only sections (e.g, code)
> of common libraries like libc. However, in todays security landscape,
> you don't want one process to know where library code is mapped in
> other processes (i.e., you want ASLR), so we can no longer make use of
> that benefit. And it's doubtful whether other uses are worth the
> complications (and even if they are, there might be security issues,
> too).
>

It seems to me, as long as it is a different place on each system,
probably good enough. Demanding a different location in each process
would create a lot of additional memory overhead due to from things like
base-relocations or similar.

>> FWIW, MAP_FIXED is specified as an optional feature by POSIX
>> and may not be supported by the OS at all.
>
> As usual, what is specified by a common-subset standard is not
> relevant for what an OS implementor has to do if they want to supply
> more than a practically unusable checkbox feature like the POSIX
> subsystem for Windows. There is a reason why WSL2 includes a full
> Linux kernel.
>

Still using WSL1 here as for whatever reason hardware virtualization has
thus far refused to work on my PC, and is apparently required for WSL2.

I can add this to my list of annoyances, like I can install "just short
of 128GB", but putting in the full 128GB causes my PC to be like "Oh
Crap, I guess there is 3.5GB ..." (but, apparently "112GB with unmatched
RAM sticks is fine I guess...").

But, yeah, the original POSIX is an easier goal to achieve, vs, say, the
ability to port over the GNU userland.

A lot of it is doable, but things like fork+exec are a problem if one
wants to support NOMMU operation or otherwise run all of the logical
processes in a shared address space.

A practical alternative is something more like a CreateProcess style
call, but this is "not exactly POSIX". In theory though, one could treat
"fork()" more like "vfork()" and then turn the exec* call into a
CreateProcess call and then terminate the current thread. Wouldn't
really work "in general" though, for programs that expect to be able to
"fork()" and then continue running the current program as a sub-process.

>>> Should they both end up using the same ASID ??
>>
>> They couldn't share an ASID assuming the TLB looks up by VA.
>
> Of course the TLB looks up by VA, what else. But if the VA is the
> same and the PA is the same, the same ASID can be used.
>

?...

Typically the ASID applies to the whole virtual address space, not to
individual memory objects.

Or, at least, my page-table scheme doesn't have a way to express
per-page ASIDs (merely if a page is Private/Shared, with the results of
this partly depending on the current ASID given for the page-table).

Where, say, I am mostly using 64-bit entries in the page-table, as going
to a 128-bit page-table format would be a bit steep.

Say, PTE layout (16K pages):
(63:48): ACLID
(47:14): Physical Address.
(13:12): Address or OS flag.
(11:10): For use by OS
( 9: 0): Base page-access and similar.
(9): S1 / U1 (Page-Size or OS Flag)
(8): S0 / U0 (Page-Size or OS Flag)
(7): Nu User (Supervisor Only)
(6): No Execute
(5): No Write
(4): No Read
(3): No Cache
(2): Dirty (OS, ignored by TLB)
(1): Private/Shared (MBZ if not Valid)
(0): Present/Valid

Where, ACLID serves as an index into the ACL table, or to lookup the
VUGID parameters for the page (well, along with an alternate PTE variant
that encodes VUGID directly, but reduces the physical address to 36
bits). It is possible that the original VUGID scheme may be phased out
in favor of using exclusively ACL checking.

Note that the ACL checks don't add new permissions to a page, they add
further restrictions (with the base-access being the most permissive).

Some combinations of flags are special, and encode a few edge-case
modes; such as pages which are Read/Write in Supervisor mode but
Read-Only in user mode (separate from the possible use of ACL's to mark
pages as read-only for certain tasks).

But, FWIW, I ended up adding an extended MAP_GLOBAL flag for "mmap'ed
space should be visible to all of the processes"; which in turn was used
as part of the backing memory for the "GlobalAlloc" style calls (it is
not a global heap, in that each process still manages the memory
locally, but other intersecting processes can see the address within
their own address spaces).

Well, along with a MAP_PHYSICAL flag, for if one needs memory where
VA==PA (this may fail, with the mmap returning NULL, effectively only
allowed for "superusermode"; mostly intended for hardware interfaces).

The usual behavior of MAP_SHARED didn't really make sense outside of the
context of mapping a file, and didn't really serve the needed purpose
(say, one wants to hand off a pointer to a bitmap buffer to the GUI
subsystem to have it drawn into a window).

It is also being used for things like shared scratch buffers, say, for
passing BITMAPINFOHEADER and MIDI commands and similar across the
interprocess calls (the C API style wrapper wraps a lot of this; whereas
the internal COM-style interfaces will require any pointer-style
arguments to point to shared memory).

This is not required for normal syscall handlers, where the usual
assumption is that normal syscalls will have some means of directly
accessing the address space of the caller process. I didn't really want
to require that TKGDI have this same capability.

It is debatable whether calls like BlitImage and similar should require
global memory, or merely recommend it (potentially having the call fall
back to a scratch buffer and internal memcpy if the passed bitmap image
is not already in global memory).

I had originally considered a more complex mechanism for object sharing,
but then ended up going with this for now partly because it was easier
and lower overhead (well, and also because I wanted something that would
still work if/when I started to add proper memory protection). May make
sense to impose a limit on per-process global alloc's though (since it
is intended specifically for shared buffers and not for general heap
allocation; where for heap allocation ANONYMOUS+PRIVATE would be used
instead).

Though, looking at stuff, MAP_GLOBAL semantics may have also been
partially covered by "MAP_ANONYMOUS|MAP_SHARED"?... Though, the
semantics aren't the same.

I guess, another alternative would have been to use shm_open+mmap or
similar.

Where, say, memory map will look something like:
00yy_xxxxxxxx: Physical mapped and direct-mapped memory.
00yy_xxxxxxxx: Start of global virtual memory (*1);
3FFF_xxxxxxxx: End of global virtual memory;
4000_xxxxxxxx: Start of private/local virtual memory (possible, *2);
7FFF_xxxxxxxx: End of private/local virtual memory (possible);
8000_xxxxxxxx: Start of kernel virtual memory;
BFFF_xxxxxxxx: End of kernel virtual memory;
Cxxx_xxxxxxxx: Physical Address Range (Cached);
Dxxx_xxxxxxxx: Physical Address Range (Volatile/NoCache);
Exxx_xxxxxxxx: Reserved;
Fxxx_xxxxxxxx: MMIO and similar.

*1: The 'yy' division point may move, will depend on things like how
much RAM exists (currently, 00/01; no current "sane cost" FPGA boards
having more than 256 or 512 MB of RAM).

*2: If I go to a scheme of giving processes their own address spaces,
then private memory will be used. It is likely that executable code may
remain shared, but the data sections and heap would be put into private
address ranges.

Re: Tonight's tradeoff

<tDP8N.30031$ayBd.8559@fx07.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35262&group=comp.arch#35262

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx07.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Tonight's tradeoff
Newsgroups: comp.arch
References: <uis67u$fkj4$1@dont-email.me> <uj1o0t$1kves$1@dont-email.me> <7761287e80bb22b7742fd7f292664497@news.novabbs.com> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at>
Lines: 76
Message-ID: <tDP8N.30031$ayBd.8559@fx07.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Sun, 26 Nov 2023 22:27:37 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Sun, 26 Nov 2023 22:27:37 GMT
X-Received-Bytes: 4515
 by: Scott Lurndal - Sun, 26 Nov 2023 22:27 UTC

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>scott@slp53.sl.home (Scott Lurndal) writes:
>>mitchalsup@aol.com (MitchAlsup) writes:
>>>Consider the case where two different processes MMAP the same area
>>>of memory.
>>
>>In which case, the area of memory would be mapped to different
>>virtual address ranges in each process,
>
>Says who? Unless the user process asks for MAP_FIXED or the address
>range is already occupied in the user process, nothing prevents the OS
>from putting the shared area in the same process. If the permissions
>are also the same, the OS can then use one ASID for the shared area.
>
>This would be especially useful for the read-only sections (e.g, code)
>of common libraries like libc. However, in todays security landscape,
>you don't want one process to know where library code is mapped in
>other processes (i.e., you want ASLR), so we can no longer make use of
>that benefit. And it's doubtful whether other uses are worth the
>complications (and even if they are, there might be security issues,
>too).
>
>>FWIW, MAP_FIXED is specified as an optional feature by POSIX
>>and may not be supported by the OS at all.
>
>As usual, what is specified by a common-subset standard is not
>relevant for what an OS implementor has to do if they want to supply
>more than a practically unusable checkbox feature like the POSIX
>subsystem for Windows. There is a reason why WSL2 includes a full
>Linux kernel.

If an implementation claims support for the XSI option of
POSIX, then it must support MAP_FIXED. There were a couple
of vendors who claimed not to be able to support MAP_FIXED
back in the days when it was being discussed in the standards
committee working groups.

In addition, the standard notes:

"Use of MAP_FIXED may result in unspecified behavior in
further use of malloc() and shmat(). The use of MAP_FIXED is
discouraged, as it may prevent an implementation from making
the most effective use of resources.

Because the semantics of MAP_FIXED are to unmap any
prior mapping in the range, if the implementation had happened to
allocate the heap or shared System V region at that address, the heap
would have become corrupt with dangling references hanging
around which, if stored into, would subsequently corrupt the mapped region.

>
>>>Should they both end up using the same ASID ??
>>
>>They couldn't share an ASID assuming the TLB looks up by VA.
>
>Of course the TLB looks up by VA, what else. But if the VA is the
>same and the PA is the same, the same ASID can be used.

That sounds like a nightmare scenario. Normally the ASID is
closely associated with a single process and the scope of
necessary TLB maintenance operations (e.g. invalidates
after translation table updates) is usually the process.

It's certainly not possible to do that on ARMv8 systems. The
ASID tag in the TLB entry comes from the translation table base
register and applies to all accesses made to the entire range covered
by the translation table by all the threads of the process.

Likewise the VMID tag in the TLB entry comes from the nested
translation table base address system register at the time
of entry creation.

For a subsequent process (child or detached) sharing memory with
that process, there just isn't any way to tag it's TLB entry with
the ASID of the first process to map the shared region.

Re: Tonight's tradeoff

<HKP8N.30032$ayBd.9426@fx07.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35263&group=comp.arch#35263

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx07.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Tonight's tradeoff
Newsgroups: comp.arch
References: <uis67u$fkj4$1@dont-email.me> <7761287e80bb22b7742fd7f292664497@news.novabbs.com> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <uk0e96$3dtqn$1@dont-email.me>
Lines: 49
Message-ID: <HKP8N.30032$ayBd.9426@fx07.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Sun, 26 Nov 2023 22:35:19 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Sun, 26 Nov 2023 22:35:19 GMT
X-Received-Bytes: 3204
 by: Scott Lurndal - Sun, 26 Nov 2023 22:35 UTC

BGB <cr88192@gmail.com> writes:
>On 11/26/2023 9:45 AM, Anton Ertl wrote:

>
>
>Where, say, memory map will look something like:
> 00yy_xxxxxxxx: Physical mapped and direct-mapped memory.
> 00yy_xxxxxxxx: Start of global virtual memory (*1);
> 3FFF_xxxxxxxx: End of global virtual memory;
> 4000_xxxxxxxx: Start of private/local virtual memory (possible, *2);
> 7FFF_xxxxxxxx: End of private/local virtual memory (possible);
> 8000_xxxxxxxx: Start of kernel virtual memory;
> BFFF_xxxxxxxx: End of kernel virtual memory;
> Cxxx_xxxxxxxx: Physical Address Range (Cached);
> Dxxx_xxxxxxxx: Physical Address Range (Volatile/NoCache);
> Exxx_xxxxxxxx: Reserved;
> Fxxx_xxxxxxxx: MMIO and similar.

The modern preference is to make the memory map flexible.

Linux, for example, requires that PCI Base Address Registers
be programmable by the operating system, and the OS can
choose any range (subject to host bridge configuration, of
course) for the device.

It is notable that even on non-intel systems, one may need
to map a 32-bit PCI BAR (AHCI is the classic example) which
requires the address programmed in the bar to be less than
0x10000000. Granted systems can have custom PCI controllers
that remap that into the larger physical address space with
a bit of extra hardware, however the kernel people don't
like that at all since there is no universal standard for
such remapping and they don't want to support
dozens of independent implementations, constantly
changing from generation to generation.

Many modern SoCs (and ARM SBSA requires this) make their
on-board devices and coprocessors look like PCI express
devices to software, and SBSA requires the PCIe ECAM
region for device discovery. Here again, each of
these on board devices will have from one to six
memory region base address registers (or one to
three for 64-bit bars).

Encoding memory attributes into the address is common
in microcontrollers, but in a general purpose processor
constrains the system to an extent sufficient to make it
unattractive for general purpose workloads.

Re: Tonight's tradeoff

<uk0jsr$3eloe$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35269&group=comp.arch#35269

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: robfi680@gmail.com (Robert Finch)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Sun, 26 Nov 2023 18:20:58 -0500
Organization: A noiseless patient Spider
Lines: 242
Message-ID: <uk0jsr$3eloe$1@dont-email.me>
References: <uis67u$fkj4$1@dont-email.me>
<643607718b82ff03ae09d2b661963223@news.novabbs.com>
<uj1o0t$1kves$1@dont-email.me>
<7761287e80bb22b7742fd7f292664497@news.novabbs.com>
<uj9bm2$36401$1@dont-email.me>
<71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com>
<ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me>
<987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com>
<ujrm4a$2llie$1@dont-email.me>
<d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com>
<bps8N.150652$wvv7.7314@fx14.iad>
<2023Nov26.164506@mips.complang.tuwien.ac.at> <uk0e96$3dtqn$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 26 Nov 2023 23:21:00 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="ceec176b45ee2c6fcec01f9530ed8ad4";
logging-data="3626766"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+1V5BgocOVXOK69G9NWq6SbHrSiwqpfMo="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:LfPMT5MD04TwXg0Kvl3LL4rJnbk=
Content-Language: en-US
In-Reply-To: <uk0e96$3dtqn$1@dont-email.me>
 by: Robert Finch - Sun, 26 Nov 2023 23:20 UTC

On 2023-11-26 4:45 p.m., BGB wrote:
> On 11/26/2023 9:45 AM, Anton Ertl wrote:
>> scott@slp53.sl.home (Scott Lurndal) writes:
>>> mitchalsup@aol.com (MitchAlsup) writes:
>>>> Consider the case where two different processes MMAP the same area
>>>> of memory.
>>>
>>> In which case, the area of memory would be mapped to different
>>> virtual address ranges in each process,
>>
>> Says who?  Unless the user process asks for MAP_FIXED or the address
>> range is already occupied in the user process, nothing prevents the OS
>> from putting the shared area in the same process.  If the permissions
>> are also the same, the OS can then use one ASID for the shared area.
>>
>> This would be especially useful for the read-only sections (e.g, code)
>> of common libraries like libc.  However, in todays security landscape,
>> you don't want one process to know where library code is mapped in
>> other processes (i.e., you want ASLR), so we can no longer make use of
>> that benefit.  And it's doubtful whether other uses are worth the
>> complications (and even if they are, there might be security issues,
>> too).
>>
>
> It seems to me, as long as it is a different place on each system,
> probably good enough. Demanding a different location in each process
> would create a lot of additional memory overhead due to from things like
> base-relocations or similar.
>
>
>>> FWIW,  MAP_FIXED is specified as an optional feature by POSIX
>>> and may not be supported by the OS at all.
>>
>> As usual, what is specified by a common-subset standard is not
>> relevant for what an OS implementor has to do if they want to supply
>> more than a practically unusable checkbox feature like the POSIX
>> subsystem for Windows.  There is a reason why WSL2 includes a full
>> Linux kernel.
>>
>
> Still using WSL1 here as for whatever reason hardware virtualization has
> thus far refused to work on my PC, and is apparently required for WSL2.
>
> I can add this to my list of annoyances, like I can install "just short
> of 128GB", but putting in the full 128GB causes my PC to be like "Oh
> Crap, I guess there is 3.5GB ..." (but, apparently "112GB with unmatched
> RAM sticks is fine I guess...").
>
>
>
> But, yeah, the original POSIX is an easier goal to achieve, vs, say, the
> ability to port over the GNU userland.
>
>
> A lot of it is doable, but things like fork+exec are a problem if one
> wants to support NOMMU operation or otherwise run all of the logical
> processes in a shared address space.
>
> A practical alternative is something more like a CreateProcess style
> call, but this is "not exactly POSIX". In theory though, one could treat
> "fork()" more like "vfork()" and then turn the exec* call into a
> CreateProcess call and then terminate the current thread. Wouldn't
> really work "in general" though, for programs that expect to be able to
> "fork()" and then continue running the current program as a sub-process.
>
>
>>>> Should they both end up using the same ASID ??
>>>
>>> They couldn't share an ASID assuming the TLB looks up by VA.
>>
>> Of course the TLB looks up by VA, what else.  But if the VA is the
>> same and the PA is the same, the same ASID can be used.
>>
>
> ?...
>
> Typically the ASID applies to the whole virtual address space, not to
> individual memory objects.
>
>
> Or, at least, my page-table scheme doesn't have a way to express
> per-page ASIDs (merely if a page is Private/Shared, with the results of
> this partly depending on the current ASID given for the page-table).
>
> Where, say, I am mostly using 64-bit entries in the page-table, as going
> to a 128-bit page-table format would be a bit steep.
>
> Say, PTE layout (16K pages):
>   (63:48): ACLID
>   (47:14): Physical Address.
>   (13:12): Address or OS flag.
>   (11:10): For use by OS
>   ( 9: 0): Base page-access and similar.
>     (9): S1 / U1 (Page-Size or OS Flag)
>     (8): S0 / U0 (Page-Size or OS Flag)
>     (7): Nu User (Supervisor Only)
>     (6): No Execute
>     (5): No Write
>     (4): No Read
>     (3): No Cache
>     (2): Dirty (OS, ignored by TLB)
>     (1): Private/Shared (MBZ if not Valid)
>     (0): Present/Valid
>
> Where, ACLID serves as an index into the ACL table, or to lookup the
> VUGID parameters for the page (well, along with an alternate PTE variant
> that encodes VUGID directly, but reduces the physical address to 36
> bits). It is possible that the original VUGID scheme may be phased out
> in favor of using exclusively ACL checking.
>
> Note that the ACL checks don't add new permissions to a page, they add
> further restrictions (with the base-access being the most permissive).
>
> Some combinations of flags are special, and encode a few edge-case
> modes; such as pages which are Read/Write in Supervisor mode but
> Read-Only in user mode (separate from the possible use of ACL's to mark
> pages as read-only for certain tasks).
>
>
Q+ has a similar setup, but the ACLID is in a separate table.

For Q+ Two similar MMUs have been designed, one to be used in a large
system and a second for a small system. The difference between the two
is in the size of page numbers. The large system uses 64-bit page
numbers, and the small system uses 32-bit page numbers. The PTE for the
large system is 96-bits, 32-bits larger than the PTE for the small
system due to the extra bits for the page number. Pages are 64kB. The
small system supports a 48-bit address range.

The PTE has the following fields:
PPN 64/32 Physical page number
URWX 3 User read-write-execute override
SRWX 3 Supervisor read-write-execute override
HRWX 3 Hypervisor read-write-execute override
MRWX 3 Machine read-write-execute override
CACHE 4 Cache-ability bits
SW 2 OS software usage
A 1 1=accessed/used
M 1 1=modified
V 1 1 if entry is valid, otherwise 0
S 1 1=shared page
G 1 1=global, ignore ASID
T 1 0=page pointer, 1= table pointer
RGN 3 Region table index
LVL/BC 5 the page table level of the entry pointed to

The RWX and CACHE bits are overrides. These values normally come from
the region table, but may be overridden by values in the PTE.
The LVL/BC5 field is five bits to account for a five-bit bounce counter
for inverted page tables. Only a 3-bit level is in use.

There is a separate table with per page information that contains a
reference to an ACL (16-bts), share counts (16-bits), privilege level
(8-bits), and access key (24-bits), and a couple of other fields for
compression / encryption.

I have made the PTBR a full 64-bit address now rather than a page number
with control bits. So, it may now point into the middle of a page
directory which is shared between tasks.

The table walker and region table look like PCI devices to the system.

>
> But, FWIW, I ended up adding an extended MAP_GLOBAL flag for "mmap'ed
> space should be visible to all of the processes"; which in turn was used
> as part of the backing memory for the "GlobalAlloc" style calls (it is
> not a global heap, in that each process still manages the memory
> locally, but other intersecting processes can see the address within
> their own address spaces).
>
> Well, along with a MAP_PHYSICAL flag, for if one needs memory where
> VA==PA (this may fail, with the mmap returning NULL, effectively only
> allowed for "superusermode"; mostly intended for hardware interfaces).
>
>
>
> The usual behavior of MAP_SHARED didn't really make sense outside of the
> context of mapping a file, and didn't really serve the needed purpose
> (say, one wants to hand off a pointer to a bitmap buffer to the GUI
> subsystem to have it drawn into a window).
>
> It is also being used for things like shared scratch buffers, say, for
> passing BITMAPINFOHEADER and MIDI commands and similar across the
> interprocess calls (the C API style wrapper wraps a lot of this; whereas
> the internal COM-style interfaces will require any pointer-style
> arguments to point to shared memory).
>
> This is not required for normal syscall handlers, where the usual
> assumption is that normal syscalls will have some means of directly
> accessing the address space of the caller process. I didn't really want
> to require that TKGDI have this same capability.
>
> It is debatable whether calls like BlitImage and similar should require
> global memory, or merely recommend it (potentially having the call fall
> back to a scratch buffer and internal memcpy if the passed bitmap image
> is not already in global memory).
>
>
>
> I had originally considered a more complex mechanism for object sharing,
> but then ended up going with this for now partly because it was easier
> and lower overhead (well, and also because I wanted something that would
> still work if/when I started to add proper memory protection). May make
> sense to impose a limit on per-process global alloc's though (since it
> is intended specifically for shared buffers and not for general heap
> allocation; where for heap allocation ANONYMOUS+PRIVATE would be used
> instead).
>
> Though, looking at stuff, MAP_GLOBAL semantics may have also been
> partially covered by "MAP_ANONYMOUS|MAP_SHARED"?... Though, the
> semantics aren't the same.
>
> I guess, another alternative would have been to use shm_open+mmap or
> similar.
>
>
> Where, say, memory map will look something like:
>   00yy_xxxxxxxx: Physical mapped and direct-mapped memory.
>   00yy_xxxxxxxx: Start of global virtual memory (*1);
>   3FFF_xxxxxxxx: End of global virtual memory;
>   4000_xxxxxxxx: Start of private/local virtual memory (possible, *2);
>   7FFF_xxxxxxxx: End of private/local virtual memory (possible);
>   8000_xxxxxxxx: Start of kernel virtual memory;
>   BFFF_xxxxxxxx: End of kernel virtual memory;
>   Cxxx_xxxxxxxx: Physical Address Range (Cached);
>   Dxxx_xxxxxxxx: Physical Address Range (Volatile/NoCache);
>   Exxx_xxxxxxxx: Reserved;
>   Fxxx_xxxxxxxx: MMIO and similar.
>
> *1: The 'yy' division point may move, will depend on things like how
> much RAM exists (currently, 00/01; no current "sane cost" FPGA boards
> having more than 256 or 512 MB of RAM).
>
> *2: If I go to a scheme of giving processes their own address spaces,
> then private memory will be used. It is likely that executable code may
> remain shared, but the data sections and heap would be put into private
> address ranges.
>
>


Click here to read the complete article
Re: Tonight's tradeoff

<uk0ohi$3fcof$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35273&group=comp.arch#35273

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Sun, 26 Nov 2023 18:40:16 -0600
Organization: A noiseless patient Spider
Lines: 202
Message-ID: <uk0ohi$3fcof$1@dont-email.me>
References: <uis67u$fkj4$1@dont-email.me>
<7761287e80bb22b7742fd7f292664497@news.novabbs.com>
<uj9bm2$36401$1@dont-email.me>
<71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com>
<ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me>
<987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com>
<ujrm4a$2llie$1@dont-email.me>
<d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com>
<bps8N.150652$wvv7.7314@fx14.iad>
<2023Nov26.164506@mips.complang.tuwien.ac.at> <uk0e96$3dtqn$1@dont-email.me>
<HKP8N.30032$ayBd.9426@fx07.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 27 Nov 2023 00:40:18 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="2fcebed1d03588c400a4bb58f77147d2";
logging-data="3650319"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+6dApeta6TIdORtW5He/38"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:oNvNweDmnz5kTlgczYj81lNQToY=
In-Reply-To: <HKP8N.30032$ayBd.9426@fx07.iad>
Content-Language: en-US
 by: BGB - Mon, 27 Nov 2023 00:40 UTC

On 11/26/2023 4:35 PM, Scott Lurndal wrote:
> BGB <cr88192@gmail.com> writes:
>> On 11/26/2023 9:45 AM, Anton Ertl wrote:
>
>>
>>
>> Where, say, memory map will look something like:
>> 00yy_xxxxxxxx: Physical mapped and direct-mapped memory.
>> 00yy_xxxxxxxx: Start of global virtual memory (*1);
>> 3FFF_xxxxxxxx: End of global virtual memory;
>> 4000_xxxxxxxx: Start of private/local virtual memory (possible, *2);
>> 7FFF_xxxxxxxx: End of private/local virtual memory (possible);
>> 8000_xxxxxxxx: Start of kernel virtual memory;
>> BFFF_xxxxxxxx: End of kernel virtual memory;
>> Cxxx_xxxxxxxx: Physical Address Range (Cached);
>> Dxxx_xxxxxxxx: Physical Address Range (Volatile/NoCache);
>> Exxx_xxxxxxxx: Reserved;
>> Fxxx_xxxxxxxx: MMIO and similar.
>
>
> The modern preference is to make the memory map flexible.
>
> Linux, for example, requires that PCI Base Address Registers
> be programmable by the operating system, and the OS can
> choose any range (subject to host bridge configuration, of
> course) for the device.
>

As for the memory map, actual hardware-relevant part of the map is:
0000_xxxxxxxx..7FFF_xxxxxxxx: User Mode, virtual
8000_xxxxxxxx..BFFF_xxxxxxxx: Supervisor Mode, virtual
Cxxx_xxxxxxxx: Physical Address Range (Cached);
Dxxx_xxxxxxxx: Physical Address Range (Volatile/NoCache);
Exxx_xxxxxxxx: Reserved;
Fxxx_xxxxxxxx: MMIO and similar.

No good way to make more entirely flexible, some of this stuff requires
special handling from the L1 cache, and by the time it reaches the TLB,
it is too late (unless there were additional logic to be like "Oh, crap,
this was actually meant for MMIO!").

Though, with the 96-bit VA mode, if GBH(47:0)!=0, then the entire 48-bit
space is User Mode Virtual (and it is not possible to access MMIO or
similar at all, short of reloading 0 into GBH, or using XMOV.x
instructions with a 128-bit pointer, say:
0000_0000_00000000-tttt_Fxxx_xxxxxxxx).

Note here that the high 16-bits are ignored for normal pointers
(typically used for type-tagging or bounds-checking by the runtime).

For branches and captured Link-Register values:
If LSB is 0: High 16 bits are ignored;
The branch will always be within the same CPU Mode.
If LSB is 1: High 16 bits encode CPU Mode control flags.
LSB is always set for created LR values.
CPU will trap if the LSB is Clear in LR during an RTS/RTSU.

Setting the LSB and putting the mode in the high 16 bits is also often
used on function pointers so that theoretically Baseline and XG2 code
can play along together (though, at present, BGBCC does not generate any
mixed binaries, so this part would mostly apply to DLLs).

For the time being, there is no PCI or PCIe in my case.
Nor have I gone up the learning curve for what would be required to
interface with any PCIe devices.

Had tried to get USB working, but didn't have much success as it seemed
I was still missing something (seemed to be sending/receiving bytes, but
the devices would not respond as expected to any requests or commands).

Mostly ended up using a PS2 keyboard, and had realized that (IIRC) if
one pulled the D+ and D- lines high (IIRC) the mouse would instead
implement the PS2 protocol (though, this didn't work on the USB
keyboards I had tried).

Most devices are mapped to fixed address ranges in the MMIO space:
F000Cxxx: Rasterizer / Edge-Walker Control Registers
F000Exxx: Various basic devices
SDcard, PS2 Keyboard/Mouse, RS232 UART (*), etc
F008xxxx: FM Synth / Sample Mixer Control / ...
F009xxxx: PCM Audio Loop/Registers
F00Axxxx: MMIO VRAM
F00Bxxxx: MMIO VRAM and Video Control
At present, VRAM is also RAM-backed.
VRAM framebuffer base address in RAM is now movable.

All this existing within:
FFFF_Fxxxxxxx

*: RS232 generally connected to a UART interface that feeds back to a
connected computer via an on-board FTDI chip or similar.

As for physical memory map, it is sorta like:
00000000..00007FFF: Boot ROM
0000C000..0000DFFF: Boot SRAM
00010000..0001FFFF: ZERO's
00020000..0002FFFF: BJX2 NOP's
00030000..0003FFFF: BJX2 BREAK's
...
01000000..1FFFFFFF: Reserved for RAM
20000000..3FFFFFFF: Reserved for More RAM (And/or repeating)
40000000..5FFFFFFF: RAM repeats (and/or Reserved)
60000000..7FFFFFFF: RAM repeats more (and/or Reserved)
80000000..EFFFFFFF: Reserved
F0000000..FFFFFFFF: MMIO in 32-bit Mode (*1)

*1: There used to be an MMIO range at 0000_F0000000, but this has been
eliminated in favor of only recognizing this range as MMIO in 32-bit
mode (where only the low 32-bits of the address are used). Enabling
48-bit addressing will now require using the proper MMIO address.

Currently, nothing past the low 4GB is used in the physical memory map.

> It is notable that even on non-intel systems, one may need
> to map a 32-bit PCI BAR (AHCI is the classic example) which
> requires the address programmed in the bar to be less than
> 0x10000000. Granted systems can have custom PCI controllers
> that remap that into the larger physical address space with
> a bit of extra hardware, however the kernel people don't
> like that at all since there is no universal standard for
> such remapping and they don't want to support
> dozens of independent implementations, constantly
> changing from generation to generation.
>
> Many modern SoCs (and ARM SBSA requires this) make their
> on-board devices and coprocessors look like PCI express
> devices to software, and SBSA requires the PCIe ECAM
> region for device discovery. Here again, each of
> these on board devices will have from one to six
> memory region base address registers (or one to
> three for 64-bit bars).
>
> Encoding memory attributes into the address is common
> in microcontrollers, but in a general purpose processor
> constrains the system to an extent sufficient to make it
> unattractive for general purpose workloads.

Possibly, but making things more flexible here would be a non-trivial
level of complexity to deal with at the moment (and, it seemed relevant
at first to design something I could "actually implement").

At the time I started out on this, even maintaining similar hardware
interfaces to a minimalist version of the Sega Dreamcast (what the
BJX1's hardware-interface design was partly based on) was asking a bit
too much (even after leaving out things like the CD-ROM drive and similar).

So, I simplified things somewhat, initially taking some design
inspiration in these areas from the Commodore 64 and MSP430 and similar...

Say:
VRAM was reinterpreted as being an 80x25 grid of 8x8 pixel color cells;
Audio was a simple MMIO-backed PCM loop (with a few registers to adjust
the sample rate and similar).

In terms of output signals, the display module drives a VGA output, and
the audio is generally pulled off by turning an IO pin on and off really
fast.

Or, one drives 2 lines for audio, say:
10: +, 01: -, 11: 0

Using an H-Bridge driver as an amplifier (turns out one needs to drive
like 50-100mA to get any decent level of loudness out of headphones;
which is well beyond the power normal IO pins can deliver). Generally
PCM needs to get turned into PWM/PDM.

Driving stereo via a dual H-Bridge driver would get a little wonky
though, since headphones use Left/Right and a Common, effectively one
needs to drive the center as a neutral, with L/R channels (and/or, just
get lazy and drive mono across both the L/R channels using a single
H-Bridge and ignore the center point, which ironically can get more
loudness at less current because now one is dealing with 70 ohm rather
than 35 ohm).

....

Generally, with all of the hardware addresses at fixed locations.
Doing any kind of dynamic configuration or allowing hardware addresses
to be movable would have likely made the MMIO devices significantly more
expensive (vs hard-coding the address of each device).

Did generally go with MMIO rather than x86 style IO ports though.
Partly because IO ports sucks, and I wasn't quite *that* limited (say,
could afford to use a 28-bit space, rather than a 16-bit space).

....

Re: Tonight's tradeoff

<ba33cfddf95a05be000caf419cca9e18@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35276&group=comp.arch#35276

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Mon, 27 Nov 2023 02:09:52 +0000
Organization: novaBBS
Message-ID: <ba33cfddf95a05be000caf419cca9e18@news.novabbs.com>
References: <uis67u$fkj4$1@dont-email.me> <7761287e80bb22b7742fd7f292664497@news.novabbs.com> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <uk0e96$3dtqn$1@dont-email.me> <HKP8N.30032$ayBd.9426@fx07.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2230658"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$8KtEjuM9.iI66ktz1NIrz.jffSqbsMyj5RSDUG.7EqZnmIiNhmfBm
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on novalink.us
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949
 by: MitchAlsup - Mon, 27 Nov 2023 02:09 UTC

Scott Lurndal wrote:

> BGB <cr88192@gmail.com> writes:
>>On 11/26/2023 9:45 AM, Anton Ertl wrote:

>>
>>
>>Where, say, memory map will look something like:
>> 00yy_xxxxxxxx: Physical mapped and direct-mapped memory.
>> 00yy_xxxxxxxx: Start of global virtual memory (*1);
>> 3FFF_xxxxxxxx: End of global virtual memory;
>> 4000_xxxxxxxx: Start of private/local virtual memory (possible, *2);
>> 7FFF_xxxxxxxx: End of private/local virtual memory (possible);
>> 8000_xxxxxxxx: Start of kernel virtual memory;
>> BFFF_xxxxxxxx: End of kernel virtual memory;
>> Cxxx_xxxxxxxx: Physical Address Range (Cached);
>> Dxxx_xxxxxxxx: Physical Address Range (Volatile/NoCache);
>> Exxx_xxxxxxxx: Reserved;
>> Fxxx_xxxxxxxx: MMIO and similar.

> The modern preference is to make the memory map flexible.

// cacheable, used, modified bits
CUM kind of access
--- ------------------------------
000 uncacheable DRAM
001 MMI/O
010 config
011 ROM
1xx cacheable DRAM

> Linux, for example, requires that PCI Base Address Registers
> be programmable by the operating system, and the OS can
> choose any range (subject to host bridge configuration, of
> course) for the device.

Easily done, just create an uncacheable PTE and set UM to 10
for config space or 01 for MMI/O space.

> It is notable that even on non-intel systems, one may need
> to map a 32-bit PCI BAR (AHCI is the classic example) which
> requires the address programmed in the bar to be less than
> 0x10000000.

I/O MMU translates these devices from a 32-bit VAS into the
64-bit PAS.
> Granted systems can have custom PCI controllers
> that remap that into the larger physical address space with
> a bit of extra hardware, however the kernel people don't
> like that at all since there is no universal standard for
> such remapping and they don't want to support
> dozens of independent implementations, constantly
> changing from generation to generation.

What they figure if they are already supporting 4 incompatible
mapping systems {Intel, AMD, ARM, RISC-V} you would have though
they had gotten good at these implementations :-)

> Many modern SoCs (and ARM SBSA requires this) make their
> on-board devices and coprocessors look like PCI express
> devices to software,

I made the CPU/cores in My 66000 have a configuration port
that is setup during boot that smells just like a PCIe
port.

> and SBSA requires the PCIe ECAM
> region for device discovery. Here again, each of
> these on board devices will have from one to six
> memory region base address registers (or one to
> three for 64-bit bars).

> Encoding memory attributes into the address is common
> in microcontrollers, but in a general purpose processor
> constrains the system to an extent sufficient to make it
> unattractive for general purpose workloads.

Agreed.

Re: Tonight's tradeoff

<uk1bgo$3lf4g$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35279&group=comp.arch#35279

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Mon, 27 Nov 2023 00:04:06 -0600
Organization: A noiseless patient Spider
Lines: 180
Message-ID: <uk1bgo$3lf4g$1@dont-email.me>
References: <uis67u$fkj4$1@dont-email.me>
<7761287e80bb22b7742fd7f292664497@news.novabbs.com>
<uj9bm2$36401$1@dont-email.me>
<71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com>
<ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me>
<987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com>
<ujrm4a$2llie$1@dont-email.me>
<d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com>
<bps8N.150652$wvv7.7314@fx14.iad>
<2023Nov26.164506@mips.complang.tuwien.ac.at> <uk0e96$3dtqn$1@dont-email.me>
<HKP8N.30032$ayBd.9426@fx07.iad>
<ba33cfddf95a05be000caf419cca9e18@news.novabbs.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 27 Nov 2023 06:04:08 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="2fcebed1d03588c400a4bb58f77147d2";
logging-data="3849360"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/caqTt2QAZhVNrio9tZZD3"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:fmTntfmAs1VB7AO1s+1wZtQj1Jg=
Content-Language: en-US
In-Reply-To: <ba33cfddf95a05be000caf419cca9e18@news.novabbs.com>
 by: BGB - Mon, 27 Nov 2023 06:04 UTC

On 11/26/2023 8:09 PM, MitchAlsup wrote:
> Scott Lurndal wrote:
>
>> BGB <cr88192@gmail.com> writes:
>>> On 11/26/2023 9:45 AM, Anton Ertl wrote:
>
>>>
>>>
>>> Where, say, memory map will look something like:
>>>   00yy_xxxxxxxx: Physical mapped and direct-mapped memory.
>>>   00yy_xxxxxxxx: Start of global virtual memory (*1);
>>>   3FFF_xxxxxxxx: End of global virtual memory;
>>>   4000_xxxxxxxx: Start of private/local virtual memory (possible, *2);
>>>   7FFF_xxxxxxxx: End of private/local virtual memory (possible);
>>>   8000_xxxxxxxx: Start of kernel virtual memory;
>>>   BFFF_xxxxxxxx: End of kernel virtual memory;
>>>   Cxxx_xxxxxxxx: Physical Address Range (Cached);
>>>   Dxxx_xxxxxxxx: Physical Address Range (Volatile/NoCache);
>>>   Exxx_xxxxxxxx: Reserved;
>>>   Fxxx_xxxxxxxx: MMIO and similar.
> >
>> The modern preference is to make the memory map flexible.
>

As noted, some amount of the above would be part of the OS memory map,
rather than a hardware imposed memory map.

Like, say, Windows on x86 typically had:
00000000..000FFFFF: DOS-like map (9x)
00100000..7FFFFFFF: Userland stuff
80000000..BFFFFFFF: Shared stuff
C0000000..FFFFFFFF: Kernel Stuff

Did the hardware enforce this? No.
Did Windows follow such a structure? Yes, generally.

Linux sorta followed a similar structure, except that on some versions,
they had given the full 4GB to userland addresses (which made an
annoyance if trying to use TagRefs and the OS might actually put memory
in the part of the address space one would have otherwise used to hold
fixnums and similar).

Ironically though, this sort of thing (along with the limits of 32-bit
tagrefs) made incentive for my to go over to 64-bit tagrefs even on
32-bit machines, and a generally similar tagref scheme got carried into
my later projects.

Say:
0ttt_xxxx_xxxxxxxx: Pointers
1ttt_xxxx_xxxxxxxx: Small Value Spaces
2ttt_xxxx_xxxxxxxx: ...
3yyy_xxxx_xxxxxxxx: Bounds Checked Pointers
4iii_iiii_iiiiiiii: Fixnum
..
7iii_iiii_iiiiiiii: Fixnum
8iii_iiii_iiiiiiii: Flonum
..
Biii_iiii_iiiiiiii: Flonum
...

But, this scheme is more used by the runtime, not so much by the hardware.

For the most part, C doesn't use pointer tagging.
However BGBScript/JavaScript and my BASIC variant do make use of
type-tagging.

>                 // cacheable, used, modified bits
>     CUM            kind of access
>     ---            ------------------------------
>     000            uncacheable DRAM
>     001            MMI/O
>     010            config
>     011            ROM
>     1xx            cacheable DRAM
>

Hmm...
Unfortunate acronyms are inescapable it seems...

>> Linux, for example, requires that PCI Base Address Registers
>> be programmable by the operating system, and the OS can
>> choose any range (subject to host bridge configuration, of
>> course) for the device.
>
> Easily done, just create an uncacheable PTE and set UM to 10
> for config space or 01 for MMI/O space.
>

I guess, if PCIe were supported, some scheme could be developed to map
the PCIe space either into part of the MMIO space, into RAM space, or
maybe some other space.

There is a functional difference between MMIO space and RAM space in
terms of how they are accessed:
RAM space: Cache does its thing and works with cache-lines;
MMIO space: A request is sent over the bus, and then it waits for a
response.

If the MMIO bridge sees an MMIO request, it puts it onto the MMIO Bus,
and sees if any device responds (if so, sending the response back to the
origin). Otherwise, if no device responds after a certain number of
clock cycles, an all-zeroes response is sent instead.

Currently, no sort of general purpose bus is routed outside of the FPGA,
and if it did exist, it is not yet clear what form it would take.

Would need to limit pin counts though, so probably some sort of serial
bus in any case.

PCIe might be sort of tempting in the sense that apparently, 1 PCIe lane
can be subdivided to multiple devices, and bridge cards exist that can
apparently route PCIe over a repurposed USB cable and then connect
multiple devices, PCI, or ISA cards. Albeit apparently with mixed results.

>> It is notable that even on non-intel systems, one may need
>> to map a 32-bit PCI BAR (AHCI is the classic example) which
>> requires the address programmed in the bar to be less than
>> 0x10000000.
>
> I/O MMU translates these devices from a 32-bit VAS into the 64-bit PAS.
>
>>             Granted systems can have custom PCI controllers
>> that remap that into the larger physical address space with
>> a bit of extra hardware, however the kernel people don't
>> like that at all since there is no universal standard for
>> such remapping and they don't want to support
>> dozens of independent implementations, constantly
>> changing from generation to generation.
>
> What they figure if they are already supporting 4 incompatible
> mapping systems {Intel, AMD, ARM, RISC-V} you would have though
> they had gotten good at these implementations :-)
>
>> Many modern SoCs (and ARM SBSA requires this) make their
>> on-board devices and coprocessors look like PCI express
>> devices to software,
>
> I made the CPU/cores in My 66000 have a configuration port
> that is setup during boot that smells just like a PCIe
> port.
>
>>                      and SBSA requires the PCIe ECAM
>> region for device discovery.    Here again, each of
>> these on board devices will have from one to six
>> memory region base address registers (or one to
>> three for 64-bit bars).
>
>> Encoding memory attributes into the address is common
>> in microcontrollers, but in a general purpose processor
>> constrains the system to an extent sufficient to make it
>> unattractive for general purpose workloads.
>
> Agreed.

At least for the userland address ranges, there is less of this going on
than in SH4, which had basically spent the top 3 bits of the 32-bit
address as mode.

Say, IIRC:
(29): No TLB
(30): No Cache
(31): Supervisor

So, in effect, there was only 512MB of usable address space.
The SH-4A had then expanded the lower part to 31 bits, so one could have
2GB of usermode address space.

But, say, if one can have 47 bits of freely usable virtual address space
for userland, probably good enough.

Re: Tonight's tradeoff

<2023Nov27.082222@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35281&group=comp.arch#35281

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.nntp4.net!news.hispagatos.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Mon, 27 Nov 2023 07:22:22 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 74
Message-ID: <2023Nov27.082222@mips.complang.tuwien.ac.at>
References: <uis67u$fkj4$1@dont-email.me> <7761287e80bb22b7742fd7f292664497@news.novabbs.com> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <uk0e96$3dtqn$1@dont-email.me>
Injection-Info: dont-email.me; posting-host="9f161ffc6bf8f4ad3914c9d4166bd6be";
logging-data="3876579"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/HtptciiVkTuz9LOwzFRlb"
Cancel-Lock: sha1:kaUwa944q7XOOWMFvmovmLnS7D8=
X-newsreader: xrn 10.11
 by: Anton Ertl - Mon, 27 Nov 2023 07:22 UTC

BGB <cr88192@gmail.com> writes:
>On 11/26/2023 9:45 AM, Anton Ertl wrote:
>> This would be especially useful for the read-only sections (e.g, code)
>> of common libraries like libc. However, in todays security landscape,
>> you don't want one process to know where library code is mapped in
>> other processes (i.e., you want ASLR), so we can no longer make use of
>> that benefit. And it's doubtful whether other uses are worth the
>> complications (and even if they are, there might be security issues,
>> too).
>>
>
>It seems to me, as long as it is a different place on each system,
>probably good enough. Demanding a different location in each process
>would create a lot of additional memory overhead due to from things like
>base-relocations or similar.

If the binary is position-independent (the default on Linux on AMD64),
there is no such overhead.

I just started the same binary twice and looked at the address of the
same peace of code:

Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
see open-file
Code open-file
0x000055c2b76d5833 <gforth_engine+6595>: mov %r15,0x1c126(%rip)
...

For the other process the same instruction is:

Code open-file
0x000055dd606e4833 <gforth_engine+6595>: mov %r15,0x1c126(%rip)

Following the calls until I get to glibc, I get, for the two processes:

0x00007f705c0c3b90 <__libc_open64+0>: push %r12
0x00007f190aa34b90 <__libc_open64+0>: push %r12

So not just the binary, but also glibc resides at different virtual
addresses in the two processes.

So obviously the Linux and glibc maintainers think that per-system
ASLR is not good enough. They obviously want ASLR to work as well as
possible against local attackers.

>> Of course the TLB looks up by VA, what else. But if the VA is the
>> same and the PA is the same, the same ASID can be used.
>>
>
>?...
>
>Typically the ASID applies to the whole virtual address space, not to
>individual memory objects.

Yes, one would need more complicated ASID management than setting
"the" ASID on switching to a process if different VMAs in the process
have different ASIDs. Another reason not to go there.

Power (and IIRC HPPA) do something in this direction with their
"segments", where the VA space was split into 16 equally parts, and
IIRC the 16 parts each extended the address by 16 bits (minus the 4
bits of the segment number), so essentially they have 16 16-bit ASIDs.
The address spaces are somewhat unflexible, but with 64-bit VAs
(i.e. 60-bit address spaces) that may be good enough for quite a
while. The cost is that you now have to manage 16 ASID registers.
And if we ever get to actually making use of more the 60 bits of VA in
other ways, combining this ASID scheme with the other use of the VAs.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Tonight's tradeoff

<2023Nov27.085708@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35282&group=comp.arch#35282

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Mon, 27 Nov 2023 07:57:08 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 37
Message-ID: <2023Nov27.085708@mips.complang.tuwien.ac.at>
References: <uis67u$fkj4$1@dont-email.me> <7761287e80bb22b7742fd7f292664497@news.novabbs.com> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <tDP8N.30031$ayBd.8559@fx07.iad>
Injection-Info: dont-email.me; posting-host="9f161ffc6bf8f4ad3914c9d4166bd6be";
logging-data="3901925"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX198dXlZaf5zdMdaJAT2kKI2"
Cancel-Lock: sha1:NlByZq6EBhHyh8Eumu2T57rAwMw=
X-newsreader: xrn 10.11
 by: Anton Ertl - Mon, 27 Nov 2023 07:57 UTC

scott@slp53.sl.home (Scott Lurndal) writes:
>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>>scott@slp53.sl.home (Scott Lurndal) writes:
>>>FWIW, MAP_FIXED is specified as an optional feature by POSIX
>>>and may not be supported by the OS at all.
>>
>>As usual, what is specified by a common-subset standard is not
>>relevant for what an OS implementor has to do if they want to supply
>>more than a practically unusable checkbox feature like the POSIX
>>subsystem for Windows. There is a reason why WSL2 includes a full
>>Linux kernel.
....
>Because the semantics of MAP_FIXED are to unmap any
>prior mapping in the range, if the implementation had happened to
>allocate the heap or shared System V region at that address, the heap
>would have become corrupt with dangling references hanging
>around which, if stored into, would subsequently corrupt the mapped region.

Of course you can provide an address without specifying MAP_FIXED, and
a high-quality OS will satisfy the request if possible (and return a
different address if not), while a work-to-rule OS like the POSIX
subsystem for Windows may then treat that address as if the user had
passed NULL.

Interestingly, Linux (since 4.17) also provides MAP_FIXED_NOREPLACE,
which works like MAP_FIXED except that it returns an error if
MAP_FIXED would replace part of an existing mapping. Makes me wonder
if in the no-conflict case, and given a page-aligned addr there is any
difference between MAP_FIXED, MAP_FIXED_NOREPLACE and just providing
an address without any of these flags in Linux. In the conflict case,
the difference between the latter two variants is how you detect that
it did not work as desired.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Tonight's tradeoff

<uk1nrc$3n756$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35283&group=comp.arch#35283

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Mon, 27 Nov 2023 03:34:34 -0600
Organization: A noiseless patient Spider
Lines: 96
Message-ID: <uk1nrc$3n756$1@dont-email.me>
References: <uis67u$fkj4$1@dont-email.me>
<7761287e80bb22b7742fd7f292664497@news.novabbs.com>
<uj9bm2$36401$1@dont-email.me>
<71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com>
<ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me>
<987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com>
<ujrm4a$2llie$1@dont-email.me>
<d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com>
<bps8N.150652$wvv7.7314@fx14.iad>
<2023Nov26.164506@mips.complang.tuwien.ac.at> <uk0e96$3dtqn$1@dont-email.me>
<2023Nov27.082222@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 27 Nov 2023 09:34:36 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="2fcebed1d03588c400a4bb58f77147d2";
logging-data="3906726"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+4Ci0AQGHLwuIZrrpCfOJh"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:xd6+X2+cmJRT19STnykQLloAxcM=
In-Reply-To: <2023Nov27.082222@mips.complang.tuwien.ac.at>
Content-Language: en-US
 by: BGB - Mon, 27 Nov 2023 09:34 UTC

On 11/27/2023 1:22 AM, Anton Ertl wrote:
> BGB <cr88192@gmail.com> writes:
>> On 11/26/2023 9:45 AM, Anton Ertl wrote:
>>> This would be especially useful for the read-only sections (e.g, code)
>>> of common libraries like libc. However, in todays security landscape,
>>> you don't want one process to know where library code is mapped in
>>> other processes (i.e., you want ASLR), so we can no longer make use of
>>> that benefit. And it's doubtful whether other uses are worth the
>>> complications (and even if they are, there might be security issues,
>>> too).
>>>
>>
>> It seems to me, as long as it is a different place on each system,
>> probably good enough. Demanding a different location in each process
>> would create a lot of additional memory overhead due to from things like
>> base-relocations or similar.
>
> If the binary is position-independent (the default on Linux on AMD64),
> there is no such overhead.
>

OK.

I was thinking mostly of things like PE/COFF, where often a mix of
relative and absolute addressing is used, and loading typically involves
applying base relocations (so, once loaded, the assumption is that the
binary will not move further).

Granted, traditional PE/COFF and ELF manage things like global variables
differently (direct vs GOT).

Though, on x86-64, PC-relative addressing is a thing, so less need for
absolute addressing. PIC with PE/COFF might not be too much of a stretch.

> I just started the same binary twice and looked at the address of the
> same peace of code:
>
> Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
> Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
> Type `bye' to exit
> see open-file
> Code open-file
> 0x000055c2b76d5833 <gforth_engine+6595>: mov %r15,0x1c126(%rip)
> ...
>
> For the other process the same instruction is:
>
> Code open-file
> 0x000055dd606e4833 <gforth_engine+6595>: mov %r15,0x1c126(%rip)
>
> Following the calls until I get to glibc, I get, for the two processes:
>
> 0x00007f705c0c3b90 <__libc_open64+0>: push %r12
> 0x00007f190aa34b90 <__libc_open64+0>: push %r12
>
> So not just the binary, but also glibc resides at different virtual
> addresses in the two processes.
>
> So obviously the Linux and glibc maintainers think that per-system
> ASLR is not good enough. They obviously want ASLR to work as well as
> possible against local attackers.
>

OK.

>>> Of course the TLB looks up by VA, what else. But if the VA is the
>>> same and the PA is the same, the same ASID can be used.
>>>
>>
>> ?...
>>
>> Typically the ASID applies to the whole virtual address space, not to
>> individual memory objects.
>
> Yes, one would need more complicated ASID management than setting
> "the" ASID on switching to a process if different VMAs in the process
> have different ASIDs. Another reason not to go there.
>
> Power (and IIRC HPPA) do something in this direction with their
> "segments", where the VA space was split into 16 equally parts, and
> IIRC the 16 parts each extended the address by 16 bits (minus the 4
> bits of the segment number), so essentially they have 16 16-bit ASIDs.
> The address spaces are somewhat unflexible, but with 64-bit VAs
> (i.e. 60-bit address spaces) that may be good enough for quite a
> while. The cost is that you now have to manage 16 ASID registers.
> And if we ever get to actually making use of more the 60 bits of VA in
> other ways, combining this ASID scheme with the other use of the VAs.
>

OK.

That seems a bit odd...

Re: Tonight's tradeoff

<s929N.28687$rx%7.18632@fx47.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35287&group=comp.arch#35287

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.neodome.net!weretis.net!feeder6.news.weretis.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx47.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Tonight's tradeoff
Newsgroups: comp.arch
References: <uis67u$fkj4$1@dont-email.me> <uj9bm2$36401$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <tDP8N.30031$ayBd.8559@fx07.iad> <2023Nov27.085708@mips.complang.tuwien.ac.at>
Lines: 44
Message-ID: <s929N.28687$rx%7.18632@fx47.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Mon, 27 Nov 2023 14:59:36 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Mon, 27 Nov 2023 14:59:36 GMT
X-Received-Bytes: 3296
 by: Scott Lurndal - Mon, 27 Nov 2023 14:59 UTC

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>scott@slp53.sl.home (Scott Lurndal) writes:
>>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>>>scott@slp53.sl.home (Scott Lurndal) writes:
>>>>FWIW, MAP_FIXED is specified as an optional feature by POSIX
>>>>and may not be supported by the OS at all.
>>>
>>>As usual, what is specified by a common-subset standard is not
>>>relevant for what an OS implementor has to do if they want to supply
>>>more than a practically unusable checkbox feature like the POSIX
>>>subsystem for Windows. There is a reason why WSL2 includes a full
>>>Linux kernel.
>...
>>Because the semantics of MAP_FIXED are to unmap any
>>prior mapping in the range, if the implementation had happened to
>>allocate the heap or shared System V region at that address, the heap
>>would have become corrupt with dangling references hanging
>>around which, if stored into, would subsequently corrupt the mapped region.
>
>Of course you can provide an address without specifying MAP_FIXED, and
>a high-quality OS will satisfy the request if possible (and return a
>different address if not), while a work-to-rule OS like the POSIX
>subsystem for Windows may then treat that address as if the user had
>passed NULL.
>
>Interestingly, Linux (since 4.17) also provides MAP_FIXED_NOREPLACE,
>which works like MAP_FIXED except that it returns an error if
>MAP_FIXED would replace part of an existing mapping. Makes me wonder
>if in the no-conflict case, and given a page-aligned addr there is any
>difference between MAP_FIXED, MAP_FIXED_NOREPLACE and just providing
>an address without any of these flags in Linux. In the conflict case,
>the difference between the latter two variants is how you detect that
>it did not work as desired.
>

I've never seen a case where using MAP_FIXED was useful, and I've
been using mmap since the early 90's. I'm sure there must be one,
probabably where someone uses full VAs instead of offsets in data
structures. Using the full VAs in the region will likely cause
issues in the long term as the application is moved to updated or
different posix systems, particularly if the data file associated
with the region is expected to work in all subsequent
implementats. MAP_FIXED should be avoided, IMO.

Re: Tonight's tradeoff

<2023Nov27.171049@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35290&group=comp.arch#35290

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Mon, 27 Nov 2023 16:10:49 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 21
Message-ID: <2023Nov27.171049@mips.complang.tuwien.ac.at>
References: <uis67u$fkj4$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <tDP8N.30031$ayBd.8559@fx07.iad> <2023Nov27.085708@mips.complang.tuwien.ac.at> <s929N.28687$rx%7.18632@fx47.iad>
Injection-Info: dont-email.me; posting-host="9f161ffc6bf8f4ad3914c9d4166bd6be";
logging-data="4033796"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19RoIwBWQ6gPD1xQE7iS6eA"
Cancel-Lock: sha1:fj3DHbC5ml4N/GkDK5txMXhFa6A=
X-newsreader: xrn 10.11
 by: Anton Ertl - Mon, 27 Nov 2023 16:10 UTC

scott@slp53.sl.home (Scott Lurndal) writes:
>I've never seen a case where using MAP_FIXED was useful, and I've
>been using mmap since the early 90's.

Gforth uses it for putting the image into the dictionary (the memory
area for Forth definitions, where more definitions can be put during a
session): It first allocates the space for the dictionary with an
anonymous mmap, then puts the image at the start of this area with a
file mmap with MAP_FIXED.

It also currently uses MAP_FIXED for allocating the memory for
non-relocatable images, but thinking through it again, it's probably
better to use MAP_FIXED_NOREPLACE or nothing, and then check the
address, and report any error. However, we have not received any bug
reports about that, which probably shows that nobody uses
non-relocatable images.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Tonight's tradeoff

<ukaeef$1ecg7$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35319&group=comp.arch#35319

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: robfi680@gmail.com (Robert Finch)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Thu, 30 Nov 2023 11:49:18 -0500
Organization: A noiseless patient Spider
Lines: 23
Message-ID: <ukaeef$1ecg7$1@dont-email.me>
References: <uis67u$fkj4$1@dont-email.me>
<71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com>
<ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me>
<987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com>
<ujrm4a$2llie$1@dont-email.me>
<d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com>
<bps8N.150652$wvv7.7314@fx14.iad>
<2023Nov26.164506@mips.complang.tuwien.ac.at>
<tDP8N.30031$ayBd.8559@fx07.iad>
<2023Nov27.085708@mips.complang.tuwien.ac.at>
<s929N.28687$rx%7.18632@fx47.iad>
<2023Nov27.171049@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 30 Nov 2023 16:49:19 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="108932fdbdfcb95c13135e42463893dc";
logging-data="1520135"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19byzx4ZBBwpk8dFmbvUHGyU5mhqVudtN8="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:W9Z0izNyFFqT/eHWB7199xRYsEc=
In-Reply-To: <2023Nov27.171049@mips.complang.tuwien.ac.at>
Content-Language: en-US
 by: Robert Finch - Thu, 30 Nov 2023 16:49 UTC

The Q+ register file is implemented with one block-RAM per read port.
With a 64-bit width this gives 512 registers in a block RAM. 192
registers are needed for renaming a 64-entry architectural register
file. That leaves 320 registers unused. My thought was to support two
banks of registers, one for the highest operating mode, and the other
for remaining operating modes. On exceptions the register bank could be
switched. But to do this there are now 128-register effectively being
renamed which leads to 384 physical registers to manage. This doubles
the size of the register management code. Unless, a pipeline flush
occurs for exception processing which I think would allow the renamer to
reuse the same hardware to manage a new bank of registers. But that
hinges on all references to registers in the current bank being unused.

My other thought was that with approximately three times the number of
architectural registers required, using 256 physical registers would
allow 85 architectural registers. Perhaps some of the registers could be
banked for different operating modes. Banking four registers per mode
would use up 16.

If the 512-register file were divided by three, 170 physical registers
could be available for renaming. This is less than the ideal 192
registers but maybe close enough to not impact performance adversely.

Re: Tonight's tradeoff

<Zb3aN.140351$AqO5.33253@fx11.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35320&group=comp.arch#35320

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx11.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Tonight's tradeoff
Newsgroups: comp.arch
References: <uis67u$fkj4$1@dont-email.me> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <tDP8N.30031$ayBd.8559@fx07.iad> <2023Nov27.085708@mips.complang.tuwien.ac.at> <s929N.28687$rx%7.18632@fx47.iad> <2023Nov27.171049@mips.complang.tuwien.ac.at> <ukaeef$1ecg7$1@dont-email.me>
Lines: 9
Message-ID: <Zb3aN.140351$AqO5.33253@fx11.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Thu, 30 Nov 2023 16:59:37 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Thu, 30 Nov 2023 16:59:37 GMT
X-Received-Bytes: 1392
 by: Scott Lurndal - Thu, 30 Nov 2023 16:59 UTC

Robert Finch <robfi680@gmail.com> writes:
<snip>
> My thought was to support two
>banks of registers, one for the highest operating mode, and the other
>for remaining operating modes.

How do the operating modes pass data between each other? E.g. for
a system call, the arguments are generally passed to the next higher
privilege level/operating mode via registers.

Re: Tonight's tradeoff

<gC4aN.155430$_Oab.116148@fx15.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35323&group=comp.arch#35323

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
References: <uis67u$fkj4$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <tDP8N.30031$ayBd.8559@fx07.iad> <2023Nov27.085708@mips.complang.tuwien.ac.at> <s929N.28687$rx%7.18632@fx47.iad> <2023Nov27.171049@mips.complang.tuwien.ac.at> <ukaeef$1ecg7$1@dont-email.me>
In-Reply-To: <ukaeef$1ecg7$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 43
Message-ID: <gC4aN.155430$_Oab.116148@fx15.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Thu, 30 Nov 2023 18:35:56 UTC
Date: Thu, 30 Nov 2023 13:35:04 -0500
X-Received-Bytes: 3519
 by: EricP - Thu, 30 Nov 2023 18:35 UTC

Robert Finch wrote:
> The Q+ register file is implemented with one block-RAM per read port.
> With a 64-bit width this gives 512 registers in a block RAM. 192
> registers are needed for renaming a 64-entry architectural register
> file. That leaves 320 registers unused. My thought was to support two
> banks of registers, one for the highest operating mode, and the other
> for remaining operating modes. On exceptions the register bank could be
> switched. But to do this there are now 128-register effectively being
> renamed which leads to 384 physical registers to manage. This doubles
> the size of the register management code. Unless, a pipeline flush
> occurs for exception processing which I think would allow the renamer to
> reuse the same hardware to manage a new bank of registers. But that
> hinges on all references to registers in the current bank being unused.
>
> My other thought was that with approximately three times the number of
> architectural registers required, using 256 physical registers would
> allow 85 architectural registers. Perhaps some of the registers could be
> banked for different operating modes. Banking four registers per mode
> would use up 16.
>
> If the 512-register file were divided by three, 170 physical registers
> could be available for renaming. This is less than the ideal 192
> registers but maybe close enough to not impact performance adversely.
>

I don't understand the problem.
You want 64 architecture registers, each which needs a physical register,
plus 128 registers for in-flight instructions, so 196 physical registers.

If you add a second bank of 64 architecture registers for interrupts
then each needs a physical register. But that doesn't change the number
of in-flight registers so thats 256 physical total.
Plus two sets of rename banks, one for each mode.

If you drain the pipeline before switching register banks then all
of the 128 in-flight registers will be free at the time of switch.

If you can switch to interrupt mode without draining the pipeline then
some of those 128 will be in-use for the old mode, some for the new mode
(and the uOps carry a privilege mode flag so you can do things like
check LD or ST ops against the appropriate PTE mode access control).

Re: Tonight's tradeoff

<b32150df757ef8894bac87db8b695882@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35328&group=comp.arch#35328

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Thu, 30 Nov 2023 20:30:52 +0000
Organization: novaBBS
Message-ID: <b32150df757ef8894bac87db8b695882@news.novabbs.com>
References: <uis67u$fkj4$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <tDP8N.30031$ayBd.8559@fx07.iad> <2023Nov27.085708@mips.complang.tuwien.ac.at> <s929N.28687$rx%7.18632@fx47.iad> <2023Nov27.171049@mips.complang.tuwien.ac.at> <ukaeef$1ecg7$1@dont-email.me> <gC4aN.155430$_Oab.116148@fx15.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2632899"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949
X-Rslight-Site: $2y$10$WhGwa9z9DH/aCHIR6jNgLOEqKjczO.c4ceQHomWGHrL2ut9GaI1tm
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on novalink.us
 by: MitchAlsup - Thu, 30 Nov 2023 20:30 UTC

EricP wrote:

> Robert Finch wrote:
>> The Q+ register file is implemented with one block-RAM per read port.
>> With a 64-bit width this gives 512 registers in a block RAM. 192
>> registers are needed for renaming a 64-entry architectural register
>> file. That leaves 320 registers unused. My thought was to support two
>> banks of registers, one for the highest operating mode, and the other
>> for remaining operating modes. On exceptions the register bank could be
>> switched. But to do this there are now 128-register effectively being
>> renamed which leads to 384 physical registers to manage. This doubles
>> the size of the register management code. Unless, a pipeline flush
>> occurs for exception processing which I think would allow the renamer to
>> reuse the same hardware to manage a new bank of registers. But that
>> hinges on all references to registers in the current bank being unused.
>>
>> My other thought was that with approximately three times the number of
>> architectural registers required, using 256 physical registers would
>> allow 85 architectural registers. Perhaps some of the registers could be
>> banked for different operating modes. Banking four registers per mode
>> would use up 16.
>>
>> If the 512-register file were divided by three, 170 physical registers
>> could be available for renaming. This is less than the ideal 192
>> registers but maybe close enough to not impact performance adversely.
>>

> I don't understand the problem.
> You want 64 architecture registers, each which needs a physical register,
> plus 128 registers for in-flight instructions, so 196 physical registers.

> If you add a second bank of 64 architecture registers for interrupts
> then each needs a physical register. But that doesn't change the number
> of in-flight registers so thats 256 physical total.
> Plus two sets of rename banks, one for each mode.

> If you drain the pipeline before switching register banks then all
> of the 128 in-flight registers will be free at the time of switch.

A couple of bits of state and you don't need to drain the pipeline,
you just have to find the youngest instruction with the property
that all older instructions cannot raise an exception; these can be
allowed to finish execution while you are fetching instruction for
the new context.

> If you can switch to interrupt mode without draining the pipeline then
> some of those 128 will be in-use for the old mode, some for the new mode
> (and the uOps carry a privilege mode flag so you can do things like
> check LD or ST ops against the appropriate PTE mode access control).

And 1 bit of state keeps track of which is which.

Re: Tonight's tradeoff

<ukb3ko$1hv5s$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35330&group=comp.arch#35330

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: robfi680@gmail.com (Robert Finch)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Thu, 30 Nov 2023 17:51:02 -0500
Organization: A noiseless patient Spider
Lines: 79
Message-ID: <ukb3ko$1hv5s$1@dont-email.me>
References: <uis67u$fkj4$1@dont-email.me>
<71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com>
<ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me>
<987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com>
<ujrm4a$2llie$1@dont-email.me>
<d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com>
<bps8N.150652$wvv7.7314@fx14.iad>
<2023Nov26.164506@mips.complang.tuwien.ac.at>
<tDP8N.30031$ayBd.8559@fx07.iad>
<2023Nov27.085708@mips.complang.tuwien.ac.at>
<s929N.28687$rx%7.18632@fx47.iad>
<2023Nov27.171049@mips.complang.tuwien.ac.at> <ukaeef$1ecg7$1@dont-email.me>
<gC4aN.155430$_Oab.116148@fx15.iad>
<b32150df757ef8894bac87db8b695882@news.novabbs.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 30 Nov 2023 22:51:05 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="108932fdbdfcb95c13135e42463893dc";
logging-data="1637564"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19tAlGyB8WyooUDgDZF0HpwVoJbzdNMN3o="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:tlvl3bibQKBgE0uT70UnxgyaMa4=
In-Reply-To: <b32150df757ef8894bac87db8b695882@news.novabbs.com>
Content-Language: en-US
 by: Robert Finch - Thu, 30 Nov 2023 22:51 UTC

On 2023-11-30 3:30 p.m., MitchAlsup wrote:
> EricP wrote:
>
>> Robert Finch wrote:
>>> The Q+ register file is implemented with one block-RAM per read port.
>>> With a 64-bit width this gives 512 registers in a block RAM. 192
>>> registers are needed for renaming a 64-entry architectural register
>>> file. That leaves 320 registers unused. My thought was to support two
>>> banks of registers, one for the highest operating mode, and the other
>>> for remaining operating modes. On exceptions the register bank could
>>> be switched. But to do this there are now 128-register effectively
>>> being renamed which leads to 384 physical registers to manage. This
>>> doubles the size of the register management code. Unless, a pipeline
>>> flush occurs for exception processing which I think would allow the
>>> renamer to reuse the same hardware to manage a new bank of registers.
>>> But that hinges on all references to registers in the current bank
>>> being unused.
>>>
>>> My other thought was that with approximately three times the number
>>> of architectural registers required, using 256 physical registers
>>> would allow 85 architectural registers. Perhaps some of the registers
>>> could be banked for different operating modes. Banking four registers
>>> per mode would use up 16.
>>>
>>> If the 512-register file were divided by three, 170 physical
>>> registers could be available for renaming. This is less than the
>>> ideal 192 registers but maybe close enough to not impact performance
>>> adversely.
>>>
>
>> I don't understand the problem.
>> You want 64 architecture registers, each which needs a physical register,
>> plus 128 registers for in-flight instructions, so 196 physical registers.
>
>> If you add a second bank of 64 architecture registers for interrupts
>> then each needs a physical register. But that doesn't change the number
>> of in-flight registers so thats 256 physical total.
>> Plus two sets of rename banks, one for each mode.
>
>> If you drain the pipeline before switching register banks then all
>> of the 128 in-flight registers will be free at the time of switch.
>
> A couple of bits of state and you don't need to drain the pipeline,
> you just have to find the youngest instruction with the property that
> all older instructions cannot raise an exception; these can be
> allowed to finish execution while you are fetching instruction for
> the new context.

Not quite comprehending. Will not the registers for the new context be
improperly mapped if there are registers in use for the old map? I think
a state bit could be used to pause a fetch of a register still in use in
the old map, but that is draining the pipeline anyway.
When the context swaps, a new set of target registers is always
established before the registers are used. So incoming references in the
new context should always map to the new registers?

>
>> If you can switch to interrupt mode without draining the pipeline then
>> some of those 128 will be in-use for the old mode, some for the new mode
>> (and the uOps carry a privilege mode flag so you can do things like
>> check LD or ST ops against the appropriate PTE mode access control).
>
> And 1 bit of state keeps track of which is which.

Did some experimenting and the RAT turns out to be too large if more
registers are incorporated. Even as few as 256 regs caused the RAT to
increase in size substantially. So, I may go the alternate route of
making register wider rather than deeper, having 128-bit wide registers
instead.

There is an eight bit sequence number bit associated with each
instruction. So it can easily be detected the age of an instruction. I
found a really slick way of detecting instruction age using a matrix
approach on the web. But I did not fully understand it. So I just use
eight bit counters for now.

There is a two bit privilege mode flag for instructions in the ROB. I
suppose the ROB entries could be called uOps.

Re: Tonight's tradeoff

<242110bb39a3a3f4afd4f747b0178333@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35331&group=comp.arch#35331

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Thu, 30 Nov 2023 23:06:32 +0000
Organization: novaBBS
Message-ID: <242110bb39a3a3f4afd4f747b0178333@news.novabbs.com>
References: <uis67u$fkj4$1@dont-email.me> <71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com> <ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me> <987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com> <ujrm4a$2llie$1@dont-email.me> <d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com> <bps8N.150652$wvv7.7314@fx14.iad> <2023Nov26.164506@mips.complang.tuwien.ac.at> <tDP8N.30031$ayBd.8559@fx07.iad> <2023Nov27.085708@mips.complang.tuwien.ac.at> <s929N.28687$rx%7.18632@fx47.iad> <2023Nov27.171049@mips.complang.tuwien.ac.at> <ukaeef$1ecg7$1@dont-email.me> <gC4aN.155430$_Oab.116148@fx15.iad> <b32150df757ef8894bac87db8b695882@news.novabbs.com> <ukb3ko$1hv5s$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2646668"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on novalink.us
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949
X-Rslight-Site: $2y$10$s/7qEg.QwWf6QjogkaUTEeVlgspu2a3xxtSxS1y7Tm3PgsyZFlruy
 by: MitchAlsup - Thu, 30 Nov 2023 23:06 UTC

Robert Finch wrote:

> On 2023-11-30 3:30 p.m., MitchAlsup wrote:
>> EricP wrote:
>>
>>> Robert Finch wrote:
>>>> The Q+ register file is implemented with one block-RAM per read port.
>>>> With a 64-bit width this gives 512 registers in a block RAM. 192
>>>> registers are needed for renaming a 64-entry architectural register
>>>> file. That leaves 320 registers unused. My thought was to support two
>>>> banks of registers, one for the highest operating mode, and the other
>>>> for remaining operating modes. On exceptions the register bank could
>>>> be switched. But to do this there are now 128-register effectively
>>>> being renamed which leads to 384 physical registers to manage. This
>>>> doubles the size of the register management code. Unless, a pipeline
>>>> flush occurs for exception processing which I think would allow the
>>>> renamer to reuse the same hardware to manage a new bank of registers.
>>>> But that hinges on all references to registers in the current bank
>>>> being unused.
>>>>
>>>> My other thought was that with approximately three times the number
>>>> of architectural registers required, using 256 physical registers
>>>> would allow 85 architectural registers. Perhaps some of the registers
>>>> could be banked for different operating modes. Banking four registers
>>>> per mode would use up 16.
>>>>
>>>> If the 512-register file were divided by three, 170 physical
>>>> registers could be available for renaming. This is less than the
>>>> ideal 192 registers but maybe close enough to not impact performance
>>>> adversely.
>>>>
>>
>>> I don't understand the problem.
>>> You want 64 architecture registers, each which needs a physical register,
>>> plus 128 registers for in-flight instructions, so 196 physical registers.
>>
>>> If you add a second bank of 64 architecture registers for interrupts
>>> then each needs a physical register. But that doesn't change the number
>>> of in-flight registers so thats 256 physical total.
>>> Plus two sets of rename banks, one for each mode.
>>
>>> If you drain the pipeline before switching register banks then all
>>> of the 128 in-flight registers will be free at the time of switch.
>>
>> A couple of bits of state and you don't need to drain the pipeline,
>> you just have to find the youngest instruction with the property that
>> all older instructions cannot raise an exception; these can be
>> allowed to finish execution while you are fetching instruction for
>> the new context.

> Not quite comprehending. Will not the registers for the new context be
> improperly mapped if there are registers in use for the old map?

All the in-flight destination registers will get written by the in-flight
instructions. All the instruction of the new context will allocate registers
from the pool which is not currently in-flight. So, while there is mental
confusion on how this gets pulled off in HW, it does get pulled off just
fine. When the new context STs the registers of the old context, it obtains
the correct register from the old context {{Should HW be doing this the
same orchestration applies--and it still works.}}

> I think
> a state bit could be used to pause a fetch of a register still in use in
> the old map, but that is draining the pipeline anyway.

You are assuming a RAT, I am not using a RAT but a CAM where I can restore
to any checkpoint by simply rewriting the valid bit vector.

> When the context swaps, a new set of target registers is always
> established before the registers are used.

You still have to deal with the transient state and the CAM version works
with either SW or HW save/restore.

> So incoming references in the
> new context should always map to the new registers?

Which they will--as illustrated above.

>>
>>> If you can switch to interrupt mode without draining the pipeline then
>>> some of those 128 will be in-use for the old mode, some for the new mode
>>> (and the uOps carry a privilege mode flag so you can do things like
>>> check LD or ST ops against the appropriate PTE mode access control).
>>
>> And 1 bit of state keeps track of which is which.

> Did some experimenting and the RAT turns out to be too large if more
> registers are incorporated. Even as few as 256 regs caused the RAT to
> increase in size substantially. So, I may go the alternate route of
> making register wider rather than deeper, having 128-bit wide registers
> instead.

Register ports (or equivalently RAT ports) are one of the things that most
limit issue width. K9 was to have 22 RAT ports, and was similar in size to
the {standard decoded Register File.}

> There is an eight bit sequence number bit associated with each
> instruction. So it can easily be detected the age of an instruction. I

I assign a 4-bit number (16-checkpints) to all instructions issued in
the same clock cycle. This gives a 6-wide machine up to 96 instructions
in-flight; and makes backing up (misprediction) simple and fast.

> found a really slick way of detecting instruction age using a matrix
> approach on the web. But I did not fully understand it. So I just use
> eight bit counters for now.

> There is a two bit privilege mode flag for instructions in the ROB. I
> suppose the ROB entries could be called uOps.

Re: Tonight's tradeoff

<ukb8rd$1im31$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=35333&group=comp.arch#35333

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: robfi680@gmail.com (Robert Finch)
Newsgroups: comp.arch
Subject: Re: Tonight's tradeoff
Date: Thu, 30 Nov 2023 19:19:56 -0500
Organization: A noiseless patient Spider
Lines: 7
Message-ID: <ukb8rd$1im31$1@dont-email.me>
References: <uis67u$fkj4$1@dont-email.me>
<71cb5ad7604b3d909df865a19ee3d52e@news.novabbs.com>
<ujb40q$3eepe$1@dont-email.me> <ujrfaa$2h1v9$1@dont-email.me>
<987455c358f93a9a7896c9af3d5f2b75@news.novabbs.com>
<ujrm4a$2llie$1@dont-email.me>
<d1f73b9de9ff6f86dac089ebd4bca037@news.novabbs.com>
<bps8N.150652$wvv7.7314@fx14.iad>
<2023Nov26.164506@mips.complang.tuwien.ac.at>
<tDP8N.30031$ayBd.8559@fx07.iad>
<2023Nov27.085708@mips.complang.tuwien.ac.at>
<s929N.28687$rx%7.18632@fx47.iad>
<2023Nov27.171049@mips.complang.tuwien.ac.at> <ukaeef$1ecg7$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 1 Dec 2023 00:19:57 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="579b79356b0ea203efc2b134288f0458";
logging-data="1661025"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18NEJS5NvOSIke8RZEuWj9PgBUVhLb3GpM="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:tRAatEOokMlnhO9vPQyv1rGA1mA=
Content-Language: en-US
In-Reply-To: <ukaeef$1ecg7$1@dont-email.me>
 by: Robert Finch - Fri, 1 Dec 2023 00:19 UTC

Figured it out. Each architectural register in the RAT must refer to N
physical registers, where N is the number of banks. Setting N to 4
results in a RAT that is only about 50% larger than one supporting only
a single bank. The operating mode is used to select the physical
register. The first eight registers are shared between all operating
modes so arguments can be passed to syscalls. It is tempting to have
eight banks of registers, one for each hardware interrupt level.


devel / comp.arch / Re: Tonight's tradeoff

Pages:123456789101112
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor