Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

When Dexter's on the Internet, can Hell be far behind?"


devel / comp.arch / Re: Stealing a Great Idea from the 6600

SubjectAuthor
* Stealing a Great Idea from the 6600John Savard
+- Re: Stealing a Great Idea from the 6600Scott Lurndal
+* Re: Stealing a Great Idea from the 6600MitchAlsup1
|`* Re: Stealing a Great Idea from the 6600John Savard
| `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|  `* Re: Stealing a Great Idea from the 6600John Savard
|   `* Re: Stealing a Great Idea from the 6600John Savard
|    `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|     +* Re: Stealing a Great Idea from the 6600John Savard
|     |`- Re: Stealing a Great Idea from the 6600John Savard
|     `* Re: Stealing a Great Idea from the 6600John Savard
|      `* Re: Stealing a Great Idea from the 6600John Savard
|       `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        +* Re: Stealing a Great Idea from the 6600BGB
|        |`* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | +* Re: Stealing a Great Idea from the 6600John Savard
|        | |+* Re: Stealing a Great Idea from the 6600John Savard
|        | ||`* Re: Stealing a Great Idea from the 6600Lawrence D'Oliveiro
|        | || +* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | || |+- Re: Stealing a Great Idea from the 6600Lawrence D'Oliveiro
|        | || |`* Re: Stealing a Great Idea from the 6600Scott Lurndal
|        | || | `* Re: a bit of history, Stealing a Great Idea from the 6600John Levine
|        | || |  `* Re: a bit of history, Stealing a Great Idea from the 6600Anton Ertl
|        | || |   +- Re: a bit of history, Stealing a Great Idea from the 6600MitchAlsup1
|        | || |   `* Re: a bit of history, Stealing a Great Idea from the 6600John Levine
|        | || |    `* Re: a bit of history, Stealing a Great Idea from the 6600Thomas Koenig
|        | || |     `* Re: a bit of history, Stealing a Great Idea from the 6600John Levine
|        | || |      `* Re: a bit of history, Stealing a Great Idea from the 6600MitchAlsup1
|        | || |       `* Re: a bit of history, Stealing a Great Idea from the 6600Thomas Koenig
|        | || |        `- Re: a bit of history, Stealing a Great Idea from the 6600MitchAlsup1
|        | || `- Re: Stealing a Great Idea from the 6600John Savard
|        | |`* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | +* Re: Stealing a Great Idea from the 6600George Neuner
|        | | |`* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | | `* Re: Stealing a Great Idea from the 6600George Neuner
|        | | |  `* Re: Stealing a Great Idea from the 6600BGB
|        | | |   `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |    +* Re: Stealing a Great Idea from the 6600Anton Ertl
|        | | |    |`- Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |    +* Re: Stealing a Great Idea from the 6600BGB
|        | | |    |+* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |    ||`- Re: Stealing a Great Idea from the 6600BGB
|        | | |    |`- Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |    `* Re: Stealing a Great Idea from the 6600EricP
|        | | |     `* Re: Stealing a Great Idea from the 6600BGB
|        | | |      `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |       `* Re: Stealing a Great Idea from the 6600BGB
|        | | |        `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |         `* Re: Stealing a Great Idea from the 6600BGB
|        | | |          +* Re: Stealing a Great Idea from the 6600BGB
|        | | |          |`* Re: Stealing a Great Idea from the 6600BGB
|        | | |          | `* Re: Stealing a Great Idea from the 6600Thomas Koenig
|        | | |          |  `* Re: Stealing a Great Idea from the 6600BGB
|        | | |          |   `* Re: Stealing a Great Idea from the 6600BGB
|        | | |          |    `* Re: Stealing a Great Idea from the 6600Thomas Koenig
|        | | |          |     `- Re: Stealing a Great Idea from the 6600BGB
|        | | |          `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |           `* Re: Stealing a Great Idea from the 6600BGB
|        | | |            `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |             `* Re: Stealing a Great Idea from the 6600BGB
|        | | |              `- Re: Stealing a Great Idea from the 6600Thomas Koenig
|        | | `- Re: Stealing a Great Idea from the 6600Tim Rentsch
|        | `* Re: Stealing a Great Idea from the 6600BGB
|        |  `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        |   +* Re: Stealing a Great Idea from the 6600BGB
|        |   |`* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        |   | `- Re: Stealing a Great Idea from the 6600BGB
|        |   +* Re: Stealing a Great Idea from the 6600John Savard
|        |   |`- Re: Stealing a Great Idea from the 6600BGB
|        |   `* Re: Stealing a Great Idea from the 6600Terje Mathisen
|        |    `- Re: Stealing a Great Idea from the 6600BGB
|        `* Re: Stealing a Great Idea from the 6600John Savard
|         +* Re: Stealing a Great Idea from the 6600John Savard
|         |`* Re: Stealing a Great Idea from the 6600John Savard
|         | `* Re: Stealing a Great Idea from the 6600John Savard
|         |  `* Re: Stealing a Great Idea from the 6600John Savard
|         |   `* Re: Stealing a Great Idea from the 6600BGB
|         |    `- Re: Stealing a Great Idea from the 6600MitchAlsup1
|         `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|          `- Re: Stealing a Great Idea from the 6600John Savard
`* Re: Stealing a Great Idea from the 6600Lawrence D'Oliveiro
 `- Re: Stealing a Great Idea from the 6600MitchAlsup1

Pages:1234
Re: Stealing a Great Idea from the 6600

<cb96602949ccc7bae484d12748f2b3c9@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38457&group=comp.arch#38457

  copy link   Newsgroups: comp.arch
Date: Fri, 26 Apr 2024 21:01:35 +0000
Subject: Re: Stealing a Great Idea from the 6600
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$YiFyTm4J6ktmxdc50Y8s3umVGQXP0ca/.AFm3.ei1JykIo.pS7FQq
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org> <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me> <ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <2024Apr26.173457@mips.complang.tuwien.ac.at>
Organization: Rocksolid Light
Message-ID: <cb96602949ccc7bae484d12748f2b3c9@www.novabbs.org>
 by: MitchAlsup1 - Fri, 26 Apr 2024 21:01 UTC

Anton Ertl wrote:

> mitchalsup@aol.com (MitchAlsup1) writes:
>>What we need is ~16-bit displacements where 82½%-91¼% are positive.

> What are these funny numbers about?

In typical usages in MY 66000 ISA <only> one needs only 18 DW of
negative addressing and we have a 16-bit displacement. So, technically
it might get by at the 99% level with -32..+65500. Other usages might
need a few more on the negative end of things so 1/8..1/16 in the
negative direction, 7/8..15/16 in the positive.

> Do you mean that you want number ranges like -11468..54067 (82.5%
> positive) or -5734..59801 (91.25% positive)? Which one of those? And
> why not, say -8192..57343 (87.5% positive)?

Roughly.

>>How does one use a frame pointer without negative displacements ??

> You let it point to the lowest address you want to access. That moves
> the problem to unwinding frame pointer chains where the unwinder does
> not know the frame-specific difference between the frame pointer and
> the pointer of the next frame.

> An alternative is to have a frame-independent difference that leaves
> enough room that, say 90% (or 99%, or whatever) of the frames don't
> need negative offsets from that frame.

> Likewise, if you have signed displacements, and are unhappy about the
> skewed usage, you can let the frame pointer point at an offset from
> the pointer to the next fram such that the usage is less skewed.

Such a hassle....

> - anton

Re: Stealing a Great Idea from the 6600

<c197f829d4e112cf2b4703e59e8cc04c@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38458&group=comp.arch#38458

  copy link   Newsgroups: comp.arch
Date: Fri, 26 Apr 2024 21:07:24 +0000
Subject: Re: Stealing a Great Idea from the 6600
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$oK2ios0HWADfuTxhdUOiOO5U10pwfCJCZdSL1pazzuHO77n0NdQ3O
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org> <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me> <ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <v0gobh$3qnis$1@dont-email.me>
Organization: Rocksolid Light
Message-ID: <c197f829d4e112cf2b4703e59e8cc04c@www.novabbs.org>
 by: MitchAlsup1 - Fri, 26 Apr 2024 21:07 UTC

BGB wrote:

> On 4/26/2024 8:25 AM, MitchAlsup1 wrote:
>> BGB wrote:
>>
>>> On 4/25/2024 4:01 PM, George Neuner wrote:
>>>> On Tue, 23 Apr 2024 17:58:41 +0000, mitchalsup@aol.com (MitchAlsup1)
>>>> wrote:
>>>>
>>
>>> Agreed in the sense that negative displacements exist.
>>
>>> However, can note that positive displacements tend to be significantly
>>> more common than negative ones. Whether or not it makes sense to have
>>> a negative displacement, depending mostly on the probability of
>>> greater than half of the missed displacements being negative.
>>
>>>  From what I can tell, this seems to be:
>>>    ~ 10 bits, scaled.
>>>    ~ 13 bits, unscaled.
>>
>>
>>> So, say, an ISA like RISC-V might have had a slightly hit rate with
>>> unsigned displacements than with signed displacements, but if one
>>> added 1 or 2 bits, signed would have still been a clear winner (or,
>>> with 1 or 2 fewer bits, unsigned a clear winner).
>>
>>> I ended up going with signed displacements for XG2, but it was pretty
>>> close to break-even in this case (when expanding from the 9-bit
>>> unsigned displacements in Baseline).
>>
>>
>>> Granted, all signed or all-unsigned might be better from an ISA design
>>> consistency POV.
>>
>>
>>> If one had 16-bit displacements, then unscaled displacements would
>>> make sense; otherwise scaled displacements seem like a win (misaligned
>>> displacements being much less common than aligned displacements).
>>
>> What we need is ~16-bit displacements where 82½%-91¼% are positive.
>>

> I was seeing stats more like 99.8% positive, 0.2% negative.

After pulling out the calculator and thinking about the frames, My
66000 needs no more than 18 DW of negative addressing. This is just
over 0.2% as you indicate.

> There was enough of a bias that, below 10 bits, if one takes all the
> remaining cases, zero extending would always win, until reaching 10
> bits, when the number of missed reaches 50% negative (along with
> positive displacements larger than 512).

> So, one can make a choice: -512..511, or 0..1023, ...

> In XG2, I ended up with -512..511, for pros or cons (for some programs,
> this choice is optimal, for others it is not).

> Where, when scaled for QWORD, this is +/- 4K.

> If one had a 16-bit displacement, it would be a choice between +/- 32K,
> or (scaled) +/- 256K, or 0..512K, ...

We looked at this in Mc88100 (scaling of the displacement). The drawback
was that the ISA and linker were slightly mismatched: The linker wanted
to use a single upper 16-bit LUI <if it were> over several LD/STs of
potentially different sizes, and scaling of the displacement failed in
those regards; so we dropped scaled displacements.

> For the special purpose "LEA.Q (GBR, Disp16), Rn" instruction, I ended
> up going unsigned, where for a lot of the programs I am dealing with,
> this is big enough to cover ".data" and part of ".bss", generally used
> for arrays which need the larger displacements (the compiler lays things
> out so that most of the commonly used variables are closer to the start
> of ".data", so can use smaller displacements).

Not even an issue when one has universal constants.

>> How does one use a frame pointer without negative displacements ??
>>
>> [FP+disp] accesses callee save registers
>> [FP-disp] accesses local stack variables and descriptors
>>
>> [SP+disp] accesses argument and result values
>>

> In my case, all of these are [SP+Disp], granted, there is no frame
> pointer and stack frames are fixed-size in BGBCC.

> This is typically with a frame layout like:
> Argument/Spill space
> -- Frame Top
> Register Save
> (Stack Canary)
> Local arrays/structs
> Local variables
> Argument/Spill Space
> -- Frame Bottom

> Contrast with traditional x86 layout, which puts saved registers and
> local variables near the frame-pointer, which points near the top of the
> stack frame.

> Though, in a majority of functions, the MOV.L and MOV.Q functions have a
> big enough displacement to cover the whole frame (excludes functions
> which have a lot of local arrays or similar, though overly large local
> arrays are auto-folded to using heap allocation, but at present this
> logic is based on the size of individual arrays rather than on the total
> combined size of the stack frame).

> Adding a frame pointer (with negative displacements) wouldn't make a big
> difference in XG2 Mode, but would be more of an issue for (pure)
> Baseline, where options are either to load the displacement into a
> register, or use a jumbo prefix.

>>> But, admittedly, main reason I went with unscaled for GBR-rel and
>>> PC-rel Load/Store, was because using scaled displacements here would
>>> have required more relocation types (nevermind if the hit rate for
>>> unscaled 9-bit displacements is "pretty weak").
>>
>>> Though, did end up later adding specialized Scaled GBR-Rel Load/Store
>>> ops (to improve code density), so it might have been better in
>>> retrospect had I instead just went the "keep it scaled and add more
>>> reloc types to compensate" option.
>>
>>
>>> ....
>>
>>
>>>> YMMV.

Re: Stealing a Great Idea from the 6600

<70998f4532923bf28d11f9a8544e089f@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38459&group=comp.arch#38459

  copy link   Newsgroups: comp.arch
Date: Fri, 26 Apr 2024 21:16:28 +0000
Subject: Re: Stealing a Great Idea from the 6600
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$XA8qmBl4LMuHnkGGcwQJE.BUjCXbtGjMTuKFTPXzdvgEl7R8RA59C
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org> <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me> <ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <v0gobh$3qnis$1@dont-email.me>
Organization: Rocksolid Light
Message-ID: <70998f4532923bf28d11f9a8544e089f@www.novabbs.org>
 by: MitchAlsup1 - Fri, 26 Apr 2024 21:16 UTC

BGB wrote:

> On 4/26/2024 8:25 AM, MitchAlsup1 wrote:
>>

>> How does one use a frame pointer without negative displacements ??
>>
>> [FP+disp] accesses callee save registers
>> [FP-disp] accesses local stack variables and descriptors
>>
>> [SP+disp] accesses argument and result values
>>

> In my case, all of these are [SP+Disp], granted, there is no frame
> pointer and stack frames are fixed-size in BGBCC.

I only have FP when the base language is block structured and scoped.
Not C, C++ or FORTRAN, but Algol, ADA, Pascal: Yes.

> This is typically with a frame layout like:
> Argument/Spill space
> -- Frame Top
> Register Save
> (Stack Canary)
> Local arrays/structs
> Local variables
> Argument/Spill Space
> -- Frame Bottom

Previous Argument/Result space
{ Register Save area
Return Pointer }
FP-> Local Descriptors -------------------\
Local Variables |
Dynamically allocated Stack space <--/
SP-> My Argument/Result space

When safe stack is in use, Register Save area and return pointer are
placed on a separate stack not accessible with LD/ST instructions.

> Contrast with traditional x86 layout, which puts saved registers and
> local variables near the frame-pointer, which points near the top of the
> stack frame.

> Though, in a majority of functions, the MOV.L and MOV.Q functions have a
> big enough displacement to cover the whole frame (excludes functions
> which have a lot of local arrays or similar, though overly large local
> arrays are auto-folded to using heap allocation, but at present this
> logic is based on the size of individual arrays rather than on the total
> combined size of the stack frame).

By making a Local Descriptor area on the stack, one can access the
descriptors off of FP and access the dynamic stuff via that pointer.
Both Local Descriptors and Local Variables may be allocated into
registers and not actually exist on the stack.

Re: Stealing a Great Idea from the 6600

<v0jldv$i3mh$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38474&group=comp.arch#38474

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sat, 27 Apr 2024 14:58:52 -0500
Organization: A noiseless patient Spider
Lines: 188
Message-ID: <v0jldv$i3mh$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org>
<v0gobh$3qnis$1@dont-email.me>
<c197f829d4e112cf2b4703e59e8cc04c@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 27 Apr 2024 21:58:56 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="6d46d46453f8be8cb5c0e2a63a99790f";
logging-data="593617"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+DOiQ8PGQ9in5l8+swL1n2sR1O0rLoeSc="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:5xx3Pyc4VwK3lpo17l3IFr4gK3M=
Content-Language: en-US
In-Reply-To: <c197f829d4e112cf2b4703e59e8cc04c@www.novabbs.org>
 by: BGB - Sat, 27 Apr 2024 19:58 UTC

On 4/26/2024 4:07 PM, MitchAlsup1 wrote:
> BGB wrote:
>
>> On 4/26/2024 8:25 AM, MitchAlsup1 wrote:
>>> BGB wrote:
>>>
>>>> On 4/25/2024 4:01 PM, George Neuner wrote:
>>>>> On Tue, 23 Apr 2024 17:58:41 +0000, mitchalsup@aol.com (MitchAlsup1)
>>>>> wrote:
>>>>>
>>>
>>>> Agreed in the sense that negative displacements exist.
>>>
>>>> However, can note that positive displacements tend to be
>>>> significantly more common than negative ones. Whether or not it
>>>> makes sense to have a negative displacement, depending mostly on the
>>>> probability of greater than half of the missed displacements being
>>>> negative.
>>>
>>>>  From what I can tell, this seems to be:
>>>>    ~ 10 bits, scaled.
>>>>    ~ 13 bits, unscaled.
>>>
>>>
>>>> So, say, an ISA like RISC-V might have had a slightly hit rate with
>>>> unsigned displacements than with signed displacements, but if one
>>>> added 1 or 2 bits, signed would have still been a clear winner (or,
>>>> with 1 or 2 fewer bits, unsigned a clear winner).
>>>
>>>> I ended up going with signed displacements for XG2, but it was
>>>> pretty close to break-even in this case (when expanding from the
>>>> 9-bit unsigned displacements in Baseline).
>>>
>>>
>>>> Granted, all signed or all-unsigned might be better from an ISA
>>>> design consistency POV.
>>>
>>>
>>>> If one had 16-bit displacements, then unscaled displacements would
>>>> make sense; otherwise scaled displacements seem like a win
>>>> (misaligned displacements being much less common than aligned
>>>> displacements).
>>>
>>> What we need is ~16-bit displacements where 82½%-91¼% are positive.
>>>
>
>> I was seeing stats more like 99.8% positive, 0.2% negative.
>
> After pulling out the calculator and thinking about the frames, My 66000
> needs no more than 18 DW of negative addressing. This is just
> over 0.2% as you indicate.
>

OK.

Not entirely sure I know what you mean be 'DW' here though...

>
>> There was enough of a bias that, below 10 bits, if one takes all the
>> remaining cases, zero extending would always win, until reaching 10
>> bits, when the number of missed reaches 50% negative (along with
>> positive displacements larger than 512).
>
>> So, one can make a choice: -512..511, or 0..1023, ...
>
>> In XG2, I ended up with -512..511, for pros or cons (for some
>> programs, this choice is optimal, for others it is not).
>
>> Where, when scaled for QWORD, this is +/- 4K.
>
>
>> If one had a 16-bit displacement, it would be a choice between +/-
>> 32K, or (scaled) +/- 256K, or 0..512K, ...
>
> We looked at this in Mc88100 (scaling of the displacement). The drawback
> was that the ISA and linker were slightly mismatched: The linker wanted
> to use a single upper 16-bit LUI <if it were> over several LD/STs of
> potentially different sizes, and scaling of the displacement failed in
> those regards; so we dropped scaled displacements.
>

This is partly why I initially went with unscaled for PC-rel and GBR-rel
cases, since these were being used for globals, and the linker/reloc
stage would need to deal with more complexity for the relocs in this case.

For normal direct displacements, these will not typically be used for
accessing globals or similar, or otherwise need relocs, so scaled made
more sense here.

Some scaled cases (for GBR) were later re-added mostly as an
optimization case.

And, when generating the binary all at once, it is possible to cluster
the commonly used globals close together rather than have them scattered
all across ".data" and ".bss" (so, most of the further reaches of these
sections can be all the bulk arrays and similar).

If doing separate compilation, likely the compiler would need to use the
generic case (or possibly involve the arcane magic known as "linker
relaxation").

>> For the special purpose "LEA.Q (GBR, Disp16), Rn" instruction, I ended
>> up going unsigned, where for a lot of the programs I am dealing with,
>> this is big enough to cover ".data" and part of ".bss", generally used
>> for arrays which need the larger displacements (the compiler lays
>> things out so that most of the commonly used variables are closer to
>> the start of ".data", so can use smaller displacements).
>
> Not even an issue when one has universal constants.
>

In this case:
LEA.B (GBR, Disp33s), Rn
Needs 64 bits to encode, whereas:
LEA.Q (GBR, Disp16u), Rn
Can be encoded in 32 bits (but only applicable if the array is within
512K of GBR).

This saves some space for loading the address of a global array.

Though, as-is, string literals still always need the longer form:
LEA.B (PC, Disp33s), Rn
....

Similar also applies for function pointers.

>
>>> How does one use a frame pointer without negative displacements ??
>>>
>>> [FP+disp] accesses callee save registers
>>> [FP-disp] accesses local stack variables and descriptors
>>>
>>> [SP+disp] accesses argument and result values
>>>
>
>> In my case, all of these are [SP+Disp], granted, there is no frame
>> pointer and stack frames are fixed-size in BGBCC.
>
>> This is typically with a frame layout like:
>>    Argument/Spill space
>>    -- Frame Top
>>    Register Save
>>    (Stack Canary)
>>    Local arrays/structs
>>    Local variables
>>    Argument/Spill Space
>>    -- Frame Bottom
>
>> Contrast with traditional x86 layout, which puts saved registers and
>> local variables near the frame-pointer, which points near the top of
>> the stack frame.
>
>> Though, in a majority of functions, the MOV.L and MOV.Q functions have
>> a big enough displacement to cover the whole frame (excludes functions
>> which have a lot of local arrays or similar, though overly large local
>> arrays are auto-folded to using heap allocation, but at present this
>> logic is based on the size of individual arrays rather than on the
>> total combined size of the stack frame).
>
>
>> Adding a frame pointer (with negative displacements) wouldn't make a
>> big difference in XG2 Mode, but would be more of an issue for (pure)
>> Baseline, where options are either to load the displacement into a
>> register, or use a jumbo prefix.
>
>
>>>> But, admittedly, main reason I went with unscaled for GBR-rel and
>>>> PC-rel Load/Store, was because using scaled displacements here would
>>>> have required more relocation types (nevermind if the hit rate for
>>>> unscaled 9-bit displacements is "pretty weak").
>>>
>>>> Though, did end up later adding specialized Scaled GBR-Rel
>>>> Load/Store ops (to improve code density), so it might have been
>>>> better in retrospect had I instead just went the "keep it scaled and
>>>> add more reloc types to compensate" option.
>>>
>>>
>>>> ....
>>>
>>>
>>>>> YMMV.

Re: Stealing a Great Idea from the 6600

<v0jlf3$i3mh$2@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38475&group=comp.arch#38475

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sat, 27 Apr 2024 14:59:30 -0500
Organization: A noiseless patient Spider
Lines: 72
Message-ID: <v0jlf3$i3mh$2@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 27 Apr 2024 21:59:32 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="6d46d46453f8be8cb5c0e2a63a99790f";
logging-data="593617"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18G331M3Oqob0nONLHtiF9GSQAFfR2xarM="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:m59WDiPqDKfXkoxqHSoEghio18s=
In-Reply-To: <IQSWN.4$nQv.0@fx10.iad>
Content-Language: en-US
 by: BGB - Sat, 27 Apr 2024 19:59 UTC

On 4/26/2024 1:59 PM, EricP wrote:
> MitchAlsup1 wrote:
>> BGB wrote:
>>
>>> If one had 16-bit displacements, then unscaled displacements would
>>> make sense; otherwise scaled displacements seem like a win
>>> (misaligned displacements being much less common than aligned
>>> displacements).
>>
>> What we need is ~16-bit displacements where 82½%-91¼% are positive.
>>
>> How does one use a frame pointer without negative displacements ??
>>
>> [FP+disp] accesses callee save registers
>> [FP-disp] accesses local stack variables and descriptors
>>
>> [SP+disp] accesses argument and result values
>
> A sign extended 16-bit offsets would cover almost all such access needs
> so I really don't see the need for funny business.
>
> But if you really want a skewed range offset it could use something like
> excess-256 encoding which zero extends the immediate then subtract 256
> (or whatever) from it, to give offsets in the range -256..+65535-256.
> So an immediate value of 0 equals an offset of -256.
>

Yeah, my thinking was that by the time one has 16 bits for Load/Store
displacements, they could almost just go +/- 32K and call it done.

But, much smaller than this, there is an advantage to scaling the
displacements.

In other news, got around to getting the RISC-V code to build in PIE
mode for Doom (by using "riscv64-unknown-linux-gnu-*").

Can note that RV64 code density takes a hit in this case:
RV64: 299K (.text)
XG2 : 284K (.text)

So, apparently using this version of GCC and using "-fPIE" works in my
favor regarding code density...

I guess a question is what FDPIC would do if GCC supported it, since
this would be the closest direct analog to my own ABI.

I guess some people are dragging their feet on FDPIC, as there is some
debate as to whether or not NOMMU makes sense for RISC-V, along with its
associated performance impact if used.

In my case, if I wanted to go over to simple base-relocatable images,
this would technically eliminate the need for GBR reloading.

Checks:
Simple base-relocatable case actually currently generates bigger
binaries, I suspect because in this case it is less space-efficient to
use PC-rel vs GBR-rel.

Went and added a "pbostatic" option, which sidesteps saving and
restoring GBR (making the simplifying assumption that functions will
never be called from outside the current binary).

This saves roughly 4K (Doom's ".text" shrinks to 280K).

....

Re: Stealing a Great Idea from the 6600

<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38476&group=comp.arch#38476

  copy link   Newsgroups: comp.arch
Date: Sat, 27 Apr 2024 20:37:34 +0000
Subject: Re: Stealing a Great Idea from the 6600
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$IAFkvKc85GmfqRSVAoUDs.h3zmEZkEX11Os.TSepVbVmitTtL9FxC
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org> <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me> <ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad> <v0jlf3$i3mh$2@dont-email.me>
Organization: Rocksolid Light
Message-ID: <3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
 by: MitchAlsup1 - Sat, 27 Apr 2024 20:37 UTC

BGB wrote:

> On 4/26/2024 1:59 PM, EricP wrote:
>> MitchAlsup1 wrote:
>>> BGB wrote:
>>>
>>>> If one had 16-bit displacements, then unscaled displacements would
>>>> make sense; otherwise scaled displacements seem like a win
>>>> (misaligned displacements being much less common than aligned
>>>> displacements).
>>>
>>> What we need is ~16-bit displacements where 82½%-91¼% are positive.
>>>
>>> How does one use a frame pointer without negative displacements ??
>>>
>>> [FP+disp] accesses callee save registers
>>> [FP-disp] accesses local stack variables and descriptors
>>>
>>> [SP+disp] accesses argument and result values
>>
>> A sign extended 16-bit offsets would cover almost all such access needs
>> so I really don't see the need for funny business.
>>
>> But if you really want a skewed range offset it could use something like
>> excess-256 encoding which zero extends the immediate then subtract 256
>> (or whatever) from it, to give offsets in the range -256..+65535-256.
>> So an immediate value of 0 equals an offset of -256.
>>

> Yeah, my thinking was that by the time one has 16 bits for Load/Store
> displacements, they could almost just go +/- 32K and call it done.

> But, much smaller than this, there is an advantage to scaling the
> displacements.

> In other news, got around to getting the RISC-V code to build in PIE
> mode for Doom (by using "riscv64-unknown-linux-gnu-*").

> Can note that RV64 code density takes a hit in this case:
> RV64: 299K (.text)
> XG2 : 284K (.text)

Is this indicative that your ISA and RISC-V are within spitting distance
of each other in terms of the number of instructions in .text ?? or not ??

> So, apparently using this version of GCC and using "-fPIE" works in my
> favor regarding code density...

> I guess a question is what FDPIC would do if GCC supported it, since
> this would be the closest direct analog to my own ABI.

What is FDPIC ?? Federal Deposit Processor Insurance Corporation ??
Final Dopey Position Independent Code ??

> I guess some people are dragging their feet on FDPIC, as there is some
> debate as to whether or not NOMMU makes sense for RISC-V, along with its
> associated performance impact if used.

> In my case, if I wanted to go over to simple base-relocatable images,
> this would technically eliminate the need for GBR reloading.

> Checks:
> Simple base-relocatable case actually currently generates bigger
> binaries, I suspect because in this case it is less space-efficient to
> use PC-rel vs GBR-rel.

> Went and added a "pbostatic" option, which sidesteps saving and
> restoring GBR (making the simplifying assumption that functions will
> never be called from outside the current binary).

> This saves roughly 4K (Doom's ".text" shrinks to 280K).

Would you be willing to compile DOOM with Brian's LLVM compiler and
show the results ??

> ....

Re: Stealing a Great Idea from the 6600

<v0k2kb$l21r$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38479&group=comp.arch#38479

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sat, 27 Apr 2024 18:44:08 -0500
Organization: A noiseless patient Spider
Lines: 210
Message-ID: <v0k2kb$l21r$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 28 Apr 2024 01:44:11 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="437e2caa7a0d326db2a3a89ac4546326";
logging-data="690235"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+0mv4TB6xCAb9EYfGiz+VK8X9Ol5hgsYk="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:AIgY6MsdRlADm1/1TXPrF6/MzOA=
In-Reply-To: <3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
Content-Language: en-US
 by: BGB - Sat, 27 Apr 2024 23:44 UTC

On 4/27/2024 3:37 PM, MitchAlsup1 wrote:
> BGB wrote:
>
>> On 4/26/2024 1:59 PM, EricP wrote:
>>> MitchAlsup1 wrote:
>>>> BGB wrote:
>>>>
>>>>> If one had 16-bit displacements, then unscaled displacements would
>>>>> make sense; otherwise scaled displacements seem like a win
>>>>> (misaligned displacements being much less common than aligned
>>>>> displacements).
>>>>
>>>> What we need is ~16-bit displacements where 82½%-91¼% are positive.
>>>>
>>>> How does one use a frame pointer without negative displacements ??
>>>>
>>>> [FP+disp] accesses callee save registers
>>>> [FP-disp] accesses local stack variables and descriptors
>>>>
>>>> [SP+disp] accesses argument and result values
>>>
>>> A sign extended 16-bit offsets would cover almost all such access needs
>>> so I really don't see the need for funny business.
>>>
>>> But if you really want a skewed range offset it could use something like
>>> excess-256 encoding which zero extends the immediate then subtract 256
>>> (or whatever) from it, to give offsets in the range -256..+65535-256.
>>> So an immediate value of 0 equals an offset of -256.
>>>
>
>> Yeah, my thinking was that by the time one has 16 bits for Load/Store
>> displacements, they could almost just go +/- 32K and call it done.
>
>> But, much smaller than this, there is an advantage to scaling the
>> displacements.
>
>
>
>
>> In other news, got around to getting the RISC-V code to build in PIE
>> mode for Doom (by using "riscv64-unknown-linux-gnu-*").
>
>> Can note that RV64 code density takes a hit in this case:
>>    RV64: 299K (.text)
>>    XG2 : 284K (.text)
>
> Is this indicative that your ISA and RISC-V are within spitting distance
> of each other in terms of the number of instructions in .text ?? or not ??
>

It would appear that, with my current compiler output, both BJX2-XG2 and
RISC-V RV64G are within a few percent of each other...

If adjusting for Jumbo prefixes (with the version that omits GBR reloads):
XG2: 270K (-10K of Jumbo Prefixes)

Implying RISC-V now has around 11% more instructions in this scenario.

It also has an additional 20K of ".rodata" that is likely constants,
which likely overlap significantly with the jumbo prefixes.

>> So, apparently using this version of GCC and using "-fPIE" works in my
>> favor regarding code density...
>
>
>> I guess a question is what FDPIC would do if GCC supported it, since
>> this would be the closest direct analog to my own ABI.
>
> What is FDPIC ?? Federal Deposit Processor Insurance   Corporation ??
>                 Final   Dopey   Position  Independent Code ??
>

Required a little digging: "Function Descriptor Position Independent Code".

But, I think the main difference is that, normal PIC does calls like like:
LD Rt, [GOT+Disp]
BSR Rt

Wheres, FDPIC was typically more like (pseudo ASM):
MOV SavedGOT, GOT
LEA Rt, [GOT+Disp]
MOV GOT, [Rt+8]
MOV Rt, [Rt+0]
BSR Rt
MOV GOT, SavedGOT

But, in my case, noting that function calls tend to be more common than
the functions themselves, and functions will know whether or not they
need to access global variables or call other functions, ... it made
more sense to move this logic into the callee.

No official RISC-V FDPIC ABI that I am aware of, though some proposals
did seem vaguely similar in some areas to what I was doing with PBO.

Where, they were accessing globals like:
LUI Xt, DispHi
ADD Xt, Xt, DispLo
ADD Xt, Xt, GP
LD Xd, Xt, 0

Granted, this is less efficient than, say:
MOV.Q (GBR, Disp33s), Rd

Though, people didn't really detail the call sequence or prolog/epilog
sequences, so less sure how this would work.

Likely guess, something like:
MV Xs, GP
LUI Xt, DispHi
ADD Xt, Xt, DispLo
ADD Xt, Xt, GP
LD GP, Xt, 8
LD Xt, Xt, 0
JALR LR, Xt, 0
MV GP, Xs

Well, unless they have a better way to pull this off...

But, yeah, as far as I saw it, my "better solution" was to put this part
into the callee.

Main tradeoff with my design is:
From any GBR, one needs to be able to get to every other GBR;
We need to have a way to know which table entry to reload (not
statically known at compile time).

In my PBO ABI, this was accomplished by using base relocs (but, this is
N/A for ELF, where PE/COFF style base relocs are not a thing).

One other option might be to use a PC-relative load to load the index.
Say:
AUIPC Xs, DispHi //"__global_pbo_offset$" ?
LD Xs, DispLo
LD Xt, GP, 0 //get table of offsets
ADD Xt, Xt, Xs
LD GP, Xt, 0

In this case, "__global_pbo_offset$" would be a magic constant variable
that gets fixed up by the ELF loader.

>> I guess some people are dragging their feet on FDPIC, as there is some
>> debate as to whether or not NOMMU makes sense for RISC-V, along with
>> its associated performance impact if used.
>
>> In my case, if I wanted to go over to simple base-relocatable images,
>> this would technically eliminate the need for GBR reloading.
>
>> Checks:
>> Simple base-relocatable case actually currently generates bigger
>> binaries, I suspect because in this case it is less space-efficient to
>> use PC-rel vs GBR-rel.
>
>> Went and added a "pbostatic" option, which sidesteps saving and
>> restoring GBR (making the simplifying assumption that functions will
>> never be called from outside the current binary).
>
>> This saves roughly 4K (Doom's ".text" shrinks to 280K).
>
> Would you be willing to compile DOOM with Brian's LLVM compiler and
> show the results ??
>

Will need to download and build this compiler...

Might need to look into this.

But, yeah, current standing for this is:
XG2 : 280K (static linked, Modified PDPCLIB + TestKern)
RV64G : 299K (static linked, Modified PDPCLIB + TestKern)
X86-64: 288K ("gcc -O3", dynamically linked GLIBC)
X64 : 1083K (VS2022, static linked MSVCRT)

But, MSVC is an outlier here for just how bad it is on this front.

To get more reference points, would need to install more compilers.

Could have provided an ARM reference point, except that the compiler
isn't compiling stuff at the moment (would need to beat on stuff a bit
more to try to get it to build; appears to be trying to build with
static-linked Newlib but is missing symbols, ...).

But, yeah, for good comparison, one needs to have everything build with
the same C library, etc.

I am thinking it may be possible to save a little more space by folding
some of the stuff for "va_start()" into an ASM blob (currently, a lot of
stuff is folded off into the function prolog, but probably doesn't need
to be done inline for every varargs function).

Mostly this would be the logic for spilling all of the argument
registers to a location on the stack and similar.

>> ....

Re: Stealing a Great Idea from the 6600

<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38480&group=comp.arch#38480

  copy link   Newsgroups: comp.arch
Date: Sun, 28 Apr 2024 01:45:59 +0000
Subject: Re: Stealing a Great Idea from the 6600
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$1WXnCXYQr3NWJmo.DtixEO3hdEzfUudP39TPUIGbI2mogDTBOyyVO
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org> <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me> <ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad> <v0jlf3$i3mh$2@dont-email.me> <3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org> <v0k2kb$l21r$1@dont-email.me>
Organization: Rocksolid Light
Message-ID: <58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
 by: MitchAlsup1 - Sun, 28 Apr 2024 01:45 UTC

BGB wrote:

> On 4/27/2024 3:37 PM, MitchAlsup1 wrote:
>> BGB wrote:
>>
>>> On 4/26/2024 1:59 PM, EricP wrote:
>>>> MitchAlsup1 wrote:
>>>>> BGB wrote:
>>>>>
>>>>>> If one had 16-bit displacements, then unscaled displacements would
>>>>>> make sense; otherwise scaled displacements seem like a win
>>>>>> (misaligned displacements being much less common than aligned
>>>>>> displacements).
>>>>>
>>>>> What we need is ~16-bit displacements where 82½%-91¼% are positive.
>>>>>
>>>>> How does one use a frame pointer without negative displacements ??
>>>>>
>>>>> [FP+disp] accesses callee save registers
>>>>> [FP-disp] accesses local stack variables and descriptors
>>>>>
>>>>> [SP+disp] accesses argument and result values
>>>>
>>>> A sign extended 16-bit offsets would cover almost all such access needs
>>>> so I really don't see the need for funny business.
>>>>
>>>> But if you really want a skewed range offset it could use something like
>>>> excess-256 encoding which zero extends the immediate then subtract 256
>>>> (or whatever) from it, to give offsets in the range -256..+65535-256.
>>>> So an immediate value of 0 equals an offset of -256.
>>>>
>>
>>> Yeah, my thinking was that by the time one has 16 bits for Load/Store
>>> displacements, they could almost just go +/- 32K and call it done.
>>
>>> But, much smaller than this, there is an advantage to scaling the
>>> displacements.
>>
>>
>>
>>
>>> In other news, got around to getting the RISC-V code to build in PIE
>>> mode for Doom (by using "riscv64-unknown-linux-gnu-*").
>>
>>> Can note that RV64 code density takes a hit in this case:
>>>    RV64: 299K (.text)
>>>    XG2 : 284K (.text)
>>
>> Is this indicative that your ISA and RISC-V are within spitting distance
>> of each other in terms of the number of instructions in .text ?? or not ??
>>

> It would appear that, with my current compiler output, both BJX2-XG2 and
> RISC-V RV64G are within a few percent of each other...

> If adjusting for Jumbo prefixes (with the version that omits GBR reloads):
> XG2: 270K (-10K of Jumbo Prefixes)

> Implying RISC-V now has around 11% more instructions in this scenario.

Based on Brian's LLVM compiler; RISC-V has about 40% more instructions
than My 66000, or My 66000 has 70% the number of instructions that
RISC-V has (same compilation flags, same source code).

> It also has an additional 20K of ".rodata" that is likely constants,
> which likely overlap significantly with the jumbo prefixes.

My 66000 has vastly smaller .rodata because constants are part of .text

>>> So, apparently using this version of GCC and using "-fPIE" works in my
>>> favor regarding code density...
>>
>>
>>> I guess a question is what FDPIC would do if GCC supported it, since
>>> this would be the closest direct analog to my own ABI.
>>
>> What is FDPIC ?? Federal Deposit Processor Insurance   Corporation ??
>>                 Final   Dopey   Position  Independent Code ??
>>

> Required a little digging: "Function Descriptor Position Independent Code".

> But, I think the main difference is that, normal PIC does calls like like:
> LD Rt, [GOT+Disp]
> BSR Rt

CALX [IP,,#GOT+#disp-.]

It is unlikely that %GOT can be represented with 16-bit offset from IP
so the 32-bit displacement form (,,) is used.

> Wheres, FDPIC was typically more like (pseudo ASM):
> MOV SavedGOT, GOT
> LEA Rt, [GOT+Disp]
> MOV GOT, [Rt+8]
> MOV Rt, [Rt+0]
> BSR Rt
> MOV GOT, SavedGOT

Since GOT is not in a register but is an address constant this is also::

CALX [IP,,#GOT+#disp-.]

> But, in my case, noting that function calls tend to be more common than
> the functions themselves, and functions will know whether or not they
> need to access global variables or call other functions, ... it made
> more sense to move this logic into the callee.

> No official RISC-V FDPIC ABI that I am aware of, though some proposals
> did seem vaguely similar in some areas to what I was doing with PBO.

> Where, they were accessing globals like:
> LUI Xt, DispHi
> ADD Xt, Xt, DispLo
> ADD Xt, Xt, GP
> LD Xd, Xt, 0

> Granted, this is less efficient than, say:
> MOV.Q (GBR, Disp33s), Rd

LDD Rd,[IP,,#GOT+#disp-.]

> Though, people didn't really detail the call sequence or prolog/epilog
> sequences, so less sure how this would work.

> Likely guess, something like:
> MV Xs, GP
> LUI Xt, DispHi
> ADD Xt, Xt, DispLo
> ADD Xt, Xt, GP
> LD GP, Xt, 8
> LD Xt, Xt, 0
> JALR LR, Xt, 0
> MV GP, Xs

> Well, unless they have a better way to pull this off...

CALX [IP,,#GOT+#disp-.]

> But, yeah, as far as I saw it, my "better solution" was to put this part
> into the callee.

> Main tradeoff with my design is:
> From any GBR, one needs to be able to get to every other GBR;
> We need to have a way to know which table entry to reload (not
> statically known at compile time).

Resolved by linker or accessed through GOT in mine. Each dynamic
module gets its own GOT.

> In my PBO ABI, this was accomplished by using base relocs (but, this is
> N/A for ELF, where PE/COFF style base relocs are not a thing).

> One other option might be to use a PC-relative load to load the index.
> Say:
> AUIPC Xs, DispHi //"__global_pbo_offset$" ?
> LD Xs, DispLo
> LD Xt, GP, 0 //get table of offsets
> ADD Xt, Xt, Xs
> LD GP, Xt, 0

> In this case, "__global_pbo_offset$" would be a magic constant variable
> that gets fixed up by the ELF loader.

LDD Rd,[IP,,#GOT+#disp-.]

>>> I guess some people are dragging their feet on FDPIC, as there is some
>>> debate as to whether or not NOMMU makes sense for RISC-V, along with
>>> its associated performance impact if used.
>>
>>> In my case, if I wanted to go over to simple base-relocatable images,
>>> this would technically eliminate the need for GBR reloading.
>>
>>> Checks:
>>> Simple base-relocatable case actually currently generates bigger
>>> binaries, I suspect because in this case it is less space-efficient to
>>> use PC-rel vs GBR-rel.
>>
>>> Went and added a "pbostatic" option, which sidesteps saving and
>>> restoring GBR (making the simplifying assumption that functions will
>>> never be called from outside the current binary).
>>
>>> This saves roughly 4K (Doom's ".text" shrinks to 280K).
>>
>> Would you be willing to compile DOOM with Brian's LLVM compiler and
>> show the results ??
>>

> Will need to download and build this compiler...

> Might need to look into this.

Please do.

> But, yeah, current standing for this is:
> XG2 : 280K (static linked, Modified PDPCLIB + TestKern)
> RV64G : 299K (static linked, Modified PDPCLIB + TestKern)
> X86-64: 288K ("gcc -O3", dynamically linked GLIBC)
> X64 : 1083K (VS2022, static linked MSVCRT)

> But, MSVC is an outlier here for just how bad it is on this front.

> To get more reference points, would need to install more compilers.

> Could have provided an ARM reference point, except that the compiler
> isn't compiling stuff at the moment (would need to beat on stuff a bit
> more to try to get it to build; appears to be trying to build with
> static-linked Newlib but is missing symbols, ...).

> But, yeah, for good comparison, one needs to have everything build with
> the same C library, etc.

> I am thinking it may be possible to save a little more space by folding
> some of the stuff for "va_start()" into an ASM blob (currently, a lot of
> stuff is folded off into the function prolog, but probably doesn't need
> to be done inline for every varargs function).

> Mostly this would be the logic for spilling all of the argument
> registers to a location on the stack and similar.

Part of ENTER already does this: A typical subroutine will use::

ENTER R27,R0,#local_stack_size

Where the varargs subroutine will use::

ENTER R27,R8,#local_stack_size
ADD Rva_ptr,SP,#local_stack_size+64


Click here to read the complete article
Re: Stealing a Great Idea from the 6600

<v0kodk$t520$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38481&group=comp.arch#38481

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 00:56:01 -0500
Organization: A noiseless patient Spider
Lines: 371
Message-ID: <v0kodk$t520$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 28 Apr 2024 07:56:05 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="437e2caa7a0d326db2a3a89ac4546326";
logging-data="955456"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX185zDtBAa83xK6ptuxq0nu8GVxfBO3Eyok="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:qrDgnCfn4mmHshQJLeMzYx3C5VE=
Content-Language: en-US
In-Reply-To: <58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
 by: BGB - Sun, 28 Apr 2024 05:56 UTC

On 4/27/2024 8:45 PM, MitchAlsup1 wrote:
> BGB wrote:
>
>> On 4/27/2024 3:37 PM, MitchAlsup1 wrote:
>>> BGB wrote:
>>>
>>>> On 4/26/2024 1:59 PM, EricP wrote:
>>>>> MitchAlsup1 wrote:
>>>>>> BGB wrote:
>>>>>>
>>>>>>> If one had 16-bit displacements, then unscaled displacements
>>>>>>> would make sense; otherwise scaled displacements seem like a win
>>>>>>> (misaligned displacements being much less common than aligned
>>>>>>> displacements).
>>>>>>
>>>>>> What we need is ~16-bit displacements where 82½%-91¼% are positive.
>>>>>>
>>>>>> How does one use a frame pointer without negative displacements ??
>>>>>>
>>>>>> [FP+disp] accesses callee save registers
>>>>>> [FP-disp] accesses local stack variables and descriptors
>>>>>>
>>>>>> [SP+disp] accesses argument and result values
>>>>>
>>>>> A sign extended 16-bit offsets would cover almost all such access
>>>>> needs
>>>>> so I really don't see the need for funny business.
>>>>>
>>>>> But if you really want a skewed range offset it could use something
>>>>> like
>>>>> excess-256 encoding which zero extends the immediate then subtract 256
>>>>> (or whatever) from it, to give offsets in the range -256..+65535-256.
>>>>> So an immediate value of 0 equals an offset of -256.
>>>>>
>>>
>>>> Yeah, my thinking was that by the time one has 16 bits for
>>>> Load/Store displacements, they could almost just go +/- 32K and call
>>>> it done.
>>>
>>>> But, much smaller than this, there is an advantage to scaling the
>>>> displacements.
>>>
>>>
>>>
>>>
>>>> In other news, got around to getting the RISC-V code to build in PIE
>>>> mode for Doom (by using "riscv64-unknown-linux-gnu-*").
>>>
>>>> Can note that RV64 code density takes a hit in this case:
>>>>    RV64: 299K (.text)
>>>>    XG2 : 284K (.text)
>>>
>>> Is this indicative that your ISA and RISC-V are within spitting
>>> distance of each other in terms of the number of instructions in
>>> .text ?? or not ??
>>>
>
>> It would appear that, with my current compiler output, both BJX2-XG2
>> and RISC-V RV64G are within a few percent of each other...
>
>> If adjusting for Jumbo prefixes (with the version that omits GBR
>> reloads):
>>    XG2: 270K (-10K of Jumbo Prefixes)
>
>> Implying RISC-V now has around 11% more instructions in this scenario.
>
> Based on Brian's LLVM compiler; RISC-V has about 40% more instructions
> than My 66000, or My 66000 has 70% the number of instructions that
> RISC-V has (same compilation flags, same source code).
>

I have made some progress here recently, but it is still a case of (in
my case):
Stronger ISA, but with a compiler with a weak optimizer;
Vs:
Weaker ISA, but vs a compiler with a stronger optimizer.

GCC is very clever at figuring out what to optimize...

Meanwhile, BGBCC may fail to optimize away constant sub-expressions if
operator precedence doesn't fall in a preferable direction.

Say:
y=x*3*4;
Doing two multiply instructions in a row, because:
y=x*12;
Didn't happen to map to the AST as it was written (because parsing was
left-associative in this case).

Yeah, actually ran into this recently, only solution at present is to
put parenthesis around the constant parts.

But, yeah, seemingly GCC isn't fooled by things like precedence order.
Seemingly, it may even chase constants across basic blocks or across
memory loads and stores, causing chunks of code to disappear, etc...

But, still not enough to make up for RV64G's weaknesses it seems.
Well, and Doom isn't full of a lot of cases for it to leverage its
seeming aggressive constant-folding might...

>> It also has an additional 20K of ".rodata" that is likely constants,
>> which likely overlap significantly with the jumbo prefixes.
>
> My 66000 has vastly smaller .rodata because constants are part of .text
>

Similar, though in my case they exist as Jumbo prefixes.

Except well, if values are declared as "const double x0=...;", where
BGBCC ends up treating it like a normal variable that does not allow
assignment (so will generate different code than had one used #define or
similar).

Also noted cases of this recently when diffing through my compiler output.

Does seem to be context-dependent to some extent though...

>>>> So, apparently using this version of GCC and using "-fPIE" works in
>>>> my favor regarding code density...
>>>
>>>
>>>> I guess a question is what FDPIC would do if GCC supported it, since
>>>> this would be the closest direct analog to my own ABI.
>>>
>>> What is FDPIC ?? Federal Deposit Processor Insurance   Corporation ??
>>>                  Final   Dopey   Position  Independent Code ??
>>>
>
>> Required a little digging: "Function Descriptor Position Independent
>> Code".
>
>> But, I think the main difference is that, normal PIC does calls like
>> like:
>>    LD Rt, [GOT+Disp]
>>    BSR Rt
>
>     CALX   [IP,,#GOT+#disp-.]
>
> It is unlikely that %GOT can be represented with 16-bit offset from IP
> so the 32-bit displacement form (,,) is used.
>
>> Wheres, FDPIC was typically more like (pseudo ASM):
>>    MOV SavedGOT, GOT
>>    LEA Rt, [GOT+Disp]
>>    MOV GOT, [Rt+8]
>>    MOV Rt, [Rt+0]
>>    BSR Rt
>>    MOV GOT, SavedGOT
>
> Since GOT is not in a register but is an address constant this is also::
>
>     CALX   [IP,,#GOT+#disp-.]
>

So... Would this also cause GOT to point to a new address on the callee
side (that is dependent on the GOT on the caller side, and *not* on the
PC address at the destination) ?...

In effect, the context dependent GOT daisy-chaining is a fundamental
aspect of FDPIC that is different from conventional PIC.

>> But, in my case, noting that function calls tend to be more common
>> than the functions themselves, and functions will know whether or not
>> they need to access global variables or call other functions, ... it
>> made more sense to move this logic into the callee.
>
>
>> No official RISC-V FDPIC ABI that I am aware of, though some proposals
>> did seem vaguely similar in some areas to what I was doing with PBO.
>
>> Where, they were accessing globals like:
>>    LUI Xt, DispHi
>>    ADD Xt, Xt, DispLo
>>    ADD Xt, Xt, GP
>>    LD  Xd, Xt, 0
>
>> Granted, this is less efficient than, say:
>>    MOV.Q (GBR, Disp33s), Rd
>
>     LDD   Rd,[IP,,#GOT+#disp-.]
>

As noted, BJX2 can handle this in a single 64-bit instruction, vs 4
instructions.

>> Though, people didn't really detail the call sequence or prolog/epilog
>> sequences, so less sure how this would work.
>
>
>> Likely guess, something like:
>>    MV    Xs, GP
>>    LUI   Xt, DispHi
>>    ADD   Xt, Xt, DispLo
>>    ADD   Xt, Xt, GP
>>    LD    GP, Xt, 8
>>    LD    Xt, Xt, 0
>>    JALR  LR, Xt, 0
>>    MV    GP, Xs
>
>> Well, unless they have a better way to pull this off...
>
>     CALX   [IP,,#GOT+#disp-.]
>

Well, can you explain the semantics of this one...

>> But, yeah, as far as I saw it, my "better solution" was to put this
>> part into the callee.
>
>
>> Main tradeoff with my design is:
>>    From any GBR, one needs to be able to get to every other GBR;
>>    We need to have a way to know which table entry to reload (not
>> statically known at compile time).
>
> Resolved by linker or accessed through GOT in mine. Each dynamic
> module gets its own GOT.
>

The important thing is not associating a GOT with an ELF module, but
with an instance of said module.

So, say, one copy of an ELF image, can have N separate GOTs and data
sections (each associated with a program instance).


Click here to read the complete article
Re: Stealing a Great Idea from the 6600

<v0ksrk$u1j2$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38482&group=comp.arch#38482

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 02:11:45 -0500
Organization: A noiseless patient Spider
Lines: 72
Message-ID: <v0ksrk$u1j2$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
<v0kodk$t520$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 28 Apr 2024 09:11:49 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="437e2caa7a0d326db2a3a89ac4546326";
logging-data="984674"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/kKHjNFgIV+iVHmmHO3IV2Hc/+y7zFaE8="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:LddNmYWlaije6X/mFegnMM58Hu8=
Content-Language: en-US
In-Reply-To: <v0kodk$t520$1@dont-email.me>
 by: BGB - Sun, 28 Apr 2024 07:11 UTC

On 4/28/2024 12:56 AM, BGB wrote:
> On 4/27/2024 8:45 PM, MitchAlsup1 wrote:
>> BGB wrote:
>>

....

>
>>>>> I guess some people are dragging their feet on FDPIC, as there is
>>>>> some debate as to whether or not NOMMU makes sense for RISC-V,
>>>>> along with its associated performance impact if used.
>>>>
>>>>> In my case, if I wanted to go over to simple base-relocatable
>>>>> images, this would technically eliminate the need for GBR reloading.
>>>>
>>>>> Checks:
>>>>> Simple base-relocatable case actually currently generates bigger
>>>>> binaries, I suspect because in this case it is less space-efficient
>>>>> to use PC-rel vs GBR-rel.
>>>>
>>>>> Went and added a "pbostatic" option, which sidesteps saving and
>>>>> restoring GBR (making the simplifying assumption that functions
>>>>> will never be called from outside the current binary).
>>>>
>>>>> This saves roughly 4K (Doom's ".text" shrinks to 280K).
>>>>
>>>> Would you be willing to compile DOOM with Brian's LLVM compiler and
>>>> show the results ??
>>>>
>>
>>> Will need to download and build this compiler...
>>
>>> Might need to look into this.
>>
>> Please do.
>>
>
> Extracting the ZIP file and "git clone llvm-project" etc, have thus far
> taken hours...
>
> Well, and then the commands to CMake were not working, tried invoking
> cmake more minimally, and it gives a message complaining about the
> version being too old, ...
>
> Seems I have to build it with a different / newer WSL instance (well, I
> guess it was either this or try to rebuild CMake from source).
>
>
> Checks, download for compiler (+ git cloned LLVM) is a little over 6GB.
>
>
> Well, OK, now LLVM is building... I guess, will see if it compiles and
> doesn't explode in the process. Probably going to be a while it seems.
>

A little over an hour later and it still hasn't broken 50% yet...

I think LLVM rebuilds may have actually gotten slower than in the past...

Well, at least my 112GB of RAM means it isn't swapping too much...

Computer is a little sluggish and the "System" process seems kinda
pegged out though...

I guess I will know sometime later whether or not all of this builds...

Re: Stealing a Great Idea from the 6600

<v0l35j$v8sl$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38483&group=comp.arch#38483

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 03:59:28 -0500
Organization: A noiseless patient Spider
Lines: 126
Message-ID: <v0l35j$v8sl$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
<v0kodk$t520$1@dont-email.me> <v0ksrk$u1j2$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 28 Apr 2024 10:59:31 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="437e2caa7a0d326db2a3a89ac4546326";
logging-data="1024917"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18w9RyjH5JiRd/KShv7j2g89eI2amPVED4="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:ZJeCb7Tk/mFQrQjGq0ShxG3CKWw=
Content-Language: en-US
In-Reply-To: <v0ksrk$u1j2$1@dont-email.me>
 by: BGB - Sun, 28 Apr 2024 08:59 UTC

On 4/28/2024 2:11 AM, BGB wrote:
> On 4/28/2024 12:56 AM, BGB wrote:
>> On 4/27/2024 8:45 PM, MitchAlsup1 wrote:
>>> BGB wrote:
>>>
>
> ...
>
>>
>>>>>> I guess some people are dragging their feet on FDPIC, as there is
>>>>>> some debate as to whether or not NOMMU makes sense for RISC-V,
>>>>>> along with its associated performance impact if used.
>>>>>
>>>>>> In my case, if I wanted to go over to simple base-relocatable
>>>>>> images, this would technically eliminate the need for GBR reloading.
>>>>>
>>>>>> Checks:
>>>>>> Simple base-relocatable case actually currently generates bigger
>>>>>> binaries, I suspect because in this case it is less
>>>>>> space-efficient to use PC-rel vs GBR-rel.
>>>>>
>>>>>> Went and added a "pbostatic" option, which sidesteps saving and
>>>>>> restoring GBR (making the simplifying assumption that functions
>>>>>> will never be called from outside the current binary).
>>>>>
>>>>>> This saves roughly 4K (Doom's ".text" shrinks to 280K).
>>>>>
>>>>> Would you be willing to compile DOOM with Brian's LLVM compiler and
>>>>> show the results ??
>>>>>
>>>
>>>> Will need to download and build this compiler...
>>>
>>>> Might need to look into this.
>>>
>>> Please do.
>>>
>>
>> Extracting the ZIP file and "git clone llvm-project" etc, have thus
>> far taken hours...
>>
>> Well, and then the commands to CMake were not working, tried invoking
>> cmake more minimally, and it gives a message complaining about the
>> version being too old, ...
>>
>> Seems I have to build it with a different / newer WSL instance (well,
>> I guess it was either this or try to rebuild CMake from source).
>>
>>
>> Checks, download for compiler (+ git cloned LLVM) is a little over 6GB.
>>
>>
>> Well, OK, now LLVM is building... I guess, will see if it compiles and
>> doesn't explode in the process. Probably going to be a while it seems.
>>
>
>
>
> A little over an hour later and it still hasn't broken 50% yet...
>
> I think LLVM rebuilds may have actually gotten slower than in the past...
>
>
> Well, at least my 112GB of RAM means it isn't swapping too much...
>
> Computer is a little sluggish and the "System" process seems kinda
> pegged out though...
>
>
>
> I guess I will know sometime later whether or not all of this builds...
>
>

Still watching LLVM build (several hours later), kinda of an interesting
meta aspect in its behaviors.

Sub-stage 1:
cc1plus processes, they start out mostly idle, sit around idle for a few
seconds, CPU activity spikes and then they terminate (and a new one
spawns to take its place).
During this sub-process, PC is generally sluggish.

Sub-stage 2:
"llvm-tblgen" runs; these processes are short lived and run at high CPU
load; but PC stops being sluggish for the brief moments it is running
these (but, soon enough, it is back to the former stage).

Overall CPU load is fairly modest, and HDD activity is also fairly
reasonable. Seems like there is a bottleneck that "cc1plus" steps on.

The "System" process runs at somewhat higher than usual CPU load (~ 8%),
process description "NT Kernel & System", where spikes in this process
seem correlated with the "general sluggishness".

Seems likely related to the "teh crapton" of files it contained...
Also does not escape my notice that on the build drive in question, it
has seemingly eaten over 100GB of disk space during the build project...

Like, seemingly, LLVM has managed to somehow become more absurd than it
was in the past...

Dunno, maybe all this seems like pretty reasonable project design to
some people, but at least to me, it all seems a little absurd.
Well, unless some people experience it building quickly and not eating
huge amounts of HDD space.

Dunno, maybe related to me running the build on a drive with "Folder
Compression" enabled?... (Often times, folder compression makes things
faster, but this might be one of the times where it doesn't).

Some of this is kind of a disincentive to trying to build a compiler
based on LLVM though. Trying to have this as a normal part of the build
process seems implausible.

Like, if it makes Vivado synthesis and implementation seem fast and
lightweight in comparison, this is not an ideal situation...

GCC is an ugly mess, but at least it builds moderately faster.

....

Re: Stealing a Great Idea from the 6600

<v0l3gm$vbr3$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38484&group=comp.arch#38484

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 09:05:26 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 6
Message-ID: <v0l3gm$vbr3$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
<v0kodk$t520$1@dont-email.me> <v0ksrk$u1j2$1@dont-email.me>
<v0l35j$v8sl$1@dont-email.me>
Injection-Date: Sun, 28 Apr 2024 11:05:26 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="a2ac21b75c6e4d3fef1594d1c791c20e";
logging-data="1027939"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18+eBjEy6UOAx+BnCWYU7OjJhY7Elq5vNQ="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:HlgySXvbNQRq2ToeAz5lvkXlG3U=
 by: Thomas Koenig - Sun, 28 Apr 2024 09:05 UTC

BGB <cr88192@gmail.com> schrieb:

> Still watching LLVM build (several hours later), kinda of an interesting
> meta aspect in its behaviors.

Don't build it in debug mode.

Re: Stealing a Great Idea from the 6600

<v0l5ou$vnf8$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38485&group=comp.arch#38485

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 04:43:56 -0500
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <v0l5ou$vnf8$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
<v0kodk$t520$1@dont-email.me> <v0ksrk$u1j2$1@dont-email.me>
<v0l35j$v8sl$1@dont-email.me> <v0l3gm$vbr3$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 28 Apr 2024 11:43:59 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="437e2caa7a0d326db2a3a89ac4546326";
logging-data="1039848"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/oL9RnvDgi+A0Ou77fcIse8a1GgsH+Tuc="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:oW08KscrcrKnmwXlr/06tPDslqs=
In-Reply-To: <v0l3gm$vbr3$1@dont-email.me>
Content-Language: en-US
 by: BGB - Sun, 28 Apr 2024 09:43 UTC

On 4/28/2024 4:05 AM, Thomas Koenig wrote:
> BGB <cr88192@gmail.com> schrieb:
>
>> Still watching LLVM build (several hours later), kinda of an interesting
>> meta aspect in its behaviors.
>
> Don't build it in debug mode.

I was building it in MinSizeRel mode...

But, yeah, need to go to sleep... May poke with it tomorrow if all goes
well...

Re: Stealing a Great Idea from the 6600

<v0m6ce$17634$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38487&group=comp.arch#38487

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 14:00:28 -0500
Organization: A noiseless patient Spider
Lines: 65
Message-ID: <v0m6ce$17634$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
<v0kodk$t520$1@dont-email.me> <v0ksrk$u1j2$1@dont-email.me>
<v0l35j$v8sl$1@dont-email.me> <v0l3gm$vbr3$1@dont-email.me>
<v0l5ou$vnf8$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 28 Apr 2024 21:00:30 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="437e2caa7a0d326db2a3a89ac4546326";
logging-data="1284196"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+vEvfsyw25THMrOPcKTNOuJHrat1Lu89M="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:pfs8UX7NnGs1cSpMHchKPeRHmPI=
Content-Language: en-US
In-Reply-To: <v0l5ou$vnf8$1@dont-email.me>
 by: BGB - Sun, 28 Apr 2024 19:00 UTC

On 4/28/2024 4:43 AM, BGB wrote:
> On 4/28/2024 4:05 AM, Thomas Koenig wrote:
>> BGB <cr88192@gmail.com> schrieb:
>>
>>> Still watching LLVM build (several hours later), kinda of an interesting
>>> meta aspect in its behaviors.
>>
>> Don't build it in debug mode.
>
> I was building it in MinSizeRel mode...
>

Also "-j 4".

Didn't want to go too much higher as this would likely bog down PC harder.

Some stuff say builds should not take quite this long, but this is what
I am seeing...

>
> But, yeah, need to go to sleep... May poke with it tomorrow if all goes
> well...
>

Seems the options I had used did not enable "clang".

Had to enable clang and run again, though as LLVM itself was already
built, it seemed to not need to do much with the parts already built.

So, rerun with "clang" enabled took ~ 1hr.

So, errm... Once built, how does one get it to actually get it to target
the ISA.

"--target my66000-none-elf" or similar just gets it to complain about an
unknown triple, not sure how to query for known targets/triples with clang.

The built "llc --version" makes no mention of MY66000...
Seems to self-identify as version 16.0.0git, host CPU znver1.

Had built the version found here:
https://github.com/bagel99/llvm-my66000

Ironically, it does seem to know about ARM64, though trying to build
anything with it results in it complaining about missing headers
("bits/libc-header-start.h").

No obvious documentation besides that from LLVM itself, and not super
easy to figure out otherwise. Was there some option I was supposed to
give to CMake to enable the target?...

Granted, there is always a possibility that I screwed something up here
(or possible interference from another LLVM version?...).

....

Re: Stealing a Great Idea from the 6600

<c482e88733bb667025cff44314127e14@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38488&group=comp.arch#38488

  copy link   Newsgroups: comp.arch
Date: Sun, 28 Apr 2024 19:24:51 +0000
Subject: Re: Stealing a Great Idea from the 6600
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$Ufb2NeWuHSThicAABesEIewDy3fJp6Fl.7zHQMSBHN0U1GtIqC8Ju
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org> <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me> <ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad> <v0jlf3$i3mh$2@dont-email.me> <3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org> <v0k2kb$l21r$1@dont-email.me> <58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org> <v0kodk$t520$1@dont-email.me>
Organization: Rocksolid Light
Message-ID: <c482e88733bb667025cff44314127e14@www.novabbs.org>
 by: MitchAlsup1 - Sun, 28 Apr 2024 19:24 UTC

BGB wrote:

> On 4/27/2024 8:45 PM, MitchAlsup1 wrote:
>>
>>
>>> But, I think the main difference is that, normal PIC does calls like
>>> like:
>>>    LD Rt, [GOT+Disp]
>>>    BSR Rt
>>
>>     CALX   [IP,,#GOT+#disp-.]
>>
>> It is unlikely that %GOT can be represented with 16-bit offset from IP
>> so the 32-bit displacement form (,,) is used.
>>
>>> Wheres, FDPIC was typically more like (pseudo ASM):
>>>    MOV SavedGOT, GOT
>>>    LEA Rt, [GOT+Disp]
>>>    MOV GOT, [Rt+8]
>>>    MOV Rt, [Rt+0]
>>>    BSR Rt
>>>    MOV GOT, SavedGOT
>>
>> Since GOT is not in a register but is an address constant this is also::
>>
>>     CALX   [IP,,#GOT+#disp-.]
>>

> So... Would this also cause GOT to point to a new address on the callee
> side (that is dependent on the GOT on the caller side, and *not* on the
> PC address at the destination) ?...

The module on the calling side has its GOT and the module on the called side
has its own GOT where offsets to/in GOT are determined by linker making the
module. There may be cases where multiple link edits on a final module have
some of the functions in this module accessed via GOT in this module and in
these cases one uses

CALA [IP,,#GOT+#disp-.] // LDD ip changes to LDA ip

> In effect, the context dependent GOT daisy-chaining is a fundamental
> aspect of FDPIC that is different from conventional PIC.

Yes, understood, and it happens.

>>> But, in my case, noting that function calls tend to be more common
>>> than the functions themselves, and functions will know whether or not
>>> they need to access global variables or call other functions, ... it
>>> made more sense to move this logic into the callee.
>>
>>
>>> No official RISC-V FDPIC ABI that I am aware of, though some proposals
>>> did seem vaguely similar in some areas to what I was doing with PBO.
>>
>>> Where, they were accessing globals like:
>>>    LUI Xt, DispHi
>>>    ADD Xt, Xt, DispLo
>>>    ADD Xt, Xt, GP
>>>    LD  Xd, Xt, 0
>>
>>> Granted, this is less efficient than, say:
>>>    MOV.Q (GBR, Disp33s), Rd
>>
>>     LDD   Rd,[IP,,#GOT+#disp-.]
>>

> As noted, BJX2 can handle this in a single 64-bit instruction, vs 4
> instructions.

>>> Though, people didn't really detail the call sequence or prolog/epilog
>>> sequences, so less sure how this would work.
>>
>>
>>> Likely guess, something like:
>>>    MV    Xs, GP
>>>    LUI   Xt, DispHi
>>>    ADD   Xt, Xt, DispLo
>>>    ADD   Xt, Xt, GP
>>>    LD    GP, Xt, 8
>>>    LD    Xt, Xt, 0
>>>    JALR  LR, Xt, 0
>>>    MV    GP, Xs
>>
>>> Well, unless they have a better way to pull this off...
>>
>>     CALX   [IP,,#GOT+#disp-.]
>>

> Well, can you explain the semantics of this one...

>>> But, yeah, as far as I saw it, my "better solution" was to put this
>>> part into the callee.
>>
>>
>>> Main tradeoff with my design is:
>>>    From any GBR, one needs to be able to get to every other GBR;
>>>    We need to have a way to know which table entry to reload (not
>>> statically known at compile time).
>>
>> Resolved by linker or accessed through GOT in mine. Each dynamic
>> module gets its own GOT.
>>

> The important thing is not associating a GOT with an ELF module, but
> with an instance of said module.

Yes.

> So, say, one copy of an ELF image, can have N separate GOTs and data
> sections (each associated with a program instance).

>>> In my PBO ABI, this was accomplished by using base relocs (but, this
>>> is N/A for ELF, where PE/COFF style base relocs are not a thing).
>>
>>
>>> One other option might be to use a PC-relative load to load the index.
>>> Say:
>>>    AUIPC Xs, DispHi  //"__global_pbo_offset$" ?
>>>    LD Xs, DispLo
>>>    LD Xt, GP, 0   //get table of offsets
>>>    ADD Xt, Xt, Xs
>>>    LD  GP, Xt, 0
>>
>>> In this case, "__global_pbo_offset$" would be a magic constant
>>> variable that gets fixed up by the ELF loader.
>>
>>     LDD   Rd,[IP,,#GOT+#disp-.]
>>

> Still going to need to explain the semantics here...

IP+&GOT+disp-IP is a 64-bit pointer into GOT where the external linkage
pointer resides.

Re: Stealing a Great Idea from the 6600

<v0maic$180gk$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38489&group=comp.arch#38489

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 20:11:56 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <v0maic$180gk$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
<v0kodk$t520$1@dont-email.me> <v0ksrk$u1j2$1@dont-email.me>
<v0l35j$v8sl$1@dont-email.me> <v0l3gm$vbr3$1@dont-email.me>
<v0l5ou$vnf8$1@dont-email.me> <v0m6ce$17634$1@dont-email.me>
Injection-Date: Sun, 28 Apr 2024 22:11:57 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="e42223bf59333610e4180cac5d5d9e57";
logging-data="1311252"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+DJnF6j3ECEe4hWGCVv5unVzJifu7eVMY="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:9aau1qdz/lfklwzVOfZBEnyGiVw=
 by: Thomas Koenig - Sun, 28 Apr 2024 20:11 UTC

BGB <cr88192@gmail.com> schrieb:

> "--target my66000-none-elf" or similar just gets it to complain about an
> unknown triple, not sure how to query for known targets/triples with clang.

Grepping around the CMakeCache.txt file in my build directory, I find

//Semicolon-separated list of experimental targets to build.
LLVM_EXPERIMENTAL_TARGETS_TO_BUILD:STRING=My66000

This is documented in llvm/lib/Target/My66000/README .

Re: Stealing a Great Idea from the 6600

<v0mb1g$184aa$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38490&group=comp.arch#38490

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 15:20:00 -0500
Organization: A noiseless patient Spider
Lines: 27
Message-ID: <v0mb1g$184aa$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
<v0kodk$t520$1@dont-email.me> <v0ksrk$u1j2$1@dont-email.me>
<v0l35j$v8sl$1@dont-email.me> <v0l3gm$vbr3$1@dont-email.me>
<v0l5ou$vnf8$1@dont-email.me> <v0m6ce$17634$1@dont-email.me>
<v0maic$180gk$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 28 Apr 2024 22:20:01 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="437e2caa7a0d326db2a3a89ac4546326";
logging-data="1315146"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+FuPhMyTPsKuHjcKGHiLgx75RWS8xtd80="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:2fqqVs5gYIYU8l26K+ec1dA5qPM=
Content-Language: en-US
In-Reply-To: <v0maic$180gk$1@dont-email.me>
 by: BGB - Sun, 28 Apr 2024 20:20 UTC

On 4/28/2024 3:11 PM, Thomas Koenig wrote:
> BGB <cr88192@gmail.com> schrieb:
>
>> "--target my66000-none-elf" or similar just gets it to complain about an
>> unknown triple, not sure how to query for known targets/triples with clang.
>
> Grepping around the CMakeCache.txt file in my build directory, I find
>
> //Semicolon-separated list of experimental targets to build.
> LLVM_EXPERIMENTAL_TARGETS_TO_BUILD:STRING=My66000
>
> This is documented in llvm/lib/Target/My66000/README .

I realized after posting this that I had cloned the wrong branch...

I had cloned "main", seems the My66000 stuff was in the "my66000"
branch; somehow didn't realize that there would be multiple branches
with different stuff in each branch.

But, now is the process of waiting for this branch to build...

Probably a few more hours, then I will see what happens.

Build status currently at ~20%.

Re: Stealing a Great Idea from the 6600

<v0mean$18r2p$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38492&group=comp.arch#38492

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 16:16:07 -0500
Organization: A noiseless patient Spider
Lines: 212
Message-ID: <v0mean$18r2p$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
<v0kodk$t520$1@dont-email.me>
<c482e88733bb667025cff44314127e14@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 28 Apr 2024 23:16:08 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="437e2caa7a0d326db2a3a89ac4546326";
logging-data="1338457"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19nuDRzzlRyZVjOQ/R5jdj+3s+xZRIzVWI="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:dHIrlScZL74ZNgFgASDeOyTVzAQ=
In-Reply-To: <c482e88733bb667025cff44314127e14@www.novabbs.org>
Content-Language: en-US
 by: BGB - Sun, 28 Apr 2024 21:16 UTC

On 4/28/2024 2:24 PM, MitchAlsup1 wrote:
> BGB wrote:
>
>> On 4/27/2024 8:45 PM, MitchAlsup1 wrote:
>>>
>>>
>>>> But, I think the main difference is that, normal PIC does calls like
>>>> like:
>>>>    LD Rt, [GOT+Disp]
>>>>    BSR Rt
>>>
>>>      CALX   [IP,,#GOT+#disp-.]
>>>
>>> It is unlikely that %GOT can be represented with 16-bit offset from IP
>>> so the 32-bit displacement form (,,) is used.
>>>
>>>> Wheres, FDPIC was typically more like (pseudo ASM):
>>>>    MOV SavedGOT, GOT
>>>>    LEA Rt, [GOT+Disp]
>>>>    MOV GOT, [Rt+8]
>>>>    MOV Rt, [Rt+0]
>>>>    BSR Rt
>>>>    MOV GOT, SavedGOT
>>>
>>> Since GOT is not in a register but is an address constant this is also::
>>>
>>>      CALX   [IP,,#GOT+#disp-.]
>>>
>
>> So... Would this also cause GOT to point to a new address on the
>> callee side (that is dependent on the GOT on the caller side, and
>> *not* on the PC address at the destination) ?...
>
> The module on the calling side has its GOT and the module on the called
> side
> has its own GOT where offsets to/in GOT are determined by linker making the
> module. There may be cases where multiple link edits on a final module have
> some of the functions in this module accessed via GOT in this module and in
> these cases one uses
>     CALA   [IP,,#GOT+#disp-.]     // LDD ip changes to LDA ip
>

OK, but it seems I may be failing to understand something here...

>> In effect, the context dependent GOT daisy-chaining is a fundamental
>> aspect of FDPIC that is different from conventional PIC.
>
> Yes, understood, and it happens.
>
>>>> But, in my case, noting that function calls tend to be more common
>>>> than the functions themselves, and functions will know whether or
>>>> not they need to access global variables or call other functions,
>>>> ... it made more sense to move this logic into the callee.
>>>
>>>
>>>> No official RISC-V FDPIC ABI that I am aware of, though some
>>>> proposals did seem vaguely similar in some areas to what I was doing
>>>> with PBO.
>>>
>>>> Where, they were accessing globals like:
>>>>    LUI Xt, DispHi
>>>>    ADD Xt, Xt, DispLo
>>>>    ADD Xt, Xt, GP
>>>>    LD  Xd, Xt, 0
>>>
>>>> Granted, this is less efficient than, say:
>>>>    MOV.Q (GBR, Disp33s), Rd
>>>
>>>      LDD   Rd,[IP,,#GOT+#disp-.]
>>>
>
>> As noted, BJX2 can handle this in a single 64-bit instruction, vs 4
>> instructions.
>
>
>>>> Though, people didn't really detail the call sequence or
>>>> prolog/epilog sequences, so less sure how this would work.
>>>
>>>
>>>> Likely guess, something like:
>>>>    MV    Xs, GP
>>>>    LUI   Xt, DispHi
>>>>    ADD   Xt, Xt, DispLo
>>>>    ADD   Xt, Xt, GP
>>>>    LD    GP, Xt, 8
>>>>    LD    Xt, Xt, 0
>>>>    JALR  LR, Xt, 0
>>>>    MV    GP, Xs
>>>
>>>> Well, unless they have a better way to pull this off...
>>>
>>>      CALX   [IP,,#GOT+#disp-.]
>>>
>
>> Well, can you explain the semantics of this one...
>
>
>>>> But, yeah, as far as I saw it, my "better solution" was to put this
>>>> part into the callee.
>>>
>>>
>>>> Main tradeoff with my design is:
>>>>    From any GBR, one needs to be able to get to every other GBR;
>>>>    We need to have a way to know which table entry to reload (not
>>>> statically known at compile time).
>>>
>>> Resolved by linker or accessed through GOT in mine. Each dynamic
>>> module gets its own GOT.
>>>
>
>> The important thing is not associating a GOT with an ELF module, but
>> with an instance of said module.
>
> Yes.
>
>> So, say, one copy of an ELF image, can have N separate GOTs and data
>> sections (each associated with a program instance).
>
>>>> In my PBO ABI, this was accomplished by using base relocs (but, this
>>>> is N/A for ELF, where PE/COFF style base relocs are not a thing).
>>>
>>>
>>>> One other option might be to use a PC-relative load to load the index.
>>>> Say:
>>>>    AUIPC Xs, DispHi  //"__global_pbo_offset$" ?
>>>>    LD Xs, DispLo
>>>>    LD Xt, GP, 0   //get table of offsets
>>>>    ADD Xt, Xt, Xs
>>>>    LD  GP, Xt, 0
>>>
>>>> In this case, "__global_pbo_offset$" would be a magic constant
>>>> variable that gets fixed up by the ELF loader.
>>>
>>>      LDD   Rd,[IP,,#GOT+#disp-.]
>>>
>
>> Still going to need to explain the semantics here...
>
> IP+&GOT+disp-IP is a 64-bit pointer into GOT where the external linkage
> pointer resides.

OK.

Not sure I follow here what exactly is going on...

As noted, if I did a similar thing to the RISC-V example, but with my
own ISA (with the MOV.C extension):
MOV.Q (PC, Disp33), R0
MOV.Q (GBR, 0), R18
MOV.C (R18, R0), GBR

Differing mostly in that it doesn't require base relocs.

The normal version in my case avoids the extra memory load, but uses a
base reloc for the table index.

....

Though, the reloc format is at least semi-dense, eg, for a block of relocs:
{ DWORD rvaPage; //address of page (4K)
DWORD szRelocs; //size of relocs in block
}
With each reloc encoded as a 16-bit entry:
(15:12): Reloc Type
(11: 0): Address within Page (4K)

One downside is this format is less efficient for sparse relocs (current
situation), where often there are only 1 or 2 relocs per page (typically
the PBO index fixups and similar).

One situation could be to have a modified format that partially omits
the block structuring, say:
0ddd: Advance current page position by ddd pages (4K);
0000: Effectively a NOP (as before)
1ddd..Cddd: Apply the given reloc.
These represent typical relocs, target dependent.
HI16, LO16, DIR32, HI32ADJ, ...
8ddd: Was assigned for PBO fixups;
Addd: Fixup for a 64-bit address, also semi common.
Dzzz/Ezzz: Extended Relocs
These ones are configurable from a larger set of reloc types.
Fzzz: Command-Escape
...

Where, say, rather than needing 1 block per 4K page, it is 1 block per
PE section.

Though, base relocs are a relatively small part of the size of the binary.

To some extent, the PBO reloc is magic in that it works by
pattern-matching the instruction that it finds at the given address. So,
in effect, is only defined for a limited range of instructions.

Contrast with, say, the 1/2/3/4/A relocs, which expect raw 16/32/64 bit
values. Though, a lot of these are not currently used for BJX2 (does not
use 16-bit addressing nides, ...).

Here:
5/6/7/8/9/B/C, ended up used for BJX2 relocs in BJX2 mode.
For other targets, they would have other meanings.
D/E/F were reserved as expanded/escape-case relocs, in case I need to
add more. These would differ partly in that the reloc sub-type would be
assigned as a sort of state-machine.

Re: Stealing a Great Idea from the 6600

<86cyq9unaa.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38496&group=comp.arch#38496

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 15:37:17 -0700
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <86cyq9unaa.fsf@linuxsc.com>
References: <in312jlca131khq3vj0i24n6pb0hah2ur5@4ax.com> <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Date: Mon, 29 Apr 2024 00:37:18 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="c36476fd6bd91582cb5d0726d3f1692e";
logging-data="1371881"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19UgAznM5f4bxgYPv5cPu6UD9HThxlI1wY="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:EEFveMALh+YgvQom7l1gbQ1oTjA=
sha1:IXYS7RuYCaJ06t1W6MeudcUJeWM=
 by: Tim Rentsch - Sun, 28 Apr 2024 22:37 UTC

mitchalsup@aol.com (MitchAlsup1) writes:

> John Savard wrote:
>
>> On Sat, 20 Apr 2024 22:03:21 +0000, mitchalsup@aol.com (MitchAlsup1)
>> wrote:
>>
>>> BGB wrote:
>>>
>>>> Sign-extend signed values, zero-extend unsigned values.
>>>
>>> Another mistake I mad in Mc 88100.
>>
>> As that is a mistake the IBM 360 made, I make it too. But I make it
>> the way the 360 did: there are no signed and unsigned values, in the
>> sense of a Burroughs machine, there are just Load, Load Unsigned - and
>> Insert - instructions.
>>
>> Index and base register values are assumed to be unsigned.
>
> I would use the term signless as opposed to unsigned.

What's the point of using a non-standard term when there is a
common and firmly established standard term? I don't see how the
non-standard term conveys anything different. Next thing you
know someone will want to say "signful" rather than "signed".

> Address arithmetic is ADD only and does not care about signs or
> overflow. There is no concept of a negative base register or a
> negative index register (or, for that matter, a negative displace-
> ment), overflow, underflow, carry, ...

Some people here have argued that (for some architectures), addresses
with the high-order bit set should be taken as negative rather than
positive. Or did you mean your comment to apply only to certain
architectures (IBM 360, Mc 88100, perhaps others?), and not to
all architectures?

Re: Stealing a Great Idea from the 6600

<14a7b1b370c033c50ac77e3394ac1ea5@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38497&group=comp.arch#38497

  copy link   Newsgroups: comp.arch
Date: Sun, 28 Apr 2024 22:56:41 +0000
Subject: Re: Stealing a Great Idea from the 6600
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$9XQ6srDJl/A5poJ6NUjCaO9OGCHhZ5nTgsM78ggD50u4b1Da0LdXa
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org> <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me> <ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad> <v0jlf3$i3mh$2@dont-email.me> <3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org> <v0k2kb$l21r$1@dont-email.me> <58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org> <v0kodk$t520$1@dont-email.me> <c482e88733bb667025cff44314127e14@www.novabbs.org> <v0mean$18r2p$1@dont-email.me>
Organization: Rocksolid Light
Message-ID: <14a7b1b370c033c50ac77e3394ac1ea5@www.novabbs.org>
 by: MitchAlsup1 - Sun, 28 Apr 2024 22:56 UTC

BGB wrote:

> On 4/28/2024 2:24 PM, MitchAlsup1 wrote:
>>
>>> Still going to need to explain the semantics here...
>>
>> IP+&GOT+disp-IP is a 64-bit pointer into GOT where the external linkage
>> pointer resides.

> OK.

> Not sure I follow here what exactly is going on...

While I am sure I don't understand what is going on....

> As noted, if I did a similar thing to the RISC-V example, but with my
> own ISA (with the MOV.C extension):
> MOV.Q (PC, Disp33), R0 // What data does this access ?
> MOV.Q (GBR, 0), R18
> MOV.C (R18, R0), GBR

It appears to me that you are placing an array of GOT pointers at
the first entry of any particular GOT ?!?

Whereas My 66000 uses IP relative access to the GOT the linker
(or LD.so) setup avoiding the indirection.

Then My 66000 does not have or need a pointer to GOT since it can
synthesize such a pointer at link time and then just use a IP relative
plus DISP32 to access said GOT.

So, say we have some external variables::

extern uint64_t fred, wilma, barney, betty;

AND we postulate that the linker found all 4 externs in the same module
so that it can access them all via 1 pointer. The linker assigns an
index into GOT and setups a relocation to that memory segment and when
LD.so runs, it stores a proper pointer in that index of GOT, call this
index fred_index.

And we access one of these::

if( fred at_work )

The compiler will obtain the pointer to the area fred is positioned via:

LDD Rfp,[IP,,#GOT+fred_index<<3] // *

and from here one can access barney, betty and wilma using the pointer
to fred and standard offsetting.

LDD Rfred,[Rfp,#0] // fred
LDD Rbarn,[Rfp,#16] // barney
LDD Rbett,[Rfp,#24] // betty
LDD Rwilm,[Rfp,#8] // wilma

These offsets are known at link time and possibly not at compile time.

(*) if the LDD through GOT takes a page fault, we have a procedure setup
so LD.so can run figure out which entry is missing, look up where it is
(possibly load and resolve it) and insert the required data into GOT.
When control returns to LDD, the entry is now present, and we now have
access to fred, wilma, barney and betty.

> Differing mostly in that it doesn't require base relocs.

> The normal version in my case avoids the extra memory load, but uses a
> base reloc for the table index.

> ....

{{ // this looks like stuff that should be accessible to LD.so

> Though, the reloc format is at least semi-dense, eg, for a block of relocs:
> { DWORD rvaPage; //address of page (4K)
> DWORD szRelocs; //size of relocs in block
> }
> With each reloc encoded as a 16-bit entry:
> (15:12): Reloc Type
> (11: 0): Address within Page (4K)

> One downside is this format is less efficient for sparse relocs (current
> situation), where often there are only 1 or 2 relocs per page (typically
> the PBO index fixups and similar).

> One situation could be to have a modified format that partially omits
> the block structuring, say:
> 0ddd: Advance current page position by ddd pages (4K);
> 0000: Effectively a NOP (as before)
> 1ddd..Cddd: Apply the given reloc.
> These represent typical relocs, target dependent.
> HI16, LO16, DIR32, HI32ADJ, ...
> 8ddd: Was assigned for PBO fixups;
> Addd: Fixup for a 64-bit address, also semi common.
> Dzzz/Ezzz: Extended Relocs
> These ones are configurable from a larger set of reloc types.
> Fzzz: Command-Escape
> ...

> Where, say, rather than needing 1 block per 4K page, it is 1 block per
> PE section.

> Though, base relocs are a relatively small part of the size of the binary.

> To some extent, the PBO reloc is magic in that it works by
> pattern-matching the instruction that it finds at the given address. So,
> in effect, is only defined for a limited range of instructions.

> Contrast with, say, the 1/2/3/4/A relocs, which expect raw 16/32/64 bit
> values. Though, a lot of these are not currently used for BJX2 (does not
> use 16-bit addressing nides, ...).

> Here:
> 5/6/7/8/9/B/C, ended up used for BJX2 relocs in BJX2 mode.
> For other targets, they would have other meanings.
> D/E/F were reserved as expanded/escape-case relocs, in case I need to
> add more. These would differ partly in that the reloc sub-type would be
> assigned as a sort of state-machine.

but not the program itself}}

Re: Stealing a Great Idea from the 6600

<v0n8ls$1hv6r$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38501&group=comp.arch#38501

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 28 Apr 2024 23:45:45 -0500
Organization: A noiseless patient Spider
Lines: 383
Message-ID: <v0n8ls$1hv6r$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
<v0kodk$t520$1@dont-email.me>
<c482e88733bb667025cff44314127e14@www.novabbs.org>
<v0mean$18r2p$1@dont-email.me>
<14a7b1b370c033c50ac77e3394ac1ea5@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 29 Apr 2024 06:45:49 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7f06ccc94ee66b06f8f8e543c504a29c";
logging-data="1637595"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19jlG4y80iyPzK0xWeqN9GFI6jjULR37kI="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:qUMdRTdmbSZGoaXpaeP5u4nBtRY=
Content-Language: en-US
In-Reply-To: <14a7b1b370c033c50ac77e3394ac1ea5@www.novabbs.org>
 by: BGB - Mon, 29 Apr 2024 04:45 UTC

On 4/28/2024 5:56 PM, MitchAlsup1 wrote:
> BGB wrote:
>
>> On 4/28/2024 2:24 PM, MitchAlsup1 wrote:
>>>
>>>> Still going to need to explain the semantics here...
>>>
>>> IP+&GOT+disp-IP is a 64-bit pointer into GOT where the external linkage
>>> pointer resides.
>
>> OK.
>
>> Not sure I follow here what exactly is going on...
>
> While I am sure I don't understand what is going on....
>
>> As noted, if I did a similar thing to the RISC-V example, but with my
>> own ISA (with the MOV.C extension):
>>      MOV.Q (PC, Disp33), R0            // What data does this access ?
>>      MOV.Q (GBR, 0), R18
>>      MOV.C (R18, R0), GBR
>
> It appears to me that you are placing an array of GOT pointers at the
> first entry of any particular GOT ?!?
>

They are not really GOTs in my PBO ABI, but rather the start of every
".data" section.

In this case, every ".data" section starts with a pointer to an array
that holds a pointer to every other ".data" section in the process
image, and every DLL is assigned an index in this array (except the main
EXE, which always has an index value of 0).

Every program instance exists relative to this array of ".data" sections.

So, say, "Process 1" will have one version of this array, "Process 2"
will have another, etc.

And, all of the data sections in Process 1 will point to the array for
Process 1. And, all of the data sections in Process 2 will point to the
array for Process 2. And so on...

So, even if all the ".text" sections are shared between "Process 1" and
"Process 2" (with both existing within the same address space), because
the data sections are separate; each has its own set of global
variables, so the processes effectively don't see the other versions of
the program running within the shared address space.

In some sense, FDPIC is vague similar, but does use GOTs, but
effectively daisy-chains all the GOTs together with all the other GOTs
(having a GOT pointer for every function pointer in the GOT).

But, as can be noted, this does add some overhead.

In my case, I had wanted to do something similar to FDPIC in the sense
of allowing multiple instances without needing to duplicate the
read-only sections. But, I also wanted a lower performance overhead.

> Whereas My 66000 uses IP relative access to the GOT the linker (or
> LD.so) setup avoiding the indirection.
> Then My 66000 does not have or need a pointer to GOT since it can
> synthesize such a pointer at link time and then just use a IP relative
> plus DISP32 to access said GOT.
>

This approach works so long as one has a one-to-one mapping between
loaded binaries, and their associated sets of global variables (or, if
each mapping exists in its own address space).

Doesn't work so well for a many-to-one mapping within a shared address
space.

So, say, if you only have one instance of a binary, getting the GOT or
data sections relative to PC/IP can work.

But, with multiple instances, it does not work. The data sections can
only be relative to the other data sections (or to the process context).

Like, say, if you wanted to support a multitasking operating system on
hardware that doesn't have either virtual memory or segments.

Or, if one does have virtual memory, but wants to keep it as optional.

Say, for example, uClinux...

> So, say we have some external variables::
>
>     extern uint64_t fred, wilma, barney, betty;
>
> AND we postulate that the linker found all 4 externs in the same module
> so that it can access them all via 1 pointer. The linker assigns an
> index into GOT and setups a relocation to that memory segment and when
> LD.so runs, it stores a proper pointer in that index of GOT, call this
> index fred_index.
>
> And we access one of these::
>
>     if( fred at_work )
>
> The compiler will obtain the pointer to the area fred is positioned via:
>
>     LDD    Rfp,[IP,,#GOT+fred_index<<3]        // *
>

And, the above is where the problem lies...

Would be valid for ELF PIC or PIE binaries, but is not valid for PBO or
FDPIC.

> and from here one can access barney, betty and wilma using the pointer
> to fred and standard offsetting.
>
>     LDD    Rfred,[Rfp,#0]     // fred
>     LDD    Rbarn,[Rfp,#16]    // barney
>     LDD    Rbett,[Rfp,#24]    // betty
>     LDD    Rwilm,[Rfp,#8]     // wilma
>
> These offsets are known at link time and possibly not at compile time.
>
> (*) if the LDD through GOT takes a page fault, we have a procedure setup
> so LD.so can run figure out which entry is missing, look up where it is
> (possibly load and resolve it) and insert the required data into GOT.
> When control returns to LDD, the entry is now present, and we now have
> access to fred, wilma, barney and betty.
>

Yeah.

>> Differing mostly in that it doesn't require base relocs.
>
>> The normal version in my case avoids the extra memory load, but uses a
>> base reloc for the table index.
>
>> ....
>
> {{ // this looks like stuff that should be accessible to LD.so
>
>> Though, the reloc format is at least semi-dense, eg, for a block of
>> relocs:
>>    { DWORD rvaPage;   //address of page (4K)
>>      DWORD szRelocs;  //size of relocs in block
>>    }
>> With each reloc encoded as a 16-bit entry:
>>    (15:12): Reloc Type
>>    (11: 0): Address within Page (4K)
>
>> One downside is this format is less efficient for sparse relocs
>> (current situation), where often there are only 1 or 2 relocs per page
>> (typically the PBO index fixups and similar).
>
>
>> One situation could be to have a modified format that partially omits
>> the block structuring, say:
>>    0ddd: Advance current page position by ddd pages (4K);
>>      0000: Effectively a NOP (as before)
>>    1ddd..Cddd: Apply the given reloc.
>>      These represent typical relocs, target dependent.
>>      HI16, LO16, DIR32, HI32ADJ, ...
>>      8ddd: Was assigned for PBO fixups;
>>      Addd: Fixup for a 64-bit address, also semi common.
>>    Dzzz/Ezzz: Extended Relocs
>>      These ones are configurable from a larger set of reloc types.
>>    Fzzz: Command-Escape
>>    ...
>
>> Where, say, rather than needing 1 block per 4K page, it is 1 block per
>> PE section.
>

Tested the above tweak, it can reduce the size of the ".reloc" section
by around 20%, but would break compatibility with previous versions of
my PEL loader.

>
>> Though, base relocs are a relatively small part of the size of the
>> binary.
>
>
>> To some extent, the PBO reloc is magic in that it works by
>> pattern-matching the instruction that it finds at the given address.
>> So, in effect, is only defined for a limited range of instructions.
>
>> Contrast with, say, the 1/2/3/4/A relocs, which expect raw 16/32/64
>> bit values. Though, a lot of these are not currently used for BJX2
>> (does not use 16-bit addressing nides, ...).
>
>> Here:
>> 5/6/7/8/9/B/C, ended up used for BJX2 relocs in BJX2 mode.
>>    For other targets, they would have other meanings.
>> D/E/F were reserved as expanded/escape-case relocs, in case I need to
>> add more. These would differ partly in that the reloc sub-type would
>> be assigned as a sort of state-machine.
>
>
> but not the program itself}}

As noted, the base relocs are applied by the PE / PEL loader.

But, annoyingly, this would not map over so well to an ELF loader...

Note that despite PEL keeping the same high level structure as PE/COFF,
in the case of PBO, effectively the binary is split in half.

So, the read-only sections (".text" and friends), and the read/write
sections (".data" and ".bss") are entirely disjoint in memory.

Likewise, read-only sections may not point to read/write sections, and
the reloc's are effectively applied in two different stages (for the
read-only sections when the binary is loaded into memory; and to the
read/write sections when a new instance is created).

Meanwhile, got the My66000 LLVM/Clang compiler built so far as that it
at least seems to try to build something (and seems to know that the
target exists).

But, also tends to die in s storm of error messages, eg:

/tmp/m_swap-822054.s:6: Error: no such instruction: `bitr r1,r1,<8:48>'
/tmp/m_swap-822054.s:14: Error: no such instruction: `srl r2,r1,<0:24>'
/tmp/m_swap-822054.s:15: Error: no such instruction: `srl r3,r1,<0:8>'
/tmp/m_swap-822054.s:16: Error: too many memory references for `and'
/tmp/m_swap-822054.s:17: Error: no such instruction: `sll r4,r1,<0:8>'
/tmp/m_swap-822054.s:18: Error: too many memory references for `and'
/tmp/m_swap-822054.s:19: Error: no such instruction: `sll r1,r1,<0:24>'
/tmp/m_swap-822054.s:20: Error: too many memory references for `or'
/tmp/m_swap-822054.s:21: Error: too many memory references for `or'
/tmp/m_swap-822054.s:22: Error: too many memory references for `or'
/tmp/m_cheat-f6c778.s: Assembler messages:
/tmp/m_cheat-f6c778.s:6: Error: no such instruction: `ldub
r3,[ip,firsttime]'
/tmp/m_cheat-f6c778.s:7: Error: no such instruction: `bb1 0,r3,.LBB0_3'
/tmp/m_cheat-f6c778.s:8: Error: no such instruction: `stb '
/tmp/m_cheat-f6c778.s:9: Error: expecting operand after ','; got nothing
/tmp/m_cheat-f6c778.s:10: Error: too many memory references for `mov'
/tmp/m_cheat-f6c778.s:12: Error: no such instruction: `bitr r5,r4,<1:56>'
/tmp/m_cheat-f6c778.s:13: Error: no such instruction: `sll r6,r3,<0:1>'
/tmp/m_cheat-f6c778.s:14: Error: too many memory references for `and'
/tmp/m_cheat-f6c778.s:15: Error: no such instruction: `srl r7,r3,<0:1>'
/tmp/m_cheat-f6c778.s:16: Error: too many memory references for `and'
/tmp/m_cheat-f6c778.s:17: Error: no such instruction: `srl r8,r3,<0:5>'
/tmp/m_cheat-f6c778.s:18: Error: too many memory references for `and'
/tmp/m_cheat-f6c778.s:19: Error: no such instruction: `srl r9,r3,<1:7>'
/tmp/m_cheat-f6c778.s:20: Error: too many memory references for `and'
/tmp/m_cheat-f6c778.s:21: Error: too many memory references for `or'
/tmp/m_cheat-f6c778.s:22: Error: too many memory references for `or'
/tmp/m_cheat-f6c778.s:23: Error: too many memory references for `or'
/tmp/m_cheat-f6c778.s:24: Error: too many memory references for `or'
/tmp/m_cheat-f6c778.s:25: Warning: `r5' is not valid here (expected
`(%rdi)')
/tmp/m_cheat-f6c778.s:25: Warning: `r5' is not valid here (expected
`(%rdi)')
/tmp/m_cheat-f6c778.s:25: Error: too many memory references for `ins'
/tmp/m_cheat-f6c778.s:26: Error: no such instruction: `stb
r5,[ip,r3,cheat_xlate_table]'
/tmp/m_cheat-f6c778.s:27: Error: too many memory references for `add'
/tmp/m_cheat-f6c778.s:28: Error: too many memory references for `add'
/tmp/m_cheat-f6c778.s:29: Error: too many memory references for `cmp'
/tmp/m_cheat-f6c778.s:30: Error: no such instruction: `bne r5,.LBB0_2'
/tmp/m_cheat-f6c778.s:32: Error: no such instruction: `ldd r4,[r1]'
/tmp/m_cheat-f6c778.s:33: Error: expecting operand after ','; got nothing
/tmp/m_cheat-f6c778.s:34: Error: no such instruction: `beq0 r4,.LBB0_17'
/tmp/m_cheat-f6c778.s:35: Error: no such instruction: `ldd r5,[r1,8]'
/tmp/m_cheat-f6c778.s:36: Error: no such instruction: `beq0 r5,.LBB0_5'
/tmp/m_cheat-f6c778.s:37: Error: no such instruction: `ldub r6,[r5]'
/tmp/m_cheat-f6c778.s:38: Error: no such instruction: `beq0 r6,.LBB0_7'
/tmp/m_cheat-f6c778.s:40: Error: too many memory references for `and'
/tmp/m_cheat-f6c778.s:41: Error: no such instruction: `ldub
r2,[ip,r2,cheat_xlate_table]'
/tmp/m_cheat-f6c778.s:42: Error: too many memory references for `cmp'
/tmp/m_cheat-f6c778.s:43: Error: no such instruction: `bne r2,.LBB0_10'
/tmp/m_cheat-f6c778.s:44: Error: too many memory references for `add'
/tmp/m_cheat-f6c778.s:45: Error: too many memory references for `std'
/tmp/m_cheat-f6c778.s:46: Error: no such instruction: `ldub r2,[r4]'
/tmp/m_cheat-f6c778.s:47: Error: too many memory references for `cmp'
/tmp/m_cheat-f6c778.s:48: Error: no such instruction: `bne r5,.LBB0_12'
/tmp/m_cheat-f6c778.s:49: Error: no such instruction: `br .LBB0_14'
/tmp/m_cheat-f6c778.s:52: Error: too many memory references for `mov'
/tmp/m_cheat-f6c778.s:55: Error: too many memory references for `std'
/tmp/m_cheat-f6c778.s:56: Error: too many memory references for `mov'
/tmp/m_cheat-f6c778.s:57: Error: no such instruction: `ldub r6,[r5]'
/tmp/m_cheat-f6c778.s:58: Error: no such instruction: `bne0 r6,.LBB0_8'
/tmp/m_cheat-f6c778.s:60: Error: too many memory references for `add'
/tmp/m_cheat-f6c778.s:61: Error: too many memory references for `std'
/tmp/m_cheat-f6c778.s:62: Error: no such instruction: `stb r2,[r5]'
/tmp/m_cheat-f6c778.s:63: Error: no such instruction: `ldd r4,[r1,8]'
/tmp/m_cheat-f6c778.s:64: Error: no such instruction: `ldub r2,[r4]'
/tmp/m_cheat-f6c778.s:65: Error: too many memory references for `cmp'
/tmp/m_cheat-f6c778.s:66: Error: no such instruction: `bne r5,.LBB0_12'
/tmp/m_cheat-f6c778.s:68: Error: no such instruction: `ldd r2,[r1]'
/tmp/m_cheat-f6c778.s:69: Error: expecting operand after ','; got nothing
/tmp/m_cheat-f6c778.s:70: Error: too many memory references for `std'
/tmp/m_cheat-f6c778.s:71: Error: too many memory references for `mov'
/tmp/m_cheat-f6c778.s:74: Error: too many memory references for `std'
/tmp/m_cheat-f6c778.s:75: Error: no such instruction: `ldub r2,[r4]'
/tmp/m_cheat-f6c778.s:76: Error: too many memory references for `cmp'
/tmp/m_cheat-f6c778.s:77: Error: no such instruction: `beq r5,.LBB0_14'
/tmp/m_cheat-f6c778.s:79: Error: too many memory references for `cmp'
/tmp/m_cheat-f6c778.s:80: Error: no such instruction: `bne r2,.LBB0_16'
/tmp/m_cheat-f6c778.s:81: Error: too many memory references for `add'
/tmp/m_cheat-f6c778.s:82: Error: expecting operand after ','; got nothing
/tmp/m_cheat-f6c778.s:83: Error: too many memory references for `std'
/tmp/m_cheat-f6c778.s:84: Error: too many memory references for `mov'
/tmp/m_cheat-f6c778.s:92: Error: no such instruction: `ldd r3,[r1]'
/tmp/m_cheat-f6c778.s:94: Error: too many memory references for `add'
/tmp/m_cheat-f6c778.s:95: Error: no such instruction: `ldub r3,[r3]'
/tmp/m_cheat-f6c778.s:96: Error: too many memory references for `cmp'
/tmp/m_cheat-f6c778.s:97: Error: too many memory references for `mov'
/tmp/m_cheat-f6c778.s:98: Error: no such instruction: `bne r4,.LBB1_1'
/tmp/m_cheat-f6c778.s:99: Error: no such instruction: `ldub r4,[r1]'
/tmp/m_cheat-f6c778.s:100: Error: expecting operand after ','; got nothing
/tmp/m_cheat-f6c778.s:102: Error: too many memory references for `mov'
/tmp/m_cheat-f6c778.s:103: Error: no such instruction: `stb r4,[r2,r3,-1]'
/tmp/m_cheat-f6c778.s:104: Error: too many memory references for `and'
/tmp/m_cheat-f6c778.s:105: Error: no such instruction: `stb '
/tmp/m_cheat-f6c778.s:106: Error: no such instruction: `ldub r4,[r1,r3,0]'
/tmp/m_cheat-f6c778.s:107: Error: too many memory references for `and'
/tmp/m_cheat-f6c778.s:108: Error: no such instruction: `beq0 r6,.LBB1_5'
/tmp/m_cheat-f6c778.s:109: Error: too many memory references for `add'
/tmp/m_cheat-f6c778.s:110: Error: too many memory references for `cmp'
/tmp/m_cheat-f6c778.s:111: Error: no such instruction: `bne r5,.LBB1_3'
/tmp/m_cheat-f6c778.s:112: Error: no such instruction: `br .LBB1_6'
/tmp/m_cheat-f6c778.s:114: Error: too many memory references for `cmp'
/tmp/m_cheat-f6c778.s:115: Error: no such instruction: `beq r1,.LBB1_6'
/tmp/m_cheat-f6c778.s:118: Error: no such instruction: `stb '
/tmp/m_random-1b60b6.s: Assembler messages:
/tmp/m_random-1b60b6.s:6: Error: no such instruction: `lduw
r1,[ip,prndindex]'
/tmp/m_random-1b60b6.s:7: Error: too many memory references for `add'
/tmp/m_random-1b60b6.s:8: Error: too many memory references for `and'
/tmp/m_random-1b60b6.s:9: Error: no such instruction: `stw
r1,[ip,prndindex]'
/tmp/m_random-1b60b6.s:10: Error: no such instruction: `ldub
r1,[ip,r1,rndtable]'
/tmp/m_random-1b60b6.s:18: Error: no such instruction: `lduw
r1,[ip,rndindex]'
/tmp/m_random-1b60b6.s:19: Error: too many memory references for `add'
/tmp/m_random-1b60b6.s:20: Error: too many memory references for `and'
/tmp/m_random-1b60b6.s:21: Error: no such instruction: `stw
r1,[ip,rndindex]'
/tmp/m_random-1b60b6.s:22: Error: no such instruction: `ldub
r1,[ip,r1,rndtable]'
/tmp/m_random-1b60b6.s:30: Error: no such instruction: `stw '
/tmp/m_random-1b60b6.s:31: Error: no such instruction: `stw '


Click here to read the complete article
Re: Stealing a Great Idea from the 6600

<v0oja9$1rtc8$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38505&group=comp.arch#38505

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Mon, 29 Apr 2024 16:53:29 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <v0oja9$1rtc8$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org> <IQSWN.4$nQv.0@fx10.iad>
<v0jlf3$i3mh$2@dont-email.me>
<3458ae0a6b7c1f667ef232c58569b5e1@www.novabbs.org>
<v0k2kb$l21r$1@dont-email.me>
<58f8e9f6925fd21a5526ea45fae82251@www.novabbs.org>
<v0kodk$t520$1@dont-email.me>
<c482e88733bb667025cff44314127e14@www.novabbs.org>
<v0mean$18r2p$1@dont-email.me>
<14a7b1b370c033c50ac77e3394ac1ea5@www.novabbs.org>
<v0n8ls$1hv6r$1@dont-email.me>
Injection-Date: Mon, 29 Apr 2024 18:53:29 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="f6538722a26f1008304ec5c0c907557d";
logging-data="1963400"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX198ASuHbRkYn6lDvHGqC+gtMfjr/bCC+N4="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:XYn9Pf2D3oXa6dmQFlTfV8dTDkM=
 by: Thomas Koenig - Mon, 29 Apr 2024 16:53 UTC

BGB <cr88192@gmail.com> schrieb:

>
> Meanwhile, got the My66000 LLVM/Clang compiler built so far as that it
> at least seems to try to build something (and seems to know that the
> target exists).
>
>
> But, also tends to die in s storm of error messages, eg:
>
> /tmp/m_swap-822054.s:6: Error: no such instruction: `bitr r1,r1,<8:48>'

You can only generate assembly code, so just use "-S".

If you want to assemble to object files, you can use my binutils
branch on github. I have not yet started on the linker (there
are still quite a few decisions to be made regarding relocations,
which is a topic that I do not enjoy too much.

Re: Stealing a Great Idea from the 6600

<lm8YN.7354$qdt5.2292@fx35.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38509&group=comp.arch#38509

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.swapon.de!news.mixmin.net!feeder1-2.proxad.net!proxad.net!feeder1-1.proxad.net!193.141.40.65.MISMATCH!npeer.as286.net!npeer-ng0.as286.net!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx35.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Stealing a Great Idea from the 6600
Newsgroups: comp.arch
References: <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <gul82jlmud2gglbf1siupn180r3f5o3qo5@4ax.com> <v0c9v5$2k063$8@dont-email.me> <80e0bf91545212a676011a9ccd0efa06@www.novabbs.org>
Lines: 19
Message-ID: <lm8YN.7354$qdt5.2292@fx35.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Tue, 30 Apr 2024 15:45:21 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Tue, 30 Apr 2024 15:45:21 GMT
X-Received-Bytes: 2017
 by: Scott Lurndal - Tue, 30 Apr 2024 15:45 UTC

mitchalsup@aol.com (MitchAlsup1) writes:
>Lawrence D'Oliveiro wrote:
>
>> On Sat, 20 Apr 2024 18:06:22 -0600, John Savard wrote:
>
>>> Since there was only one set of arithmetic instrucions, that meant that
>>> when you wrote code to operate on unsigned values, you had to remember
>>> that the normal names of the condition code values were oriented around
>>> signed arithmetic.
>
>> I thought architectures typically had separate condition codes for “carry”
>> versus “overflow”. That way, you didn’t need signed versus unsigned
>> versions of add, subtract and compare; it was just a matter of looking at
>> the right condition codes on the result.
>
>Maybe now with 4-or-5-bit condition codes yes,
>But the early machines (360) with 2-bit codes were already constricted.

The B3500 (contemporaneous with 360) had COMS toggles (2 bits) and OVERFLOW toggle (1 bit).

Re: a bit of history, Stealing a Great Idea from the 6600

<v0rhj5$1itj$2@gal.iecc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38516&group=comp.arch#38516

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder9.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: johnl@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: a bit of history, Stealing a Great Idea from the 6600
Date: Tue, 30 Apr 2024 19:42:29 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <v0rhj5$1itj$2@gal.iecc.com>
References: <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <v0c9v5$2k063$8@dont-email.me> <80e0bf91545212a676011a9ccd0efa06@www.novabbs.org> <lm8YN.7354$qdt5.2292@fx35.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 30 Apr 2024 19:42:29 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="52147"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <v0c9v5$2k063$8@dont-email.me> <80e0bf91545212a676011a9ccd0efa06@www.novabbs.org> <lm8YN.7354$qdt5.2292@fx35.iad>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Tue, 30 Apr 2024 19:42 UTC

According to Scott Lurndal <slp53@pacbell.net>:
>mitchalsup@aol.com (MitchAlsup1) writes:
>>Maybe now with 4-or-5-bit condition codes yes,
>>But the early machines (360) with 2-bit codes were already constricted.
>
>The B3500 (contemporaneous with 360) had COMS toggles (2 bits) and OVERFLOW toggle (1 bit).

As far as I can tell, the 360 was the only machine with short encoded
condition codes.

In their archtecture book Brooks and Blaauw say it was a mistake, but
it also would have been a problem to fit conditional branches into the
instruction set if they needed more instruction bits to say which
codes to test. As it was, the branch instructions had 4 condition bits
which let them check for any combination of two-bit conditions along
with 1111 for unconditional branch and 0000 for no-op.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: a bit of history, Stealing a Great Idea from the 6600

<2024May3.173347@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=38654&group=comp.arch#38654

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!rocksolid2!news.neodome.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: a bit of history, Stealing a Great Idea from the 6600
Date: Fri, 03 May 2024 15:33:47 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 33
Message-ID: <2024May3.173347@mips.complang.tuwien.ac.at>
References: <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <v0c9v5$2k063$8@dont-email.me> <80e0bf91545212a676011a9ccd0efa06@www.novabbs.org> <lm8YN.7354$qdt5.2292@fx35.iad> <v0rhj5$1itj$2@gal.iecc.com>
Injection-Date: Fri, 03 May 2024 17:51:26 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="91e091e20b1a6fa18e9a38109ba9bc59";
logging-data="667316"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/swDuQomZUheuNorsniuPs"
Cancel-Lock: sha1:wNHHWFXi37vRoNqDCTFjk3Kcq08=
X-newsreader: xrn 10.11
 by: Anton Ertl - Fri, 3 May 2024 15:33 UTC

John Levine <johnl@taugh.com> writes:
[only 2 condition code bits]
>In their archtecture book Brooks and Blaauw say it was a mistake, but
>it also would have been a problem to fit conditional branches into the
>instruction set if they needed more instruction bits to say which
>codes to test. As it was, the branch instructions had 4 condition bits
>which let them check for any combination of two-bit conditions along
>with 1111 for unconditional branch and 0000 for no-op.

The architecture paid for the reduction in branches with an increase
in other instructions; S/360 needed signed and unsigned versions of a
number of instructions, e.g.,

signed unsigned
A AL
AR ALR
S SL
SR SLR
C CL
CR CLR

Other instruction sets with 16-bit instructions have no problem
supporting the usual NCZV flags.

IBM then continued in their tradition by having < = > and sticky
overflow flags in Power (and a separate carry, sticky overflow and
overflow flag); this also requires signed and unsigned compares (cmp
and cmpl along with variants).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>


devel / comp.arch / Re: Stealing a Great Idea from the 6600

Pages:1234
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor