Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

Wherever you go...There you are. -- Buckaroo Banzai


devel / comp.arch / Re: By Popular Demand

SubjectAuthor
* By Popular DemandQuadibloc
+* Re: By Popular DemandMitchAlsup1
|`* Re: By Popular DemandQuadibloc
| `- Re: By Popular DemandQuadibloc
`- Re: By Popular DemandPaul A. Clayton

1
By Popular Demand

<uord1m$1s1mn$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37085&group=comp.arch#37085

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.samoylyk.net!news.swapon.de!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: quadibloc@servername.invalid (Quadibloc)
Newsgroups: comp.arch
Subject: By Popular Demand
Date: Wed, 24 Jan 2024 16:14:46 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 26
Message-ID: <uord1m$1s1mn$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 24 Jan 2024 16:14:46 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="55bb24f95768215c54bcf4c91e417444";
logging-data="1967831"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ecbasHHtQyjmKyQoesuq0UtjKSvxhI/g="
User-Agent: Pan/0.146 (Hic habitat felicitas; d7a48b4
gitlab.gnome.org/GNOME/pan.git)
Cancel-Lock: sha1:DUKNqP1ww+OGCtQyCLOWghwiflU=
 by: Quadibloc - Wed, 24 Jan 2024 16:14 UTC

A very common comment I have receieved from several people on my Concertina II
ISA is that making the instruction stream vlock structured is a mistake.

However, computers having a VLIW architecture do normally have a block
structured instruction scheme, with the block being the very long instruction
word. While I've included VLIW functionality in Concertina II and Concertina
III, this has been to increase performance in some implementations, and, thus,
is a relatively minor part of the ISA.

What I've come up with now has the following characteristics:

The normal instruction set no longer has block structure, it's been squeezed
enough to go without that, and provide variable-length instructions.

But one can also choose to run in VLIW mode; then, the instruction stream
is divided into blocks of eight 32-bit instructions, with one block header
to indicate instruction predication.

So block structure is only present where it belongs. VLIW code can't be
distinguished from normal code by using the block header because
instructions in normal code can cross block boundaries, so the second
half of a 32-bit instruction could look like the start of a block header.

Is this worthwhile, I wonder...

John Savard

Re: By Popular Demand

<70195d29d1c8d12daca57a328f266528@www.novabbs.org>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37089&group=comp.arch#37089

  copy link   Newsgroups: comp.arch
Date: Wed, 24 Jan 2024 20:05:23 +0000
Subject: Re: By Popular Demand
From: mitchalsup@aol.com (MitchAlsup1)
Newsgroups: comp.arch
X-Rslight-Site: $2y$10$TvcMM2bjkL5TpCiM2cTeAOHvhtrZcBtHBWf4JzPXo6CL8WVMY9bZy
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <uord1m$1s1mn$1@dont-email.me>
Organization: Rocksolid Light
Message-ID: <70195d29d1c8d12daca57a328f266528@www.novabbs.org>
 by: MitchAlsup1 - Wed, 24 Jan 2024 20:05 UTC

Block structure is only applicable to 1-width of execution
and fails for all other widths.....

So the question becomes:: is your architecture designed for exactly
one width of execution ???

Re: By Popular Demand

<uoseir$21cm6$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37098&group=comp.arch#37098

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!rocksolid2!news.neodome.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: quadibloc@servername.invalid (Quadibloc)
Newsgroups: comp.arch
Subject: Re: By Popular Demand
Date: Thu, 25 Jan 2024 01:47:07 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 26
Message-ID: <uoseir$21cm6$1@dont-email.me>
References: <uord1m$1s1mn$1@dont-email.me>
<70195d29d1c8d12daca57a328f266528@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 25 Jan 2024 01:47:07 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e9afbfdab9c02bd0747dee767528d4ce";
logging-data="2142918"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX180xBIxeYvwk/CqcsqntZzRnFK5oY4XDKk="
User-Agent: Pan/0.146 (Hic habitat felicitas; d7a48b4
gitlab.gnome.org/GNOME/pan.git)
Cancel-Lock: sha1:Wkz6BUrHPy9MPJbB+tK+qU8Lsoc=
 by: Quadibloc - Thu, 25 Jan 2024 01:47 UTC

On Wed, 24 Jan 2024 20:05:23 +0000, MitchAlsup1 wrote:

> Block structure is only applicable to 1-width of execution
> and fails for all other widths.....
>
> So the question becomes:: is your architecture designed for exactly
> one width of execution ???

Well, the VLIW mode is designed for eight-wide execution. But it
can also work well with four-wide or two-wide, I would think, since
it could still specify more efficient execution for those.

But this new design is about _abandoning_ block structure *except*
for VLIW programs. No more pseudo-immediates. 16-bit instructions
no longer come in pairs.

And here it is now:

http://www.quadibloc.com/arch/ct20int.htm

So basically I have finally taken your advice to dump block structure,
except that Concertina IV still offers VLIW in addition to CISC and
RISC; but now, VLIW is separate so the CISC/RISC instruction set is
no longer disfigured by block structure.

John Savard

Re: By Popular Demand

<uot5ng$27t3o$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37105&group=comp.arch#37105

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!rocksolid2!news.neodome.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: quadibloc@servername.invalid (Quadibloc)
Newsgroups: comp.arch
Subject: Re: By Popular Demand
Date: Thu, 25 Jan 2024 08:22:08 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <uot5ng$27t3o$1@dont-email.me>
References: <uord1m$1s1mn$1@dont-email.me>
<70195d29d1c8d12daca57a328f266528@www.novabbs.org>
<uoseir$21cm6$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 25 Jan 2024 08:22:08 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e9afbfdab9c02bd0747dee767528d4ce";
logging-data="2356344"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+oxqD9XlARS2LmTIZ7erPnwJWO9lrurKw="
User-Agent: Pan/0.146 (Hic habitat felicitas; d7a48b4
gitlab.gnome.org/GNOME/pan.git)
Cancel-Lock: sha1:zd4zUTpbyFcG+5ACAifCX9rwUTY=
 by: Quadibloc - Thu, 25 Jan 2024 08:22 UTC

On Thu, 25 Jan 2024 01:47:07 +0000, Quadibloc wrote:

> On Wed, 24 Jan 2024 20:05:23 +0000, MitchAlsup1 wrote:
>
>> Block structure is only applicable to 1-width of execution
>> and fails for all other widths.....
>>
>> So the question becomes:: is your architecture designed for exactly
>> one width of execution ???
>
> Well, the VLIW mode is designed for eight-wide execution. But it
> can also work well with four-wide or two-wide, I would think, since
> it could still specify more efficient execution for those.

With Concertina II in its various incarnations, if a header left
seven instructions in a block, I provided _six_ break bits with
them, as there was always a break before the first one because the
instructions needed to be fetched.

With Concertina IV, on the other hand, the break bit happens to be
the first bit of every instruction. So, while the block length of
eight instructions controls the format of the header that provides
predication, there actually would be nothing stopping an implementation
from treating that as simply a notational convention for predication...
and fetching and executing twelve instructions at a time.

Although presumably the compiler would take the issue width of the
target machine into account. So Concertina IV isn't necessarily
block structured in a way that limits the issue widths it can
work with, although that's just a happy accident, not something I
intended. And, of course, memory is simpler if powers of two are
fetched and executed.

Remember: now the only header is for predication. No longer is
decoding profoundly changed by some possible header values.

John Savard

Re: By Popular Demand

<urg471$215g3$4@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=37711&group=comp.arch#37711

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: paaronclayton@gmail.com (Paul A. Clayton)
Newsgroups: comp.arch
Subject: Re: By Popular Demand
Date: Fri, 23 Feb 2024 17:40:28 -0500
Organization: A noiseless patient Spider
Lines: 69
Message-ID: <urg471$215g3$4@dont-email.me>
References: <uord1m$1s1mn$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 25 Feb 2024 19:25:21 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c2dd64b59c6b4553f800b1a2df9a3a39";
logging-data="2135555"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1++XGIOmXtvxn8bIqCaUgCEz3zFrBjmyYc="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.0
Cancel-Lock: sha1:sPFMNTdwUJrYZHDLAu/nXkk/ErU=
In-Reply-To: <uord1m$1s1mn$1@dont-email.me>
 by: Paul A. Clayton - Fri, 23 Feb 2024 22:40 UTC

On 1/24/24 11:14 AM, Quadibloc wrote:
> A very common comment I have receieved from several people on my Concertina II
> ISA is that making the instruction stream vlock structured is a mistake.

I am not certain that encoding based on a block larger than a
typical instruction is necessarily a mistake. If operations can
receive "left over" bits from a previous block and borrow bits
from the next block, large immediates (e.g.,) could still be
provided with less space wasted than if instructions were required
to fit within a block.

Immediate bits would be obvious choices for borrowed bits as they
do not affect the operation and delayed acquisition would often
not be problematic. However, some operation encoding bits might be
safely lent to the next block, especially if the buffer was
cleared on indirect jumps.

Such a block encoding could prevent jumping into the middle of an
instruction. Useful code snippets for return-oriented programming
would still exist, of course.

(My 66000 provides some protection from jumping into the middle of
an instruction by having the most common 32-bit immediates not be
a valid first parcel of an instruction. Various "safe stack"
proposals can prevent return-oriented programming [I suspect
limiting the "main stack" accesses to using immediate offsets from
frame or stack pointer would have a similar effect]. There have
also been proposals for "landing pads" for indirect jumps.)

Block encoding could also be more flexible in field placement, not
being limited to a single operation/instruction. There might also
be opportunities for compression from using a larger encoding
chunk. Dependent operations and "superinstructions" might be made
easier to detect with block-based encoding (at least those within
one block).

Block encoding might also facilitate predecoding. With variable
length encoding like My 66000 (which is simpler by having the
"instruction" in the first parcel with later parcels only being
immediates), hardware would have to know the start of the first
instruction in a cache block to do predecoding of the cache block.
This feature seems especially attractive to me.

A block encoding that supports bit-borrowing from other blocks and
seeks to optimize density and staged interpretation of bits
according to criticality would probably be rather complex.

I am not convinced that block-based encoding is worthwhile, but
such an encoding need not be less dense than an instruction-based
encoding.

(I have not thought deeply about it, but I feel that there should
be a way to encode instruction information such that it could be
"parsed" to fill a BTB (to avoid target mispredictions and
redundant tag checks) and perhaps facilitate hoisting of loads
(particularly those with base addresses known early) and otherwise
organize the information for faster or less expensive execution. I
also wonder if a dependence-oriented encoding, where results are
explicitly directed to consumers similar to transport-trigger
architectures, would be helpful. Reducing any-to-any communication
_seems_ desirable as does localization of communication. Sun's
MAJC seemed attractive in having global and "slice" (execution
lane) private registers, though with four slices I would have
thought pair-shared registers might also be useful. Other aspects
of MAJC such as its VLIW nature and uniform capabilities of three
of the four slices seem less than ideal. MAJC also sought to allow
cheaper communication between threads; however, sharing a 16KiB
dual-ported L1 data cache among two cores seems problematic. Sun's
Rock processor was similarly "interesting".)

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor