Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

echo "Your stdio isn't very std." -- Larry Wall in Configure from the perl distribution


computers / comp.arch / Re: "Mini" tags to reduce the number of op codes

Re: "Mini" tags to reduce the number of op codes

<uuo3pp$16v2r$1@dont-email.me>

  copy mid

https://news.novabbs.org/computers/article-flat.php?id=38234&group=comp.arch#38234

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: "Mini" tags to reduce the number of op codes
Date: Fri, 5 Apr 2024 00:54:54 -0500
Organization: A noiseless patient Spider
Lines: 268
Message-ID: <uuo3pp$16v2r$1@dont-email.me>
References: <uuk100$inj$1@dont-email.me> <uukduu$4o4p$1@dont-email.me>
<420556afacf3ef3eea07b95498bcbef0@www.novabbs.org>
<uulojh$honc$1@dont-email.me> <uun9is$tpk9$1@dont-email.me>
<1eda150f3f6f24095c2204722a2fd541@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 05 Apr 2024 05:56:10 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="ae0c7ce1f1f2160912b24510716acaff";
logging-data="1277019"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19SRP0Q3kLeqBU3wu6PUyj77x2ZkdTAZnA="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:ofmEpBNO28WTxUiOmT9bSFrnz+8=
In-Reply-To: <1eda150f3f6f24095c2204722a2fd541@www.novabbs.org>
Content-Language: en-US
 by: BGB - Fri, 5 Apr 2024 05:54 UTC

On 4/4/2024 8:48 PM, MitchAlsup1 wrote:
> BGB-Alt wrote:
>
>> On 4/4/2024 3:32 AM, Terje Mathisen wrote:
>>> MitchAlsup1 wrote:
>>>>
>
>> As I can note, in my actual ISA, any type-tagging in the registers was
>> explicit and opt-in, generally managed by the compiler/runtime/etc; in
>> this case, the ISA merely providing facilities to assist with this.
>
>
>> The main exception would likely have been the possible "Bounds Check
>> Enforce" mode, which would still need a bit of work to implement, and
>> is not likely to be terribly useful.
>
> A while back (and maybe in the future) My 66000 had what I called the
> Foreign Access Mode. When the HoB of the pointer was set, the first
> entry in the translation table was a 4 doubleword structure, A Root
> pointer, the Lowest addressable Byte, the Highest addressable Byte,
> and a DW of access rights, permissions,... While sort-of like a capability
> I don't think it was close enough to actually be a capability or used as
> one.
>
> So, it fell out of favor, and it was not clear how it fit into the
> HyperVisor/SuperVisor model, either.
>

Possibly true.

The idea with BCE mode would be that the pointers would contain an
address along with an upper and lower bound, and possibly a few access
flags. It would disable the narrower 64-bit pointer instructions,
forcing the use of the 128-bit pointer instructions; which would perform
bounds checks, and some instructions would gain some additional semantics.

In addition, the Boot SRAM and DRAM gain some special "Tag Bits" areas.

However, it is unclear if the enforcing mode gains much over the normal
optional bounds checking to justify the extra cost. The main "merit"
case is that, in theory, it could offer some additional protection
against hostile machine code (whereas the non-enforcing mode is mostly
useful for detecting out-of-bounds memory accesses).

However, the optional mode is compatible with the use of 64-bit pointers
and the existing C ABI, so there is less overhead.

>>                                   Most complicated and expensive parts
>> are that it will require implicit register and memory tagging (to flag
>> capabilities). Though, cheaper option is simply to not enable it, in
>> which case things either behave as before, with the new functionality
>> essentially being NOP. Much of the work still needed on this would be
>> getting the 128-bit ABI working, and adding some new tweaks to the ABI
>> to play well with the capability addressing (effectively it requires
>> partly reworking how global variables are accessed).
>
>
>> The type-tagging scheme used in my case is very similar to that used
>> in my previous BGBScript VMs (where, as I can note, BGBCC was itself a
>> fork off of an early version of the BGBScript VM, and effectively
>> using a lax hybrid typesystem masquerading as C). Though, it has long
>> since moved to a more proper C style typesystem, with dynamic types
>> more as an optional extension.
>
> In general, any time one needs to change the type you waste an instruction
> compared to type less registers.

In my case, both types of values are used:
int x; //x is a bare register
void *p; //may or may not have tag, high 16 bits 0000 if untagged
__variant y; //y is tagged
auto z; //may be tagged or untagged

Here, untagged values will generally be used for non-variant types,
whereas tagged values for variant types.

Here, 'auto' and 'variant' differ, in that variant says "the type is
only known at runtime", whereas 'auto' assumes that a type exists and
may optionally be resolved at compile time (or, alternatively, it may
decay into variant; assumption being that one may not use auto in ways
that are incompatible with variant). In terms of behavior, both cases
may appear superficially similar.

Though:
auto z = expr;
Would instead define 'z' as a type inferred from the expression (in a
similar way to how it works in C++).

Note that:
__var x;
Would also give a variable of type variant, but is not exactly the same
("__variant" is the type, where "__var" is a statement/expression
keyword that just so happens to declare a variable of type "__variant"
when used in this way).

Say, non-variant:
int, long, double
void*, char*, Foo*, ...
__m128, __vec4f, ...
Variant:
__variant, __object, __fixnum, __string, ...

Where, for example:
__variant
May hold (nearly) any type of value at runtime.
Though, with some semantic restrictions.
__object
Tagged value, like variant;
But does not allow using operators on it directly.
__fixnum
Represents a 62-bit signed integer value.
Always exists in tagged form.
__flonum
Represents a 62-bit floating-point value.
Effectively a tagged Binary64 shifted-right by 2 bits.
__string
Holds a string;
Essentially 'char*' but with a type-tagged pointer.
Defaults to CP-1252 at present, but may also hold a UCS-2 string.
Strings are assumed to be a read-only character array.
...

So, say:
int x, z;
__variant y;

y=x; //implicit int -> __fixnum -> __variant
z=(int)y; //coerces y to 'int'

There are some operators that exist for variant types but not for
non-variant types, such as __instanceof.

if(y __instanceof __fixnum)
{
//y is known to be a fixnum here
}

Where __instanceof can also be used on class instances:
__class Foo __extends Bar __implements IBaz {
... class members ...
};

In theory, could add a header to #define a lot of these keywords in
non-prefixed forms, in which case one could theoretically write, say:
public class Foo extends Bar implements IBaz {
private int x, y;
public int someMethod()
{ return x+y; }
public void setX(int val)
{ x=val; }
...
};

And, if one has, say:
IBaz baz;
...
if(baz instanceof Foo)
{
//baz is an instance of the Foo class
}

Though, will note that object instances are pass-by-reference here (like
in Java and C#) and not by-value. Though, if one is familiar with Java,
probably not too hard to figure out how some of this works. Also, as can
be noted, the object model is more like Java family languages than like C++.

However, unlike Java (and more like ActionScript), one can throw a
'dynamic' (or '__dynamic') keyword on a class, in which case it is
possible to create new members in the object instances merely by
assigning to them (where any members created this way will default to
being 'public variant').

Object member access will differ depending on the type of object.
Direct access to a non-dynamic class member will use a fixed
displacement (like when accessing a struct). Dynamic members will
implicitly access an ex-nihilo object that exists as a hidden member in
the class instance (and using the 'dynamic' modifier on a class will
implicitly create this member).

In this case, interfaces are pulled off by sticking a interface VTable
pointer onto the end of the object, and then encoding the Interface
reference as a pointer to the pointer to this vtable (with the VTable
encoding the offset to adjust the object pointer to give a pointer to
the base class for the virtual method). Note that (unlike in the JVM),
what interfaces a class implements is fixed at compile time ("interface
injection" is not possible in BGBCC).

There was an experimental C++ mode, which tries to mimic C++ syntax and
semantics (kinda), sort of trying to awkwardly fake C++'s object system
on top of the Java-like object system (with POD classes decaying into C
structs; value objects faked with object cloning, ...). Will not take
much to see through this illusion though (and almost doesn't really seem
worth it).

If ex-nihilo objects are used, these are treated as separate from the
instance-of-class objects. In the current implementation, these objects
are represented as small B-Trees representing key/value associations.
Here, each key is a 16-bit number (associated with a "symbol") and the
value is a 64-bit value (variant). Each object has a fixed capacity (16
members), and if exceeded, splits apart into a tree (say, a 2-level tree
representing up to 256 members; with the keys in the top-level node
encoding the ranges of keys present in each sub-node).

At present, there is a limit of 64K unique symbols, but this isn't too
big of an issue in practice (each symbol can be seen as a mapping
between a 16-bit number and an ASCII string representing the symbol's name).

If accessing a normal class member, it will be accessed as a direct
memory load or store, or if it is a dynamic member, an implicit runtime
call will be used.

For dynamic types (variant), pretty much all operations involve runtime
calls. These calls will perform a dynamically-typed dispatch based on
the tags of the values they are given.

Similarly, getting/setting a member in an ex-nihilo object is
accomplished via a runtime call.

For performance reasons, high-traffic areas of the dynamic-type runtime
were written in ASM.

Though, for performance, the rule here is to avoid using variant except
in cases where one actually needs dynamic types (and based on whether or
not compatibility with mainline C compilers is needed).

On the other side of the language border (BS), the syntax differs slightly:
function foo(x:int, y:int):int
{
var z:int;
z=x+y;
...
}

And, another language (BS2) of mine had switched to a more Java-like
syntax (with parts of C syntax bolted on). Though, the practical
difference between BS2 and the extended C variant is small (if you
#define the keywords, it is possible to do similar things with mostly
only minor syntactic differences).

Similarly, the extended C variant has another advantage:
It is backwards compatible with C.

Though, not quite so compatible with C++, and I don't expect C++ fans to
be all that interested in BGBCC's non-standard dialect.

OTOH: I will argue that it is at least much less horrid looking than
Objective-C.

....

SubjectRepliesAuthor
o "Mini" tags to reduce the number of op codes

By: Stephen Fuld on Wed, 3 Apr 2024

85Stephen Fuld
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor