Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

Mausoleum: The final and funniest folly of the rich. -- Ambrose Bierce


devel / comp.lang.forth / Re: Inefficiency of FSL matrices

SubjectAuthor
* Inefficiency of FSL matricesKrishna Myneni
+* Re: Inefficiency of FSL matricesminforth
|`* Re: Inefficiency of FSL matricesKrishna Myneni
| `* Re: Inefficiency of FSL matricesKrishna Myneni
|  `* Re: Inefficiency of FSL matricesminforth
|   `* Re: Inefficiency of FSL matricesmhx
|    `* Re: Inefficiency of FSL matricesminforth
|     `* Re: Inefficiency of FSL matricesmhx
|      +* Re: Inefficiency of FSL matricesdxf
|      |`* locals (was: Inefficiency of FSL matrices)Anton Ertl
|      | +* Re: locals (was: Inefficiency of FSL matrices)none
|      | |+* Re: localsmhx
|      | ||`- Re: localsnone
|      | |`- Re: localsPaul Rubin
|      | +* Re: localsdxf
|      | |`* Re: localsPaul Rubin
|      | | `- Re: localsdxf
|      | `* Re: localsdxf
|      |  +- Re: localsnone
|      |  `* Re: localsBernd Linsel
|      |   `- Re: localsdxf
|      `* Re: Inefficiency of FSL matricesminforth
|       `* Re: Inefficiency of FSL matricesmhx
|        `* Re: Inefficiency of FSL matricesminforth
|         `* Re: Inefficiency of FSL matricesmhx
|          `* Re: Inefficiency of FSL matricesminforth
|           `* Re: Inefficiency of FSL matricesminforth
|            +- Re: Inefficiency of FSL matricesminforth
|            `- Re: Inefficiency of FSL matricesmhx
+* Re: Inefficiency of FSL matricesKrishna Myneni
|+* Re: Inefficiency of FSL matricesKrishna Myneni
||`- Re: Inefficiency of FSL matricesKrishna Myneni
|`* Re: Inefficiency of FSL matricesKrishna Myneni
| `* Re: Inefficiency of FSL matricesminforth
|  `* Re: Inefficiency of FSL matricesKrishna Myneni
|   +* Re: Inefficiency of FSL matricesminforth
|   |`- Re: Inefficiency of FSL matricesKrishna Myneni
|   +* Re: Inefficiency of FSL matricesKrishna Myneni
|   |`* Re: Inefficiency of FSL matricesKrishna Myneni
|   | `* Re: Inefficiency of FSL matricesKrishna Myneni
|   |  `- Re: Inefficiency of FSL matricesminforth
|   `* Re: Inefficiency of FSL matricesAnton Ertl
|    `* Re: Inefficiency of FSL matricesKrishna Myneni
|     +* Re: Inefficiency of FSL matricesPMF
|     |`* Re: Inefficiency of FSL matricesKrishna Myneni
|     | +* Re: Inefficiency of FSL matricesminforth
|     | |`* Re: Inefficiency of FSL matricesKrishna Myneni
|     | | +- Re: Inefficiency of FSL matricesnone
|     | | `* Re: Inefficiency of FSL matricesminforth
|     | |  `* Re: Inefficiency of FSL matricesKrishna Myneni
|     | |   `* Re: Inefficiency of FSL matricesminforth
|     | |    `- Re: Inefficiency of FSL matricesKrishna Myneni
|     | `* Re: Inefficiency of FSL matricesAnton Ertl
|     |  `- Re: Inefficiency of FSL matricesKrishna Myneni
|     `* Re: Inefficiency of FSL matricesAnton Ertl
|      `* Re: Inefficiency of FSL matricesKrishna Myneni
|       `* SET-OPTIMIZER etc. (was: Inefficiency of FSL matrices)Anton Ertl
|        +* Re: SET-OPTIMIZER etc. (was: Inefficiency of FSL matrices)Krishna Myneni
|        |`- Re: SET-OPTIMIZER etc.dxf
|        `* Re: SET-OPTIMIZER etc.minforth
|         `* Re: SET-OPTIMIZER etc.Anton Ertl
|          `* Re: SET-OPTIMIZER etc.dxf
|           `* Re: SET-OPTIMIZER etc.Anton Ertl
|            `* Re: SET-OPTIMIZER etc.dxf
|             +- Re: SET-OPTIMIZER etc.none
|             +* Re: SET-OPTIMIZER etc.sjack
|             |`- Re: SET-OPTIMIZER etc.dxf
|             `* Re: SET-OPTIMIZER etc.Anton Ertl
|              `- Re: SET-OPTIMIZER etc.dxf
`- Re: Inefficiency of FSL matricesKrishna Myneni

Pages:123
Re: Inefficiency of FSL matrices

<um6suv$23sun$2@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25688&group=comp.lang.forth#25688

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: krishna.myneni@ccreweb.org (Krishna Myneni)
Newsgroups: comp.lang.forth
Subject: Re: Inefficiency of FSL matrices
Date: Sat, 23 Dec 2023 09:05:03 -0600
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <um6suv$23sun$2@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me>
<ulog1e$3bbc1$1@dont-email.me>
<76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com>
<ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at>
<ultadd$8ai8$1@dont-email.me>
<e91eb8ee85d8c8e008f0cfa91be8e6c2@news.novabbs.com>
<ulvh4g$mk1d$1@dont-email.me>
<0279d7081d3a12193ae86977cce62047@news.novabbs.com>
<um5aeu$1p3uf$1@dont-email.me>
<716e2d9e1d794bcc3530ee920714f9c5@news.novabbs.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 23 Dec 2023 15:05:04 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="b6c75e65afa53b62a6ade9f50499be84";
logging-data="2225111"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19bK36MHSJ+2cw/RXsu16CT"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.15.1
Cancel-Lock: sha1:1D32bHZIy3ExmqLdCTCMTAcacKk=
In-Reply-To: <716e2d9e1d794bcc3530ee920714f9c5@news.novabbs.com>
Content-Language: en-US
 by: Krishna Myneni - Sat, 23 Dec 2023 15:05 UTC

On 12/23/23 06:20, minforth wrote:
> Krishna Myneni wrote:
>> I do use code inlining. I don't understand what you mean by this being
>> a counter-example. For example, you may need to give interpretation
>> semantics for a word that performs a sequence of words, but the
>> compilation semantics does inline compiling of the sequence.
>
> I am a little surprised that this is a topic of discussion. See also:
> https://ethz.ch/content/dam/ethz/special-interest/infk/ast-dam/documents/Theodoridis-ASPLOS22-Inlining-Paper.pdf
>
> Well, I consider compiling a single xt to be equivalent to compiling a
> function call. Including the function body (apart from reducing function
> call overhead)
> a) eliminates the xt, so it is no longer there for decompilation or
> introspection, and
> b) creates a wider area for further optimisations.
> For example, peephole optimisation can now extend across the boundaries
> of the host code and inlined code, et cetera.

Aren't we saying the same thing? Inlining avoids compiling a function
call by including the function body in the function into which it is
inlined.

I didn't understand why you claimed inlining is a counterexample to
using dual semantics for optimization.

--
KM

Re: Inefficiency of FSL matrices

<75d915c87140e837b88ab047c14d7c16@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25689&group=comp.lang.forth#25689

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!.POSTED!not-for-mail
From: minforth@gmx.net (minforth)
Newsgroups: comp.lang.forth
Subject: Re: Inefficiency of FSL matrices
Date: Sat, 23 Dec 2023 17:17:25 +0000
Organization: novaBBS
Message-ID: <75d915c87140e837b88ab047c14d7c16@news.novabbs.com>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me> <ulog1e$3bbc1$1@dont-email.me> <76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com> <ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at> <ultadd$8ai8$1@dont-email.me> <e91eb8ee85d8c8e008f0cfa91be8e6c2@news.novabbs.com> <ulvh4g$mk1d$1@dont-email.me> <0279d7081d3a12193ae86977cce62047@news.novabbs.com> <um5aeu$1p3uf$1@dont-email.me> <716e2d9e1d794bcc3530ee920714f9c5@news.novabbs.com> <um6suv$23sun$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="917708"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$B4PK2Ej4YCQ8TOoHmt6dA.Ko2YeXl5JjNgDVV7JLKV5R5z9NbtLee
X-Rslight-Posting-User: 0d6d33dbe0e2e1ff58b82acfc1a8a32ac3b1cb72
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: minforth - Sat, 23 Dec 2023 17:17 UTC

Quoting you:
> Krishna Myneni wrote:
> This is my preference -- COMPILE, should simply compile the code needed
> to execute the xt given to it and not do something clever by
> substituting a different xt for it.

Generally spoken, inlining substitutes the original xt with different xts
(the inlined code) or machine code. But okay, just a misunderstanding.

Merry Xmas !! :-)

Re: Inefficiency of FSL matrices

<um7han$277lr$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25693&group=comp.lang.forth#25693

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: krishna.myneni@ccreweb.org (Krishna Myneni)
Newsgroups: comp.lang.forth
Subject: Re: Inefficiency of FSL matrices
Date: Sat, 23 Dec 2023 14:52:39 -0600
Organization: A noiseless patient Spider
Lines: 17
Message-ID: <um7han$277lr$1@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me>
<ulog1e$3bbc1$1@dont-email.me>
<76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com>
<ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at>
<ultadd$8ai8$1@dont-email.me>
<e91eb8ee85d8c8e008f0cfa91be8e6c2@news.novabbs.com>
<ulvh4g$mk1d$1@dont-email.me>
<0279d7081d3a12193ae86977cce62047@news.novabbs.com>
<um5aeu$1p3uf$1@dont-email.me>
<716e2d9e1d794bcc3530ee920714f9c5@news.novabbs.com>
<um6suv$23sun$2@dont-email.me>
<75d915c87140e837b88ab047c14d7c16@news.novabbs.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 23 Dec 2023 20:52:39 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="b6c75e65afa53b62a6ade9f50499be84";
logging-data="2334395"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+JfFTlLrh20YkzPJ6KgsWA"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.15.1
Cancel-Lock: sha1:UWyRdOHRL3KBvTs/eFAVH661zSQ=
In-Reply-To: <75d915c87140e837b88ab047c14d7c16@news.novabbs.com>
Content-Language: en-US
 by: Krishna Myneni - Sat, 23 Dec 2023 20:52 UTC

On 12/23/23 11:17, minforth wrote:
> Quoting you:
>> Krishna Myneni wrote:
>> This is my preference -- COMPILE, should simply compile the code
>> needed to execute the xt given to it and not do something clever by
>> substituting a different xt for it.
>
> Generally spoken, inlining substitutes the original xt with different xts
> (the inlined code) or machine code. But okay, just a misunderstanding.
>
> Merry Xmas !!  :-)

Likewise!

--
Krishna

Re: Inefficiency of FSL matrices

<umlhqr$nkl6$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25845&group=comp.lang.forth#25845

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: krishna.myneni@ccreweb.org (Krishna Myneni)
Newsgroups: comp.lang.forth
Subject: Re: Inefficiency of FSL matrices
Date: Thu, 28 Dec 2023 22:27:07 -0600
Organization: A noiseless patient Spider
Lines: 81
Message-ID: <umlhqr$nkl6$1@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me>
<ulog1e$3bbc1$1@dont-email.me>
<76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com>
<ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at>
<ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Dec 2023 04:27:08 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e22d02f464a131c1acadc5360041461d";
logging-data="774822"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18qGG2nMPLRwcbXoVy3rRSw"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.15.1
Cancel-Lock: sha1:AIslZff7/eQHHqISxBet2Lvgba0=
Content-Language: en-US
In-Reply-To: <2023Dec22.151432@mips.complang.tuwien.ac.at>
 by: Krishna Myneni - Fri, 29 Dec 2023 04:27 UTC

On 12/22/23 08:14, Anton Ertl wrote:
> Krishna Myneni <krishna.myneni@ccreweb.org> writes:
>> On 12/19/23 10:43, Anton Ertl wrote:
>>> Krishna Myneni <krishna.myneni@ccreweb.org> writes:
>>>> But, I want }} to do the size-specific compilation,
>>>> eventually. To do that properly, I want to go to a dual-xt system and
>>>> avoid explicit STATE-dependence (which has other benefits we've
>>>> discussed in the past).
>>>
>>> Avoiding STATE is a good idea. However, what you have in mind seems
>>> to be an optimization, not something like S". Dual-xt words are good
>>> for stuff like S"; you can use them for optimization, but the
>>> intelligent COMPILE, is better for that.
>>>
>>
>> We had a discussion about this earlier, and I did not like the design of
>> SET-OPTIMIZER changing the behavior of COMPILE, .
>
> If it changes the behaviour of COMPILE, (rather than the
> implementation of that behaviour), that's a mistake in the use of
> SET-OPTIMIZER: Whatever you do, it must not change the behaviour.
>
>> I don't see the
>> drawback of changing the xt for compilation state as a method of
>> optimization. Why add extra complexity with SET-OPTIMIZER when you don't
>> have to?
>
> * It's a separation of concerns: SET-OPTIMIZER is for optimization and
> must not change the behaviour (i.e., if you replace the call to
> SET-OPTIMIZER with DROP, the program should still work), whereas
> SET->COMP (and related words such as INTERPRET/COMPILE:) changes the
> compilation semantics, i.e., the behaviour.
>
> * If you implement [COMPILE], you need to know if a word has
> non-default compilation semantics. If you have the separation of
> concerns above, that is easy: If it does not have the default
> NAME>COMPILE method (DEFAULT-NAME>COMP), it has non-default
> compilation semantics. If you are using this mechanism for a
> purpose that does not change the compilation semantics, you have to
> add a mechanism that tells the compiler about the difference, and
> the user has to provide this information in some way, too (e.g., by
> having INTERPRET/COMPILE: if the resulting word has non-default
> compilation semantics and INTERPRET/OPTIMIZE: if it has).
>
> * There is a difference in performance if an xt is COMPILE,ed; with
> the intelligent COMPILE, the result is as good as going through the
> text interpreter; with INTERPRET/COMPILE:, you get a generic call to
> the xt, while the text interpreter produces better code.
>
> * The COMPILE, methods get the xt that is COMPILE,d (they have the
> same stack effect as COMPILE,), which helps in using the same
> implementation for several xts. E.g., FOLD1-1 is the optimizer of
> 29 words (all with the stack effect ( x -- x ). INTERPRET/COMPILE:
> lacks this flexibility. I guess you could have a way that provides
> the xt or nt of the word for which you are performing the
> compilation semantics; not sure how well that would work.
>
> Finally, my vision for the (far-away) future is that words such as S"
> and TO go away, and with them the need for words such as
> INTERPRET/COMPILE: or (worse) STATE-smart words.
>

I have not had time to respond to this, but will do so in more detail in
a separate thread. This is an important topic, and it will take me a
little while to digest the points you make above, as well as lookup our
previous discussion about SET-OPTIMIZER.

My impression is that, in Gforth, you implement xt for a word as follows:

{ xt-interp, xt-compile [,xt-opt] }

and SET-OPTIMIZER allows you to specify xt-opt. Perhaps some pseudo-code
for INTERPRET will be helpful in understanding how compilation occurs in
Gforth.

--
Krishna

SET-OPTIMIZER etc. (was: Inefficiency of FSL matrices)

<2023Dec29.092151@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25847&group=comp.lang.forth#25847

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.lang.forth
Subject: SET-OPTIMIZER etc. (was: Inefficiency of FSL matrices)
Date: Fri, 29 Dec 2023 08:21:51 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 173
Message-ID: <2023Dec29.092151@mips.complang.tuwien.ac.at>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me> <ulog1e$3bbc1$1@dont-email.me> <76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com> <ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at> <ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at> <umlhqr$nkl6$1@dont-email.me>
Injection-Info: dont-email.me; posting-host="923a333d17efc18a48c215109db233bb";
logging-data="847627"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19f1buH42v0bpgBngtcxS5s"
Cancel-Lock: sha1:x8sZ4mCzi/MqmBSSE9EYvbeqdqQ=
X-newsreader: xrn 10.11
 by: Anton Ertl - Fri, 29 Dec 2023 08:21 UTC

Krishna Myneni <krishna.myneni@ccreweb.org> writes:
>My impression is that, in Gforth, you implement xt for a word as follows:
>
>{ xt-interp, xt-compile [,xt-opt] }
>
>and SET-OPTIMIZER allows you to specify xt-opt.

Actually, Gforth has 8 header methods (method selectors) and 9 setter
words for specifying the method implementation for a method selector
in the current word:

execute ( ... xt -- ... ) set-execute ( code-address -- )
execute ( ... xt -- ... ) set-does> ( xt -- )
compile, ( xt -- ) set-optimizer ( xt -- )
name>interpret ( nt -- xt ) set->int ( xt -- )
name>compile ( nt -- xt1 xt2 ) set->comp ( xt -- )
(to) ( val xt -- ) set-to ( xt -- )
defer@ ( xt1 -- xt2 ) set-defer@ ( xt -- )
name>string ( nt -- c-addr u ) set->string ( xt -- )
name>link ( nt1 -- nt2|0 ) set->link ( xt -- )

In particular, you change the compilation semantics by specifying the
NAME>COMPILE method implementation, which is somewhat different from
providing xt-compsem (your xt-compile). Based on that, we also
provide the convenience words:

set-compsem ( xt -- )
compsem: ( -- ) \ dedicated to Stephen Pelc
intsem: ( -- ) \ special dedication to Stephen Pelc
interpret/compile: ( xt-int xt-comp "name" -- )

Usage examples, all equivalent:

:noname ." compiling" ;
: foo ." interpreting" ; set-compsem

: foo ." interpreting" ; compsem: ." compiling" ;
: foo ." compiling" ; intsem: ." interpreting" ;

: foo-int ." interpreting" ;
: foo-comp ." compiling" ;
' foo-int ' foo-comp interpret/compile: foo

Note that several people (including me) have recommended to define,
for every dual-semantics word like FOO, also FOO-INT and FOO-COMP.

Usage:
9 interpret/compile:
1 set-compsem: (in the definition of COMPSEM:)
1 compsem:
0 intselm:
23 set-optimizer (and there are other, derived words)

Read all about it in:

http://www.euroforth.org/ef19/papers/paysan.pdf

Since then, we have moved nt and xt to point to the body.

You can see the header methods (except the code field) with

..hm ( nt -- )

E.g., for seeing the code generator of "!", you say

``! .hm

Let's compare the header methods of "!", ":", and "TO" and the
constant K-F1:

``! .hm ``: .hm ``to .hm ``k-f1 .hm
opt: peephole-compile, :, :, constant,
to: no-to no-to no-to no-to
extra: $0 $0 $0 $0
>int: noop noop a>int noop
>comp: default-name>comp default-name>comp i/c>comp default-name>comp
>string: named>string named>string named>string named>string
>link: named>link named>link named>link named>link

The "extra:" field is used for SET-DOES>.

>Perhaps some pseudo-code
>for INTERPRET will be helpful in understanding how compilation occurs in
>Gforth.

For dictionary words, what happens is, in essence:

parse-name find-name dup 0= #-13 and throw ( nt )
state @ if
name>compile
else
name>interpret
then
execute

You won't find it in this form in the current text interpreter,
because the text interpreter is now written to cover all kinds of
recognizers.

What may also be interesting to you is what happens then: For words
with default interpretation and compilation semantics (most words),
the NAME>COMPILE implementation is (simplified)

: default-name>comp ( nt -- xt1 xt2 )
name>interpret ['] compile, ;

For an immediate word, the NAME>COMPILE implementation is:

: imm>comp ( nt -- xt1 xt2 )
name>interpret ['] execute ;

Not just optimization, but all code generation happens in the
implementations of COMPILE,. E.g., ":,", "CONSTANT," and
"PEEPHOLE-COMPILE," are (simplified):

: :, ( xt -- ) >body ['] call peephole-compile, , ;

: constant, ( xt -- ) >body @ postpone literal ;

: peephole-compile, ( xt -- )
\ compile xt, appending its code to the current dynamic superinstruction
lits, here swap , compile-prim1 ;

LITS, ensures that any literals on the literal stack are compiled
before the primitive (this is part of the constant folding
implementation), and COMPILE-PRIM1 ( addr -- ) is in the C part of
Gforth; in gforth-fast, it performs stack caching, combines primitives
into superinstructions, and performs native-code generation if these
optimizations are enabled (they are by default), but at the very
least, turns the code-field address (ITC) into a code address (DTC).

Note that in the gforth and gforth-fast engines, "," alone does not
work for compiling a word, because these engines use hybrid
direct/indirect threaded code, which requires primitive-centric code
for colon definitions: In a colon definition, every word is compiled
to a primitive, possibly followed by an immediate argument; e.g., a
colon definition is compiled to the primitive CALL followed by the
address of the called word. See:

@InProceedings{ertl02,
author = {M. Anton Ertl},
title = {Threaded Code Variations and Optimizations (Extended
Version)},
booktitle = {Forth-Tagung 2002},
year = {2002},
address = {Garmisch-Partenkirchen},
url = {http://www.complang.tuwien.ac.at/papers/ertl02.ps.gz},
abstract = {Forth has been traditionally implemented as indirect
threaded code, where the code for non-primitives is
the code-field address of the word. To get the
maximum benefit from combining sequences of
primitives into superinstructions, the code produced
for a non-primitive should be a primitive followed
by a parameter (e.g., \code{lit} \emph{addr} for
variables). This paper takes a look at the steps
from a traditional threaded-code implementation to
superinstructions, and at the size and speed effects
of the various steps.\comment{It also compares these
variants of Gforth to various other Forth
implementations on contemporary machines.} The use
of superinstructions gives speedups of up to a
factor of 2 on large benchmarks on processors with
branch target buffers, but requires more space for
the primitives and the optimization tables, and also
a little more space for the threaded code.}
}

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

Re: Inefficiency of FSL matrices

<ums4p8$1peim$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25865&group=comp.lang.forth#25865

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: krishna.myneni@ccreweb.org (Krishna Myneni)
Newsgroups: comp.lang.forth
Subject: Re: Inefficiency of FSL matrices
Date: Sun, 31 Dec 2023 10:27:20 -0600
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <ums4p8$1peim$1@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me>
<ulog1e$3bbc1$1@dont-email.me>
<76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com>
<ulpg1t$3gktg$1@dont-email.me> <ulphuo$3h2lu$1@dont-email.me>
<ulqp9d$3o5cg$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 31 Dec 2023 16:27:21 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f2665e0aaf52942e704793ee38e20be7";
logging-data="1882710"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+QXujQZxNU/+UipEgQhcxk"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.15.1
Cancel-Lock: sha1:n89hLdV+LuFQRFtnh+9+oMS2kyM=
In-Reply-To: <ulqp9d$3o5cg$1@dont-email.me>
Content-Language: en-US
 by: Krishna Myneni - Sun, 31 Dec 2023 16:27 UTC

On 12/18/23 18:48, Krishna Myneni wrote:
> On 12/18/23 07:37, Krishna Myneni wrote:
....
> I also introduced the word "*+" in the last commit in kForth-64.
>
> *+ ( a b c -- n )  \ n = a*b + c
>
> Note that *+ is not the same as the sequence "* +", ...
>
> I expect the floating point version, F*+, to be equally, if not more
> useful for improving readability of fp code and increasing efficiency.
>
> F*+ ( F: r1 r2 r3 -- r )  \ r = r1*r2 + r3
>
> F*+ provides a intrinsic scalar linear transformation. I have not yet
> added this word.
>

I need to correct my terminology in the prior statement. It is incorrect
to refer to the word F*+ (and its integer counterpart, *+) as a scalar
"linear transformation." They allow efficient calculation of linear
functions, of the form f(x) = a*x + b.

A "linear transformation" is something else. A function effecting a
linear transformation must have the property f(x=0) = 0, which is most
certainly not true in the general case for F*+ or *+.

--
Krishna

Re: Inefficiency of FSL matrices

<1b1d629383490ba8c96710c1caa46612@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25874&group=comp.lang.forth#25874

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!.POSTED!not-for-mail
From: minforth@gmx.net (minforth)
Newsgroups: comp.lang.forth
Subject: Re: Inefficiency of FSL matrices
Date: Tue, 2 Jan 2024 13:08:48 +0000
Organization: novaBBS
Message-ID: <1b1d629383490ba8c96710c1caa46612@news.novabbs.com>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me> <ulog1e$3bbc1$1@dont-email.me> <76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com> <ulpg1t$3gktg$1@dont-email.me> <ulphuo$3h2lu$1@dont-email.me> <ulqp9d$3o5cg$1@dont-email.me> <ums4p8$1peim$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2008334"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Site: $2y$10$4ReZtmtnrhSxnmc2O1zv1ehXlZ9sis/VIKYgqSEqZVwcf5mpz3BxK
X-Rslight-Posting-User: 0d6d33dbe0e2e1ff58b82acfc1a8a32ac3b1cb72
 by: minforth - Tue, 2 Jan 2024 13:08 UTC

Krishna Myneni wrote:
> I need to correct my terminology in the prior statement. It is incorrect
> to refer to the word F*+ (and its integer counterpart, *+) as a scalar
> "linear transformation." They allow efficient calculation of linear
> functions, of the form f(x) = a*x + b.

> A "linear transformation" is something else. A function effecting a
> linear transformation must have the property f(x=0) = 0, which is most
> certainly not true in the general case for F*+ or *+.

Mathematically correct. FMA or F*+ is used row-wise or column-wise within
many matrix operations e.g. Gaussian elimination.

Unsurprisingly many modern CPUs even support vectorized FMA.

Re: SET-OPTIMIZER etc. (was: Inefficiency of FSL matrices)

<un2fop$2uni1$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25875&group=comp.lang.forth#25875

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: krishna.myneni@ccreweb.org (Krishna Myneni)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc. (was: Inefficiency of FSL matrices)
Date: Tue, 2 Jan 2024 20:11:37 -0600
Organization: A noiseless patient Spider
Lines: 175
Message-ID: <un2fop$2uni1$1@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me>
<ulog1e$3bbc1$1@dont-email.me>
<76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com>
<ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at>
<ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at>
<umlhqr$nkl6$1@dont-email.me> <2023Dec29.092151@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 3 Jan 2024 02:11:38 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f1c4f2f9e53b25a6d865e702ede89827";
logging-data="3104321"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18USped8KKLsml3e6fVnvgE"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.15.1
Cancel-Lock: sha1:Rq5hvLtv7TsVWPaA1xJHMwDKu5A=
In-Reply-To: <2023Dec29.092151@mips.complang.tuwien.ac.at>
Content-Language: en-US
 by: Krishna Myneni - Wed, 3 Jan 2024 02:11 UTC

On 12/29/23 02:21, Anton Ertl wrote:
> Krishna Myneni <krishna.myneni@ccreweb.org> writes:
>> My impression is that, in Gforth, you implement xt for a word as follows:
>>
>> { xt-interp, xt-compile [,xt-opt] }
>>
>> and SET-OPTIMIZER allows you to specify xt-opt.
>
> Actually, Gforth has 8 header methods (method selectors) and 9 setter
> words for specifying the method implementation for a method selector
> in the current word:
>
> execute ( ... xt -- ... ) set-execute ( code-address -- )
> execute ( ... xt -- ... ) set-does> ( xt -- )
> compile, ( xt -- ) set-optimizer ( xt -- )
> name>interpret ( nt -- xt ) set->int ( xt -- )
> name>compile ( nt -- xt1 xt2 ) set->comp ( xt -- )
> (to) ( val xt -- ) set-to ( xt -- )
> defer@ ( xt1 -- xt2 ) set-defer@ ( xt -- )
> name>string ( nt -- c-addr u ) set->string ( xt -- )
> name>link ( nt1 -- nt2|0 ) set->link ( xt -- )
>
> In particular, you change the compilation semantics by specifying the
> NAME>COMPILE method implementation, which is somewhat different from
> providing xt-compsem (your xt-compile). Based on that, we also
> provide the convenience words:
>
> set-compsem ( xt -- )
> compsem: ( -- ) \ dedicated to Stephen Pelc
> intsem: ( -- ) \ special dedication to Stephen Pelc
> interpret/compile: ( xt-int xt-comp "name" -- )
>
> Usage examples, all equivalent:
>
> :noname ." compiling" ;
> : foo ." interpreting" ; set-compsem
>
> : foo ." interpreting" ; compsem: ." compiling" ;
> : foo ." compiling" ; intsem: ." interpreting" ;
>
> : foo-int ." interpreting" ;
> : foo-comp ." compiling" ;
> ' foo-int ' foo-comp interpret/compile: foo
>
> Note that several people (including me) have recommended to define,
> for every dual-semantics word like FOO, also FOO-INT and FOO-COMP.
>
> Usage:
> 9 interpret/compile:
> 1 set-compsem: (in the definition of COMPSEM:)
> 1 compsem:
> 0 intselm:
> 23 set-optimizer (and there are other, derived words)
>
> Read all about it in:
>
> http://www.euroforth.org/ef19/papers/paysan.pdf
>
> Since then, we have moved nt and xt to point to the body.
>
> You can see the header methods (except the code field) with
>
> .hm ( nt -- )
>
> E.g., for seeing the code generator of "!", you say
>
> ``! .hm
>
> Let's compare the header methods of "!", ":", and "TO" and the
> constant K-F1:
>
> ``! .hm ``: .hm ``to .hm ``k-f1 .hm
> opt: peephole-compile, :, :, constant,
> to: no-to no-to no-to no-to
> extra: $0 $0 $0 $0
>> int: noop noop a>int noop
>> comp: default-name>comp default-name>comp i/c>comp default-name>comp
>> string: named>string named>string named>string named>string
>> link: named>link named>link named>link named>link
>
> The "extra:" field is used for SET-DOES>.
>
>> Perhaps some pseudo-code
>> for INTERPRET will be helpful in understanding how compilation occurs in
>> Gforth.
>
> For dictionary words, what happens is, in essence:
>
> parse-name find-name dup 0= #-13 and throw ( nt )
> state @ if
> name>compile
> else
> name>interpret
> then
> execute
>
> You won't find it in this form in the current text interpreter,
> because the text interpreter is now written to cover all kinds of
> recognizers.
>
> What may also be interesting to you is what happens then: For words
> with default interpretation and compilation semantics (most words),
> the NAME>COMPILE implementation is (simplified)
>
> : default-name>comp ( nt -- xt1 xt2 )
> name>interpret ['] compile, ;
>
> For an immediate word, the NAME>COMPILE implementation is:
>
> : imm>comp ( nt -- xt1 xt2 )
> name>interpret ['] execute ;
>
> Not just optimization, but all code generation happens in the
> implementations of COMPILE,. E.g., ":,", "CONSTANT," and
> "PEEPHOLE-COMPILE," are (simplified):
>
> : :, ( xt -- ) >body ['] call peephole-compile, , ;
>
> : constant, ( xt -- ) >body @ postpone literal ;
>
> : peephole-compile, ( xt -- )
> \ compile xt, appending its code to the current dynamic superinstruction
> lits, here swap , compile-prim1 ;
>
> LITS, ensures that any literals on the literal stack are compiled
> before the primitive (this is part of the constant folding
> implementation), and COMPILE-PRIM1 ( addr -- ) is in the C part of
> Gforth; in gforth-fast, it performs stack caching, combines primitives
> into superinstructions, and performs native-code generation if these
> optimizations are enabled (they are by default), but at the very
> least, turns the code-field address (ITC) into a code address (DTC).
>
> Note that in the gforth and gforth-fast engines, "," alone does not
> work for compiling a word, because these engines use hybrid
> direct/indirect threaded code, which requires primitive-centric code
> for colon definitions: In a colon definition, every word is compiled
> to a primitive, possibly followed by an immediate argument; e.g., a
> colon definition is compiled to the primitive CALL followed by the
> address of the called word. See:
>
> @InProceedings{ertl02,
> author = {M. Anton Ertl},
> title = {Threaded Code Variations and Optimizations (Extended
> Version)},
> booktitle = {Forth-Tagung 2002},
> year = {2002},
> address = {Garmisch-Partenkirchen},
> url = {http://www.complang.tuwien.ac.at/papers/ertl02.ps.gz},
> abstract = {Forth has been traditionally implemented as indirect
> threaded code, where the code for non-primitives is
> the code-field address of the word. To get the
> maximum benefit from combining sequences of
> primitives into superinstructions, the code produced
> for a non-primitive should be a primitive followed
> by a parameter (e.g., \code{lit} \emph{addr} for
> variables). This paper takes a look at the steps
> from a traditional threaded-code implementation to
> superinstructions, and at the size and speed effects
> of the various steps.\comment{It also compares these
> variants of Gforth to various other Forth
> implementations on contemporary machines.} The use
> of superinstructions gives speedups of up to a
> factor of 2 on large benchmarks on processors with
> branch target buffers, but requires more space for
> the primitives and the optimization tables, and also
> a little more space for the threaded code.}
> }
>
> - anton

Thank you for the details. There is a good deal to absorb here.

--
Krishna

Re: SET-OPTIMIZER etc.

<un2hnv$32oig$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25876&group=comp.lang.forth#25876

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: dxforth@gmail.com (dxf)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc.
Date: Wed, 3 Jan 2024 13:45:19 +1100
Organization: A noiseless patient Spider
Lines: 178
Message-ID: <un2hnv$32oig$1@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me>
<ulog1e$3bbc1$1@dont-email.me>
<76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com>
<ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at>
<ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at>
<umlhqr$nkl6$1@dont-email.me> <2023Dec29.092151@mips.complang.tuwien.ac.at>
<un2fop$2uni1$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 3 Jan 2024 02:45:21 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="79bac741e1e47902726d1b6667a722d4";
logging-data="3236432"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX183gQvIHRdmJxmlGkQiNN9y"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:YddDeAOJnDY5REy4SlKz+9J26mo=
Content-Language: en-GB
In-Reply-To: <un2fop$2uni1$1@dont-email.me>
 by: dxf - Wed, 3 Jan 2024 02:45 UTC

On 3/01/2024 1:11 pm, Krishna Myneni wrote:
> On 12/29/23 02:21, Anton Ertl wrote:
>> Krishna Myneni <krishna.myneni@ccreweb.org> writes:
>>> My impression is that, in Gforth, you implement xt for a word as follows:
>>>
>>> { xt-interp, xt-compile [,xt-opt] }
>>>
>>> and SET-OPTIMIZER allows you to specify xt-opt.
>>
>> Actually, Gforth has 8 header methods (method selectors) and 9 setter
>> words for specifying the method implementation for a method selector
>> in the current word:
>>
>> execute ( ... xt -- ... )      set-execute   ( code-address -- )
>> execute ( ... xt -- ... )      set-does>     ( xt -- )
>> compile, ( xt -- )             set-optimizer ( xt -- )
>> name>interpret ( nt -- xt )    set->int      ( xt -- )
>> name>compile ( nt -- xt1 xt2 ) set->comp     ( xt -- )
>> (to) ( val xt -- )             set-to        ( xt -- )
>> defer@ ( xt1 -- xt2 )          set-defer@    ( xt -- )
>> name>string ( nt -- c-addr u ) set->string   ( xt -- )
>> name>link ( nt1 -- nt2|0 )     set->link     ( xt -- )
>>
>> In particular, you change the compilation semantics by specifying the
>> NAME>COMPILE method implementation, which is somewhat different from
>> providing xt-compsem (your xt-compile).  Based on that, we also
>> provide the convenience words:
>>
>> set-compsem ( xt -- )
>> compsem: ( -- )                  \ dedicated to Stephen Pelc
>> intsem:  ( -- )                  \ special dedication to Stephen Pelc
>> interpret/compile: ( xt-int xt-comp "name" -- )
>>
>> Usage examples, all equivalent:
>>
>> :noname ." compiling" ;
>> : foo ." interpreting" ; set-compsem
>>
>> : foo ." interpreting" ; compsem: ." compiling" ;
>> : foo ." compiling" ;     intsem: ." interpreting" ;
>>
>> : foo-int  ." interpreting" ;
>> : foo-comp ." compiling" ;
>> ' foo-int ' foo-comp interpret/compile: foo
>>
>> Note that several people (including me) have recommended to define,
>> for every dual-semantics word like FOO, also FOO-INT and FOO-COMP.
>>
>> Usage:
>>   9 interpret/compile:
>>   1 set-compsem: (in the definition of COMPSEM:)
>>   1 compsem:
>>   0 intselm:
>> 23 set-optimizer (and there are other, derived words)
>>
>> Read all about it in:
>>
>> http://www.euroforth.org/ef19/papers/paysan.pdf
>>
>> Since then, we have moved nt and xt to point to the body.
>>
>> You can see the header methods (except the code field) with
>>
>> .hm ( nt -- )
>>
>> E.g., for seeing the code generator of "!", you say
>>
>> ``! .hm
>>
>> Let's compare the header methods of "!", ":", and "TO" and the
>> constant K-F1:
>>
>>           ``! .hm           ``: .hm           ``to .hm     ``k-f1 .hm
>> opt:     peephole-compile, :,                :,           constant,
>> to:      no-to             no-to             no-to        no-to
>> extra:   $0                $0                $0           $0
>>> int:    noop              noop              a>int        noop
>>> comp:   default-name>comp default-name>comp i/c>comp     default-name>comp
>>> string: named>string      named>string      named>string named>string
>>> link:   named>link        named>link        named>link   named>link
>>
>> The "extra:" field is used for SET-DOES>.
>>
>>> Perhaps some pseudo-code
>>> for INTERPRET will be helpful in understanding how compilation occurs in
>>> Gforth.
>>
>> For dictionary words, what happens is, in essence:
>>
>> parse-name find-name dup 0= #-13 and throw ( nt )
>> state @ if
>>    name>compile
>> else
>>    name>interpret
>> then
>> execute
>>
>> You won't find it in this form in the current text interpreter,
>> because the text interpreter is now written to cover all kinds of
>> recognizers.
>>
>> What may also be interesting to you is what happens then: For words
>> with default interpretation and compilation semantics (most words),
>> the NAME>COMPILE implementation is (simplified)
>>
>> : default-name>comp ( nt -- xt1 xt2 )
>>    name>interpret ['] compile, ;
>>
>> For an immediate word, the NAME>COMPILE implementation is:
>>
>> : imm>comp ( nt -- xt1 xt2 )
>>    name>interpret ['] execute ;
>>    Not just optimization, but all code generation happens in the
>> implementations of COMPILE,.  E.g., ":,", "CONSTANT," and
>> "PEEPHOLE-COMPILE," are (simplified):
>>
>> : :, ( xt -- ) >body ['] call peephole-compile, , ;
>>
>> : constant, ( xt -- ) >body @ postpone literal ;
>>
>> : peephole-compile, ( xt -- )
>>      \ compile xt, appending its code to the current dynamic superinstruction
>>      lits, here swap , compile-prim1 ;
>>
>> LITS, ensures that any literals on the literal stack are compiled
>> before the primitive (this is part of the constant folding
>> implementation), and COMPILE-PRIM1 ( addr -- ) is in the C part of
>> Gforth; in gforth-fast, it performs stack caching, combines primitives
>> into superinstructions, and performs native-code generation if these
>> optimizations are enabled (they are by default), but at the very
>> least, turns the code-field address (ITC) into a code address (DTC).
>>
>> Note that in the gforth and gforth-fast engines, "," alone does not
>> work for compiling a word, because these engines use hybrid
>> direct/indirect threaded code, which requires primitive-centric code
>> for colon definitions: In a colon definition, every word is compiled
>> to a primitive, possibly followed by an immediate argument; e.g., a
>> colon definition is compiled to the primitive CALL followed by the
>> address of the called word.  See:
>>
>> @InProceedings{ertl02,
>>    author =     {M. Anton Ertl},
>>    title =     {Threaded Code Variations and Optimizations (Extended
>>                    Version)},
>>    booktitle =     {Forth-Tagung 2002},
>>    year =     {2002},
>>    address =     {Garmisch-Partenkirchen},
>>    url =          {http://www.complang.tuwien.ac.at/papers/ertl02.ps.gz},
>>    abstract =     {Forth has been traditionally implemented as indirect
>>                    threaded code, where the code for non-primitives is
>>                    the code-field address of the word. To get the
>>                    maximum benefit from combining sequences of
>>                    primitives into superinstructions, the code produced
>>                    for a non-primitive should be a primitive followed
>>                    by a parameter (e.g., \code{lit} \emph{addr} for
>>                    variables). This paper takes a look at the steps
>>                    from a traditional threaded-code implementation to
>>                    superinstructions, and at the size and speed effects
>>                    of the various steps.\comment{It also compares these
>>                    variants of Gforth to various other Forth
>>                    implementations on contemporary machines.} The use
>>                    of superinstructions gives speedups of up to a
>>                    factor of 2 on large benchmarks on processors with
>>                    branch target buffers, but requires more space for
>>                    the primitives and the optimization tables, and also
>>                    a little more space for the threaded code.}
>> }
>>
>> - anton
>
> Thank you for the details. There is a good deal to absorb here.


Click here to read the complete article
Re: SET-OPTIMIZER etc.

<05928adca947347c8ada4107fddb9ba8@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25878&group=comp.lang.forth#25878

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!.POSTED!not-for-mail
From: minforth@gmx.net (minforth)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc.
Date: Wed, 3 Jan 2024 13:42:26 +0000
Organization: novaBBS
Message-ID: <05928adca947347c8ada4107fddb9ba8@news.novabbs.com>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me> <ulog1e$3bbc1$1@dont-email.me> <76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com> <ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at> <ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at> <umlhqr$nkl6$1@dont-email.me> <2023Dec29.092151@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2127847"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$q/0YFJlptVIy.5YBPApCH.YtP6PdyHTWxgdMt1p07V5gdOJk8b3Ge
X-Rslight-Posting-User: 0d6d33dbe0e2e1ff58b82acfc1a8a32ac3b1cb72
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: minforth - Wed, 3 Jan 2024 13:42 UTC

Anton Ertl wrote:
> Not just optimization, but all code generation happens in the
> implementations of COMPILE,. E.g., ":,", "CONSTANT," and
> "PEEPHOLE-COMPILE," are (simplified):

> : :, ( xt -- ) >body ['] call peephole-compile, , ;

> : constant, ( xt -- ) >body @ postpone literal ;

> : peephole-compile, ( xt -- )
> compile xt, appending its code to the current dynamic superinstruction
> lits, here swap , compile-prim1 ;

> LITS, ensures that any literals on the literal stack are compiled
> before the primitive (this is part of the constant folding
> implementation), and COMPILE-PRIM1 ( addr -- ) is in the C part of
> Gforth; in gforth-fast, it performs stack caching, combines primitives
> into superinstructions, and performs native-code generation if these
> optimizations are enabled (they are by default), but at the very
> least, turns the code-field address (ITC) into a code address (DTC).

For constant folding, superinstructions and peephole optimisation,
my compiler has a FIFO token queue for delayed compilation.
Optimisations are based on simple pattern recognition of the queue content.
When active, this is fully automatic and does not require complicated
contortions and a bag of special words.

Re: SET-OPTIMIZER etc.

<2024Jan4.123304@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25883&group=comp.lang.forth#25883

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc.
Date: Thu, 04 Jan 2024 11:33:04 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 99
Message-ID: <2024Jan4.123304@mips.complang.tuwien.ac.at>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me> <ulog1e$3bbc1$1@dont-email.me> <76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com> <ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at> <ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at> <umlhqr$nkl6$1@dont-email.me> <2023Dec29.092151@mips.complang.tuwien.ac.at> <05928adca947347c8ada4107fddb9ba8@news.novabbs.com>
Injection-Info: dont-email.me; posting-host="8fb1d67506df1ec5fbbf30451372e7b6";
logging-data="3848356"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+D0PXNN1g3fllUf8WF7ee3"
Cancel-Lock: sha1:nOrqYqiVMg3oJPr/BkvGWbA+RaU=
X-newsreader: xrn 10.11
 by: Anton Ertl - Thu, 4 Jan 2024 11:33 UTC

minforth@gmx.net (minforth) writes:
>Anton Ertl wrote:
>> Not just optimization, but all code generation happens in the
>> implementations of COMPILE,. E.g., ":,", "CONSTANT," and
>> "PEEPHOLE-COMPILE," are (simplified):
>
>> : :, ( xt -- ) >body ['] call peephole-compile, , ;
>
>> : constant, ( xt -- ) >body @ postpone literal ;
>
>> : peephole-compile, ( xt -- )
>> \ compile xt, appending its code to the current dynamic superinstruction
>> lits, here swap , compile-prim1 ;
>
>> LITS, ensures that any literals on the literal stack are compiled
>> before the primitive (this is part of the constant folding
>> implementation), and COMPILE-PRIM1 ( addr -- ) is in the C part of
>> Gforth; in gforth-fast, it performs stack caching, combines primitives
>> into superinstructions, and performs native-code generation if these
>> optimizations are enabled (they are by default), but at the very
>> least, turns the code-field address (ITC) into a code address (DTC).
>
>For constant folding, superinstructions and peephole optimisation,
>my compiler has a FIFO token queue for delayed compilation.
>Optimisations are based on simple pattern recognition of the queue content.
>When active, this is fully automatic and does not require complicated
>contortions and a bag of special words.

No words for these optimizations in this compiler? So they are
written in a different language than Forth.

Gforth also has that, and COMPILE-PRIM1 is the interface to it. It
combines primitives into superinstructions, it optimizes stack
cacheing and it optimizes many instruction-pointer updates away. It
does not use a FIFO, but compiles a superblock at a time. It is quite
complicated (contorted?), but it does produce good results. You can
see the results by disabling the optimizations. Here are results on a
Ryzen 5800X.

sieve bubble matrix fib fft
0.037 0.035 0.013 0.031 0.021 gforth-fast
0.058 0.061 0.034 0.054 0.020 gforth-fast --ss-states=0 --ss-number=0 --opt-ip-updates=0

Gforth implements constant folding and some related optimizations,
such as optimizing "2 PICK" to "THIRD" on the Forth level, and that,
of course means that there is "a bag of special words". Matthias Koch
invented this approach. It uses a literal stack, and "LITERAL" pushes
its argument to that stack rather than compiling a literal.
"COMPILE," implementations of many words look at the literal stack and
perform constant folding if all arguments are constant, or sometimes
other optimizations (e.g., of 2 PICK) if some of the arguments are
constant.

This is actually a simple and straightforward optimization, and has
the nice property that most of the optimization code is very local to
the word that is optimized. E.g., the 2 PICK optimization looks as
follows:

:noname ( xt -- )
lits# 1 u>= if
lits> case
0 of postpone dup drop exit endof
1 of postpone over drop exit endof
2 of postpone third drop exit endof
3 of postpone fourth drop exit endof
dup >lits
endcase
then
peephole-compile, ;
optimizes pick

And note that DUP also has a constant-folding optimization, but that's
local to DUP. So if you actually compile

: foo 3 0 pick ;

you get the same code as with

: foo 3 3 ;

because first PICK is optimized to "COMPILE," DUP, and then DUP is
optimized. I doubt that this particular optimization sequence will
ever happen, but there have been other cases where I have been
positively surprised how far this simple optimization got us.

Anyway, back to the "special words". I guess you mean "LITS,". Yes,
there are some places where "LITS," has to be used; these are 4 places
in Gforth, three of them corresponding to code generation boundaries
(BEGIN, the start of a closure, and BASIC-BLOCK-END), and the use in
"PEEPHOLE-COMPILE," just before calling the lower-level compiler
COMPILE-PRIM1 shown above. If you look at it from a higher level
(e.g., at the level of COMPILE,), it's "fully automatic".

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

Re: SET-OPTIMIZER etc.

<un7cs1$3q90b$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25890&group=comp.lang.forth#25890

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: dxforth@gmail.com (dxf)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc.
Date: Fri, 5 Jan 2024 09:52:49 +1100
Organization: A noiseless patient Spider
Lines: 75
Message-ID: <un7cs1$3q90b$1@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me> <ulhpd7$1uifl$1@dont-email.me>
<ulog1e$3bbc1$1@dont-email.me>
<76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com>
<ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at>
<ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at>
<umlhqr$nkl6$1@dont-email.me> <2023Dec29.092151@mips.complang.tuwien.ac.at>
<05928adca947347c8ada4107fddb9ba8@news.novabbs.com>
<2024Jan4.123304@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 4 Jan 2024 22:52:50 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c77590be1374dd279d838308c1a51cb0";
logging-data="4006923"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/V0Xfagb1uFm9G+i3WMBnC"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:tmo1JB7/5s2a2CtQoqj6+iUFvBk=
In-Reply-To: <2024Jan4.123304@mips.complang.tuwien.ac.at>
Content-Language: en-GB
 by: dxf - Thu, 4 Jan 2024 22:52 UTC

On 4/01/2024 10:33 pm, Anton Ertl wrote:
> minforth@gmx.net (minforth) writes:
>> Anton Ertl wrote:
>>> Not just optimization, but all code generation happens in the
>>> implementations of COMPILE,. E.g., ":,", "CONSTANT," and
>>> "PEEPHOLE-COMPILE," are (simplified):
>>
>>> : :, ( xt -- ) >body ['] call peephole-compile, , ;
>>
>>> : constant, ( xt -- ) >body @ postpone literal ;
>>
>>> : peephole-compile, ( xt -- )
>>> \ compile xt, appending its code to the current dynamic superinstruction
>>> lits, here swap , compile-prim1 ;
>>
>>> LITS, ensures that any literals on the literal stack are compiled
>>> before the primitive (this is part of the constant folding
>>> implementation), and COMPILE-PRIM1 ( addr -- ) is in the C part of
>>> Gforth; in gforth-fast, it performs stack caching, combines primitives
>>> into superinstructions, and performs native-code generation if these
>>> optimizations are enabled (they are by default), but at the very
>>> least, turns the code-field address (ITC) into a code address (DTC).
>>
>> For constant folding, superinstructions and peephole optimisation,
>> my compiler has a FIFO token queue for delayed compilation.
>> Optimisations are based on simple pattern recognition of the queue content.
>> When active, this is fully automatic and does not require complicated
>> contortions and a bag of special words.
>
> No words for these optimizations in this compiler? So they are
> written in a different language than Forth.
>
> Gforth also has that, and COMPILE-PRIM1 is the interface to it. It
> combines primitives into superinstructions, it optimizes stack
> cacheing and it optimizes many instruction-pointer updates away. It
> does not use a FIFO, but compiles a superblock at a time. It is quite
> complicated (contorted?), but it does produce good results. You can
> see the results by disabling the optimizations. Here are results on a
> Ryzen 5800X.
>
> sieve bubble matrix fib fft
> 0.037 0.035 0.013 0.031 0.021 gforth-fast
> 0.058 0.061 0.034 0.054 0.020 gforth-fast --ss-states=0 --ss-number=0 --opt-ip-updates=0
>
> Gforth implements constant folding and some related optimizations,
> such as optimizing "2 PICK" to "THIRD" on the Forth level, and that,
> of course means that there is "a bag of special words". Matthias Koch
> invented this approach. It uses a literal stack, and "LITERAL" pushes
> its argument to that stack rather than compiling a literal.
> "COMPILE," implementations of many words look at the literal stack and
> perform constant folding if all arguments are constant, or sometimes
> other optimizations (e.g., of 2 PICK) if some of the arguments are
> constant.
>
> This is actually a simple and straightforward optimization, and has
> the nice property that most of the optimization code is very local to
> the word that is optimized. E.g., the 2 PICK optimization looks as
> follows:
>
> :noname ( xt -- )
> lits# 1 u>= if
> lits> case
> 0 of postpone dup drop exit endof
> 1 of postpone over drop exit endof
> 2 of postpone third drop exit endof
> 3 of postpone fourth drop exit endof
> dup >lits
> endcase
> then
> peephole-compile, ;
> optimizes pick

What this demonstrates is how badly ANS CASE needs optimization.
PICK not so much.

Re: SET-OPTIMIZER etc.

<2024Jan5.122715@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25892&group=comp.lang.forth#25892

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!nntp.comgw.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc.
Date: Fri, 05 Jan 2024 11:27:15 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 46
Message-ID: <2024Jan5.122715@mips.complang.tuwien.ac.at>
References: <uldipg$12lpv$1@dont-email.me> <ulog1e$3bbc1$1@dont-email.me> <76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com> <ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at> <ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at> <umlhqr$nkl6$1@dont-email.me> <2023Dec29.092151@mips.complang.tuwien.ac.at> <05928adca947347c8ada4107fddb9ba8@news.novabbs.com> <2024Jan4.123304@mips.complang.tuwien.ac.at> <un7cs1$3q90b$1@dont-email.me>
Injection-Info: dont-email.me; posting-host="db83b9ed50d0c79ff10ac9c13ee0eb86";
logging-data="157570"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19JYL9kP6hYO0gmwodjXV6w"
Cancel-Lock: sha1:BHOadC+uNkHPFN1ZQdTv1bE3Q9Y=
X-newsreader: xrn 10.11
 by: Anton Ertl - Fri, 5 Jan 2024 11:27 UTC

dxf <dxforth@gmail.com> writes:
>On 4/01/2024 10:33 pm, Anton Ertl wrote:
>> :noname ( xt -- )
>> lits# 1 u>= if
>> lits> case
>> 0 of postpone dup drop exit endof
>> 1 of postpone over drop exit endof
>> 2 of postpone third drop exit endof
>> 3 of postpone fourth drop exit endof
>> dup >lits
>> endcase
>> then
>> peephole-compile, ;
>> optimizes pick
>
>What this demonstrates is how badly ANS CASE needs optimization.
>PICK not so much.

What makes you think so?

Sure, this could be rewritten as:

create npicks ' dup , ' over , ' third , ' fourth ,

:noname ( xt -- )
lits# 1 u>= if
lits> dup 4 u< if
cells npicks + @ compile, exit then
>lits then
peephole-compile, ;
optimizes pick

and the result would be slightly faster when it is used, PICK is
compiled not that often, so why optimize it? Especially given that,
if the optimization was really useful, one could just write the latter
code.

Another question in this context is which version is easier to write
correctly and easier to read and understand.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

Re: SET-OPTIMIZER etc.

<unaf0p$b7ql$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25899&group=comp.lang.forth#25899

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: dxforth@gmail.com (dxf)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc.
Date: Sat, 6 Jan 2024 13:47:53 +1100
Organization: A noiseless patient Spider
Lines: 50
Message-ID: <unaf0p$b7ql$1@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me> <ulog1e$3bbc1$1@dont-email.me>
<76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com>
<ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at>
<ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at>
<umlhqr$nkl6$1@dont-email.me> <2023Dec29.092151@mips.complang.tuwien.ac.at>
<05928adca947347c8ada4107fddb9ba8@news.novabbs.com>
<2024Jan4.123304@mips.complang.tuwien.ac.at> <un7cs1$3q90b$1@dont-email.me>
<2024Jan5.122715@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 6 Jan 2024 02:47:53 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3d0c35f47e379885954fdc77244a0f74";
logging-data="368469"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/nP0o/uRumZVvQszQgQzxx"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:tcHbtf6OA1eJXSupwb/Xb1Dq6fs=
Content-Language: en-GB
In-Reply-To: <2024Jan5.122715@mips.complang.tuwien.ac.at>
 by: dxf - Sat, 6 Jan 2024 02:47 UTC

On 5/01/2024 10:27 pm, Anton Ertl wrote:
> dxf <dxforth@gmail.com> writes:
>> On 4/01/2024 10:33 pm, Anton Ertl wrote:
>>> :noname ( xt -- )
>>> lits# 1 u>= if
>>> lits> case
>>> 0 of postpone dup drop exit endof
>>> 1 of postpone over drop exit endof
>>> 2 of postpone third drop exit endof
>>> 3 of postpone fourth drop exit endof
>>> dup >lits
>>> endcase
>>> then
>>> peephole-compile, ;
>>> optimizes pick
>>
>> What this demonstrates is how badly ANS CASE needs optimization.
>> PICK not so much.
>
> What makes you think so?

Is it not obvious to all that ENDOF jumps in the above are compiled but
never used? I'm curious. What is it about ANS-FORTH that users and
implementers alike have become so uncritical?

> Sure, this could be rewritten as:
>
> create npicks ' dup , ' over , ' third , ' fourth ,
>
> :noname ( xt -- )
> lits# 1 u>= if
> lits> dup 4 u< if
> cells npicks + @ compile, exit then
> >lits then
> peephole-compile, ;
> optimizes pick
>
> and the result would be slightly faster when it is used, PICK is
> compiled not that often, so why optimize it? Especially given that,
> if the optimization was really useful, one could just write the latter
> code.

Don't make a computer do what intelligence will do anyway.

> Another question in this context is which version is easier to write
> correctly and easier to read and understand.

When the cases are 0 1 2 3 etc execution tables are easy to write and
overall code is minimal. But such is not often the case.

Re: SET-OPTIMIZER etc.

<nnd$0c989229$20f5c9f1@18a5e548ff366b9e>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25904&group=comp.lang.forth#25904

  copy link   Newsgroups: comp.lang.forth
Newsgroups: comp.lang.forth
References: <uldipg$12lpv$1@dont-email.me> <un7cs1$3q90b$1@dont-email.me> <2024Jan5.122715@mips.complang.tuwien.ac.at> <unaf0p$b7ql$1@dont-email.me>
Subject: Re: SET-OPTIMIZER etc.
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
From: albert@cherry (none)
Originator: albert@cherry.(none) (albert)
Message-ID: <nnd$0c989229$20f5c9f1@18a5e548ff366b9e>
Organization: KPN B.V.
Date: Sat, 06 Jan 2024 14:14:33 +0100
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!feed.abavia.com!abe005.abavia.com!abp002.abavia.com!news.kpn.nl!not-for-mail
Lines: 134
Injection-Date: Sat, 06 Jan 2024 14:14:33 +0100
Injection-Info: news.kpn.nl; mail-complaints-to="abuse@kpn.com"
 by: none - Sat, 6 Jan 2024 13:14 UTC

In article <unaf0p$b7ql$1@dont-email.me>, dxf <dxforth@gmail.com> wrote:
>On 5/01/2024 10:27 pm, Anton Ertl wrote:
>> dxf <dxforth@gmail.com> writes:
>>> On 4/01/2024 10:33 pm, Anton Ertl wrote:
>>>> :noname ( xt -- )
>>>> lits# 1 u>= if
>>>> lits> case
>>>> 0 of postpone dup drop exit endof
>>>> 1 of postpone over drop exit endof
>>>> 2 of postpone third drop exit endof
>>>> 3 of postpone fourth drop exit endof
>>>> dup >lits
>>>> endcase
>>>> then
>>>> peephole-compile, ;
>>>> optimizes pick
>>>
>>> What this demonstrates is how badly ANS CASE needs optimization.
>>> PICK not so much.
>>
>> What makes you think so?
>
>Is it not obvious to all that ENDOF jumps in the above are compiled but
>never used? I'm curious. What is it about ANS-FORTH that users and
>implementers alike have become so uncritical?
>
>> Sure, this could be rewritten as:
>>
>> create npicks ' dup , ' over , ' third , ' fourth ,
>>
>> :noname ( xt -- )
>> lits# 1 u>= if
>> lits> dup 4 u< if
>> cells npicks + @ compile, exit then
>> >lits then
>> peephole-compile, ;
>> optimizes pick
>>
>> and the result would be slightly faster when it is used, PICK is
>> compiled not that often, so why optimize it? Especially given that,
>> if the optimization was really useful, one could just write the latter
>> code.
>
>Don't make a computer do what intelligence will do anyway.
>
>> Another question in this context is which version is easier to write
>> correctly and easier to read and understand.
>
>When the cases are 0 1 2 3 etc execution tables are easy to write and
>overall code is minimal. But such is not often the case.

Optimiser for each word? IMHO this is peephole gone to far.
I prefer an optimisation table.

Adding
"
0 PICK | DUP |
1 PICK | OVER |
2 PICK | ROT DUP >R ROT ROT R> |
"
is easy enough.
Is it worth it?

From my optimiser:

: (MATCH-TABLE) |
\ `` MATCH-TABLE'' points here :
'P EXECUTE | P | \ Execute optimisation
'P + 'P + | 'P 'P + + | \ Associativity optimisation
'P 'P D+ 'P 'P D+ | 'P 'P 'P 'P D+ D+ | \ Associativity optimisation
'P + 'P - | 'P 'P - + |
'P - 'P + | 'P 'P - - |
'P - 'P - | 'P 'P + - |
'P @ 'P ! | NOOP |
'P ! 'P @ | DUP 'P ! |
'P M* DROP 'P M* DROP | 'P 'P M* DROP M* DROP | \ Invalid if last drop removed!
'P OR 'P OR | 'P 'P OR OR |
'P AND 'P AND | 'P 'P AND AND |
'P XOR 'P XOR | 'P 'P XOR XOR |
[ 0 ]L + | NOOP | \ Shortcut evaluations
[ 0 ]L - | NOOP |
[ 0 ]L M* DROP | DROP 0 |
[ 0 ]L OR | NOOP |
[ 0 ]L AND | DROP 0 |
[ 0 ]L XOR | NOOP |
[ 1 ]L M* DROP | NOOP |
[ 1 ]L / | NOOP |
[ -1 ]L M* DROP | NEGATE |
[ -1 ]L / | NEGATE |
[ -1 ]L OR | DROP -1 |
[ -1 ]L AND | NOOP |
[ -1 ]L XOR | INVERT |
'P LSHIFT 'P LSHIFT | 'P 'P + LSHIFT | \ Distributivity optimisation
'P RSHIFT 'P RSHIFT | 'P 'P + RSHIFT |
[ 0 ]L 0BRANCH [ 'P , ] | NOP1 NOP1 BRANCH [ 'P , ] | \ Branch optimisation
'P 0BRANCH [ 'P , ] | NOOP | \ Non-zero, zero is matched by previous
BRANCH [ 0 , ] | NOOP |
0BRANCH [ 0 , ] | DROP |
< 0= | 1+ > |
> 0= | 1- < |
>R R> | NOOP |
R> >R | NOOP |
;

This is on top of generalised folding and before inspecting the
machine code.
The machine code matcher relies heavily on my ciasdis assembler

<! !Q LEA, XO| !!T 0 {L,} ~!!T
!Q LEA, XO| !!T 0 {L,} ~!!T !>
<A QX: !TALLY LEA, XO| 0 L, !TALLY A>
{ bufv L@ $FFFFFF AND bufc OR!U
bufv 3 + L@ bufv 10 + L@ + bufc 3 + L! }
optimisation lealea-pattern

This generates an optimisation object, with a match pattern,
a replace pattern and a replacing xt.
This is definitively ugly. In view of special registers treatment in
the 86 I am at the verge of giving up.
Making a RISCV optimiser is probably a much more fruitful endeavour.

(The optimised version of the original byte sieve benchmark comes
with 10/20 procent of vfx and swiftforth versions, but this is unfair.)

Groetjes Albert

>
--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat spinning. - the Wise from Antrim -

Re: SET-OPTIMIZER etc.

<495126c461e2473a28fe1a1a8cfa39f8@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25907&group=comp.lang.forth#25907

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!.POSTED!not-for-mail
From: sdwjack69@gmail.com (sjack)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc.
Date: Sat, 6 Jan 2024 14:09:45 +0000
Organization: novaBBS
Message-ID: <495126c461e2473a28fe1a1a8cfa39f8@news.novabbs.com>
References: <uldipg$12lpv$1@dont-email.me> <ulog1e$3bbc1$1@dont-email.me> <76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com> <ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at> <ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at> <umlhqr$nkl6$1@dont-email.me> <2023Dec29.092151@mips.complang.tuwien.ac.at> <05928adca947347c8ada4107fddb9ba8@news.novabbs.com> <2024Jan4.123304@mips.complang.tuwien.ac.at> <un7cs1$3q90b$1@dont-email.me> <2024Jan5.122715@mips.complang.tuwien.ac.at> <unaf0p$b7ql$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2487707"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Posting-User: 38a45832bc9b44da212ffa4812b707acd0ff1b7c
X-Rslight-Site: $2y$10$QPby8TahvGxa3nwFFhV9IeZVCZmGby59LBMXCs.rEwVBzvIQV3GEO
 by: sjack - Sat, 6 Jan 2024 14:09 UTC

dxf wrote:

> never used? I'm curious. What is it about ANS-FORTH that users and
> implementers alike have become so uncritical?

"to be prickly in small matters is the nature of a hedgehog"
Nietzsche

Re: SET-OPTIMIZER etc.

<2024Jan6.173536@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25910&group=comp.lang.forth#25910

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc.
Date: Sat, 06 Jan 2024 16:35:36 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 48
Message-ID: <2024Jan6.173536@mips.complang.tuwien.ac.at>
References: <uldipg$12lpv$1@dont-email.me> <ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at> <ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at> <umlhqr$nkl6$1@dont-email.me> <2023Dec29.092151@mips.complang.tuwien.ac.at> <05928adca947347c8ada4107fddb9ba8@news.novabbs.com> <2024Jan4.123304@mips.complang.tuwien.ac.at> <un7cs1$3q90b$1@dont-email.me> <2024Jan5.122715@mips.complang.tuwien.ac.at> <unaf0p$b7ql$1@dont-email.me>
Injection-Info: dont-email.me; posting-host="444defe7ba9ec3dff1c2e1e9c29c85d7";
logging-data="727605"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18JWODyuwQRalCVKdjHl5iw"
Cancel-Lock: sha1:eGSu2ok+LMhCHceGFvJ5lOEkPtY=
X-newsreader: xrn 10.11
 by: Anton Ertl - Sat, 6 Jan 2024 16:35 UTC

dxf <dxforth@gmail.com> writes:
>On 5/01/2024 10:27 pm, Anton Ertl wrote:
>> dxf <dxforth@gmail.com> writes:
>>> On 4/01/2024 10:33 pm, Anton Ertl wrote:
>>>> :noname ( xt -- )
>>>> lits# 1 u>= if
>>>> lits> case
>>>> 0 of postpone dup drop exit endof
>>>> 1 of postpone over drop exit endof
>>>> 2 of postpone third drop exit endof
>>>> 3 of postpone fourth drop exit endof
>>>> dup >lits
>>>> endcase
>>>> then
>>>> peephole-compile, ;
>>>> optimizes pick
>>>
>>> What this demonstrates is how badly ANS CASE needs optimization.
>>> PICK not so much.
>>
>> What makes you think so?
>
>Is it not obvious to all that ENDOF jumps in the above are compiled but
>never used?

No. It depends on the Forth system. This optimization would be
relatively easy to implement (see below), but I never found it
worthwhile.

How to implement it? When performing the compilation semantics of
EXIT, AHEAD or AGAIN, just set a flag DEAD-CODE, and clear that flag
when performing the compilation semantics of THEN or BEGIN, or when
starting a new definition. Let the code-generation words ("COMPILE,",
LITERAL etc.) skip the code generation action when DEAD-CODE is set.

Gforth maintains DEAD-CODE for the automatic scoping of local
variables, but does not skip code generation.

This optimization would also eliminate the final exit in the pattern

begin ... again ;

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

Re: SET-OPTIMIZER etc.

<uncr25$pu9i$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25913&group=comp.lang.forth#25913

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: dxforth@gmail.com (dxf)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc.
Date: Sun, 7 Jan 2024 11:25:42 +1100
Organization: A noiseless patient Spider
Lines: 12
Message-ID: <uncr25$pu9i$1@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me> <ulog1e$3bbc1$1@dont-email.me>
<76dbb68b1eed8a9242ebab77e2f362d9@news.novabbs.com>
<ulpg1t$3gktg$1@dont-email.me> <2023Dec19.174341@mips.complang.tuwien.ac.at>
<ultadd$8ai8$1@dont-email.me> <2023Dec22.151432@mips.complang.tuwien.ac.at>
<umlhqr$nkl6$1@dont-email.me> <2023Dec29.092151@mips.complang.tuwien.ac.at>
<05928adca947347c8ada4107fddb9ba8@news.novabbs.com>
<2024Jan4.123304@mips.complang.tuwien.ac.at> <un7cs1$3q90b$1@dont-email.me>
<2024Jan5.122715@mips.complang.tuwien.ac.at> <unaf0p$b7ql$1@dont-email.me>
<495126c461e2473a28fe1a1a8cfa39f8@news.novabbs.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 7 Jan 2024 00:25:41 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="07904bc98a728e3f69a582b42caa0801";
logging-data="850226"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/XN5nLpSpyMaML/yFVZbHm"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:JgTSZrO6GkQby0kpXD7MpGSH5fA=
In-Reply-To: <495126c461e2473a28fe1a1a8cfa39f8@news.novabbs.com>
Content-Language: en-GB
 by: dxf - Sun, 7 Jan 2024 00:25 UTC

On 7/01/2024 1:09 am, sjack wrote:
> dxf wrote:
>
>> never used?  I'm curious.  What is it about ANS-FORTH that users and
>> implementers alike have become so uncritical?
>
> "to be prickly in small matters is the nature of a hedgehog"
> Nietzsche

If the tools one uses everyday are a small matter, it begs the question
what is greater.

Re: SET-OPTIMIZER etc.

<unde4m$qhrs$2@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25915&group=comp.lang.forth#25915

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: dxforth@gmail.com (dxf)
Newsgroups: comp.lang.forth
Subject: Re: SET-OPTIMIZER etc.
Date: Sun, 7 Jan 2024 16:51:18 +1100
Organization: A noiseless patient Spider
Lines: 63
Message-ID: <unde4m$qhrs$2@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me> <ulpg1t$3gktg$1@dont-email.me>
<2023Dec19.174341@mips.complang.tuwien.ac.at> <ultadd$8ai8$1@dont-email.me>
<2023Dec22.151432@mips.complang.tuwien.ac.at> <umlhqr$nkl6$1@dont-email.me>
<2023Dec29.092151@mips.complang.tuwien.ac.at>
<05928adca947347c8ada4107fddb9ba8@news.novabbs.com>
<2024Jan4.123304@mips.complang.tuwien.ac.at> <un7cs1$3q90b$1@dont-email.me>
<2024Jan5.122715@mips.complang.tuwien.ac.at> <unaf0p$b7ql$1@dont-email.me>
<2024Jan6.173536@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 7 Jan 2024 05:51:19 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="07904bc98a728e3f69a582b42caa0801";
logging-data="870268"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/tHilVHXGlwZMnhRcmMKZP"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:2IfbZxr4csdP/eez2YajLQl2UZs=
Content-Language: en-GB
In-Reply-To: <2024Jan6.173536@mips.complang.tuwien.ac.at>
 by: dxf - Sun, 7 Jan 2024 05:51 UTC

On 7/01/2024 3:35 am, Anton Ertl wrote:
> dxf <dxforth@gmail.com> writes:
>> On 5/01/2024 10:27 pm, Anton Ertl wrote:
>>> dxf <dxforth@gmail.com> writes:
>>>> On 4/01/2024 10:33 pm, Anton Ertl wrote:
>>>>> :noname ( xt -- )
>>>>> lits# 1 u>= if
>>>>> lits> case
>>>>> 0 of postpone dup drop exit endof
>>>>> 1 of postpone over drop exit endof
>>>>> 2 of postpone third drop exit endof
>>>>> 3 of postpone fourth drop exit endof
>>>>> dup >lits
>>>>> endcase
>>>>> then
>>>>> peephole-compile, ;
>>>>> optimizes pick
>>>>
>>>> What this demonstrates is how badly ANS CASE needs optimization.
>>>> PICK not so much.
>>>
>>> What makes you think so?
>>
>> Is it not obvious to all that ENDOF jumps in the above are compiled but
>> never used?
>
> No. It depends on the Forth system. This optimization would be
> relatively easy to implement (see below), but I never found it
> worthwhile.
>
> How to implement it? When performing the compilation semantics of
> EXIT, AHEAD or AGAIN, just set a flag DEAD-CODE, and clear that flag
> when performing the compilation semantics of THEN or BEGIN, or when
> starting a new definition. Let the code-generation words ("COMPILE,",
> LITERAL etc.) skip the code generation action when DEAD-CODE is set.
>
> Gforth maintains DEAD-CODE for the automatic scoping of local
> variables, but does not skip code generation.
>
> This optimization would also eliminate the final exit in the pattern
>
> begin ... again ;

The source is replete with 'dead code' and is due to the ANS-FORTH spec.
As the spec is unlikely to change, it is left to compiler writers to
make changes in the best interests of users and Forth in general. A
user ought to be able to write:

:noname ( xt -- )
lits# 1 u>= if
lits>
0 of postpone dup drop end
1 of postpone over drop end
2 of postpone third drop end
3 of postpone fourth drop end
>lits
then
peephole-compile, ;
optimizes pick

Everything to gain and nothing to lose - other than workarounds such
as ?OF et al folks have tacked on since ANS.

Re: Inefficiency of FSL matrices

<uo8ptl$21vd7$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=25969&group=comp.lang.forth#25969

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: krishna.myneni@ccreweb.org (Krishna Myneni)
Newsgroups: comp.lang.forth
Subject: Re: Inefficiency of FSL matrices
Date: Wed, 17 Jan 2024 08:57:56 -0600
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <uo8ptl$21vd7$1@dont-email.me>
References: <uldipg$12lpv$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 17 Jan 2024 14:57:57 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="695e73cd369caca444baa6cab59613f4";
logging-data="2162087"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18pdU5PfVeKdDyk6Wo3yht+"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.15.1
Cancel-Lock: sha1:hH4OAuxr3tEHE0OswymZcexYJek=
In-Reply-To: <uldipg$12lpv$1@dont-email.me>
Content-Language: en-US
 by: Krishna Myneni - Wed, 17 Jan 2024 14:57 UTC

On 12/13/23 18:38, Krishna Myneni wrote:
> The array and matrix definitions given in the FSL utilities source,
>
> fsl-util.x
> or
> fsl_util.x.
>
> The arrays are quite nice for their flexibility in making arrays out of
> any type of data, and are useful in many instances. However, the source
> definitions are slow on non-optimizing Forth systems.
>
> I believe the design of the arrays and matrices traces back to Julian
> Noble's, "Scientific Forth." A 1-D array is named with a trailing left
> brace "{" while 2-D matrices have a trailing double left brace, "{{".
> This allows a convenient notation
>
> a{ I }     \ resolves to address of array element I
> m{{ J I }} \ resolves to address of matrix element at row J, col I
>
....

For writing code which uses FSL arrays of FLOATS, we can improve the
efficiency significantly by defining ]F@ and ]F! . Recently, I had to
modify my Numerov integrator module in this manner, not primarily for
efficiency, but to be able to pass the address of an arbitrary element
index in an FSL array which is treated as the first element of a sub
array (this can't be done with the usual FSL array indexing method).

See

https://github.com/mynenik/kForth-64/blob/master/forth-src/fsl/extras/numerov.4th

--
KM

Pages:123
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor