Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

Let the machine do the dirty work. -- "Elements of Programming Style", Kernighan and Ritchie


devel / comp.arch / Re: AMD CPU funny

SubjectAuthor
* Re: AMD CPU funnyTheo
+- Re: AMD CPU funnyMitchAlsup
`* Re: AMD CPU funnyVir Campestris
 +* Re: AMD CPU funnyScott Lurndal
 |`* Re: AMD CPU funnyVir Campestris
 | +* Re: AMD CPU funnyMichael S
 | |`* Re: AMD CPU funnyMitchAlsup
 | | `* Re: AMD CPU funnyMichael S
 | |  `- Re: AMD CPU funnyMitchAlsup
 | `* Re: AMD CPU funnyAndy Burns
 |  `* Re: AMD CPU funnyChris M. Thomasson
 |   `- Re: AMD CPU funnyChris M. Thomasson
 +* Re: AMD CPU funnyMitchAlsup
 |`* Re: AMD CPU funnyVir Campestris
 | `* Re: AMD CPU funnyTheo
 |  `* Re: AMD CPU funnyMitchAlsup
 |   `* Re: AMD CPU funnyTheo
 |    +- Re: AMD CPU funnyMitchAlsup
 |    `* Re: AMD CPU funnyVir Campestris
 |     `* Re: AMD CPU funnyThomas Koenig
 |      `* Re: AMD CPU funnyDavid Brown
 |       `* Re: AMD CPU funnyVir Campestris
 |        +- Re: AMD CPU funnyDavid Brown
 |        `* Re: AMD CPU funnyMichael S
 |         `* Re: AMD CPU funnyVir Campestris
 |          `* Re: AMD CPU funnyAndy Burns
 |           `* Re: AMD CPU funnyTerje Mathisen
 |            `* Re: AMD CPU funnyBGB
 |             `* Re: AMD CPU funnyMitchAlsup
 |              +* Re: AMD CPU funnyBGB
 |              |+- Re: AMD CPU funnyMitchAlsup
 |              |`* Re: AMD CPU funnyPancho
 |              | +* Re: AMD CPU funnyDaniel James
 |              | |+* Re: AMD CPU funnyPancho
 |              | ||+* Re: AMD CPU funnyTerje Mathisen
 |              | |||+* Re: AMD CPU funnyPancho
 |              | ||||`- Re: AMD CPU funnyMitchAlsup
 |              | |||`- Re: AMD CPU funnyMitchAlsup
 |              | ||+- Re: AMD CPU funnyPancho
 |              | ||`* Re: AMD CPU funnyDaniel James
 |              | || +* Re: AMD CPU funnyTim Rentsch
 |              | || |`* Re: AMD CPU funnyTerje Mathisen
 |              | || | `- Re: AMD CPU funnyTim Rentsch
 |              | || `* Re: AMD CPU funnyPancho
 |              | ||  `- Re: AMD CPU funnyTim Rentsch
 |              | |+* Re: AMD CPU funnyMitchAlsup
 |              | ||`- Re: AMD CPU funnyTerje Mathisen
 |              | |`* Re: AMD CPU funnyTim Rentsch
 |              | | +* Re: AMD CPU funnyTerje Mathisen
 |              | | |+- Re: AMD CPU funnyMitchAlsup
 |              | | |`* Re: AMD CPU funnyTim Rentsch
 |              | | | `* Re: AMD CPU funnyTerje Mathisen
 |              | | |  `- Re: AMD CPU funnyTim Rentsch
 |              | | `* Re: AMD CPU funnyBGB
 |              | |  `- Re: AMD CPU funnyTim Rentsch
 |              | `* Re: AMD CPU funnyTim Rentsch
 |              |  `* Re: AMD CPU funnyThomas Koenig
 |              |   +* Re: AMD CPU funnyMitchAlsup
 |              |   |+* Re: AMD CPU funnyTerje Mathisen
 |              |   ||`- Re: AMD CPU funnyTim Rentsch
 |              |   |`* Re: AMD CPU funnyThomas Koenig
 |              |   | +- Re: AMD CPU funnyMitchAlsup
 |              |   | `- Re: AMD CPU funnyTerje Mathisen
 |              |   `* Re: AMD CPU funnyTim Rentsch
 |              |    +* Re: AMD CPU funnyTerje Mathisen
 |              |    |`* Re: AMD CPU funnyTim Rentsch
 |              |    | +* Re: AMD CPU funnyMitchAlsup
 |              |    | |`- Re: AMD CPU funnyTim Rentsch
 |              |    | `* Re: AMD CPU funnyTerje Mathisen
 |              |    |  +* Re: AMD CPU funnyMitchAlsup
 |              |    |  |+* Re: AMD CPU funnyTerje Mathisen
 |              |    |  ||`* Re: AMD CPU funnyMitchAlsup
 |              |    |  || +- Re: AMD CPU funnyTim Rentsch
 |              |    |  || `* Re: AMD CPU funnyTerje Mathisen
 |              |    |  ||  `* Re: AMD CPU funnyTerje Mathisen
 |              |    |  ||   +* Re: AMD CPU funnyTim Rentsch
 |              |    |  ||   |`* Re: AMD CPU funnyTerje Mathisen
 |              |    |  ||   | `* Re: AMD CPU funnyTim Rentsch
 |              |    |  ||   |  `* Re: AMD CPU funnyTerje Mathisen
 |              |    |  ||   |   `- Re: AMD CPU funnyTim Rentsch
 |              |    |  ||   `* Re: AMD CPU funnyMitchAlsup
 |              |    |  ||    `* Re: AMD CPU funnyMitchAlsup
 |              |    |  ||     `- Re: AMD CPU funnyTim Rentsch
 |              |    |  |`- Re: AMD CPU funnyTim Rentsch
 |              |    |  `* Re: AMD CPU funnyTim Rentsch
 |              |    |   `* Re: AMD CPU funnyTerje Mathisen
 |              |    |    `* Re: AMD CPU funnyTim Rentsch
 |              |    |     `- Re: AMD CPU funnyTerje Mathisen
 |              |    `* Re: AMD CPU funnyThomas Koenig
 |              |     +* Re: AMD CPU funnyTim Rentsch
 |              |     |`- Re: AMD CPU funnyTerje Mathisen
 |              |     `* Re: AMD CPU funnyMitchAlsup
 |              |      `* Re: AMD CPU funnyPaul A. Clayton
 |              |       +* Re: AMD CPU funnyAnton Ertl
 |              |       |+- Re: AMD CPU funnyMitchAlsup
 |              |       |`* Re: AMD CPU funnyGeorge Neuner
 |              |       | `- Re: AMD CPU funnyAnton Ertl
 |              |       `- Re: AMD CPU funnyBGB-Alt
 |              `- Re: AMD CPU funnyTerje Mathisen
 `* Re: AMD CPU funnyAndy Burns
  +* Re: AMD CPU funnyAndy Burns
  `- Re: AMD CPU funnyMitchAlsup

Pages:12345
Re: AMD CPU funny

<umvfc1$1qq5r$1@newsreader4.netcologne.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36474&group=comp.arch#36474

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-594-0-bad0-cab6-b9f6-b3a2.ipv6dyn.netcologne.de!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Mon, 1 Jan 2024 22:46:25 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <umvfc1$1qq5r$1@newsreader4.netcologne.de>
References: <ulv7j4$l0o0$1@dont-email.me>
<hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me>
<e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com>
<um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<86cyuksoc0.fsf@linuxsc.com>
Injection-Date: Mon, 1 Jan 2024 22:46:25 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-594-0-bad0-cab6-b9f6-b3a2.ipv6dyn.netcologne.de:2a0a:a540:594:0:bad0:cab6:b9f6:b3a2";
logging-data="1927355"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Mon, 1 Jan 2024 22:46 UTC

Tim Rentsch <tr.17687@z991.linuxsc.com> schrieb:
> Pancho <Pancho.Jones@proton.me> writes:
>
>> On 30/12/2023 23:19, BGB wrote:
>>
>>> Granted, There are plenty of other much more efficient ways of
>>> calculating Fibonacci numbers...
>>
>> Yes. In order of effort.
>>
>> 1) Memoize the recursive function.
>> 2) Return values for (n-1,n-2) as a tuple, and make it single (tail)
>> recursive. Which a smart compiler could theoretically convert to an
>> iterative loop.
>> 3) Write an iterative loop yourself.
>> 4) Remember college maths, and solve the recurrence relation to give a
>> very simple closed form solution.
>
> All of these approaches have drawbacks of one sort or another.
>
> For (1), we end up writing more code and using more memory.
> Some problems are a good fit to memoization, but computing
> Fibonacci numbers isn't one of the.
>
> For (2), compilers are not so good at dealing with tail
> recursion when the return value is a pair of results
> rather than a single result.
>
> For (3), iterative code takes more mental effort to see
> that it works than does a simple recursive version.

I don't think that

#include <stdio.h>

unsigned long int fib (int n)
{ unsigned long f1, f2, fn;

if (n < 2)
return 1;

f1 = 1;
f2 = 1;
while (n >= 2) {
fn = f2 + f1;
f1 = f2;
f2 = fn;
n--;
}
return fn;
}

is particularly intellecually challenging.

However, my favorite method is https://oeis.org/A331164 .
Not because it is particularly effective or easy to understand,
but because the sequence is nice to listen to.

Re: AMD CPU funny

<6b99684fe256d50fe3a78803200c438b@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36475&group=comp.arch#36475

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Mon, 1 Jan 2024 23:51:03 +0000
Organization: novaBBS
Message-ID: <6b99684fe256d50fe3a78803200c438b@news.novabbs.com>
References: <ulv7j4$l0o0$1@dont-email.me> <hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me> <e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="1946290"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949
X-Rslight-Site: $2y$10$kqsBiBCVs9ltOsLxJJ4EceeRDh2vqS3zSzZGUtNkshUfPryqeJffi
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: MitchAlsup - Mon, 1 Jan 2024 23:51 UTC

Thomas Koenig wrote:

> #include <stdio.h>

> unsigned long int fib (int n)
> {
> unsigned long f1, f2, fn;

> if (n < 2)
> return 1;

> f1 = 1;
> f2 = 1;
> do { // we know n >=2 so at least 1 path through the loop is performed.
> fn = f2 + f1;
> f1 = f2;
> f2 = fn;
> n--;
> } while( n > 2 );
> return fn;
> }

Saves 1 cycle and and at least 1 instruction of code footprint.

Re: AMD CPU funny

<un0d8s$2ire7$2@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36481&group=comp.arch#36481

  copy link   Newsgroups: uk.comp.homebuilt comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: uk.comp.homebuilt,comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 08:16:44 +0100
Organization: A noiseless patient Spider
Lines: 35
Message-ID: <un0d8s$2ire7$2@dont-email.me>
References: <ulv7j4$l0o0$1@dont-email.me>
<hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me>
<e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com>
<um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<ums29d$1p6cc$1@dont-email.me>
<fd0ecac8cd27ba889c031da9b90e0c84@news.novabbs.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Injection-Date: Tue, 2 Jan 2024 07:16:44 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0f31d8973b7141a50c164ff2a77cd8ee";
logging-data="2715079"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/W5Uambo5+Xuk2Dy5QMAqwM7dcSs4hnto2dOFU/LHcHQ=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18
Cancel-Lock: sha1:c4/W6OoyaBhLOzMLj5K1ggcS4Qk=
In-Reply-To: <fd0ecac8cd27ba889c031da9b90e0c84@news.novabbs.com>
 by: Terje Mathisen - Tue, 2 Jan 2024 07:16 UTC

MitchAlsup wrote:
> Daniel James wrote:
>
>> Unlike, say:
>
>>    double fib( int n )
>>    {
>>        return (pow( (1+sqrt(5.0))/2, n+1 )
>>                - pow( (1-sqrt(5.0))/2, n+1 ))
>>            / sqrt(5.0);
>>    }
>
> Looks fast to me::
>
> fib:
>     ADD    R2,R1,#1
>     POWN    R3,#(1+sqrt(5.0))/2,R2
>     POWN    R4,#(1-sqrt(5.0))/2,R2
>     FADD    R2,R3,R4
>     FDIV    R1,R2,#sqrt(5)
>     RET

The final FDIV should be a FMUL by (1.0/sqrt(5.0))

Probably followed by a round to nearest int.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: AMD CPU funny

<un0djb$2ire7$3@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36482&group=comp.arch#36482

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 08:22:19 +0100
Organization: A noiseless patient Spider
Lines: 20
Message-ID: <un0djb$2ire7$3@dont-email.me>
References: <ulv7j4$l0o0$1@dont-email.me>
<e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com>
<um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<ums29d$1p6cc$1@dont-email.me> <ums5h9$1m63d$2@dont-email.me>
<umuk0t$286vf$1@dont-email.me> <867ckssn67.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 2 Jan 2024 07:22:19 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0f31d8973b7141a50c164ff2a77cd8ee";
logging-data="2715079"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/F5yEjfnDX2mz/FYtONRQHw5HyCGrueAAEXi4wEZuSow=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18
Cancel-Lock: sha1:CTFUaGsauMUHSpKBxpwUmBKQyvI=
In-Reply-To: <867ckssn67.fsf@linuxsc.com>
 by: Terje Mathisen - Tue, 2 Jan 2024 07:22 UTC

Tim Rentsch wrote:
> Daniel James <daniel@me.invalid> writes:
>
>> However, the (signed) 64-bit integer result overflows for fib(n) where
>> n>45, which is more of a problem.
>
> fib( 92 ) is still less than 1UL << 63 (or fib(93) for a 64-bit
> unsigned type).
>
> And fib( 186 ) still fits in a 128-bit unsigned type.
>
If calculating fib(n) for n in the fixed 0..186 range is actually
important to you, then you would obviously allocate 16*187 (about 3KB)
bytes of table space and do a simple lookup instead.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: AMD CPU funny

<un0g3d$2j7b7$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36483&group=comp.arch#36483

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 09:05:00 +0100
Organization: A noiseless patient Spider
Lines: 54
Message-ID: <un0g3d$2j7b7$1@dont-email.me>
References: <ulv7j4$l0o0$1@dont-email.me>
<hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me>
<e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com>
<um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de>
<6b99684fe256d50fe3a78803200c438b@news.novabbs.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Injection-Date: Tue, 2 Jan 2024 08:05:01 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0f31d8973b7141a50c164ff2a77cd8ee";
logging-data="2727271"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1804NgfYbaJ3hz8GuzZu8g9GK8vwnxrzTEZc5RGnrIBvQ=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18
Cancel-Lock: sha1:LVpFaBxhVYZYP/7m94+D4jKKvqg=
In-Reply-To: <6b99684fe256d50fe3a78803200c438b@news.novabbs.com>
 by: Terje Mathisen - Tue, 2 Jan 2024 08:05 UTC

MitchAlsup wrote:
> Thomas Koenig wrote:
>
>> #include <stdio.h>
>
>> unsigned long int fib (int n)
>> {
>>   unsigned long f1, f2, fn;
>
>>   if (n < 2)
>>     return 1;
>
>>   f1 = 1;
>>   f2 = 1;
>>   do { // we know n >=2 so at least 1 path through the loop is performed.
>>     fn = f2 + f1;
>>     f1 = f2;
>>     f2 = fn;
>>     n--;
>>   } while( n > 2 );
>>   return fn;
>> }
>
> Saves 1 cycle and and at least 1 instruction of code footprint.

I have written asm code for this, and noted that on x86 where reg MOVes
used to cost a cycle, it was better to unroll and get rid of the reg-reg
moves.

I think the following should work:

u64 fib(unsigned n)
{ if (n-- < 2) return 1;
a = 1; b = 1;

a += (n & 1); // Odd remaining n
n >>= 1;

while (n--) {
b += a;
a += b;
}
return a;
}

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: AMD CPU funny

<8634vgrwc2.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36484&group=comp.arch#36484

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 02 Jan 2024 00:08:45 -0800
Organization: A noiseless patient Spider
Lines: 38
Message-ID: <8634vgrwc2.fsf@linuxsc.com>
References: <ulv7j4$l0o0$1@dont-email.me> <hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me> <e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <ums29d$1p6cc$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="7861cd1ffbd714be6607f64f72508688";
logging-data="2725040"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/njal8/GdIYviH50fSUlcZLKbjCBdKBEE="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:ok3FcyzZQ6Fs1O783fduErq1n5E=
sha1:C1EnHZbwmF8Nwp6f88W0eT76aQs=
 by: Tim Rentsch - Tue, 2 Jan 2024 08:08 UTC

Daniel James <daniel@me.invalid> writes:

> On 31/12/2023 09:12, Pancho wrote:
>
>> In fact, given that fib grows exponentially, so we only ever need to
>> calculate it for small values of n, the only ?bad? solution is the
>> naive double recursive one.
>
> Well, I might disagree.
>
> Given that we only ever need to calculate it for small values of n (as
> you say, and if that's true) its efficiency doesn't matter. If you
> need to compute it often enough that efficiency might become relevant
> just put the results in a table and use that. Given that fib goes over
> 10^12 after about 60 terms it needn't be a very big table.

When the size of the table is more than ten times the size of
the code, and the code is reasonably fast, using a table is a
poor choice. Fibonacci functions don't have to be slow.

> The naive doubly recursive solution has the huge merit of being
> human-readable.

But unacceptably slow.

> Unlike, say:
>
> double fib( int n )
> {
> return (pow( (1+sqrt(5.0))/2, n+1 )
> - pow( (1-sqrt(5.0))/2, n+1 ))
> / sqrt(5.0);
> }
>
> ... which is great for fibs of big values, but not fast for small ones.

Unacceptable for fibs of big values, because the values computed
are wrong.

Re: AMD CPU funny

<86y1d8qh1m.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36485&group=comp.arch#36485

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 02 Jan 2024 00:24:21 -0800
Organization: A noiseless patient Spider
Lines: 69
Message-ID: <86y1d8qh1m.fsf@linuxsc.com>
References: <ulv7j4$l0o0$1@dont-email.me> <um1h21$13l52$1@dont-email.me> <e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="7861cd1ffbd714be6607f64f72508688";
logging-data="2731790"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX193QNnFPHNgTsK874DATC/xpoADjC8yG78="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:F/+uNvaPmxNpKszpRCJIAhai9Fk=
sha1:F385xRkvhpZkS+u+CFsIoTsK9Dg=
 by: Tim Rentsch - Tue, 2 Jan 2024 08:24 UTC

Thomas Koenig <tkoenig@netcologne.de> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> schrieb:
>
>> Pancho <Pancho.Jones@proton.me> writes:
>>
>>> On 30/12/2023 23:19, BGB wrote:
>>>
>>>> Granted, There are plenty of other much more efficient ways of
>>>> calculating Fibonacci numbers...
>>>
>>> Yes. In order of effort.
>>>
>>> 1) Memoize the recursive function.
>>> 2) Return values for (n-1,n-2) as a tuple, and make it single (tail)
>>> recursive. Which a smart compiler could theoretically convert to an
>>> iterative loop.
>>> 3) Write an iterative loop yourself.
>>> 4) Remember college maths, and solve the recurrence relation to give a
>>> very simple closed form solution.
>>
>> All of these approaches have drawbacks of one sort or another.
>>
>> For (1), we end up writing more code and using more memory.
>> Some problems are a good fit to memoization, but computing
>> Fibonacci numbers isn't one of the.
>>
>> For (2), compilers are not so good at dealing with tail
>> recursion when the return value is a pair of results
>> rather than a single result.
>>
>> For (3), iterative code takes more mental effort to see
>> that it works than does a simple recursive version.
>
> I don't think that
>
> #include <stdio.h>
>
> unsigned long int fib (int n)
> {
> unsigned long f1, f2, fn;
>
> if (n < 2)
> return 1;
>
> f1 = 1;
> f2 = 1;
> while (n >= 2) {
> fn = f2 + f1;
> f1 = f2;
> f2 = fn;
> n--;
> }
> return fn;
> }
>
> is particularly intellecually challenging.

I'm sure there are other people who feel the same way.

Even so, when a short, functional, recursive version
is available, it is often easier to write, easier to
read, and simpler to show that it produces correct
results, in which case doing that is a better choice,
even if an imperative version is not particularly
intellecually challenging.

Incidentally your code produces wrong values, as
fib(0) is 0 and fib(1) is 1.

Re: AMD CPU funny

<un0hcp$2jca3$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36486&group=comp.arch#36486

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 09:27:04 +0100
Organization: A noiseless patient Spider
Lines: 58
Message-ID: <un0hcp$2jca3$1@dont-email.me>
References: <ulv7j4$l0o0$1@dont-email.me>
<hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me>
<e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com>
<um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<ums29d$1p6cc$1@dont-email.me> <8634vgrwc2.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 2 Jan 2024 08:27:05 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0f31d8973b7141a50c164ff2a77cd8ee";
logging-data="2732355"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Nr/vRk6WS6UyNczY84ykb9yCFeBiqmFeYoUjQCfsBpg=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18
Cancel-Lock: sha1:Gzm/3pTumo96CJf8ohOsIZxp/kk=
In-Reply-To: <8634vgrwc2.fsf@linuxsc.com>
 by: Terje Mathisen - Tue, 2 Jan 2024 08:27 UTC

Tim Rentsch wrote:
> Daniel James <daniel@me.invalid> writes:
>
>> On 31/12/2023 09:12, Pancho wrote:
>>
>>> In fact, given that fib grows exponentially, so we only ever need to
>>> calculate it for small values of n, the only ?bad? solution is the
>>> naive double recursive one.
>>
>> Well, I might disagree.
>>
>> Given that we only ever need to calculate it for small values of n (as
>> you say, and if that's true) its efficiency doesn't matter. If you
>> need to compute it often enough that efficiency might become relevant
>> just put the results in a table and use that. Given that fib goes over
>> 10^12 after about 60 terms it needn't be a very big table.
>
> When the size of the table is more than ten times the size of
> the code, and the code is reasonably fast, using a table is a
> poor choice. Fibonacci functions don't have to be slow.
>
>> The naive doubly recursive solution has the huge merit of being
>> human-readable.
>
> But unacceptably slow.
>
>> Unlike, say:
>>
>> double fib( int n )
>> {
>> return (pow( (1+sqrt(5.0))/2, n+1 )
>> - pow( (1-sqrt(5.0))/2, n+1 ))
>> / sqrt(5.0);
>> }
>>
>> ... which is great for fibs of big values, but not fast for small ones.
>
> Unacceptable for fibs of big values, because the values computed
> are wrong.
>

So either you move to quad or arbitrary precision (which will of course
be somewhat slow, or you use a hybrid technique, with lookup tables for
smallish n, iterative loop for somewhat larger and your O(log(n))
approach only for very large n?

Working with 256-bit uints, each addition will take 4-8 clock cycles,
and it overflows well before n reaches 400, right? In that case a worst
case iterative solution is less than 3000 clock cycles or about a
microsecond away.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: AMD CPU funny

<un0hu1$2je9f$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36487&group=comp.arch#36487

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 09:36:17 +0100
Organization: A noiseless patient Spider
Lines: 79
Message-ID: <un0hu1$2je9f$1@dont-email.me>
References: <ulv7j4$l0o0$1@dont-email.me> <um1h21$13l52$1@dont-email.me>
<e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com>
<um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de>
<86y1d8qh1m.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 2 Jan 2024 08:36:18 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0f31d8973b7141a50c164ff2a77cd8ee";
logging-data="2734383"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+BzGFawwqF3/uM1/KR0qo20YU7ZBMdXR/wk4XcO6IaxA=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18
Cancel-Lock: sha1:7Gs1WqP1qb3x1IonDEuxU9Ap+nM=
In-Reply-To: <86y1d8qh1m.fsf@linuxsc.com>
 by: Terje Mathisen - Tue, 2 Jan 2024 08:36 UTC

Tim Rentsch wrote:
> Thomas Koenig <tkoenig@netcologne.de> writes:
>
>> Tim Rentsch <tr.17687@z991.linuxsc.com> schrieb:
>>
>>> Pancho <Pancho.Jones@proton.me> writes:
>>>
>>>> On 30/12/2023 23:19, BGB wrote:
>>>>
>>>>> Granted, There are plenty of other much more efficient ways of
>>>>> calculating Fibonacci numbers...
>>>>
>>>> Yes. In order of effort.
>>>>
>>>> 1) Memoize the recursive function.
>>>> 2) Return values for (n-1,n-2) as a tuple, and make it single (tail)
>>>> recursive. Which a smart compiler could theoretically convert to an
>>>> iterative loop.
>>>> 3) Write an iterative loop yourself.
>>>> 4) Remember college maths, and solve the recurrence relation to give a
>>>> very simple closed form solution.
>>>
>>> All of these approaches have drawbacks of one sort or another.
>>>
>>> For (1), we end up writing more code and using more memory.
>>> Some problems are a good fit to memoization, but computing
>>> Fibonacci numbers isn't one of the.
>>>
>>> For (2), compilers are not so good at dealing with tail
>>> recursion when the return value is a pair of results
>>> rather than a single result.
>>>
>>> For (3), iterative code takes more mental effort to see
>>> that it works than does a simple recursive version.
>>
>> I don't think that
>>
>> #include <stdio.h>
>>
>> unsigned long int fib (int n)
>> {
>> unsigned long f1, f2, fn;
>>
>> if (n < 2)
>> return 1;
>>
>> f1 = 1;
>> f2 = 1;
>> while (n >= 2) {
>> fn = f2 + f1;
>> f1 = f2;
>> f2 = fn;
>> n--;
>> }
>> return fn;
>> }
>>
>> is particularly intellecually challenging.
>
> I'm sure there are other people who feel the same way.
>
> Even so, when a short, functional, recursive version
> is available, it is often easier to write, easier to
> read, and simpler to show that it produces correct
> results, in which case doing that is a better choice,
> even if an imperative version is not particularly
> intellecually challenging.
>
> Incidentally your code produces wrong values, as
> fib(0) is 0 and fib(1) is 1.
>

So return n for (n<2)?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: AMD CPU funny

<un0ivj$1rg6n$1@newsreader4.netcologne.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36488&group=comp.arch#36488

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-594-0-5856-a450-2160-967b.ipv6dyn.netcologne.de!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 08:54:11 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <un0ivj$1rg6n$1@newsreader4.netcologne.de>
References: <ulv7j4$l0o0$1@dont-email.me>
<hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me>
<e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com>
<um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de>
<6b99684fe256d50fe3a78803200c438b@news.novabbs.com>
Injection-Date: Tue, 2 Jan 2024 08:54:11 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-594-0-5856-a450-2160-967b.ipv6dyn.netcologne.de:2a0a:a540:594:0:5856:a450:2160:967b";
logging-data="1949911"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 2 Jan 2024 08:54 UTC

MitchAlsup <mitchalsup@aol.com> schrieb:
> Thomas Koenig wrote:
>
>> #include <stdio.h>
>
>> unsigned long int fib (int n)
>> {
>> unsigned long f1, f2, fn;
>
>> if (n < 2)
>> return 1;
>
>> f1 = 1;
>> f2 = 1;
>> do { // we know n >=2 so at least 1 path through the loop is performed.
>> fn = f2 + f1;
>> f1 = f2;
>> f2 = fn;
>> n--;
>> } while( n > 2 );

(This should be n >= 2)
>> return fn;
>> }
>
> Saves 1 cycle and and at least 1 instruction of code footprint.

Fortunately, this is one of the transformations which compilers
do automatically these days. For both versions, the code for
fib generated by a recent gcc on x86_64 with -O3 is

fib:
..LFB11:
.cfi_startproc
movl $1, %edx
cmpl $1, %edi
jle .L1
movl $1, %eax
movl $1, %ecx
jmp .L3
.p2align 4,,10
.p2align 3
..L5:
movq %rdx, %rax
..L3:
subl $1, %edi
leaq (%rcx,%rax), %rdx
movq %rax, %rcx
cmpl $1, %edi
jne .L5
..L1:
movq %rdx, %rax
ret

which looks pretty good.

Re: AMD CPU funny

<un0kmr$2jds3$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36489&group=comp.arch#36489

  copy link   Newsgroups: uk.comp.homebuilt comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Pancho.Jones@proton.me (Pancho)
Newsgroups: uk.comp.homebuilt,comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 09:23:39 +0000
Organization: A noiseless patient Spider
Lines: 79
Message-ID: <un0kmr$2jds3$1@dont-email.me>
References: <ulv7j4$l0o0$1@dont-email.me>
<hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me>
<e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com>
<um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<ums29d$1p6cc$1@dont-email.me> <ums5h9$1m63d$2@dont-email.me>
<umuk0t$286vf$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 2 Jan 2024 09:23:39 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0fa519e8df11f2d562a65276b155c658";
logging-data="2733955"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19byEfDDOHfX4wUv+mcmcCqGs4Ye6K/ylM="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.14.0
Cancel-Lock: sha1:fSvDbkrD2lo3QDQN8356FPqu3Q0=
Content-Language: en-GB
In-Reply-To: <umuk0t$286vf$1@dont-email.me>
 by: Pancho - Tue, 2 Jan 2024 09:23 UTC

On 01/01/2024 15:00, Daniel James wrote:
> On 31/12/2023 16:40, Pancho wrote:
>>> The naive doubly recursive solution has the huge merit of being
>>> human-readable.
>>
>> Yes, but you'd be surprised how quickly double recursion goes bad. I
>> haven't tested recently, but off the top of my head, the maximum
>> function call depth would be something like 300, so fib(10) would be
>> a stack overflow. fib(60) would need a stack function call depth ~
>> 1e18.
>
> On 64-bit gcc here on Debian (so, 64-bit ints) I can call fib(50)
> without blowing the stack, but it takes minutes (on a Ryzen 7).
>
> However, the (signed) 64-bit integer result overflows for fib(n) where
> n>45, which is more of a problem.
>

Yes, I explained that was a brain fart on my part. I remembered the
recursive stack limit was low and confused the number of recursive calls
with stack depth.

The real place I've had recursive stuff break, is tree/graph handling,
where a tree may adopt linear characteristics. There are much better
algorithms than a naive recursive one.

>>> Unlike, say:
>>>
>>>    double fib( int n )
>>>    {
>>>        return (pow( (1+sqrt(5.0))/2, n+1 )
>>>                - pow( (1-sqrt(5.0))/2, n+1 ))
>>>            / sqrt(5.0);
>>>    }
>>>
>>> ... which is great for fibs of big values, but not fast for small
>>> ones.
>>>
>>
>> Why isn't it fast for small fibs?
>
> Gut feeling. I haven't timed it, and it may be.
>

Yes, what I'm saying is it normally isn't valid. Floating point
functions are fast, and it is rarely something you need to optimise.

> I mean ... obviously it will be as fast for small ones as for large, but
> for small ones it won't beat an iterative calculation and for very small
> ones I doubt it will beat the naive recursive method.
>

It generally doesn't matter if something very, very fast is slightly
faster than something just very fast. Especially when you also have
very, very slow behaviour elsewhere.

> My point, though, was that clear code is often more important than fast
> execution, and while the naive recursive method for fibs gets horribly
> slow for large values even it is acceptable for small ones.
>

From a mathematical viewpoint, the closed form solution gives clearer
behaviour. It really depends on what you want, giving it a standard name
is probably the most important thing.

>> I seem to recall, if statements are more costly than floating point
>> function calls, like exponential.
>
> if statements are costly if they invalidate instruction prefetch and/or
> jump to code that isn't in the cache ... but optimizers and CPUs are
> pretty good at branch prediction and cacheing.
>

I was remembering software doing real life financial calcs, but I'm old,
so it was a long time ago. Pipelines are longer now, but maybe branch
prediction is better?, So I don't know if it is more of a problem or
less of a problem nowadays.

Re: AMD CPU funny

<un0rpm$1rkc9$1@newsreader4.netcologne.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36493&group=comp.arch#36493

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-594-0-5856-a450-2160-967b.ipv6dyn.netcologne.de!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 11:24:38 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <un0rpm$1rkc9$1@newsreader4.netcologne.de>
References: <ulv7j4$l0o0$1@dont-email.me> <um1h21$13l52$1@dont-email.me>
<e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com>
<um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de>
<86y1d8qh1m.fsf@linuxsc.com>
Injection-Date: Tue, 2 Jan 2024 11:24:38 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-594-0-5856-a450-2160-967b.ipv6dyn.netcologne.de:2a0a:a540:594:0:5856:a450:2160:967b";
logging-data="1954185"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 2 Jan 2024 11:24 UTC

Tim Rentsch <tr.17687@z991.linuxsc.com> schrieb:
> Thomas Koenig <tkoenig@netcologne.de> writes:
>
>> Tim Rentsch <tr.17687@z991.linuxsc.com> schrieb:
>>
>>> Pancho <Pancho.Jones@proton.me> writes:
>>>
>>>> On 30/12/2023 23:19, BGB wrote:
>>>>
>>>>> Granted, There are plenty of other much more efficient ways of
>>>>> calculating Fibonacci numbers...
>>>>
>>>> Yes. In order of effort.
>>>>
>>>> 1) Memoize the recursive function.
>>>> 2) Return values for (n-1,n-2) as a tuple, and make it single (tail)
>>>> recursive. Which a smart compiler could theoretically convert to an
>>>> iterative loop.
>>>> 3) Write an iterative loop yourself.
>>>> 4) Remember college maths, and solve the recurrence relation to give a
>>>> very simple closed form solution.
>>>
>>> All of these approaches have drawbacks of one sort or another.
>>>
>>> For (1), we end up writing more code and using more memory.
>>> Some problems are a good fit to memoization, but computing
>>> Fibonacci numbers isn't one of the.
>>>
>>> For (2), compilers are not so good at dealing with tail
>>> recursion when the return value is a pair of results
>>> rather than a single result.
>>>
>>> For (3), iterative code takes more mental effort to see
>>> that it works than does a simple recursive version.
>>
>> I don't think that
>>
>> #include <stdio.h>
>>
>> unsigned long int fib (int n)
>> {
>> unsigned long f1, f2, fn;
>>
>> if (n < 2)
>> return 1;
>>
>> f1 = 1;
>> f2 = 1;
>> while (n >= 2) {
>> fn = f2 + f1;
>> f1 = f2;
>> f2 = fn;
>> n--;
>> }
>> return fn;
>> }
>>
>> is particularly intellecually challenging.
>
> I'm sure there are other people who feel the same way.
>
> Even so, when a short, functional, recursive version
> is available, it is often easier to write, easier to
> read, and simpler to show that it produces correct
> results, in which case doing that is a better choice,
> even if an imperative version is not particularly
> intellecually challenging.

Unless, of course, it runs in exponential time, as the
primitive version with the recurrence does.

Compilers often do not do a good job of replacing recursion
with iteration (is there a verb for that? Maybe it can be called
iterationizing?), so it is something to consider if an algorithm
is at all time critical or stack space is problematic.

Re: AMD CPU funny

<86ttnvr2om.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36511&group=comp.arch#36511

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.chmurka.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 02 Jan 2024 10:49:13 -0800
Organization: A noiseless patient Spider
Lines: 91
Message-ID: <86ttnvr2om.fsf@linuxsc.com>
References: <ulv7j4$l0o0$1@dont-email.me> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de> <86y1d8qh1m.fsf@linuxsc.com> <un0rpm$1rkc9$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="7861cd1ffbd714be6607f64f72508688";
logging-data="2954615"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19YGHe+oH5pvCF7uGUcdcJ/AVd8kMltgpg="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:j3+1vnHsfJfkrcl5g3fKffT7ojw=
sha1:RHThigJp/IRpZiZzssicTc61xb4=
 by: Tim Rentsch - Tue, 2 Jan 2024 18:49 UTC

Thomas Koenig <tkoenig@netcologne.de> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> schrieb:
>
>> Thomas Koenig <tkoenig@netcologne.de> writes:
>>
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> schrieb:
>>>
>>>> Pancho <Pancho.Jones@proton.me> writes:
>>>>
>>>>> On 30/12/2023 23:19, BGB wrote:
>>>>>
>>>>>> Granted, There are plenty of other much more efficient ways of
>>>>>> calculating Fibonacci numbers...
>>>>>
>>>>> Yes. In order of effort.
>>>>>
>>>>> 1) Memoize the recursive function.
>>>>> 2) Return values for (n-1,n-2) as a tuple, and make it single (tail)
>>>>> recursive. Which a smart compiler could theoretically convert to an
>>>>> iterative loop.
>>>>> 3) Write an iterative loop yourself.
>>>>> 4) Remember college maths, and solve the recurrence relation to give a
>>>>> very simple closed form solution.
>>>>
>>>> All of these approaches have drawbacks of one sort or another.
>>>>
>>>> For (1), we end up writing more code and using more memory.
>>>> Some problems are a good fit to memoization, but computing
>>>> Fibonacci numbers isn't one of the.
>>>>
>>>> For (2), compilers are not so good at dealing with tail
>>>> recursion when the return value is a pair of results
>>>> rather than a single result.
>>>>
>>>> For (3), iterative code takes more mental effort to see
>>>> that it works than does a simple recursive version.
>>>
>>> I don't think that
>>>
>>> #include <stdio.h>
>>>
>>> unsigned long int fib (int n)
>>> {
>>> unsigned long f1, f2, fn;
>>>
>>> if (n < 2)
>>> return 1;
>>>
>>> f1 = 1;
>>> f2 = 1;
>>> while (n >= 2) {
>>> fn = f2 + f1;
>>> f1 = f2;
>>> f2 = fn;
>>> n--;
>>> }
>>> return fn;
>>> }
>>>
>>> is particularly intellecually challenging.
>>
>> I'm sure there are other people who feel the same way.
>>
>> Even so, when a short, functional, recursive version
>> is available, it is often easier to write, easier to
>> read, and simpler to show that it produces correct
>> results, in which case doing that is a better choice,
>> even if an imperative version is not particularly
>> intellecually challenging.
>
> Unless, of course, it runs in exponential time, as the
> primitive version with the recurrence does.

Sure. Implicit in my comment is that the alternatives being
considered are algorithmically comparable. Otherwise it's
comparing apples and oranges.

> Compilers often do not do a good job of replacing recursion
> with iteration (is there a verb for that? Maybe it can be called
> iterationizing?), so it is something to consider if an algorithm
> is at all time critical or stack space is problematic.

Of course. I take it as given that recursive functions that use
an actual recursive call are unacceptable, except perhaps in
special cases, as for example quicksort. I tend to use functional
recursion and mutual recursion a fair amount, but not in cases
where an actual call is done in the recursive cycle.

By the way a term that may satisfy your question is "tail call
optimization".

Re: AMD CPU funny

<86plyjr2br.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36512&group=comp.arch#36512

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 02 Jan 2024 10:56:56 -0800
Organization: A noiseless patient Spider
Lines: 36
Message-ID: <86plyjr2br.fsf@linuxsc.com>
References: <ulv7j4$l0o0$1@dont-email.me> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de> <86y1d8qh1m.fsf@linuxsc.com> <un0hu1$2je9f$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="7861cd1ffbd714be6607f64f72508688";
logging-data="2954615"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX185lljFl5e5R8BdGubXsKoajDi4u5ig2M0="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:VNIOdKJSwnD51w+cj2hB0KTZjEs=
sha1:cbWz78k1wXv6x/Vj+jvm1NLj+bc=
 by: Tim Rentsch - Tue, 2 Jan 2024 18:56 UTC

Terje Mathisen <terje.mathisen@tmsw.no> writes:

> Tim Rentsch wrote:
>
>> Thomas Koenig <tkoenig@netcologne.de> writes:
[...]

>>> unsigned long int fib (int n)
>>> {
>>> unsigned long f1, f2, fn;
>>>
>>> if (n < 2)
>>> return 1;
>>>
>>> f1 = 1;
>>> f2 = 1;
>>> while (n >= 2) {
>>> fn = f2 + f1;
>>> f1 = f2;
>>> f2 = fn;
>>> n--;
>>> }
>>> return fn;
>>> }

[...]

>> Incidentally your code produces wrong values, as
>> fib(0) is 0 and fib(1) is 1.
>
> So return n for (n<2)?

In situations like this one, usually I would favor something
more like this:

if( n < 8 ) return 0xd8532110u >> n*4 & 15;

Re: AMD CPU funny

<bb766e244b5472e00fc5a11e2407c974@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36513&group=comp.arch#36513

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 19:01:00 +0000
Organization: novaBBS
Message-ID: <bb766e244b5472e00fc5a11e2407c974@news.novabbs.com>
References: <ulv7j4$l0o0$1@dont-email.me> <hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me> <e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <ums29d$1p6cc$1@dont-email.me> <8634vgrwc2.fsf@linuxsc.com> <un0hcp$2jca3$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2039272"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949
X-Rslight-Site: $2y$10$lHjQEN.z3CQxS9HuvAsUNOVpN3bqxv3Wzo8n1qkeyRejwIM95tioS
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: MitchAlsup - Tue, 2 Jan 2024 19:01 UTC

Terje Mathisen wrote:

> Tim Rentsch wrote:
>> Daniel James <daniel@me.invalid> writes:
>>
>>> On 31/12/2023 09:12, Pancho wrote:
>>>
>>>> In fact, given that fib grows exponentially, so we only ever need to
>>>> calculate it for small values of n, the only ?bad? solution is the
>>>> naive double recursive one.
>>>
>>> Well, I might disagree.
>>>
>>> Given that we only ever need to calculate it for small values of n (as
>>> you say, and if that's true) its efficiency doesn't matter. If you
>>> need to compute it often enough that efficiency might become relevant
>>> just put the results in a table and use that. Given that fib goes over
>>> 10^12 after about 60 terms it needn't be a very big table.
>>
>> When the size of the table is more than ten times the size of
>> the code, and the code is reasonably fast, using a table is a
>> poor choice. Fibonacci functions don't have to be slow.
>>
>>> The naive doubly recursive solution has the huge merit of being
>>> human-readable.
>>
>> But unacceptably slow.
>>
>>> Unlike, say:
>>>
>>> double fib( int n )
>>> {
>>> return (pow( (1+sqrt(5.0))/2, n+1 )
>>> - pow( (1-sqrt(5.0))/2, n+1 ))
>>> / sqrt(5.0);
>>> }
>>>
>>> ... which is great for fibs of big values, but not fast for small ones.
>>
>> Unacceptable for fibs of big values, because the values computed
>> are wrong.
>>

> So either you move to quad or arbitrary precision (which will of course
> be somewhat slow, or you use a hybrid technique, with lookup tables for
> smallish n, iterative loop for somewhat larger and your O(log(n))
> approach only for very large n?

> Working with 256-bit uints, each addition will take 4-8 clock cycles,
> and it overflows well before n reaches 400, right? In that case a worst
> case iterative solution is less than 3000 clock cycles or about a
> microsecond away.

Whereas table lookup is never more than 100ns away (in DRAM).

> Terje

Re: AMD CPU funny

<9d83d1c36c79a9f85a2cc57615910cf4@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36514&group=comp.arch#36514

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 19:08:49 +0000
Organization: novaBBS
Message-ID: <9d83d1c36c79a9f85a2cc57615910cf4@news.novabbs.com>
References: <ulv7j4$l0o0$1@dont-email.me> <hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me> <e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de> <6b99684fe256d50fe3a78803200c438b@news.novabbs.com> <un0ivj$1rg6n$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2040219"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949
X-Rslight-Site: $2y$10$CE454whfvP2ViLeJ/cR8hOgzIT/RyyhqxLG7zpWLS4PSZghO/Y4W2
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: MitchAlsup - Tue, 2 Jan 2024 19:08 UTC

Thomas Koenig wrote:

> MitchAlsup <mitchalsup@aol.com> schrieb:
>> Thomas Koenig wrote:
>>
>>> #include <stdio.h>
>>
>>> unsigned long int fib (int n)
>>> {
>>> unsigned long f1, f2, fn;
>>
>>> if (n < 2)
>>> return 1;
>>
>>> f1 = 1;
>>> f2 = 1;
>>> do { // we know n >=2 so at least 1 path through the loop is performed.
>>> fn = f2 + f1;
>>> f1 = f2;
>>> f2 = fn;
>>> n--;
>>> } while( n > 2 );

> (This should be n >= 2)

OK

Re: AMD CPU funny

<b0a868b435300a4eb248824c47c81eb7@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36515&group=comp.arch#36515

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 19:10:08 +0000
Organization: novaBBS
Message-ID: <b0a868b435300a4eb248824c47c81eb7@news.novabbs.com>
References: <ulv7j4$l0o0$1@dont-email.me> <um1h21$13l52$1@dont-email.me> <e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de> <86y1d8qh1m.fsf@linuxsc.com> <un0rpm$1rkc9$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2040219"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949
X-Rslight-Site: $2y$10$mI/yOiNcRZZMLIPfw5LuyOXmdzD/0iEt1f5IwrlKwm5r/AnVp3epa
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: MitchAlsup - Tue, 2 Jan 2024 19:10 UTC

Thomas Koenig wrote:

> Compilers often do not do a good job of replacing recursion
> with iteration (is there a verb for that? Maybe it can be called
> iterationizing?), so it is something to consider if an algorithm
> is at all time critical or stack space is problematic.

Tail recursion elimination.

Re: AMD CPU funny

<86le97r1at.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36517&group=comp.arch#36517

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!rocksolid2!news.neodome.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 02 Jan 2024 11:19:06 -0800
Organization: A noiseless patient Spider
Lines: 63
Message-ID: <86le97r1at.fsf@linuxsc.com>
References: <ulv7j4$l0o0$1@dont-email.me> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de> <6b99684fe256d50fe3a78803200c438b@news.novabbs.com> <un0g3d$2j7b7$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="7861cd1ffbd714be6607f64f72508688";
logging-data="2963270"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Hz78dRhY9GZ09vok8R/fnRu4hR4fIwmw="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:4j0K1mLiwl5qci8tcNDfXizF4MI=
sha1:/10tIKOK2dD2SgCALpuktZYVB2s=
 by: Tim Rentsch - Tue, 2 Jan 2024 19:19 UTC

Terje Mathisen <terje.mathisen@tmsw.no> writes:

> MitchAlsup wrote:
>
>> Thomas Koenig wrote:
>>
>>> #include <stdio.h>
>>>
>>> unsigned long int fib (int n)
>>> {
>>> unsigned long f1, f2, fn;
>>>
>>> if (n < 2)
>>> return 1;
>>>
>>> f1 = 1;
>>> f2 = 1;
>>> do { // we know n >=2 so at least 1 path through the loop is performed.
>>> fn = f2 + f1;
>>> f1 = f2;
>>> f2 = fn;
>>> n--;
>>> } while( n > 2 );
>>> return fn;
>>> }
>>
>> Saves 1 cycle and and at least 1 instruction of code footprint.
>
> I have written asm code for this, and noted that on x86 where reg
> MOVes used to cost a cycle, it was better to unroll and get rid of the
> reg-reg moves.
>
> I think the following should work:
>
> u64 fib(unsigned n)
> {
> if (n-- < 2) return 1;
> a = 1; b = 1;
>
> a += (n & 1); // Odd remaining n
> n >>= 1;
>
> while (n--) {
> b += a;
> a += b;
> }
> return a;
> }

This code still has the off-by-one error w.r.t. the
value of n.

We can fix that and simultaneously simplify the code:

u64
fib( unsigned n ){
u64 b = n&1, a = b^1;

n >>= 1;
while( n-- ) a += b, b += a;

return b;
}

Re: AMD CPU funny

<86h6jvr0w7.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36520&group=comp.arch#36520

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 02 Jan 2024 11:27:52 -0800
Organization: A noiseless patient Spider
Lines: 20
Message-ID: <86h6jvr0w7.fsf@linuxsc.com>
References: <ulv7j4$l0o0$1@dont-email.me> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <ums29d$1p6cc$1@dont-email.me> <ums5h9$1m63d$2@dont-email.me> <umuk0t$286vf$1@dont-email.me> <un0kmr$2jds3$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="7861cd1ffbd714be6607f64f72508688";
logging-data="2963270"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+cIV1KOqZWKfsFDgeHXuyPrMpbi4TwP24="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:/j6SPOVpJGEJTXrH0vwnnS/6ekM=
sha1:IRBzi1MXTuCkqyEWky18xnMu0DU=
 by: Tim Rentsch - Tue, 2 Jan 2024 19:27 UTC

Pancho <Pancho.Jones@proton.me> writes:

> On 01/01/2024 15:00, Daniel James wrote:

>>>> double fib( int n )
>>>> {
>>>> return (pow( (1+sqrt(5.0))/2, n+1 )
>>>> - pow( (1-sqrt(5.0))/2, n+1 ))
>>>> / sqrt(5.0);
>>>> }
>>>>
>>>> ... which is great for fibs of big values, but not fast for small
>>>> ones.

> From a mathematical viewpoint, the closed form solution gives clearer
> behaviour.

The problem with using this mathematically exact formula is
that the code gives wrong answers. Using floating point is
a poor choice for computing Fibonacci values.

Re: AMD CPU funny

<865y0bqyhj.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36525&group=comp.arch#36525

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 02 Jan 2024 12:19:52 -0800
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <865y0bqyhj.fsf@linuxsc.com>
References: <ulv7j4$l0o0$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <ums29d$1p6cc$1@dont-email.me> <ums5h9$1m63d$2@dont-email.me> <umuk0t$286vf$1@dont-email.me> <867ckssn67.fsf@linuxsc.com> <un0djb$2ire7$3@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="7861cd1ffbd714be6607f64f72508688";
logging-data="2978918"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19nMUDTsBmyG/GJIJZD9RJvq6OGRqU1ChE="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:lAkD+uumqW+Yer97qzVQsZUcjyU=
sha1:l931B8UIiwYUVsbeQCIxHVy7J5w=
 by: Tim Rentsch - Tue, 2 Jan 2024 20:19 UTC

Terje Mathisen <terje.mathisen@tmsw.no> writes:

> Tim Rentsch wrote:
>
>> Daniel James <daniel@me.invalid> writes:
>>
>>> However, the (signed) 64-bit integer result overflows for fib(n) where
>>> n>45, which is more of a problem.
>>
>> fib( 92 ) is still less than 1UL << 63 (or fib(93) for a 64-bit
>> unsigned type).
>>
>> And fib( 186 ) still fits in a 128-bit unsigned type.
>
> If calculating fib(n) for n in the fixed 0..186 range is actually
> important to you, then you would obviously allocate 16*187 (about 3KB)
> bytes of table space and do a simple lookup instead.

I have no doubt that some people would, but I wouldn't. It isn't
hard to write code that has very good performance, and that doesn't
have the associated costs in terms of memory use, cache behavior,
and potential page faults and TLB misses.

(To be clear, I am talking specifically about computing fibonacci
values.)

Re: AMD CPU funny

<861qazqtxu.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36531&group=comp.arch#36531

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 02 Jan 2024 13:58:05 -0800
Organization: A noiseless patient Spider
Lines: 80
Message-ID: <861qazqtxu.fsf@linuxsc.com>
References: <ulv7j4$l0o0$1@dont-email.me> <e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <ums29d$1p6cc$1@dont-email.me> <8634vgrwc2.fsf@linuxsc.com> <un0hcp$2jca3$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="7861cd1ffbd714be6607f64f72508688";
logging-data="3005351"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/+cKb+Db+HKW73kSCFE25M5CNZvLRWehk="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:gMLRGaa03TZohwcl/MO17gZLLZk=
sha1:0AzK12liY/BqaesghY8CLCd2pnM=
 by: Tim Rentsch - Tue, 2 Jan 2024 21:58 UTC

Terje Mathisen <terje.mathisen@tmsw.no> writes:

> Tim Rentsch wrote:
>
>> Daniel James <daniel@me.invalid> writes:
>>
>>> On 31/12/2023 09:12, Pancho wrote:
>>>
>>>> In fact, given that fib grows exponentially, so we only ever need to
>>>> calculate it for small values of n, the only ?bad? solution is the
>>>> naive double recursive one.
>>>
>>> Well, I might disagree.
>>>
>>> Given that we only ever need to calculate it for small values of n (as
>>> you say, and if that's true) its efficiency doesn't matter. If you
>>> need to compute it often enough that efficiency might become relevant
>>> just put the results in a table and use that. Given that fib goes over
>>> 10^12 after about 60 terms it needn't be a very big table.
>>
>> When the size of the table is more than ten times the size of
>> the code, and the code is reasonably fast, using a table is a
>> poor choice. Fibonacci functions don't have to be slow.
>>
>>> The naive doubly recursive solution has the huge merit of being
>>> human-readable.
>>
>> But unacceptably slow.
>>
>>> Unlike, say:
>>>
>>> double fib( int n )
>>> {
>>> return (pow( (1+sqrt(5.0))/2, n+1 )
>>> - pow( (1-sqrt(5.0))/2, n+1 ))
>>> / sqrt(5.0);
>>> }
>>>
>>> ... which is great for fibs of big values, but not fast for small ones.
>>
>> Unacceptable for fibs of big values, because the values computed
>> are wrong.
>
> So either you move to quad or arbitrary precision (which will of
> course be somewhat slow,

To me, what makes floating point unattractive in this case is
great care must be taken to ensure correct values are produced.
Add to that the increased time cost, and using floating point
just looks like a lot more downside than upside.

> or you use a hybrid technique, with lookup
> tables for smallish n, iterative loop for somewhat larger and your
> O(log(n)) approach only for very large n?

Oh, you remembered I wrote an O(log(n)) fibonacci function. That
is nice. :)

Based on what I know now, my inclination is to use the constant-shift
technique (as seen in another post) for n < 8, the O(log(n)) method
for results that will be more than 64 bits, and a small-constant-linear
method in between those two regions. (Incidentally, the intermediate
method is something I had already written, and might be seen as a
generalization of your two-fold speedup code. You might want to take
a look at enhancing the two-fold speedup idea so it does at most one of
those, and then after that gets a four-fold speedup. The best
performing code I have for 64 bits goes two steps farther, so all the
iterations after the "startup transient" are 16-fold speedup.)

> Working with 256-bit uints, each addition will take 4-8 clock cycles,
> and it overflows well before n reaches 400, right? In that case a
> worst case iterative solution is less than 3000 clock cycles or about
> a microsecond away.

By test, my O(log(n)) code runs in 25-30 nanoseconds for 128-bit
results. At a guess, 256-bit results would run about 5 to 10
times slower than that.

By the way, your two-fold speedup code runs about 1.6 times as
fast as using simple iteration.

Re: AMD CPU funny

<cd3668bf81038e750419de48521e3ab9@news.novabbs.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36532&group=comp.arch#36532

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchalsup@aol.com (MitchAlsup)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Tue, 2 Jan 2024 22:41:37 +0000
Organization: novaBBS
Message-ID: <cd3668bf81038e750419de48521e3ab9@news.novabbs.com>
References: <ulv7j4$l0o0$1@dont-email.me> <um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk> <3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com> <fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me> <um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me> <umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com> <umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net> <ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me> <bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com> <umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me> <86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de> <86y1d8qh1m.fsf@linuxsc.com> <un0hu1$2je9f$1@dont-email.me> <86plyjr2br.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2058141"; mail-complaints-to="usenet@i2pn2.org";
posting-account="t+lO0yBNO1zGxasPvGSZV1BRu71QKx+JE37DnW+83jQ";
User-Agent: Rocksolid Light
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Site: $2y$10$cvogBY6qRaQMBs1ezZp42u0FE1oR7ibcFq0Qc76vE3HPDXOiY1tuu
X-Rslight-Posting-User: 7e9c45bcd6d4757c5904fbe9a694742e6f8aa949
 by: MitchAlsup - Tue, 2 Jan 2024 22:41 UTC

Tim Rentsch wrote:

> Terje Mathisen <terje.mathisen@tmsw.no> writes:

>>

>>> Incidentally your code produces wrong values, as
>>> fib(0) is 0 and fib(1) is 1.
>>
>> So return n for (n<2)?

> In situations like this one, usually I would favor something
> more like this:

> if( n < 8 ) return 0xd8532110u >> n*4 & 15;

Which is::

CMP R2,R1,#8
PLT R2,TTT
LEA R2,[IP,R1<<2,0xF00000000-.]
SLL R1,0xd853110,R2
RET

.....

Re: AMD CPU funny

<un3lv2$375f1$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36543&group=comp.arch#36543

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Wed, 3 Jan 2024 14:03:29 +0100
Organization: A noiseless patient Spider
Lines: 71
Message-ID: <un3lv2$375f1$1@dont-email.me>
References: <ulv7j4$l0o0$1@dont-email.me>
<hVh*RTmyz@news.chiark.greenend.org.uk> <um1h21$13l52$1@dont-email.me>
<e2bb329488167ada6ece600c8f28d3ed@news.novabbs.com>
<um43of$1j3jj$1@dont-email.me> <eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de>
<6b99684fe256d50fe3a78803200c438b@news.novabbs.com>
<un0ivj$1rg6n$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 3 Jan 2024 13:03:30 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d35fba0936548e882fb18cd1e333f31f";
logging-data="3380705"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/yeIXGpjBPFyHEVuEibIBwIYWMpWZisuQ/YSjcbWSgxw=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18
Cancel-Lock: sha1:+J/TGIDRjtGIuaF6uxYj140TG2w=
In-Reply-To: <un0ivj$1rg6n$1@newsreader4.netcologne.de>
 by: Terje Mathisen - Wed, 3 Jan 2024 13:03 UTC

Thomas Koenig wrote:
> MitchAlsup <mitchalsup@aol.com> schrieb:
>> Thomas Koenig wrote:
>>
>>> #include <stdio.h>
>>
>>> unsigned long int fib (int n)
>>> {
>>> unsigned long f1, f2, fn;
>>
>>> if (n < 2)
>>> return 1;
>>
>>> f1 = 1;
>>> f2 = 1;
>>> do { // we know n >=2 so at least 1 path through the loop is performed.
>>> fn = f2 + f1;
>>> f1 = f2;
>>> f2 = fn;
>>> n--;
>>> } while( n > 2 );
>
> (This should be n >= 2)
>>> return fn;
>>> }
>>
>> Saves 1 cycle and and at least 1 instruction of code footprint.
>
> Fortunately, this is one of the transformations which compilers
> do automatically these days. For both versions, the code for
> fib generated by a recent gcc on x86_64 with -O3 is
>
> fib:
> .LFB11:
> .cfi_startproc
> movl $1, %edx
> cmpl $1, %edi
> jle .L1
> movl $1, %eax
> movl $1, %ecx
> jmp .L3
> .p2align 4,,10
> .p2align 3
> .L5:
> movq %rdx, %rax
> .L3:
> subl $1, %edi
> leaq (%rcx,%rax), %rdx
> movq %rax, %rcx
> cmpl $1, %edi
> jne .L5
> .L1:
> movq %rdx, %rax
> ret
>
> which looks pretty good.

OK at best:

The decrement of EDI should be paired with the JNE, saving a CMP, and
the two reg-reg moves could be avoided with a single level of unrolling.

OTOH, with 3-way superscalar processing and zero-cycle MOV, it should
still run at about two cycles/iteration, and a single cycle is a hard
limit due to the latency of the ADD (or in this case, LEA).

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: AMD CPU funny

<un3mfr$377uo$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36544&group=comp.arch#36544

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Wed, 3 Jan 2024 14:12:27 +0100
Organization: A noiseless patient Spider
Lines: 108
Message-ID: <un3mfr$377uo$1@dont-email.me>
References: <ulv7j4$l0o0$1@dont-email.me> <um43of$1j3jj$1@dont-email.me>
<eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de>
<86y1d8qh1m.fsf@linuxsc.com> <un0rpm$1rkc9$1@newsreader4.netcologne.de>
<86ttnvr2om.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 3 Jan 2024 13:12:27 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d35fba0936548e882fb18cd1e333f31f";
logging-data="3383256"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/MnwcQ/UJXLe2jR0EjZLg3Et5+qxpoNw2rnRpPzgV5Tg=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18
Cancel-Lock: sha1:ixJiF50yMQMHj6ALToZkaBh7bbw=
In-Reply-To: <86ttnvr2om.fsf@linuxsc.com>
 by: Terje Mathisen - Wed, 3 Jan 2024 13:12 UTC

Tim Rentsch wrote:
> Thomas Koenig <tkoenig@netcologne.de> writes:
>
>> Tim Rentsch <tr.17687@z991.linuxsc.com> schrieb:
>>
>>> Thomas Koenig <tkoenig@netcologne.de> writes:
>>>
>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> schrieb:
>>>>
>>>>> Pancho <Pancho.Jones@proton.me> writes:
>>>>>
>>>>>> On 30/12/2023 23:19, BGB wrote:
>>>>>>
>>>>>>> Granted, There are plenty of other much more efficient ways of
>>>>>>> calculating Fibonacci numbers...
>>>>>>
>>>>>> Yes. In order of effort.
>>>>>>
>>>>>> 1) Memoize the recursive function.
>>>>>> 2) Return values for (n-1,n-2) as a tuple, and make it single (tail)
>>>>>> recursive. Which a smart compiler could theoretically convert to an
>>>>>> iterative loop.
>>>>>> 3) Write an iterative loop yourself.
>>>>>> 4) Remember college maths, and solve the recurrence relation to give a
>>>>>> very simple closed form solution.
>>>>>
>>>>> All of these approaches have drawbacks of one sort or another.
>>>>>
>>>>> For (1), we end up writing more code and using more memory.
>>>>> Some problems are a good fit to memoization, but computing
>>>>> Fibonacci numbers isn't one of the.
>>>>>
>>>>> For (2), compilers are not so good at dealing with tail
>>>>> recursion when the return value is a pair of results
>>>>> rather than a single result.
>>>>>
>>>>> For (3), iterative code takes more mental effort to see
>>>>> that it works than does a simple recursive version.
>>>>
>>>> I don't think that
>>>>
>>>> #include <stdio.h>
>>>>
>>>> unsigned long int fib (int n)
>>>> {
>>>> unsigned long f1, f2, fn;
>>>>
>>>> if (n < 2)
>>>> return 1;
>>>>
>>>> f1 = 1;
>>>> f2 = 1;
>>>> while (n >= 2) {
>>>> fn = f2 + f1;
>>>> f1 = f2;
>>>> f2 = fn;
>>>> n--;
>>>> }
>>>> return fn;
>>>> }
>>>>
>>>> is particularly intellecually challenging.
>>>
>>> I'm sure there are other people who feel the same way.
>>>
>>> Even so, when a short, functional, recursive version
>>> is available, it is often easier to write, easier to
>>> read, and simpler to show that it produces correct
>>> results, in which case doing that is a better choice,
>>> even if an imperative version is not particularly
>>> intellecually challenging.
>>
>> Unless, of course, it runs in exponential time, as the
>> primitive version with the recurrence does.
>
> Sure. Implicit in my comment is that the alternatives being
> considered are algorithmically comparable. Otherwise it's
> comparing apples and oranges.
>
>> Compilers often do not do a good job of replacing recursion
>> with iteration (is there a verb for that? Maybe it can be called
>> iterationizing?), so it is something to consider if an algorithm
>> is at all time critical or stack space is problematic.
>
> Of course. I take it as given that recursive functions that use
> an actual recursive call are unacceptable, except perhaps in
> special cases, as for example quicksort. I tend to use functional

In my own quicksort implementation (from 3-4 decades ago) I used
recursion for the smaller partition and looping for the other/larger
half. That way stack usage/recursion depth was guaranteed to never
exceed log2(n/5). The /5 divisor was because I used a different sort
algorithm on partition sizes <= 5.

> recursion and mutual recursion a fair amount, but not in cases
> where an actual call is done in the recursive cycle.
>
> By the way a term that may satisfy your question is "tail call
> optimization".

As noted, that is always a good idea even when having two (or more!)
recursive calls in the source code.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: AMD CPU funny

<un3n40$37ahl$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=36545&group=comp.arch#36545

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: AMD CPU funny
Date: Wed, 3 Jan 2024 14:23:11 +0100
Organization: A noiseless patient Spider
Lines: 60
Message-ID: <un3n40$37ahl$1@dont-email.me>
References: <ulv7j4$l0o0$1@dont-email.me> <um43of$1j3jj$1@dont-email.me>
<eVh*Lrxyz@news.chiark.greenend.org.uk>
<3d2dcfa36ceca1ee856b9de3ab416a01@news.novabbs.com>
<fVh*Dgyyz@news.chiark.greenend.org.uk> <um7jom$27ikh$3@dont-email.me>
<um7o72$1bj9r$1@newsreader4.netcologne.de> <umc56l$32ucu$1@dont-email.me>
<umhbqp$3um9d$1@dont-email.me> <20231227183842.00001b4f@yahoo.com>
<umn15d$svun$6@dont-email.me> <kv8gt0Fca4nU6@mid.individual.net>
<ump7l3$19ud7$1@dont-email.me> <umpn9d$1c212$1@dont-email.me>
<bc7bd58cc9f25ca592dc5c62f415ceec@news.novabbs.com>
<umq8hi$1ebuo$1@dont-email.me> <umrb9a$1m3mj$1@dont-email.me>
<86cyuksoc0.fsf@linuxsc.com> <umvfc1$1qq5r$1@newsreader4.netcologne.de>
<86y1d8qh1m.fsf@linuxsc.com> <un0hu1$2je9f$1@dont-email.me>
<86plyjr2br.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 3 Jan 2024 13:23:12 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d35fba0936548e882fb18cd1e333f31f";
logging-data="3385909"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18BUtzDkc9N4RIMiOfT8C+0K3LWVmN108DspMkEA7kT6w=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18
Cancel-Lock: sha1:Jz4Nqv7c7iCVItC9TXH2vwDwsi4=
In-Reply-To: <86plyjr2br.fsf@linuxsc.com>
 by: Terje Mathisen - Wed, 3 Jan 2024 13:23 UTC

Tim Rentsch wrote:
> Terje Mathisen <terje.mathisen@tmsw.no> writes:
>
>> Tim Rentsch wrote:
>>
>>> Thomas Koenig <tkoenig@netcologne.de> writes:
> [...]
>
>>>> unsigned long int fib (int n)
>>>> {
>>>> unsigned long f1, f2, fn;
>>>>
>>>> if (n < 2)
>>>> return 1;
>>>>
>>>> f1 = 1;
>>>> f2 = 1;
>>>> while (n >= 2) {
>>>> fn = f2 + f1;
>>>> f1 = f2;
>>>> f2 = fn;
>>>> n--;
>>>> }
>>>> return fn;
>>>> }
>
> [...]
>
>>> Incidentally your code produces wrong values, as
>>> fib(0) is 0 and fib(1) is 1.
>>
>> So return n for (n<2)?
>
> In situations like this one, usually I would favor something
> more like this:
>
> if( n < 8 ) return 0xd8532110u >> n*4 & 15;

Nice!

Could be extended a little bit more with a lookup table, or to 16 values
with a SIMD in-register lookup operation. Using a 64-bit constant allows
11 values to fit inside 6-bit sub-containers, as long as they are stored
in reverse order so that the bottom value (0) can survive without the
top two bits.

Something like

if (n < 11) return ((0<<60)+(1<<54)+(1<<48)+(2<<42)...(21<<6)+34)>>
(60-n*6) & 63;

Anyway, this is code golfing and not a realistic scenario where you
would definitely use a direct lookup table for all the possible values.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"


devel / comp.arch / Re: AMD CPU funny

Pages:12345
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor