Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

"It's the best thing since professional golfers on 'ludes." -- Rick Obidiah


devel / comp.arch / Apple AMX instructions

SubjectAuthor
* Apple AMX instructionsBranimir Maksimovic
`* Re: Apple AMX instructionsScott Lurndal
 `- Re: Apple AMX instructionsBranimir Maksimovic

1
Apple AMX instructions

<U6PYM.160362$rbid.40431@fx18.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34608&group=comp.arch#34608

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.swapon.de!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx18.iad.POSTED!not-for-mail
Newsgroups: comp.arch
From: branimir.maksimovic@icloud.com (Branimir Maksimovic)
Subject: Apple AMX instructions
User-Agent: slrn/1.0.3 (Darwin)
Lines: 22
Message-ID: <U6PYM.160362$rbid.40431@fx18.iad>
X-Complaints-To: abuse@usenet-news.net
NNTP-Posting-Date: Sat, 21 Oct 2023 12:04:04 UTC
Organization: usenet-news.net
Date: Sat, 21 Oct 2023 12:04:04 GMT
X-Received-Bytes: 1433
 by: Branimir Maksimovic - Sat, 21 Oct 2023 12:04 UTC

Haker reverse engineered Apple libraties, and by trial and error,
figured out instuction.
Latency is 1.4ns per instuction, moving between registers
is 0.35ns, load 512 bita is 3.5 ns store 512 bits
is 26ns. load and store of 1024 bits requirew 128
bytes alignment, so it's not that practical.
documentation is here:
https://github.com/corsix/amx/blob/main/Instructions.md
my implementetation is here: https://github.com/bmaxa/AppleAmx/tree/main
matrix operations: https://github.com/bmaxa/matrix
you have 8 X registers, as rows, 8 Y registers as columns,
and 64 Z registers as result. Each is 8*64 bits.
I invented name of instrucions, so look at implementation
and documentation to figure out...
Intel added this instructions in january 2023, and Xeons only,
but Apple added them in 2020 on every ARM processor...

--

7-77-777, Evil Sinner!
https://www.linkedin.com/in/branimir-maksimovic-6762bbaa/

Re: Apple AMX instructions

<vmRYM.57679$MJ59.15327@fx10.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34609&group=comp.arch#34609

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!npeer.as286.net!npeer-ng0.as286.net!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx10.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Apple AMX instructions
Newsgroups: comp.arch
References: <U6PYM.160362$rbid.40431@fx18.iad>
Lines: 21
Message-ID: <vmRYM.57679$MJ59.15327@fx10.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Sat, 21 Oct 2023 14:37:15 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Sat, 21 Oct 2023 14:37:15 GMT
X-Received-Bytes: 1671
 by: Scott Lurndal - Sat, 21 Oct 2023 14:37 UTC

Branimir Maksimovic <branimir.maksimovic@icloud.com> writes:
>Haker reverse engineered Apple libraties, and by trial and error,
>figured out instuction.
>Latency is 1.4ns per instuction, moving between registers
>is 0.35ns, load 512 bita is 3.5 ns store 512 bits
>is 26ns. load and store of 1024 bits requirew 128
>bytes alignment, so it's not that practical.
>documentation is here:
>https://github.com/corsix/amx/blob/main/Instructions.md
>my implementetation is here: https://github.com/bmaxa/AppleAmx/tree/main
>matrix operations: https://github.com/bmaxa/matrix
>you have 8 X registers, as rows, 8 Y registers as columns,
>and 64 Z registers as result. Each is 8*64 bits.
>I invented name of instrucions, so look at implementation
>and documentation to figure out...
>Intel added this instructions in january 2023, and Xeons only,
>but Apple added them in 2020 on every ARM processor...

I wonder if they'll switch to SME in a future processor.

https://developer.arm.com/documentation/ddi0616/latest/

Re: Apple AMX instructions

<qZRYM.17022$rEF.5920@fx47.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=34610&group=comp.arch#34610

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx47.iad.POSTED!not-for-mail
Newsgroups: comp.arch
From: branimir.maksimovic@icloud.com (Branimir Maksimovic)
Subject: Re: Apple AMX instructions
References: <U6PYM.160362$rbid.40431@fx18.iad>
<vmRYM.57679$MJ59.15327@fx10.iad>
User-Agent: slrn/1.0.3 (Darwin)
Lines: 27
Message-ID: <qZRYM.17022$rEF.5920@fx47.iad>
X-Complaints-To: abuse@usenet-news.net
NNTP-Posting-Date: Sat, 21 Oct 2023 15:18:46 UTC
Organization: usenet-news.net
Date: Sat, 21 Oct 2023 15:18:46 GMT
X-Received-Bytes: 1885
 by: Branimir Maksimovic - Sat, 21 Oct 2023 15:18 UTC

On 2023-10-21, Scott Lurndal <scott@slp53.sl.home> wrote:
> Branimir Maksimovic <branimir.maksimovic@icloud.com> writes:
>>Haker reverse engineered Apple libraties, and by trial and error, figured out
>>instuction. Latency is 1.4ns per instuction, moving between registers is
>>0.35ns, load 512 bita is 3.5 ns store 512 bits is 26ns. load and store of
>>1024 bits requirew 128 bytes alignment, so it's not that practical.
>>documentation is here:
>>https://github.com/corsix/amx/blob/main/Instructions.md my implementetation
>>is here: https://github.com/bmaxa/AppleAmx/tree/main matrix operations:
>>https://github.com/bmaxa/matrix you have 8 X registers, as rows, 8 Y
>>registers as columns, and 64 Z registers as result. Each is 8*64 bits. I
>>invented name of instrucions, so look at implementation and documentation to
>>figure out... Intel added this instructions in january 2023, and Xeons only,
>>but Apple added them in 2020 on every ARM processor...
>
> I wonder if they'll switch to SME in a future processor.
>
> https://developer.arm.com/documentation/ddi0616/latest/
I guess, when they make transition to ARM v9.
These M1/M2 are proper ARM v8. AMX doesn't have neither compiler
or assembler support, so they are free to make changes as
they will...

--

7-77-777, Evil Sinner!
https://www.linkedin.com/in/branimir-maksimovic-6762bbaa/

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor