Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

try again


devel / comp.arch / What did it cost the 8086 to support unaligned access?

SubjectAuthor
* What did it cost the 8086 to support unaligned access?Russell Wallace
+* Re: What did it cost the 8086 to support unaligned access?John Levine
|+* Re: What did it cost the 8086 to support unaligned access?MitchAlsup
||+- Re: What did it cost the 8086 to support unaligned access?John Levine
||+- Re: What did it cost the 8086 to support unaligned access?Quadibloc
||`* Re: What did it cost the 8086 to support unaligned access?Thomas Koenig
|| +* Re: What did it cost the 8086 to support unaligned access?MitchAlsup
|| |`* Re: What did it cost the 8086 to support unaligned access?Thomas Koenig
|| | `* Re: misaligned Fortran, What did it cost the 8086 to support unaligned access?John Levine
|| |  +* Re: misaligned Fortran, What did it cost the 8086 to supportThomas Koenig
|| |  |`* Re: misaligned Fortran, What did it cost the 8086 to supportJohn Levine
|| |  | `- Re: misaligned Fortran, What did it cost the 8086 to supportMitchAlsup
|| |  `* Re: misaligned Fortran, What did it cost the 8086 to supportThomas Koenig
|| |   +- Re: misaligned Fortran, What did it cost the 8086 to supportJohn Levine
|| |   `- Re: misaligned Fortran, What did it cost the 8086 to supportMitchAlsup
|| +- Re: old Fortran, What did it cost the 8086 to support unaligned access?John Levine
|| `- Re: What did it cost the 8086 to support unaligned access?Anton Ertl
|+* Re: What did it cost the 8086 to support unaligned access?Thomas Koenig
||+- Re: What did it cost the 8086 to support unaligned access?MitchAlsup
||+* Re: What did it cost the 8086 to support unaligned access?Michael S
|||`- Re: What did it cost the 8086 to support unaligned access?BGB
||`* Re: What did it cost the 8086 to support unaligned access?EricP
|| +* Re: What did it cost the 8086 to support unaligned access?Quadibloc
|| |+* Re: What did it cost the 8086 to support unaligned access?EricP
|| ||+* Re: What did it cost the 8086 to support unaligned access?Quadibloc
|| |||+- Re: What did it cost the 8086 to support unaligned access?EricP
|| |||`- Re: What did it cost the 8086 to support unaligned access?MitchAlsup
|| ||`- Re: What did it cost the 8086 to support unaligned access?MitchAlsup
|| |`- Re: What did it cost the 8086 to support unaligned access?MitchAlsup
|| +* Re: What did it cost the 8086 to support unaligned access?Anton Ertl
|| |+* Re: What did it cost the 8086 to support unaligned access?robf...@gmail.com
|| ||`- Re: What did it cost the 8086 to support unaligned access?BGB
|| |+- Re: What did it cost the 8086 to support unaligned access?Quadibloc
|| |`* Re: What did it cost the 8086 to support unaligned access?EricP
|| | `- Re: What did it cost the 8086 to support unaligned access?MitchAlsup
|| `* Re: What did it cost the 8086 to support unaligned access?Timothy McCaffrey
||  +- Re: What did it cost the 8086 to support unaligned access?MitchAlsup
||  +* Re: What did it cost the 8086 to support unaligned access?Thomas Koenig
||  |`- Re: What did it cost the 8086 to support unaligned access?BGB
||  +* Re: What did it cost the 8086 to support unaligned access?Andy Valencia
||  |`* Re: What did it cost the 8086 to support unaligned access?Terje Mathisen
||  | +* Re: What did it cost the 8086 to support unaligned access?Stephen Fuld
||  | |+* Re: What did it cost the 8086 to support unaligned access?MitchAlsup
||  | ||`* Re: What did it cost the 8086 to support unaligned access?Stephen Fuld
||  | || `- Re: What did it cost the 8086 to support unaligned access?Thomas Koenig
||  | |`- Re: What did it cost the 8086 to support unaligned access?Terje Mathisen
||  | `- Re: What did it cost the 8086 to support unaligned access?MitchAlsup
||  `- Re: What did it cost the 8086 to support unaligned access?Anton Ertl
|`* Re: What did it cost the 8086 to support unaligned access?Michael S
| `- Re: What did it cost the 8086 to support unaligned access?John Levine
+- Re: What did it cost the 8086 to support unaligned access?MitchAlsup
+* Re: What did it cost the 8086 to support unaligned access?Quadibloc
|`- Re: What did it cost the 8086 to support unaligned access?MitchAlsup
+* Re: What did it cost the 8086 to support unaligned access?Terje Mathisen
|`* Re: What did it cost the 8086 to support unaligned access?BGB
| `* Re: What did it cost the 8086 to support unaligned access?Terje Mathisen
|  `* Re: What did it cost the 8086 to support unaligned access?BGB
|   `- Re: What did it cost the 8086 to support unaligned access?BGB
`- Re: What did it cost the 8086 to support unaligned access?EricP

Pages:123
What did it cost the 8086 to support unaligned access?

<b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33023&group=comp.arch#33023

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:394c:b0:767:8357:7b59 with SMTP id qs12-20020a05620a394c00b0076783577b59mr906qkn.5.1688588044137;
Wed, 05 Jul 2023 13:14:04 -0700 (PDT)
X-Received: by 2002:a17:902:8d88:b0:1b8:9468:c04 with SMTP id
v8-20020a1709028d8800b001b894680c04mr30818plo.5.1688588043436; Wed, 05 Jul
2023 13:14:03 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 5 Jul 2023 13:14:02 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=2a02:8084:601f:9880:21a8:a43d:704c:56a7;
posting-account=f4I3oAkAAABDSN7-E4aFhBpEX3HML7-_
NNTP-Posting-Host: 2a02:8084:601f:9880:21a8:a43d:704c:56a7
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
Subject: What did it cost the 8086 to support unaligned access?
From: russell.wallace@gmail.com (Russell Wallace)
Injection-Date: Wed, 05 Jul 2023 20:14:04 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2976
 by: Russell Wallace - Wed, 5 Jul 2023 20:14 UTC

The Intel 8086 supported unaligned loads and stores of 16-bit data, e.g. mov ax, foo was guaranteed to work even if foo was odd.

What did this cost, in terms of performance and chip area, compared to an alternative architecture that would have been the same except for unaligned access being a trap or undefined behavior?

To be clear, I'm not talking about the dynamic behavior of code. On the actual 8086, access was still faster if the pointer did happen to be even. I'm asking, suppose all your pointers for word access were actually even, how much bigger and slower was the chip made by having to support the possibility that some of them could have been odd?

My first thought is that obviously a load/store circuit that doesn't need to take into account the possibility of unaligned access (and be ready to do the complicated fallback routine of two accesses and splicing the parts together) must clearly be simpler, therefore smaller and faster, then one does need to take into account this possibility.

On the other hand, the instruction decoder needs to support unaligned access anyway.

On the third hand, that might not be relevant here; the instruction decoder is probably a completely separate circuit that doesn't share parts with the data load/store circuitry.

Which points back to the conclusion that the chip could have been smaller and faster if it didn't support unaligned access.

Then again, maybe the test for the fallback case, wasn't a bottleneck in cycle time? In that case, maybe it only cost chip area?

Or maybe it was an extra microcode stage? In that case, every aligned access would have been at least one full clock cycle slower, just to support the possibility of unaligned access?

Re: What did it cost the 8086 to support unaligned access?

<u84l35$1aq4$1@gal.iecc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33024&group=comp.arch#33024

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.cmpublishers.com!adore2!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: johnl@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
Date: Wed, 5 Jul 2023 20:50:45 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <u84l35$1aq4$1@gal.iecc.com>
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
Injection-Date: Wed, 5 Jul 2023 20:50:45 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="43844"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Wed, 5 Jul 2023 20:50 UTC

According to Russell Wallace <russell.wallace@gmail.com>:
>The Intel 8086 supported unaligned loads and stores of 16-bit data, e.g. mov ax, foo was guaranteed to work even if foo was odd.
>
>What did this cost, in terms of performance and chip area, compared to an alternative architecture that would have been the same except for unaligned
>access being a trap or undefined behavior?

The 8086 was generally constrained by memory speed, so it cost little if anything.

A few years later the 486 and RISC-ish i860 were implemented with
similar technology and the 860 was a lot faster, maybe a factor of
two. But I think that was because it was easier to decode the instructions
and pipeline them, not anything about aligned memory and some odd hacks to
help pipeline repetitive floating point operations.

The original S/360 required everything to be aligned, while S/370
allowed all data to be misaligned. IBM did a great deal of simulation
before making architecture choices so I assume they had good reason to
believe it would be an improvement.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: What did it cost the 8086 to support unaligned access?

<ceded677-28df-4b71-84b8-318bc270740cn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33026&group=comp.arch#33026

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:3197:b0:767:40cc:944d with SMTP id bi23-20020a05620a319700b0076740cc944dmr46695qkb.9.1688590657248;
Wed, 05 Jul 2023 13:57:37 -0700 (PDT)
X-Received: by 2002:a05:6a00:1889:b0:682:5980:a0e0 with SMTP id
x9-20020a056a00188900b006825980a0e0mr3631pfh.5.1688590656691; Wed, 05 Jul
2023 13:57:36 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 5 Jul 2023 13:57:36 -0700 (PDT)
In-Reply-To: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:28ce:7684:1034:367a;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:28ce:7684:1034:367a
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ceded677-28df-4b71-84b8-318bc270740cn@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Wed, 05 Jul 2023 20:57:37 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 57
 by: MitchAlsup - Wed, 5 Jul 2023 20:57 UTC

On Wednesday, July 5, 2023 at 3:14:05 PM UTC-5, Russell Wallace wrote:
> The Intel 8086 supported unaligned loads and stores of 16-bit data, e.g. mov ax, foo was guaranteed to work even if foo was odd.
>
> What did this cost, in terms of performance and chip area, compared to an alternative architecture that would have been the same except for unaligned access being a trap or undefined behavior?
>
> To be clear, I'm not talking about the dynamic behavior of code. On the actual 8086, access was still faster if the pointer did happen to be even. I'm asking, suppose all your pointers for word access were actually even, how much bigger and slower was the chip made by having to support the possibility that some of them could have been odd?
>
> My first thought is that obviously a load/store circuit that doesn't need to take into account the possibility of unaligned access (and be ready to do the complicated fallback routine of two accesses and splicing the parts together) must clearly be simpler, therefore smaller and faster, then one does need to take into account this possibility.
>
> On the other hand, the instruction decoder needs to support unaligned access anyway.
>
> On the third hand, that might not be relevant here; the instruction decoder is probably a completely separate circuit that doesn't share parts with the data load/store circuitry.
<
Decoder was byte by byte.
>
> Which points back to the conclusion that the chip could have been smaller and faster if it didn't support unaligned access.
<
It would have been significantly small without segmentation but with misaligned support, too.
<
Since it was basically running at bus performance, faster remains in question.
>
> Then again, maybe the test for the fallback case, wasn't a bottleneck in cycle time? In that case, maybe it only cost chip area?
>
> Or maybe it was an extra microcode stage? In that case, every aligned access would have been at least one full clock cycle slower, just to support the possibility of unaligned access?
<
Probably something close to 5-gates (recognize the misalignment), 2-words of microcode
(special sequence, few bits set), and one 8-bit registers (data feed around..) And maybe one
16-bit 2:1 multiplexer (byte-swap).
<
OR less than one segmentation register !!
<
8086 had ~30,000 transistors, in 3µNMOS I could build the above in ~145 transistors.
Today in gate only design it would take ~450 transistors.
or about ½%.

Re: What did it cost the 8086 to support unaligned access?

<4733a536-4669-4930-931f-9b58c644b9b4n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33027&group=comp.arch#33027

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:4d03:0:b0:3f8:6c1b:70b6 with SMTP id w3-20020ac84d03000000b003f86c1b70b6mr15864qtv.4.1688590714720;
Wed, 05 Jul 2023 13:58:34 -0700 (PDT)
X-Received: by 2002:a17:902:be14:b0:1b8:30d8:bc48 with SMTP id
r20-20020a170902be1400b001b830d8bc48mr59770pls.9.1688590714432; Wed, 05 Jul
2023 13:58:34 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!border-1.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 5 Jul 2023 13:58:33 -0700 (PDT)
In-Reply-To: <u84l35$1aq4$1@gal.iecc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:28ce:7684:1034:367a;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:28ce:7684:1034:367a
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com> <u84l35$1aq4$1@gal.iecc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <4733a536-4669-4930-931f-9b58c644b9b4n@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Wed, 05 Jul 2023 20:58:34 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 31
X-Received-Bytes: 2792
 by: MitchAlsup - Wed, 5 Jul 2023 20:58 UTC

On Wednesday, July 5, 2023 at 3:53:12 PM UTC-5, John Levine wrote:
> According to Russell Wallace <russell...@gmail.com>:
> >The Intel 8086 supported unaligned loads and stores of 16-bit data, e.g. mov ax, foo was guaranteed to work even if foo was odd.
> >
> >What did this cost, in terms of performance and chip area, compared to an alternative architecture that would have been the same except for unaligned
> >access being a trap or undefined behavior?
> The 8086 was generally constrained by memory speed, so it cost little if anything.
>
> A few years later the 486 and RISC-ish i860 were implemented with
> similar technology and the 860 was a lot faster, maybe a factor of
> two. But I think that was because it was easier to decode the instructions
> and pipeline them, not anything about aligned memory and some odd hacks to
> help pipeline repetitive floating point operations.
>
> The original S/360 required everything to be aligned, while S/370
> allowed all data to be misaligned. IBM did a great deal of simulation
> before making architecture choices so I assume they had good reason to
> believe it would be an improvement.
<
FORTRAN common blocks required misaligned DP FP accesses.
>
> --
> Regards,
> John Levine, jo...@taugh.com, Primary Perpetrator of "The Internet for Dummies",
> Please consider the environment before reading this e-mail. https://jl.ly

Re: What did it cost the 8086 to support unaligned access?

<u84lqq$6ioc$2@newsreader4.netcologne.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33028&group=comp.arch#33028

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd4-c42d-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
Date: Wed, 5 Jul 2023 21:03:22 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <u84lqq$6ioc$2@newsreader4.netcologne.de>
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com>
Injection-Date: Wed, 5 Jul 2023 21:03:22 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd4-c42d-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd4:c42d:0:7285:c2ff:fe6c:992d";
logging-data="215820"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Wed, 5 Jul 2023 21:03 UTC

John Levine <johnl@taugh.com> schrieb:
> According to Russell Wallace <russell.wallace@gmail.com>:
>>The Intel 8086 supported unaligned loads and stores of 16-bit data, e.g. mov ax, foo was guaranteed to work even if foo was odd.
>>
>>What did this cost, in terms of performance and chip area, compared to an alternative architecture that would have been the same except for unaligned
>>access being a trap or undefined behavior?
>
> The 8086 was generally constrained by memory speed, so it cost little if anything.
>
> A few years later the 486 and RISC-ish i860 were implemented with
> similar technology and the 860 was a lot faster, maybe a factor of
> two. But I think that was because it was easier to decode the instructions
> and pipeline them, not anything about aligned memory and some odd hacks to
> help pipeline repetitive floating point operations.
>
> The original S/360 required everything to be aligned, while S/370
> allowed all data to be misaligned. IBM did a great deal of simulation
> before making architecture choices so I assume they had good reason to
> believe it would be an improvement.

And then the 801 people came along and found out that compilers rarely,
if ever, issued misaligned loads and stores, and the RISC people turned
back the clock to aligned. High-end RISC systems generally support it
by now (even POWER), low-end microcontrollers may not, and RISC-V is
RISC-V because it may be even be trap/emulate there.

Re: What did it cost the 8086 to support unaligned access?

<84e73677-1edb-487f-8756-d9ee65c57999n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33030&group=comp.arch#33030

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:b30:b0:635:e6f8:b28d with SMTP id w16-20020a0562140b3000b00635e6f8b28dmr54741qvj.12.1688591180252;
Wed, 05 Jul 2023 14:06:20 -0700 (PDT)
X-Received: by 2002:a17:902:dacc:b0:1b8:866f:6fc1 with SMTP id
q12-20020a170902dacc00b001b8866f6fc1mr170898plx.0.1688591180050; Wed, 05 Jul
2023 14:06:20 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 5 Jul 2023 14:06:19 -0700 (PDT)
In-Reply-To: <u84lqq$6ioc$2@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:28ce:7684:1034:367a;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:28ce:7684:1034:367a
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <84e73677-1edb-487f-8756-d9ee65c57999n@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Wed, 05 Jul 2023 21:06:20 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Wed, 5 Jul 2023 21:06 UTC

On Wednesday, July 5, 2023 at 4:03:26 PM UTC-5, Thomas Koenig wrote:
> John Levine <jo...@taugh.com> schrieb:
> > According to Russell Wallace <russell...@gmail.com>:
> >>The Intel 8086 supported unaligned loads and stores of 16-bit data, e.g.. mov ax, foo was guaranteed to work even if foo was odd.
> >>
> >>What did this cost, in terms of performance and chip area, compared to an alternative architecture that would have been the same except for unaligned
> >>access being a trap or undefined behavior?
> >
> > The 8086 was generally constrained by memory speed, so it cost little if anything.
> >
> > A few years later the 486 and RISC-ish i860 were implemented with
> > similar technology and the 860 was a lot faster, maybe a factor of
> > two. But I think that was because it was easier to decode the instructions
> > and pipeline them, not anything about aligned memory and some odd hacks to
> > help pipeline repetitive floating point operations.
> >
> > The original S/360 required everything to be aligned, while S/370
> > allowed all data to be misaligned. IBM did a great deal of simulation
> > before making architecture choices so I assume they had good reason to
> > believe it would be an improvement.
> And then the 801 people came along and found out that compilers rarely,
> if ever, issued misaligned loads and stores, and the RISC people turned
> back the clock to aligned. High-end RISC systems generally support it
> by now (even POWER), low-end microcontrollers may not, and RISC-V is
> RISC-V because it may be even be trap/emulate there.
<
How much consternation in SW is worth ½% chip area ???

Re: What did it cost the 8086 to support unaligned access?

<2c9cf050-77f1-49fb-b53c-662712725675n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33031&group=comp.arch#33031

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:306:b0:400:8036:6f04 with SMTP id q6-20020a05622a030600b0040080366f04mr191qtw.3.1688592127428;
Wed, 05 Jul 2023 14:22:07 -0700 (PDT)
X-Received: by 2002:a63:5c17:0:b0:55b:49a8:49bd with SMTP id
q23-20020a635c17000000b0055b49a849bdmr10375174pgb.2.1688592127048; Wed, 05
Jul 2023 14:22:07 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 5 Jul 2023 14:22:06 -0700 (PDT)
In-Reply-To: <u84lqq$6ioc$2@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=87.68.182.136; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 87.68.182.136
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2c9cf050-77f1-49fb-b53c-662712725675n@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: already5chosen@yahoo.com (Michael S)
Injection-Date: Wed, 05 Jul 2023 21:22:07 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Michael S - Wed, 5 Jul 2023 21:22 UTC

On Thursday, July 6, 2023 at 12:03:26 AM UTC+3, Thomas Koenig wrote:
> John Levine <jo...@taugh.com> schrieb:
> > According to Russell Wallace <russell...@gmail.com>:
> >>The Intel 8086 supported unaligned loads and stores of 16-bit data, e.g.. mov ax, foo was guaranteed to work even if foo was odd.
> >>
> >>What did this cost, in terms of performance and chip area, compared to an alternative architecture that would have been the same except for unaligned
> >>access being a trap or undefined behavior?
> >
> > The 8086 was generally constrained by memory speed, so it cost little if anything.
> >
> > A few years later the 486 and RISC-ish i860 were implemented with
> > similar technology and the 860 was a lot faster, maybe a factor of
> > two. But I think that was because it was easier to decode the instructions
> > and pipeline them, not anything about aligned memory and some odd hacks to
> > help pipeline repetitive floating point operations.
> >
> > The original S/360 required everything to be aligned, while S/370
> > allowed all data to be misaligned. IBM did a great deal of simulation
> > before making architecture choices so I assume they had good reason to
> > believe it would be an improvement.
> And then the 801 people came along and found out that compilers rarely,
> if ever, issued misaligned loads and stores, and the RISC people turned
> back the clock to aligned. High-end RISC systems generally support it
> by now (even POWER), low-end microcontrollers may not,

World leading 32-bit microcontroller architecture, i.e. Arm Cortex-M, supports
unaligned accesses.

> and RISC-V is
> RISC-V because it may be even be trap/emulate there.

Re: What did it cost the 8086 to support unaligned access?

<u84n4f$1ru4$1@gal.iecc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33032&group=comp.arch#33032

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.cmpublishers.com!adore2!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: johnl@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
Date: Wed, 5 Jul 2023 21:25:35 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <u84n4f$1ru4$1@gal.iecc.com>
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com> <u84l35$1aq4$1@gal.iecc.com> <4733a536-4669-4930-931f-9b58c644b9b4n@googlegroups.com>
Injection-Date: Wed, 5 Jul 2023 21:25:35 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="61380"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com> <u84l35$1aq4$1@gal.iecc.com> <4733a536-4669-4930-931f-9b58c644b9b4n@googlegroups.com>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Wed, 5 Jul 2023 21:25 UTC

According to MitchAlsup <MitchAlsup@aol.com>:
>> The original S/360 required everything to be aligned, while S/370
>> allowed all data to be misaligned. IBM did a great deal of simulation
>> before making architecture choices so I assume they had good reason to
>> believe it would be an improvement.
><
>FORTRAN common blocks required misaligned DP FP accesses.

To some extent. You could indeed use COMMON and EQUIVALENCE statements
to misalign data. The Fortran library normally caught the misaligned
data exception and simulated the failed instruction. We all knew how
slow that was so if we cared how fast our programs ran, which we usually
did, we made sure to align correctly, which was not hard.

Then the 360/91 came along, with imprecise interrupts so the fixups
didn't work. Oops. By 1968 the 360/195 had the Byte Oriented Operand
feature that handled misaligned data in hardware.

The motivation wasn't performance, it was that there were programs
that worked on slower machines and failed on the /91. That was pretty
embarassing, to be sure, not the only embarassing thing about the /91.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: What did it cost the 8086 to support unaligned access?

<f787d433-f871-4db2-a07d-716b62f7f1a6n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33033&group=comp.arch#33033

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:19a6:b0:75c:b403:271 with SMTP id bm38-20020a05620a19a600b0075cb4030271mr42600qkb.1.1688592504573;
Wed, 05 Jul 2023 14:28:24 -0700 (PDT)
X-Received: by 2002:a63:2485:0:b0:557:5a08:845a with SMTP id
k127-20020a632485000000b005575a08845amr10710993pgk.12.1688592504162; Wed, 05
Jul 2023 14:28:24 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 5 Jul 2023 14:28:23 -0700 (PDT)
In-Reply-To: <u84l35$1aq4$1@gal.iecc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=87.68.182.136; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 87.68.182.136
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com> <u84l35$1aq4$1@gal.iecc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f787d433-f871-4db2-a07d-716b62f7f1a6n@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: already5chosen@yahoo.com (Michael S)
Injection-Date: Wed, 05 Jul 2023 21:28:24 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 9
 by: Michael S - Wed, 5 Jul 2023 21:28 UTC

On Wednesday, July 5, 2023 at 11:53:12 PM UTC+3, John Levine wrote:
>
> and the 860 was a lot faster, maybe a factor of two.

I heard very varying accounts about it, including claims that on majority
of real-world codes i860 was much slower than both i486 and and all
but the lowest end members of i960 family.
In practice, i860 was quickly relegated to role of floating-point DSP, so
very few people had 1st-hand experience of using it for anything general-
purpose.

Re: What did it cost the 8086 to support unaligned access?

<u84oj3$26is$1@gal.iecc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33034&group=comp.arch#33034

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.cmpublishers.com!adore2!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: johnl@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
Date: Wed, 5 Jul 2023 21:50:27 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <u84oj3$26is$1@gal.iecc.com>
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com> <u84l35$1aq4$1@gal.iecc.com> <f787d433-f871-4db2-a07d-716b62f7f1a6n@googlegroups.com>
Injection-Date: Wed, 5 Jul 2023 21:50:27 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="72284"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com> <u84l35$1aq4$1@gal.iecc.com> <f787d433-f871-4db2-a07d-716b62f7f1a6n@googlegroups.com>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Wed, 5 Jul 2023 21:50 UTC

According to Michael S <already5chosen@yahoo.com>:
>On Wednesday, July 5, 2023 at 11:53:12 PM UTC+3, John Levine wrote:
>>
>> and the 860 was a lot faster, maybe a factor of two.
>
>I heard very varying accounts about it, including claims that on majority
>of real-world codes i860 was much slower than both i486 and and all
>but the lowest end members of i960 family.
>In practice, i860 was quickly relegated to role of floating-point DSP, so
>very few people had 1st-hand experience of using it for anything general-
>purpose.

I can believe it. Apparently the 860 was originally intended as an
embedded chip, changed at the last minute to a general purpose RISC,
while the 960 went the other way. System programming on the 860 was a
nightmare, keeping process status and context switching and stuff
nearly impossible.

Also, it was hard to generate good code for the 860 viz. that floating
point pipeline thing, so it makes sense to use it for hand optimized
DSP codes.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: What did it cost the 8086 to support unaligned access?

<u84un0$lc5h$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33035&group=comp.arch#33035

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
Date: Wed, 5 Jul 2023 18:34:53 -0500
Organization: A noiseless patient Spider
Lines: 70
Message-ID: <u84un0$lc5h$1@dont-email.me>
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de>
<2c9cf050-77f1-49fb-b53c-662712725675n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 5 Jul 2023 23:34:56 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="2c0eaf010aa842d93947785f8584e017";
logging-data="700593"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/iP/sd4pzCat1BjbpSZaA4"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:3+79v5SF/VzJMWHY8YKQiZHoAS8=
In-Reply-To: <2c9cf050-77f1-49fb-b53c-662712725675n@googlegroups.com>
Content-Language: en-US
 by: BGB - Wed, 5 Jul 2023 23:34 UTC

On 7/5/2023 4:22 PM, Michael S wrote:
> On Thursday, July 6, 2023 at 12:03:26 AM UTC+3, Thomas Koenig wrote:
>> John Levine <jo...@taugh.com> schrieb:
>>> According to Russell Wallace <russell...@gmail.com>:
>>>> The Intel 8086 supported unaligned loads and stores of 16-bit data, e.g. mov ax, foo was guaranteed to work even if foo was odd.
>>>>
>>>> What did this cost, in terms of performance and chip area, compared to an alternative architecture that would have been the same except for unaligned
>>>> access being a trap or undefined behavior?
>>>
>>> The 8086 was generally constrained by memory speed, so it cost little if anything.
>>>
>>> A few years later the 486 and RISC-ish i860 were implemented with
>>> similar technology and the 860 was a lot faster, maybe a factor of
>>> two. But I think that was because it was easier to decode the instructions
>>> and pipeline them, not anything about aligned memory and some odd hacks to
>>> help pipeline repetitive floating point operations.
>>>
>>> The original S/360 required everything to be aligned, while S/370
>>> allowed all data to be misaligned. IBM did a great deal of simulation
>>> before making architecture choices so I assume they had good reason to
>>> believe it would be an improvement.
>> And then the 801 people came along and found out that compilers rarely,
>> if ever, issued misaligned loads and stores, and the RISC people turned
>> back the clock to aligned. High-end RISC systems generally support it
>> by now (even POWER), low-end microcontrollers may not,
>
> World leading 32-bit microcontroller architecture, i.e. Arm Cortex-M, supports
> unaligned accesses.
>

Yeah.

Apart from "smallest core possible", allowing for unaligned access makes
sense.

But, if one is going to skip support for misaligned access, it may also
makes sense to skip having things like shift and multiply. Since, after
all, one can fake these in software without "too much" cost if they do
actually need the smallest core possible.

But, by the time a core is big enough to support things like a floating
point unit or virtual memory, as I see it, lacking things like
misaligned access (or indexed load/store for that matter) is kind of a
design fail IMO.

That said, misaligned access is still uncommon for normal code, and the
main "selling point" for misaligned access is mostly that it simplifies
things like making "memcpy()" moderately fast (and other related tasks,
like LZ77 decompression, ...).

>> and RISC-V is
>> RISC-V because it may be even be trap/emulate there.

Trap emulation is a possible tradeoff.

Main drawback is that it is basically the slowest option possible
(within reason).

As I see it though, RISC-V is a little unbalanced though, as it adds
many expensive/advanced features without addressing some code
deficiencies that are likely to have a detrimental effect on performance
(and in some cases seems to sway more towards ideology than to
optimizing for the cost/benefit tradeoffs).

Then again, I guess it is possible that someone could disagree with me
in terms of where the cost/benefit tradeoffs lead to.

Re: What did it cost the 8086 to support unaligned access?

<1jnpM.980$8Ma1.956@fx37.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33036&group=comp.arch#33036

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!2.eu.feeder.erje.net!feeder.erje.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx37.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com> <u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de>
In-Reply-To: <u84lqq$6ioc$2@newsreader4.netcologne.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 41
Message-ID: <1jnpM.980$8Ma1.956@fx37.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 05 Jul 2023 23:48:13 UTC
Date: Wed, 05 Jul 2023 19:47:33 -0400
X-Received-Bytes: 3205
 by: EricP - Wed, 5 Jul 2023 23:47 UTC

Thomas Koenig wrote:
> John Levine <johnl@taugh.com> schrieb:
>> According to Russell Wallace <russell.wallace@gmail.com>:
>>> The Intel 8086 supported unaligned loads and stores of 16-bit data, e.g. mov ax, foo was guaranteed to work even if foo was odd.
>>>
>>> What did this cost, in terms of performance and chip area, compared to an alternative architecture that would have been the same except for unaligned
>>> access being a trap or undefined behavior?
>> The 8086 was generally constrained by memory speed, so it cost little if anything.
>>
>> A few years later the 486 and RISC-ish i860 were implemented with
>> similar technology and the 860 was a lot faster, maybe a factor of
>> two. But I think that was because it was easier to decode the instructions
>> and pipeline them, not anything about aligned memory and some odd hacks to
>> help pipeline repetitive floating point operations.
>>
>> The original S/360 required everything to be aligned, while S/370
>> allowed all data to be misaligned. IBM did a great deal of simulation
>> before making architecture choices so I assume they had good reason to
>> believe it would be an improvement.
>
> And then the 801 people came along and found out that compilers rarely,
> if ever, issued misaligned loads and stores, and the RISC people turned
> back the clock to aligned. High-end RISC systems generally support it
> by now (even POWER), low-end microcontrollers may not, and RISC-V is
> RISC-V because it may be even be trap/emulate there.

And then the Standford MIPS came along, believed this, eliminated unaligned
loads and stores, and found out that while compilers may not issue these
humans do it *a lot*. Faced with the prospect of rewriting lots of code
to suit their processor they chose to add it back in for their first
commercial version, the MIPS R2000.

And then the Alpha 21064 came along, believed this, eliminated unaligned
loads and stores, and found out humans do this *a lot*.
Then DEC lied it caused problems, blamed the humans, claimed the code was
broken to begin with (it wasn't), quietly published manuals on how to
rewrite code to suit their processor, and finally added it back in again
claiming it was to support Windows. By which time they had, in my opinion,
driven away Alpha's market due to incompatibility and it never recovered.

Re: What did it cost the 8086 to support unaligned access?

<7c70c3c3-18bc-4030-820e-4779e17ccf04n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33038&group=comp.arch#33038

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:130a:b0:403:2e4c:28a6 with SMTP id v10-20020a05622a130a00b004032e4c28a6mr3542qtk.3.1688625548032;
Wed, 05 Jul 2023 23:39:08 -0700 (PDT)
X-Received: by 2002:a17:903:2607:b0:1b8:a92f:2618 with SMTP id
jd7-20020a170903260700b001b8a92f2618mr879333plb.10.1688625547802; Wed, 05 Jul
2023 23:39:07 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 5 Jul 2023 23:39:07 -0700 (PDT)
In-Reply-To: <4733a536-4669-4930-931f-9b58c644b9b4n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:31bb:b49:ee4e:d672;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:31bb:b49:ee4e:d672
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com> <4733a536-4669-4930-931f-9b58c644b9b4n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7c70c3c3-18bc-4030-820e-4779e17ccf04n@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Thu, 06 Jul 2023 06:39:08 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2257
 by: Quadibloc - Thu, 6 Jul 2023 06:39 UTC

On Wednesday, July 5, 2023 at 2:58:36 PM UTC-6, MitchAlsup wrote:
> On Wednesday, July 5, 2023 at 3:53:12 PM UTC-5, John Levine wrote:

> > The original S/360 required everything to be aligned, while S/370
> > allowed all data to be misaligned. IBM did a great deal of simulation
> > before making architecture choices so I assume they had good reason to
> > believe it would be an improvement.
> <
> FORTRAN common blocks required misaligned DP FP accesses.

In that case, how did IBM offer a FORTRAN compiler for the IBM System/360?

I suppose they _could_ have allowed unaligned floats, and used MVC instructions
in the code generated to access them, but I'm sure they just gave an error message
if you tried to put a double-precision float in an unaligned location.

And if that made their FORTRAN IV compiler noncompliant with something, nobody
minded.

John Savard

Re: What did it cost the 8086 to support unaligned access?

<7f5a951f-4dd1-45ac-9130-9d783c0eeb37n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33039&group=comp.arch#33039

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:254a:b0:767:233b:6703 with SMTP id s10-20020a05620a254a00b00767233b6703mr2193qko.15.1688626057807;
Wed, 05 Jul 2023 23:47:37 -0700 (PDT)
X-Received: by 2002:a05:6a00:1ad2:b0:67a:1788:7653 with SMTP id
f18-20020a056a001ad200b0067a17887653mr1277706pfv.1.1688626055995; Wed, 05 Jul
2023 23:47:35 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 5 Jul 2023 23:47:35 -0700 (PDT)
In-Reply-To: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:31bb:b49:ee4e:d672;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:31bb:b49:ee4e:d672
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7f5a951f-4dd1-45ac-9130-9d783c0eeb37n@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Thu, 06 Jul 2023 06:47:37 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2034
 by: Quadibloc - Thu, 6 Jul 2023 06:47 UTC

On Wednesday, July 5, 2023 at 2:14:05 PM UTC-6, Russell Wallace wrote:

> What did this cost, in terms of performance and chip area, compared
> to an alternative architecture that would have been the same except
> for unaligned access being a trap or undefined behavior?

What does support for unaligned operations require?

Basically, what has to happen is:

Every time there's a memory access, there has to be a
check for an unaligned access.

If there is an unaligned access, the sequence of events is now
changed: the memory access is broken up into a larger number of
memory accesses, and, in the case of a load, the operand is then
shifted and assembled; in the case of a store, it's broken up and
the parts are shifted before the actual memory access.

John Savard

Re: What did it cost the 8086 to support unaligned access?

<923886d2-485f-4447-8629-f62aaebf0cf2n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33040&group=comp.arch#33040

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5142:0:b0:635:db0c:95eb with SMTP id g2-20020ad45142000000b00635db0c95ebmr23666qvq.1.1688626709633;
Wed, 05 Jul 2023 23:58:29 -0700 (PDT)
X-Received: by 2002:a05:6a00:9a9:b0:682:69ee:5037 with SMTP id
u41-20020a056a0009a900b0068269ee5037mr1475160pfg.0.1688626709094; Wed, 05 Jul
2023 23:58:29 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 5 Jul 2023 23:58:28 -0700 (PDT)
In-Reply-To: <1jnpM.980$8Ma1.956@fx37.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:31bb:b49:ee4e:d672;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:31bb:b49:ee4e:d672
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de> <1jnpM.980$8Ma1.956@fx37.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <923886d2-485f-4447-8629-f62aaebf0cf2n@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Thu, 06 Jul 2023 06:58:29 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2323
 by: Quadibloc - Thu, 6 Jul 2023 06:58 UTC

On Wednesday, July 5, 2023 at 5:48:17 PM UTC-6, EricP wrote:

> And then the Standford MIPS came along,
> And then the Alpha 21064 came along,

Oh, dear. What a pity. While support for unaligned
memory access is a convenience, I would have thought
of it as a frill, that can, and should, be dispensed with on
an architecture designed, say, for ultimate performance
in high-performance computing.

But if, in the real world, you are actually going to have to
occasionally trap and emulate unaligned accesses, then
unless "occasionally" is _very_ rare indeed, hardware support
will be preferred.

Since the original System/360 got along reasonably well,
and people wrote a version of SNOBOL for it, and a version
of LISP for it, and so on... I had always assumed that
unaligned access is not really needed; it's just a frill. But
the history you're giving certainly suggests that for _some_
reason, the world of computers today _is_ basically
dependent on having this feature.

John Savard

Re: What did it cost the 8086 to support unaligned access?

<2023Jul6.082239@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33041&group=comp.arch#33041

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
Date: Thu, 06 Jul 2023 06:22:39 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 167
Message-ID: <2023Jul6.082239@mips.complang.tuwien.ac.at>
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com> <u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de> <1jnpM.980$8Ma1.956@fx37.iad>
Injection-Info: dont-email.me; posting-host="7f62c39cee3a6b8a1706b7d20b04a7ad";
logging-data="928331"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX187QzprXext4b2igFoZtIfV"
Cancel-Lock: sha1:trohl+8VCtt0PX7PB2V5QkgXDYU=
X-newsreader: xrn 10.11
 by: Anton Ertl - Thu, 6 Jul 2023 06:22 UTC

EricP <ThatWouldBeTelling@thevillage.com> writes:
>Thomas Koenig wrote:
>> John Levine <johnl@taugh.com> schrieb:
>>> The original S/360 required everything to be aligned, while S/370
>>> allowed all data to be misaligned. IBM did a great deal of simulation
>>> before making architecture choices so I assume they had good reason to
>>> believe it would be an improvement.
>>
>> And then the 801 people came along and found out that compilers rarely,
>> if ever, issued misaligned loads and stores, and the RISC people turned
>> back the clock to aligned. High-end RISC systems generally support it
>> by now (even POWER), low-end microcontrollers may not, and RISC-V is
>> RISC-V because it may be even be trap/emulate there.
>
>And then the Standford MIPS came along, believed this, eliminated unaligned
>loads and stores, and found out that while compilers may not issue these
>humans do it *a lot*. Faced with the prospect of rewriting lots of code
>to suit their processor they chose to add it back in for their first
>commercial version, the MIPS R2000.

It's a long time since I used an R2000 based machine (and slightly
less since I last used an R3000 or R4000-based machine), but I am
pretty sure that the usual loads and stores require natural alignment
and trap on misaligned accesses. They also have lwl, lwr, swl and swr
for synthesizing unaligned accesses; and pseudo-instructions ulw, usw,
ulh, ulhu, ush that expand to sequences of machine instructions.
Raymond Chen describes this in
<https://devblogs.microsoft.com/oldnewthing/20180409-00/?p=98465>.

>And then the Alpha 21064 came along, believed this, eliminated unaligned
>loads and stores, and found out humans do this *a lot*.
>Then DEC lied it caused problems, blamed the humans, claimed the code was
>broken to begin with (it wasn't),

I don't remember this for the Alpha, but it's a common pattern.

>quietly published manuals on how to
>rewrite code to suit their processor, and finally added it back in again
>claiming it was to support Windows.

It's also been a while since I used an Alpha, but even on the 21264B
we saw notices of unaligned accesses by programs in the Linux kernel
logs, and I wrote <https://www.complang.tuwien.ac.at/anton/uace.c> to
control this Linux kernel feature (a similar program uac was available
on Digital OSF/1 (or whatever it had been renamed to at the time), and
as a result I (and my students) got SIGBUS signals on unaligned
accesses even on Linux (where normally the kernel emulated unaligned
accesses).

Alpha also has ldq_u (which performed an aligned access using a
possibly unaligned address), and pseudo-instructions like ustq (which
expands to 11 instructions IIRC). I actually found that the gas
implementation of ustq was broken, which showed that no Linux software
used this pseudo-instruction.

You may be confusing the unaligned access issue with byte and 16-bit
memory access, which was added in EV56.

Trapping on unaligned accesses was very common at the time: the 68000
and 68010 do it (but the 68020 and onwards support unaligned
accesses), and all the first-generation RISCs also trap on unaligned
accesses: ARM, HPPA, MIPS, SPARC, 88k. In the second generation
(Power and Alpha) only Power in big-endian mode supports unaligned
accesses.

Even Intel jumped on the alignment-requiring bandwagon: they added the
AC flag in the 486 (1989) and alignment-requiring memory accesses with
SSE (1999); of course, they did both in an idiotic way:

1) Setting AC requires 8-byte alignment for 8-byte FP numbers, but the
Intel-designed ABI requires 8-byte FP numbers to be stored with 4-byte
alignment, which means that you could not set AC except in very
confined circumstances.

2) Aligned SSE memory accesses require 16-byte alignments, while the
elements were smaller; this means that you either use the unaligned
load and store instructions, or your vectorized code has to
special-case the first and last elements to achieve alignment. AMD
added a flag for dropping the alignment requirement for implicit SSE
memory accesses. Intel did not follow AMD in that, but dropped the
alignment requirement in AVX; this would have been ok if AVX had been
universally available, but unfortunately AVX was not implemented in
Intel's cheap cores up to Gracemont, and often disabled on lower-end
SKUs of their expensive cores.

Back to earlier times: Many RISCs died over the years, with IA-64's
prospect bing particularly deadly: HPPA, MIPS (as GP-computer
architecture), and Alpha were cancelled and intended to be replaced
with IA-64.

IA-64 also did not guarantee that unaligned accesses work. More
precisely,
<https://www.ece.lsu.edu/ee4720/doc/itanium-arch-2.1-v2.pdf> says on
page 2:78:

|When PSR.ac is 1, any Itanium data memory reference that is not
|aligned on a boundary the size of the operand results in an Unaligned
|Data Reference fault; e.g., 1, 2, 4, 8, 10, and 16-byte datums should
|be aligned on 1, 2, 4, 8, 16, and 16-byte boundaries respectively to
|avoid generation of an Unaligned Data Reference fault.
|...
|When PSR.ac is 0, Itanium data memory references that are not aligned
|may or may not result in an Unaligned Data Reference fault based on
|the implementation. The level of unaligned memory support is
|implementation specific. However, all implementations will raise an
|Unaligned Data Reference fault if the datum referenced by an Itanium
|instruction spans a 4K aligned boundary, and many implementations will
|raise an Unaligned Data Reference fault if the datum spans a cache
|line. Implementations may also raise an Unaligned Data Reference
|fault for any other unaligned Itanium memory reference.

The wider support for unaligned accesses came only later: ARM added
support for unaligned accesses, Power had it already for big-endian
(and later added it for little-endian, along with a new form of
little-endian support), IA-64 and SPARC died out, and ARM A64 and
RISC-V support unaligned accesses, leaving only architectures with
unaligned-access support for general-purpose computers.

I guess that the RISC-V decision shows that by that time (early 2010s,
does anybody know a more exact date?) unaligned access support had
won.

It's interesting to speculate why we have had alignment requirements
for so long in RISCs (from the 1980s to the 2000s), and eventually
unaligned won. It seems that the forces involved were relatively
small:

The number of unaligned-access reports in the Linux kernel log on
Alpha were relatively small, so software emulation is a minor cost.

Another cost of alignment requirements is the problem of using the
instructions in a SIMD context (even non-SIMD instructions, e.g.,
performing a hash of a string by loading 8-byte pieces and mixing them
with the hash value up to now). One can use stuff like Alpha's uldq
for such uses and memcpy() to make the compiler generate it in
somewhat recent C compilers, but I guess that doing stuff like *p
(which generates an alignment-requiring ldq on the Alpha) was frequent
enough in the early 2000s (when Linux on IA-32/AMD64 had become the
dominant Unix) that architectures finally added support for unaligned
accesses.

The hardware cost of supporting unaligned accesses is not that large,
as Mitch Alsup explained elsewhere in this thread, and that cost got
relatively smaller as transistor density increased.

I guess that the software environment of the years between 2000 and
2010 and the lowering hardware costs of unaligned-access support
eventually tipped the balance in favour of unaligned accesses. It's
interesting that the same had earlier happened on S/360->S/370 and
68010->68020.

Concerning the original question: One may also ask what the support
for unaligned accesses bought the 8086: The 8086 was designed to be
8080-compatible at the assembly level (from what I read, this feature
helped create an initial software base, even though it was not really
used in later years), and this meant that unaligned accesses would
need to be supported, too. If they had skimped on unaligned accesses,
maybe the 8086 would have been much less successful than they were;
but it was actually the 8088 that was successful, and unaligned
accesses were for free on that, so it would not have changed much if
the 8086 required alignment (if that requirement was not inherited to
the 8088).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: What did it cost the 8086 to support unaligned access?

<5887c9c6-3aa7-4b3a-aea3-bd0432a96070n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33042&group=comp.arch#33042

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:14f0:b0:634:81f6:569b with SMTP id k16-20020a05621414f000b0063481f6569bmr3192qvw.10.1688632112900;
Thu, 06 Jul 2023 01:28:32 -0700 (PDT)
X-Received: by 2002:a17:903:25d4:b0:1b8:a09b:38c8 with SMTP id
jc20-20020a17090325d400b001b8a09b38c8mr1101687plb.8.1688632112286; Thu, 06
Jul 2023 01:28:32 -0700 (PDT)
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 6 Jul 2023 01:28:31 -0700 (PDT)
In-Reply-To: <2023Jul6.082239@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1dde:6a00:1142:6c91:5fbc:a31f;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1dde:6a00:1142:6c91:5fbc:a31f
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de>
<1jnpM.980$8Ma1.956@fx37.iad> <2023Jul6.082239@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5887c9c6-3aa7-4b3a-aea3-bd0432a96070n@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: robfi680@gmail.com (robf...@gmail.com)
Injection-Date: Thu, 06 Jul 2023 08:28:32 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 11461
 by: robf...@gmail.com - Thu, 6 Jul 2023 08:28 UTC

On Thursday, July 6, 2023 at 4:18:08 AM UTC-4, Anton Ertl wrote:
> EricP <ThatWould...@thevillage.com> writes:
> >Thomas Koenig wrote:
> >> John Levine <jo...@taugh.com> schrieb:
> >>> The original S/360 required everything to be aligned, while S/370
> >>> allowed all data to be misaligned. IBM did a great deal of simulation
> >>> before making architecture choices so I assume they had good reason to
> >>> believe it would be an improvement.
> >>
> >> And then the 801 people came along and found out that compilers rarely,
> >> if ever, issued misaligned loads and stores, and the RISC people turned
> >> back the clock to aligned. High-end RISC systems generally support it
> >> by now (even POWER), low-end microcontrollers may not, and RISC-V is
> >> RISC-V because it may be even be trap/emulate there.
> >
> >And then the Standford MIPS came along, believed this, eliminated unaligned
> >loads and stores, and found out that while compilers may not issue these
> >humans do it *a lot*. Faced with the prospect of rewriting lots of code
> >to suit their processor they chose to add it back in for their first
> >commercial version, the MIPS R2000.
> It's a long time since I used an R2000 based machine (and slightly
> less since I last used an R3000 or R4000-based machine), but I am
> pretty sure that the usual loads and stores require natural alignment
> and trap on misaligned accesses. They also have lwl, lwr, swl and swr
> for synthesizing unaligned accesses; and pseudo-instructions ulw, usw,
> ulh, ulhu, ush that expand to sequences of machine instructions.
> Raymond Chen describes this in
> <https://devblogs.microsoft.com/oldnewthing/20180409-00/?p=98465>.
> >And then the Alpha 21064 came along, believed this, eliminated unaligned
> >loads and stores, and found out humans do this *a lot*.
> >Then DEC lied it caused problems, blamed the humans, claimed the code was
> >broken to begin with (it wasn't),
> I don't remember this for the Alpha, but it's a common pattern.
> >quietly published manuals on how to
> >rewrite code to suit their processor, and finally added it back in again
> >claiming it was to support Windows.
> It's also been a while since I used an Alpha, but even on the 21264B
> we saw notices of unaligned accesses by programs in the Linux kernel
> logs, and I wrote <https://www.complang.tuwien.ac.at/anton/uace.c> to
> control this Linux kernel feature (a similar program uac was available
> on Digital OSF/1 (or whatever it had been renamed to at the time), and
> as a result I (and my students) got SIGBUS signals on unaligned
> accesses even on Linux (where normally the kernel emulated unaligned
> accesses).
>
> Alpha also has ldq_u (which performed an aligned access using a
> possibly unaligned address), and pseudo-instructions like ustq (which
> expands to 11 instructions IIRC). I actually found that the gas
> implementation of ustq was broken, which showed that no Linux software
> used this pseudo-instruction.
>
> You may be confusing the unaligned access issue with byte and 16-bit
> memory access, which was added in EV56.
>
> Trapping on unaligned accesses was very common at the time: the 68000
> and 68010 do it (but the 68020 and onwards support unaligned
> accesses), and all the first-generation RISCs also trap on unaligned
> accesses: ARM, HPPA, MIPS, SPARC, 88k. In the second generation
> (Power and Alpha) only Power in big-endian mode supports unaligned
> accesses.
>
> Even Intel jumped on the alignment-requiring bandwagon: they added the
> AC flag in the 486 (1989) and alignment-requiring memory accesses with
> SSE (1999); of course, they did both in an idiotic way:
>
> 1) Setting AC requires 8-byte alignment for 8-byte FP numbers, but the
> Intel-designed ABI requires 8-byte FP numbers to be stored with 4-byte
> alignment, which means that you could not set AC except in very
> confined circumstances.
>
> 2) Aligned SSE memory accesses require 16-byte alignments, while the
> elements were smaller; this means that you either use the unaligned
> load and store instructions, or your vectorized code has to
> special-case the first and last elements to achieve alignment. AMD
> added a flag for dropping the alignment requirement for implicit SSE
> memory accesses. Intel did not follow AMD in that, but dropped the
> alignment requirement in AVX; this would have been ok if AVX had been
> universally available, but unfortunately AVX was not implemented in
> Intel's cheap cores up to Gracemont, and often disabled on lower-end
> SKUs of their expensive cores.
>
> Back to earlier times: Many RISCs died over the years, with IA-64's
> prospect bing particularly deadly: HPPA, MIPS (as GP-computer
> architecture), and Alpha were cancelled and intended to be replaced
> with IA-64.
>
> IA-64 also did not guarantee that unaligned accesses work. More
> precisely,
> <https://www.ece.lsu.edu/ee4720/doc/itanium-arch-2.1-v2.pdf> says on
> page 2:78:
>
> |When PSR.ac is 1, any Itanium data memory reference that is not
> |aligned on a boundary the size of the operand results in an Unaligned
> |Data Reference fault; e.g., 1, 2, 4, 8, 10, and 16-byte datums should
> |be aligned on 1, 2, 4, 8, 16, and 16-byte boundaries respectively to
> |avoid generation of an Unaligned Data Reference fault.
> |...
> |When PSR.ac is 0, Itanium data memory references that are not aligned
> |may or may not result in an Unaligned Data Reference fault based on
> |the implementation. The level of unaligned memory support is
> |implementation specific. However, all implementations will raise an
> |Unaligned Data Reference fault if the datum referenced by an Itanium
> |instruction spans a 4K aligned boundary, and many implementations will
> |raise an Unaligned Data Reference fault if the datum spans a cache
> |line. Implementations may also raise an Unaligned Data Reference
> |fault for any other unaligned Itanium memory reference.
>
> The wider support for unaligned accesses came only later: ARM added
> support for unaligned accesses, Power had it already for big-endian
> (and later added it for little-endian, along with a new form of
> little-endian support), IA-64 and SPARC died out, and ARM A64 and
> RISC-V support unaligned accesses, leaving only architectures with
> unaligned-access support for general-purpose computers.
>
> I guess that the RISC-V decision shows that by that time (early 2010s,
> does anybody know a more exact date?) unaligned access support had
> won.
>
> It's interesting to speculate why we have had alignment requirements
> for so long in RISCs (from the 1980s to the 2000s), and eventually
> unaligned won. It seems that the forces involved were relatively
> small:
>
> The number of unaligned-access reports in the Linux kernel log on
> Alpha were relatively small, so software emulation is a minor cost.
>
> Another cost of alignment requirements is the problem of using the
> instructions in a SIMD context (even non-SIMD instructions, e.g.,
> performing a hash of a string by loading 8-byte pieces and mixing them
> with the hash value up to now). One can use stuff like Alpha's uldq
> for such uses and memcpy() to make the compiler generate it in
> somewhat recent C compilers, but I guess that doing stuff like *p
> (which generates an alignment-requiring ldq on the Alpha) was frequent
> enough in the early 2000s (when Linux on IA-32/AMD64 had become the
> dominant Unix) that architectures finally added support for unaligned
> accesses.
>
> The hardware cost of supporting unaligned accesses is not that large,
> as Mitch Alsup explained elsewhere in this thread, and that cost got
> relatively smaller as transistor density increased.
>
> I guess that the software environment of the years between 2000 and
> 2010 and the lowering hardware costs of unaligned-access support
> eventually tipped the balance in favour of unaligned accesses. It's
> interesting that the same had earlier happened on S/360->S/370 and
> 68010->68020.
>
> Concerning the original question: One may also ask what the support
> for unaligned accesses bought the 8086: The 8086 was designed to be
> 8080-compatible at the assembly level (from what I read, this feature
> helped create an initial software base, even though it was not really
> used in later years), and this meant that unaligned accesses would
> need to be supported, too. If they had skimped on unaligned accesses,
> maybe the 8086 would have been much less successful than they were;
> but it was actually the 8088 that was successful, and unaligned
> accesses were for free on that, so it would not have changed much if
> the 8086 required alignment (if that requirement was not inherited to
> the 8088).
>
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>


Click here to read the complete article
Re: What did it cost the 8086 to support unaligned access?

<3czpM.1716$eGef.678@fx47.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33043&group=comp.arch#33043

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!2.eu.feeder.erje.net!feeder.erje.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer03.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx47.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com> <u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de> <1jnpM.980$8Ma1.956@fx37.iad> <923886d2-485f-4447-8629-f62aaebf0cf2n@googlegroups.com>
In-Reply-To: <923886d2-485f-4447-8629-f62aaebf0cf2n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 40
Message-ID: <3czpM.1716$eGef.678@fx47.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Thu, 06 Jul 2023 13:19:59 UTC
Date: Thu, 06 Jul 2023 09:19:09 -0400
X-Received-Bytes: 2625
 by: EricP - Thu, 6 Jul 2023 13:19 UTC

Quadibloc wrote:
> On Wednesday, July 5, 2023 at 5:48:17 PM UTC-6, EricP wrote:
>
>> And then the Standford MIPS came along,
>> And then the Alpha 21064 came along,
>
> Oh, dear. What a pity. While support for unaligned
> memory access is a convenience, I would have thought
> of it as a frill, that can, and should, be dispensed with on
> an architecture designed, say, for ultimate performance
> in high-performance computing.
>
> But if, in the real world, you are actually going to have to
> occasionally trap and emulate unaligned accesses, then
> unless "occasionally" is _very_ rare indeed, hardware support
> will be preferred.

The overhead of trap-and-emulate is the problem people most often cite
because it is in their face.

The far more insidious hidden problem is that while byte memory access cpus
*define* that reads and writes to adjacent memory cells are independent,
this is NOT true for cpus that use larger alignment and must perform
such updates as a read-modify-write operation in registers.

That RMW sequence means that adjacent byte and word and all unaligned
accesses are not independent as an update to one variable can clobber
and lose a concurrent update to an adjacent variable.

This race condition can show up even on uni-processors due to interrupts
and exceptions.

The fix is that all such byte, word and misaligned writes must be performed
as atomic LL-SC sequences. But since you don't know which variables are
adjacent then to be safe all such updates must be atomic. This turns each
such write into a subroutine call so the performance is terrible.
And this change must be applied to all code, just in case.

Re: What did it cost the 8086 to support unaligned access?

<b016c723-3d7c-483c-8cab-5bb2b01f6e9cn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33045&group=comp.arch#33045

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:4a48:b0:635:e5f2:4ecc with SMTP id ph8-20020a0562144a4800b00635e5f24eccmr4335qvb.5.1688650807905;
Thu, 06 Jul 2023 06:40:07 -0700 (PDT)
X-Received: by 2002:a63:8bca:0:b0:542:c9ed:b with SMTP id j193-20020a638bca000000b00542c9ed000bmr996984pge.7.1688650807556;
Thu, 06 Jul 2023 06:40:07 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!newsfeed.hasname.com!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 6 Jul 2023 06:40:07 -0700 (PDT)
In-Reply-To: <2023Jul6.082239@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:34bf:bfd3:4fe9:d670;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:34bf:bfd3:4fe9:d670
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de>
<1jnpM.980$8Ma1.956@fx37.iad> <2023Jul6.082239@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b016c723-3d7c-483c-8cab-5bb2b01f6e9cn@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Thu, 06 Jul 2023 13:40:07 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2124
 by: Quadibloc - Thu, 6 Jul 2023 13:40 UTC

On Thursday, July 6, 2023 at 2:18:08 AM UTC-6, Anton Ertl wrote:
> EricP <ThatWould...@thevillage.com> writes:
> >And then the Alpha 21064 came along, believed this, eliminated unaligned
> >loads and stores, and found out humans do this *a lot*.
> >Then DEC lied it caused problems, blamed the humans, claimed the code was
> >broken to begin with (it wasn't),

> I don't remember this for the Alpha, but it's a common pattern.

I just remembered that unaligned loads and stores weren't
the _only_ thing the original Alpha didn't support.

They also didn't support any loads and stores of data shorter
than the full 64 bit width of the machine. *That* is something I
do tend to regard as a serious mistake.

John Savard

Re: What did it cost the 8086 to support unaligned access?

<77a9c815-775b-4836-b623-73553a5f5a3dn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33046&group=comp.arch#33046

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:4c09:b0:635:de67:7674 with SMTP id qh9-20020a0562144c0900b00635de677674mr4592qvb.4.1688651234981;
Thu, 06 Jul 2023 06:47:14 -0700 (PDT)
X-Received: by 2002:a05:6a00:1823:b0:681:3d96:bddc with SMTP id
y35-20020a056a00182300b006813d96bddcmr2370653pfa.2.1688651234719; Thu, 06 Jul
2023 06:47:14 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 6 Jul 2023 06:47:14 -0700 (PDT)
In-Reply-To: <3czpM.1716$eGef.678@fx47.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa34:c000:34bf:bfd3:4fe9:d670;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa34:c000:34bf:bfd3:4fe9:d670
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de>
<1jnpM.980$8Ma1.956@fx37.iad> <923886d2-485f-4447-8629-f62aaebf0cf2n@googlegroups.com>
<3czpM.1716$eGef.678@fx47.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <77a9c815-775b-4836-b623-73553a5f5a3dn@googlegroups.com>
Subject: Re: What did it cost the 8086 to support unaligned access?
From: jsavard@ecn.ab.ca (Quadibloc)
Injection-Date: Thu, 06 Jul 2023 13:47:14 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2537
 by: Quadibloc - Thu, 6 Jul 2023 13:47 UTC

On Thursday, July 6, 2023 at 7:20:03 AM UTC-6, EricP wrote:

> The far more insidious hidden problem is that while byte memory access cpus
> *define* that reads and writes to adjacent memory cells are independent,
> this is NOT true for cpus that use larger alignment and must perform
> such updates as a read-modify-write operation in registers.
>
> That RMW sequence means that adjacent byte and word and all unaligned
> accesses are not independent as an update to one variable can clobber
> and lose a concurrent update to an adjacent variable.

I thought that x86 CPUs were designed to solve this problem.

The memory bus is organized so that there is one enable line for each seven
data lines, and, therefore, any read or write operation can be made to operate
only on specific bytes, no matter how wide the data bus.

And so a misaligned 64-bit data item, on a computer with a 64-bit data bus,
can always be fetched or written in only two accesses.

Of course, fetches are usually of an entire DRAM data line into the cache,
but that's another issue.

John Savard

Re: What did it cost the 8086 to support unaligned access?

<u86nok$vp04$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33047&group=comp.arch#33047

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
Date: Thu, 6 Jul 2023 10:48:32 -0500
Organization: A noiseless patient Spider
Lines: 226
Message-ID: <u86nok$vp04$1@dont-email.me>
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de>
<1jnpM.980$8Ma1.956@fx37.iad> <2023Jul6.082239@mips.complang.tuwien.ac.at>
<5887c9c6-3aa7-4b3a-aea3-bd0432a96070n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 6 Jul 2023 15:48:36 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="2c0eaf010aa842d93947785f8584e017";
logging-data="1041412"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/cXX+cFho/fhHj7KOx9MOj"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:RlUA72R8gTCQ8ZF2UNpomGSoxB0=
Content-Language: en-US
In-Reply-To: <5887c9c6-3aa7-4b3a-aea3-bd0432a96070n@googlegroups.com>
 by: BGB - Thu, 6 Jul 2023 15:48 UTC

On 7/6/2023 3:28 AM, robf...@gmail.com wrote:
> On Thursday, July 6, 2023 at 4:18:08 AM UTC-4, Anton Ertl wrote:
>> EricP <ThatWould...@thevillage.com> writes:
>>> Thomas Koenig wrote:
>>>> John Levine <jo...@taugh.com> schrieb:
>>>>> The original S/360 required everything to be aligned, while S/370
>>>>> allowed all data to be misaligned. IBM did a great deal of simulation
>>>>> before making architecture choices so I assume they had good reason to
>>>>> believe it would be an improvement.
>>>>
>>>> And then the 801 people came along and found out that compilers rarely,
>>>> if ever, issued misaligned loads and stores, and the RISC people turned
>>>> back the clock to aligned. High-end RISC systems generally support it
>>>> by now (even POWER), low-end microcontrollers may not, and RISC-V is
>>>> RISC-V because it may be even be trap/emulate there.
>>>
>>> And then the Standford MIPS came along, believed this, eliminated unaligned
>>> loads and stores, and found out that while compilers may not issue these
>>> humans do it *a lot*. Faced with the prospect of rewriting lots of code
>>> to suit their processor they chose to add it back in for their first
>>> commercial version, the MIPS R2000.
>> It's a long time since I used an R2000 based machine (and slightly
>> less since I last used an R3000 or R4000-based machine), but I am
>> pretty sure that the usual loads and stores require natural alignment
>> and trap on misaligned accesses. They also have lwl, lwr, swl and swr
>> for synthesizing unaligned accesses; and pseudo-instructions ulw, usw,
>> ulh, ulhu, ush that expand to sequences of machine instructions.
>> Raymond Chen describes this in
>> <https://devblogs.microsoft.com/oldnewthing/20180409-00/?p=98465>.
>>> And then the Alpha 21064 came along, believed this, eliminated unaligned
>>> loads and stores, and found out humans do this *a lot*.
>>> Then DEC lied it caused problems, blamed the humans, claimed the code was
>>> broken to begin with (it wasn't),
>> I don't remember this for the Alpha, but it's a common pattern.
>>> quietly published manuals on how to
>>> rewrite code to suit their processor, and finally added it back in again
>>> claiming it was to support Windows.
>> It's also been a while since I used an Alpha, but even on the 21264B
>> we saw notices of unaligned accesses by programs in the Linux kernel
>> logs, and I wrote <https://www.complang.tuwien.ac.at/anton/uace.c> to
>> control this Linux kernel feature (a similar program uac was available
>> on Digital OSF/1 (or whatever it had been renamed to at the time), and
>> as a result I (and my students) got SIGBUS signals on unaligned
>> accesses even on Linux (where normally the kernel emulated unaligned
>> accesses).
>>
>> Alpha also has ldq_u (which performed an aligned access using a
>> possibly unaligned address), and pseudo-instructions like ustq (which
>> expands to 11 instructions IIRC). I actually found that the gas
>> implementation of ustq was broken, which showed that no Linux software
>> used this pseudo-instruction.
>>
>> You may be confusing the unaligned access issue with byte and 16-bit
>> memory access, which was added in EV56.
>>
>> Trapping on unaligned accesses was very common at the time: the 68000
>> and 68010 do it (but the 68020 and onwards support unaligned
>> accesses), and all the first-generation RISCs also trap on unaligned
>> accesses: ARM, HPPA, MIPS, SPARC, 88k. In the second generation
>> (Power and Alpha) only Power in big-endian mode supports unaligned
>> accesses.
>>
>> Even Intel jumped on the alignment-requiring bandwagon: they added the
>> AC flag in the 486 (1989) and alignment-requiring memory accesses with
>> SSE (1999); of course, they did both in an idiotic way:
>>
>> 1) Setting AC requires 8-byte alignment for 8-byte FP numbers, but the
>> Intel-designed ABI requires 8-byte FP numbers to be stored with 4-byte
>> alignment, which means that you could not set AC except in very
>> confined circumstances.
>>
>> 2) Aligned SSE memory accesses require 16-byte alignments, while the
>> elements were smaller; this means that you either use the unaligned
>> load and store instructions, or your vectorized code has to
>> special-case the first and last elements to achieve alignment. AMD
>> added a flag for dropping the alignment requirement for implicit SSE
>> memory accesses. Intel did not follow AMD in that, but dropped the
>> alignment requirement in AVX; this would have been ok if AVX had been
>> universally available, but unfortunately AVX was not implemented in
>> Intel's cheap cores up to Gracemont, and often disabled on lower-end
>> SKUs of their expensive cores.
>>
>> Back to earlier times: Many RISCs died over the years, with IA-64's
>> prospect bing particularly deadly: HPPA, MIPS (as GP-computer
>> architecture), and Alpha were cancelled and intended to be replaced
>> with IA-64.
>>
>> IA-64 also did not guarantee that unaligned accesses work. More
>> precisely,
>> <https://www.ece.lsu.edu/ee4720/doc/itanium-arch-2.1-v2.pdf> says on
>> page 2:78:
>>
>> |When PSR.ac is 1, any Itanium data memory reference that is not
>> |aligned on a boundary the size of the operand results in an Unaligned
>> |Data Reference fault; e.g., 1, 2, 4, 8, 10, and 16-byte datums should
>> |be aligned on 1, 2, 4, 8, 16, and 16-byte boundaries respectively to
>> |avoid generation of an Unaligned Data Reference fault.
>> |...
>> |When PSR.ac is 0, Itanium data memory references that are not aligned
>> |may or may not result in an Unaligned Data Reference fault based on
>> |the implementation. The level of unaligned memory support is
>> |implementation specific. However, all implementations will raise an
>> |Unaligned Data Reference fault if the datum referenced by an Itanium
>> |instruction spans a 4K aligned boundary, and many implementations will
>> |raise an Unaligned Data Reference fault if the datum spans a cache
>> |line. Implementations may also raise an Unaligned Data Reference
>> |fault for any other unaligned Itanium memory reference.
>>
>> The wider support for unaligned accesses came only later: ARM added
>> support for unaligned accesses, Power had it already for big-endian
>> (and later added it for little-endian, along with a new form of
>> little-endian support), IA-64 and SPARC died out, and ARM A64 and
>> RISC-V support unaligned accesses, leaving only architectures with
>> unaligned-access support for general-purpose computers.
>>
>> I guess that the RISC-V decision shows that by that time (early 2010s,
>> does anybody know a more exact date?) unaligned access support had
>> won.
>>
>> It's interesting to speculate why we have had alignment requirements
>> for so long in RISCs (from the 1980s to the 2000s), and eventually
>> unaligned won. It seems that the forces involved were relatively
>> small:
>>
>> The number of unaligned-access reports in the Linux kernel log on
>> Alpha were relatively small, so software emulation is a minor cost.
>>
>> Another cost of alignment requirements is the problem of using the
>> instructions in a SIMD context (even non-SIMD instructions, e.g.,
>> performing a hash of a string by loading 8-byte pieces and mixing them
>> with the hash value up to now). One can use stuff like Alpha's uldq
>> for such uses and memcpy() to make the compiler generate it in
>> somewhat recent C compilers, but I guess that doing stuff like *p
>> (which generates an alignment-requiring ldq on the Alpha) was frequent
>> enough in the early 2000s (when Linux on IA-32/AMD64 had become the
>> dominant Unix) that architectures finally added support for unaligned
>> accesses.
>>
>> The hardware cost of supporting unaligned accesses is not that large,
>> as Mitch Alsup explained elsewhere in this thread, and that cost got
>> relatively smaller as transistor density increased.
>>
>> I guess that the software environment of the years between 2000 and
>> 2010 and the lowering hardware costs of unaligned-access support
>> eventually tipped the balance in favour of unaligned accesses. It's
>> interesting that the same had earlier happened on S/360->S/370 and
>> 68010->68020.
>>
>> Concerning the original question: One may also ask what the support
>> for unaligned accesses bought the 8086: The 8086 was designed to be
>> 8080-compatible at the assembly level (from what I read, this feature
>> helped create an initial software base, even though it was not really
>> used in later years), and this meant that unaligned accesses would
>> need to be supported, too. If they had skimped on unaligned accesses,
>> maybe the 8086 would have been much less successful than they were;
>> but it was actually the 8088 that was successful, and unaligned
>> accesses were for free on that, so it would not have changed much if
>> the 8086 required alignment (if that requirement was not inherited to
>> the 8088).
>>
>> - anton
>> --
>> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
>> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>
>
> For a while my Thor core had only partial unaligned access. Partial being
> unaligned access allowed but only if it fit into a cache line. With a cache
> line being 64 bytes unaligned access was mostly supported. Thor has
> since been updated to in theory support any unaligned access. Running
> two memory cycles if the access spans 64 bytes. Yet to be tested. A lot
> of shifting is required just for aligned access when data is coming from
> or going to the cache line.
>
> IMO unaligned access is worth supporting. Even instruction access should
> support unaligned access.


Click here to read the complete article
Re: What did it cost the 8086 to support unaligned access?

<QHBpM.52228$edN3.24198@fx14.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33048&group=comp.arch#33048

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx14.iad.POSTED!not-for-mail
From: ThatWouldBeTelling@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com> <u84l35$1aq4$1@gal.iecc.com> <u84lqq$6ioc$2@newsreader4.netcologne.de> <1jnpM.980$8Ma1.956@fx37.iad> <923886d2-485f-4447-8629-f62aaebf0cf2n@googlegroups.com> <3czpM.1716$eGef.678@fx47.iad> <77a9c815-775b-4836-b623-73553a5f5a3dn@googlegroups.com>
In-Reply-To: <77a9c815-775b-4836-b623-73553a5f5a3dn@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 53
Message-ID: <QHBpM.52228$edN3.24198@fx14.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Thu, 06 Jul 2023 16:10:24 UTC
Date: Thu, 06 Jul 2023 12:09:04 -0400
X-Received-Bytes: 3413
 by: EricP - Thu, 6 Jul 2023 16:09 UTC

Quadibloc wrote:
> On Thursday, July 6, 2023 at 7:20:03 AM UTC-6, EricP wrote:
>
>> The far more insidious hidden problem is that while byte memory access cpus
>> *define* that reads and writes to adjacent memory cells are independent,
>> this is NOT true for cpus that use larger alignment and must perform
>> such updates as a read-modify-write operation in registers.
>>
>> That RMW sequence means that adjacent byte and word and all unaligned
>> accesses are not independent as an update to one variable can clobber
>> and lose a concurrent update to an adjacent variable.
>
> I thought that x86 CPUs were designed to solve this problem.
>
> The memory bus is organized so that there is one enable line for each seven
> data lines, and, therefore, any read or write operation can be made to operate
> only on specific bytes, no matter how wide the data bus.
>
> And so a misaligned 64-bit data item, on a computer with a 64-bit data bus,
> can always be fetched or written in only two accesses.
>
> Of course, fetches are usually of an entire DRAM data line into the cache,
> but that's another issue.
>
> John Savard

On x86 as on VAX the adjacent bytes are accessed as independent memory cells.
If two independent program variables just happen to be located in adjacent
memory bytes then in the two statements
c1 = ...
c2 = ...
the updates cannot interact.

Stanford MIPS had 32-bit aligned and Alpha 21064 32/64-bit aligned memory
cells. Byte, word and unaligned and straddle accesses were done by reading
a 32/64-bit cell into register, using byte extract and insert instructions
on the register, then writing the whole 32/64-bit memory cell back.

In the second case the variable updates which appear independent may not be
as concurrent exceptions and IO operations can create hidden race conditions
which result in lost updates due to the RMW operation.

For example, if c1 is written by the main program and c2 by a catch handler
which interrupts the RMW sequence for the adjacent c1. The handler updates
c2 and returns, the write of c1 finishes and overwrites the new value in c2
with its prior value.

A program which works fine on VAX or x86 might fail for no apparent reason
on an Alpha 21064. When inspected with the debugger no fault is apparent.
If one introduces a multi-threaded OS then the number of potential race
conditions goes up, and then up again for true SMP multi-processors.

Re: What did it cost the 8086 to support unaligned access?

<u86pdf$vu1u$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33050&group=comp.arch#33050

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.mathisen@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
Date: Thu, 6 Jul 2023 18:16:47 +0200
Organization: A noiseless patient Spider
Lines: 27
Message-ID: <u86pdf$vu1u$1@dont-email.me>
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 6 Jul 2023 16:16:48 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="9899d6265b6f4364a9aa908dba5a298c";
logging-data="1046590"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19SQ+zVk1TvfTcfX02Mqwg/1NQHMeN4IQETXTQNFpdrsA=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.16
Cancel-Lock: sha1:SqpsSZgWlGi5VDsmXP6GeBJJOSU=
In-Reply-To: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
 by: Terje Mathisen - Thu, 6 Jul 2023 16:16 UTC

Russell Wallace wrote:
> The Intel 8086 supported unaligned loads and stores of 16-bit data,
> e.g. mov ax, foo was guaranteed to work even if foo was odd.
>
> What did this cost, in terms of performance and chip area, compared
> to an alternative architecture that would have been the same except
> for unaligned access being a trap or undefined behavior?
>
> To be clear, I'm not talking about the dynamic behavior of code. On
> the actual 8086, access was still faster if the pointer did happen to
> be even. I'm asking, suppose all your pointers for word access were
> actually even, how much bigger and slower was the chip made by having
> to support the possibility that some of them could have been odd?

I would suggest that since they already knew that they would make an
8-bit bus version (the 8088 which ended up in the IBM PC), the control
circuits already knew how to combine two 8-bit accesses into a 16-bit
load. In the '86 an aligned 16-bit load would run a single bus cycle
(taking 4 clock cycles), while the same operation on the '88 took twice
as long. Unless the '86 coud do unaligned accesses in less than 8
cycles, I would guess the mechanism was the same!

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: What did it cost the 8086 to support unaligned access?

<u86sbp$7vac$2@newsreader4.netcologne.de>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33053&group=comp.arch#33053

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd4-c42d-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoenig@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
Date: Thu, 6 Jul 2023 17:07:05 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <u86sbp$7vac$2@newsreader4.netcologne.de>
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u84l35$1aq4$1@gal.iecc.com>
<4733a536-4669-4930-931f-9b58c644b9b4n@googlegroups.com>
Injection-Date: Thu, 6 Jul 2023 17:07:05 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd4-c42d-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd4:c42d:0:7285:c2ff:fe6c:992d";
logging-data="261452"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Thu, 6 Jul 2023 17:07 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:

> FORTRAN common blocks required misaligned DP FP accesses.

That is probably the most-ignored part of the Fortran standard.
Even the very first FORTRAN 77 compiler, by Bell Labs, aligned the
data in COMMON blocks.

The x86-64 psABI also specified aligned COMMON blocks (they are
treated the same as structs).

Hm... which ABI actually specifies non-aligned access? I don't
know of any.

Re: What did it cost the 8086 to support unaligned access?

<u86sra$1070j$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=33054&group=comp.arch#33054

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: What did it cost the 8086 to support unaligned access?
Date: Thu, 6 Jul 2023 12:15:20 -0500
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <u86sra$1070j$1@dont-email.me>
References: <b2711000-2ce5-4c51-b44d-665e97a4c488n@googlegroups.com>
<u86pdf$vu1u$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 6 Jul 2023 17:15:23 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="2c0eaf010aa842d93947785f8584e017";
logging-data="1055763"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+BI3wLjfhmc9LjlfWiLevP"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:cO+I09iwKDFtSkUVHlGV1uHmdpw=
Content-Language: en-US
In-Reply-To: <u86pdf$vu1u$1@dont-email.me>
 by: BGB - Thu, 6 Jul 2023 17:15 UTC

On 7/6/2023 11:16 AM, Terje Mathisen wrote:
> Russell Wallace wrote:
>> The Intel 8086 supported unaligned loads and stores of 16-bit data,
>> e.g. mov ax, foo was guaranteed to work even if foo was odd.
>>
>> What did this cost, in terms of performance and chip area, compared
>> to an alternative architecture that would have been the same except
>> for unaligned access being a trap or undefined behavior?
>>
>> To be clear, I'm not talking about the dynamic behavior of code. On
>> the actual 8086, access was still faster if the pointer did happen to
>> be even. I'm asking, suppose all your pointers for word access were
>> actually even, how much bigger and slower was the chip made by having
>> to support the possibility that some of them could have been odd?
>
> I would suggest that since they already knew that they would make an
> 8-bit bus version (the 8088 which ended up in the IBM PC), the control
> circuits already knew how to combine two 8-bit accesses into a 16-bit
> load. In the '86 an aligned 16-bit load would run a single bus cycle
> (taking 4 clock cycles), while the same operation on the '88 took twice
> as long. Unless the '86 coud do unaligned accesses in less than 8
> cycles, I would guess the mechanism was the same!
>

This makes it seem like the 8086/8088 would have been painfully slow?...

Like, how exactly did they run programs like Wolfenstein 3D or the
various platformer games?...

Like, even with all my fancy stuff, and a 1-cycle throughput for many
memory accesses to the L1 cache, still difficult to get any semblance of
usable performance with things like Wolf3D much under ~ 10-14 MHz ...

Granted, a lot of these also required VGA, so maybe running them on the
original PC wasn't really a thing even if they were originally written
for 16-bit real-mode?...

> Terje
>

Pages:123
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor