Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

It seems intuitively obvious to me, which means that it might be wrong. -- Chris Torek


devel / comp.arch / Re: Load/Store with auto-increment

SubjectAuthor
* Re: Load/Store with auto-incrementAnton Ertl
+- Re: Load/Store with auto-incrementScott Lurndal
+* Re: Load/Store with auto-incrementBGB
|`* Re: Load/Store with auto-incrementAnton Ertl
| +- Re: Load/Store with auto-incrementMitchAlsup
| `- Re: Load/Store with auto-incrementBGB
`* Re: Load/Store with auto-incrementMitchAlsup
 +- Re: Load/Store with auto-incrementKeith Thompson
 `* Re: arcana of encodings, Load/Store with auto-incrementJohn Levine
  `- Re: arcana of encodings, Load/Store with auto-incrementDavid Brown

1
Re: Load/Store with auto-increment

<2023May11.163319@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32159&group=comp.arch#32159

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Load/Store with auto-increment
Date: Thu, 11 May 2023 14:33:19 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 40
Message-ID: <2023May11.163319@mips.complang.tuwien.ac.at>
References: <u35prk$2ssbq$1@dont-email.me> <2c034f32-5954-4c48-b650-16973aa55606n@googlegroups.com> <be260539-b8e5-408d-971c-16070e0e543dn@googlegroups.com> <52e78621-b93d-460a-8b1d-888512284d19n@googlegroups.com> <Nk96M.2805257$9sn9.1655802@fx17.iad> <u3bf3r$3ut5g$1@dont-email.me> <Ksb6M.534497$Olad.124053@fx35.iad> <u3bm2p$3vih2$1@dont-email.me> <2023May9.083355@mips.complang.tuwien.ac.at> <u3d3j4$7k5h$1@dont-email.me> <2023May9.182238@mips.complang.tuwien.ac.at> <u3e2jm$bb98$1@dont-email.me>
Injection-Info: dont-email.me; posting-host="bc2e03655b91bf1121d8368dc6f60a96";
logging-data="1197229"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19NddjFqQ86DMLrDzWEkn/O"
Cancel-Lock: sha1:PIFe6uRBRcx2/ufYPXVIXPcfLgA=
X-newsreader: xrn 10.11
 by: Anton Ertl - Thu, 11 May 2023 14:33 UTC

BGB <cr88192@gmail.com> writes:
>On 5/9/2023 11:22 AM, Anton Ertl wrote:
>> Just do it, and you will see that it's much less painful than
>> converting between UTF-8 and UTF-32 all the time.
>>
>
>There is no straightforward mapping between UTF-8 bytes and the logical
>X position of a cursor on screen.
>
>One would likely need a loop or similar to walk the text to figure out
>where to put the cursor at.

True. But that is already true for plain ASCII, thanks to characters
like TAB. When you then go to UTF-32, you get things like zero-width
spaces, double-wide (on fixed-width fonts) characters (typically CJK)
and combining marks. So UTF-32 buys you absolutely nothing compared
to UTF-8.

You suggest having a 64-bit cell that represents a single glyph.
That's possible, but it's not UTF-32. You can get there from UTF-8
roughly as easily as from UTF-32. It solves the problem with
combining marks, but you still have glyphs with widths 0, 1, or 2, and
the TAB whose width depends on the stuff that came before it.

>If Emacs can handle huge files, maybe that is in its merit, but this is
>not likely a common use-case.

I have not made a survey of Emacs users, but I would not be surprised
if many users used it on huge files. I doubt I am the only one who
uses emacs to search, e.g., his mbox file. Another usage of huge
files has been to generate a trace of a program execution and then
search through it with Emacs.

Of course, if your tools don't support a particular usage, you won't
use them that way, and you will think that nobody does.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Load/Store with auto-increment

<DN77M.2684301$iS99.2214402@fx16.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32162&group=comp.arch#32162

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx16.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: scott@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Load/Store with auto-increment
Newsgroups: comp.arch
References: <u35prk$2ssbq$1@dont-email.me> <be260539-b8e5-408d-971c-16070e0e543dn@googlegroups.com> <52e78621-b93d-460a-8b1d-888512284d19n@googlegroups.com> <Nk96M.2805257$9sn9.1655802@fx17.iad> <u3bf3r$3ut5g$1@dont-email.me> <Ksb6M.534497$Olad.124053@fx35.iad> <u3bm2p$3vih2$1@dont-email.me> <2023May9.083355@mips.complang.tuwien.ac.at> <u3d3j4$7k5h$1@dont-email.me> <2023May9.182238@mips.complang.tuwien.ac.at> <u3e2jm$bb98$1@dont-email.me> <2023May11.163319@mips.complang.tuwien.ac.at>
Lines: 20
Message-ID: <DN77M.2684301$iS99.2214402@fx16.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Thu, 11 May 2023 15:25:23 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Thu, 11 May 2023 15:25:23 GMT
X-Received-Bytes: 1867
 by: Scott Lurndal - Thu, 11 May 2023 15:25 UTC

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>BGB <cr88192@gmail.com> writes:
>>On 5/9/2023 11:22 AM, Anton Ertl wrote:
>>> Just do it, and you will see that it's much less painful than
>>> converting between UTF-8 and UTF-32 all the time.

>>If Emacs can handle huge files, maybe that is in its merit, but this is
>>not likely a common use-case.
>
>I have not made a survey of Emacs users, but I would not be surprised
>if many users used it on huge files. I doubt I am the only one who
>uses emacs to search, e.g., his mbox file. Another usage of huge
>files has been to generate a trace of a program execution and then
>search through it with Emacs.

Even Emacs has trouble with large program traces. As does Vim
(although opening it read-only helps a bit).

Really, split and grep are more useful for large traces ime.

Re: Load/Store with auto-increment

<u3j6g5$1593o$2@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32164&group=comp.arch#32164

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.swapon.de!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Load/Store with auto-increment
Date: Thu, 11 May 2023 11:52:20 -0500
Organization: A noiseless patient Spider
Lines: 51
Message-ID: <u3j6g5$1593o$2@dont-email.me>
References: <u35prk$2ssbq$1@dont-email.me>
<2c034f32-5954-4c48-b650-16973aa55606n@googlegroups.com>
<be260539-b8e5-408d-971c-16070e0e543dn@googlegroups.com>
<52e78621-b93d-460a-8b1d-888512284d19n@googlegroups.com>
<Nk96M.2805257$9sn9.1655802@fx17.iad> <u3bf3r$3ut5g$1@dont-email.me>
<Ksb6M.534497$Olad.124053@fx35.iad> <u3bm2p$3vih2$1@dont-email.me>
<2023May9.083355@mips.complang.tuwien.ac.at> <u3d3j4$7k5h$1@dont-email.me>
<2023May9.182238@mips.complang.tuwien.ac.at> <u3e2jm$bb98$1@dont-email.me>
<2023May11.163319@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 11 May 2023 16:52:21 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="7adb6d8692f54f2bba25bfdce2224f90";
logging-data="1221752"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+5faxZOPA4JCsU7PbhjbYh"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.10.1
Cancel-Lock: sha1:mhbLU7O6jNM/QmXEGXP+U//cWV0=
Content-Language: en-US
In-Reply-To: <2023May11.163319@mips.complang.tuwien.ac.at>
 by: BGB - Thu, 11 May 2023 16:52 UTC

On 5/11/2023 9:33 AM, Anton Ertl wrote:
> BGB <cr88192@gmail.com> writes:
>> On 5/9/2023 11:22 AM, Anton Ertl wrote:
>>> Just do it, and you will see that it's much less painful than
>>> converting between UTF-8 and UTF-32 all the time.
>>>
>>
>> There is no straightforward mapping between UTF-8 bytes and the logical
>> X position of a cursor on screen.
>>
>> One would likely need a loop or similar to walk the text to figure out
>> where to put the cursor at.
>
> True. But that is already true for plain ASCII, thanks to characters
> like TAB. When you then go to UTF-32, you get things like zero-width
> spaces, double-wide (on fixed-width fonts) characters (typically CJK)
> and combining marks. So UTF-32 buys you absolutely nothing compared
> to UTF-8.
>

The tab would usually become 4 or 8 space-like characters, which are
understood to represent a tab character.

> You suggest having a 64-bit cell that represents a single glyph.
> That's possible, but it's not UTF-32. You can get there from UTF-8
> roughly as easily as from UTF-32. It solves the problem with
> combining marks, but you still have glyphs with widths 0, 1, or 2, and
> the TAB whose width depends on the stuff that came before it.
>

I wasn't really arguing for UTF-32 in the first place.

>> If Emacs can handle huge files, maybe that is in its merit, but this is
>> not likely a common use-case.
>
> I have not made a survey of Emacs users, but I would not be surprised
> if many users used it on huge files. I doubt I am the only one who
> uses emacs to search, e.g., his mbox file. Another usage of huge
> files has been to generate a trace of a program execution and then
> search through it with Emacs.
>
> Of course, if your tools don't support a particular usage, you won't
> use them that way, and you will think that nobody does.
>

OK.

> - anton

Re: Load/Store with auto-increment

<2023May11.192934@mips.complang.tuwien.ac.at>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32165&group=comp.arch#32165

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.swapon.de!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Load/Store with auto-increment
Date: Thu, 11 May 2023 17:29:34 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 24
Message-ID: <2023May11.192934@mips.complang.tuwien.ac.at>
References: <u35prk$2ssbq$1@dont-email.me> <52e78621-b93d-460a-8b1d-888512284d19n@googlegroups.com> <Nk96M.2805257$9sn9.1655802@fx17.iad> <u3bf3r$3ut5g$1@dont-email.me> <Ksb6M.534497$Olad.124053@fx35.iad> <u3bm2p$3vih2$1@dont-email.me> <2023May9.083355@mips.complang.tuwien.ac.at> <u3d3j4$7k5h$1@dont-email.me> <2023May9.182238@mips.complang.tuwien.ac.at> <u3e2jm$bb98$1@dont-email.me> <2023May11.163319@mips.complang.tuwien.ac.at> <u3j6g5$1593o$2@dont-email.me>
Injection-Info: dont-email.me; posting-host="bc2e03655b91bf1121d8368dc6f60a96";
logging-data="1231609"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/83ZHKjkliQApwPE3L7QbC"
Cancel-Lock: sha1:RL9jg1ymHbHZwY/No9uNmhNpvac=
X-newsreader: xrn 10.11
 by: Anton Ertl - Thu, 11 May 2023 17:29 UTC

BGB <cr88192@gmail.com> writes:
>On 5/11/2023 9:33 AM, Anton Ertl wrote:
>> True. But that is already true for plain ASCII, thanks to characters
>> like TAB. When you then go to UTF-32, you get things like zero-width
>> spaces, double-wide (on fixed-width fonts) characters (typically CJK)
>> and combining marks. So UTF-32 buys you absolutely nothing compared
>> to UTF-8.
>>
>
>The tab would usually become 4 or 8 space-like characters, which are
>understood to represent a tab character.

Tabs don't work that way. With tab stops at columns divisible by 8, a
tab can be equivalent to 1-8 spaces; and how many depends on the stuff
earlier in the line.

>I wasn't really arguing for UTF-32 in the first place.

One less reason to have anything but UTF-8.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Load/Store with auto-increment

<f18ffe2a-9f0f-4c0c-b0f6-91de88eced3en@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32166&group=comp.arch#32166

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:408b:b0:74e:3031:f54c with SMTP id f11-20020a05620a408b00b0074e3031f54cmr6107874qko.10.1683827255848;
Thu, 11 May 2023 10:47:35 -0700 (PDT)
X-Received: by 2002:a05:6870:7ec5:b0:192:8947:8ac with SMTP id
wz5-20020a0568707ec500b00192894708acmr5902632oab.1.1683827255680; Thu, 11 May
2023 10:47:35 -0700 (PDT)
Path: i2pn2.org!i2pn.org!news.swapon.de!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 11 May 2023 10:47:35 -0700 (PDT)
In-Reply-To: <2023May11.192934@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:b141:ed72:1f40:88ff;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:b141:ed72:1f40:88ff
References: <u35prk$2ssbq$1@dont-email.me> <52e78621-b93d-460a-8b1d-888512284d19n@googlegroups.com>
<Nk96M.2805257$9sn9.1655802@fx17.iad> <u3bf3r$3ut5g$1@dont-email.me>
<Ksb6M.534497$Olad.124053@fx35.iad> <u3bm2p$3vih2$1@dont-email.me>
<2023May9.083355@mips.complang.tuwien.ac.at> <u3d3j4$7k5h$1@dont-email.me>
<2023May9.182238@mips.complang.tuwien.ac.at> <u3e2jm$bb98$1@dont-email.me>
<2023May11.163319@mips.complang.tuwien.ac.at> <u3j6g5$1593o$2@dont-email.me> <2023May11.192934@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f18ffe2a-9f0f-4c0c-b0f6-91de88eced3en@googlegroups.com>
Subject: Re: Load/Store with auto-increment
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Thu, 11 May 2023 17:47:35 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3323
 by: MitchAlsup - Thu, 11 May 2023 17:47 UTC

On Thursday, May 11, 2023 at 12:36:50 PM UTC-5, Anton Ertl wrote:
> BGB <cr8...@gmail.com> writes:
> >On 5/11/2023 9:33 AM, Anton Ertl wrote:
> >> True. But that is already true for plain ASCII, thanks to characters
> >> like TAB. When you then go to UTF-32, you get things like zero-width
> >> spaces, double-wide (on fixed-width fonts) characters (typically CJK)
> >> and combining marks. So UTF-32 buys you absolutely nothing compared
> >> to UTF-8.
> >>
> >
> >The tab would usually become 4 or 8 space-like characters, which are
> >understood to represent a tab character.
<
> Tabs don't work that way. With tab stops at columns divisible by 8, a
> tab can be equivalent to 1-8 spaces; and how many depends on the stuff
> earlier in the line.
<
And then there is how tabs work on a O29 key punch: You place a card in
the hopper and punch marks on the columns you want tabs to stop. You
then remove the card, and wrap it around the <forgot term> and stick this
back into the key punch.
<
Henceforth, every time you key a tab, the card slides over to the next stop..
<
Essentially equivalent to "tabs 5 12 15 27 38 49 64" for example.
<
FORTRAN formatting tabs are entirely different in that one can tab-backwards...
<
> >I wasn't really arguing for UTF-32 in the first place.
> One less reason to have anything but UTF-8.
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Load/Store with auto-increment

<4cde7139-f78d-490b-b8c7-294ff25328a0n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32167&group=comp.arch#32167

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:f9ce:0:b0:5df:47b4:a977 with SMTP id j14-20020a0cf9ce000000b005df47b4a977mr6375041qvo.5.1683827425957;
Thu, 11 May 2023 10:50:25 -0700 (PDT)
X-Received: by 2002:a05:6830:199:b0:6a5:d944:f1c4 with SMTP id
q25-20020a056830019900b006a5d944f1c4mr2951543ota.7.1683827425655; Thu, 11 May
2023 10:50:25 -0700 (PDT)
Path: i2pn2.org!i2pn.org!news.swapon.de!news.mixmin.net!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 11 May 2023 10:50:25 -0700 (PDT)
In-Reply-To: <2023May11.163319@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:b141:ed72:1f40:88ff;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:b141:ed72:1f40:88ff
References: <u35prk$2ssbq$1@dont-email.me> <2c034f32-5954-4c48-b650-16973aa55606n@googlegroups.com>
<be260539-b8e5-408d-971c-16070e0e543dn@googlegroups.com> <52e78621-b93d-460a-8b1d-888512284d19n@googlegroups.com>
<Nk96M.2805257$9sn9.1655802@fx17.iad> <u3bf3r$3ut5g$1@dont-email.me>
<Ksb6M.534497$Olad.124053@fx35.iad> <u3bm2p$3vih2$1@dont-email.me>
<2023May9.083355@mips.complang.tuwien.ac.at> <u3d3j4$7k5h$1@dont-email.me>
<2023May9.182238@mips.complang.tuwien.ac.at> <u3e2jm$bb98$1@dont-email.me> <2023May11.163319@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <4cde7139-f78d-490b-b8c7-294ff25328a0n@googlegroups.com>
Subject: Re: Load/Store with auto-increment
From: MitchAlsup@aol.com (MitchAlsup)
Injection-Date: Thu, 11 May 2023 17:50:25 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2213
 by: MitchAlsup - Thu, 11 May 2023 17:50 UTC

On Thursday, May 11, 2023 at 9:59:21 AM UTC-5, Anton Ertl wrote:
>
> True. But that is already true for plain ASCII, thanks to characters
> like TAB. When you then go to UTF-32, you get things like zero-width
> spaces, double-wide (on fixed-width fonts) characters (typically CJK)
> and combining marks. So UTF-32 buys you absolutely nothing compared
> to UTF-8.
<
Why would you want zero-width spaces ?

Re: Load/Store with auto-increment

<u3j9t8$15lil$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32168&group=comp.arch#32168

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.swapon.de!news.mixmin.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88192@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Load/Store with auto-increment
Date: Thu, 11 May 2023 12:50:30 -0500
Organization: A noiseless patient Spider
Lines: 48
Message-ID: <u3j9t8$15lil$1@dont-email.me>
References: <u35prk$2ssbq$1@dont-email.me>
<52e78621-b93d-460a-8b1d-888512284d19n@googlegroups.com>
<Nk96M.2805257$9sn9.1655802@fx17.iad> <u3bf3r$3ut5g$1@dont-email.me>
<Ksb6M.534497$Olad.124053@fx35.iad> <u3bm2p$3vih2$1@dont-email.me>
<2023May9.083355@mips.complang.tuwien.ac.at> <u3d3j4$7k5h$1@dont-email.me>
<2023May9.182238@mips.complang.tuwien.ac.at> <u3e2jm$bb98$1@dont-email.me>
<2023May11.163319@mips.complang.tuwien.ac.at> <u3j6g5$1593o$2@dont-email.me>
<2023May11.192934@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 11 May 2023 17:50:32 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="7adb6d8692f54f2bba25bfdce2224f90";
logging-data="1234517"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX191M5eNQshG6j2H0gd17BjB"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.10.1
Cancel-Lock: sha1:y0PM9KvLqMG3K7FMZzBlqiO8+NY=
In-Reply-To: <2023May11.192934@mips.complang.tuwien.ac.at>
Content-Language: en-US
 by: BGB - Thu, 11 May 2023 17:50 UTC

On 5/11/2023 12:29 PM, Anton Ertl wrote:
> BGB <cr88192@gmail.com> writes:
>> On 5/11/2023 9:33 AM, Anton Ertl wrote:
>>> True. But that is already true for plain ASCII, thanks to characters
>>> like TAB. When you then go to UTF-32, you get things like zero-width
>>> spaces, double-wide (on fixed-width fonts) characters (typically CJK)
>>> and combining marks. So UTF-32 buys you absolutely nothing compared
>>> to UTF-8.
>>>
>>
>> The tab would usually become 4 or 8 space-like characters, which are
>> understood to represent a tab character.
>
> Tabs don't work that way. With tab stops at columns divisible by 8, a
> tab can be equivalent to 1-8 spaces; and how many depends on the stuff
> earlier in the line.
>

Granted, was assuming the case of tabs at the start of the line.
But, a variable number isn't that much different than 4 or 8, one just
sort of needs some special characters to designate them as representing
a tab.

>> I wasn't really arguing for UTF-32 in the first place.
>
> One less reason to have anything but UTF-8.
>

For external storage or sending them from one place to another, granted.

Putting UTF-32 in files or similar would be pointless and wasteful.
Similar for using them in OS API calls, ...

There is an edge case for systems which fall in the "Java legacy", which
tend to assume UTF-16 for a lot of stuff internally.

This leading to the whole "pretend string is UTF16 but store it
internally as 1252 if it maps directly" thing.

But, yeah, if designing a new language, might make more sense to assume
UTF-8 or similar for strings.

> - anton

Re: Load/Store with auto-increment

<87o7mqag89.fsf@nosuchdomain.example.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32170&group=comp.arch#32170

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.swapon.de!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Keith.S.Thompson+u@gmail.com (Keith Thompson)
Newsgroups: comp.arch
Subject: Re: Load/Store with auto-increment
Date: Thu, 11 May 2023 11:13:42 -0700
Organization: None to speak of
Lines: 16
Message-ID: <87o7mqag89.fsf@nosuchdomain.example.com>
References: <u35prk$2ssbq$1@dont-email.me>
<2c034f32-5954-4c48-b650-16973aa55606n@googlegroups.com>
<be260539-b8e5-408d-971c-16070e0e543dn@googlegroups.com>
<52e78621-b93d-460a-8b1d-888512284d19n@googlegroups.com>
<Nk96M.2805257$9sn9.1655802@fx17.iad> <u3bf3r$3ut5g$1@dont-email.me>
<Ksb6M.534497$Olad.124053@fx35.iad> <u3bm2p$3vih2$1@dont-email.me>
<2023May9.083355@mips.complang.tuwien.ac.at>
<u3d3j4$7k5h$1@dont-email.me>
<2023May9.182238@mips.complang.tuwien.ac.at>
<u3e2jm$bb98$1@dont-email.me>
<2023May11.163319@mips.complang.tuwien.ac.at>
<4cde7139-f78d-490b-b8c7-294ff25328a0n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: dont-email.me; posting-host="af23344badd732ee8eb21ba95e43a430";
logging-data="1239913"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+X1nMV0ig6fpke472SiN87"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:yW6zwiLnagDeppzZxO6I+XpmI8Y=
sha1:h3Cw651MwCHKPOKSCQIoS7+ePis=
 by: Keith Thompson - Thu, 11 May 2023 18:13 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
> On Thursday, May 11, 2023 at 9:59:21 AM UTC-5, Anton Ertl wrote:
>> True. But that is already true for plain ASCII, thanks to characters
>> like TAB. When you then go to UTF-32, you get things like zero-width
>> spaces, double-wide (on fixed-width fonts) characters (typically CJK)
>> and combining marks. So UTF-32 buys you absolutely nothing compared
>> to UTF-8.
> <
> Why would you want zero-width spaces ?

https://en.wikipedia.org/wiki/Zero-width_space

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for XCOM Labs
void Void(void) { Void(); } /* The recursive call of the void */

Re: arcana of encodings, Load/Store with auto-increment

<u3jbh8$12hl$1@gal.iecc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32171&group=comp.arch#32171

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: johnl@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: arcana of encodings, Load/Store with auto-increment
Date: Thu, 11 May 2023 18:18:16 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <u3jbh8$12hl$1@gal.iecc.com>
References: <u35prk$2ssbq$1@dont-email.me> <u3e2jm$bb98$1@dont-email.me> <2023May11.163319@mips.complang.tuwien.ac.at> <4cde7139-f78d-490b-b8c7-294ff25328a0n@googlegroups.com>
Injection-Date: Thu, 11 May 2023 18:18:16 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="35381"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <u35prk$2ssbq$1@dont-email.me> <u3e2jm$bb98$1@dont-email.me> <2023May11.163319@mips.complang.tuwien.ac.at> <4cde7139-f78d-490b-b8c7-294ff25328a0n@googlegroups.com>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Thu, 11 May 2023 18:18 UTC

According to MitchAlsup <MitchAlsup@aol.com>:
>On Thursday, May 11, 2023 at 9:59:21 AM UTC-5, Anton Ertl wrote:
>>
>> True. But that is already true for plain ASCII, thanks to characters
>> like TAB. When you then go to UTF-32, you get things like zero-width
>> spaces, double-wide (on fixed-width fonts) characters (typically CJK)
>> and combining marks. So UTF-32 buys you absolutely nothing compared
>> to UTF-8.
><
>Why would you want zero-width spaces ?

They show the boundaries between words in scripts that don't use
explicit spaces, and they're useful in Latin text to show places where
you can break a long word. There are also Zero Width Joiner and Zero
Width Nonjoiner codes which are useful in languages like Arabic where
adjacent characters are sometimes combined except when they aren't.

Pretty much anything you think you know about text layout is wrong
in at least some of the languages people use in computer text.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: arcana of encodings, Load/Store with auto-increment

<u3kqmf$1ehf7$1@dont-email.me>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=32180&group=comp.arch#32180

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: arcana of encodings, Load/Store with auto-increment
Date: Fri, 12 May 2023 09:43:11 +0200
Organization: A noiseless patient Spider
Lines: 35
Message-ID: <u3kqmf$1ehf7$1@dont-email.me>
References: <u35prk$2ssbq$1@dont-email.me> <u3e2jm$bb98$1@dont-email.me>
<2023May11.163319@mips.complang.tuwien.ac.at>
<4cde7139-f78d-490b-b8c7-294ff25328a0n@googlegroups.com>
<u3jbh8$12hl$1@gal.iecc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 12 May 2023 07:43:11 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d5982df7f9eb55cf664dc862fe6f0403";
logging-data="1525223"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+cu/AjXR8XA1U7NN4ASnw7RIYM60Jlj5E="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:KAS1r/7m1a5ex4CuETXqs14qAbs=
In-Reply-To: <u3jbh8$12hl$1@gal.iecc.com>
Content-Language: en-GB
 by: David Brown - Fri, 12 May 2023 07:43 UTC

On 11/05/2023 20:18, John Levine wrote:
> According to MitchAlsup <MitchAlsup@aol.com>:
>> On Thursday, May 11, 2023 at 9:59:21 AM UTC-5, Anton Ertl wrote:
>>>
>>> True. But that is already true for plain ASCII, thanks to characters
>>> like TAB. When you then go to UTF-32, you get things like zero-width
>>> spaces, double-wide (on fixed-width fonts) characters (typically CJK)
>>> and combining marks. So UTF-32 buys you absolutely nothing compared
>>> to UTF-8.
>> <
>> Why would you want zero-width spaces ?
>
> They show the boundaries between words in scripts that don't use
> explicit spaces, and they're useful in Latin text to show places where
> you can break a long word. There are also Zero Width Joiner and Zero
> Width Nonjoiner codes which are useful in languages like Arabic where
> adjacent characters are sometimes combined except when they aren't.
>

They can also used to break up kerning pairs. In some text rendering
systems, letter combinations such as AV are placed a little closer
together, with the top left of the V above the bottom right of the A. A
zero-width space can break that, stopping the overlap.

> Pretty much anything you think you know about text layout is wrong
> in at least some of the languages people use in computer text.
>

I find that people get a many things wrong, or at least imprecise, even
if they stick to the one language.

(The TeXbook should come with a warning label - reading it teaches you
huge amounts about typesetting, but leaves you unable to look at other
people's writing without getting annoyed about their poor layout or
inconsistencies!)

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor