Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

Garbage In, Gospel Out


devel / comp.std.c / Re: Does reading an uninitialized object have undefined behavior?

SubjectAuthor
* Does reading an uninitialized object have undefined behavior?Keith Thompson
+* Does reading an uninitialized object have undefined behavior?Ben Bacarisse
|+* Does reading an uninitialized object have undefined behavior?Keith Thompson
||`* Does reading an uninitialized object have undefined behavior?Ben Bacarisse
|| `* Does reading an uninitialized object have undefined behavior?Keith Thompson
||  +- Does reading an uninitialized object have undefined behavior?Ben Bacarisse
||  `* Does reading an uninitialized object have undefined behavior?Tim Rentsch
||   `* Does reading an uninitialized object have undefined behavior?Martin Uecker
||    `* Does reading an uninitialized object have undefined behavior?Tim Rentsch
||     `* Does reading an uninitialized object have undefined behavior?Martin Uecker
||      `* Does reading an uninitialized object have undefined behavior?Tim Rentsch
||       +* Does reading an uninitialized object have undefined behavior?Kaz Kylheku
||       |+* Does reading an uninitialized object have undefined behavior?Martin Uecker
||       ||`* Does reading an uninitialized object have undefined behavior?Kaz Kylheku
||       || `* Does reading an uninitialized object have undefined behavior?Martin Uecker
||       ||  `* Does reading an uninitialized object have undefined behavior?Richard Damon
||       ||   `- Does reading an uninitialized object have undefined behavior?Martin Uecker
||       |`* Does reading an uninitialized object have undefined behavior?Tim Rentsch
||       | `* Does reading an uninitialized object have undefined behavior?Kaz Kylheku
||       |  `- Does reading an uninitialized object have undefined behavior?Tim Rentsch
||       `* Does reading an uninitialized object have undefined behavior?Martin Uecker
||        +* Does reading an uninitialized object have undefined behavior?Tim Rentsch
||        |`* Does reading an uninitialized object have undefined behavior?Spiros Bousbouras
||        | `* Does reading an uninitialized object have undefined behavior?Tim Rentsch
||        |  `* Does reading an uninitialized object have undefined behavior?Spiros Bousbouras
||        |   `* Does reading an uninitialized object have undefined behavior?Tim Rentsch
||        |    `* Does reading an uninitialized object have undefined behavior?Spiros Bousbouras
||        |     `- Does reading an uninitialized object have undefined behavior?Tim Rentsch
||        `* Does reading an uninitialized object have undefined behavior?Tim Rentsch
||         `* Does reading an uninitialized object have undefined behavior?Jakob Bohm
||          `* Does reading an uninitialized object have undefined behavior?Ben Bacarisse
||           `* Does reading an uninitialized object have undefined behavior?Jakob Bohm
||            `- Does reading an uninitialized object have undefined behavior?Ben Bacarisse
|`* Does reading an uninitialized object have undefined behavior?Kaz Kylheku
| +* Does reading an uninitialized object have undefined behavior?Martin Uecker
| |`- Does reading an uninitialized object have undefined behavior?Tim Rentsch
| `- Does reading an uninitialized object have undefined behavior?Tim Rentsch
+* Does reading an uninitialized object have undefined behavior?Kaz Kylheku
|`* Does reading an uninitialized object have undefined behavior?Jakob Bohm
| `- Does reading an uninitialized object have undefined behavior?Tim Rentsch
`* Does reading an uninitialized object have undefined behavior?Tim Rentsch
 `* Does reading an uninitialized object have undefined behavior?Keith Thompson
  +- Does reading an uninitialized object have undefined behavior?Martin Uecker
  +- Does reading an uninitialized object have undefined behavior?Tim Rentsch
  +- Does reading an uninitialized object have undefined behavior?Kaz Kylheku
  `* Does reading an uninitialized object have undefined behavior?Kaz Kylheku
   `* Does reading an uninitialized object have undefined behavior?Keith Thompson
    `- Does reading an uninitialized object have undefined behavior?Kaz Kylheku

Pages:12
Re: Does reading an uninitialized object have undefined behavior?

<867cpu5h8w.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=561&group=comp.std.c#561

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Wed, 16 Aug 2023 23:13:03 -0700
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <867cpu5h8w.fsf@linuxsc.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com> <fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="a8782d2d7d1c356e90db8dd7e2df2f84";
logging-data="3867612"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+e/e8EORVq+9QcK5EG/f0/fcG+a8uTOSo="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:JorTpH85F6J4KLQAXF0INg7P6bk=
sha1:3LN1xP1tyIj6/O9GQUbX/2pOQdU=
 by: Tim Rentsch - Thu, 17 Aug 2023 06:13 UTC

Martin Uecker <ma.uecker@gmail.com> writes:

[some unrelated passages removed]

> On Wednesday, August 16, 2023 at 6:06:43?AM UTC+2, Tim Rentsch wrote:
>
>> Martin Uecker <ma.u...@gmail.com> writes:

[...]

>>> One could still consider the idea that "indeterminate" is an
>>> abstract property that yields UB during read even for types
>>> that do not have trap representations. There is no wording
>>> in the C standard to support this, but I would not call this
>>> idea "fundamentally wrong". You are right that this is different
>>> to provenance provenance which is about values. What it would
>>> have in common with pointer provenance is that there is hidden
>>> state in the abstract machine associated with memory that
>>> is not part of the representation. With effective types there
>>> is another example of this.
>>
>> I understand that you want to consider a broader topic, and that,
>> in the realm of that broader topic, something like provenance
>> could have a role to play. I think it is worth responding to
>> that thesis, and am expecting to do so in a separate reply (or
>> new thread?) although probably not right away.
>
> I would love to hear your comments, because some people
> want to have such an abstract of "indeterminate" and
> some already believe that this is how the standard should
> be understood already today.

I've been thinking about this, and am close (I think) to having
something to say in response. Before I do that, thought, let me
ask this: what problem or problems are motivating the question?
What problems do you (or "some people") want to solve? I don't
want just examples here; I'm hoping to get a full list.

Re: Does reading an uninitialized object have undefined behavior?

<20230816235712.844@kylheku.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=562&group=comp.std.c#562

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 864-117-4973@kylheku.com (Kaz Kylheku)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Thu, 17 Aug 2023 07:08:45 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 68
Message-ID: <20230816235712.844@kylheku.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com>
<87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com>
<87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com>
<86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com>
<864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
<867cpu5h8w.fsf@linuxsc.com>
Injection-Date: Thu, 17 Aug 2023 07:08:45 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="1aa1e972d16a2b9e389dd8d4860af990";
logging-data="3886532"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+jHyNLEqzWC0oUcYpuNF6IF0vrgpWTsaw="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:yRukYXeof1nBG5leOh6jronsROA=
 by: Kaz Kylheku - Thu, 17 Aug 2023 07:08 UTC

On 2023-08-17, Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> Martin Uecker <ma.uecker@gmail.com> writes:
>
> [some unrelated passages removed]
>
>> On Wednesday, August 16, 2023 at 6:06:43?AM UTC+2, Tim Rentsch wrote:
>>
>>> Martin Uecker <ma.u...@gmail.com> writes:
>
> [...]
>
>>>> One could still consider the idea that "indeterminate" is an
>>>> abstract property that yields UB during read even for types
>>>> that do not have trap representations. There is no wording
>>>> in the C standard to support this, but I would not call this
>>>> idea "fundamentally wrong". You are right that this is different
>>>> to provenance provenance which is about values. What it would
>>>> have in common with pointer provenance is that there is hidden
>>>> state in the abstract machine associated with memory that
>>>> is not part of the representation. With effective types there
>>>> is another example of this.
>>>
>>> I understand that you want to consider a broader topic, and that,
>>> in the realm of that broader topic, something like provenance
>>> could have a role to play. I think it is worth responding to
>>> that thesis, and am expecting to do so in a separate reply (or
>>> new thread?) although probably not right away.
>>
>> I would love to hear your comments, because some people
>> want to have such an abstract of "indeterminate" and
>> some already believe that this is how the standard should
>> be understood already today.
>
> I've been thinking about this, and am close (I think) to having
> something to say in response. Before I do that, thought, let me
> ask this: what problem or problems are motivating the question?
> What problems do you (or "some people") want to solve? I don't
> want just examples here; I'm hoping to get a full list.

I'm all about the diagnosis. Even on machines in which all
representations are values, and therefore safe, a program whose external
effect or output depends on unintialized data, and is therefore
nondeterministic (a bad form of nondeterministic), is a repugnant
program.

I'd like to have clear rules which allow an implementation to
to go great depths to diagnose all such situations, while
remaining conforming. (The language agrees that those situations
are erroneous, granting the tools license to diagnose.)

At the same time, certain situations in which uninitialized data are
used in ways that don't have a visible effect, would be nuisance if they
generated diagnostics, the primary example being the copying of objects.
I would like it so that memcpy isn't magic. I want it so that the
programmer can write a bytewise memcpy which doesn't violate the
rules even if it moves uninitialized data.

I would like a model of uninitialized data which usefully lends itself
to different depths with different trade-offs, like complexity of
analysis and use of run-time resources. Limits should be imposed by
implementations (what cases they want to diagnose) rather than by the
model.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

Re: Does reading an uninitialized object have undefined behavior?

<d6d5f930-1943-424f-a572-7d62cfd2bda0n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=566&group=comp.std.c#566

  copy link   Newsgroups: comp.std.c
X-Received: by 2002:a05:620a:8508:b0:76d:86b1:ece8 with SMTP id pe8-20020a05620a850800b0076d86b1ece8mr1038qkn.12.1692387852223;
Fri, 18 Aug 2023 12:44:12 -0700 (PDT)
X-Received: by 2002:a17:902:c609:b0:1b8:7f21:6d3 with SMTP id
r9-20020a170902c60900b001b87f2106d3mr70396plr.6.1692387851688; Fri, 18 Aug
2023 12:44:11 -0700 (PDT)
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.std.c
Date: Fri, 18 Aug 2023 12:44:11 -0700 (PDT)
In-Reply-To: <20230816235712.844@kylheku.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a02:8388:e203:9700:eddb:fb4f:5189:911d;
posting-account=RQgdUAoAAACC04vq-o2ZyxdALW1NmdRY
NNTP-Posting-Host: 2a02:8388:e203:9700:eddb:fb4f:5189:911d
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk>
<874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk>
<87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com>
<20230816235712.844@kylheku.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d6d5f930-1943-424f-a572-7d62cfd2bda0n@googlegroups.com>
Subject: Re: Does reading an uninitialized object have undefined behavior?
From: ma.uecker@gmail.com (Martin Uecker)
Injection-Date: Fri, 18 Aug 2023 19:44:12 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 6111
 by: Martin Uecker - Fri, 18 Aug 2023 19:44 UTC

On Thursday, August 17, 2023 at 9:08:48 AM UTC+2, Kaz Kylheku wrote:
> On 2023-08-17, Tim Rentsch <tr.1...@z991.linuxsc.com> wrote:
> > Martin Uecker <ma.u...@gmail.com> writes:
> >
> > [some unrelated passages removed]
> >
> >> On Wednesday, August 16, 2023 at 6:06:43?AM UTC+2, Tim Rentsch wrote:
> >>
> >>> Martin Uecker <ma.u...@gmail.com> writes:
> >
> > [...]
> >
> >>>> One could still consider the idea that "indeterminate" is an
> >>>> abstract property that yields UB during read even for types
> >>>> that do not have trap representations. There is no wording
> >>>> in the C standard to support this, but I would not call this
> >>>> idea "fundamentally wrong". You are right that this is different
> >>>> to provenance provenance which is about values. What it would
> >>>> have in common with pointer provenance is that there is hidden
> >>>> state in the abstract machine associated with memory that
> >>>> is not part of the representation. With effective types there
> >>>> is another example of this.
> >>>
> >>> I understand that you want to consider a broader topic, and that,
> >>> in the realm of that broader topic, something like provenance
> >>> could have a role to play. I think it is worth responding to
> >>> that thesis, and am expecting to do so in a separate reply (or
> >>> new thread?) although probably not right away.
> >>
> >> I would love to hear your comments, because some people
> >> want to have such an abstract of "indeterminate" and
> >> some already believe that this is how the standard should
> >> be understood already today.
> >
> > I've been thinking about this, and am close (I think) to having
> > something to say in response. Before I do that, thought, let me
> > ask this: what problem or problems are motivating the question?
> > What problems do you (or "some people") want to solve? I don't
> > want just examples here; I'm hoping to get a full list.
> I'm all about the diagnosis. Even on machines in which all
> representations are values, and therefore safe,

I do not agree with the idea that "absence of UB = safe ".

> a program whose external
> effect or output depends on unintialized data, and is therefore
> nondeterministic (a bad form of nondeterministic), is a repugnant
> program.

I would expect a debugger to output the memory as it seen
by the CPU. But yes, it would not be a strictly conforming program.

> I'd like to have clear rules which allow an implementation to
> to go great depths to diagnose all such situations, while
> remaining conforming. (The language agrees that those situations
> are erroneous, granting the tools license to diagnose.)

An implementation does not need a license from the standard
to diagnose anything. I can already diagnose whatever seems
useful and this does not affect conformance at all.

But it becomes easier to usefully diagnose behavior which is
undefined, because then one can expect that in portable C it
is not used intentionally.

> At the same time, certain situations in which uninitialized data are
> used in ways that don't have a visible effect, would be nuisance if they
> generated diagnostics, the primary example being the copying of objects.
> I would like it so that memcpy isn't magic. I want it so that the
> programmer can write a bytewise memcpy which doesn't violate the
> rules even if it moves uninitialized data.

Yes, I think for C this is rather important.

> I would like a model of uninitialized data which usefully lends itself
> to different depths with different trade-offs, like complexity of
> analysis and use of run-time resources. Limits should be imposed by
> implementations (what cases they want to diagnose) rather than by the
> model.

Tools can already do complex analysis and track down use of
uninitialized variables. But with respect to conformance, I think
the current standard has very good rules: memcpy/memcmp
and similar code works as expected. Locally, where a compiler
can be expected to give good diagnostics via static analysis
the use of uninitialized variables is UB. But this does not
spread via pointers elsewhere, where useful diagnostics
are unlikely and optimizer induced problems based on UB
might be far more difficult to debug.

Martin

Re: Does reading an uninitialized object have undefined behavior?

<86wmxr4t22.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=568&group=comp.std.c#568

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Fri, 18 Aug 2023 20:20:05 -0700
Organization: A noiseless patient Spider
Lines: 17
Message-ID: <86wmxr4t22.fsf@linuxsc.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com> <fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com> <20230816235712.844@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="fd3ada769e75711f4f8ff6c26696c0a7";
logging-data="652185"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18XKjzRQdcWFNw+hma0QfFa0TN2GyhH/x8="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:vN3oFqc4xrcW94MsLSIiWw0HFVA=
sha1:WVWyRdrWRnlsuNaqTQX2UAixiCE=
 by: Tim Rentsch - Sat, 19 Aug 2023 03:20 UTC

Kaz Kylheku <864-117-4973@kylheku.com> writes:

> I'm all about the diagnosis. Even on machines in which all
> representations are values, and therefore safe, a program whose
> external effect or output depends on unintialized data, and is
> therefore nondeterministic (a bad form of nondeterministic), is a
> repugnant program.
>
> I'd like to have clear rules which allow an implementation to to
> go great depths to diagnose all such situations, while remaining
> conforming. (The language agrees that those situations are
> erroneous, granting the tools license to diagnose.)

The C standard allows compilers to do whatever analysis they
want and to issue diagnostics for whatever conditions or
circumstances they choose. What you want is orthogonal to
what is being discussed.

Re: Does reading an uninitialized object have undefined behavior?

<20230818220442.950@kylheku.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=570&group=comp.std.c#570

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 864-117-4973@kylheku.com (Kaz Kylheku)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Sat, 19 Aug 2023 05:23:29 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 39
Message-ID: <20230818220442.950@kylheku.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com>
<87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com>
<87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com>
<86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com>
<864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
<867cpu5h8w.fsf@linuxsc.com> <20230816235712.844@kylheku.com>
<86wmxr4t22.fsf@linuxsc.com>
Injection-Date: Sat, 19 Aug 2023 05:23:29 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="588828d29d125ee26b6c8f373c219d7a";
logging-data="686064"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/B0q3V00p7X/QkK9r/007o6AK+I2vNgnM="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:GmMjOyXPSt4IEe+ZW9YJJofjxE0=
 by: Kaz Kylheku - Sat, 19 Aug 2023 05:23 UTC

On 2023-08-19, Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> Kaz Kylheku <864-117-4973@kylheku.com> writes:
>
>> I'm all about the diagnosis. Even on machines in which all
>> representations are values, and therefore safe, a program whose
>> external effect or output depends on unintialized data, and is
>> therefore nondeterministic (a bad form of nondeterministic), is a
>> repugnant program.
>>
>> I'd like to have clear rules which allow an implementation to to
>> go great depths to diagnose all such situations, while remaining
>> conforming. (The language agrees that those situations are
>> erroneous, granting the tools license to diagnose.)
>
> The C standard allows compilers to do whatever analysis they
> want and to issue diagnostics for whatever conditions or
> circumstances they choose.

And stop translating? If some use of an uninitialized object
isn't undefined, and you make the diagnostic a fatal error,
then you don't have a conforming compiler at that point.

> What you want is orthogonal to what is being discussed.

I'm mainly concerned about run-time.

If the program hasn't invoked undefined behavior, I don't thinkk it's
conforming to inject gratuitous diagnostics into the program's run-time,
such that they appear as if they were its output on stderr or stdout.
Those diagnostics have to go to some special debug port.

Also, not conforming to arbitrarily terminate the program. (Other
than in some weasly language lawyering way, by declaring that it
has exceeded an implementation limit or something.)

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

Re: Does reading an uninitialized object have undefined behavior?

<86sf8f4lt9.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=571&group=comp.std.c#571

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Fri, 18 Aug 2023 22:56:34 -0700
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <86sf8f4lt9.fsf@linuxsc.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com> <fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com> <20230816235712.844@kylheku.com> <86wmxr4t22.fsf@linuxsc.com> <20230818220442.950@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="fd3ada769e75711f4f8ff6c26696c0a7";
logging-data="695252"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+CYVZVqDQ5ivbNvlmiiQC7G6PsNp2fUxY="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:4bl42fWyowyExexIorknMZB5A20=
sha1:LcFmDHOm/4wkRYBYVLxQ9BRiVO0=
 by: Tim Rentsch - Sat, 19 Aug 2023 05:56 UTC

Kaz Kylheku <864-117-4973@kylheku.com> writes:

> On 2023-08-19, Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

>> The C standard allows compilers to do whatever analysis they
>> want and to issue diagnostics for whatever conditions or
>> circumstances they choose.
>
> And stop translating? If some use of an uninitialized object
> isn't undefined, and you make the diagnostic a fatal error,
> then you don't have a conforming compiler at that point.
>
> [also]
>
> If the program hasn't invoked undefined behavior, I don't thinkk
> it's conforming to inject gratuitous diagnostics [..or..]
> to arbitrarily terminate the program. [...]

You need to learn how to say what you mean. Your earlier
posting didn't say anything about failing to compile
or altering program behavior. If you can't learn how
to say what you mean then there is roughly a 1e-29 percent
chance that you'll get what you want.

Re: Does reading an uninitialized object have undefined behavior?

<137bed86-8fd4-42d5-aaf0-96ccce615376n@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=572&group=comp.std.c#572

  copy link   Newsgroups: comp.std.c
X-Received: by 2002:a05:6214:a4f:b0:641:8c92:29f2 with SMTP id ee15-20020a0562140a4f00b006418c9229f2mr7678qvb.5.1692434184489;
Sat, 19 Aug 2023 01:36:24 -0700 (PDT)
X-Received: by 2002:a17:902:e801:b0:1bc:4452:59c4 with SMTP id
u1-20020a170902e80100b001bc445259c4mr526303plg.4.1692434183973; Sat, 19 Aug
2023 01:36:23 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.std.c
Date: Sat, 19 Aug 2023 01:36:23 -0700 (PDT)
In-Reply-To: <20230818215322.47@kylheku.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a02:8388:e203:9700:eddb:fb4f:5189:911d;
posting-account=RQgdUAoAAACC04vq-o2ZyxdALW1NmdRY
NNTP-Posting-Host: 2a02:8388:e203:9700:eddb:fb4f:5189:911d
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk>
<874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk>
<87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com>
<20230816235712.844@kylheku.com> <d6d5f930-1943-424f-a572-7d62cfd2bda0n@googlegroups.com>
<20230818215322.47@kylheku.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <137bed86-8fd4-42d5-aaf0-96ccce615376n@googlegroups.com>
Subject: Re: Does reading an uninitialized object have undefined behavior?
From: ma.uecker@gmail.com (Martin Uecker)
Injection-Date: Sat, 19 Aug 2023 08:36:24 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 5074
 by: Martin Uecker - Sat, 19 Aug 2023 08:36 UTC

On Saturday, August 19, 2023 at 7:04:10 AM UTC+2, Kaz Kylheku wrote:
> On 2023-08-18, Martin Uecker <ma.u...@gmail.com> wrote:
> > On Thursday, August 17, 2023 at 9:08:48 AM UTC+2, Kaz Kylheku wrote:
> > An implementation does not need a license from the standard
> > to diagnose anything. I can already diagnose whatever seems
> > useful and this does not affect conformance at all.
> That's true about diagnostics at translation time. It's not clear
> about that happen at run time and indistinguishable from the
> program's output on stdout or stderr.

The observable behavior has to stay the same, so yes, it could
not output to stdout or stderr. But there is nothing stopping it
to log debugging information somewhere else, where it could
be accessed.

> Also, it might be desirable for it to be conforming to terminate the
> program if it has run afoul of the rules.

Yes, this is one main reason to make certain things UB. But
then it can have false positives and needs to be backward
compatible, which limits what is possible.

> >> I would like a model of uninitialized data which usefully lends itself
> >> to different depths with different trade-offs, like complexity of
> >> analysis and use of run-time resources. Limits should be imposed by
> >> implementations (what cases they want to diagnose) rather than by the
> >> model.
> >
> > Tools can already do complex analysis and track down use of
> > uninitialized variables. But with respect to conformance, I think
> > the current standard has very good rules: memcpy/memcmp
> > and similar code works as expected. Locally, where a compiler
> > can be expected to give good diagnostics via static analysis
> > the use of uninitialized variables is UB. But this does not
> > spread via pointers elsewhere, where useful diagnostics
> > are unlikely and optimizer induced problems based on UB
> > might be far more difficult to debug.
> Dynamic instrumentation and tracking makes that possible
> for that information to follow pointer data flows, globally
> in the program.
>
> E.g. under the Valgrind tool, if one module passes an unitialized
> object into another, and that other one relies on it to make
> a conditional branch, it will be diagnosed. You can get the
> backtrace of where that object was created as well as where
> the use took place.

And valgrind exists and is a useful tool (I use it myself)
despite not everything it diagnoses is UB. But it also has
false positives, so using the same rules for deciding what
should be UB in the standard as valgrind uses seems difficult.

Also note that of the output of a program relies on
unspecified values, then it is already not strictly conforming
even when the behavior itself is not undefined. So if an
implementation is smart enough to see this, it could already
reject the program.

Making already the use of unspecified values in conditional
branches be UB seems problematic. E.g. you could not
compute a hash over data structures with padding and
then compare it later to see whether something has
changed (taking into account false positives). This seems
similar to memcpy / memcmp but involved conditions,
and such techniques would become non-conforming.

Martin

Re: Does reading an uninitialized object have undefined behavior?

<ui3EM.518408$TCKc.407024@fx13.iad>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=573&group=comp.std.c#573

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx13.iad.POSTED!not-for-mail
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: Does reading an uninitialized object have undefined behavior?
Newsgroups: comp.std.c
References: <87zg3pq1ym.fsf@nosuchdomain.example.com>
<87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com>
<87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com>
<86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com>
<864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
<867cpu5h8w.fsf@linuxsc.com> <20230816235712.844@kylheku.com>
<d6d5f930-1943-424f-a572-7d62cfd2bda0n@googlegroups.com>
<20230818215322.47@kylheku.com>
<137bed86-8fd4-42d5-aaf0-96ccce615376n@googlegroups.com>
From: Richard@Damon-Family.org (Richard Damon)
Content-Language: en-US
In-Reply-To: <137bed86-8fd4-42d5-aaf0-96ccce615376n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 74
Message-ID: <ui3EM.518408$TCKc.407024@fx13.iad>
X-Complaints-To: abuse@easynews.com
Organization: Forte - www.forteinc.com
X-Complaints-Info: Please be sure to forward a copy of ALL headers otherwise we will be unable to process your complaint properly.
Date: Sat, 19 Aug 2023 09:18:17 -0400
X-Received-Bytes: 5093
 by: Richard Damon - Sat, 19 Aug 2023 13:18 UTC

On 8/19/23 4:36 AM, Martin Uecker wrote:
> On Saturday, August 19, 2023 at 7:04:10 AM UTC+2, Kaz Kylheku wrote:
>> On 2023-08-18, Martin Uecker <ma.u...@gmail.com> wrote:
>>> On Thursday, August 17, 2023 at 9:08:48 AM UTC+2, Kaz Kylheku wrote:
>>> An implementation does not need a license from the standard
>>> to diagnose anything. I can already diagnose whatever seems
>>> useful and this does not affect conformance at all.
>> That's true about diagnostics at translation time. It's not clear
>> about that happen at run time and indistinguishable from the
>> program's output on stdout or stderr.
>
> The observable behavior has to stay the same, so yes, it could
> not output to stdout or stderr. But there is nothing stopping it
> to log debugging information somewhere else, where it could
> be accessed.
>
>> Also, it might be desirable for it to be conforming to terminate the
>> program if it has run afoul of the rules.
>
> Yes, this is one main reason to make certain things UB. But
> then it can have false positives and needs to be backward
> compatible, which limits what is possible.
>
>>>> I would like a model of uninitialized data which usefully lends itself
>>>> to different depths with different trade-offs, like complexity of
>>>> analysis and use of run-time resources. Limits should be imposed by
>>>> implementations (what cases they want to diagnose) rather than by the
>>>> model.
>>>
>>> Tools can already do complex analysis and track down use of
>>> uninitialized variables. But with respect to conformance, I think
>>> the current standard has very good rules: memcpy/memcmp
>>> and similar code works as expected. Locally, where a compiler
>>> can be expected to give good diagnostics via static analysis
>>> the use of uninitialized variables is UB. But this does not
>>> spread via pointers elsewhere, where useful diagnostics
>>> are unlikely and optimizer induced problems based on UB
>>> might be far more difficult to debug.
>> Dynamic instrumentation and tracking makes that possible
>> for that information to follow pointer data flows, globally
>> in the program.
>>
>> E.g. under the Valgrind tool, if one module passes an unitialized
>> object into another, and that other one relies on it to make
>> a conditional branch, it will be diagnosed. You can get the
>> backtrace of where that object was created as well as where
>> the use took place.
>
> And valgrind exists and is a useful tool (I use it myself)
> despite not everything it diagnoses is UB. But it also has
> false positives, so using the same rules for deciding what
> should be UB in the standard as valgrind uses seems difficult.
>
> Also note that of the output of a program relies on
> unspecified values, then it is already not strictly conforming
> even when the behavior itself is not undefined. So if an
> implementation is smart enough to see this, it could already
> reject the program.
>
> Making already the use of unspecified values in conditional
> branches be UB seems problematic. E.g. you could not
> compute a hash over data structures with padding and
> then compare it later to see whether something has
> changed (taking into account false positives). This seems
> similar to memcpy / memcmp but involved conditions,
> and such techniques would become non-conforming.
>
> Martin

My understanding is that there is no requirement that the values of the
padding bytes remains constant over time. I can't imagine a case where
they will just change at an arbitrary time, but setting a member of the
structure to a value (even if it is the same value it had) might easily
affect the value of the padding bytes, so the hash changes.

Re: Does reading an uninitialized object have undefined behavior?

<4a526949-06dd-404d-a299-cb30953e7a5fn@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=574&group=comp.std.c#574

  copy link   Newsgroups: comp.std.c
X-Received: by 2002:a05:622a:309:b0:40f:e2a5:3100 with SMTP id q9-20020a05622a030900b0040fe2a53100mr24457qtw.6.1692468774142;
Sat, 19 Aug 2023 11:12:54 -0700 (PDT)
X-Received: by 2002:a17:90a:d184:b0:269:2227:b290 with SMTP id
fu4-20020a17090ad18400b002692227b290mr498721pjb.7.1692468773826; Sat, 19 Aug
2023 11:12:53 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.std.c
Date: Sat, 19 Aug 2023 11:12:53 -0700 (PDT)
In-Reply-To: <ui3EM.518408$TCKc.407024@fx13.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2a02:8388:e203:9700:eddb:fb4f:5189:911d;
posting-account=RQgdUAoAAACC04vq-o2ZyxdALW1NmdRY
NNTP-Posting-Host: 2a02:8388:e203:9700:eddb:fb4f:5189:911d
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk>
<874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk>
<87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com>
<20230816235712.844@kylheku.com> <d6d5f930-1943-424f-a572-7d62cfd2bda0n@googlegroups.com>
<20230818215322.47@kylheku.com> <137bed86-8fd4-42d5-aaf0-96ccce615376n@googlegroups.com>
<ui3EM.518408$TCKc.407024@fx13.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <4a526949-06dd-404d-a299-cb30953e7a5fn@googlegroups.com>
Subject: Re: Does reading an uninitialized object have undefined behavior?
From: ma.uecker@gmail.com (Martin Uecker)
Injection-Date: Sat, 19 Aug 2023 18:12:54 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 6388
 by: Martin Uecker - Sat, 19 Aug 2023 18:12 UTC

On Saturday, August 19, 2023 at 3:18:22 PM UTC+2, Richard Damon wrote:
> On 8/19/23 4:36 AM, Martin Uecker wrote:
> > On Saturday, August 19, 2023 at 7:04:10 AM UTC+2, Kaz Kylheku wrote:
> >> On 2023-08-18, Martin Uecker <ma.u...@gmail.com> wrote:
> >>> On Thursday, August 17, 2023 at 9:08:48 AM UTC+2, Kaz Kylheku wrote:
> >>> An implementation does not need a license from the standard
> >>> to diagnose anything. I can already diagnose whatever seems
> >>> useful and this does not affect conformance at all.
> >> That's true about diagnostics at translation time. It's not clear
> >> about that happen at run time and indistinguishable from the
> >> program's output on stdout or stderr.
> >
> > The observable behavior has to stay the same, so yes, it could
> > not output to stdout or stderr. But there is nothing stopping it
> > to log debugging information somewhere else, where it could
> > be accessed.
> >
> >> Also, it might be desirable for it to be conforming to terminate the
> >> program if it has run afoul of the rules.
> >
> > Yes, this is one main reason to make certain things UB. But
> > then it can have false positives and needs to be backward
> > compatible, which limits what is possible.
> >
> >>>> I would like a model of uninitialized data which usefully lends itself
> >>>> to different depths with different trade-offs, like complexity of
> >>>> analysis and use of run-time resources. Limits should be imposed by
> >>>> implementations (what cases they want to diagnose) rather than by the
> >>>> model.
> >>>
> >>> Tools can already do complex analysis and track down use of
> >>> uninitialized variables. But with respect to conformance, I think
> >>> the current standard has very good rules: memcpy/memcmp
> >>> and similar code works as expected. Locally, where a compiler
> >>> can be expected to give good diagnostics via static analysis
> >>> the use of uninitialized variables is UB. But this does not
> >>> spread via pointers elsewhere, where useful diagnostics
> >>> are unlikely and optimizer induced problems based on UB
> >>> might be far more difficult to debug.
> >> Dynamic instrumentation and tracking makes that possible
> >> for that information to follow pointer data flows, globally
> >> in the program.
> >>
> >> E.g. under the Valgrind tool, if one module passes an unitialized
> >> object into another, and that other one relies on it to make
> >> a conditional branch, it will be diagnosed. You can get the
> >> backtrace of where that object was created as well as where
> >> the use took place.
> >
> > And valgrind exists and is a useful tool (I use it myself)
> > despite not everything it diagnoses is UB. But it also has
> > false positives, so using the same rules for deciding what
> > should be UB in the standard as valgrind uses seems difficult.
> >
> > Also note that of the output of a program relies on
> > unspecified values, then it is already not strictly conforming
> > even when the behavior itself is not undefined. So if an
> > implementation is smart enough to see this, it could already
> > reject the program.
> >
> > Making already the use of unspecified values in conditional
> > branches be UB seems problematic. E.g. you could not
> > compute a hash over data structures with padding and
> > then compare it later to see whether something has
> > changed (taking into account false positives). This seems
> > similar to memcpy / memcmp but involved conditions,
> > and such techniques would become non-conforming.
> >
> > Martin
> My understanding is that there is no requirement that the values of the
> padding bytes remains constant over time.

The C standard specifies when they can change:

"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values"

> I can't imagine a case where
> they will just change at an arbitrary time, but setting a member of the
> structure to a value (even if it is the same value it had) might easily
> affect the value of the padding bytes, so the hash changes.

Sure, writing to object may change the padding and then the
hash changes. This is why I mentioned false positives.

Martin

Re: Does reading an uninitialized object have undefined behavior?

<868r9xz0ek.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=576&group=comp.std.c#576

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Sat, 26 Aug 2023 19:25:55 -0700
Organization: A noiseless patient Spider
Lines: 349
Message-ID: <868r9xz0ek.fsf@linuxsc.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com> <fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com> <a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="e11e661f808acc58bc4b97373d32be95";
logging-data="1062591"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+UenhpWWRNm26H5xbkLX4NV6K9HH23Y2k="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:mwtP2hoKT64IGGxAQukf7aflcKI=
sha1:YDXKgJV5rk52j1GtC6nG4rHCvf4=
 by: Tim Rentsch - Sun, 27 Aug 2023 02:25 UTC

Martin Uecker <ma.uecker@gmail.com> writes:

> On Thursday, August 17, 2023 at 8:13:07?AM UTC+2, Tim Rentsch wrote:
>
>> Martin Uecker <ma.u...@gmail.com> writes:
>>
>> [some unrelated passages removed]
>>
>>> On Wednesday, August 16, 2023 at 6:06:43?AM UTC+2, Tim Rentsch wrote:
>>>
>>>> Martin Uecker <ma.u...@gmail.com> writes:
>>
>> [...]
>>
>>>>> One could still consider the idea that "indeterminate" is an
>>>>> abstract property that yields UB during read even for types
>>>>> that do not have trap representations. There is no wording
>>>>> in the C standard to support this, but I would not call this
>>>>> idea "fundamentally wrong". You are right that this is different
>>>>> to provenance provenance which is about values. What it would
>>>>> have in common with pointer provenance is that there is hidden
>>>>> state in the abstract machine associated with memory that
>>>>> is not part of the representation. With effective types there
>>>>> is another example of this.
>>>>
>>>> I understand that you want to consider a broader topic, and that,
>>>> in the realm of that broader topic, something like provenance
>>>> could have a role to play. I think it is worth responding to
>>>> that thesis, and am expecting to do so in a separate reply (or
>>>> new thread?) although probably not right away.
>>>
>>> I would love to hear your comments, because some people
>>> want to have such an abstract of "indeterminate" and
>>> some already believe that this is how the standard should
>>> be understood already today.
>>
>> I've been thinking about this, and am close (I think) to having
>> something to say in response. Before I do that, thought, let me
>> ask this: what problem or problems are motivating the question?
>> What problems do you (or "some people") want to solve? I don't
>> want just examples here; I'm hoping to get a full list.
>
> There are essentially two main interests driving this. First,
> there is some interest to precisely formulate the semantics for C.
> The provenance proposal came out of this.
>
> Second, there is the issue of safety problems caused by
> uninitialized reads, together with compiler support for zero
> initialization etc. So there are various people who want to
> change the semantics for uninitialized variables completely
> in the interest of safety.

This response doesn't answer my question. What are the problems,
specifically, that people want to solve? If there isn't a good
understanding of what the problem is, there is little hope of
finding a solution, let alone reaching agreement on whether a
proposed change does in fact solve the problem. If we don't know
where we're going, any choice of road is equally good.

That said, I understand that you are asking not on your own behalf
but on behalf (perhaps indirectly) of others, and the others might
not know what the problem(s) are that they want to solve. I think
it's worth asking the question explicitly, What is the problem
that we want to solve here? Start by simply trying to write a
clear statement of what the problem is; proceed on to looking for
a solution only after there is agreement (and I don't mean just a
majority vote) about what problem it is the group wants to solve.

(Note added after writing: I didn't realize when I started how
difficult this subject is and how much there is to say about it.
I hope readers will appreciate the amount of effort that has
been invested, and get some value out of what has been produced,
even if it spends too much time on some less important issues.)

(Also, after having written the whole posting, I see that there
are some aspects that I didn't relate to the indeterminate
question and so didn't address. If you want me to say more about
formalizing semantics or the issue of safety for uninitialized
variables, I really need some specifics before I can talk about
those.)

(One further thought: on reading through my comments one last
time, I may have more to say about uninitialized variables. But
I am deferring that for now, to get this beast out the door.)

> So far, there was no consensus in WG14 that the rules should
> be changed or what the new rules should be.

That's because they don't know what problem it is that they want
to solve.

Consider the question of what happens with padding bits/bytes,
and unnamed members, in structs (unions too of course, but for
now we consider only structs). The C standard says these bits of
memory take unspecified values whenever there is a store to any
member of the struct (and maybe also at other times, but let's
ignore that). I understand why this decision was made, namely,
to give more freedom to implementations as to how such operations
are actualized. But it leaves behind a problem. Speaking as a
developer, I want the values of these bits to be stable, at least
in certain cases (and I want to be able to choose which cases
those are). The C language doesn't give me any way to do that,
at least not one that isn't horribly inconvenient. In making the
decision about padding bits/bytes, the C committee answered the
/question/ but didn't address the /problem/. I expect that
something similar is going on with the current discussions.

To better understand the landscape, let's look at three different
kinds of undefined behavior. The illustrating constructions are
signed integer arithmetic, obsolete pointer values, and violating
effective type rules.

Situations where arithmetic on signed integers overflows might be
called /practical/ undefined behavior. Certainly it would be
possible to require a better-defined semantics (such as giving an
unspecified result), but presumably overflow doesn't come up very
often, it's not clear how useful the "better" result would be,
and the cost in some hardware environments might be prohibitive.
Furthermore there is a fairly easy workaround to avoid overflow:
simply convert to unsigned types, do the operations, and then
convert back. Overflow being undefined behavior isn't absolutely
necessary but in practical terms it's acceptable. (I acknowledge
that some people have different views on that last statement.)

An obsolete pointer value is a pointer to an object after the end
of the object's lifetime. Attempting to make use of an obsolete
pointer value, in any way whatsoever including simply loading it
by means of lvalue conversion, is undefined behavior. We can
imagine narrowing the scope a bit so simply loading an obsolete
pointer value or comparing one for equality could be better
defined, but any attempt to dereference an obsolete pointer value
is what might be called an /essential/ undefined behavior. The
problem here is both practical and theoretical: there is no way
to be sure the underlying hardware will be able to carry out the
asked-for operation (without a machine check, etc), and even if
there were, there is no way to describe what happens in a way
that can be expressed (usefully) in terms that relate to what's
going on in the abstract machine. There simply is no practical,
useful, sensible way to define the behavior of dereferencing an
obsolete pointer value.

At the other end of the spectrum, violating effective type rules is
what might be called /gratuitous/ undefined behavior. There is no
particular hardware motivation for choosing UB. And there is no
problem defining the semantics of a cross-type access, which can be
done definedly in the same way as accessing union members. So there
is no reason to think that adding cross-type restrictions is
necessary. An argument can be made that cross-type restrictions
are /desirable/, because they allow code transformations that
improve performance in some cases.

Incidentally, it might seem like effective type rules are similar in
some way to NaT bits or pointer provenance. They aren't. NaT bits
are hardware indicators that actually exist, and pointer provenances
are attached to values, not to objects. Neither of those conditions
hold for effective types. The seeming similarity to hidden memory
bits is a red herring.

(Also, effective type rules are a lot more complicated than they
seem at first blush, and have some peculiar properties as a result.
They seem to work okay if not looked at too closely, but a closer
look shows some serious shortcomings. But I digress.)

There are two significant problems with undefined behavior. The
smaller of the two is that there are no distinctions between the
different classes of undefined behavior. There is no way around
having some sort of undefined behavior for obsolete pointer values,
but cross-typing rules are a completely different story. Yet the C
standard puts all the different kinds of undefined behaviors into
the same absolute category. Sometimes people use compiler options
to turn off, for example, so-called "strict aliasing", and of course
the C standard allows us to do that. But compilers aren't required
to provide such an option, and if they do the option may not do
exactly what we expect it to do, because there is no standard
specification for it. The C standard should define officially
sanctioned mechanisms -- as for example standard #pragma's -- to
give standard-defined semantics to certain constructs of undefined
behavior that resemble, eg, -fno-strict-aliasing.


Click here to read the complete article
Re: Does reading an uninitialized object have undefined behavior?

<5+eRe7cp3yQjL4=AX@bongo-ra.co>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=577&group=comp.std.c#577

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!paganini.bofh.team!not-for-mail
From: spibou@gmail.com (Spiros Bousbouras)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Sun, 27 Aug 2023 08:31:26 -0000 (UTC)
Organization: To protect and to server
Message-ID: <5+eRe7cp3yQjL4=AX@bongo-ra.co>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com>
<87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
<867cpu5h8w.fsf@linuxsc.com> <a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com> <868r9xz0ek.fsf@linuxsc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 27 Aug 2023 08:31:26 -0000 (UTC)
Injection-Info: paganini.bofh.team; logging-data="1913797"; posting-host="9H7U5kayiTdk7VIdYU44Rw.user.paganini.bofh.team"; mail-complaints-to="usenet@bofh.team"; posting-account="9dIQLXBM7WM9KzA+yjdR4A";
Cancel-Lock: sha256:K/OMzELX5OH79c5muYUaKgPl32nOsfGfQuK2G1MXEuQ=
X-Notice: Filtered by postfilter v. 0.9.3
X-Server-Commands: nowebcancel
X-Organisation: Weyland-Yutani
 by: Spiros Bousbouras - Sun, 27 Aug 2023 08:31 UTC

On Sat, 26 Aug 2023 19:25:55 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> Sometimes people use compiler options
> to turn off, for example, so-called "strict aliasing", and of course
> the C standard allows us to do that. But compilers aren't required
> to provide such an option, and if they do the option may not do
> exactly what we expect it to do, because there is no standard
> specification for it. The C standard should define officially
> sanctioned mechanisms -- as for example standard #pragma's -- to
> give standard-defined semantics to certain constructs of undefined
> behavior that resemble, eg, -fno-strict-aliasing.

Surely the starting point for this should be the documentation of the
compilers to specify precisely what -fno-strict-aliasing does. If
a consensus emerges out of these precise specifications or C programmers
indicate that they prefer the specification of some particular compiler
then this can become part of the standard. Adding a relevant #pragma
should be trivial.

> The second problem is basically The Law of Unintended Consequences
> smashing into The Law of Least Astonishment. As compiler writers
> have gotten more and more clever at exploiting the implications of
> "undefined behavior", we see more and more cases of code that looks
> reasonable being turned into mush by overly clever "optimizing"
> compilers. There is obviously something wrong with the way this
> trend is going -- ever more clever "optimizations", followed by ever
> more arcane compiler options to work around the problems caused by
> the too-clever compilers. This problem must be addressed by the C
> standard, for if it is not the ecosystem will transform into a
> confused state that is exactly what the C standard was put in place
> to avoid. (I do have some ideas about how to address this issue,
> but I want to make sure everyone appreciates the extent of the
> problem before we start talking about solutions.)

Without specific examples , it's impossible to comment on this. Why did
the "reasonable" code have the undefined behaviour ? Could the result
the programmer was aiming for have been achieved with defined behaviour
? For example it has been pointed out on comp.lang.c that it's
impossible to write a malloc() implementation in conforming C. This is
certainly a weakness which should be addressed with some appropriate
#pragma .

> Before leaving the sub-topic of undefined behavior, let me mention
> two success stories. The first is 'restrict': the performance
> implications are local, the choice is under control of the program
> (and programmer), and the default choice is to play safe. Good
> show.

From my point of view , restrict is not a success because the
specification of restrict is the one part of the C1999 standard I have
given up trying to understand. I understand the underlying idea but the
specifics elude me. I remember many years ago someone asked on this
group about some code involving restrict and a member of the standard
committee replied and I found the reply counterintuitive. So I have
decided to not use restrict in my own code taking also into account
that I don't need the microoptimisations which restrict is intended to
allow. But for all I know , people who do need these optimisations find
the specification of restrict in the standard perfectly adequate.

--
It is not widely known that the "CPC" in "Amstrad CPC" actually stands
for "cool people club".

Re: Does reading an uninitialized object have undefined behavior?

<86sf82ulmb.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=579&group=comp.std.c#579

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Tue, 29 Aug 2023 04:35:40 -0700
Organization: A noiseless patient Spider
Lines: 110
Message-ID: <86sf82ulmb.fsf@linuxsc.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com> <fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com> <a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com> <868r9xz0ek.fsf@linuxsc.com> <5+eRe7cp3yQjL4=AX@bongo-ra.co>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="1084eb5dd0b1a11121af7ba127734067";
logging-data="2364061"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX187fQ51468Qzx0z10c1Wy3q+xvH9NwObYk="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:NBGpuO8ev+U/Llqr5D8wc5p42zE=
sha1:pZQE77+/rUs++M4WqAvumoBKBeY=
 by: Tim Rentsch - Tue, 29 Aug 2023 11:35 UTC

Spiros Bousbouras <spibou@gmail.com> writes:

> On Sat, 26 Aug 2023 19:25:55 -0700
> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>
>> Sometimes people use compiler options to turn off, for example,
>> so-called "strict aliasing", and of course the C standard allows
>> us to do that. But compilers aren't required to provide such an
>> option, and if they do the option may not do exactly what we
>> expect it to do, because there is no standard specification for
>> it. The C standard should define officially sanctioned
>> mechanisms -- as for example standard #pragma's -- to give
>> standard-defined semantics to certain constructs of undefined
>> behavior that resemble, eg, -fno-strict-aliasing.
>
> Surely the starting point for this should be the documentation of
> the compilers to specify precisely what -fno-strict-aliasing does.
> [...]

Not at all. It's easy to write a specification that says what we
want to do, along similar lines to what is said in the footnote
about union member access in section 6.5.2.3

If the member used to access the contents of a union object
is not the same as the member last used to store a value in
the object, the appropriate part of the object representation
of the value is reinterpreted as an object representation in
the new type as described in 6.2.6 (a process sometimes called
"type punning"). This might be a trap representation.

That behavior should be the default, for all accesses. For cases
where a developer wants to give permission to the compiler to
optimize based on cross-type non-interference assumptions, there
should be a #pragma to do something similar to what effective type
rules do now. The effective type rules are in need of re-writing
anyway, and making type punning be the default doesn't break any
programs, because compilers are already free to ignore the
implications of violating effective type conditions.

>> The second problem is basically The Law of Unintended Consequences
>> smashing into The Law of Least Astonishment. As compiler writers
>> have gotten more and more clever at exploiting the implications of
>> "undefined behavior", we see more and more cases of code that looks
>> reasonable being turned into mush by overly clever "optimizing"
>> compilers. There is obviously something wrong with the way this
>> trend is going -- ever more clever "optimizations", followed by
>> ever more arcane compiler options to work around the problems
>> caused by the too-clever compilers. This problem must be addressed
>> by the C standard, for if it is not the ecosystem will transform
>> into a confused state that is exactly what the C standard was put
>> in place to avoid. (I do have some ideas about how to address this
>> issue, but I want to make sure everyone appreciates the extent of
>> the problem before we start talking about solutions.)
>
> Without specific examples , it's impossible to comment on this.
> [...]

I feel that so much has been written about this issue that it
isn't necessary for me to elaborate.

> For example it has been pointed out on comp.lang.c that it's
> impossible to write a malloc() implementation in conforming
> C. This is certainly a weakness which should be addressed with
> some appropriate #pragma .

There isn't any reason to think malloc() should be writable in
completely portable C. That's the point of putting malloc() in
the system library in the first place. By the way, with type
punning semantics mentioned above being the default, and with the
alignment features added in C11, I think it is possible to write
malloc() in portable C without needed any additional language
changes. But even if it isn't that is no cause for concern; one
of the principal reasons for having a system library is to
provide functionality that the core language cannot express (or
cannot express conveniently).

>> Before leaving the sub-topic of undefined behavior, let me mention
>> two success stories. The first is 'restrict': the performance
>> implications are local, the choice is under control of the program
>> (and programmer), and the default choice is to play safe. Good
>> show.
>
> From my point of view , restrict is not a success because the
> specification of restrict is the one part of the C1999 standard I
> have given up trying to understand. I understand the underlying
> idea but the specifics elude me. [...]

I agree the formal definition of restrict is rather daunting. In
practice though I think using restrict with confidence is not
overly difficult. My working model for restrict is something
like this:

1. Use restrict only in the declarations of function
parameters.

2. For a declaration like const T *restrict foo ,
the compiler may assume that any objects that can be
accessed through 'foo' will not be modified.

3. For a declaration like T *restrict bas ,
the compiler may assume that any changes to objects
that can be accessed through 'bas' will be done
using 'bas' or a pointer value derived from 'bas'
(and in particular that no changes will happen
other than through 'bas' or 'bas'-derived pointer
values).

Is this summary description helpful?

Re: Does reading an uninitialized object have undefined behavior?

<KvVxh3+WExIyDnM+5@bongo-ra.co>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=580&group=comp.std.c#580

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!paganini.bofh.team!tor-network!not-for-mail
From: spibou@gmail.com (Spiros Bousbouras)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Wed, 30 Aug 2023 19:53:40 -0000 (UTC)
Organization: To protect and to server
Message-ID: <KvVxh3+WExIyDnM+5@bongo-ra.co>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com>
<87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
<867cpu5h8w.fsf@linuxsc.com> <a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com> <868r9xz0ek.fsf@linuxsc.com>
<5+eRe7cp3yQjL4=AX@bongo-ra.co> <86sf82ulmb.fsf@linuxsc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 30 Aug 2023 19:53:40 -0000 (UTC)
Injection-Info: paganini.bofh.team; logging-data="56661"; posting-host="9H7U5kayiTdk7VIdYU44Rw.user.paganini.bofh.team"; mail-complaints-to="usenet@bofh.team"; posting-account="9dIQLXBM7WM9KzA+yjdR4A";
Cancel-Lock: sha256:oi++WaHEBABeTER7pIgO28xeipsH5zNhmIMkog29qVk=
X-Server-Commands: nowebcancel
X-Organisation: Weyland-Yutani
X-Notice: Filtered by postfilter v. 0.9.3
X-TOR-Router: 81.187.223.112
 by: Spiros Bousbouras - Wed, 30 Aug 2023 19:53 UTC

On Tue, 29 Aug 2023 04:35:40 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> Spiros Bousbouras <spibou@gmail.com> writes:
>
> > On Sat, 26 Aug 2023 19:25:55 -0700
> > Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> >
> >> Sometimes people use compiler options to turn off, for example,
> >> so-called "strict aliasing", and of course the C standard allows
> >> us to do that. But compilers aren't required to provide such an
> >> option, and if they do the option may not do exactly what we
> >> expect it to do, because there is no standard specification for
> >> it. The C standard should define officially sanctioned
> >> mechanisms -- as for example standard #pragma's -- to give
> >> standard-defined semantics to certain constructs of undefined
> >> behavior that resemble, eg, -fno-strict-aliasing.
> >
> > Surely the starting point for this should be the documentation of
> > the compilers to specify precisely what -fno-strict-aliasing does.
> > [...]
>
> Not at all. It's easy to write a specification that says what we
> want to do, along similar lines to what is said in the footnote
> about union member access in section 6.5.2.3
>
> If the member used to access the contents of a union object
> is not the same as the member last used to store a value in
> the object, the appropriate part of the object representation
> of the value is reinterpreted as an object representation in
> the new type as described in 6.2.6 (a process sometimes called
> "type punning"). This might be a trap representation.

Works for me but it would be good to know that this is how compiler
writers actually understand -fno-strict-aliasing .Is there any compiler
documentation which says something like this ?

> That behavior should be the default, for all accesses. For cases
> where a developer wants to give permission to the compiler to
> optimize based on cross-type non-interference assumptions, there
> should be a #pragma to do something similar to what effective type
> rules do now. The effective type rules are in need of re-writing
> anyway, and making type punning be the default doesn't break any
> programs, because compilers are already free to ignore the
> implications of violating effective type conditions.

[...]

> > For example it has been pointed out on comp.lang.c that it's
> > impossible to write a malloc() implementation in conforming
> > C. This is certainly a weakness which should be addressed with
> > some appropriate #pragma .
>
> There isn't any reason to think malloc() should be writable in
> completely portable C. That's the point of putting malloc() in
> the system library in the first place. By the way, with type
> punning semantics mentioned above being the default, and with the
> alignment features added in C11, I think it is possible to write
> malloc() in portable C without needed any additional language
> changes. But even if it isn't that is no cause for concern; one
> of the principal reasons for having a system library is to
> provide functionality that the core language cannot express (or
> cannot express conveniently).

One might want to experiment with different allocation algorithms
and it seems to me that this sort of thing is within the "remit" of
C. So ideally one should be able to write it in C and prove , starting
from the standard or precise specifications in compiler documentation ,
that it works correctly. I don't necessarily mean prove the correctness
of the whole code but certain key parts.

Another application I have in mind is languages which get translated
to C and support garbage collection. Again one might want to use the
standard malloc() to allocate a large block of memory and use different
parts of this memory for different types of objects.

If with the semantics you propose these things are possible , I'm happy.
I'm not bothered which is the default as long as there is a precise
specification from which you can reason that you get the desired behaviour.

> >> Before leaving the sub-topic of undefined behavior, let me mention
> >> two success stories. The first is 'restrict': the performance
> >> implications are local, the choice is under control of the program
> >> (and programmer), and the default choice is to play safe. Good
> >> show.
> >
> > From my point of view , restrict is not a success because the
> > specification of restrict is the one part of the C1999 standard I
> > have given up trying to understand. I understand the underlying
> > idea but the specifics elude me. [...]
>
> I agree the formal definition of restrict is rather daunting. In
> practice though I think using restrict with confidence is not
> overly difficult. My working model for restrict is something
> like this:
>
> 1. Use restrict only in the declarations of function
> parameters.
>
> 2. For a declaration like const T *restrict foo ,
> the compiler may assume that any objects that can be
> accessed through 'foo' will not be modified.

Wouldn't that also be the case with just const T * foo ?

> 3. For a declaration like T *restrict bas ,
> the compiler may assume that any changes to objects
> that can be accessed through 'bas' will be done
> using 'bas' or a pointer value derived from 'bas'
> (and in particular that no changes will happen
> other than through 'bas' or 'bas'-derived pointer
> values).
>
> Is this summary description helpful?

It seems clear enough but , as I've said , I don't have any use for
restrict anyway and it's not worth it for me to expend the additional
mental effort to confirm that my code obeys the additional restrictions
of restrict .If I call a function with a preexisting interface which
involves restrict then it seems easy enough to obey the restrictions.

--
Carrie also narrates the film, providing useful guidelines for those
challenged by its intricacies. Sample: "Later that day, Big and I
arrived home."
http://www.rogerebert.com/reviews/sex-and-the-city-2-2010

Re: Does reading an uninitialized object have undefined behavior?

<86zg28t563.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=581&group=comp.std.c#581

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Wed, 30 Aug 2023 17:40:52 -0700
Organization: A noiseless patient Spider
Lines: 105
Message-ID: <86zg28t563.fsf@linuxsc.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com> <fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com> <a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com> <868r9xz0ek.fsf@linuxsc.com> <5+eRe7cp3yQjL4=AX@bongo-ra.co> <86sf82ulmb.fsf@linuxsc.com> <KvVxh3+WExIyDnM+5@bongo-ra.co>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="49a0c7fba7d7c0f06cea865d80b29294";
logging-data="3098900"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/95rjl+oVAGbjt81ES9CG7oIKOzQs06QE="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:Ze1BXDV1P3+7t3aIatUSA5AEohQ=
sha1:drZdhrhKPVQILRkqN7UwYVb4dAw=
 by: Tim Rentsch - Thu, 31 Aug 2023 00:40 UTC

Spiros Bousbouras <spibou@gmail.com> writes:

> On Tue, 29 Aug 2023 04:35:40 -0700
> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>
>> Spiros Bousbouras <spibou@gmail.com> writes:
>>
>>> On Sat, 26 Aug 2023 19:25:55 -0700
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>>>
>>>> Sometimes people use compiler options to turn off, for example,
>>>> so-called "strict aliasing", and of course the C standard allows
>>>> us to do that. But compilers aren't required to provide such an
>>>> option, and if they do the option may not do exactly what we
>>>> expect it to do, because there is no standard specification for
>>>> it. The C standard should define officially sanctioned
>>>> mechanisms -- as for example standard #pragma's -- to give
>>>> standard-defined semantics to certain constructs of undefined
>>>> behavior that resemble, eg, -fno-strict-aliasing.
>>>
>>> Surely the starting point for this should be the documentation of
>>> the compilers to specify precisely what -fno-strict-aliasing does.
>>> [...]
>>
>> Not at all. It's easy to write a specification that says what we
>> want to do, along similar lines to what is said in the footnote
>> about union member access in section 6.5.2.3
>>
>> If the member used to access the contents of a union object
>> is not the same as the member last used to store a value in
>> the object, the appropriate part of the object representation
>> of the value is reinterpreted as an object representation in
>> the new type as described in 6.2.6 (a process sometimes called
>> "type punning"). This might be a trap representation.
>
> Works for me but it would be good to know that this is how compiler
> writers actually understand -fno-strict-aliasing . [...]

No, it wouldn't. Implementations follow the C standard, not
the other way around. Looking at what implementations do for
the -fno-strict-aliasing flag is worse than a waste of time.

>>> For example it has been pointed out on comp.lang.c that it's
>>> impossible to write a malloc() implementation in conforming
>>> C. This is certainly a weakness which should be addressed with
>>> some appropriate #pragma .
>>
>> There isn't any reason to think malloc() should be writable in
>> completely portable C. That's the point of putting malloc() in
>> the system library in the first place. By the way, with type
>> punning semantics mentioned above being the default, and with the
>> alignment features added in C11, I think it is possible to write
>> malloc() in portable C without needed any additional language
>> changes. But even if it isn't that is no cause for concern; one
>> of the principal reasons for having a system library is to
>> provide functionality that the core language cannot express (or
>> cannot express conveniently).
>
> One might want to experiment with different allocation algorithms
> and it seems to me that this sort of thing is within the "remit" of
> C. So ideally one should be able to write it in C [...]

You're conflating writing something in C and writing something
in completely portable C. It's already possible to do these
things writing in C.

>>> From my point of view , restrict is not a success because the
>>> specification of restrict is the one part of the C1999 standard I
>>> have given up trying to understand. I understand the underlying
>>> idea but the specifics elude me. [...]
>>
>> I agree the formal definition of restrict is rather daunting. In
>> practice though I think using restrict with confidence is not
>> overly difficult. My working model for restrict is something
>> like this:
>>
>> 1. Use restrict only in the declarations of function
>> parameters.
>>
>> 2. For a declaration like const T *restrict foo ,
>> the compiler may assume that any objects that can be
>> accessed through 'foo' will not be modified.
>
> Wouldn't that also be the case with just const T * foo ?

No.

>> 3. For a declaration like T *restrict bas ,
>> the compiler may assume that any changes to objects
>> that can be accessed through 'bas' will be done
>> using 'bas' or a pointer value derived from 'bas'
>> (and in particular that no changes will happen
>> other than through 'bas' or 'bas'-derived pointer
>> values).
>>
>> Is this summary description helpful?
>
> It seems clear enough but , as I've said , I don't have any use
> for restrict anyway and it's not worth it for me to expend the
> additional mental effort to confirm that my code obeys the
> additional restrictions of restrict. [...]

If you don't want to use restrict that is quite okay. Part of
why I call restrict a success is that it can be ignored, with
only minimal effort, by any developer who doesn't want to use it.

Re: Does reading an uninitialized object have undefined behavior?

<S+3le+7=sc9SPmPL3@bongo-ra.co>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=582&group=comp.std.c#582

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!paganini.bofh.team!not-for-mail
From: spibou@gmail.com (Spiros Bousbouras)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Thu, 31 Aug 2023 18:18:59 -0000 (UTC)
Organization: To protect and to server
Message-ID: <S+3le+7=sc9SPmPL3@bongo-ra.co>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com>
<87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
<867cpu5h8w.fsf@linuxsc.com> <a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com> <868r9xz0ek.fsf@linuxsc.com>
<5+eRe7cp3yQjL4=AX@bongo-ra.co> <86sf82ulmb.fsf@linuxsc.com> <KvVxh3+WExIyDnM+5@bongo-ra.co>
<86zg28t563.fsf@linuxsc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 31 Aug 2023 18:18:59 -0000 (UTC)
Injection-Info: paganini.bofh.team; logging-data="606078"; posting-host="9H7U5kayiTdk7VIdYU44Rw.user.paganini.bofh.team"; mail-complaints-to="usenet@bofh.team"; posting-account="9dIQLXBM7WM9KzA+yjdR4A";
Cancel-Lock: sha256:j/9v/48WvXI176WJ/T3SSfbXRWzCGo4fohGbIaNGAsY=
X-Notice: Filtered by postfilter v. 0.9.3
X-Organisation: Weyland-Yutani
X-Server-Commands: nowebcancel
 by: Spiros Bousbouras - Thu, 31 Aug 2023 18:18 UTC

On Wed, 30 Aug 2023 17:40:52 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> Spiros Bousbouras <spibou@gmail.com> writes:
>
> > On Tue, 29 Aug 2023 04:35:40 -0700
> > Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> >
> >> Spiros Bousbouras <spibou@gmail.com> writes:

[...]

> >> Not at all. It's easy to write a specification that says what we
> >> want to do, along similar lines to what is said in the footnote
> >> about union member access in section 6.5.2.3
> >>
> >> If the member used to access the contents of a union object
> >> is not the same as the member last used to store a value in
> >> the object, the appropriate part of the object representation
> >> of the value is reinterpreted as an object representation in
> >> the new type as described in 6.2.6 (a process sometimes called
> >> "type punning"). This might be a trap representation.
> >
> > Works for me but it would be good to know that this is how compiler
> > writers actually understand -fno-strict-aliasing . [...]
>
> No, it wouldn't. Implementations follow the C standard, not
> the other way around. Looking at what implementations do for
> the -fno-strict-aliasing flag is worse than a waste of time.

Actually the influence goes in both directions. In theory the standard is the
ultimate authority , in practice whatever C compilers one has access to. For
now the standard doesn't have something like -fno-strict-aliasing so if one
needs it then looking at what implementations do is the only option. But even
the standard committee should look at it and whether C programmers find it
useful to decide what around such lines (if anything) should go into the
standard.

> >> There isn't any reason to think malloc() should be writable in
> >> completely portable C. That's the point of putting malloc() in
> >> the system library in the first place. By the way, with type
> >> punning semantics mentioned above being the default, and with the
> >> alignment features added in C11, I think it is possible to write
> >> malloc() in portable C without needed any additional language
> >> changes. But even if it isn't that is no cause for concern; one
> >> of the principal reasons for having a system library is to
> >> provide functionality that the core language cannot express (or
> >> cannot express conveniently).
> >
> > One might want to experiment with different allocation algorithms
> > and it seems to me that this sort of thing is within the "remit" of
> > C. So ideally one should be able to write it in C [...]
>
> You're conflating writing something in C and writing something
> in completely portable C. It's already possible to do these
> things writing in C.

I wrote

One might want to experiment with different allocation algorithms and it
seems to me that this sort of thing is within the "remit" of C. So
ideally one should be able to write it in C and prove , starting from the
standard or precise specifications in compiler documentation , that it
works correctly. I don't necessarily mean prove the correctness of the
whole code but certain key parts.

..This doesn't conflate anything. One can do the writing but can one do the
proving or something close ?

--
vlaho.ninja/prog

Re: Does reading an uninitialized object have undefined behavior?

<867cp4pzdu.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=873&group=comp.std.c#873

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Tue, 05 Sep 2023 05:39:57 -0700
Organization: A noiseless patient Spider
Lines: 30
Message-ID: <867cp4pzdu.fsf@linuxsc.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com> <fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com> <a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com> <868r9xz0ek.fsf@linuxsc.com> <5+eRe7cp3yQjL4=AX@bongo-ra.co> <86sf82ulmb.fsf@linuxsc.com> <KvVxh3+WExIyDnM+5@bongo-ra.co> <86zg28t563.fsf@linuxsc.com> <S+3le+7=sc9SPmPL3@bongo-ra.co>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="6793c8bc2747e0ba6d3890411794594c";
logging-data="2084334"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19oTDwT01H2dtFhgr3uP4EzP3byC8dCgaE="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:rMFZ8QPqtIiZe+fzthTyn5yqK8Y=
sha1:ZR7O2PME4rcI6p4iAPwWQre6VwA=
 by: Tim Rentsch - Tue, 5 Sep 2023 12:39 UTC

Spiros Bousbouras <spibou@gmail.com> writes:

> On Wed, 30 Aug 2023 17:40:52 -0700
> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

>> You're conflating writing something in C and writing something
>> in completely portable C. It's already possible to do these
>> things writing in C.
>
> I wrote
>
> One might want to experiment with different allocation
> algorithms and it seems to me that this sort of thing is
> within the "remit" of C. So ideally one should be able to
> write it in C and prove , starting from the standard or
> precise specifications in compiler documentation , that it
> works correctly. I don't necessarily mean prove the
> correctness of the whole code but certain key parts.
>
> .This doesn't conflate anything. One can do the writing but
> can one do the proving or something close ?

A substitute for malloc()/free() can be written in standard C.

A substitute for malloc()/free() can not be written in completely
portable standard C.

I hope this clarifies my earlier comments.

Re: Does reading an uninitialized object have undefined behavior?

<861qfcp3q5.fsf@linuxsc.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=874&group=comp.std.c#874

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Tue, 05 Sep 2023 17:03:46 -0700
Organization: A noiseless patient Spider
Lines: 85
Message-ID: <861qfcp3q5.fsf@linuxsc.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com> <fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com> <a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="f6301c9a6ce7286ea3bcf92b0a94924d";
logging-data="2295855"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+KB0WOG4uc3zwOGpwALzTrnQNt5wGXMM4="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:/esJ1yjkV8T0kPtOD5iSAPqyRiY=
sha1:5muYHZhAOyReSPLpgirNY29mcc4=
 by: Tim Rentsch - Wed, 6 Sep 2023 00:03 UTC

Martin Uecker <ma.uecker@gmail.com> writes:

[...]

> There are essentially two main interests driving this. First,
> there is some interest to precisely formulate the semantics for
> C. The provenance proposal came out of this.
>
> Second, there is the issue of safety problems caused by
> uninitialized reads, together with compiler support for zero
> initialization etc. So there are various people who want to
> change the semantics for uninitialized variables completely
> in the interest of safety.
>
> So far, there was no consensus in WG14 that the rules should
> be changed or what the new rules should be.

I have a second reply here, which I hope will come closer to
being relevant to the issues of interest.

What I think is being looked for is a way to describe the
language semantics in areas such as cross-type interference and
what is meant when an uninitialized object is read. I thought
about this question both while I was writing the longer earlier
reply and then more deeply afterwards.

What I think is most important is that these areas in particular
are not about language semantics in the same way as, for example,
array indexing. Rather they are about what transformations a
compiler is allowed to do in the presence of various combinations
of program constructs. That difference means the C standard
should express the rules in a way that more directly reflects
what's going on. More specifically, the standard should say or
explain what can be done, not by describing language semantics
(which is indirect), but explicitly in terms of what compiler
transformations are allowed (which is direct). Note that there
is precedent for this idea, in how the C standard talks about
looping constructs and when they may be assumed to terminate.

To give an example, take uninitialized objects, either automatic
variables without an initializer, or memory allocated by malloc or
added by realloc. The most natural semantics for such situations
is to say that newly "created" memory gets an unspecified object
representation at the start of its lifetime. (Yes I know that C
in its current form lets automatic objects be "uninitialized"
whenever their declaration points are reached, but let's ignore
that for now.) Now suppose a program has a read access where it
is easy to deduce that the object being read is still in the
"unspecified object representation" initial state. To simplify
the discussion, suppose the type of the access is a pointer type,
and so is known to have trap representations (the name is changed
in the C23 draft, but the idea is what's important).

What is a compiler allowed to do in such circumstances? One thing
it might reasonably be allowed to do is to cause the program to be
terminated if it ever reaches such an access. Or there might be
an option to initialize the pointer to NULL. Or, if a suitable
compiler option were invoked, the construct might be flagged with
a fatal error (or of course a warning). There are all sorts of
actions a developer might want the compiler to take, and a
compiler could offer many of those options, as choices selected
under control of command line switches (or equivalent). I think a
few points are worth making.

One, there must be some sort of default action that all compilers
have to support. The default action in this case might be to
issue a non-fatal diagnostic.

Two, there must be a way for the developer to tell the compiler to
"proceed blindly" - saying, in effect, I accept that the compiled
code might misbehave, but let me take that risk, and generate code
like it's going to work. (In other words, for the read access, go
ahead and load whatever unspecified object representation happens
to be there.) A "proceed blindly" choice probably shouldn't be
the default, but it must be available.

Three, the consequence must never be "undefined behavior", unless
there is an explicit stipulation to that effect. The stipulation
might take the form of a #pragma, or a compiler option, or a code
decoration using "attribute" (whatever the syntax for such things
is).

I know my comments here are somewhat sketchy, but hopefully a
general sense of the ideas gets across. The suggestions should at
least serve to stimulate further discussion.

Re: Does reading an uninitialized object have undefined behavior?

<b4qdnRse5OVYemT5nZ2dnZfqn_idnZ2d@giganews.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=1010&group=comp.std.c#1010

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!newsfeed.hasname.com!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!Xl.tags.giganews.com!local-1.nntp.ord.giganews.com!news.giganews.com.POSTED!not-for-mail
NNTP-Posting-Date: Thu, 07 Sep 2023 15:09:57 +0000
Subject: Re: Does reading an uninitialized object have undefined behavior?
Newsgroups: comp.std.c
References: <87zg3pq1ym.fsf@nosuchdomain.example.com>
<87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com>
<87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com>
<86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com>
<864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
<867cpu5h8w.fsf@linuxsc.com>
<a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com>
<861qfcp3q5.fsf@linuxsc.com>
From: jb-usenet@wisemo.com.invalid (Jakob Bohm)
Organization: WiseMo A/S
Date: Thu, 7 Sep 2023 17:09:56 +0200
User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:6.2) Goanna/20230604
Epyrus/2.0.2
MIME-Version: 1.0
In-Reply-To: <861qfcp3q5.fsf@linuxsc.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Message-ID: <b4qdnRse5OVYemT5nZ2dnZfqn_idnZ2d@giganews.com>
Lines: 122
X-Usenet-Provider: http://www.giganews.com
X-Trace: sv3-H2RdyYFBteMvblV5mdphGKkjMFB7sXT6finU06B5KLMyHY3E7uFpL5qtCycajigaEz2zMqdkaXU6/f0!mZNGLNUTzAHzHHoP/Vj4bNDCBnCn8WjBnEV96onBm4otoa6Ahi+KHiAkDfLaZV/7KlXV48kRf8g=
X-Complaints-To: abuse@giganews.com
X-DMCA-Notifications: http://www.giganews.com/info/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
X-Received-Bytes: 7756
 by: Jakob Bohm - Thu, 7 Sep 2023 15:09 UTC

On 2023-09-06 02:03, Tim Rentsch wrote:
> Martin Uecker <ma.uecker@gmail.com> writes:
>
> [...]
>
>> There are essentially two main interests driving this. First,
>> there is some interest to precisely formulate the semantics for
>> C. The provenance proposal came out of this.
>>
>> Second, there is the issue of safety problems caused by
>> uninitialized reads, together with compiler support for zero
>> initialization etc. So there are various people who want to
>> change the semantics for uninitialized variables completely
>> in the interest of safety.
>>
>> So far, there was no consensus in WG14 that the rules should
>> be changed or what the new rules should be.
>
> I have a second reply here, which I hope will come closer to
> being relevant to the issues of interest.
>
> What I think is being looked for is a way to describe the
> language semantics in areas such as cross-type interference and
> what is meant when an uninitialized object is read. I thought
> about this question both while I was writing the longer earlier
> reply and then more deeply afterwards.
>
> What I think is most important is that these areas in particular
> are not about language semantics in the same way as, for example,
> array indexing. Rather they are about what transformations a
> compiler is allowed to do in the presence of various combinations
> of program constructs. That difference means the C standard
> should express the rules in a way that more directly reflects
> what's going on. More specifically, the standard should say or
> explain what can be done, not by describing language semantics
> (which is indirect), but explicitly in terms of what compiler
> transformations are allowed (which is direct). Note that there
> is precedent for this idea, in how the C standard talks about
> looping constructs and when they may be assumed to terminate.
>
> To give an example, take uninitialized objects, either automatic
> variables without an initializer, or memory allocated by malloc or
> added by realloc. The most natural semantics for such situations
> is to say that newly "created" memory gets an unspecified object
> representation at the start of its lifetime. (Yes I know that C
> in its current form lets automatic objects be "uninitialized"
> whenever their declaration points are reached, but let's ignore
> that for now.) Now suppose a program has a read access where it
> is easy to deduce that the object being read is still in the
> "unspecified object representation" initial state. To simplify
> the discussion, suppose the type of the access is a pointer type,
> and so is known to have trap representations (the name is changed
> in the C23 draft, but the idea is what's important).
>
> What is a compiler allowed to do in such circumstances? One thing
> it might reasonably be allowed to do is to cause the program to be
> terminated if it ever reaches such an access. Or there might be
> an option to initialize the pointer to NULL. Or, if a suitable
> compiler option were invoked, the construct might be flagged with
> a fatal error (or of course a warning). There are all sorts of
> actions a developer might want the compiler to take, and a
> compiler could offer many of those options, as choices selected
> under control of command line switches (or equivalent). I think a
> few points are worth making.
>
> One, there must be some sort of default action that all compilers
> have to support. The default action in this case might be to
> issue a non-fatal diagnostic.
>
> Two, there must be a way for the developer to tell the compiler to
> "proceed blindly" - saying, in effect, I accept that the compiled
> code might misbehave, but let me take that risk, and generate code
> like it's going to work. (In other words, for the read access, go
> ahead and load whatever unspecified object representation happens
> to be there.) A "proceed blindly" choice probably shouldn't be
> the default, but it must be available.
>
> Three, the consequence must never be "undefined behavior", unless
> there is an explicit stipulation to that effect. The stipulation
> might take the form of a #pragma, or a compiler option, or a code
> decoration using "attribute" (whatever the syntax for such things
> is).
>

Agreed so far!

As a developer of programs in C with practical but not infinite
portability, I very much abhore the mad optimizations that use
language lawyering to state that any code path that might,
hypothetically, exceed the boundaries of standard-enforced behavior
is allowed to be arbitrarily mangled to get a faster bad result.

For example, I have one function which intentionally reads an
uninitialized variable to get a somewhat arbitrary value of a type
with no known trap representation. I have a number of other
programs which extensively process a block of data before deciding
in some other way if the data is garbage or useful. This is done
for sound technical reasons but requires that the compiler doesn't
plant landmines all over virgin land.

As another example, I have speed critical code that relies on running
on 2s complement machines with wraparound on signed integer overflow,
and that code is being very clear and explicit in doing so, but there
is no C90 notation to tell all ISO-C implementation that this is the
intention, thus it is explicit only in comments, not in the tokens
passed to the C compiler.

> I know my comments here are somewhat sketchy, but hopefully a
> general sense of the ideas gets across. The suggestions should at
> least serve to stimulate further discussion.
>

I am writing from a similar perspective .

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

Re: Does reading an uninitialized object have undefined behavior?

<87sf7qnefn.fsf@bsb.me.uk>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=1011&group=comp.std.c#1011

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ben.usenet@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Thu, 07 Sep 2023 17:19:56 +0100
Organization: A noiseless patient Spider
Lines: 21
Message-ID: <87sf7qnefn.fsf@bsb.me.uk>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com>
<87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com>
<87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com>
<86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com>
<864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
<867cpu5h8w.fsf@linuxsc.com>
<a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com>
<861qfcp3q5.fsf@linuxsc.com>
<b4qdnRse5OVYemT5nZ2dnZfqn_idnZ2d@giganews.com>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: dont-email.me; posting-host="347d35c8051c6481b9483e2b861d4761";
logging-data="3213906"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX181p/KnrcP2rfJki8ggsdVf3m0L+Bt91RM="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
Cancel-Lock: sha1:dCcid7HrBJ9od27DdBukSd+KDDU=
sha1:yeHcAGQBgrRQFdtAZjVlNGhkgdc=
X-BSB-Auth: 1.ebbaa5d3a35201676371.20230907171956BST.87sf7qnefn.fsf@bsb.me.uk
 by: Ben Bacarisse - Thu, 7 Sep 2023 16:19 UTC

Jakob Bohm <jb-usenet@wisemo.com.invalid> writes:

> As another example, I have speed critical code that relies on running
> on 2s complement machines with wraparound on signed integer overflow, and
> that code is being very clear and explicit in doing so, but there
> is no C90 notation to tell all ISO-C implementation that this is the
> intention, thus it is explicit only in comments, not in the tokens
> passed to the C compiler.

You can tell the compiler you want 2s complement by using the intN_t
types if you can find one that suits your portability requirements.

And can you not use unsigned arithmetic, re-interpreting as signed for
those places where it matters? The "overflow" can only happen in
the arithmetic, not in the re-interpretation.

I know this is a deviation from the topic, so feel free to ignore if you
don't want to get into it.

--
Ben.

Re: Does reading an uninitialized object have undefined behavior?

<p5KdnX4UMaaDE2b5nZ2dnZeNn_pj4p2d@giganews.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=1017&group=comp.std.c#1017

  copy link   Newsgroups: comp.std.c
Path: rocksolid2!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!feeder.usenetexpress.com!tr1.iad1.usenetexpress.com!69.80.99.23.MISMATCH!Xl.tags.giganews.com!local-2.nntp.ord.giganews.com!news.giganews.com.POSTED!not-for-mail
NNTP-Posting-Date: Fri, 08 Sep 2023 21:11:58 +0000
Subject: Re: Does reading an uninitialized object have undefined behavior?
Newsgroups: comp.std.c
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com> <fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com> <e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com> <a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com> <861qfcp3q5.fsf@linuxsc.com> <b4qdnRse5OVYemT5nZ2dnZfqn_idnZ2d@giganews.com> <87sf7qnefn.fsf@bsb.me.uk>
From: jb-usenet@wisemo.com.invalid (Jakob Bohm)
Organization: WiseMo A/S
Date: Fri, 8 Sep 2023 23:12:00 +0200
User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:6.2) Goanna/20230604 Epyrus/2.0.2
MIME-Version: 1.0
In-Reply-To: <87sf7qnefn.fsf@bsb.me.uk>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Message-ID: <p5KdnX4UMaaDE2b5nZ2dnZeNn_pj4p2d@giganews.com>
Lines: 49
X-Usenet-Provider: http://www.giganews.com
X-Trace: sv3-KHDvlr5KPIz5D86fnaKZatu9djuuDOVuz+3Tco2rFs2UBMi+SAxXK6ahQOtUOrXjd9QZH0ljLTlKpVh!qHslSOKemXHY2OAx+AmIqcnfuzJroXP41R8BBF+BYnzZMi6g9SDZPVR2BmFqL6fl9MYZaZMjYB4=
X-Complaints-To: abuse@giganews.com
X-DMCA-Notifications: http://www.giganews.com/info/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
 by: Jakob Bohm - Fri, 8 Sep 2023 21:12 UTC

On 2023-09-07 18:19, Ben Bacarisse wrote:
> Jakob Bohm <jb-usenet@wisemo.com.invalid> writes:
>
>> As another example, I have speed critical code that relies on running
>> on 2s complement machines with wraparound on signed integer overflow, and
>> that code is being very clear and explicit in doing so, but there
>> is no C90 notation to tell all ISO-C implementation that this is the
>> intention, thus it is explicit only in comments, not in the tokens
>> passed to the C compiler.
>
> You can tell the compiler you want 2s complement by using the intN_t
> types if you can find one that suits your portability requirements.
>
> And can you not use unsigned arithmetic, re-interpreting as signed for
> those places where it matters? The "overflow" can only happen in
> the arithmetic, not in the re-interpretation.
>
> I know this is a deviation from the topic, so feel free to ignore if you
> don't want to get into it.
>

The code in question has as explicit design condition that the compiler
implements signed versions with wraparound for each unsigned int type .

The code cannot rely on the intN_t types because they were not part of
C90 and thus do not exist as separate types in some targeted compilers.

In the world of C90 compilers, stdint.h was a non-standard system header
that provided convenience names for the most closely matching C90 types
on the platform, and some platforms simply didn't provide that header,
instead documenting how each C90 type mapped to data sizes.

Excessive casting where directly using the desired type seems possible
is highly counter-intuitive and thus it is inherently wrong for an
optimizer to presume the right to mangle code using types such as "int",
"short int", "long int" and "signed char".

Once again this comes down to a language drift from "undefined" meaning
"not defined by this standard" to "An extremely toxic trap condition" .

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

Re: Does reading an uninitialized object have undefined behavior?

<87fs3omjxj.fsf@bsb.me.uk>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=1018&group=comp.std.c#1018

  copy link   Newsgroups: comp.std.c
Path: rocksolid2!news.neodome.net!news.mixmin.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ben.usenet@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Fri, 08 Sep 2023 22:31:04 +0100
Organization: A noiseless patient Spider
Lines: 39
Message-ID: <87fs3omjxj.fsf@bsb.me.uk>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com>
<87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com>
<87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com>
<86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com>
<864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
<867cpu5h8w.fsf@linuxsc.com>
<a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com>
<861qfcp3q5.fsf@linuxsc.com>
<b4qdnRse5OVYemT5nZ2dnZfqn_idnZ2d@giganews.com>
<87sf7qnefn.fsf@bsb.me.uk>
<p5KdnX4UMaaDE2b5nZ2dnZeNn_pj4p2d@giganews.com>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: dont-email.me; posting-host="b39237310d6b9749d50605be679823cd";
logging-data="3907119"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18dMnY+R5Nm2LzNw6kEaoOnzQ4tCzwARig="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
Cancel-Lock: sha1:iGNj4rt7B0eWNNp9d8IsUtBUJ1Q=
sha1:Eak3Zz2vJ8MS7yYSb63mBaXZ3SI=
X-BSB-Auth: 1.b8198f5001a4a9087a54.20230908223104BST.87fs3omjxj.fsf@bsb.me.uk
 by: Ben Bacarisse - Fri, 8 Sep 2023 21:31 UTC

Jakob Bohm <jb-usenet@wisemo.com.invalid> writes:

> On 2023-09-07 18:19, Ben Bacarisse wrote:
>> Jakob Bohm <jb-usenet@wisemo.com.invalid> writes:
>>
>>> As another example, I have speed critical code that relies on running
>>> on 2s complement machines with wraparound on signed integer overflow, and
>>> that code is being very clear and explicit in doing so, but there
>>> is no C90 notation to tell all ISO-C implementation that this is the
>>> intention, thus it is explicit only in comments, not in the tokens
>>> passed to the C compiler.
>> You can tell the compiler you want 2s complement by using the intN_t
>> types if you can find one that suits your portability requirements.
>> And can you not use unsigned arithmetic, re-interpreting as signed for
>> those places where it matters? The "overflow" can only happen in
>> the arithmetic, not in the re-interpretation.
>> I know this is a deviation from the topic, so feel free to ignore if you
>> don't want to get into it.
>
> The code in question has as explicit design condition that the compiler
> implements signed versions with wraparound for each unsigned int type .
>
> The code cannot rely on the intN_t types because they were not part of
> C90 and thus do not exist as separate types in some targeted
> compilers.

Ah, I didn't know targetting C90 was still a thing. I've been out of
the business for many years.

> Excessive casting where directly using the desired type seems possible
> is highly counter-intuitive and thus it is inherently wrong for an
> optimizer to presume the right to mangle code using types such as "int",
> "short int", "long int" and "signed char".

I wasn't suggesting casts as they don't remove the undefined behaviour.
But you have a design that suits your needs so it's all good.

--
Ben.

Re: Does reading an uninitialized object have undefined behavior?

<a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=1019&group=comp.std.c#1019

  copy link   Newsgroups: comp.std.c
X-Received: by 2002:a05:622a:288:b0:403:fe96:5779 with SMTP id z8-20020a05622a028800b00403fe965779mr1844qtw.5.1692388363786;
Fri, 18 Aug 2023 12:52:43 -0700 (PDT)
X-Received: by 2002:a17:902:da8b:b0:1b8:8fe2:6627 with SMTP id
j11-20020a170902da8b00b001b88fe26627mr64708plx.8.1692388363244; Fri, 18 Aug
2023 12:52:43 -0700 (PDT)
Path: rocksolid2!i2pn.org!weretis.net!feeder6.news.weretis.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.std.c
Date: Fri, 18 Aug 2023 12:52:42 -0700 (PDT)
In-Reply-To: <867cpu5h8w.fsf@linuxsc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a02:8388:e203:9700:eddb:fb4f:5189:911d;
posting-account=RQgdUAoAAACC04vq-o2ZyxdALW1NmdRY
NNTP-Posting-Host: 2a02:8388:e203:9700:eddb:fb4f:5189:911d
References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk>
<874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk>
<87a5vpnegz.fsf@nosuchdomain.example.com> <86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com> <864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com> <867cpu5h8w.fsf@linuxsc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <a3199783-d8b7-4065-836b-08f647a6808en@googlegroups.com>
Subject: Re: Does reading an uninitialized object have undefined behavior?
From: ma.uecker@gmail.com (Martin Uecker)
Injection-Date: Fri, 18 Aug 2023 19:52:43 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 4032
 by: Martin Uecker - Fri, 18 Aug 2023 19:52 UTC

On Thursday, August 17, 2023 at 8:13:07 AM UTC+2, Tim Rentsch wrote:
> Martin Uecker <ma.u...@gmail.com> writes:
>
> [some unrelated passages removed]
> > On Wednesday, August 16, 2023 at 6:06:43?AM UTC+2, Tim Rentsch wrote:
> >
> >> Martin Uecker <ma.u...@gmail.com> writes:
> [...]
> >>> One could still consider the idea that "indeterminate" is an
> >>> abstract property that yields UB during read even for types
> >>> that do not have trap representations. There is no wording
> >>> in the C standard to support this, but I would not call this
> >>> idea "fundamentally wrong". You are right that this is different
> >>> to provenance provenance which is about values. What it would
> >>> have in common with pointer provenance is that there is hidden
> >>> state in the abstract machine associated with memory that
> >>> is not part of the representation. With effective types there
> >>> is another example of this.
> >>
> >> I understand that you want to consider a broader topic, and that,
> >> in the realm of that broader topic, something like provenance
> >> could have a role to play. I think it is worth responding to
> >> that thesis, and am expecting to do so in a separate reply (or
> >> new thread?) although probably not right away.
> >
> > I would love to hear your comments, because some people
> > want to have such an abstract of "indeterminate" and
> > some already believe that this is how the standard should
> > be understood already today.
> I've been thinking about this, and am close (I think) to having
> something to say in response. Before I do that, thought, let me
> ask this: what problem or problems are motivating the question?
> What problems do you (or "some people") want to solve? I don't
> want just examples here; I'm hoping to get a full list.

There are essentially two main interests driving this. First, there
is some interest to precisely formulate the semantics for C.
The provenance proposal came out of this.

Second, there is the issue of safety problems caused by
uninitialized reads, together with compiler support for zero
initialization etc. So there are various people who want to
change the semantics for uninitialized variables completely
in the interest of safety.

So far, there was no consensus in WG14 that the rules should
be changed or what the new rules should be.

Martin

Re: Does reading an uninitialized object have undefined behavior?

<20230818215322.47@kylheku.com>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=1020&group=comp.std.c#1020

  copy link   Newsgroups: comp.std.c
Path: rocksolid2!news.neodome.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 864-117-4973@kylheku.com (Kaz Kylheku)
Newsgroups: comp.std.c
Subject: Re: Does reading an uninitialized object have undefined behavior?
Date: Sat, 19 Aug 2023 05:04:06 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <20230818215322.47@kylheku.com>
References: <87zg3pq1ym.fsf@nosuchdomain.example.com>
<87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com>
<87fs5hnipv.fsf@bsb.me.uk> <87a5vpnegz.fsf@nosuchdomain.example.com>
<86a5uv95g7.fsf@linuxsc.com>
<fcb2be8f-b346-421f-9804-5f94c93266b0n@googlegroups.com>
<864jkz7hrm.fsf@linuxsc.com>
<e043af84-3153-4097-9505-666869fcf727n@googlegroups.com>
<867cpu5h8w.fsf@linuxsc.com> <20230816235712.844@kylheku.com>
<d6d5f930-1943-424f-a572-7d62cfd2bda0n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 19 Aug 2023 05:04:06 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="588828d29d125ee26b6c8f373c219d7a";
logging-data="681351"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/HscedvCvTGwNoSfVXn7kubq6Q9P+dYIg="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:7ertVE9Nk4CB+HXOOU6rKaYeZvU=
 by: Kaz Kylheku - Sat, 19 Aug 2023 05:04 UTC

On 2023-08-18, Martin Uecker <ma.uecker@gmail.com> wrote:
> On Thursday, August 17, 2023 at 9:08:48 AM UTC+2, Kaz Kylheku wrote:
> An implementation does not need a license from the standard
> to diagnose anything. I can already diagnose whatever seems
> useful and this does not affect conformance at all.

That's true about diagnostics at translation time. It's not clear
about that happen at run time and indistinguishable from the
program's output on stdout or stderr.

Also, it might be desirable for it to be conforming to terminate the
program if it has run afoul of the rules.

>> I would like a model of uninitialized data which usefully lends itself
>> to different depths with different trade-offs, like complexity of
>> analysis and use of run-time resources. Limits should be imposed by
>> implementations (what cases they want to diagnose) rather than by the
>> model.
>
> Tools can already do complex analysis and track down use of
> uninitialized variables. But with respect to conformance, I think
> the current standard has very good rules: memcpy/memcmp
> and similar code works as expected. Locally, where a compiler
> can be expected to give good diagnostics via static analysis
> the use of uninitialized variables is UB. But this does not
> spread via pointers elsewhere, where useful diagnostics
> are unlikely and optimizer induced problems based on UB
> might be far more difficult to debug.

Dynamic instrumentation and tracking makes that possible
for that information to follow pointer data flows, globally
in the program.

E.g. under the Valgrind tool, if one module passes an unitialized
object into another, and that other one relies on it to make
a conditional branch, it will be diagnosed. You can get the
backtrace of where that object was created as well as where
the use took place.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

Pages:12
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor