From: Jakub Jelinek <jakub@redhat.com>
To: Alexandre Oliva <oliva@adacore.com>,
Jonathan Wakely <jwakely@redhat.com>
Cc: gcc-patches@gcc.gnu.org, libstdc++@gcc.gnu.org
Subject: Re: [PATCH] [PR77760] [libstdc++] encode __time_get_state in tm
Date: Fri, 17 Feb 2023 13:56:32 +0100 [thread overview]
Message-ID: <Y+95gEhz3zQj3ngU@tucnak> (raw)
In-Reply-To: <orh6vkravy.fsf@lxoliva.fsfla.org>
On Fri, Feb 17, 2023 at 04:47:45AM -0300, Alexandre Oliva wrote:
>
> On platforms that fail the ptrtomemfn-cast-to-pfn hack, such as
> arm-*-vxworks*, time_get fails with %I and %p because the state is not
> preserved across do_get calls.
>
> This patch introduces an alternate hack, that encodes the state in
> unused bits of struct tm before calling do_get, extracts them in
> do_get, does the processing, and encodes it back, so that get extracts
> it.
>
> The finalizer is adjusted for idempotence, because both do_get and get
> may call it.
As I said in the PR, I'm worried about this approach, but Jonathan
is the expert on the standard...
My worry is that people often invoke the time_get APIs on uninitialized
struct tm.
https://eel.is/c++draft/locale.time.get#general-1 says that corresponding
members of the struct tm are set, in the past we used to set mostly just
a single tm_* for ultimate format specifiers, with the addition of state
we try to set some further ones as well in the spirit of what strptime
does in glibc. But still, say if the format specifiers only mention
hours/minutes/seconds, we don't try to update year/day/month related stuff
and vice versa. Or say minute format specifier will only update tm_min
and nothing else.
For the encoding in get, the questions are:
1) if it is ok if we clear some normally unused bits of certain tm_*
fields even if they wouldn't be otherwise touched; perhaps nothing
will care, but it is still a user visible change
2) when those tm_* fields we want to encode the state into are unitialized,
don't we effectively invoke UB by using the uninitialized member
(admittedly we just overwrite some bits in it and leave others as is,
but won't that e.g. cause -Wuninitialized warnings)?
More importantly, in the do_get the questions are:
3) while do_get is protected, are we guaranteed that it will never be called
except from the public members of time_get? I mean, if we are called
from the libstdc++ time_get::get, then there is a state and it is already
encoded in struct tm, so those bits are initialized and all is fine;
but if it is called from elsewhere (either a class derived from
time_get which just calls do_get alone on uninitialized struct tm,
or say get in a derived class which calls do_get but doesn't encode the
state, or even mixing of older GCC compiled get with newer GCC compiled
do_get - get can be inlined into user code, while do_get is called
through vtable and so could come up from another shared library), those
struct tm bits could be effectively random and we'd finalize the state
on the random stuff
4) I think we even shouldn't __state._M_finalize_state(__tm); in do_get
if it is nested, then you wouldn't need to tweak src/c++98/locale_facets.cc;
generally state should be finalized once everything is parsed together,
not many times in between (of course if user code will parse stuff
separately that will not work, say get with "%I" in one call, then get
with "%h" in another one etc., but that is user's decision). That
would be another ABI issue, if the _M_finalize_state is called in
do_get and the containing get calls it as well, then depending on which
_M_finalize_state is used at runtime it would either cope well with the
double finalization or not; now, if users never called time_get stuff
on uninitialized struct tm, we could just add another bit to the state,
whether the state has been actually encoded there, and call
_M_finalize_state in do_get only if it hasn't been.
Just curious, why doesn't the pmf hack work on arm-vxworks7?
Sure, clang++ doesn't support it and if one derives a class from time_get
and overrides do_get and/or get things will not work properly either.
Jakub
next prev parent reply other threads:[~2023-02-17 12:56 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-17 7:47 Alexandre Oliva
2023-02-17 12:56 ` Jakub Jelinek [this message]
2023-02-22 14:27 ` Alexandre Oliva
2023-02-23 17:55 ` Alexandre Oliva
2023-02-23 18:21 ` Jonathan Wakely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y+95gEhz3zQj3ngU@tucnak \
--to=jakub@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jwakely@redhat.com \
--cc=libstdc++@gcc.gnu.org \
--cc=oliva@adacore.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).