From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id D58143858C31 for ; Fri, 17 Feb 2023 12:56:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D58143858C31 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1676638598; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=FJY37c2jop8OE0F82smg+ejpGGDMFyEisbK5C0Kpv7A=; b=AfV0lRj9qxXtEvZSrRR+jhtiIiG9fMmTVceSEhH1cbChQvasdZcumR/NnMaM6LB28oTyPk gMfNsZZ3W4eczFm6RpdgM50CscbrP19VwdNPXYZ/5RYi1QQ9fgNqo6hnLpUQl0q9GS7hY+ 3DapcrAdicdDrcJGm/9AUgeux7H5bPY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-179-ntzsSzGvOvaCzFvzkJ--3g-1; Fri, 17 Feb 2023 07:56:37 -0500 X-MC-Unique: ntzsSzGvOvaCzFvzkJ--3g-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BB98D811E6E; Fri, 17 Feb 2023 12:56:36 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.211]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 78491492C14; Fri, 17 Feb 2023 12:56:36 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 31HCuXAw2923178 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Fri, 17 Feb 2023 13:56:34 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 31HCuXVh2923175; Fri, 17 Feb 2023 13:56:33 +0100 Date: Fri, 17 Feb 2023 13:56:32 +0100 From: Jakub Jelinek To: Alexandre Oliva , Jonathan Wakely Cc: gcc-patches@gcc.gnu.org, libstdc++@gcc.gnu.org Subject: Re: [PATCH] [PR77760] [libstdc++] encode __time_get_state in tm Message-ID: Reply-To: Jakub Jelinek References: MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Feb 17, 2023 at 04:47:45AM -0300, Alexandre Oliva wrote: > > On platforms that fail the ptrtomemfn-cast-to-pfn hack, such as > arm-*-vxworks*, time_get fails with %I and %p because the state is not > preserved across do_get calls. > > This patch introduces an alternate hack, that encodes the state in > unused bits of struct tm before calling do_get, extracts them in > do_get, does the processing, and encodes it back, so that get extracts > it. > > The finalizer is adjusted for idempotence, because both do_get and get > may call it. As I said in the PR, I'm worried about this approach, but Jonathan is the expert on the standard... My worry is that people often invoke the time_get APIs on uninitialized struct tm. https://eel.is/c++draft/locale.time.get#general-1 says that corresponding members of the struct tm are set, in the past we used to set mostly just a single tm_* for ultimate format specifiers, with the addition of state we try to set some further ones as well in the spirit of what strptime does in glibc. But still, say if the format specifiers only mention hours/minutes/seconds, we don't try to update year/day/month related stuff and vice versa. Or say minute format specifier will only update tm_min and nothing else. For the encoding in get, the questions are: 1) if it is ok if we clear some normally unused bits of certain tm_* fields even if they wouldn't be otherwise touched; perhaps nothing will care, but it is still a user visible change 2) when those tm_* fields we want to encode the state into are unitialized, don't we effectively invoke UB by using the uninitialized member (admittedly we just overwrite some bits in it and leave others as is, but won't that e.g. cause -Wuninitialized warnings)? More importantly, in the do_get the questions are: 3) while do_get is protected, are we guaranteed that it will never be called except from the public members of time_get? I mean, if we are called from the libstdc++ time_get::get, then there is a state and it is already encoded in struct tm, so those bits are initialized and all is fine; but if it is called from elsewhere (either a class derived from time_get which just calls do_get alone on uninitialized struct tm, or say get in a derived class which calls do_get but doesn't encode the state, or even mixing of older GCC compiled get with newer GCC compiled do_get - get can be inlined into user code, while do_get is called through vtable and so could come up from another shared library), those struct tm bits could be effectively random and we'd finalize the state on the random stuff 4) I think we even shouldn't __state._M_finalize_state(__tm); in do_get if it is nested, then you wouldn't need to tweak src/c++98/locale_facets.cc; generally state should be finalized once everything is parsed together, not many times in between (of course if user code will parse stuff separately that will not work, say get with "%I" in one call, then get with "%h" in another one etc., but that is user's decision). That would be another ABI issue, if the _M_finalize_state is called in do_get and the containing get calls it as well, then depending on which _M_finalize_state is used at runtime it would either cope well with the double finalization or not; now, if users never called time_get stuff on uninitialized struct tm, we could just add another bit to the state, whether the state has been actually encoded there, and call _M_finalize_state in do_get only if it hasn't been. Just curious, why doesn't the pmf hack work on arm-vxworks7? Sure, clang++ doesn't support it and if one derives a class from time_get and overrides do_get and/or get things will not work properly either. Jakub