From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.gentoo.org (woodpecker.gentoo.org [140.211.166.183]) by sourceware.org (Postfix) with ESMTP id 4DA503858D20 for ; Sun, 12 Nov 2023 09:33:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4DA503858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gentoo.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gentoo.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4DA503858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=140.211.166.183 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699781626; cv=none; b=bZqeShZMw4Pui2a91KF8/AZevrTKrogadNnMwWM6DUU/Hfr0BN8/mZOcixANfrjBf/EHtDAWuAZ6cEPNfjLg3kwYORTB+Yan8UFIffauWOJxgIt0BNGR1f47vSXTfNUWiYisru6pVv0LUyn/Q5zFjMbRVTrHI75c28kzmijQayg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699781626; c=relaxed/simple; bh=N9Hyzc6SWDs0sOwkr1FSDdzLako+0/GC0P7Ofee7WVE=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=iyYAw5NIReAMDyxOPWcrCe1PxgXHFLeJP/RcLIsmEAgBUvuE8f6JKINXVur2yZSz7I75f0s4GVOwWbJSMXRgscKDgh36S9lUTK6qZrdmTAVXitIn147Syt9TcCp2sK3MIyoEsc9N/9xNO+tSc6ghIi/o4QK5D5Nby8nUfGKdMMc= ARC-Authentication-Results: i=1; server2.sourceware.org References: <20231024141124.210708-1-exactlywb@ispras.ru> <87zfzj6hvx.fsf@gentoo.org> User-agent: mu4e 1.10.8; emacs 30.0.50 From: Sam James To: Alexander Monakov Cc: Sam James , Daniil Frolov , Arsen =?utf-8?Q?Arsenovi=C4=87?= , gcc-patches@gcc.gnu.org Subject: Re: [RFC PATCH] Detecting lifetime-dse issues via Valgrind [PR66487] Date: Sun, 12 Nov 2023 09:05:21 +0000 Organization: Gentoo In-reply-to: Message-ID: <87zfzj48p9.fsf@gentoo.org> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Alexander Monakov writes: > On Sat, 11 Nov 2023, Sam James wrote: > >> > Valgrind client requests are offered as macros that emit inline asm. For use >> > in code generation, we need to wrap it in a built-in. We know that implementing >> > such a built-in in libgcc is undesirable, [...]. >> >> Perhaps less objectionable than you think, at least given the new CFR >> stuff from oliva from the other week that landed. > > Yeah; we haven't found any better solution anyway. > >> This is a really neat idea (it also makes me wonder if there's any other >> opportunities for Valgrind integration like this?). > > To (attempt to) answer the parenthetical question, note that the patch is not > limited to instrumenting C++ cdtors, it annotates all lifetime CLOBBER marks, > so Valgrind should see lifetime boundaries of various on-stack arrays too. Oh, right! > > (I hope positioning the new pass after build_ssa is sufficient to avoid > annotating too much, like non-address-taken local scalars) > >> LLVM was the most recent example but it wasn't the first, and this has >> come up with LLVM in a few other places too (same root cause, wasn't >> obvious at all). > > I'm very curious what you mean by "this has come up with LLVM [] too": ttbomk, > LLVM doesn't do such lifetime-based optimization yet, which is why compiling > LLVM with LLVM doesn't break it. Can you share some examples? Or do you mean > instances when libLLVM-miscompiled-with-GCC was linked elsewhere, and that > program crashed mysteriously as a result? > > Indeed this work is inspired by the LLVM incident in PR 106943. For that part, I meant that we had a _lot_ of different variations of the LLVM miscompilation over the years where people would report issues when building LLVM with GCC but trying to investigate it didn't get very far. i.e. It was very hard to debug and something like this would've made things substantially easier (it takes us from the realm of "uhh maybe compiler bug" to provable UB, which we're very used to handling). It doesn't help that the fact that ubsan and friends can't find this means it's often not-determined and may be worked around with either disabling LTO or disabling various passes as a heavy hammer. I didn't think to try toggling it when I hit issues until PR 106943. > we don't see many other instances with -flifetime-dse workarounds in public. > Grepping Gentoo Portage reveals only openjade. Arch applies the workaround to > a jvm package too, and we know that Firefox and LLVM apply it on their own. > I had some vague memories in the back of my head so I went digging because I enjoy this: * Qt has hit this in the past, and I actually wonder if it's the cause of a few other nebulous LTO issues (which we've never had a solid report for) as well: https://codereview.qt-project.org/c/qt/qtdeclarative/+/176272 https://bugs.gentoo.org/584818 https://bugs.gentoo.org/626070 We've given up for Qt 5 and I plan on revisiting any possible issues with LTO with Qt 6 instead. * Firefox as you noted: https://bugzilla.mozilla.org/show_bug.cgi?id=1232696. * TBB (no real details available, unfortunately): https://github.com/oneapi-src/oneTBB/commit/51c0b2f742920535178560f31c6e91065bf87b41 * crypto++ has had a series of mysterious miscompilations and I suspect it may be a victim here. See https://github.com/weidai11/cryptopp/issues/1141#issuecomment-1208169530 onwards but it's not the first bug like this crypto++ has had. * codeblocks: https://bugs.gentoo.org/625696 * coin: https://bugs.gentoo.org/619378 (I would not be surprised if other wxwidgets applications are victims, e.g. older kicad or audacity.) Some of the results for GCC 6 at https://bugs.gentoo.org/showdependencytree.cgi?id=582084&hide_resolved=0 were from other optimisation changes, but some are DSE (and some might have been DSE but masked by another flag as I mentioned earlier). If someone is extremely bored, they may want to look through our LTO tracker / various -fno-lto/filter-lto grep results in gentoo.git to see if any of them seem fishy. In the past, people would filter LTO without examining the real problem. I am trying to fix this but you can't do that overnight. Another good heuristic, I bet, is anything passing -O1, filtering -O3, or if you want a general sense of crustiness, where the ebuild has to pass -std=c++98 or so. But I'm just trying to give you fodder if you want more examples with that last suggestion. With less detail, I have mentions for the following in some git checkouts from other distributions. I could not find any bug references: * injeqt (Void Linux) * opencollada (Void Linux) > This patch finds the issue in LLVM and openjade; testing it on Spidermonkey > is TODO. Suggestions for other interesting tests would be welcome. > >> > --- a/libgcc/configure.ac >> > +++ b/libgcc/configure.ac >> > @@ -269,6 +269,54 @@ GCC_CHECK_SJLJ_EXCEPTIONS >> > GCC_CET_FLAGS(CET_FLAGS) >> > AC_SUBST(CET_FLAGS) >> > >> > +AC_CHECK_HEADER(valgrind.h, have_valgrind_h=yes, have_valgrind_h=no) >> >> Consider using PKG_CHECK_MODULES and falling back to a manual search. > > Thanks. autotools bits in this patch are one-to-one copy of the pre-existing > Valgrind detection in the 'gcc' subdirectory where it's necessary for > running the compiler under Valgrind without false positives. > > I guess the right solution is to move Valgrind detection into the top-level > 'config' directory (and apply the cleanups you mention), but as we are not > familiar with autotools we just made the copy-paste for this RFC. ack > > With the patch, --enable-valgrind-annotations becomes "overloaded" to > simultaneously instrument the compiler and enhance libgcc to support > -fvalgrind-emit-annotations, but those are independent and in practice > people may need the latter without the former. Yeah, I did notice this and wasn't sure how I felt about it, but I didn't want to bikeshed at this point. I don't think it matters that much given you need to build with a flag for them to be emitted anyway. > > Alexander thanks, sam