From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-il1-x130.google.com (mail-il1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) by sourceware.org (Postfix) with ESMTPS id 1D019385700B for ; Wed, 16 Sep 2020 17:32:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1D019385700B Received: by mail-il1-x130.google.com with SMTP id y2so7287079ilp.7 for ; Wed, 16 Sep 2020 10:32:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=XPW/if0QQbvyeLVdixtCOFQLU1sy3VMih0PVCSW2JwA=; b=Zu+R9nPzLn+UUKY0qN/suBoV5VPDncX2+NBRAduiDm71CAUJdFt7gAwWXdAPxugyOU BHO5CmrN16CbyAIRACTGN6x4w7TYauNPAhfAt0+Pw0wu+rzIYoL27JTB9XZ4W+uY7gjF vAfKQqdKGo1GyxgBYdHJA2QvF+/KhqaTSFHNL06bZlov0zUeI0VXFxxgI55HBwprZx9W bAzqF+Yt5OQvlKijmFuMim34JnFsDRC3RC2EGeG6CfZTLELRctA+VqNSlVoV9qCDqMMD wnO/gwMWMVWWD7VIXbuBxk6J8hyxu8TVwINDWqRzPsMgWEaIHgq8BvQ+UQZYwczKbfVM 9jKA== X-Gm-Message-State: AOAM533QAvjs67gZzvLnhMG7pJ0sse9N6AlCpf7USA2xsx8tPIr3XAiV ymSdxPu4AJS7knwKuyaI9iSQOChdkv8XZRYN/Eg= X-Google-Smtp-Source: ABdhPJxMuYku84WuawAFD7388JiiUW7gW37jNkENyZWNEtTVcGYfBVTjsvGWlxvR/59TM4juS5qeMAtSwdd1NsDE7XA= X-Received: by 2002:a92:c506:: with SMTP id r6mr7735030ilg.292.1600277574353; Wed, 16 Sep 2020 10:32:54 -0700 (PDT) MIME-Version: 1.0 References: <8770b080-9da3-e8e5-80a7-afde36ab3e5e@redhat.com> In-Reply-To: From: "H.J. Lu" Date: Wed, 16 Sep 2020 10:32:18 -0700 Message-ID: Subject: Re: Problem with static const objects and LTO To: Jeff Law Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Sep 2020 17:32:56 -0000 On Wed, Sep 16, 2020 at 10:24 AM Jeff Law wrote: > > > On 9/16/20 11:13 AM, H.J. Lu wrote: > > On Wed, Sep 16, 2020 at 10:10 AM Jeff Law wrote: > >> > >> On 9/16/20 11:05 AM, H.J. Lu wrote: > >>> On Wed, Sep 16, 2020 at 9:53 AM Jeff Law via Gcc-patches > >>> wrote: > >>>> Consider a TU with file scoped "static const object utf8_sb_map". A > >>>> routine within the TU will stuff &utf8_sb_map into an object, something > >>>> like: > >>>> > >>>> fu (...) > >>>> > >>>> { > >>>> > >>>> if (cond) > >>>> > >>>> dfa->sb_char = utf8_sb_map; > >>>> > >>>> else > >>>> > >>>> dfa->sb_char = malloc (...); > >>>> > >>>> } > >>>> > >>>> > >>>> There is another routine in the TU which looks like > >>>> > >>>> bar (...) > >>>> > >>>> { > >>>> > >>>> if (dfa->sb_char != utf8_sb_map) > >>>> > >>>> free (dfa->sb_char); > >>>> > >>>> } > >>>> > >>>> > >>>> Now imagine that the TU is compiled (with LTO) into a static library, > >>>> libgl.a and there's a DSO (libdso.so) which gets linked against libgl.a > >>>> and references the first routine (fu). We get a copy of fu in the DSO > >>>> along with a copy of utf8_sb_map. > >>>> > >>>> > >>>> Then imagine there's a main executable that dynamicly links against > >>>> libdso.so, then links statically against libgl.a. Assume the main > >>>> executable does not directly reference fu(), but does call a routine in > >>>> libdso.so which eventually calls fu(). Also assume the main executable > >>>> directly calls bar(). Again, remember we're compiling with LTO, so we > >>>> don't suck in the entire TU, just the routines/data we need. > >>>> > >>>> > >>>> In this scenario, both libdso.so and the main executable are going to a > >>>> copy of utf8_sb_map and they'll be at different addresses. So when the > >>>> main executable calls into libdso.so which in turn calls libdso's copy > >>>> of fu() which stuffs the address of utf8_sb_map from the DSO into > >>>> dfa->sb_char. Later the main executable calls bar() that's in the main > >>>> executable. It does the comparison to see if dfa->sb_char is equal to > >>>> utf8_sb_map -- but it's using the main executable's copy of utf8_sb_map > >>>> and naturally free() blows us because it was passed a static object, not > >>>> a malloc'd object. > >>>> > >>>> > >>>> ISTM this is a lot like the problem we have where we inline functions > >>>> with static data. To fix those we use STB_GNU_UNIQUE. But I don't see > >>>> any code in the C front-end which would utilize STB_GNU_UNIQUE. It's > >>>> support seems limited to C++. > >>>> > >>>> > >>>> How is this supposed to work for C? > >>>> > >>>> > >>>> Jeff > >>>> > >>>> > >>> Can you group utf8_sb_map, fu and bar together so that they are defined > >>> together? > >> They're all defined within the same TU in gnulib. It's the LTO > >> dead/unreachable code elimination that results in just parts of the TU > >> being copied into the DSO and a different set copied into the main > >> executable. In many ways LTO makes this look a lot like the static data > >> member problems we've had to deal with in the C++ world. > > In this case, LTO should treat them as in a single group. Removing > > one group member should remove the whole group. Keep one member > > should keep the whole group. > > Do you mean ensure they're all in a partition together? I think that > might work in the immediate term, but is probably brittle in the long > term. I'd tend to lean towards forcing these static data objects to be > STB_GNU_UNIQUE -- that seems more robust to me. Isn't STB_GNU_UNIQUE binding global? How does it work with static const int foo; and static const double foo; in different files? -- H.J.