From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf35.google.com (mail-qv1-xf35.google.com [IPv6:2607:f8b0:4864:20::f35]) by sourceware.org (Postfix) with ESMTPS id 07707385736E for ; Mon, 8 Aug 2022 17:02:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 07707385736E Received: by mail-qv1-xf35.google.com with SMTP id mk9so6779189qvb.11 for ; Mon, 08 Aug 2022 10:02:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=ccAlPGy15hi8um3cVYimWJZzJVE3/nfNDsdmqTN02P0=; b=ksukCBScGR6xG9219T7qOccwoKKdm6YjvRzr2RTt9knIPcas1l8i1H7/65xUgg/VzQ 71pMADltYNPd5yJq6g2aDliyubHglnSEj+3RpY0fxbQikJ1yFbD36M6GMArvVdejVqVj SjGOrMxPu2Ludy/1pyiS8qNewVhNE8nQxgsL5Jzu6HdH5UWWu5fbseMX9F371ovE8sUz AngW8Ii0FJx1sd6+EVP3O98u5h5C1o/aU0TbCGnVrsuxJ9UnEXIX+J/fsOVq/NoDGc2j Z7Y3auhd6tBWAGNryQ4bP+t6PFaUmAK+ux0F6z9raSHtE/UbjCcDe5S1uJKoXSv9fnF8 dlCA== X-Gm-Message-State: ACgBeo2AjyMR3ADG9BqqU9RCVnhUDmXNiYHCe1f2Nq7VRder8mJ+tMrU 2PzddS7rgD62tUM29QQHyGBy8L55NoExtg0st/s= X-Google-Smtp-Source: AA6agR4svjTtdOi5n6CKgPkIdI226YrS9azT+w+rCJOUbfIln5YoYkdyjP7Ho6w77ycFzK3URcgkzlzTCfFmlKgLIn0= X-Received: by 2002:a05:6214:d82:b0:477:3d7c:1081 with SMTP id e2-20020a0562140d8200b004773d7c1081mr16286363qve.28.1659978164383; Mon, 08 Aug 2022 10:02:44 -0700 (PDT) MIME-Version: 1.0 References: <20220801195150.2160919-1-hjl.tools@gmail.com> <87o7x3847o.fsf@oldenburg.str.redhat.com> <871qtqood7.fsf@oldenburg.str.redhat.com> In-Reply-To: <871qtqood7.fsf@oldenburg.str.redhat.com> From: "H.J. Lu" Date: Mon, 8 Aug 2022 10:02:08 -0700 Message-ID: Subject: Re: [PATCH] x86-64: Restore LD_PREFER_MAP_32BIT_EXEC support [BZ #28656] To: Florian Weimer Cc: "H.J. Lu via Libc-alpha" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3018.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2022 17:02:47 -0000 On Mon, Aug 8, 2022 at 6:29 AM Florian Weimer wrote: > > * H. J. Lu: > > > On Tue, Aug 2, 2022 at 1:00 AM Florian Weimer wrote: > >> > >> * H. J. Lu via Libc-alpha: > >> > >> > Crossing 2GB boundaries with indirect calls and jumps can use more > >> > branch prediction resources on several Intel CPUs. There is visible > >> > performance improvement on workloads with many PLT calls when executable > >> > and shared libraries are mmapped below 2GB. Add the Prefer_MAP_32BIT_EXEC > >> > bit so that mmap will try to map executable or denywrite pages with > >> > MAP_32BIT first. > >> > > >> > NB: Prefer_MAP_32BIT_EXEC reduces bits available for address space > >> > layout randomization (ASLR), which is always disabled for SUID programs > >> > and can only be enabled by setting environment variable, > >> > LD_PREFER_MAP_32BIT_EXEC. > >> > >> If the performance benefits are significant, this should be handled at > >> the kernel level. Only the kernel can put the main program, ld.so and > >> the vDSO into the same 2GB window (presumably with the main program at > >> the top, so that the heap can grow almost indefinitely). > > > > ld.so and vDSO aren't performance sensitive. But we need to handle PIE. > > I don't think this is necessarily true. It depends on execution > profile. True. > clock_gettime in the vDSO could certainly matter to some workloads. > > >> For mapping shared objects, we can give the kernel a hint that they will > >> eventually contain an executable mapping. If the kernel could reuse > >> MAP_DENYWRITE for that, no glibc changes would be needed after all. > >> > >> Doing this is in glibc is only a very partial solution, and so I'd > >> appreciate if it could be fixed properly in the kernel. > >> > > > > There is no easy way for kernel to selectively mmap PIE with MAP_32BIT. > > Can ld.so re-exec PIE with "ld.so PIE" so that ld.so can mmap PIE with > > MAP_32BIT? > > In theory, yes, but that still leaves the vDSO issue. The kernel could > cover that as well. Kernel changes may not be easy. Glibc changes can cover most of performance issues. However, "ld.so PIE" may be difficult to debug. Is that possible for ld.so to unmap PIE and map PIE with MAP_32BIT? > Regarding the performance issue, does everything have to be in the first > 2 GiB or 4 GiB, or is it sufficient if everything is in the same > +/- 2 GiB window? This doesn't apply since the issue is with indirect calls and jumps. -- H.J.