From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 901E03858CDB for ; Mon, 8 Aug 2022 13:29:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 901E03858CDB Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-332-V1Ko48CFPmyEcmBJCOgdYg-1; Mon, 08 Aug 2022 09:29:11 -0400 X-MC-Unique: V1Ko48CFPmyEcmBJCOgdYg-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 93DF58039A1; Mon, 8 Aug 2022 13:29:10 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.39.192.233]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E8F58492CA2; Mon, 8 Aug 2022 13:29:09 +0000 (UTC) From: Florian Weimer To: "H.J. Lu" Cc: "H.J. Lu via Libc-alpha" Subject: Re: [PATCH] x86-64: Restore LD_PREFER_MAP_32BIT_EXEC support [BZ #28656] References: <20220801195150.2160919-1-hjl.tools@gmail.com> <87o7x3847o.fsf@oldenburg.str.redhat.com> Date: Mon, 08 Aug 2022 15:29:08 +0200 In-Reply-To: (H. J. Lu's message of "Fri, 5 Aug 2022 14:53:23 -0700") Message-ID: <871qtqood7.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2022 13:29:15 -0000 * H. J. Lu: > On Tue, Aug 2, 2022 at 1:00 AM Florian Weimer wrote: >> >> * H. J. Lu via Libc-alpha: >> >> > Crossing 2GB boundaries with indirect calls and jumps can use more >> > branch prediction resources on several Intel CPUs. There is visible >> > performance improvement on workloads with many PLT calls when executable >> > and shared libraries are mmapped below 2GB. Add the Prefer_MAP_32BIT_EXEC >> > bit so that mmap will try to map executable or denywrite pages with >> > MAP_32BIT first. >> > >> > NB: Prefer_MAP_32BIT_EXEC reduces bits available for address space >> > layout randomization (ASLR), which is always disabled for SUID programs >> > and can only be enabled by setting environment variable, >> > LD_PREFER_MAP_32BIT_EXEC. >> >> If the performance benefits are significant, this should be handled at >> the kernel level. Only the kernel can put the main program, ld.so and >> the vDSO into the same 2GB window (presumably with the main program at >> the top, so that the heap can grow almost indefinitely). > > ld.so and vDSO aren't performance sensitive. But we need to handle PIE. I don't think this is necessarily true. It depends on execution profile. clock_gettime in the vDSO could certainly matter to some workloads. >> For mapping shared objects, we can give the kernel a hint that they will >> eventually contain an executable mapping. If the kernel could reuse >> MAP_DENYWRITE for that, no glibc changes would be needed after all. >> >> Doing this is in glibc is only a very partial solution, and so I'd >> appreciate if it could be fixed properly in the kernel. >> > > There is no easy way for kernel to selectively mmap PIE with MAP_32BIT. > Can ld.so re-exec PIE with "ld.so PIE" so that ld.so can mmap PIE with > MAP_32BIT? In theory, yes, but that still leaves the vDSO issue. The kernel could cover that as well. Regarding the performance issue, does everything have to be in the first 2 GiB or 4 GiB, or is it sufficient if everything is in the same +/- 2 GiB window? Thanks, Florian