From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 999C9385841C for ; Tue, 4 Jan 2022 18:28:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 999C9385841C Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-572-63GYsA9KOYym-wRmeIie6g-1; Tue, 04 Jan 2022 13:28:34 -0500 X-MC-Unique: 63GYsA9KOYym-wRmeIie6g-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3E2B58015CD; Tue, 4 Jan 2022 18:28:33 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.2.16.169]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C63428276A; Tue, 4 Jan 2022 18:28:32 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 204ISUdc2908541 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 4 Jan 2022 19:28:30 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 204ISTXF2908540; Tue, 4 Jan 2022 19:28:29 +0100 Date: Tue, 4 Jan 2022 19:28:29 +0100 From: Jakub Jelinek To: Andrew Stubbs Cc: "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH] libgomp, openmp: pinned memory Message-ID: <20220104182829.GK2646553@tucnak> Reply-To: Jakub Jelinek References: <20220104155558.GG2646553@tucnak> <48ee767a-0d90-53b4-ea54-9deba9edd805@codesourcery.com> MIME-Version: 1.0 In-Reply-To: <48ee767a-0d90-53b4-ea54-9deba9edd805@codesourcery.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Jan 2022 18:28:38 -0000 On Tue, Jan 04, 2022 at 04:58:19PM +0000, Andrew Stubbs wrote: > > I think perror is the wrong thing to do, omp_alloc etc. has a well defined > > interface what to do in such cases - the allocation should just fail (not be > > allocated) and depending on user's choice that can be fatal, or return NULL, > > or chain to some other allocator with other properties etc. > > I did it this way because pinning feels more like an optimization, and > falling back to "just works" seemed like what users would want to happen. > The perror was added because it turns out the default ulimit is tiny and I > wanted to hint at the solution. Something like perror might be acceptable for GOMP_DEBUG mode, but not normal operation. So perhaps use gomp_debug there instead? If it is just an optimization for the user, they should be using the chaining to corresponding allocator without the pinning to make it clear what they want and also standard conforming. > > Other issues in the patch are that it doesn't munlock on deallocation and > > that because of that deallocation we need to figure out what to do on page > > boundaries. As documented, mlock can be passed address and/or address + > > size that aren't at page boundaries and pinning happens even just for > > partially touched pages. But munlock unpins also even the partially > > overlapping pages and we don't know at that point whether some other pinned > > allocations don't appear in those pages. > > Right, it doesn't munlock because of these issues. I don't know of any way > to solve this that wouldn't involve building tables of locked ranges (and > knowing what the page size is). > > I considered using mmap with the lock flag instead, but the failure mode > looked unhelpful. I guess we could mmap with the regular flags, then mlock > after. That should bypass the regular heap and ensure each allocation has > it's own page. I'm not sure what the unintended side-effects of that might > be. But the munlock is even more important because of the low ulimit -l, because if munlock isn't done on deallocation, the by default I think 64KB limit will be reached even much earlier. If most users have just 64KB limit on pinned memory per process, then that most likely asks for grabbing such memory in whole pages and doing memory management on that resource. Because vasting that precious memory on the partial pages which will most likely get non-pinned allocations when we just have 16 such pages is a big waste. Jakub