From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 90BD63858D39 for ; Thu, 6 Jan 2022 09:29:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 90BD63858D39 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 685EB1F37F; Thu, 6 Jan 2022 09:29:30 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 1A31B13BFF; Thu, 6 Jan 2022 09:29:30 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 4pTlBHq21mEFaQAAMHmgww (envelope-from ); Thu, 06 Jan 2022 09:29:30 +0000 Message-ID: <5e75a64c-a8d3-2d2a-162a-a3ea79358b48@suse.de> Date: Thu, 6 Jan 2022 10:29:29 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 Subject: Re: [PATCH] libgomp, OpenMP, nvptx: Low-latency memory allocator Content-Language: en-US To: Andrew Stubbs , "gcc-patches@gcc.gnu.org" Cc: Tobias Burnus , Jakub Jelinek References: <25ad524d-f0d6-1970-b8e9-9b11b6cde68b@codesourcery.com> <42c70624-2b10-340c-8945-601203768d48@suse.de> <664653d3-cf64-b800-6ffb-c27e50dc15bf@suse.de> From: Tom de Vries In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jan 2022 09:29:33 -0000 On 1/5/22 15:36, Andrew Stubbs wrote: > On 05/01/2022 13:04, Tom de Vries wrote: >> On 1/5/22 12:08, Tom de Vries wrote: >>> The allocators-1.c test-case doesn't compile because: >>> ... >>> FAIL: libgomp.c/allocators-1.c (test for excess errors) >>> Excess errors: >>> /home/vries/oacc/trunk/source-gcc/libgomp/testsuite/libgomp.c/allocators-1.c:7:22: >>> sorry, unimplemented: '    ' clause on 'requires' directive not >>> supported yet >>> >>> UNRESOLVED: libgomp.c/allocators-1.c compilation failed to produce >>> executable >>> ... >>> >>> So, I suppose I need "[PATCH] OpenMP front-end: allow requires >>> dynamic_allocators" as well, I'll try again with that applied. >> >> After applying that, I get: >> ... >> WARNING: program timed out. >> FAIL: libgomp.c/allocators-2.c execution test >> WARNING: program timed out. >> FAIL: libgomp.c/allocators-3.c execution test >> ... > > It works for me..... > > Those tests are doing some large number of allocations repeatedly and in > parallel to stress the atomics. They're also slightly longer running > than the other tests. >   - allocators-2 calls omp_alloc 8080 times, over 16 kernel launches, > some of which will fall back to PTX malloc. I've minimized the test-case by enabling a single call in main at a time. All but the last 4 take about two seconds, the last 4 hang (and time out at 5min). So, this already times out for me: ... int main () { test (1000, omp_low_lat_mem_alloc); return 0; } ... I tried playing around with the n, and roughly there's no hang below 100, and a hang above 200, and inbetween there may or may not be a hang. Again the same dynamic: if there's no hang, it just takes a few seconds. >   - allocators-3 calls omp_alloc and omp_free 8 million times each, > over 8 kernel launches, and takes about a minute to run on my device > (whether that falls back depends entirely on how the free calls > interleave). > > Either there is a flaw in the concurrency causing some kind of deadlock, > or else your timeout is set too short for your device. I hope it's the > latter. We may need to tweak this. At first glance, the above behaviour doesn't look like a too short timeout. [ FTR, I'm using a GT 1030 with production branch driver version 470.86 (which is one version behind the latest 470.94) ] Thanks, - Tom