From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1461) id B5EE93858C39; Fri, 13 Jan 2023 17:44:32 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B5EE93858C39 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1673631872; bh=TuScQrSYsKIVm77pLLkDq6kmxxD7AoV7rg5VPTolnDY=; h=From:To:Subject:Date:From; b=CM8Or3idBol3hPNWvrNP8Gl9f8lRWsWIof8dhTSx2ChHv0vbAMdvDD3WOtfKG+6Pn bSA+7r/H7nQ9XLO5cwOnE9q7re7RHqhZDqSo5EkLiLh1s9lMlv8S6BpGd6XPLgQzu0 SLeJvpDyBLepDKAK4LczxccgrH9WPok4B8EARFBE= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Andrew Stubbs To: gcc-cvs@gcc.gnu.org Subject: [gcc/devel/omp/gcc-12] libgomp, amdgcn: Switch USM to 128-byte alignment X-Act-Checkin: gcc X-Git-Author: Andrew Stubbs X-Git-Refname: refs/heads/devel/omp/gcc-12 X-Git-Oldrev: 2d6fb9681337077a3b639f94ed725245f17bc596 X-Git-Newrev: c9b47ccf32a91d8c851a3e20d1a2f47ed0aaa47e Message-Id: <20230113174432.B5EE93858C39@sourceware.org> Date: Fri, 13 Jan 2023 17:44:32 +0000 (GMT) List-Id: https://gcc.gnu.org/g:c9b47ccf32a91d8c851a3e20d1a2f47ed0aaa47e commit c9b47ccf32a91d8c851a3e20d1a2f47ed0aaa47e Author: Andrew Stubbs Date: Fri Jan 13 17:38:39 2023 +0000 libgomp, amdgcn: Switch USM to 128-byte alignment This should optimize cache-lines on the AMD GPUs somewhat. libgomp/ChangeLog: * usm-allocator.c (ALIGN): Use 128-byte alignment. Diff: --- libgomp/usm-allocator.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libgomp/usm-allocator.c b/libgomp/usm-allocator.c index c45109169ca..68c1ebafec2 100644 --- a/libgomp/usm-allocator.c +++ b/libgomp/usm-allocator.c @@ -57,7 +57,8 @@ static int usm_lock = 0; static struct usm_splay_tree_s usm_allocations = { NULL }; static struct usm_splay_tree_s usm_free_space = { NULL }; -#define ALIGN(VAR) (((VAR) + 7) & ~7) /* 8-byte granularity. */ +/* 128-byte granularity means GPU cache-line aligned. */ +#define ALIGN(VAR) (((VAR) + 127) & ~127) /* Coalesce contiguous free space into one entry. This considers the entries either side of the root node only, so it should be called each time a new