From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id 1D0273858D1E for ; Sat, 19 Nov 2022 10:46:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1D0273858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.96,176,1665475200"; d="scan'208";a="87597089" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa2.mentor.iphmx.com with ESMTP; 19 Nov 2022 02:46:39 -0800 IronPort-SDR: EawDMMH83JTV5AxOFygiw7sZ5+JbymowQ+5qyMr9Hm1GQ7wVfhZlenBVttCSC1t4ENxDDEa2B4 sWuG0pxRayTO7rciHAiBIc2JY40ndi8PYQ3X/nM+AO+e2n2+Efvyo72II9ib4uBOdFIPNVkP6w uYuQNjN+Zl3aYPZvJRy4c+uS83gG0fXKfLQBsf9O5wD6/4vQsLZl2elHGhg8fuvH6jdD1Eqsag HXrSaI05LrheViwzmDFP+n8AyElx6TRjnLGNuGV6zZnB50HLJ0BRlZHZnPu9g6Em5+UG/jFuMo Adk= Message-ID: Date: Sat, 19 Nov 2022 11:46:32 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.0 Subject: Re: [Patch] gcn: Add __builtin_gcn_{get_stack_limit,first_call_this_thread_p} Content-Language: en-US To: Andrew Stubbs , gcc-patches References: <1bec26d6-e2c5-3408-4f61-0fb17e730b3e@codesourcery.com> From: Tobias Burnus In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 18.11.22 18:49, Andrew Stubbs wrote: > On 18/11/2022 17:20, Tobias Burnus wrote: > > This looks wrong: > >> + /* stackbase =3D (stack_segment_decr & 0x0000ffffffffffff) >> + + stack_wave_offset); >> + seg_size =3D dispatch_ptr->private_segment_size; >> + stacklimit =3D stackbase + seg_size*64; (this should be '*seg_size' not 'seg_size' and the name should be s/seg_size/seg_size_ptr/.) >> + with segsize =3D dispatch_ptr + 6*sizeof(int16_t) + >> 3*sizeof(int32_t); >> + cf. struct hsa_kernel_dispatch_packet_s in the HSA doc. */ >> + rtx ptr; >> + if (cfun->machine->args.reg[DISPATCH_PTR_ARG] >=3D 0 >> + && cfun->machine->args.reg[PRIVATE_SEGMENT_BUFFER_ARG] >=3D 0) >> + { >> + rtx size_rtx =3D gen_rtx_REG (DImode, >> + cfun->machine->args.reg[DISPATCH_PTR_ARG]); >> + size_rtx =3D gen_rtx_MEM (DImode, >> + gen_rtx_PLUS (DImode, size_rtx, >> + GEN_INT (6*16 + 3*32))); >> + size_rtx =3D gen_rtx_MULT (DImode, size_rtx, GEN_INT (64)); >> + (Reading it, I think it should be '..._MEM(SImode,' and '..._MULT(SImode' instead of DImode.) > seg_size is calculated from the private_segment_size loaded from the > dispatch_ptr, not calculated from the dispatch_ptr itself. Isn't this what thee code tries to do? Namely: My understanding is that dispatch_ptr->private_segment_size =3D=3D *((char*)dispatch_ptr + 192) And the latter is what I attempt to do. I have a very limited knowledge of insn/rtx/RTL and of GCN assemply; thus, I likely have done something stupid. Having said this, Here is what I get: (Where asm("s4") =3D=3D dispatch_ptr) s_add_u32 s2, s4, 192 s_addc_u32 s3, s5, 0 v_writelane_b32 v4, s2, 0 v_writelane_b32 v5, s3, 0 s_mov_b64 exec, 1 flat_load_dwordx2 v[4:5], v[4:5] s_waitcnt 0 v_lshlrev_b64 v[4:5], 6, v[4:5] v_readlane_b32 s2, v4, 0 v_readlane_b32 s3, v5, 0 Not that I really understand every line, but at a glance it looks okay. The 192 is because of (quoting newlib/libc/machine/amdgcn/getreent.c): typedef struct hsa_kernel_dispatch_packet_s { uint16_t header ; uint16_t setup; uint16_t workgroup_size_x ; uint16_t workgroup_size_y ; uint16_t workgroup_size_z; uint16_t reserved0; uint32_t grid_size_x ; uint32_t grid_size_y ; uint32_t grid_size_z; uint32_t private_segment_size; i.e. 6*16 + 3*32 =3D 192 =E2=80=93 and we want to read a 32bit unsigned int= . * * * Admittedly, there is probably something not quite right as I see with gfx90= 8 # of expected passes 27476 # of unexpected failures 317 where 317 FAIL comes from 88 testcase files. That's not a a very high number but more than the usual fails, which shows = that something is not quite right. * * * I am pretty sure that I missed something - but the question is what. I hope you can help me pinpoint the place where it goes wrong. Thanks, Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955