From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id B039B3858C20; Thu, 11 May 2023 01:20:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B039B3858C20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34B19OTO017343; Thu, 11 May 2023 01:20:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : references : date : in-reply-to : message-id : content-type : mime-version; s=pp1; bh=rz6HO7Ifjbh3QxLJhGK3/ME1eS9kzAHXPIGOtrk1mGc=; b=PbeUfTPgJ08ZEtJYCzoj0y5Giv0FTdKbhNG44svfnUoGBaGG7FipIDOlAYFFE6KjE2kb /JUyhFfK89f/QASuX9KC1M6SRFxvCMSPS211pcfU9scmzxw01HD/DFimHt7mhw2EwM5v NexoG2dCt2tXUreH+qiv1yeA1vozpyQMgkGIh/cglmATSPIYEhRMk9hIFJI8xffaM4Wa Qba0OKFTmZWqMh6eBqTjoN7o94mcsxZoOFsEb8/M/uz2FNgl0IJvLGoArbhmIBzLXQU6 N8lb//NpTktImEL+mE9DdhwctG/CHWS7G37IxfDsTm3QJBIkjHtC6YvSGOtQsneNBotA ZQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qgc30brqk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 11 May 2023 01:20:09 +0000 Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 34B1K9Y7019828; Thu, 11 May 2023 01:20:09 GMT Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qgc30brq4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 11 May 2023 01:20:09 +0000 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 34B1ITU8002671; Thu, 11 May 2023 01:20:08 GMT Received: from smtprelay07.wdc07v.mail.ibm.com ([9.208.129.116]) by ppma05wdc.us.ibm.com (PPS) with ESMTPS id 3qf88ucabd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 11 May 2023 01:20:08 +0000 Received: from smtpav04.wdc07v.mail.ibm.com (smtpav04.wdc07v.mail.ibm.com [10.39.53.231]) by smtprelay07.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 34B1K65E51642800 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 May 2023 01:20:07 GMT Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BE4CB58045; Thu, 11 May 2023 01:20:06 +0000 (GMT) Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5B4085805E; Thu, 11 May 2023 01:20:06 +0000 (GMT) Received: from ltcden2-lp1.aus.stglabs.ibm.com (unknown [9.3.90.43]) by smtpav04.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Thu, 11 May 2023 01:20:06 +0000 (GMT) From: Jiufu Guo To: Jiufu Guo via Gcc-patches Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, rguenther@suse.de, jeffreyalaw@gmail.com Subject: Ping^^: [PATCH V2] extract DF/SF/SI/HI/QI subreg from parameter word on stack References: <20230104134557.196235-1-guojiufu@linux.ibm.com> <7nilfjlcow.fsf@ltcden2-lp1.aus.stglabs.ibm.com> Date: Thu, 11 May 2023 09:20:01 +0800 In-Reply-To: <7nilfjlcow.fsf@ltcden2-lp1.aus.stglabs.ibm.com> (Jiufu Guo via Gcc-patches's message of "Thu, 02 Mar 2023 17:33:51 +0800") Message-ID: <7nr0rnbr5q.fsf_-_@ltcden2-lp1.aus.stglabs.ibm.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) Content-Type: text/plain X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: OkBPW8XNcTlPPBzk3xxioBPlFCC-59XV X-Proofpoint-GUID: G19yFTE_A2LEL57njRSDUbeGVKwXy1WN X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-10_04,2023-05-05_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 phishscore=0 adultscore=0 mlxlogscore=999 priorityscore=1501 lowpriorityscore=0 malwarescore=0 clxscore=1015 bulkscore=0 mlxscore=0 spamscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2305110008 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, I would like to ping: https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609396.html We know there are a few issues related to aggregate parameter and returns. I'm thinking if it is ok for trunk to use this patch to resolve part of those issues. BR, Jeff (Jiufu) Jiufu Guo via Gcc-patches writes: > Hi, > > Gentle ping: > https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609396.html > > Thanks for comments and suggestions! > > I'm thinking that we may use these patches to fix some of the issues > on parm and returns. > > Sorry for the late ping for this patch to ask if this is acceptable. > > > BR, > Jeff (Jiufu) > > Jiufu Guo writes: > >> Hi, >> >> This patch is fixing an issue about parameter accessing if the >> parameter is struct type and passed through integer registers, and >> there is floating member is accessed. Like below code: >> >> typedef struct DF {double a[4]; long l; } DF; >> double foo_df (DF arg){return arg.a[3];} >> >> On ppc64le, with trunk gcc, "std 6,-24(1) ; lfd 1,-24(1)" is >> generated. While instruction "mtvsrd 1, 6" would be enough for >> this case. >> >> This patch updates the behavior when loading floating members of a >> parameter: if that floating member is stored via integer register, >> then loading it as integer mode first, and converting it to floating >> mode. >> >> Compare with previous patch: >> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608872.html >> Previous version supports converion from DImode to DF/SF, this >> version also supports conversion from DImode to SI/HI/QI modes. >> >> I also tried to enhance CSE/DSE for this issue. But because the >> limitations (e.g. CSE does not like new pseudo, DSE is not good >> at cross-blocks), some cases (as this patch) can not be handled. >> >> Bootstrap and regtest passes on ppc64{,le}. >> Is this ok for trunk? Thanks for comments! >> >> >> BR, >> Jeff (Jiufu) >> >> >> PR target/108073 >> >> gcc/ChangeLog: >> >> * expr.cc (extract_subreg_from_loading_word): New function. >> (expand_expr_real_1): Call extract_subreg_from_loading_word. >> >> gcc/testsuite/ChangeLog: >> >> * g++.target/powerpc/pr102024.C: Updated. >> * gcc.target/powerpc/pr108073.c: New test. >> >> --- >> gcc/expr.cc | 76 +++++++++++++++++++++ >> gcc/testsuite/g++.target/powerpc/pr102024.C | 2 +- >> gcc/testsuite/gcc.target/powerpc/pr108073.c | 30 ++++++++ >> 3 files changed, 107 insertions(+), 1 deletion(-) >> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073.c >> >> diff --git a/gcc/expr.cc b/gcc/expr.cc >> index d9407432ea5..6de4a985c8b 100644 >> --- a/gcc/expr.cc >> +++ b/gcc/expr.cc >> @@ -10631,6 +10631,69 @@ stmt_is_replaceable_p (gimple *stmt) >> return false; >> } >> >> +/* Return the content of the memory slot SOURCE as MODE. >> + SOURCE is based on BASE. BASE is a memory block that is stored via words. >> + >> + To get the content from SOURCE: >> + first load the word from the memory which covers the SOURCE slot first; >> + next return the word's subreg which offsets to SOURCE slot; >> + then convert to MODE as necessary. */ >> + >> +static rtx >> +extract_subreg_from_loading_word (machine_mode mode, rtx source, rtx base) >> +{ >> + rtx src_base = XEXP (source, 0); >> + poly_uint64 offset = MEM_OFFSET (source); >> + >> + if (GET_CODE (src_base) == PLUS && CONSTANT_P (XEXP (src_base, 1))) >> + { >> + offset += INTVAL (XEXP (src_base, 1)); >> + src_base = XEXP (src_base, 0); >> + } >> + >> + if (!rtx_equal_p (XEXP (base, 0), src_base)) >> + return NULL_RTX; >> + >> + /* Subreg(DI,n) -> DF/SF/SI/HI/QI */ >> + poly_uint64 word_size = GET_MODE_SIZE (word_mode); >> + poly_uint64 mode_size = GET_MODE_SIZE (mode); >> + poly_uint64 byte_off; >> + unsigned int start; >> + machine_mode int_mode; >> + if (known_ge (word_size, mode_size) && multiple_p (word_size, mode_size) >> + && int_mode_for_mode (mode).exists (&int_mode) >> + && can_div_trunc_p (offset, word_size, &start, &byte_off) >> + && multiple_p (byte_off, mode_size)) >> + { >> + rtx word_mem = copy_rtx (source); >> + PUT_MODE (word_mem, word_mode); >> + word_mem = adjust_address (word_mem, word_mode, -byte_off); >> + >> + rtx word_reg = gen_reg_rtx (word_mode); >> + emit_move_insn (word_reg, word_mem); >> + >> + poly_uint64 low_off = subreg_lowpart_offset (int_mode, word_mode); >> + if (!known_eq (byte_off, low_off)) >> + { >> + poly_uint64 shift_bytes = known_gt (byte_off, low_off) >> + ? byte_off - low_off >> + : low_off - byte_off; >> + word_reg = expand_shift (RSHIFT_EXPR, word_mode, word_reg, >> + shift_bytes * BITS_PER_UNIT, word_reg, 0); >> + } >> + >> + rtx int_subreg = gen_lowpart (int_mode, word_reg); >> + if (mode == int_mode) >> + return int_subreg; >> + >> + rtx int_mode_reg = gen_reg_rtx (int_mode); >> + emit_move_insn (int_mode_reg, int_subreg); >> + return gen_lowpart (mode, int_mode_reg); >> + } >> + >> + return NULL_RTX; >> +} >> + >> rtx >> expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >> enum expand_modifier modifier, rtx *alt_rtl, >> @@ -11812,6 +11875,19 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >> && modifier != EXPAND_WRITE) >> op0 = flip_storage_order (mode1, op0); >> >> + /* Accessing sub-field of struct parameter which passed via integer >> + registers. */ >> + if (mode == mode1 && TREE_CODE (tem) == PARM_DECL >> + && DECL_INCOMING_RTL (tem) && REG_P (DECL_INCOMING_RTL (tem)) >> + && GET_MODE (DECL_INCOMING_RTL (tem)) == BLKmode && MEM_P (op0) >> + && MEM_OFFSET_KNOWN_P (op0)) >> + { >> + rtx subreg >> + = extract_subreg_from_loading_word (mode, op0, DECL_RTL (tem)); >> + if (subreg) >> + op0 = subreg; >> + } >> + >> if (mode == mode1 || mode1 == BLKmode || mode1 == tmode >> || modifier == EXPAND_CONST_ADDRESS >> || modifier == EXPAND_INITIALIZER) >> diff --git a/gcc/testsuite/g++.target/powerpc/pr102024.C b/gcc/testsuite/g++.target/powerpc/pr102024.C >> index 769585052b5..c8995cae707 100644 >> --- a/gcc/testsuite/g++.target/powerpc/pr102024.C >> +++ b/gcc/testsuite/g++.target/powerpc/pr102024.C >> @@ -5,7 +5,7 @@ >> // Test that a zero-width bit field in an otherwise homogeneous aggregate >> // generates a psabi warning and passes arguments in GPRs. >> >> -// { dg-final { scan-assembler-times {\mstd\M} 4 } } >> +// { dg-final { scan-assembler-times {\mmtvsrd\M} 4 } } >> >> struct a_thing >> { >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr108073.c b/gcc/testsuite/gcc.target/powerpc/pr108073.c >> new file mode 100644 >> index 00000000000..91c13d896b7 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr108073.c >> @@ -0,0 +1,30 @@ >> +/* { dg-do run } */ >> +/* { dg-options "-O2 -save-temps" } */ >> + >> +typedef struct DF {double a[4]; short s1; short s2; short s3; short s4; } DF; >> +typedef struct SF {float a[4]; int i1; int i2; } SF; >> + >> +/* Each of below function contains one mtvsrd. */ >> +/* { dg-final { scan-assembler-times {\mmtvsrd\M} 3 {target { has_arch_ppc64 && has_arch_pwr8 } } } } */ >> +/* { dg-final { scan-assembler-not {\mlwz\M} {target { has_arch_ppc64 && has_arch_pwr8 } } } } */ >> +/* { dg-final { scan-assembler-not {\mlhz\M} {target { has_arch_ppc64 && has_arch_pwr8 } } } } */ >> +short __attribute__ ((noipa)) foo_hi (DF a, int flag){if (flag == 2)return a.s2+a.s3;return 0;} >> +int __attribute__ ((noipa)) foo_si (SF a, int flag){if (flag == 2)return a.i2+a.i1;return 0;} >> +double __attribute__ ((noipa)) foo_df (DF arg, int flag){if (flag == 2)return arg.a[3];else return 0.0;} >> +float __attribute__ ((noipa)) foo_sf (SF arg, int flag){if (flag == 2)return arg.a[2]; return 0;} >> +float __attribute__ ((noipa)) foo_sf1 (SF arg, int flag){if (flag == 2)return arg.a[1];return 0;} >> + >> +DF gdf = {{1.0,2.0,3.0,4.0}, 1, 2, 3, 4}; >> +SF gsf = {{1.0f,2.0f,3.0f,4.0f}, 1, 2}; >> + >> +int main() >> +{ >> + if (!(foo_hi (gdf, 2) == 5 && foo_si (gsf, 2) == 3 && foo_df (gdf, 2) == 4.0 >> + && foo_sf (gsf, 2) == 3.0 && foo_sf1 (gsf, 2) == 2.0)) >> + __builtin_abort (); >> + if (!(foo_hi (gdf, 1) == 0 && foo_si (gsf, 1) == 0 && foo_df (gdf, 1) == 0 >> + && foo_sf (gsf, 1) == 0 && foo_sf1 (gsf, 1) == 0)) >> + __builtin_abort (); >> + return 0; >> +} >> +