From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 3BCB13938380 for ; Thu, 13 May 2021 17:01:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3BCB13938380 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-110-M2hG5reqP4SyrQIR3RErVw-1; Thu, 13 May 2021 13:01:07 -0400 X-MC-Unique: M2hG5reqP4SyrQIR3RErVw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B22A01926DA4; Thu, 13 May 2021 17:01:05 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-114-59.ams2.redhat.com [10.36.114.59]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B0DD810016FC; Thu, 13 May 2021 17:01:04 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 14DH13iu3330061 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 13 May 2021 19:01:03 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 14DH12Zo3330060; Thu, 13 May 2021 19:01:02 +0200 Date: Thu, 13 May 2021 19:01:02 +0200 From: Jakub Jelinek To: Hongtao Liu , Eric Botcazou , Jeff Law , gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: Re: [PATCH] regcprop: Fix another cprop_hardreg bug [PR100342] Message-ID: <20210513170102.GQ3748@tucnak> Reply-To: Jakub Jelinek References: <20210119144514.GA4020736@tucnak> <20210505174446.GU1179226@tucnak> <20210513153736.GK1179226@tucnak> MIME-Version: 1.0 In-Reply-To: <20210513153736.GK1179226@tucnak> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-6.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 May 2021 17:01:18 -0000 On Thu, May 13, 2021 at 05:37:36PM +0200, Jakub Jelinek wrote: > So, do you want something like (I've deleted the old comment as I think > the new one is enough, but am open to keep both) the patch below, where > it REG_CAN_CHANGE_MODE_P is false, we punt (return), otherwise call > set_value_regno? > Am not sure if those REG_CAN_CHANGE_MODE_P arguments is what you want > though. Oops, missing !, meant following which works on 11 branch for the testcase: 2021-05-13 Jakub Jelinek PR rtl-optimization/100342 * regcprop.c (copy_value): When copying a source reg in a wider mode than it has recorded for the value, adjust recorded destination mode too or punt if !REG_CAN_CHANGE_MODE_P. * gcc.target/i386/pr100342.c: New test. --- gcc/regcprop.c.jj 2021-03-23 10:21:07.176447920 +0100 +++ gcc/regcprop.c 2021-05-13 17:36:46.443192451 +0200 @@ -358,34 +358,25 @@ copy_value (rtx dest, rtx src, struct va else if (sn > hard_regno_nregs (sr, vd->e[sr].mode)) return; - /* It is not safe to link DEST into the chain if SRC was defined in some - narrower mode M and if M is also narrower than the mode of the first - register in the chain. For example: - (set (reg:DI r1) (reg:DI r0)) - (set (reg:HI r2) (reg:HI r1)) - (set (reg:SI r3) (reg:SI r2)) //Should be a new chain start at r3 - (set (reg:SI r4) (reg:SI r1)) - (set (reg:SI r5) (reg:SI r4)) - - the upper part of r3 is undefined. If we added it to the chain, - it may be used to replace r5, which has defined upper bits. - See PR98694 for details. - - [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src)) - [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode) - Condition B is added to to catch optimization opportunities of - - (set (reg:HI R1) (reg:HI R0)) - (set (reg:SI R2) (reg:SI R1)) // [A] - (set (reg:DI R3) (reg:DI R2)) // [A] - (set (reg:SI R4) (reg:SI R[0-3])) - (set (reg:HI R5) (reg:HI R[0-4])) - - in which all registers have only 16 defined bits. */ - else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src)) - && partial_subreg_p (vd->e[sr].mode, - vd->e[vd->e[sr].oldest_regno].mode)) - return; + /* If a narrower value is copied using wider mode, the upper bits + are undefined (could be e.g. a former paradoxical subreg). Signal + in that case we've only copied value using the narrower mode. + Consider: + (set (reg:DI r14) (mem:DI ...)) + (set (reg:QI si) (reg:QI r14)) + (set (reg:DI bp) (reg:DI r14)) + (set (reg:DI r14) (const_int ...)) + (set (reg:DI dx) (reg:DI si)) + (set (reg:DI si) (const_int ...)) + (set (reg:DI dx) (reg:DI bp)) + The last set is not redundant, while the low 8 bits of dx are already + equal to low 8 bits of bp, the other bits are undefined. */ + else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))) + { + if (!REG_CAN_CHANGE_MODE_P (sr, GET_MODE (src), vd->e[sr].mode)) + return; + set_value_regno (dr, vd->e[sr].mode, vd); + } /* Link DR at the end of the value chain used by SR. */ --- gcc/testsuite/gcc.target/i386/pr100342.c.jj 2021-05-13 17:28:41.181460465 +0200 +++ gcc/testsuite/gcc.target/i386/pr100342.c 2021-05-13 17:28:41.181460465 +0200 @@ -0,0 +1,70 @@ +/* PR rtl-optimization/100342 */ +/* { dg-do run { target int128 } } */ +/* { dg-options "-O2 -fno-dse -fno-forward-propagate -Wno-psabi -mno-sse2" } */ + +#define SHL(x, y) ((x) << ((y) & (sizeof(x) * 8 - 1))) +#define SHR(x, y) ((x) >> ((y) & (sizeof(x) * 8 - 1))) +#define ROR(x, y) (SHR(x, y)) | (SHL(x, (sizeof(x) * 8 - (y)))) +#define SHLV(x, y) ((x) << ((y) & (sizeof((x)[0]) * 8 - 1))) +#define SHLSV(x, y) ((x) << ((y) & (sizeof((y)[0]) * 8 - 1))) +typedef unsigned char A; +typedef unsigned char __attribute__((__vector_size__ (8))) B; +typedef unsigned char __attribute__((__vector_size__ (16))) C; +typedef unsigned char __attribute__((__vector_size__ (32))) D; +typedef unsigned char __attribute__((__vector_size__ (64))) E; +typedef unsigned short F; +typedef unsigned short __attribute__((__vector_size__ (16))) G; +typedef unsigned int H; +typedef unsigned int __attribute__((__vector_size__ (32))) I; +typedef unsigned long long J; +typedef unsigned long long __attribute__((__vector_size__ (8))) K; +typedef unsigned long long __attribute__((__vector_size__ (32))) L; +typedef unsigned long long __attribute__((__vector_size__ (64))) M; +typedef unsigned __int128 N; +typedef unsigned __int128 __attribute__((__vector_size__ (16))) O; +typedef unsigned __int128 __attribute__((__vector_size__ (32))) P; +typedef unsigned __int128 __attribute__((__vector_size__ (64))) Q; +B v1; +D v2; +L v3; +K v4; +I v5; +O v6; + +B +foo (A a, C b, E c, F d, G e, H f, J g, M h, N i, P j, Q k) +{ + b &= (A) f; + k += a; + G l = e; + D m = v2 >= (A) (J) v1; + J r = a + g; + L n = v3 <= f; + k -= i / f; + l -= (A) g; + c |= (A) d; + b -= (A) i; + J o = ROR (__builtin_clz (r), a); + K p = v4 | f, q = v4 <= f; + P s = SHLV (SHLSV (__builtin_bswap64 (i), (P) (0 < j)) <= 0, j); + n += a <= r; + M t = (M) (a / SHLV (c, 0)) != __builtin_bswap64 (i); + I u = f - v5; + E v = (E) h + (E) t + (E) k; + D w = (union { D b[2]; }) { }.b[0] + ((union { E b; }) v).b[1] + m + (D) u + (D) n + (D) s; + C x = ((union { D b; }) w).b[1] + b + (C) l + (C) v6; + B y = ((union { C a; B b; }) x).b + ((union { C a; B b[2]; }) x).b[1] + (B) p + (B) q; + J z = i + o; + F z2 = z; + A z3 = z2; + return y + z3; +} + +int +main () +{ + B x = foo (0, (C) { }, (E) { }, 10, (G) { }, 4, 2, (M) { }, 123842323652213865LL, (P) { 1 }, (Q) { }); + if ((J) x != 0x2e2c2e2c2e2c2e30ULL) + __builtin_abort(); + return 0; +} Jakub