From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id F303B385380B; Mon, 17 May 2021 22:02:47 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F303B385380B From: "segher at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/100622] Conversion to smaller unsigned type in loop Date: Mon, 17 May 2021 22:02:47 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: segher at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status everconfirmed Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 May 2021 22:02:48 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D100622 Segher Boessenkool changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #5 from Segher Boessenkool --- (In reply to Thomas Koenig from comment #4) > Yes, the masking should be only performed at the end. >=20 > However, the inner loop could be further simplified to >=20 > label: > lwzu r8,4(r10) > add r3,r8,r3 > bdnz label >=20 > without the need to do anything with r9, so this is probably > more than one topic in one test case. Please use -O2 instead, no one will care much about -O1. You can use -fno-unroll-loops to make it easier to read. The core for foo is .L3: lwzu 10,4(9) add 3,10,3 rldicl 3,3,0,32 bdnz .L3 and for foo2 is .L10: lwzu 10,4(9) add 3,3,10 bdnz .L10 This is this way in Gimple already: the IV is a DImode, while it would be better as a SImode. That is the root of the problem here. Sinking extensions could well help, but the IV should not be DImode in the first place! Confirmed.=