From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 199863870A08; Tue, 31 Jan 2023 14:33:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 199863870A08 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1675175589; bh=m2mbHPLHBYf3+iF6UAhZ3cMSRJL4+ZB0z5AJVXlX69k=; h=From:To:Subject:Date:In-Reply-To:References:From; b=WGr8ZVWDcyaUj1WU8smucbECiK/sjt3YQvVpPo7E6dkY/83MZK+AqIB/hxxDMSj89 lTMyUIhy4IrUqRSGnczgHZz0COiHlf85+Xls4vhnMwdlRe0dtVr+HquOkdPKGgjOxO oSpVufvlIZ7Rg8idaOkYcxWCjKelYBY2dBqprSDI= From: "rguenther at suse dot de" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2 Date: Tue, 31 Jan 2023 14:33:08 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenther at suse dot de X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: tnfchris at gcc dot gnu.org X-Bugzilla-Target-Milestone: 13.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108583 --- Comment #16 from rguenther at suse dot de --- On Tue, 31 Jan 2023, tnfchris at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108583 >=20 > --- Comment #15 from Tamar Christina --- > > OK, hopefully I understand now. Sorry for being slow. >=20 > Not at all, Sorry if it came across a bit cranky, it wasn't meant that wa= y! >=20 > > If that's the condition we want to test for, it seems like something > > we need to check in the vectoriser rather than the hook. And it's > > not something we can easily do in the vector form, since we don't > > track ranges for vectors (AFAIK). >=20 > Ack, that also tracks with what I tried before, we don't indeed track ran= ges > for vector ops. The general case can still be handled slightly better (I = think) > but it doesn't become as clear of a win as this one. >=20 > > You probably did so elsewhere some time ago, but what exactly are those > > four instructions? (pointers to specifications appreciated) >=20 > For NEON we use: > https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructi= ons/ADDHN--ADDHN2--Add-returning-High-Narrow- so thats a add + pack high > https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructi= ons/UADDW--UADDW2--Unsigned-Add-Wide- and that unpacks (zero-extends) the high/low part of one operand of an add I wonder if we'd open-code the pack / unpack and use regular add whether combine can synthesize uaddw and addhn? The pack and unpack would be vec_perms on GIMPLE (plus V_C_E). > In that order, and for SVE we use two > https://developer.arm.com/documentation/ddi0602/2022-12/SVE-Instructions/= ADDHNB--Add-narrow-high-part--bottom-- probably similar. So the difficulty here will be to decide whether that's in the end better than what the pattern handling code does now, right? Because I think most targets will be able to do the above but lacking the special adds it will be slower because of the extra packing/unpacking? That said, can we possibly do just that costing (would be a first in the pattern code I guess) with a target hook? Or add optabs for the addh operations so we can query support?=