From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id B4A993858C56; Mon, 28 Mar 2022 10:15:40 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B4A993858C56
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/105075] [nvptx] Generate sad insn (sum of absolute
 differences)
Date: Mon, 28 Mar 2022 10:15:40 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-105075-4-OPKNe3T3FU@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-105075-4@http.gcc.gnu.org/bugzilla/>
References: <bug-105075-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 28 Mar 2022 10:15:40 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105075
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
@cindex @code{ssad@var{m}} instruction pattern
@item @samp{ssad@var{m}}
@cindex @code{usad@var{m}} instruction pattern
@item @samp{usad@var{m}}
Compute the sum of absolute differences of two signed/unsigned elements.
Operand 1 and operand 2 are of the same mode. Their absolute difference, wh=
ich
is of a wider mode, is computed and added to operand 3. Operand 3 is of a m=
ode
equal or wider than the mode of the absolute difference. The result is plac=
ed
in operand 0, which is of the same mode as operand 3.


That cruically "misses" a detail for the vector case where the sum will
also sum across (unspecified!) lanes when operand 3 is wider than the
absolute difference and has a lower number of lanes than the input vectors.

The unspecified part makes it a hart fit for pattern matching (unrolled)
code when actual output lanes are used and they are not being reduced to
a single scalar in the end.

For scalar instruction matching the patterns should be usable.

Note the SAD_EXPR on GENERIC has the same issue when vectors types are
used - the exact semantics are unspecified.

The same is true for DOT_PROD_EXPR and WIDEN_SUM_EXPR and a bunch of others.

These days we'd go for matching them to direct internal function calls
using the {u,s}sad optabs and I don't see any reason to not allow scalar
modes for them.  I'd rather get rid of all the tree codes we have for
vectorizer reduction patterns in favor of those so if you can avoid
introducing new ones or adding more uses of existing ones that would be nic=
e.=