From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 1DFF63858C60; Wed, 17 Jan 2024 20:29:01 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1DFF63858C60
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1705523341;
	bh=G3dEAJL6DOvdCVSFYj3l/Rqj4etmdw7zurkV0qzQeWk=;
	h=From:To:Subject:Date:From;
	b=eML3BSGg5wFipq/sZYkXp1rUPBU0BYlhWOAOLV8bDHngO3cK6Zy8FpJObU0gxPoNP
	 qaUWAnPqThVlyVy8Eyhqgyb1G/i9BcO2r8BmxRgovO9fNtJZ7AarBGP0akzP2xkJcs
	 vtvql8ZMFsEGWw6mKlAqrT8ri0ua9E2Akt1conzU=
From: "pinskia at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/113458] New: Missed SLP for reduction of
 multiplication/addition with promotion
Date: Wed, 17 Jan 2024 20:29:00 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: pinskia at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status
 keywords bug_severity priority component assigned_to reporter
 target_milestone cf_gcctarget
Message-ID: <bug-113458-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113458

            Bug ID: 113458
           Summary: Missed SLP for reduction of multiplication/addition
                    with promotion
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64-*-*

Take:
```
int f(short *a, signed char *b)
{
        int sum =3D 0;
        sum +=3D a[0]*b[0];
        sum +=3D a[1]*b[1];
        sum +=3D a[2]*b[2];
        sum +=3D a[3]*b[3];
        return sum;
}
```

This is not SLPed with GCC.

With `-fno-vect-cost-model` it is but in a very inefficient way.

LLVM produces:
```
        ldr     s0, [x1]
        ldr     d1, [x0]
        sshll   v0.8h, v0.8b, #0 // promote to short
        smull   v0.4s, v0.4h, v1.4h //multiply 2 shorts to ints
        addv    s0, v0.4s // do the reduction
        fmov    w0, s0
```

Which GCC should be to produce this too.=