From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <Tamar.Christina@arm.com>
Received: from EUR05-VI1-obe.outbound.protection.outlook.com
 (mail-vi1eur05on2087.outbound.protection.outlook.com [40.107.21.87])
 by sourceware.org (Postfix) with ESMTPS id CAF2A385802B
 for <gcc-patches@gcc.gnu.org>; Wed,  5 May 2021 17:38:51 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org CAF2A385802B
Received: from MR2P264CA0120.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:33::36)
 by DBBPR08MB5962.eurprd08.prod.outlook.com (2603:10a6:10:202::15) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4087.35; Wed, 5 May
 2021 17:38:46 +0000
Received: from VE1EUR03FT026.eop-EUR03.prod.protection.outlook.com
 (2603:10a6:500:33:cafe::9d) by MR2P264CA0120.outlook.office365.com
 (2603:10a6:500:33::36) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4087.28 via Frontend
 Transport; Wed, 5 May 2021 17:38:46 +0000
X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123)
 smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified)
 header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none
 header.from=arm.com;
Received-SPF: Pass (protection.outlook.com: domain of arm.com designates
 63.35.35.123 as permitted sender) receiver=protection.outlook.com;
 client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com;
Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by
 VE1EUR03FT026.mail.protection.outlook.com (10.152.18.148) with
 Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.4108.25 via Frontend Transport; Wed, 5 May 2021 17:38:45 +0000
Received: ("Tessian outbound 6c4b4bc1cefb:v91");
 Wed, 05 May 2021 17:38:45 +0000
X-CheckRecipientChecked: true
X-CR-MTA-CID: f9ce36d56203cdcb
X-CR-MTA-TID: 64aa7808
Received: from 619aef73dd3c.1
 by 64aa7808-outbound-1.mta.getcheckrecipient.com id
 FE22A4D5-2E28-4C9C-A5C3-FCC43FBF7C2D.1; 
 Wed, 05 May 2021 17:38:39 +0000
Received: from EUR02-VE1-obe.outbound.protection.outlook.com
 by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 619aef73dd3c.1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384);
 Wed, 05 May 2021 17:38:39 +0000
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=KvDXsQSaUBhtGuE9BOe5MzPHoMVp702n45KfG9C+3uKGczMeDKnXBSef5QYpLewzxtPPIpwXr5gfFcnt65/1V7IZyX01s1knI4aQbKrHDAl020OwI/sWpYV+MCTxEgs3S5XAU58tlSTQmrcLTL3Ff8GqFT+nXUJJeq0D4oBKCeAd9Z9urd5ZfUVmLQ65/a/0GKkkcbkmfmu6gjjfK/1JMMSaHZAontp5G2yQuOD4KUg8J3eFcyjI8if0xGOGvanRYlB48AaMecnMw/a8XPJJKMXPS0m1wjgG/gKdEJJLH0Vk1A20Jfu0CzGvfvCKpy2elBLHZeDKkGnPHk20DvJ2BA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=dWfYDi0b5DUq9tcIzIwVLrDQAzlmXmiRi27lyws9tkk=;
 b=UcGzi0vBvI9EmBNHYXN7e3mQoouxejAbNv1409bBBj5GdsCFH6a2w2nGQ88jRiG9IMuYuvbeJUH6mBSIapANb0USZqCgEWM0+j6tqDmYR0rBo+TgqQYgdnRouoqxX9GanIyJz1LJR+YZVdKMSWXnoFZT8/LplsWSaxylCdZOosEsVu4IYt04cX93doUQw1BCpXZ7TOi9TE4eVxJvAHm+qmigjYM1/G0KWkOKx62/jlaIiPADcqHx31gEUQUoEXWQ7UULQLco8Lp5TDeDlsFD//K8+5OJyjgw18i/ltFhbwtlQIFHPuTx8i5VKbGhbptTBEKLJmTEbVJgATk4wxRwHQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass
 header.d=arm.com; arc=none
Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed)
 header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com;
Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17)
 by VI1PR0801MB1680.eurprd08.prod.outlook.com (2603:10a6:800:5a::11)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4108.25; Wed, 5 May
 2021 17:38:35 +0000
Received: from VI1PR08MB5325.eurprd08.prod.outlook.com
 ([fe80::5828:531c:5ccb:5bae]) by VI1PR08MB5325.eurprd08.prod.outlook.com
 ([fe80::5828:531c:5ccb:5bae%3]) with mapi id 15.20.4087.044; Wed, 5 May 2021
 17:38:35 +0000
Date: Wed, 5 May 2021 18:38:33 +0100
From: Tamar Christina <tamar.christina@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: nd@arm.com, rguenther@suse.de
Subject: [PATCH 1/4]middle-end Vect: Add support for dot-product where the
 sign for the multiplicant changes. 
Message-ID: <patch-14433-tamar@arm.com>
Content-Type: multipart/mixed; boundary="k1lZvvs/B4yU6o8G"
Content-Disposition: inline
User-Agent: Mutt/1.9.4 (2018-02-28)
X-Originating-IP: [217.140.106.51]
X-ClientProxiedBy: LO4P123CA0159.GBRP123.PROD.OUTLOOK.COM
 (2603:10a6:600:188::20) To VI1PR08MB5325.eurprd08.prod.outlook.com
 (2603:10a6:803:13e::17)
MIME-Version: 1.0
X-MS-Exchange-MessageSentRepresentingType: 1
Received: from arm.com (217.140.106.51) by
 LO4P123CA0159.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:188::20) with Microsoft
 SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.4108.25 via Frontend Transport; Wed, 5 May 2021 17:38:35 +0000
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id: 9b60a219-62cd-499e-c651-08d90feca230
X-MS-TrafficTypeDiagnostic: VI1PR0801MB1680:|DBBPR08MB5962:
X-Microsoft-Antispam-PRVS: <DBBPR08MB59621FA90BD048F423A05414FF599@DBBPR08MB5962.eurprd08.prod.outlook.com>
x-checkrecipientrouted: true
NoDisclaimer: true
X-MS-Oob-TLC-OOBClassifiers: OLM:4303;OLM:4303;
X-MS-Exchange-SenderADCheck: 1
X-Microsoft-Antispam-Untrusted: BCL:0;
X-Microsoft-Antispam-Message-Info-Original: CPqGvYA8UygtjEBpPqWlm4mRaK9IskQz2sVzDmzR3+em3rwPR2bHamKBEx9sFxw8ZmFqaA3rmk26tvVwFGePWBxBtZyq5M0RpaHBRci1J3c8QL6eghPdMO9BX8kqCyQbbMhssNP8QnGeFVwyvz9nMqc9v+mhQOcP6GS4x6+FmywETml8p6XSiZDYTZkM1pGTCUMAz+1lLToxFjE8vtTPC4laYBwMyDE+QUXhdSXeKvKZ6w0PSv1OTOz2ywOQ9OkNxhK7vhqwqTMbhoesINim3Ei2TDFRqwHZKqfR+M4ynmboJttOsw/E7LCEkfGcTqYhCiu10fdVCY2wbNM5zXc9FUGIFI3StsvZ9wmDQXTd2jYX/xsyuFpD5Gxfikvuhl0lhujUOrvz7NiaACD/P7vOadKPSXGpzAKFCPsg2JSViaHF2GR9a5cQqdVyurORgRjieoUyK+VQD1A2GistvXsrkWhMMzw3kzE9yz1VRui+Rd7b56x2MquqG32IYq+wAEb2T+ky65ac77jtH/edE1vXnU7HVqaO1Mu5qIxE/BumHcX1aXFZe3SsA8xssSZaITGeDft9bLo9ecDbD2Kq2pSzLPeBXc+bP3DvXk3VWadPveWn68V+l6lLpGYS9SozNxHUxfPecD4YeqZTwIiEE91z1nrG+4cnqg2khcvb3tz761Aq/M3sBFHd2c0f4mtodcJ16tZMBKsPmrs3E/EezsCDJeuXr/Q8Vqz3S5p1Si6Wnb8=
X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en;
 SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com;
 PTR:; CAT:NONE;
 SFS:(4636009)(366004)(55016002)(83380400001)(6916009)(186003)(8676002)(86362001)(36756003)(44832011)(2616005)(956004)(26005)(30864003)(16526019)(5660300002)(235185007)(4743002)(66556008)(66476007)(66616009)(2906002)(44144004)(38350700002)(7696005)(33964004)(52116002)(4326008)(38100700002)(498600001)(66946007)(8886007)(8936002)(4216001)(2700100001)(357404004);
 DIR:OUT; SFP:1101; 
X-MS-Exchange-AntiSpam-MessageData: =?utf-8?B?aC9NUFZCSTdaSVhBeVRDQVFaUU9xUW5aUkMwWVI3U0hBWG1QNWVxUy96MEVv?=
 =?utf-8?B?ZWVKeDNybWp3U0VtQTE1eDdsT2ovRloxcjhRZWFlcWFja05pUDZUNUFoS1J0?=
 =?utf-8?B?eEdBdVkyQUlHWXBmMVRlM2dnTVU5VExCNUxLRkdac2Z1bFIrSnJuR0VYODZz?=
 =?utf-8?B?MjFOdFlHRm1VUmphcTdJQ2JaWXQydHppVXViakhBQmhoZ0M1cjVtZTlQYjM1?=
 =?utf-8?B?RmQ1QWlHNW94VHBzOVBnclI1Um91ekdvRDMrdVhCYzdTOTNjSk50bXQycjRl?=
 =?utf-8?B?SmFJdmdOUU5ZbHNXYkJWT21PUmovWFpFb0VscS9zWUF6dmhmS3NwSTQ3VTJj?=
 =?utf-8?B?M1RFTHptM2NjczJpdjJ2cDhiM3F0VXh2TmlSTzRjSEY4cTJ3S1Aya2UwZ3Yv?=
 =?utf-8?B?THdHbHNHZktzOUQ5ZUFYVTZLK0JyRzZZVi9PSVE1UmpnZWNieUJHL0d0bFcw?=
 =?utf-8?B?d25mVDdHbGhEVUpGMzRvV2o2aGZ1SFVoZ01hZXVzMVEyd2RycE9wNUFGSWJL?=
 =?utf-8?B?QUx5SXY0SmpBeDY4R3NQUTFuVXB1UEJjeFVCQzUzMDJpTHREdzdJQTlRSkJL?=
 =?utf-8?B?STl3NXl6N3k0RUNaMGwzRnRVQkk1MDZYdDF4bVpqTmg2QlE0TnBjSVlXTTZv?=
 =?utf-8?B?ME5tVytxRi92aG5zNWhpS0ZONjloRVpUQ3Fhd1Z5bTJIcTQrQlJ2OG9BREZM?=
 =?utf-8?B?MEh0K25OaldObTR1NWZyanhUS3Q2Z2JMK0toaDc2eWRjQ2pEK1pVNC8yNzl4?=
 =?utf-8?B?TnVIWnNHaCszSlEwNjdnNXdHQUg5ZkhPbVYxR0hLZVhzZUJadUVCTG43OHho?=
 =?utf-8?B?LzVKakVJNzhPNEwvSUFaTHVXZkxGN0tNNWVZUzRwcTdzTmFLUWR4UkpLYjcz?=
 =?utf-8?B?UjBWV1pSbDhERStBTmJmSjlZc29xNXh3QUJBNitPNDQ4Ri9uMFNxSlcrZ0Ny?=
 =?utf-8?B?KzNpRlpqYVBCam1BaTFLR3dJWE1HZ3hJSUlTMHlhMkozc2hLczdwOWhNa0V0?=
 =?utf-8?B?bnJQaExQWmhYcUpOZGhUS3JCSzZFV01rMnNnYmg0VlViZFlyYnJ2dllWNFdp?=
 =?utf-8?B?a2R0ZDlMeENjUURFU09EZ1psOUFRWmlwQ3p5dnVQNmhMWEo2SW5SU3pyb0h4?=
 =?utf-8?B?Y092aGNjQno2c0MyRFBycEQ4K3NDTFcrYUFraEtGNjY0Z0dmbllkYzZzK2R5?=
 =?utf-8?B?R0J0aVRpUnBkSmlqaEpsL3AvTGZWdlpGeU5oM2doTzFoaHVEeWZPQ2dJdGNL?=
 =?utf-8?B?c2J0NGM2SlBVaTFkZTVTdEtleGE1RTBQUStxZ3B4aFBYUXVEYUgyeXN1ZWo5?=
 =?utf-8?B?aXVEOWRWVVV1T3d0RW1wV1VpakwrdmNPcnlxRjYva3Q4bDBoU0h6eWhYaUtY?=
 =?utf-8?B?ajhIOFJNelFieVJsYnBuQy9UNlQ2WFYvUUtNUVJqVlp2VFZwQTl3WFlmT1JI?=
 =?utf-8?B?ekVOZzV5RmwwSWJPNFAyK1ltUjNyMjNRa05wemV4SHNyZ2svZTFUdmhla0hL?=
 =?utf-8?B?VnFhZzZ5U2FlblVXWHQ5NlJTNEF4Ky9VZXNRNkVuOXZId2Q0SXE4NDRwTDhY?=
 =?utf-8?B?dDIrU3FMNHR3ajlFbUZrUWx0TlJSZXNMbExOOERKSmhWL0QzTk14VTJjQjFk?=
 =?utf-8?B?WXJJcjJZZ2V0UE9LWGlaMWZHT0dYRnA3MjNiMHg1dkhXVitsYldhK3dUYWFU?=
 =?utf-8?B?Q3dHY2pDQjBxWmc2Y2Zvb21jWnlUeGVVc0RDM3VFb3FUelRDMEszQjFaeWY5?=
 =?utf-8?Q?qaqgBGGpWO4kmC7Bj8jF+GOH0Qs5eEsYC78IgC8?=
X-MS-Exchange-Transport-Forked: True
X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB1680
Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed)
 header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com;
X-EOPAttributedMessage: 0
X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT026.eop-EUR03.prod.protection.outlook.com
X-MS-Office365-Filtering-Correlation-Id-Prvs: 976f9862-f8f6-4096-6202-08d90fec9c08
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: 5WLyWlIeYCQuE5dpGITgJgOmGqYrDK29fjrT5iN/oG23KGKyBrBXVjPCQWuaLFvaZUY852UD5GPVo9djedHWFFlYiu0zyMiSnB3CuVlaF/VuSVSByJivNLDsUEypCLEHgWbSoIMZpLH3S0C5klw+x/rSXLLrW8kIvCmWljjjfs948X7SUQVtvTfF4orvCWor9bodWcXKeqAvlnNtSej71nkLIpXPkLPCSEKiedIFgkRpnLVodxO5r+ZIoaxvpe8jEmkohCl1ttaQRl+WVzeaxBtvJraWjT7JjK7AcrbgFtHk0yn6JSG2xg55KgtBdDXNMw3vsXLCFptu6APF8BY1qKFhM6acamlBCcDuZJtt+9c+jPDCd0WZJhREnnGdMIv861Q8xlGDcIUyhaoIEJ13P+rzPeJO0CB/kAA9SGUaobumetw+sHZKhhENdHdwcLTzFJSso78ndkHfvrPdDktO2WL5JkUOc15B8Uhhxbf4IID/u10nLeI7J7vQosf/YIl4Su5AjaK2AvzQGoocjI4hFxKCCm8sy4gN6B2o1kZab26h23ahjQlhmsEqucgjTMbVi3+IA2z1yOveUpWoa1YOOHEfuubES9QyVAnFEAUGfMF1FvvzTc2gj4kLgomBUrLrlADoun8/hxl4yFrT7cvhLaL0Z5i3I3MvBgDEdHJKIY65vRFcxEsaMYbrWgqwNXnbVuAuApK8qP4k7lmvLLX2h5e29grZvprSzj3Q7fEscBw=
X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:;
 IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com;
 PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE;
 SFS:(4636009)(346002)(396003)(376002)(39830400003)(46966006)(36840700001)(6916009)(2906002)(956004)(7696005)(8886007)(4326008)(2616005)(86362001)(4743002)(107886003)(55016002)(5660300002)(70586007)(70206006)(47076005)(16526019)(66616009)(36860700001)(336012)(508600001)(186003)(235185007)(44144004)(36756003)(81166007)(8676002)(83380400001)(44832011)(8936002)(26005)(356005)(30864003)(33964004)(82310400003)(4216001)(2700100001)(357404004);
 DIR:OUT; SFP:1101; 
X-OriginatorOrg: arm.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 May 2021 17:38:45.5730 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 9b60a219-62cd-499e-c651-08d90feca230
X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123];
 Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com]
X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT026.eop-EUR03.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB5962
X-Spam-Status: No, score=-14.2 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER,
 RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP,
 UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 05 May 2021 17:38:56 -0000

--k1lZvvs/B4yU6o8G
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline

Hi All,

This patch adds support for a dot product where the sign of the multiplication
arguments differ. i.e. one is signed and one is unsigned but the precisions are
the same.

#define N 480
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 signed
#define SIGNEDNESS_4 unsigned

SIGNEDNESS_1 int __attribute__ ((noipa))
f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a,
   SIGNEDNESS_4 char *restrict b)
{
  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
    {
      int av = a[i];
      int bv = b[i];
      SIGNEDNESS_2 short mult = av * bv;
      res += mult;
    }
  return res;
}

The operations are performed as if the operands were extended to a 32-bit value.
As such this operation isn't valid if there is an intermediate conversion to an
unsigned value. i.e.  if SIGNEDNESS_2 is unsigned.

more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are flipped the same
optab is used but the operands are flipped in the optab expansion.

To support this the patch extends the dot-product detection to optionally
ignore operands with different signs and stores this information in the optab
subtype which is now made a bitfield.

The subtype can now additionally controls which optab an EXPR can expand to.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* optabs.def (usdot_prod_optab): New.
	* doc/md.texi: Document it.
	* optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
	* optabs-tree.h (enum optab_subtype): Likewise.
	* optabs.c (expand_widen_pattern_expr): Likewise.
	* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
	* tree-vect-loop.c (vect_determine_dot_kind): New.
	(vectorizable_reduction): Query dot-product kind.
	* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional
	optab subtype.
	(vect_joust_widened_type, vect_widened_op_tree): Optionally ignore
	mismatch types.
	(vect_recog_dot_prod_pattern): Support usdot_prod_optab.

--- inline copy of patch -- 
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..baf20416e63745097825fc30fdf2e66bc80d7d23 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5440,11 +5440,13 @@ Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand
 @item @samp{sdot_prod@var{m}}
 @cindex @code{udot_prod@var{m}} instruction pattern
 @itemx @samp{udot_prod@var{m}}
+@cindex @code{usdot_prod@var{m}} instruction pattern
+@itemx @samp{usdot_prod@var{m}}
 Compute the sum of the products of two signed/unsigned elements.
-Operand 1 and operand 2 are of the same mode. Their product, which is of a
-wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or
-wider than the mode of the product. The result is placed in operand 0, which
-is of the same mode as operand 3.
+Operand 1 and operand 2 are of the same mode but may differ in signs. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
 
 @cindex @code{ssad@var{m}} instruction pattern
 @item @samp{ssad@var{m}}
diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
index c3aaa1a416991e856d3e24da45968a92ebada82c..ebc23ac86fe99057f375781c2f1990e0548ba08d 100644
--- a/gcc/optabs-tree.h
+++ b/gcc/optabs-tree.h
@@ -27,11 +27,29 @@ along with GCC; see the file COPYING3.  If not see
    shift amount vs. machines that take a vector for the shift amount.  */
 enum optab_subtype
 {
-  optab_default,
-  optab_scalar,
-  optab_vector
+  optab_default = 1 << 0,
+  optab_scalar = 1 << 1,
+  optab_vector = 1 << 2,
+  optab_signed_to_unsigned = 1 << 3,
+  optab_unsigned_to_signed = 1 << 4
 };
 
+/* Override the OrEqual-operator so we can use optab_subtype as a bit flag.  */
+inline enum optab_subtype&
+operator |= (enum optab_subtype& a, enum optab_subtype b)
+{
+    return a = static_cast<optab_subtype>(static_cast<int>(a)
+					  | static_cast<int>(b));
+}
+
+/* Override the Or-operator so we can use optab_subtype as a bit flag.  */
+inline enum optab_subtype
+operator | (enum optab_subtype a, enum optab_subtype b)
+{
+    return static_cast<optab_subtype>(static_cast<int>(a)
+				      | static_cast<int>(b));
+}
+
 /* Return the optab used for computing the given operation on the type given by
    the second argument.  The third argument distinguishes between the types of
    vector shifts and rotates.  */
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 95ffe397c23e80c105afea52e9d47216bf52f55a..2f60004545defc53182e004eea1e5c22b7453072 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -127,7 +127,17 @@ optab_for_tree_code (enum tree_code code, const_tree type,
       return TYPE_UNSIGNED (type) ? usum_widen_optab : ssum_widen_optab;
 
     case DOT_PROD_EXPR:
-      return TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab;
+      {
+	gcc_assert (subtype & optab_default
+		    || subtype & optab_vector
+		    || subtype & optab_signed_to_unsigned
+		    || subtype & optab_unsigned_to_signed);
+
+	if (subtype & (optab_unsigned_to_signed | optab_signed_to_unsigned))
+	  return usdot_prod_optab;
+
+	return (TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab);
+      }
 
     case SAD_EXPR:
       return TYPE_UNSIGNED (type) ? usad_optab : ssad_optab;
diff --git a/gcc/optabs.c b/gcc/optabs.c
index f4614a394587787293dc8b680a38901f7906f61c..2e18b76de1412eab71971753ac678597c0d00098 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -262,6 +262,11 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
   bool sbool = false;
 
   oprnd0 = ops->op0;
+  if (nops >= 2)
+    oprnd1 = ops->op1;
+  if (nops >= 3)
+    oprnd2 = ops->op2;
+
   tmode0 = TYPE_MODE (TREE_TYPE (oprnd0));
   if (ops->code == VEC_UNPACK_FIX_TRUNC_HI_EXPR
       || ops->code == VEC_UNPACK_FIX_TRUNC_LO_EXPR)
@@ -285,6 +290,27 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
 	   ? vec_unpacks_sbool_hi_optab : vec_unpacks_sbool_lo_optab);
       sbool = true;
     }
+  else if (ops->code == DOT_PROD_EXPR)
+    {
+      enum optab_subtype subtype = optab_default;
+      signop sign1 = TYPE_SIGN (TREE_TYPE (oprnd0));
+      signop sign2 = TYPE_SIGN (TREE_TYPE (oprnd1));
+      if (sign1 == sign2)
+	;
+      else if (sign1 == SIGNED && sign2 == UNSIGNED)
+	{
+	  subtype |= optab_signed_to_unsigned;
+	  /* Same as optab_unsigned_to_signed but flip the operands.  */
+	  std::swap (op0, op1);
+	}
+      else if (sign1 == UNSIGNED && sign2 == SIGNED)
+	subtype |= optab_unsigned_to_signed;
+      else
+	gcc_unreachable ();
+
+      widen_pattern_optab
+	= optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), subtype);
+    }
   else
     widen_pattern_optab
       = optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
@@ -298,10 +324,7 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
   gcc_assert (icode != CODE_FOR_nothing);
 
   if (nops >= 2)
-    {
-      oprnd1 = ops->op1;
-      tmode1 = TYPE_MODE (TREE_TYPE (oprnd1));
-    }
+    tmode1 = TYPE_MODE (TREE_TYPE (oprnd1));
   else if (sbool)
     {
       nops = 2;
@@ -316,7 +339,6 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
     {
       gcc_assert (tmode1 == tmode0);
       gcc_assert (op1);
-      oprnd2 = ops->op2;
       wmode = TYPE_MODE (TREE_TYPE (oprnd2));
     }
 
diff --git a/gcc/optabs.def b/gcc/optabs.def
index b192a9d070b8aa72e5676b2eaa020b5bdd7ffcc8..f470c2168378cec840edf7fbdb7c18615baae928 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -352,6 +352,7 @@ OPTAB_D (uavg_ceil_optab, "uavg$a3_ceil")
 OPTAB_D (sdot_prod_optab, "sdot_prod$I$a")
 OPTAB_D (ssum_widen_optab, "widen_ssum$I$a3")
 OPTAB_D (udot_prod_optab, "udot_prod$I$a")
+OPTAB_D (usdot_prod_optab, "usdot_prod$I$a")
 OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
 OPTAB_D (usad_optab, "usad$I$a")
 OPTAB_D (ssad_optab, "ssad$I$a")
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 7e3aae5f9c28a49feedc7cc66e8ac0d476b9f28a..58b55bb648ad97d514f1fa18bb00808fd2678b42 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -4421,7 +4421,8 @@ verify_gimple_assign_ternary (gassign *stmt)
 		  && !SCALAR_FLOAT_TYPE_P (rhs1_type))
 		 || (!INTEGRAL_TYPE_P (lhs_type)
 		     && !SCALAR_FLOAT_TYPE_P (lhs_type))))
-	    || !types_compatible_p (rhs1_type, rhs2_type)
+	    || (!types_compatible_p (rhs1_type, rhs2_type)
+		&& TYPE_SIGN (rhs1_type) == TYPE_SIGN (rhs2_type))
 	    || !useless_type_conversion_p (lhs_type, rhs3_type)
 	    || maybe_lt (GET_MODE_SIZE (element_mode (rhs3_type)),
 			 2 * GET_MODE_SIZE (element_mode (rhs1_type))))
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 93fa2928e001c154bd4a9a73ac1dbbbf73c456df..cb8f5fbb6abca181c4171194d19fec29ec6e4176 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6401,6 +6401,33 @@ build_vect_cond_expr (enum tree_code code, tree vop[3], tree mask,
     }
 }
 
+/* Determine the optab_subtype to use for the given CODE and STMT.  For
+   most CODE this will be optab_vector, however for certain operations such as
+   DOT_PROD_EXPR where the operation can different signs for the operands we
+   need to be able to pick the right optabs.  */
+
+static enum optab_subtype
+vect_determine_dot_kind (tree_code code, stmt_vec_info stmt_vinfo)
+{
+  enum optab_subtype subtype = optab_vector;
+  switch (code)
+    {
+      case DOT_PROD_EXPR:
+	{
+	  gassign *stmt = as_a <gassign *> (STMT_VINFO_STMT (stmt_vinfo));
+	  signop rhs1_sign = TYPE_SIGN (TREE_TYPE (gimple_assign_rhs1 (stmt)));
+	  signop rhs2_sign = TYPE_SIGN (TREE_TYPE (gimple_assign_rhs2 (stmt)));
+	  if (rhs1_sign != rhs2_sign)
+	    subtype |= optab_unsigned_to_signed;
+	  break;
+	}
+      default:
+	break;
+    }
+
+  return subtype;
+}
+
 /* Function vectorizable_reduction.
 
    Check if STMT_INFO performs a reduction operation that can be vectorized.
@@ -7189,7 +7216,8 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
       bool ok = true;
 
       /* 4.1. check support for the operation in the loop  */
-      optab optab = optab_for_tree_code (code, vectype_in, optab_vector);
+      enum optab_subtype subtype = vect_determine_dot_kind (code, stmt_info);
+      optab optab = optab_for_tree_code (code, vectype_in, subtype);
       if (!optab)
 	{
 	  if (dump_enabled_p ())
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 441d6cd28c4eaded7abd756164890dbcffd2f3b8..943c001fb13777b4d1513841fa84942316846d5e 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -201,7 +201,8 @@ vect_get_external_def_edge (vec_info *vinfo, tree var)
 static bool
 vect_supportable_direct_optab_p (vec_info *vinfo, tree otype, tree_code code,
 				 tree itype, tree *vecotype_out,
-				 tree *vecitype_out = NULL)
+				 tree *vecitype_out = NULL,
+				 enum optab_subtype subtype = optab_default)
 {
   tree vecitype = get_vectype_for_scalar_type (vinfo, itype);
   if (!vecitype)
@@ -211,7 +212,7 @@ vect_supportable_direct_optab_p (vec_info *vinfo, tree otype, tree_code code,
   if (!vecotype)
     return false;
 
-  optab optab = optab_for_tree_code (code, vecitype, optab_default);
+  optab optab = optab_for_tree_code (code, vecitype, subtype);
   if (!optab)
     return false;
 
@@ -487,14 +488,31 @@ vect_joust_widened_integer (tree type, bool shift_p, tree op,
 }
 
 /* Return true if the common supertype of NEW_TYPE and *COMMON_TYPE
-   is narrower than type, storing the supertype in *COMMON_TYPE if so.  */
+   is narrower than type, storing the supertype in *COMMON_TYPE if so.
+   If ALLOW_SHORT_SIGN_MISMATCH then accept that *COMMON_TYPE and NEW_TYPE
+   may be of different signs but equal precision.   */
 
 static bool
-vect_joust_widened_type (tree type, tree new_type, tree *common_type)
+vect_joust_widened_type (tree type, tree new_type, tree *common_type,
+			 bool allow_short_sign_mismatch = false)
 {
   if (types_compatible_p (*common_type, new_type))
     return true;
 
+  /* Check if the mismatch is only in the sign and if we have
+     allow_short_sign_mismatch then allow it.  */
+  if (allow_short_sign_mismatch
+      && TYPE_SIGN (*common_type) != TYPE_SIGN (new_type))
+    {
+      bool sign = TYPE_SIGN (*common_type) == UNSIGNED;
+      tree eq_type
+	= build_nonstandard_integer_type (TYPE_PRECISION (new_type),
+					  sign);
+
+      if (types_compatible_p (*common_type, eq_type))
+	return true;
+    }
+
   /* See if *COMMON_TYPE can hold all values of NEW_TYPE.  */
   if ((TYPE_PRECISION (new_type) < TYPE_PRECISION (*common_type))
       && (TYPE_UNSIGNED (new_type) || !TYPE_UNSIGNED (*common_type)))
@@ -532,6 +550,9 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
    to a type that (a) is narrower than the result of STMT_INFO and
    (b) can hold all leaf operand values.
 
+   If ALLOW_SHORT_SIGN_MISMATCH then allow that the signs of the operands
+   may differ in signs but not in precision.
+
    Return 0 if STMT_INFO isn't such a tree, or if no such COMMON_TYPE
    exists.  */
 
@@ -539,7 +560,8 @@ static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
 		      tree_code widened_code, bool shift_p,
 		      unsigned int max_nops,
-		      vect_unpromoted_value *unprom, tree *common_type)
+		      vect_unpromoted_value *unprom, tree *common_type,
+		      bool allow_short_sign_mismatch = false)
 {
   /* Check for an integer operation with the right code.  */
   gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
@@ -600,7 +622,8 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
 		= vinfo->lookup_def (this_unprom->op);
 	      nops = vect_widened_op_tree (vinfo, def_stmt_info, code,
 					   widened_code, shift_p, max_nops,
-					   this_unprom, common_type);
+					   this_unprom, common_type,
+					   allow_short_sign_mismatch);
 	      if (nops == 0)
 		return 0;
 
@@ -617,7 +640,8 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
 	      if (i == 0)
 		*common_type = this_unprom->type;
 	      else if (!vect_joust_widened_type (type, this_unprom->type,
-						 common_type))
+						 common_type,
+						 allow_short_sign_mismatch))
 		return 0;
 	    }
 	}
@@ -888,21 +912,24 @@ vect_reassociating_reduction_p (vec_info *vinfo,
 
    Try to find the following pattern:
 
-     type x_t, y_t;
+     type1a x_t
+     type1b y_t;
      TYPE1 prod;
      TYPE2 sum = init;
    loop:
      sum_0 = phi <init, sum_1>
      S1  x_t = ...
      S2  y_t = ...
-     S3  x_T = (TYPE1) x_t;
-     S4  y_T = (TYPE1) y_t;
+     S3  x_T = (TYPE3) x_t;
+     S4  y_T = (TYPE4) y_t;
      S5  prod = x_T * y_T;
      [S6  prod = (TYPE2) prod;  #optional]
      S7  sum_1 = prod + sum_0;
 
-   where 'TYPE1' is exactly double the size of type 'type', and 'TYPE2' is the
-   same size of 'TYPE1' or bigger. This is a special case of a reduction
+   where 'TYPE1' is exactly double the size of type 'type1a' and 'type1b',
+   the sign of 'TYPE1' must be one of 'type1a' or 'type1b' but the sign of
+   'type1a' and 'type1b' can differ. 'TYPE2' is the same size of 'TYPE1' or
+   bigger and must be the same sign. This is a special case of a reduction
    computation.
 
    Input:
@@ -939,15 +966,16 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
 
   /* Look for the following pattern
           DX = (TYPE1) X;
-          DY = (TYPE1) Y;
+	  DY = (TYPE2) Y;
           DPROD = DX * DY;
-          DDPROD = (TYPE2) DPROD;
+	  DDPROD = (TYPE3) DPROD;
           sum_1 = DDPROD + sum_0;
      In which
      - DX is double the size of X
      - DY is double the size of Y
      - DX, DY, DPROD all have the same type but the sign
-       between DX, DY and DPROD can differ.
+       between DX, DY and DPROD can differ. The sign of DPROD
+       is one of the signs of DX or DY.
      - sum is the same size of DPROD or bigger
      - sum has been recognized as a reduction variable.
 
@@ -986,14 +1014,41 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom0[2];
   if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR, WIDEN_MULT_EXPR,
-			     false, 2, unprom0, &half_type))
+			     false, 2, unprom0, &half_type, true))
     return NULL;
 
+  /* Check to see if there is a sign change happening in the operands of the
+     multiplication and pick the appropriate optab subtype.  */
+  enum optab_subtype subtype;
+  tree rhs_type1 = unprom0[0].type;
+  tree rhs_type2 = unprom0[1].type;
+  if (TYPE_SIGN (rhs_type1) == TYPE_SIGN (rhs_type2))
+     subtype = optab_default;
+  else if (TYPE_SIGN (rhs_type1) == SIGNED
+	   && TYPE_SIGN (rhs_type2) == UNSIGNED)
+     subtype = optab_signed_to_unsigned;
+  else if (TYPE_SIGN (rhs_type1) == UNSIGNED
+	   && TYPE_SIGN (rhs_type2) == SIGNED)
+     subtype = optab_unsigned_to_signed;
+  else
+    gcc_unreachable ();
+
+  /* If we have a sign changing dot product we need to check that the
+     promoted type if unsigned has at least the same precision as the final
+     type of the dot-product.  */
+  if (subtype != optab_default)
+    {
+      tree mult_type = TREE_TYPE (unprom_mult.op);
+      if (TYPE_SIGN (mult_type) == UNSIGNED
+	  && TYPE_PRECISION (mult_type) < TYPE_PRECISION (type))
+	return NULL;
+    }
+
   vect_pattern_detected ("vect_recog_dot_prod_pattern", last_stmt);
 
   tree half_vectype;
   if (!vect_supportable_direct_optab_p (vinfo, type, DOT_PROD_EXPR, half_type,
-					type_out, &half_vectype))
+					type_out, &half_vectype, subtype))
     return NULL;
 
   /* Get the inputs in the appropriate types.  */
@@ -1002,8 +1057,22 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
 		       unprom0, half_vectype);
 
   var = vect_recog_temp_ssa_var (type, NULL);
+
+  /* If we have a sign changing dot-product the dot-product itself does any
+     sign conversions, so consume the type and use the unpromoted types.  */
+  tree mult_arg1, mult_arg2;
+  if (subtype == optab_default)
+    {
+      mult_arg1 = mult_oprnd[0];
+      mult_arg2 = mult_oprnd[1];
+    }
+  else
+    {
+      mult_arg1 = unprom0[0].op;
+      mult_arg2 = unprom0[1].op;
+    }
   pattern_stmt = gimple_build_assign (var, DOT_PROD_EXPR,
-				      mult_oprnd[0], mult_oprnd[1], oprnd1);
+				      mult_arg1, mult_arg2, oprnd1);
 
   return pattern_stmt;
 }


-- 

--k1lZvvs/B4yU6o8G
Content-Type: text/x-diff; charset=utf-8
Content-Disposition: attachment; filename="rb14433.patch"

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..baf20416e63745097825fc30fdf2e66bc80d7d23 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5440,11 +5440,13 @@ Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand
 @item @samp{sdot_prod@var{m}}
 @cindex @code{udot_prod@var{m}} instruction pattern
 @itemx @samp{udot_prod@var{m}}
+@cindex @code{usdot_prod@var{m}} instruction pattern
+@itemx @samp{usdot_prod@var{m}}
 Compute the sum of the products of two signed/unsigned elements.
-Operand 1 and operand 2 are of the same mode. Their product, which is of a
-wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or
-wider than the mode of the product. The result is placed in operand 0, which
-is of the same mode as operand 3.
+Operand 1 and operand 2 are of the same mode but may differ in signs. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
 
 @cindex @code{ssad@var{m}} instruction pattern
 @item @samp{ssad@var{m}}
diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
index c3aaa1a416991e856d3e24da45968a92ebada82c..ebc23ac86fe99057f375781c2f1990e0548ba08d 100644
--- a/gcc/optabs-tree.h
+++ b/gcc/optabs-tree.h
@@ -27,11 +27,29 @@ along with GCC; see the file COPYING3.  If not see
    shift amount vs. machines that take a vector for the shift amount.  */
 enum optab_subtype
 {
-  optab_default,
-  optab_scalar,
-  optab_vector
+  optab_default = 1 << 0,
+  optab_scalar = 1 << 1,
+  optab_vector = 1 << 2,
+  optab_signed_to_unsigned = 1 << 3,
+  optab_unsigned_to_signed = 1 << 4
 };
 
+/* Override the OrEqual-operator so we can use optab_subtype as a bit flag.  */
+inline enum optab_subtype&
+operator |= (enum optab_subtype& a, enum optab_subtype b)
+{
+    return a = static_cast<optab_subtype>(static_cast<int>(a)
+					  | static_cast<int>(b));
+}
+
+/* Override the Or-operator so we can use optab_subtype as a bit flag.  */
+inline enum optab_subtype
+operator | (enum optab_subtype a, enum optab_subtype b)
+{
+    return static_cast<optab_subtype>(static_cast<int>(a)
+				      | static_cast<int>(b));
+}
+
 /* Return the optab used for computing the given operation on the type given by
    the second argument.  The third argument distinguishes between the types of
    vector shifts and rotates.  */
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 95ffe397c23e80c105afea52e9d47216bf52f55a..2f60004545defc53182e004eea1e5c22b7453072 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -127,7 +127,17 @@ optab_for_tree_code (enum tree_code code, const_tree type,
       return TYPE_UNSIGNED (type) ? usum_widen_optab : ssum_widen_optab;
 
     case DOT_PROD_EXPR:
-      return TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab;
+      {
+	gcc_assert (subtype & optab_default
+		    || subtype & optab_vector
+		    || subtype & optab_signed_to_unsigned
+		    || subtype & optab_unsigned_to_signed);
+
+	if (subtype & (optab_unsigned_to_signed | optab_signed_to_unsigned))
+	  return usdot_prod_optab;
+
+	return (TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab);
+      }
 
     case SAD_EXPR:
       return TYPE_UNSIGNED (type) ? usad_optab : ssad_optab;
diff --git a/gcc/optabs.c b/gcc/optabs.c
index f4614a394587787293dc8b680a38901f7906f61c..2e18b76de1412eab71971753ac678597c0d00098 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -262,6 +262,11 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
   bool sbool = false;
 
   oprnd0 = ops->op0;
+  if (nops >= 2)
+    oprnd1 = ops->op1;
+  if (nops >= 3)
+    oprnd2 = ops->op2;
+
   tmode0 = TYPE_MODE (TREE_TYPE (oprnd0));
   if (ops->code == VEC_UNPACK_FIX_TRUNC_HI_EXPR
       || ops->code == VEC_UNPACK_FIX_TRUNC_LO_EXPR)
@@ -285,6 +290,27 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
 	   ? vec_unpacks_sbool_hi_optab : vec_unpacks_sbool_lo_optab);
       sbool = true;
     }
+  else if (ops->code == DOT_PROD_EXPR)
+    {
+      enum optab_subtype subtype = optab_default;
+      signop sign1 = TYPE_SIGN (TREE_TYPE (oprnd0));
+      signop sign2 = TYPE_SIGN (TREE_TYPE (oprnd1));
+      if (sign1 == sign2)
+	;
+      else if (sign1 == SIGNED && sign2 == UNSIGNED)
+	{
+	  subtype |= optab_signed_to_unsigned;
+	  /* Same as optab_unsigned_to_signed but flip the operands.  */
+	  std::swap (op0, op1);
+	}
+      else if (sign1 == UNSIGNED && sign2 == SIGNED)
+	subtype |= optab_unsigned_to_signed;
+      else
+	gcc_unreachable ();
+
+      widen_pattern_optab
+	= optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), subtype);
+    }
   else
     widen_pattern_optab
       = optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
@@ -298,10 +324,7 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
   gcc_assert (icode != CODE_FOR_nothing);
 
   if (nops >= 2)
-    {
-      oprnd1 = ops->op1;
-      tmode1 = TYPE_MODE (TREE_TYPE (oprnd1));
-    }
+    tmode1 = TYPE_MODE (TREE_TYPE (oprnd1));
   else if (sbool)
     {
       nops = 2;
@@ -316,7 +339,6 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
     {
       gcc_assert (tmode1 == tmode0);
       gcc_assert (op1);
-      oprnd2 = ops->op2;
       wmode = TYPE_MODE (TREE_TYPE (oprnd2));
     }
 
diff --git a/gcc/optabs.def b/gcc/optabs.def
index b192a9d070b8aa72e5676b2eaa020b5bdd7ffcc8..f470c2168378cec840edf7fbdb7c18615baae928 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -352,6 +352,7 @@ OPTAB_D (uavg_ceil_optab, "uavg$a3_ceil")
 OPTAB_D (sdot_prod_optab, "sdot_prod$I$a")
 OPTAB_D (ssum_widen_optab, "widen_ssum$I$a3")
 OPTAB_D (udot_prod_optab, "udot_prod$I$a")
+OPTAB_D (usdot_prod_optab, "usdot_prod$I$a")
 OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
 OPTAB_D (usad_optab, "usad$I$a")
 OPTAB_D (ssad_optab, "ssad$I$a")
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 7e3aae5f9c28a49feedc7cc66e8ac0d476b9f28a..58b55bb648ad97d514f1fa18bb00808fd2678b42 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -4421,7 +4421,8 @@ verify_gimple_assign_ternary (gassign *stmt)
 		  && !SCALAR_FLOAT_TYPE_P (rhs1_type))
 		 || (!INTEGRAL_TYPE_P (lhs_type)
 		     && !SCALAR_FLOAT_TYPE_P (lhs_type))))
-	    || !types_compatible_p (rhs1_type, rhs2_type)
+	    || (!types_compatible_p (rhs1_type, rhs2_type)
+		&& TYPE_SIGN (rhs1_type) == TYPE_SIGN (rhs2_type))
 	    || !useless_type_conversion_p (lhs_type, rhs3_type)
 	    || maybe_lt (GET_MODE_SIZE (element_mode (rhs3_type)),
 			 2 * GET_MODE_SIZE (element_mode (rhs1_type))))
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 93fa2928e001c154bd4a9a73ac1dbbbf73c456df..cb8f5fbb6abca181c4171194d19fec29ec6e4176 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6401,6 +6401,33 @@ build_vect_cond_expr (enum tree_code code, tree vop[3], tree mask,
     }
 }
 
+/* Determine the optab_subtype to use for the given CODE and STMT.  For
+   most CODE this will be optab_vector, however for certain operations such as
+   DOT_PROD_EXPR where the operation can different signs for the operands we
+   need to be able to pick the right optabs.  */
+
+static enum optab_subtype
+vect_determine_dot_kind (tree_code code, stmt_vec_info stmt_vinfo)
+{
+  enum optab_subtype subtype = optab_vector;
+  switch (code)
+    {
+      case DOT_PROD_EXPR:
+	{
+	  gassign *stmt = as_a <gassign *> (STMT_VINFO_STMT (stmt_vinfo));
+	  signop rhs1_sign = TYPE_SIGN (TREE_TYPE (gimple_assign_rhs1 (stmt)));
+	  signop rhs2_sign = TYPE_SIGN (TREE_TYPE (gimple_assign_rhs2 (stmt)));
+	  if (rhs1_sign != rhs2_sign)
+	    subtype |= optab_unsigned_to_signed;
+	  break;
+	}
+      default:
+	break;
+    }
+
+  return subtype;
+}
+
 /* Function vectorizable_reduction.
 
    Check if STMT_INFO performs a reduction operation that can be vectorized.
@@ -7189,7 +7216,8 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
       bool ok = true;
 
       /* 4.1. check support for the operation in the loop  */
-      optab optab = optab_for_tree_code (code, vectype_in, optab_vector);
+      enum optab_subtype subtype = vect_determine_dot_kind (code, stmt_info);
+      optab optab = optab_for_tree_code (code, vectype_in, subtype);
       if (!optab)
 	{
 	  if (dump_enabled_p ())
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 441d6cd28c4eaded7abd756164890dbcffd2f3b8..943c001fb13777b4d1513841fa84942316846d5e 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -201,7 +201,8 @@ vect_get_external_def_edge (vec_info *vinfo, tree var)
 static bool
 vect_supportable_direct_optab_p (vec_info *vinfo, tree otype, tree_code code,
 				 tree itype, tree *vecotype_out,
-				 tree *vecitype_out = NULL)
+				 tree *vecitype_out = NULL,
+				 enum optab_subtype subtype = optab_default)
 {
   tree vecitype = get_vectype_for_scalar_type (vinfo, itype);
   if (!vecitype)
@@ -211,7 +212,7 @@ vect_supportable_direct_optab_p (vec_info *vinfo, tree otype, tree_code code,
   if (!vecotype)
     return false;
 
-  optab optab = optab_for_tree_code (code, vecitype, optab_default);
+  optab optab = optab_for_tree_code (code, vecitype, subtype);
   if (!optab)
     return false;
 
@@ -487,14 +488,31 @@ vect_joust_widened_integer (tree type, bool shift_p, tree op,
 }
 
 /* Return true if the common supertype of NEW_TYPE and *COMMON_TYPE
-   is narrower than type, storing the supertype in *COMMON_TYPE if so.  */
+   is narrower than type, storing the supertype in *COMMON_TYPE if so.
+   If ALLOW_SHORT_SIGN_MISMATCH then accept that *COMMON_TYPE and NEW_TYPE
+   may be of different signs but equal precision.   */
 
 static bool
-vect_joust_widened_type (tree type, tree new_type, tree *common_type)
+vect_joust_widened_type (tree type, tree new_type, tree *common_type,
+			 bool allow_short_sign_mismatch = false)
 {
   if (types_compatible_p (*common_type, new_type))
     return true;
 
+  /* Check if the mismatch is only in the sign and if we have
+     allow_short_sign_mismatch then allow it.  */
+  if (allow_short_sign_mismatch
+      && TYPE_SIGN (*common_type) != TYPE_SIGN (new_type))
+    {
+      bool sign = TYPE_SIGN (*common_type) == UNSIGNED;
+      tree eq_type
+	= build_nonstandard_integer_type (TYPE_PRECISION (new_type),
+					  sign);
+
+      if (types_compatible_p (*common_type, eq_type))
+	return true;
+    }
+
   /* See if *COMMON_TYPE can hold all values of NEW_TYPE.  */
   if ((TYPE_PRECISION (new_type) < TYPE_PRECISION (*common_type))
       && (TYPE_UNSIGNED (new_type) || !TYPE_UNSIGNED (*common_type)))
@@ -532,6 +550,9 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
    to a type that (a) is narrower than the result of STMT_INFO and
    (b) can hold all leaf operand values.
 
+   If ALLOW_SHORT_SIGN_MISMATCH then allow that the signs of the operands
+   may differ in signs but not in precision.
+
    Return 0 if STMT_INFO isn't such a tree, or if no such COMMON_TYPE
    exists.  */
 
@@ -539,7 +560,8 @@ static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
 		      tree_code widened_code, bool shift_p,
 		      unsigned int max_nops,
-		      vect_unpromoted_value *unprom, tree *common_type)
+		      vect_unpromoted_value *unprom, tree *common_type,
+		      bool allow_short_sign_mismatch = false)
 {
   /* Check for an integer operation with the right code.  */
   gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
@@ -600,7 +622,8 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
 		= vinfo->lookup_def (this_unprom->op);
 	      nops = vect_widened_op_tree (vinfo, def_stmt_info, code,
 					   widened_code, shift_p, max_nops,
-					   this_unprom, common_type);
+					   this_unprom, common_type,
+					   allow_short_sign_mismatch);
 	      if (nops == 0)
 		return 0;
 
@@ -617,7 +640,8 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
 	      if (i == 0)
 		*common_type = this_unprom->type;
 	      else if (!vect_joust_widened_type (type, this_unprom->type,
-						 common_type))
+						 common_type,
+						 allow_short_sign_mismatch))
 		return 0;
 	    }
 	}
@@ -888,21 +912,24 @@ vect_reassociating_reduction_p (vec_info *vinfo,
 
    Try to find the following pattern:
 
-     type x_t, y_t;
+     type1a x_t
+     type1b y_t;
      TYPE1 prod;
      TYPE2 sum = init;
    loop:
      sum_0 = phi <init, sum_1>
      S1  x_t = ...
      S2  y_t = ...
-     S3  x_T = (TYPE1) x_t;
-     S4  y_T = (TYPE1) y_t;
+     S3  x_T = (TYPE3) x_t;
+     S4  y_T = (TYPE4) y_t;
      S5  prod = x_T * y_T;
      [S6  prod = (TYPE2) prod;  #optional]
      S7  sum_1 = prod + sum_0;
 
-   where 'TYPE1' is exactly double the size of type 'type', and 'TYPE2' is the
-   same size of 'TYPE1' or bigger. This is a special case of a reduction
+   where 'TYPE1' is exactly double the size of type 'type1a' and 'type1b',
+   the sign of 'TYPE1' must be one of 'type1a' or 'type1b' but the sign of
+   'type1a' and 'type1b' can differ. 'TYPE2' is the same size of 'TYPE1' or
+   bigger and must be the same sign. This is a special case of a reduction
    computation.
 
    Input:
@@ -939,15 +966,16 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
 
   /* Look for the following pattern
           DX = (TYPE1) X;
-          DY = (TYPE1) Y;
+	  DY = (TYPE2) Y;
           DPROD = DX * DY;
-          DDPROD = (TYPE2) DPROD;
+	  DDPROD = (TYPE3) DPROD;
           sum_1 = DDPROD + sum_0;
      In which
      - DX is double the size of X
      - DY is double the size of Y
      - DX, DY, DPROD all have the same type but the sign
-       between DX, DY and DPROD can differ.
+       between DX, DY and DPROD can differ. The sign of DPROD
+       is one of the signs of DX or DY.
      - sum is the same size of DPROD or bigger
      - sum has been recognized as a reduction variable.
 
@@ -986,14 +1014,41 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom0[2];
   if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR, WIDEN_MULT_EXPR,
-			     false, 2, unprom0, &half_type))
+			     false, 2, unprom0, &half_type, true))
     return NULL;
 
+  /* Check to see if there is a sign change happening in the operands of the
+     multiplication and pick the appropriate optab subtype.  */
+  enum optab_subtype subtype;
+  tree rhs_type1 = unprom0[0].type;
+  tree rhs_type2 = unprom0[1].type;
+  if (TYPE_SIGN (rhs_type1) == TYPE_SIGN (rhs_type2))
+     subtype = optab_default;
+  else if (TYPE_SIGN (rhs_type1) == SIGNED
+	   && TYPE_SIGN (rhs_type2) == UNSIGNED)
+     subtype = optab_signed_to_unsigned;
+  else if (TYPE_SIGN (rhs_type1) == UNSIGNED
+	   && TYPE_SIGN (rhs_type2) == SIGNED)
+     subtype = optab_unsigned_to_signed;
+  else
+    gcc_unreachable ();
+
+  /* If we have a sign changing dot product we need to check that the
+     promoted type if unsigned has at least the same precision as the final
+     type of the dot-product.  */
+  if (subtype != optab_default)
+    {
+      tree mult_type = TREE_TYPE (unprom_mult.op);
+      if (TYPE_SIGN (mult_type) == UNSIGNED
+	  && TYPE_PRECISION (mult_type) < TYPE_PRECISION (type))
+	return NULL;
+    }
+
   vect_pattern_detected ("vect_recog_dot_prod_pattern", last_stmt);
 
   tree half_vectype;
   if (!vect_supportable_direct_optab_p (vinfo, type, DOT_PROD_EXPR, half_type,
-					type_out, &half_vectype))
+					type_out, &half_vectype, subtype))
     return NULL;
 
   /* Get the inputs in the appropriate types.  */
@@ -1002,8 +1057,22 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
 		       unprom0, half_vectype);
 
   var = vect_recog_temp_ssa_var (type, NULL);
+
+  /* If we have a sign changing dot-product the dot-product itself does any
+     sign conversions, so consume the type and use the unpromoted types.  */
+  tree mult_arg1, mult_arg2;
+  if (subtype == optab_default)
+    {
+      mult_arg1 = mult_oprnd[0];
+      mult_arg2 = mult_oprnd[1];
+    }
+  else
+    {
+      mult_arg1 = unprom0[0].op;
+      mult_arg2 = unprom0[1].op;
+    }
   pattern_stmt = gimple_build_assign (var, DOT_PROD_EXPR,
-				      mult_oprnd[0], mult_oprnd[1], oprnd1);
+				      mult_arg1, mult_arg2, oprnd1);
 
   return pattern_stmt;
 }


--k1lZvvs/B4yU6o8G--