From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <Tamar.Christina@arm.com>
Received: from EUR04-DB3-obe.outbound.protection.outlook.com
 (mail-eopbgr60066.outbound.protection.outlook.com [40.107.6.66])
 by sourceware.org (Postfix) with ESMTPS id 985C6384400A
 for <gcc-patches@gcc.gnu.org>; Wed,  2 Jun 2021 09:28:35 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 985C6384400A
Received: from PR2P264CA0002.FRAP264.PROD.OUTLOOK.COM (2603:10a6:101::14) by
 VI1PR08MB2653.eurprd08.prod.outlook.com (2603:10a6:802:1b::32) with Microsoft
 SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.4173.24; Wed, 2 Jun 2021 09:28:31 +0000
Received: from VE1EUR03FT020.eop-EUR03.prod.protection.outlook.com
 (2603:10a6:101:0:cafe::e9) by PR2P264CA0002.outlook.office365.com
 (2603:10a6:101::14) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4195.15 via Frontend
 Transport; Wed, 2 Jun 2021 09:28:31 +0000
X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123)
 smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified)
 header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none
 header.from=arm.com;
Received-SPF: Pass (protection.outlook.com: domain of arm.com designates
 63.35.35.123 as permitted sender) receiver=protection.outlook.com;
 client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com;
Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by
 VE1EUR03FT020.mail.protection.outlook.com (10.152.18.242) with
 Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.4150.30 via Frontend Transport; Wed, 2 Jun 2021 09:28:30 +0000
Received: ("Tessian outbound 836922dda4f1:v93");
 Wed, 02 Jun 2021 09:28:30 +0000
X-CR-MTA-TID: 64aa7808
Received: from 3ad08724d7b9.1
 by 64aa7808-outbound-1.mta.getcheckrecipient.com id
 60BBDF08-D1CB-445F-90CE-FFA45996304D.1; 
 Wed, 02 Jun 2021 09:28:20 +0000
Received: from EUR05-VI1-obe.outbound.protection.outlook.com
 by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 3ad08724d7b9.1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384);
 Wed, 02 Jun 2021 09:28:20 +0000
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=FyIz5+YvDL111OaFtmTmBpOn+QES16NvKT/Cc7sMNXf1YvLgD6amWj3TF6JzzMkzo4HfwpYE0Va4kX82lwZOPg6JbjBDLG2kFcUyj7nMW14qu++vKkKxR3a0JnJsCF1rrWlAJiQ344bZDzlxSdKFS+rcOG7u7KptaTwIWjxOpP9YA82sg9ho7oU7/NvSdCMp4HrG7BNvkKJqs84IqzvitvB0nzk7XeUoWmQiwBu25R+PggzphMOIVKEWPMzK/j12L1ORgVeLDZaJqJiskUO35DUzTf6tN6u+qoxLN5nDmx67WDn7jK5DgeiV6mBe3BWVbZEBl5tWEE5YaEZj+3R02w==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=Qg9GY1bG6fHDan0VK8mg4itiMGMToiN4L+NHXSGmvoQ=;
 b=EIEDvue7onnN0dq3eCa+dDax4fii+GJMvGyZl3G0Htp9yxkogFWMVac3e3C1vf27Cl0HgsNjY8oHgqqiXiyqBvRU8CfZi/VJJG95A+m0/RmGJn6D0pby/R8AvgR3tTsCz4Hy1hB2WcgDF/s6D7gjWcPPxZZpxwT/o2LMhbhKbYvzSwznjMcYlkR6GtdKE6qW1EWZ8T+AYo+JgUf+pTXhGzQF4xWR3K3dyVOxqemGGqNyAUwCj6De+eTp834Wd032DFO1e7F2SJFLLW0zhgU9jL3wQVAhzq8tWpWpgzg7lMH3m5TAJHld4fGbJyde48g5KzSf3ztMyU7tXuHfMwF0tw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass
 header.d=arm.com; arc=none
Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17)
 by VI1PR0801MB1727.eurprd08.prod.outlook.com (2603:10a6:800:5a::17)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4173.24; Wed, 2 Jun
 2021 09:28:17 +0000
Received: from VI1PR08MB5325.eurprd08.prod.outlook.com
 ([fe80::f557:1fb2:62cc:5243]) by VI1PR08MB5325.eurprd08.prod.outlook.com
 ([fe80::f557:1fb2:62cc:5243%9]) with mapi id 15.20.4173.030; Wed, 2 Jun 2021
 09:28:17 +0000
From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Biener <rguenther@suse.de>
CC: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,
 Richard Sandiford <Richard.Sandiford@arm.com>
Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the
 sign for the multiplicant changes.
Thread-Topic: [PATCH 1/4]middle-end Vect: Add support for dot-product where
 the sign for the multiplicant changes.
Thread-Index: AQHXQdV4g2WFnSd+rEyE6iZSy1EmNqrX6WiAgAABSHCABLQJgIAAAwwAgAAbeYCAF6c1YIABMgmAgAsI3tA=
Date: Wed, 2 Jun 2021 09:28:17 +0000
Message-ID: <VI1PR08MB53256B0E3990410089DB3333FF3D9@VI1PR08MB5325.eurprd08.prod.outlook.com>
References: <patch-14433-tamar@arm.com>
 <nycvar.YFH.7.76.2105071337560.9200@zhemvz.fhfr.qr>
 <VI1PR08MB5325998C3057A611E268B740FF579@VI1PR08MB5325.eurprd08.prod.outlook.com>
 <nycvar.YFH.7.76.2105101325270.9200@zhemvz.fhfr.qr>
 <VI1PR08MB5325748CA53CAFBD3B39AFB1FF549@VI1PR08MB5325.eurprd08.prod.outlook.com>
 <nycvar.YFH.7.76.2105101522390.9200@zhemvz.fhfr.qr>
 <VI1PR08MB532566E22D4A25DCD37DBC99FF259@VI1PR08MB5325.eurprd08.prod.outlook.com>
 <7q3oonr2-92r0-8o9q-s27q-9r735s4n3s3@fhfr.qr>
In-Reply-To: <7q3oonr2-92r0-8o9q-s27q-9r735s4n3s3@fhfr.qr>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-ts-tracking-id: BFD89EEE4BA2BF409ACA24F80E3D6DF8.0
x-checkrecipientchecked: true
Authentication-Results-Original: suse.de; dkim=none (message not signed)
 header.d=none;suse.de; dmarc=none action=none header.from=arm.com;
x-originating-ip: [82.11.185.166]
x-ms-publictraffictype: Email
X-MS-Office365-Filtering-Correlation-Id: acd36a27-4a96-4f7d-eb03-08d925a8c939
x-ms-traffictypediagnostic: VI1PR0801MB1727:|VI1PR08MB2653:
x-ms-exchange-transport-forked: True
X-Microsoft-Antispam-PRVS: <VI1PR08MB265308431B037CCC7134267AFF3D9@VI1PR08MB2653.eurprd08.prod.outlook.com>
x-checkrecipientrouted: true
nodisclaimer: true
x-ms-oob-tlc-oobclassifiers: OLM:4941;OLM:4941;
X-MS-Exchange-SenderADCheck: 1
X-Microsoft-Antispam-Untrusted: BCL:0;
X-Microsoft-Antispam-Message-Info-Original: Q0TNx7xivh8Uv8be2bZQQRPvfyLYc7t46pFTF1n7uZ+K8ROuVq8q69qZfWYcwf9ENMP3B7sKPkp0Et5JgoitWYzLjLZJu9UAjCnbl4D/S1898hzfXJ7bSzOO0meP0c5XKcNowNqyvljE8tfkceLRW76/FkgPGK959kDo+3e5Vv7QQAMi1rRl/Sjaa/dDotGKm/Wpnp1jjDS5KBaoZNFtN7KqYeBDrRzLfc/lrNSfXZoYGHkn6D/8T3ES9enJgirJSj7/Cagp+5BooRtUIxtBAvSPZaoW5wwnIM4dco+8OZqWTHWo9hbf3W3Dg9QcNNX0TtxQwBvOWMQkEq3HEaJBeW4d/ohBpXUFiKtJAd4ARyB6ZwVHxvNf4pZToaDCuerMRVXv8jgT3asU1mKFprWQwY5HfEaa1ZWC+jnZe5c0iVtrFlHj1ff1zVEsFQ4Ab8ZgN2gaiXgWjQfhbAhqOiitAR16WspgBvpevnWbcqpsB7zgP279MMnvqVX8t3n0e1LSmSlUWzJJdKiC4g7/OOWRQE/eLJWa323vz618GiX1bBJIs5QWxUyHo5sIhKTXcp9bBhVxL9xmmt4iNrzSTSTDaMvSvsmvx/RZfOYJtWjsI/J4LdVzdyhQvT0+M+3jI0HZnNyTW1eVhaIIzBId8WEmPQ==
X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en;
 SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com;
 PTR:; CAT:NONE;
 SFS:(4636009)(366004)(136003)(376002)(396003)(346002)(39860400002)(30864003)(5660300002)(86362001)(26005)(8936002)(66446008)(66556008)(6506007)(53546011)(83380400001)(478600001)(71200400001)(66476007)(6916009)(76116006)(64756008)(55016002)(66946007)(52536014)(122000001)(9686003)(8676002)(186003)(7696005)(2906002)(4326008)(38100700002)(316002)(33656002)(54906003)(579004)(559001)(357404004);
 DIR:OUT; SFP:1101; 
x-ms-exchange-antispam-messagedata: =?us-ascii?Q?5jQa5Esvn9iY2xXOD6/Y6YTZpTwSYFF59zSw817zwo+ABEeXhv72qbaqTD0W?=
 =?us-ascii?Q?YgKS3/teX6OJN+WykQn3qufmo/Kr+ImXlTaas+P7zUDg4L31HXn27CJMos6C?=
 =?us-ascii?Q?ATQIFZzAKGGhKgj+QlCAGhr3o4LlweuekGyqm8N2xNdqdC9AURSI7WvwNHoj?=
 =?us-ascii?Q?ft402VbR0c3GjuZ7CrzsMLhbzGxzCP7CB9ksoz/7wJlP3TM+0LCuYThTS/p6?=
 =?us-ascii?Q?ZR7qaAZGgCQS1s7jbcSltcDcKbGh6gyv4yColjLRF7UxUbOaJsRXEfRXSZWV?=
 =?us-ascii?Q?AAF/ZqL9FkCdCMgdvHIoV+Tqq/La9M0+Kx3PxSF7qaVJasYKrDjayhhsGrkE?=
 =?us-ascii?Q?dtj9doI4MNyvRM1xclzVCA4pdm9xlWsFI0OCh6wY2mR55G9yqWhoUqonRRda?=
 =?us-ascii?Q?TixPojT2bs5C6kIsTT2iK4Pc7IUU336KaM6pRrdMZF2SrbtVeAFFoE3NLcH5?=
 =?us-ascii?Q?ZahzEZyBwtZEwQDu3ieZXgqbzgcntxU2V9srzm7xYZgQZJvFblH4LTQeV7dy?=
 =?us-ascii?Q?Etfhw8unVBclAcTlv/H4bEURxEtTyefE4bhN6Du7YVIu60lIZwJEwu/ASPCj?=
 =?us-ascii?Q?k6yi1cimKeRWQOFuyH4FvJ3Zz1xMGwn6Nf1oWlEUV6TDweL1GWC3YMbmHU/u?=
 =?us-ascii?Q?yoGcyTKQUtBywvh4AQ4OzJueUO9Tq7pBSbO01Nga9wYl/dKuBOKgSNDZKVzl?=
 =?us-ascii?Q?4lue/CTWnctvc5hJzyqcdasNUz+YTFlPHq+ojyw/B+SA5YsL8ulOdTzVP+/c?=
 =?us-ascii?Q?QG6NtMcUG7XL49sSldujsoXgAdXofmFpdXX2Stg1a1RHVnGT28vikRcbI4E1?=
 =?us-ascii?Q?XmnHnAgazQWvJj05m1bsQ5dpYhXUOw49wA2DQNEqc7RwFyDfZLKpHZqblChp?=
 =?us-ascii?Q?KbmNQe3y6mvRpU13l9spn0aBzgZmj4rdZdIL7AKpa5ylIERFihGJ7CGKApdD?=
 =?us-ascii?Q?TgejvT4nSb01VBtvNySMtDuZoiaShnn//xdkHIlUmaZM4MZZp7jMT5XoT8XF?=
 =?us-ascii?Q?5xiZsaYvB8zjCB0H5Ht5+TjDYbAIpAfmBJM8KLVjR5EYX70QaFglzcJmtQlD?=
 =?us-ascii?Q?dGRLWF5/3Hx7z0TRdo9bAV0tRbPHU7jIhFH/DKQuxuf63uIlMvPfIWNlNvBG?=
 =?us-ascii?Q?8zdds86SpmSRUGfMP1rpdiGEBZ9jVQOsWn/9IycF+CU7f8mdYUYogs17eUXO?=
 =?us-ascii?Q?JS6XQFthUutIqHz0fN9WI27IWxVd/2zKSNAfaHv/uetfTpyri/js6Au3l8Fp?=
 =?us-ascii?Q?EF2QyZ0alT6YrSj72ksRdUGpYtTzOeWYxjFUQKVjKqyWFtnFkyihE86Dtf5G?=
 =?us-ascii?Q?9DM=3D?=
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB1727
Original-Authentication-Results: suse.de; dkim=none (message not signed)
 header.d=none;suse.de; dmarc=none action=none header.from=arm.com;
X-EOPAttributedMessage: 0
X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT020.eop-EUR03.prod.protection.outlook.com
X-MS-Office365-Filtering-Correlation-Id-Prvs: 966659e0-6417-4d6f-5dc3-08d925a8c14e
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: x8ROByOB3UdFcPFw4yxIHZXWVFK0ZzV0ptjKD7hISDMSoW6VPNkS7IdfBidbzhX7dnoGw9J55TXdT+7dt4OKZYgxTn3h9UBVoTUBtFR2ggmkwzYkhNUNt5S/WVzOBOrgIAANfLza2mljOqoqChGsGpCR5s0nDHnG1xzFfKT5WeGKKyulnwtspmBlAT0ihhA726lm8Ets65VC0z/Zb0XEqvv72A8jC8+Zkkwfc5pmCAYwOBwHEVTHxp/sLNimy8ehSTPvtdveQl7ABOL48KiD8ZXBUcN7HD6qOVvbkuENqYyxLTnOWPP0/IVb2Mj/+WPWIDfmW9AjMjIBljoi0OkNnbyYhWbukHLTEIHMXkiCKqGKRLUfu4OETa6CFtxkfnvPjl/LTH2pT8sSIE+6sJGtxDPFEBzsKfkUTcIrduMumLPNvrbAo1FBzxjKzXVisWyO/A2ibjJrlaNIomh1puybPvnIxvepopzzyW2isTJz6kVAVY3EEvF/JqjAIneO4QrjT1DAFd5qSelbX9kHGKJMm9dIJMqLmK7ij6rE9nCkRoLIRzqN3szEBi61dFz7wTPUDF6moGrKLVMYJhnoArofqTkCkFPB/jCEqrPLCHCckChnmhDNm5Bk/RAb/YuTu+Lgh2lsITvKeg+HdiyDOAuq90arS1yTEaThQK5kXe/8Eyv0hLyzvMthtU8mfsrcLpCc
X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:;
 IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com;
 PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE;
 SFS:(4636009)(36840700001)(46966006)(9686003)(53546011)(6506007)(8936002)(36860700001)(82310400003)(54906003)(4326008)(26005)(83380400001)(52536014)(7696005)(356005)(47076005)(86362001)(5660300002)(8676002)(70206006)(336012)(498600001)(70586007)(33656002)(30864003)(55016002)(6862004)(186003)(81166007)(2906002)(579004)(357404004);
 DIR:OUT; SFP:1101; 
X-OriginatorOrg: arm.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Jun 2021 09:28:30.8676 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: acd36a27-4a96-4f7d-eb03-08d925a8c939
X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123];
 Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com]
X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT020.eop-EUR03.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB2653
X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2,
 SPF_HELO_PASS, SPF_PASS, TXREP,
 UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Jun 2021 09:28:46 -0000

Ping,

Did you have any comments Richard S?

Otherwise I'll proceed with respining according to Richi's comments.

Regards,
Tamar

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Wednesday, May 26, 2021 9:57 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Sandiford
> <Richard.Sandiford@arm.com>
> Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
>=20
> On Tue, 25 May 2021, Tamar Christina wrote:
>=20
> > Hi Richi,
> >
> > Here's a respun version of the patch.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
>=20
> index
> 7e3aae5f9c28a49feedc7cc66e8ac0d476b9f28a..13e405edd765dde704c64348d
> 2d0b3cd88f0af7c
> 100644
> --- a/gcc/tree-cfg.c
> +++ b/gcc/tree-cfg.c
> @@ -4421,7 +4421,9 @@ verify_gimple_assign_ternary (gassign *stmt)
>                   && !SCALAR_FLOAT_TYPE_P (rhs1_type))
>                  || (!INTEGRAL_TYPE_P (lhs_type)
>                      && !SCALAR_FLOAT_TYPE_P (lhs_type))))
> -           || !types_compatible_p (rhs1_type, rhs2_type)
> +           || (!types_compatible_p (rhs1_type, rhs2_type)
> +               && TYPE_SIGN (rhs1_type) =3D=3D TYPE_SIGN (rhs2_type)
> +               && TYPE_PRECISION (rhs1_type) !=3D TYPE_PRECISION
> (rhs2_type))
>=20
> I think this doesn't capture the constraints - instead please do
>=20
> -           || !types_compatible_p (rhs1_type, rhs2_type)
> +           /* rhs1_type and rhs2_type may differ in sign.  */
> +           || !tree_nop_conversion_p (rhs1_type, rhs2_type)
>=20
>=20
> +/* Determine the optab_subtype to use for the given CODE and STMT.  For
> +   most CODE this will be optab_vector, however for certain operations
> such as
> +   DOT_PROD_EXPR where the operation can different signs for the
> operands
> we
> +   need to be able to pick the right optabs.  */
> +
> +static enum optab_subtype
> +vect_determine_dot_kind (tree_code code, stmt_vec_info stmt_vinfo)
>=20
> vect_determine_optab_subkind would be a better name.  'code' is
> redundant (or should better match stmt_vinfo->stmts code).  I wonder
> if it might be clearer to compute the subtype where we compute 'code'
> and the relation to stmt_info is obvious, I mean here:
>=20
>   /* 3. Check the operands of the operation.  The first operands are
> defined
>         inside the loop body. The last operand is the reduction variable,
>         which is defined by the loop-header-phi.  */
>=20
>   tree vectype_out =3D STMT_VINFO_VECTYPE (stmt_info);
>   STMT_VINFO_REDUC_VECTYPE (reduc_info) =3D vectype_out;
>   gassign *stmt =3D as_a <gassign *> (stmt_info->stmt);
>   enum tree_code code =3D gimple_assign_rhs_code (stmt);
>   bool lane_reduc_code_p
>     =3D (code =3D=3D DOT_PROD_EXPR || code =3D=3D WIDEN_SUM_EXPR || code =
=3D=3D
> SAD_EXPR);
>=20
> so just add
>=20
>   enum optab_subtype optab_query_kind =3D optab_vector;
>   if (code =3D=3D DOT_PROD_EXPR
>       && <sign test>)
>     optab_query_kind =3D optab_vector_mixed_sign;
>=20
> in this place and avoid adding the new function?
>=20
> I'm not too familiar with the pattern recog code, a 2nd eye would be
> prefered (Richard?), but
>=20
> +  /* Check if the mismatch is only in the sign and if we have
> +     allow_short_sign_mismatch then allow it.  */
> +  if (unprom_type
> +      && TYPE_SIGN (unprom_type) =3D=3D SIGNED
> +      && TYPE_SIGN (*common_type) !=3D TYPE_SIGN (new_type))
> +    {
> +      bool sign =3D TYPE_SIGN (*common_type) =3D=3D UNSIGNED;
> +      tree eq_type
> +       =3D build_nonstandard_integer_type (TYPE_PRECISION (new_type),
> +                                         sign);
> +
> +      if (types_compatible_p (*common_type, eq_type))
> +       return true;
> +    }
>=20
> looks somewhat complicated - is that equal to
>=20
>   if (unprom_type
>       && tree_nop_conversion_p (*common_type, new_type))
>     return true;
>=20
> ?  That is, *common_type and new_type only differ in sign?
>=20
> @@ -812,8 +844,13 @@ vect_convert_inputs (vec_info *vinfo,
> stmt_vec_info
> stmt_info, unsigned int n,
>        for (j =3D 0; j < i; ++j)
>         if (unprom[j].op =3D=3D unprom[i].op)
>           break;
> +      bool only_sign =3D allow_short_sign_mismatch
> +                      && TYPE_SIGN (type) !=3D TYPE_SIGN (unprom[i].type=
)
> +                      && TYPE_PRECISION (type) =3D=3D TYPE_PRECISION
> (unprom[i].type);
>=20
> this could use the same tree_nop_conversion_p predicate.
>=20
> Otherwise the patch looks good.
>=20
> Thanks,
> Richard.
>=20
>=20
>=20
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 	* optabs.def (usdot_prod_optab): New.
> > 	* doc/md.texi: Document it and clarify other dot prod optabs.
> > 	* optabs-tree.h (enum optab_subtype): Add
> optab_vector_mixed_sign.
> > 	* optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
> > 	* optabs.c (expand_widen_pattern_expr): Likewise.
> > 	* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> > 	* tree-vect-loop.c (vect_determine_dot_kind): New.
> > 	(vectorizable_reduction): Query dot-product kind.
> > 	* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take
> optional
> > 	optab subtype.
> > 	(vect_joust_widened_type, vect_widened_op_tree): Optionally
> ignore
> > 	mismatch types.
> > 	(vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> >
> >
> > > -----Original Message-----
> > > From: Richard Biener <rguenther@suse.de>
> > > Sent: Monday, May 10, 2021 2:29 PM
> > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>
> > > Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> > > where the sign for the multiplicant changes.
> > >
> > > On Mon, 10 May 2021, Tamar Christina wrote:
> > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Richard Biener <rguenther@suse.de>
> > > > > Sent: Monday, May 10, 2021 12:40 PM
> > > > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>
> > > > > Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-
> product
> > > > > where the sign for the multiplicant changes.
> > > > >
> > > > > On Fri, 7 May 2021, Tamar Christina wrote:
> > > > >
> > > > > > Hi Richi,
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Richard Biener <rguenther@suse.de>
> > > > > > > Sent: Friday, May 7, 2021 12:46 PM
> > > > > > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > > > > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>
> > > > > > > Subject: Re: [PATCH 1/4]middle-end Vect: Add support for
> > > > > > > dot-product where the sign for the multiplicant changes.
> > > > > > >
> > > > > > > On Wed, 5 May 2021, Tamar Christina wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > This patch adds support for a dot product where the sign of
> > > > > > > > the multiplication arguments differ. i.e. one is signed and
> > > > > > > > one is unsigned but the precisions are the same.
> > > > > > > >
> > > > > > > > #define N 480
> > > > > > > > #define SIGNEDNESS_1 unsigned
> > > > > > > > #define SIGNEDNESS_2 signed
> > > > > > > > #define SIGNEDNESS_3 signed
> > > > > > > > #define SIGNEDNESS_4 unsigned
> > > > > > > >
> > > > > > > > SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 in=
t
> > > > > > > > res,
> > > > > > > > SIGNEDNESS_3 char *restrict a,
> > > > > > > >    SIGNEDNESS_4 char *restrict b) {
> > > > > > > >   for (__INTPTR_TYPE__ i =3D 0; i < N; ++i)
> > > > > > > >     {
> > > > > > > >       int av =3D a[i];
> > > > > > > >       int bv =3D b[i];
> > > > > > > >       SIGNEDNESS_2 short mult =3D av * bv;
> > > > > > > >       res +=3D mult;
> > > > > > > >     }
> > > > > > > >   return res;
> > > > > > > > }
> > > > > > > >
> > > > > > > > The operations are performed as if the operands were
> extended
> > > > > > > > to a 32-bit
> > > > > > > value.
> > > > > > > > As such this operation isn't valid if there is an intermedi=
ate
> > > > > > > > conversion to an unsigned value. i.e.  if SIGNEDNESS_2 is
> unsigned.
> > > > > > > >
> > > > > > > > more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are
> > > > > > > > flipped the same optab is used but the operands are flipped=
 in
> > > > > > > > the optab
> > > > > > > expansion.
> > > > > > > >
> > > > > > > > To support this the patch extends the dot-product detection=
 to
> > > > > > > > optionally ignore operands with different signs and stores
> > > > > > > > this information in the optab subtype which is now made a
> bitfield.
> > > > > > > >
> > > > > > > > The subtype can now additionally controls which optab an EX=
PR
> > > > > > > > can expand
> > > > > > > to.
> > > > > > > >
> > > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no
> issues.
> > > > > > > >
> > > > > > > > Ok for master?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Tamar
> > > > > > > >
> > > > > > > > gcc/ChangeLog:
> > > > > > > >
> > > > > > > > 	* optabs.def (usdot_prod_optab): New.
> > > > > > > > 	* doc/md.texi: Document it.
> > > > > > > > 	* optabs-tree.c (optab_for_tree_code): Support
> > > usdot_prod_optab.
> > > > > > > > 	* optabs-tree.h (enum optab_subtype): Likewise.
> > > > > > > > 	* optabs.c (expand_widen_pattern_expr): Likewise.
> > > > > > > > 	* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> > > > > > > > 	* tree-vect-loop.c (vect_determine_dot_kind): New.
> > > > > > > > 	(vectorizable_reduction): Query dot-product kind.
> > > > > > > > 	* tree-vect-patterns.c (vect_supportable_direct_optab_p):
> > > > > > > > Take
> > > > > > > optional
> > > > > > > > 	optab subtype.
> > > > > > > > 	(vect_joust_widened_type, vect_widened_op_tree):
> > > Optionally
> > > > > > > ignore
> > > > > > > > 	mismatch types.
> > > > > > > > 	(vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> > > > > > > >
> > > > > > > > --- inline copy of patch --
> > > > > > > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index
> > > > > > > >
> > > > > > >
> > > > >
> > >
> d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..baf20416e63745097825fc30fd
> > > > > > > f2
> > > > > > > > e66bc80d7d23 100644
> > > > > > > > --- a/gcc/doc/md.texi
> > > > > > > > +++ b/gcc/doc/md.texi
> > > > > > > > @@ -5440,11 +5440,13 @@ Like
> @samp{fold_left_plus_@var{m}},
> > > > > > > > but
> > > > > > > takes
> > > > > > > > an additional mask operand  @item @samp{sdot_prod@var{m}}
> > > > > @cindex
> > > > > > > > @code{udot_prod@var{m}} instruction pattern  @itemx
> > > > > > > > @samp{udot_prod@var{m}}
> > > > > > > > +@cindex @code{usdot_prod@var{m}} instruction pattern
> @itemx
> > > > > > > > +@samp{usdot_prod@var{m}}
> > > > > > > >  Compute the sum of the products of two signed/unsigned
> > > elements.
> > > > > > > > -Operand 1 and operand 2 are of the same mode. Their produc=
t,
> > > > > > > > which is of a -wider mode, is computed and added to operand=
 3.
> > > > > > > > Operand 3 is of a mode equal or -wider than the mode of the
> > > > > > > > product. The result is placed in operand 0, which -is of th=
e
> > > > > > > > same mode
> > > > > as operand 3.
> > > > > > > > +Operand 1 and operand 2 are of the same mode but may diffe=
r
> > > > > > > > +in
> > > > > signs.
> > > > > > > > +Their product, which is of a wider mode, is computed and
> > > > > > > > +added to
> > > > > > > operand 3.
> > > > > > > > +Operand 3 is of a mode equal or wider than the mode of the
> > > product.
> > > > > > > > +The result is placed in operand 0, which is of the same mo=
de
> > > > > > > > +as
> > > > > operand 3.
> > > > > > >
> > > > > > > This doesn't really say what the 's', 'u' and 'us' specify.
> > > > > > > Since we're doing a widen multiplication and then a non-widen=
ing
> > > > > > > addition we only need to know the effective sign of the
> > > > > > > multiplication so I think
> > > > > the existing 's' and 'u'
> > > > > > > are enough to cover all cases?
> > > > > >
> > > > > > The existing 's' and 'u' enforce that both operands of the
> > > > > > multiplication are of the same sign.  So for e.g. 'u' both oper=
and
> > > > > > must be
> > > > > unsigned.
> > > > > >
> > > > > > In the `us` case one can be signed and one unsigned. Operationa=
lly
> > > > > > this does a sign extension to the wider type for the signed val=
ue,
> > > > > > and the unsigned value gets zero extended first, and then conve=
rts
> > > > > > it to unsigned to perform the unsigned multiplication, conformi=
ng
> > > > > > to the C
> > > > > promotion rules.
> > > > > >
> > > > > > TL;DR; Without a new optab I can't tell during expansion which
> > > > > > semantic the operation had at the gimple/C level as modes don't
> carry
> > > signs.
> > > > > >
> > > > > > Long version:
> > > > > >
> > > > > > The problem with using the existing patterns, because of their
> > > > > > enforcement of `av` and `bv` being the same sign is that we can=
't
> > > > > > remove the explicit sign extensions, but the multiplication mus=
t
> > > > > > be done on
> > > > > the sign/zero extended char input in the same sign.
> > > > > >
> > > > > > Which means (unless I am mistaken) to get the correct result, y=
ou
> > > > > > can't use neither `udot` nor `sdot` as semantically these would
> > > > > > zero or sign extend both operands from char to int to perform t=
he
> > > > > > multiplication in the same sigh.  Whereas in this case, one
> > > > > > parameter is zero
> > > > > and one parameter is sign extended and the result is always an
> > > > > unsigned number.
> > > > > >
> > > > > > So basically
> > > > > >
> > > > > > udot<unsigned c, unsigned a, unsigned b> =3D=3D
> > > > > >    c =3D zero-ext (a) * zero-ext (b) sdot<signed c, signed a, s=
igned
> > > > > > b> =3D=3D
> > > > > >    c =3D sign-ext (a) * sign-ext (b) usdot<unsigned c, unsigned=
 a,
> > > > > > signed b> =3D=3D
> > > > > >    c =3D ((unsigned-conv) sign-ext (a)) * zero-ext (b)
> > > > > >
> > > > > > So semantically the existing optabs won't fit here. udot would
> > > > > > internally promote to unsigned types before the multiplication =
so
> > > > > > the result of the multiplication would be wrong.  sdot would
> > > > > > promote both to
> > > > > signed and do signed multiplication, so the result is also wrong.
> > > > > >
> > > > > > Now if I relax the constraint on the signs of udot and sdot the=
re
> > > > > > are two
> > > > > problems:
> > > > > > RTL Modes don't contain signs.  So a target can't tell me how t=
he
> > > > > > operands
> > > > > will be promoted.
> > > > > > So:
> > > > > >
> > > > > > 1) I can't really check which semantics the target will adhere =
to
> > > > > > on
> > > > > expansion.
> > > > > > 2) at expand time I have no way to differentiate between the tw=
o
> > > > > instructions variants, given just modes
> > > > > >      I can't tell whether I expand to the normal dot-product or
> > > > > > the new
> > > > > instruction.
> > > > >
> > > > > Ah, OK.  Indeed with such a weird instruction the new variant mak=
es
> > > sense.
> > > > > Still can you please amend the optab documentation to say which
> > > > > operand is unsigned and which is signed?  Just 'may differ in sig=
ns'
> > > > > is bad.
> > > >
> > > > Sure, will expand on it.
> > > >
> > > > >
> > > > > Since the multiplication is commutative I wonder why you need to
> > > > > handle both signed_to_unsigned and unsigned_to_signed - we
> should
> > > > > just enforce a canonical order (like the optab does).
> > > >
> > > > Sure, I thought it would have been better to change the order at
> > > > expand time, but can do so at detection time.
> > > >
> > > > > I also think it's a particular bad fit for the bad
> > > > > optab_for_tree_code API - would any of that improve when using a
> > > > > direct internal function here?
> > > >
> > > > Somewhat, but this has considerable knock on effects, e.g. currentl=
y
> > > > DOT_PROD is treated as a widening operation and so is handled by
> > > > supportable_widening_operation which does not support calls. There'=
s
> a
> > > > significant number of places which work on the tree EXPR (including
> > > constant folding) which all need to be changed.
> > > >
> > > > > In particular all the changes around optab_subtype look like they
> > > > > make a bad API worse ... at least a single optab_vector_mixed_sig=
n
> > > > > should suffice here, no need to make it a flags kind.
> > > >
> > > > The reason I did so is because depending on where the query is done=
 it
> > > > does use different subtypes currently.  During detection it uses
> > > > optab_default, and during vectorization optab_vector.  For this
> > > > instruction this difference doesn't seem to be used, but did not wa=
nt to
> > > lose this information in case something depended on it.
> > > >
> > > > But can make it just one.
> > > >
> > > > >
> > > > > +  /* If we have a sign changing dot product we need to check tha=
t
> the
> > > > > +     promoted type if unsigned has at least the same precision a=
s
> > > > > + the
> > > > > final
> > > > > +     type of the dot-product.  */
> > > > > +  if (subtype !=3D optab_default)
> > > > > +    {
> > > > > +      tree mult_type =3D TREE_TYPE (unprom_mult.op);
> > > > > +      if (TYPE_SIGN (mult_type) =3D=3D UNSIGNED
> > > > > +         && TYPE_PRECISION (mult_type) < TYPE_PRECISION (type))
> > > > > +       return NULL;
> > > > > +    }
> > > > >
> > > > > I don't understand this - how do we ever arrive at a result with =
less
> > > precision?
> > > >
> > > > The user could have manually truncated the results, i.e. in the
> > > > detection code notice `mult`
> > > >
> > > >       int av =3D a[i];
> > > >       int bv =3D b[i];
> > > >       SIGNEDNESS_2 short mult =3D av * bv;
> > > >       res +=3D mult;
> > > >
> > > > which is a short, so it's manually truncating the multiplication wh=
ich
> > > > is done as int by the instruction. If `mult` is unsigned then it wi=
ll
> > > > truncate the result if the signed input to usdot was negative, unle=
ss
> > > > the Intermediate calculation is of the same precision as the
> > > > instruction. i.e. if mult is unsigned int then there's no truncatio=
n
> > > > going on, it's casting from int to unsigned int so it's safe to use
> > > > then as the instruction does the same thing internally.
> > >
> > > It looks to me that we simply should only ever allow sing-changes fro=
m
> > > multiplication result to the sum.  At least your example above is not
> special to
> > > mixed sign multiplications, no?
> > >
> > > > > And why's this not an issue for signed multiplication?
> > > >
> > > > It is, but in that case it's handled by the type jousting, which
> > > > doesn't allow the type mismatch. i.e.
> > > >
> > > > #define SIGNEDNESS_1 unsigned
> > > > #define SIGNEDNESS_2 unsigned
> > > > #define SIGNEDNESS_3 signed
> > > > #define SIGNEDNESS_4 signed
> > > >
> > > > SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > > > SIGNEDNESS_3 char *restrict a,
> > > >    SIGNEDNESS_4 char *restrict b)
> > > > {
> > > >   for (__INTPTR_TYPE__ i =3D 0; i < N; ++i)
> > > >     {
> > > >       int av =3D a[i];
> > > >       int bv =3D b[i];
> > > >       SIGNEDNESS_2 short mult =3D av * bv;
> > > >       res +=3D mult;
> > > >     }
> > > >   return res;
> > > > }
> > > >
> > > > Is also not detected as a dot product.  By adding the carve out to =
the
> > > > widen multiplication detection it now allows this case through so I
> > > > handle it in the detection code.  Thinking about it now, it seems m=
ore
> > > > logical to add this case handling inside the type jousting code as =
I
> > > > don't think it's ever something you'd want.
> > >
> > > Yeah, I think we only need to look through sign changes on the
> multiplication
> > > result.
> > >
> > > > > Also...
> > > > >
> > > > > +  /* If we have a sign changing dot-product the dot-product itse=
lf
> > > > > + does
> > > > > any
> > > > > +     sign conversions, so consume the type and use the unpromote=
d
> > > types.
> > > > > */
> > > > > +  tree mult_arg1, mult_arg2;
> > > > > +  if (subtype =3D=3D optab_default)
> > > > > +    {
> > > > > +      mult_arg1 =3D mult_oprnd[0];
> > > > > +      mult_arg2 =3D mult_oprnd[1];
> > > > > +    }
> > > > > +  else
> > > > > +    {
> > > > > +      mult_arg1 =3D unprom0[0].op;
> > > > > +      mult_arg2 =3D unprom0[1].op;
> > > > > +    }
> > > > >    pattern_stmt =3D gimple_build_assign (var, DOT_PROD_EXPR,
> > > > > -                                     mult_oprnd[0], mult_oprnd[1=
],
> > > > > oprnd1);
> > > > > +                                     mult_arg1, mult_arg2, oprnd=
1);
> > > > >
> > > > > I thought DOT_PROD always performs the promotion.  Maybe
> > > mult_oprnd
> > > > > and unprom0 are just misnamed here?
> > > >
> > > > Somewhat, in a normal dot-product the sign of the multiplication ar=
e
> > > > the same here as the "unpromoted" types. So after
> vect_convert_input
> > > > these two types are the same.
> > > >
> > > > However because here the sign changes and to maintain the semantics
> of
> > > > the C code there's an extra conversion here to get the arguments in
> > > > the same sign.  That needs to be stripped before given to the
> > > > instruction which does the conversion internally.
> > >
> > > Yes, but then why's that not done by the detection code?  That is, do=
es it
> > > (mis-)handle the (int)short_a * (int)(unsigned short)short_b where we=
'd
> > > want the mixed-sign handling and not strip the unsigned short convers=
ion
> > > from short_b?
> > >
> > > Richard.
> > >
> > > >
> > > > Regards,
> > > > Tamar
> > > >
> > > > >
> > > > > Richard.
> > > > >
> > > > > > Regards,
> > > > > > Tamar
> > > > > >
> > > > > > >
> > > > > > > The tree.def docs say the sum is also possibly widening but I
> > > > > > > don't see this covered by the optab so we should eventually
> > > > > > > remove this feature from the tree side.  In fact the tree-cfg=
.c
> > > > > > > verifier requires the addition to be not widening - thus only
> > > > > > > tree.def needs
> > > > > adjustment.
> > > > > > >
> > > > > > > >  @cindex @code{ssad@var{m}} instruction pattern  @item
> > > > > > > > @samp{ssad@var{m}} diff --git a/gcc/optabs-tree.h
> > > > > > > > b/gcc/optabs-tree.h index
> > > > > > > >
> > > > > > >
> > > > >
> > >
> c3aaa1a416991e856d3e24da45968a92ebada82c..ebc23ac86fe99057f375781c2f
> > > > > > > 19
> > > > > > > > 90e0548ba08d 100644
> > > > > > > > --- a/gcc/optabs-tree.h
> > > > > > > > +++ b/gcc/optabs-tree.h
> > > > > > > > @@ -27,11 +27,29 @@ along with GCC; see the file COPYING3.
> If
> > > > > > > > not
> > > > > see
> > > > > > > >     shift amount vs. machines that take a vector for the sh=
ift
> amount.
> > > > > > > > */  enum optab_subtype  {
> > > > > > > > -  optab_default,
> > > > > > > > -  optab_scalar,
> > > > > > > > -  optab_vector
> > > > > > > > +  optab_default =3D 1 << 0,
> > > > > > > > +  optab_scalar =3D 1 << 1,
> > > > > > > > +  optab_vector =3D 1 << 2,
> > > > > > > > +  optab_signed_to_unsigned =3D 1 << 3,
> > > > > > > > + optab_unsigned_to_signed =3D
> > > > > > > > + 1 << 4
> > > > > > > >  };
> > > > > > > >
> > > > > > > > +/* Override the OrEqual-operator so we can use
> optab_subtype
> > > > > > > > +as a bit flag.  */ inline enum optab_subtype& operator |=
=3D
> > > > > > > > +(enum
> > > > > > > optab_subtype&
> > > > > > > > +a, enum optab_subtype b) {
> > > > > > > > +    return a =3D static_cast<optab_subtype>(static_cast<in=
t>(a)
> > > > > > > > +					  | static_cast<int>(b)); }
> > > > > > > > +
> > > > > > > > +/* Override the Or-operator so we can use optab_subtype as=
 a
> > > > > > > > +bit flag.  */ inline enum optab_subtype operator | (enum
> > > > > > > > +optab_subtype a, enum optab_subtype b) {
> > > > > > > > +    return static_cast<optab_subtype>(static_cast<int>(a)
> > > > > > > > +				      | static_cast<int>(b)); }
> > > > > > > > +
> > > > > > > >  /* Return the optab used for computing the given operation=
 on
> > > > > > > > the type
> > > > > > > given by
> > > > > > > >     the second argument.  The third argument distinguishes
> > > > > > > > between the
> > > > > > > types of
> > > > > > > >     vector shifts and rotates.  */ diff --git
> > > > > > > > a/gcc/optabs-tree.c b/gcc/optabs-tree.c index
> > > > > > > >
> > > > > > >
> > > > >
> > >
> 95ffe397c23e80c105afea52e9d47216bf52f55a..2f60004545defc53182e004eea
> > > > > > > 1e
> > > > > > > > 5c22b7453072 100644
> > > > > > > > --- a/gcc/optabs-tree.c
> > > > > > > > +++ b/gcc/optabs-tree.c
> > > > > > > > @@ -127,7 +127,17 @@ optab_for_tree_code (enum tree_code
> > > code,
> > > > > > > const_tree type,
> > > > > > > >        return TYPE_UNSIGNED (type) ? usum_widen_optab :
> > > > > > > > ssum_widen_optab;
> > > > > > > >
> > > > > > > >      case DOT_PROD_EXPR:
> > > > > > > > -      return TYPE_UNSIGNED (type) ? udot_prod_optab :
> > > > > sdot_prod_optab;
> > > > > > > > +      {
> > > > > > > > +	gcc_assert (subtype & optab_default
> > > > > > > > +		    || subtype & optab_vector
> > > > > > > > +		    || subtype & optab_signed_to_unsigned
> > > > > > > > +		    || subtype & optab_unsigned_to_signed);
> > > > > > > > +
> > > > > > > > +	if (subtype & (optab_unsigned_to_signed |
> > > > > > > optab_signed_to_unsigned))
> > > > > > > > +	  return usdot_prod_optab;
> > > > > > > > +
> > > > > > > > +	return (TYPE_UNSIGNED (type) ? udot_prod_optab :
> > > > > > > sdot_prod_optab);
> > > > > > > > +      }
> > > > > > > >
> > > > > > > >      case SAD_EXPR:
> > > > > > > >        return TYPE_UNSIGNED (type) ? usad_optab : ssad_opta=
b;
> > > > > > > > diff --git a/gcc/optabs.c b/gcc/optabs.c index
> > > > > > > >
> > > > > > >
> > > > >
> > >
> f4614a394587787293dc8b680a38901f7906f61c..2e18b76de1412eab71971753ac
> > > > > > > 67
> > > > > > > > 8597c0d00098 100644
> > > > > > > > --- a/gcc/optabs.c
> > > > > > > > +++ b/gcc/optabs.c
> > > > > > > > @@ -262,6 +262,11 @@ expand_widen_pattern_expr (sepops
> ops,
> > > > > > > > rtx op0,
> > > > > > > rtx op1, rtx wide_op,
> > > > > > > >    bool sbool =3D false;
> > > > > > > >
> > > > > > > >    oprnd0 =3D ops->op0;
> > > > > > > > +  if (nops >=3D 2)
> > > > > > > > +    oprnd1 =3D ops->op1;
> > > > > > > > +  if (nops >=3D 3)
> > > > > > > > +    oprnd2 =3D ops->op2;
> > > > > > > > +
> > > > > > > >    tmode0 =3D TYPE_MODE (TREE_TYPE (oprnd0));
> > > > > > > >    if (ops->code =3D=3D VEC_UNPACK_FIX_TRUNC_HI_EXPR
> > > > > > > >        || ops->code =3D=3D VEC_UNPACK_FIX_TRUNC_LO_EXPR) @@=
 -
> > > 285,6
> > > > > > > +290,27
> > > > > > > > @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1,
> > > > > > > > rtx
> > > > > > > wide_op,
> > > > > > > >  	   ? vec_unpacks_sbool_hi_optab :
> > > vec_unpacks_sbool_lo_optab);
> > > > > > > >        sbool =3D true;
> > > > > > > >      }
> > > > > > > > +  else if (ops->code =3D=3D DOT_PROD_EXPR)
> > > > > > > > +    {
> > > > > > > > +      enum optab_subtype subtype =3D optab_default;
> > > > > > > > +      signop sign1 =3D TYPE_SIGN (TREE_TYPE (oprnd0));
> > > > > > > > +      signop sign2 =3D TYPE_SIGN (TREE_TYPE (oprnd1));
> > > > > > > > +      if (sign1 =3D=3D sign2)
> > > > > > > > +	;
> > > > > > > > +      else if (sign1 =3D=3D SIGNED && sign2 =3D=3D UNSIGNE=
D)
> > > > > > > > +	{
> > > > > > > > +	  subtype |=3D optab_signed_to_unsigned;
> > > > > > > > +	  /* Same as optab_unsigned_to_signed but flip the
> > > operands.  */
> > > > > > > > +	  std::swap (op0, op1);
> > > > > > > > +	}
> > > > > > > > +      else if (sign1 =3D=3D UNSIGNED && sign2 =3D=3D SIGNE=
D)
> > > > > > > > +	subtype |=3D optab_unsigned_to_signed;
> > > > > > > > +      else
> > > > > > > > +	gcc_unreachable ();
> > > > > > > > +
> > > > > > > > +      widen_pattern_optab
> > > > > > > > +	=3D optab_for_tree_code (ops->code, TREE_TYPE (oprnd0),
> > > subtype);
> > > > > > > > +    }
> > > > > > > >    else
> > > > > > > >      widen_pattern_optab
> > > > > > > >        =3D optab_for_tree_code (ops->code, TREE_TYPE (oprnd=
0),
> > > > > > > > optab_default); @@ -298,10 +324,7 @@
> > > expand_widen_pattern_expr
> > > > > > > (sepops ops, rtx op0, rtx op1, rtx wide_op,
> > > > > > > >    gcc_assert (icode !=3D CODE_FOR_nothing);
> > > > > > > >
> > > > > > > >    if (nops >=3D 2)
> > > > > > > > -    {
> > > > > > > > -      oprnd1 =3D ops->op1;
> > > > > > > > -      tmode1 =3D TYPE_MODE (TREE_TYPE (oprnd1));
> > > > > > > > -    }
> > > > > > > > +    tmode1 =3D TYPE_MODE (TREE_TYPE (oprnd1));
> > > > > > > >    else if (sbool)
> > > > > > > >      {
> > > > > > > >        nops =3D 2;
> > > > > > > > @@ -316,7 +339,6 @@ expand_widen_pattern_expr (sepops
> ops,
> > > rtx
> > > > > > > > op0,
> > > > > > > rtx op1, rtx wide_op,
> > > > > > > >      {
> > > > > > > >        gcc_assert (tmode1 =3D=3D tmode0);
> > > > > > > >        gcc_assert (op1);
> > > > > > > > -      oprnd2 =3D ops->op2;
> > > > > > > >        wmode =3D TYPE_MODE (TREE_TYPE (oprnd2));
> > > > > > > >      }
> > > > > > > >
> > > > > > > > diff --git a/gcc/optabs.def b/gcc/optabs.def index
> > > > > > > >
> > > > > > >
> > > > >
> > >
> b192a9d070b8aa72e5676b2eaa020b5bdd7ffcc8..f470c2168378cec840edf7fbd
> > > > > > > b7c
> > > > > > > > 18615baae928 100644
> > > > > > > > --- a/gcc/optabs.def
> > > > > > > > +++ b/gcc/optabs.def
> > > > > > > > @@ -352,6 +352,7 @@ OPTAB_D (uavg_ceil_optab,
> "uavg$a3_ceil")
> > > > > > > OPTAB_D
> > > > > > > > (sdot_prod_optab, "sdot_prod$I$a")  OPTAB_D
> > > (ssum_widen_optab,
> > > > > > > > "widen_ssum$I$a3")  OPTAB_D (udot_prod_optab,
> > > "udot_prod$I$a")
> > > > > > > > +OPTAB_D (usdot_prod_optab, "usdot_prod$I$a")
> > > > > > > >  OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
> OPTAB_D
> > > > > > > (usad_optab,
> > > > > > > > "usad$I$a")  OPTAB_D (ssad_optab, "ssad$I$a") diff --git
> > > > > > > > a/gcc/tree-cfg.c b/gcc/tree-cfg.c index
> > > > > > > >
> > > > > > >
> > > > >
> > >
> 7e3aae5f9c28a49feedc7cc66e8ac0d476b9f28a..58b55bb648ad97d514f1fa18bb
> > > > > > > 00
> > > > > > > > 808fd2678b42 100644
> > > > > > > > --- a/gcc/tree-cfg.c
> > > > > > > > +++ b/gcc/tree-cfg.c
> > > > > > > > @@ -4421,7 +4421,8 @@ verify_gimple_assign_ternary (gassign
> > > *stmt)
> > > > > > > >  		  && !SCALAR_FLOAT_TYPE_P (rhs1_type))
> > > > > > > >  		 || (!INTEGRAL_TYPE_P (lhs_type)
> > > > > > > >  		     && !SCALAR_FLOAT_TYPE_P (lhs_type))))
> > > > > > > > -	    || !types_compatible_p (rhs1_type, rhs2_type)
> > > > > > > > +	    || (!types_compatible_p (rhs1_type, rhs2_type)
> > > > > > > > +		&& TYPE_SIGN (rhs1_type) =3D=3D TYPE_SIGN
> > > (rhs2_type))
> > > > > > >
> > > > > > > That's not restrictive enough.  I suggest you use
> > > > > > >
> > > > > > >             && element_precision (rhs1_type) !=3D
> > > > > > > element_precision
> > > > > > > (rhs2_type)
> > > > > > >
> > > > > > > instead.
> > > > > > >
> > > > > > > As said, I'm not sure all the changes in this patch are requi=
red.
> > > > > > >
> > > > > > > Please elaborate.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Richard.
> > > > > > >
> > > > > > > >  	    || !useless_type_conversion_p (lhs_type, rhs3_type)
> > > > > > > >  	    || maybe_lt (GET_MODE_SIZE (element_mode
> > > (rhs3_type)),
> > > > > > > >  			 2 * GET_MODE_SIZE (element_mode
> > > (rhs1_type))))
> > > > > > > diff --git
> > > > > > > > a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index
> > > > > > > >
> > > > > > >
> > > > >
> > >
> 93fa2928e001c154bd4a9a73ac1dbbbf73c456df..cb8f5fbb6abca181c4171194d1
> > > > > > > 9f
> > > > > > > > ec29ec6e4176 100644
> > > > > > > > --- a/gcc/tree-vect-loop.c
> > > > > > > > +++ b/gcc/tree-vect-loop.c
> > > > > > > > @@ -6401,6 +6401,33 @@ build_vect_cond_expr (enum
> tree_code
> > > > > code,
> > > > > > > tree vop[3], tree mask,
> > > > > > > >      }
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +/* Determine the optab_subtype to use for the given CODE
> and
> > > STMT.
> > > > > > > For
> > > > > > > > +   most CODE this will be optab_vector, however for certai=
n
> > > > > > > > + operations
> > > > > > > such as
> > > > > > > > +   DOT_PROD_EXPR where the operation can different signs f=
or
> > > > > > > > + the
> > > > > > > operands we
> > > > > > > > +   need to be able to pick the right optabs.  */
> > > > > > > > +
> > > > > > > > +static enum optab_subtype
> > > > > > > > +vect_determine_dot_kind (tree_code code, stmt_vec_info
> > > > > > > > +stmt_vinfo) {
> > > > > > > > +  enum optab_subtype subtype =3D optab_vector;
> > > > > > > > +  switch (code)
> > > > > > > > +    {
> > > > > > > > +      case DOT_PROD_EXPR:
> > > > > > > > +	{
> > > > > > > > +	  gassign *stmt =3D as_a <gassign *> (STMT_VINFO_STMT
> > > (stmt_vinfo));
> > > > > > > > +	  signop rhs1_sign =3D TYPE_SIGN (TREE_TYPE
> > > > > > > > +(gimple_assign_rhs1
> > > > > > > (stmt)));
> > > > > > > > +	  signop rhs2_sign =3D TYPE_SIGN (TREE_TYPE
> > > > > > > > +(gimple_assign_rhs2
> > > > > > > (stmt)));
> > > > > > > > +	  if (rhs1_sign !=3D rhs2_sign)
> > > > > > > > +	    subtype |=3D optab_unsigned_to_signed;
> > > > > > > > +	  break;
> > > > > > > > +	}
> > > > > > > > +      default:
> > > > > > > > +	break;
> > > > > > > > +    }
> > > > > > > > +
> > > > > > > > +  return subtype;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >  /* Function vectorizable_reduction.
> > > > > > > >
> > > > > > > >     Check if STMT_INFO performs a reduction operation that =
can
> > > > > > > > be
> > > > > > > vectorized.
> > > > > > > > @@ -7189,7 +7216,8 @@ vectorizable_reduction (loop_vec_info
> > > > > > > loop_vinfo,
> > > > > > > >        bool ok =3D true;
> > > > > > > >
> > > > > > > >        /* 4.1. check support for the operation in the loop =
 */
> > > > > > > > -      optab optab =3D optab_for_tree_code (code, vectype_i=
n,
> > > > > optab_vector);
> > > > > > > > +      enum optab_subtype subtype =3D vect_determine_dot_ki=
nd
> > > > > > > > + (code,
> > > > > > > stmt_info);
> > > > > > > > +      optab optab =3D optab_for_tree_code (code, vectype_i=
n,
> > > > > > > > + subtype);
> > > > > > > >        if (!optab)
> > > > > > > >  	{
> > > > > > > >  	  if (dump_enabled_p ())
> > > > > > > > diff --git a/gcc/tree-vect-patterns.c
> > > > > > > > b/gcc/tree-vect-patterns.c index
> > > > > > > >
> > > > > > >
> > > > >
> > >
> 441d6cd28c4eaded7abd756164890dbcffd2f3b8..943c001fb13777b4d1513841f
> > > > > > > a84
> > > > > > > > 942316846d5e 100644
> > > > > > > > --- a/gcc/tree-vect-patterns.c
> > > > > > > > +++ b/gcc/tree-vect-patterns.c
> > > > > > > > @@ -201,7 +201,8 @@ vect_get_external_def_edge (vec_info
> > > > > > > > *vinfo, tree
> > > > > > > > var)  static bool  vect_supportable_direct_optab_p (vec_inf=
o
> > > > > > > > *vinfo, tree otype, tree_code code,
> > > > > > > >  				 tree itype, tree *vecotype_out,
> > > > > > > > -				 tree *vecitype_out =3D NULL)
> > > > > > > > +				 tree *vecitype_out =3D NULL,
> > > > > > > > +				 enum optab_subtype subtype =3D
> > > > > > > optab_default)
> > > > > > > >  {
> > > > > > > >    tree vecitype =3D get_vectype_for_scalar_type (vinfo, it=
ype);
> > > > > > > >    if (!vecitype)
> > > > > > > > @@ -211,7 +212,7 @@ vect_supportable_direct_optab_p
> (vec_info
> > > > > > > > *vinfo,
> > > > > > > tree otype, tree_code code,
> > > > > > > >    if (!vecotype)
> > > > > > > >      return false;
> > > > > > > >
> > > > > > > > -  optab optab =3D optab_for_tree_code (code, vecitype,
> > > > > > > > optab_default);
> > > > > > > > +  optab optab =3D optab_for_tree_code (code, vecitype,
> > > > > > > > + subtype);
> > > > > > > >    if (!optab)
> > > > > > > >      return false;
> > > > > > > >
> > > > > > > > @@ -487,14 +488,31 @@ vect_joust_widened_integer (tree
> type,
> > > > > > > > bool shift_p, tree op,  }
> > > > > > > >
> > > > > > > >  /* Return true if the common supertype of NEW_TYPE and
> > > > > > > *COMMON_TYPE
> > > > > > > > -   is narrower than type, storing the supertype in
> *COMMON_TYPE
> > > if
> > > > > so.
> > > > > > > */
> > > > > > > > +   is narrower than type, storing the supertype in
> > > > > > > > + *COMMON_TYPE if
> > > > > so.
> > > > > > > > +   If ALLOW_SHORT_SIGN_MISMATCH then accept that
> > > > > *COMMON_TYPE
> > > > > > > and NEW_TYPE
> > > > > > > > +   may be of different signs but equal precision.   */
> > > > > > > >
> > > > > > > >  static bool
> > > > > > > > -vect_joust_widened_type (tree type, tree new_type, tree
> > > > > > > *common_type)
> > > > > > > > +vect_joust_widened_type (tree type, tree new_type, tree
> > > > > > > *common_type,
> > > > > > > > +			 bool allow_short_sign_mismatch =3D false)
> > > > > > > >  {
> > > > > > > >    if (types_compatible_p (*common_type, new_type))
> > > > > > > >      return true;
> > > > > > > >
> > > > > > > > +  /* Check if the mismatch is only in the sign and if we h=
ave
> > > > > > > > +     allow_short_sign_mismatch then allow it.  */
> > > > > > > > +  if (allow_short_sign_mismatch
> > > > > > > > +      && TYPE_SIGN (*common_type) !=3D TYPE_SIGN (new_type=
))
> > > > > > > > +    {
> > > > > > > > +      bool sign =3D TYPE_SIGN (*common_type) =3D=3D UNSIGN=
ED;
> > > > > > > > +      tree eq_type
> > > > > > > > +	=3D build_nonstandard_integer_type (TYPE_PRECISION
> > > (new_type),
> > > > > > > > +					  sign);
> > > > > > > > +
> > > > > > > > +      if (types_compatible_p (*common_type, eq_type))
> > > > > > > > +	return true;
> > > > > > > > +    }
> > > > > > > > +
> > > > > > > >    /* See if *COMMON_TYPE can hold all values of NEW_TYPE.
> */
> > > > > > > >    if ((TYPE_PRECISION (new_type) < TYPE_PRECISION
> > > (*common_type))
> > > > > > > >        && (TYPE_UNSIGNED (new_type) || !TYPE_UNSIGNED
> > > > > > > (*common_type)))
> > > > > > > > @@ -532,6 +550,9 @@ vect_joust_widened_type (tree type,
> tree
> > > > > > > new_type, tree *common_type)
> > > > > > > >     to a type that (a) is narrower than the result of STMT_=
INFO
> and
> > > > > > > >     (b) can hold all leaf operand values.
> > > > > > > >
> > > > > > > > +   If ALLOW_SHORT_SIGN_MISMATCH then allow that the signs
> of
> > > > > > > > + the
> > > > > > > operands
> > > > > > > > +   may differ in signs but not in precision.
> > > > > > > > +
> > > > > > > >     Return 0 if STMT_INFO isn't such a tree, or if no such
> > > COMMON_TYPE
> > > > > > > >     exists.  */
> > > > > > > >
> > > > > > > > @@ -539,7 +560,8 @@ static unsigned int
> vect_widened_op_tree
> > > > > > > > (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
> > > > > > > >  		      tree_code widened_code, bool shift_p,
> > > > > > > >  		      unsigned int max_nops,
> > > > > > > > -		      vect_unpromoted_value *unprom, tree
> > > *common_type)
> > > > > > > > +		      vect_unpromoted_value *unprom, tree
> > > *common_type,
> > > > > > > > +		      bool allow_short_sign_mismatch =3D false)
> > > > > > > >  {
> > > > > > > >    /* Check for an integer operation with the right code.  =
*/
> > > > > > > >    gassign *assign =3D dyn_cast <gassign *> (stmt_info->stm=
t);
> > > > > > > > @@
> > > > > > > > -600,7
> > > > > > > > +622,8 @@ vect_widened_op_tree (vec_info *vinfo,
> > > stmt_vec_info
> > > > > > > stmt_info, tree_code code,
> > > > > > > >  		=3D vinfo->lookup_def (this_unprom->op);
> > > > > > > >  	      nops =3D vect_widened_op_tree (vinfo, def_stmt_info=
,
> > > code,
> > > > > > > >  					   widened_code, shift_p,
> > > max_nops,
> > > > > > > > -					   this_unprom,
> > > common_type);
> > > > > > > > +					   this_unprom,
> > > common_type,
> > > > > > > > +
> > > allow_short_sign_mismatch);
> > > > > > > >  	      if (nops =3D=3D 0)
> > > > > > > >  		return 0;
> > > > > > > >
> > > > > > > > @@ -617,7 +640,8 @@ vect_widened_op_tree (vec_info *vinfo,
> > > > > > > stmt_vec_info stmt_info, tree_code code,
> > > > > > > >  	      if (i =3D=3D 0)
> > > > > > > >  		*common_type =3D this_unprom->type;
> > > > > > > >  	      else if (!vect_joust_widened_type (type, this_unpro=
m-
> > > >type,
> > > > > > > > -						 common_type))
> > > > > > > > +						 common_type,
> > > > > > > > +
> > > allow_short_sign_mismatch))
> > > > > > > >  		return 0;
> > > > > > > >  	    }
> > > > > > > >  	}
> > > > > > > > @@ -888,21 +912,24 @@ vect_reassociating_reduction_p
> (vec_info
> > > > > > > > *vinfo,
> > > > > > > >
> > > > > > > >     Try to find the following pattern:
> > > > > > > >
> > > > > > > > -     type x_t, y_t;
> > > > > > > > +     type1a x_t
> > > > > > > > +     type1b y_t;
> > > > > > > >       TYPE1 prod;
> > > > > > > >       TYPE2 sum =3D init;
> > > > > > > >     loop:
> > > > > > > >       sum_0 =3D phi <init, sum_1>
> > > > > > > >       S1  x_t =3D ...
> > > > > > > >       S2  y_t =3D ...
> > > > > > > > -     S3  x_T =3D (TYPE1) x_t;
> > > > > > > > -     S4  y_T =3D (TYPE1) y_t;
> > > > > > > > +     S3  x_T =3D (TYPE3) x_t;
> > > > > > > > +     S4  y_T =3D (TYPE4) y_t;
> > > > > > > >       S5  prod =3D x_T * y_T;
> > > > > > > >       [S6  prod =3D (TYPE2) prod;  #optional]
> > > > > > > >       S7  sum_1 =3D prod + sum_0;
> > > > > > > >
> > > > > > > > -   where 'TYPE1' is exactly double the size of type 'type'=
, and
> 'TYPE2'
> > > is
> > > > > the
> > > > > > > > -   same size of 'TYPE1' or bigger. This is a special case =
of a
> reduction
> > > > > > > > +   where 'TYPE1' is exactly double the size of type 'type1=
a' and
> > > 'type1b',
> > > > > > > > +   the sign of 'TYPE1' must be one of 'type1a' or 'type1b'=
 but the
> > > sign of
> > > > > > > > +   'type1a' and 'type1b' can differ. 'TYPE2' is the same s=
ize of
> 'TYPE1'
> > > or
> > > > > > > > +   bigger and must be the same sign. This is a special cas=
e
> > > > > > > > + of a reduction
> > > > > > > >     computation.
> > > > > > > >
> > > > > > > >     Input:
> > > > > > > > @@ -939,15 +966,16 @@ vect_recog_dot_prod_pattern
> (vec_info
> > > > > > > > *vinfo,
> > > > > > > >
> > > > > > > >    /* Look for the following pattern
> > > > > > > >            DX =3D (TYPE1) X;
> > > > > > > > -          DY =3D (TYPE1) Y;
> > > > > > > > +	  DY =3D (TYPE2) Y;
> > > > > > > >            DPROD =3D DX * DY;
> > > > > > > > -          DDPROD =3D (TYPE2) DPROD;
> > > > > > > > +	  DDPROD =3D (TYPE3) DPROD;
> > > > > > > >            sum_1 =3D DDPROD + sum_0;
> > > > > > > >       In which
> > > > > > > >       - DX is double the size of X
> > > > > > > >       - DY is double the size of Y
> > > > > > > >       - DX, DY, DPROD all have the same type but the sign
> > > > > > > > -       between DX, DY and DPROD can differ.
> > > > > > > > +       between DX, DY and DPROD can differ. The sign of DP=
ROD
> > > > > > > > +       is one of the signs of DX or DY.
> > > > > > > >       - sum is the same size of DPROD or bigger
> > > > > > > >       - sum has been recognized as a reduction variable.
> > > > > > > >
> > > > > > > > @@ -986,14 +1014,41 @@ vect_recog_dot_prod_pattern
> (vec_info
> > > > > *vinfo,
> > > > > > > >       inside the loop (in case we are analyzing an outer-lo=
op).  */
> > > > > > > >    vect_unpromoted_value unprom0[2];
> > > > > > > >    if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR,
> > > > > > > WIDEN_MULT_EXPR,
> > > > > > > > -			     false, 2, unprom0, &half_type))
> > > > > > > > +			     false, 2, unprom0, &half_type, true))
> > > > > > > >      return NULL;
> > > > > > > >
> > > > > > > > +  /* Check to see if there is a sign change happening in t=
he
> > > > > > > > + operands of
> > > > > > > the
> > > > > > > > +     multiplication and pick the appropriate optab subtype=
.
> > > > > > > > +*/
> > > > > > > > +  enum optab_subtype subtype;
> > > > > > > > +  tree rhs_type1 =3D unprom0[0].type;
> > > > > > > > +  tree rhs_type2 =3D unprom0[1].type;
> > > > > > > > +  if (TYPE_SIGN (rhs_type1) =3D=3D TYPE_SIGN (rhs_type2))
> > > > > > > > +     subtype =3D optab_default;
> > > > > > > > +  else if (TYPE_SIGN (rhs_type1) =3D=3D SIGNED
> > > > > > > > +	   && TYPE_SIGN (rhs_type2) =3D=3D UNSIGNED)
> > > > > > > > +     subtype =3D optab_signed_to_unsigned;
> > > > > > > > +  else if (TYPE_SIGN (rhs_type1) =3D=3D UNSIGNED
> > > > > > > > +	   && TYPE_SIGN (rhs_type2) =3D=3D SIGNED)
> > > > > > > > +     subtype =3D optab_unsigned_to_signed;
> > > > > > > > +  else
> > > > > > > > +    gcc_unreachable ();
> > > > > > > > +
> > > > > > > > +  /* If we have a sign changing dot product we need to che=
ck
> that
> > > the
> > > > > > > > +     promoted type if unsigned has at least the same
> > > > > > > > + precision as the
> > > > > final
> > > > > > > > +     type of the dot-product.  */
> > > > > > > > +  if (subtype !=3D optab_default)
> > > > > > > > +    {
> > > > > > > > +      tree mult_type =3D TREE_TYPE (unprom_mult.op);
> > > > > > > > +      if (TYPE_SIGN (mult_type) =3D=3D UNSIGNED
> > > > > > > > +	  && TYPE_PRECISION (mult_type) < TYPE_PRECISION (type))
> > > > > > > > +	return NULL;
> > > > > > > > +    }
> > > > > > > > +
> > > > > > > >    vect_pattern_detected ("vect_recog_dot_prod_pattern",
> > > > > > > > last_stmt);
> > > > > > > >
> > > > > > > >    tree half_vectype;
> > > > > > > >    if (!vect_supportable_direct_optab_p (vinfo, type,
> > > > > > > > DOT_PROD_EXPR,
> > > > > > > half_type,
> > > > > > > > -					type_out, &half_vectype))
> > > > > > > > +					type_out, &half_vectype,
> > > subtype))
> > > > > > > >      return NULL;
> > > > > > > >
> > > > > > > >    /* Get the inputs in the appropriate types.  */ @@ -1002=
,8
> > > > > > > > +1057,22 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
> > > > > > > >  		       unprom0, half_vectype);
> > > > > > > >
> > > > > > > >    var =3D vect_recog_temp_ssa_var (type, NULL);
> > > > > > > > +
> > > > > > > > +  /* If we have a sign changing dot-product the dot-produc=
t
> > > > > > > > + itself does
> > > > > any
> > > > > > > > +     sign conversions, so consume the type and use the
> > > > > > > > + unpromoted types.  */  tree mult_arg1, mult_arg2;  if
> > > > > > > > + (subtype =3D=3D
> > > > > > > > + optab_default)
> > > > > > > > +    {
> > > > > > > > +      mult_arg1 =3D mult_oprnd[0];
> > > > > > > > +      mult_arg2 =3D mult_oprnd[1];
> > > > > > > > +    }
> > > > > > > > +  else
> > > > > > > > +    {
> > > > > > > > +      mult_arg1 =3D unprom0[0].op;
> > > > > > > > +      mult_arg2 =3D unprom0[1].op;
> > > > > > > > +    }
> > > > > > > >    pattern_stmt =3D gimple_build_assign (var, DOT_PROD_EXPR=
,
> > > > > > > > -				      mult_oprnd[0], mult_oprnd[1],
> > > oprnd1);
> > > > > > > > +				      mult_arg1, mult_arg2, oprnd1);
> > > > > > > >
> > > > > > > >    return pattern_stmt;
> > > > > > > >  }
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Richard Biener <rguenther@suse.de> SUSE Software Solutions
> > > > > > > Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> GF:
> > > > > > > Felix Imend?rffer; HRB 36809 (AG Nuernberg)
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de>
> > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > > > > Nuernberg, Germany; GF: Felix Imend?rffer; HRB 36809 (AG
> Nuernberg)
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > > Nuernberg, Germany; GF: Felix Imend?rffer; HRB 36809 (AG Nuernberg)
> >
>=20
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> Nuernberg,
> Germany; GF: Felix Imend