From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR02-HE1-obe.outbound.protection.outlook.com (mail-eopbgr10057.outbound.protection.outlook.com [40.107.1.57]) by sourceware.org (Postfix) with ESMTPS id 3B7393840C10 for ; Mon, 28 Dec 2020 13:38:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3B7393840C10 Received: from DU2PR04CA0095.eurprd04.prod.outlook.com (2603:10a6:10:230::10) by AM6PR08MB5062.eurprd08.prod.outlook.com (2603:10a6:20b:e1::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3700.27; Mon, 28 Dec 2020 13:38:29 +0000 Received: from DB5EUR03FT024.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:230:cafe::56) by DU2PR04CA0095.outlook.office365.com (2603:10a6:10:230::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3700.29 via Frontend Transport; Mon, 28 Dec 2020 13:38:27 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT024.mail.protection.outlook.com (10.152.20.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3700.27 via Frontend Transport; Mon, 28 Dec 2020 13:38:27 +0000 Received: ("Tessian outbound fc5cc0046d61:v71"); Mon, 28 Dec 2020 13:38:27 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: a6650abcee0a5831 X-CR-MTA-TID: 64aa7808 Received: from 4a7ccbb7428a.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 789E5227-702B-4AAE-8A36-3F2DD8CDB302.1; Mon, 28 Dec 2020 13:38:11 +0000 Received: from EUR01-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 4a7ccbb7428a.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 28 Dec 2020 13:38:11 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nMB/rs84Ru3VlrwZpJyzFgFGrnZ/ngmqav4e1Wacfl1UIM+FlKjYYPKne5xBLBMvIrudDpmmObWAvouMiYug7hE+qZy7qhNV7NFXOOfd7HgP4JVcIAyee9ZTDqY1ZQqn+aYZDd/GMlhsVtoWWaAdxpQtfOTJGu1ntrhiehXdfPjBlgT/jecRKgWjRj7p0MRGZde/dMX+Kj2zkIcFYdTO89Mlth3nui/oWaip7YYK4Bfttpd1XByEmfT/KN4pZ52v4CNAs/XGN/NyEJbHwwV8MyQsfbw86C6AZmmjN2IfTl7piWNSNQaMwH3j+WPg0QHtrgH21xIL1mnhIEdtSpZFpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iYwhq/1pjhu7M5cYCBpgrE7Y/iEsezx341EVIHieSRc=; b=FihFmVnniSmWHkB/8PE4wR55sR58bQE2tsnlh8tOS+35yDRVpAoyn7FV5V0qkD69yM/c402Qh3LkV1QZP6b0Och16VIDx2w2rWi5AYqprMUhSHtts0jO3bEhYumw/d2frDfSeOQHBWsnFmTekHLvekPulGu3brvRQ9NOuDndf9Co99qZHFEH3KvuVq//hXUhuqE/CLaez+gthjqozUYXve8lOA6PG1XHw4xu+jBm2GSviIQcY3Qo/WzRp4eSF2feGe18AhDDtJS0tPX7NhHlQkWhC40UOKKFZ2CoEfkseQ2tda5hHEG+NKVh091tKB8GEK7rWv+YqvN0VFMdo/U+vw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5135.eurprd08.prod.outlook.com (2603:10a6:803:108::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3700.27; Mon, 28 Dec 2020 13:38:10 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::f937:5b3:12e1:8297]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::f937:5b3:12e1:8297%7]) with mapi id 15.20.3700.031; Mon, 28 Dec 2020 13:38:10 +0000 Date: Mon, 28 Dec 2020 13:38:08 +0000 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, rguenther@suse.de, ook@ucw.cz Subject: [PATCH 7/8 v9]middle-end slp: support complex FMS and complex FMS conjugate Message-ID: <20201228133806.GA32350@arm.com> Content-Type: multipart/mixed; boundary="AqsLC8rIMeq19msA" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-Originating-IP: [217.140.106.53] X-ClientProxiedBy: LO4P123CA0220.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1a6::9) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by LO4P123CA0220.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1a6::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3700.27 via Frontend Transport; Mon, 28 Dec 2020 13:38:09 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 468799da-238a-436c-aeba-08d8ab35db4c X-MS-TrafficTypeDiagnostic: VE1PR08MB5135:|AM6PR08MB5062: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:8882;OLM:8882; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: DfvGcMI1t0RPCsJcnS/BbUazFn9ia/JKfaiKRAKBUvgnrK+tpOW6bOlaNoD/Feiro+Z1yQptmMy6fDM7UivXfArdtVo00CKsOi1qqMCAzmK258wOANoED0AcyCNXAnF2W3QPsebhOJG/1oqcs++thSI/a9je26Dc2xuRDYsXU51agX7bYsmbmRkIrNJcOOVuGIvqcgA+4/CplwpbIIBCg4/zwoHOr/MSpCq9jNDqYal0CQ4CUg7dJU39k3oHHqu6gqmsT56P/u7eLzEh+1c2DPC0etWlCjYz76f1rIk15UfmcQP6tGP2yeXUsv3iOl7PIzyAOr/Ul5+lehtTRLfs5nrIJmP7VXhiMUiLIrXEzCbWQDM2btDX8ZvDOEMVGbt8FiBw8w4Syx99SoblKcynnWo7FuQxw73ww+4QMSACk5f12z0bbhVibdJSTzRa6SNG X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(39850400004)(396003)(136003)(366004)(346002)(376002)(478600001)(44144004)(956004)(16526019)(186003)(5660300002)(8936002)(6916009)(66476007)(66556008)(55016002)(33964004)(66946007)(66616009)(83380400001)(33656002)(2616005)(36756003)(8676002)(4743002)(52116002)(7696005)(235185007)(2906002)(86362001)(26005)(4326008)(316002)(44832011)(8886007)(1076003)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?utf-8?B?Y3dWOWM2ZUZkZDVnZ3VEME81S0JFMzFydVVzVVpDR3ZiakhybHBrTmlHUVdq?= =?utf-8?B?RktRckV4Vk5jR1RwNG15dnI0b3ZtTXBpR0Q0Q2ZLWVByVFBub3JPck8wNXVo?= =?utf-8?B?c1VReXRaNHZlU1pER2FkV3ZObmxBbG1sajdkcFlXSVloRVo0Y3p6VzZDZnN5?= =?utf-8?B?NS8wSno4M2QwWTZndURicFdoU0hSNU1zaU04RjdGSUpIMXc3ZW5ldWlwRy9p?= =?utf-8?B?YXN1VHUvdTFoOEdRZVRTWjFmU0FaQ0crVER6aExSQm1EQVpaNzJxSWU1RWtz?= =?utf-8?B?aExUd21IbFFWamhLQ1ZMd1VzekJsdFRKYmx2aVRWSDI1bDAzSTJGVWNyV0J4?= =?utf-8?B?Z0piTXRkZ201K3liZmdTWWhZT2RwSlM5SW9sMU9YRW9SL1ExczhzYm5BQk1y?= =?utf-8?B?ZlJqdEc2NzhRSk1xOTB0SlYyN21GdDM4VWpHWXc0N3lZbHdpeE9xUnBPakVs?= =?utf-8?B?OFc3dFZ6NkI1cEQrc1NIV2VHOEZsNHBLMit4M1FPQWVXYnFOQWNKZGpLUnNR?= =?utf-8?B?ZFQ4MStLbGN1Zmo2MWtyaUtmMTBWMC9NaFd0UGlLR0h5bnR0K2ZDODRrNUsz?= =?utf-8?B?NnZJaTBTbDNEdmtsK3E5dVBNSUZPc0JyYllnR2g3cHk3SzlxbU1waW40U1RE?= =?utf-8?B?a0dTY2Z1K1BOTllwV3ZiaE16K2IwWEtLbDdwaXNOcklVUEJkRVlNcnB5dzJy?= =?utf-8?B?N0RpcWZqeVd1R1d3NnRVWFBwSzJnQW1ieGh5S0VLQzVoNTBQUUNMeUF4dk9P?= =?utf-8?B?MGlSNUJsbGg1cE12bkJBYlJpY0ZNTU1oMnd5ZFRpWFRpbUJSVDRnbXBpU3g2?= =?utf-8?B?UGl6VGlSd055OEhuamo3dG9OT2dNTHdmY3pHT1FyQUZOM3dmRW5vYmp6dVE3?= =?utf-8?B?aUtPWkFpMzJYaUNYQVJUQnZEQWVHWFljS0dXZkdRNEdrU1Y2c3ZybURDOTZ5?= =?utf-8?B?ZEpYNlpMckFtYnhJTGRvc3FRTVN0R0RxcVhoSjdtajNBMERmYU1kQ1lvMi8x?= =?utf-8?B?bXEvUTZyVFoydXEyQzNsSGVtekxUcC8wSGRNSEZGRFR3aHZXNDhjemdKVVJY?= =?utf-8?B?MWhSZFZicGJjajVnd0VLK0JqeHhHQ0tZd1lHNjJTUFIrdlE1WmVZcTM0RUhx?= =?utf-8?B?UUNsUjRUck5qcXMyZEh0S1NhNHZ6ZE1IZUVmMCt0WlVWckRleENPdmpRMXMw?= =?utf-8?B?a1djYlpMUFZtdDBxN1NteitDRWZYTGN6QkgvTitidW4yOGdtalBzZ2IvVGFQ?= =?utf-8?B?VXo2ZHF3cGhmblFJTnU1WXB5NkFWTEFYOW1ySFlGL0k0aEtBTUZwYUFYTDgv?= =?utf-8?Q?QVU0bNsj65SafjNreILwH7CRYC+RXeljJ1?= X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5135 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT024.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 5cef59fe-3dde-4437-7c39-08d8ab35d106 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: bErLSGXNDCtUsszP2bj0jvPLAw+aQ7xAg1sTIfO+/pnTmCY62c+ieYys3JW2+THpnEW8YkF+WZVtgIjRYNcvDO7g/Od1FTS9frXbHJyK2DP08IH7G/2KGan9ta9Eazd87BWwiMhXnfVkUevJgemybR/YR6qhJxnhvxqfh1ygmDIP4ggRK58zcZUp6YGoboQGkdk5e+EnY+3NJBEqnayx1cPbZtdk2ZsuiMCU0L5CKw6jVCT15/cltCxdbOPYWoTsQJJzw1zSQdXY6IP518+GKmPK4UGy8c5Sag53jFEE/+0Rv/VW8U4Vtn0IOxklV42Z7OXul9KJ+F3/fOK/nfwJfP52R/gTl6uaRJUxweYiDmdUEv+ro5iv7lYLoiAuaLNkkWZXu8DqpD6+IJGM8dWl3zV4iMpBdS83t714CtJJJZ7PgNDojh8GbBywk2yV1jHQgeKSls9+idrOhvSxBgmqpL95wyqu4lCfg7QnJX9vfQM= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(39850400004)(376002)(396003)(346002)(136003)(46966006)(44832011)(316002)(478600001)(4743002)(86362001)(83380400001)(55016002)(2906002)(47076005)(1076003)(4326008)(186003)(26005)(44144004)(356005)(8936002)(235185007)(81166007)(6916009)(16526019)(33656002)(82310400003)(33964004)(82740400003)(2616005)(956004)(336012)(8886007)(7696005)(8676002)(70206006)(66616009)(36756003)(5660300002)(70586007)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Dec 2020 13:38:27.3697 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 468799da-238a-436c-aeba-08d8ab35db4c X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT024.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR08MB5062 X-Spam-Status: No, score=-14.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Dec 2020 13:38:36 -0000 --AqsLC8rIMeq19msA Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Hi All, This adds support for FMS and FMS conjugated to the slp pattern matcher. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * internal-fn.def (COMPLEX_FMS, COMPLEX_FMS_CONJ): New. * optabs.def (cmls_optab, cmls_conj_optab): New. * doc/md.texi: Document them. * tree-vect-slp-patterns.c (class complex_fms_pattern, complex_fms_pattern::matches, complex_fms_pattern::recognize, complex_fms_pattern::build): New. --- inline copy of patch -- diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 6d5a98c4946d3ff4c2b8abea5c29caa6863fd3f7..3f5a42df285b3ee162edc9ec661f25c0eec5e4fa 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -6247,6 +6247,51 @@ The operation is only supported for vector modes @var{m}. This pattern is not allowed to @code{FAIL}. +@cindex @code{cmls@var{m}4} instruction pattern +@item @samp{cmls@var{m}4} +Perform a vector multiply and subtract that is semantically the same as +a multiply and subtract of complex numbers. + +@smallexample + complex TYPE c[N]; + complex TYPE a[N]; + complex TYPE b[N]; + for (int i = 0; i < N; i += 1) + @{ + c[i] -= a[i] * b[i]; + @} +@end smallexample + +In GCC lane ordering the real part of the number must be in the even lanes with +the imaginary part in the odd lanes. + +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + +@cindex @code{cmls_conj@var{m}4} instruction pattern +@item @samp{cmls_conj@var{m}4} +Perform a vector multiply by conjugate and subtract that is semantically +the same as a multiply and subtract of complex numbers where the second +multiply arguments is conjugated. + +@smallexample + complex TYPE c[N]; + complex TYPE a[N]; + complex TYPE b[N]; + for (int i = 0; i < N; i += 1) + @{ + c[i] -= a[i] * conj (b[i]); + @} +@end smallexample + +In GCC lane ordering the real part of the number must be in the even lanes with +the imaginary part in the odd lanes. + +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + @cindex @code{cmul@var{m}4} instruction pattern @item @samp{cmul@var{m}4} Perform a vector multiply that is semantically the same as multiply of diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 305450e026d4b94ab62ceb9ca719ec5570ff43eb..c8161509d9497afe58f32bde12d8e6bd7b876a3c 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -290,6 +290,8 @@ DEF_INTERNAL_FLT_FN (LDEXP, ECF_CONST, ldexp, binary) DEF_INTERNAL_FLT_FLOATN_FN (FMA, ECF_CONST, fma, ternary) DEF_INTERNAL_OPTAB_FN (COMPLEX_FMA, ECF_CONST, cmla, ternary) DEF_INTERNAL_OPTAB_FN (COMPLEX_FMA_CONJ, ECF_CONST, cmla_conj, ternary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_FMS, ECF_CONST, cmls, ternary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_FMS_CONJ, ECF_CONST, cmls_conj, ternary) /* Unary integer ops. */ DEF_INTERNAL_INT_FN (CLRSB, ECF_CONST | ECF_NOTHROW, clrsb, unary) diff --git a/gcc/optabs.def b/gcc/optabs.def index 8e2758d685ed85e02df10dac571eb40d45a294ed..320bb5f3dce31867d312bbbb6a4c6e31c534254e 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -296,6 +296,8 @@ OPTAB_D (cmul_optab, "cmul$a3") OPTAB_D (cmul_conj_optab, "cmul_conj$a3") OPTAB_D (cmla_optab, "cmla$a4") OPTAB_D (cmla_conj_optab, "cmla_conj$a4") +OPTAB_D (cmls_optab, "cmls$a4") +OPTAB_D (cmls_conj_optab, "cmls_conj$a4") OPTAB_D (cos_optab, "cos$a2") OPTAB_D (cosh_optab, "cosh$a2") OPTAB_D (exp10_optab, "exp10$a2") diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index 3625a80c08e3d70fd362fc52e17e65b3b2c7da83..ab6587f0b8522ec5f916f74e7e7401b1f7a35bbb 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -1254,6 +1254,181 @@ complex_fma_pattern::build (vec_info *vinfo) complex_pattern::build (vinfo); } +/******************************************************************************* + * complex_fms_pattern class + ******************************************************************************/ + +class complex_fms_pattern : public complex_pattern +{ + protected: + complex_fms_pattern (slp_tree *node, vec *m_ops, internal_fn ifn) + : complex_pattern (node, m_ops, ifn) + { + this->m_num_args = 3; + } + + public: + void build (vec_info *); + static internal_fn + matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *, + vec *); + + static vect_pattern* + recognize (slp_tree_to_load_perm_map_t *, slp_tree *); + + static vect_pattern* + mkInstance (slp_tree *node, vec *m_ops, internal_fn ifn) + { + return new complex_fms_pattern (node, m_ops, ifn); + } +}; + + +/* Pattern matcher for trying to match complex multiply and accumulate + and multiply and subtract patterns in SLP tree. + If the operation matches then IFN is set to the operation it matched and + the arguments to the two replacement statements are put in m_ops. + + If no match is found then IFN is set to IFN_LAST and m_ops is unchanged. + + This function matches the patterns shaped as: + + double ax = (b[i+1] * a[i]) + (b[i] * a[i]); + double bx = (a[i+1] * b[i]) - (a[i+1] * b[i+1]); + + c[i] = c[i] - ax; + c[i+1] = c[i+1] + bx; + + If a match occurred then TRUE is returned, else FALSE. The initial match is + expected to be in OP1 and the initial match operands in args0. */ + +internal_fn +complex_fms_pattern::matches (complex_operation_t op, + slp_tree_to_load_perm_map_t *perm_cache, + slp_tree * ref_node, vec *ops) +{ + internal_fn ifn = IFN_LAST; + + /* Find the two components. We match Complex MUL first which reduces the + amount of work this pattern has to do. After that we just match the + head node and we're done.: + + * FMS: - +. */ + slp_tree child = NULL; + + /* We need to ignore the two_operands nodes that may also match, + for that we can check if they have any scalar statements and also + check that it's not a permute node as we're looking for a normal + PLUS_EXPR operation. */ + if (op != PLUS_MINUS) + return IFN_LAST; + + child = SLP_TREE_CHILDREN ((*ops)[1])[1]; + if (vect_detect_pair_op (child) != MINUS_PLUS) + return IFN_LAST; + + /* First two nodes must be a multiply. */ + auto_vec muls; + if (vect_match_call_complex_mla (child, 0) != MULT_MULT + || vect_match_call_complex_mla (child, 1, &muls) != MULT_MULT) + return IFN_LAST; + + /* Now operand2+4 may lead to another expression. */ + auto_vec left_op, right_op; + left_op.safe_splice (SLP_TREE_CHILDREN (muls[0])); + right_op.safe_splice (SLP_TREE_CHILDREN (muls[1])); + + bool is_neg = vect_normalize_conj_loc (left_op); + + child = SLP_TREE_CHILDREN ((*ops)[1])[0]; + bool conj_first_operand; + if (!vect_validate_multiplication (perm_cache, right_op, left_op, false, + &conj_first_operand, true)) + return IFN_LAST; + + if (!is_neg) + ifn = IFN_COMPLEX_FMS; + else if (is_neg) + ifn = IFN_COMPLEX_FMS_CONJ; + + if (!vect_pattern_validate_optab (ifn, *ref_node)) + return IFN_LAST; + + ops->truncate (0); + ops->create (4); + + complex_perm_kinds_t kind = linear_loads_p (perm_cache, right_op[0]).first; + if (kind == PERM_EVENODD) + { + ops->quick_push (child); + ops->quick_push (right_op[0]); + ops->quick_push (right_op[1]); + ops->quick_push (left_op[0]); + } + else if (kind == PERM_TOP) + { + ops->quick_push (child); + ops->quick_push (right_op[1]); + ops->quick_push (right_op[0]); + ops->quick_push (left_op[0]); + } + else + { + ops->quick_push (child); + ops->quick_push (right_op[1]); + ops->quick_push (right_op[0]); + ops->quick_push (left_op[1]); + } + + return ifn; +} + +/* Attempt to recognize a complex mul pattern. */ + +vect_pattern* +complex_fms_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache, + slp_tree *node) +{ + auto_vec ops; + complex_operation_t op + = vect_detect_pair_op (*node, true, &ops); + internal_fn ifn + = complex_fms_pattern::matches (op, perm_cache, node, &ops); + if (ifn == IFN_LAST) + return NULL; + + return new complex_fms_pattern (node, &ops, ifn); +} + +/* Perform a replacement of the detected complex mul pattern with the new + instruction sequences. */ + +void +complex_fms_pattern::build (vec_info *vinfo) +{ + auto_vec nodes; + + /* First re-arrange the children. */ + nodes.create (3); + + nodes.quick_push (this->m_ops[0]); + nodes.quick_push (this->m_ops[1]); + nodes.quick_push ( + vect_build_combine_node (this->m_ops[2], this->m_ops[3], *this->m_node)); + SLP_TREE_REF_COUNT (this->m_ops[0])++; + SLP_TREE_REF_COUNT (this->m_ops[1])++; + + slp_tree node; + unsigned i; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (*this->m_node), i, node) + vect_free_slp_tree (node); + + SLP_TREE_CHILDREN (*this->m_node).truncate (0); + SLP_TREE_CHILDREN (*this->m_node).safe_splice (nodes); + + complex_pattern::build (vinfo); +} + /******************************************************************************* * Pattern matching definitions ******************************************************************************/ -- --AqsLC8rIMeq19msA Content-Type: text/x-diff; charset=utf-8 Content-Disposition: attachment; filename="rb13962.patch" diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 6d5a98c4946d3ff4c2b8abea5c29caa6863fd3f7..3f5a42df285b3ee162edc9ec661f25c0eec5e4fa 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -6247,6 +6247,51 @@ The operation is only supported for vector modes @var{m}. This pattern is not allowed to @code{FAIL}. +@cindex @code{cmls@var{m}4} instruction pattern +@item @samp{cmls@var{m}4} +Perform a vector multiply and subtract that is semantically the same as +a multiply and subtract of complex numbers. + +@smallexample + complex TYPE c[N]; + complex TYPE a[N]; + complex TYPE b[N]; + for (int i = 0; i < N; i += 1) + @{ + c[i] -= a[i] * b[i]; + @} +@end smallexample + +In GCC lane ordering the real part of the number must be in the even lanes with +the imaginary part in the odd lanes. + +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + +@cindex @code{cmls_conj@var{m}4} instruction pattern +@item @samp{cmls_conj@var{m}4} +Perform a vector multiply by conjugate and subtract that is semantically +the same as a multiply and subtract of complex numbers where the second +multiply arguments is conjugated. + +@smallexample + complex TYPE c[N]; + complex TYPE a[N]; + complex TYPE b[N]; + for (int i = 0; i < N; i += 1) + @{ + c[i] -= a[i] * conj (b[i]); + @} +@end smallexample + +In GCC lane ordering the real part of the number must be in the even lanes with +the imaginary part in the odd lanes. + +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + @cindex @code{cmul@var{m}4} instruction pattern @item @samp{cmul@var{m}4} Perform a vector multiply that is semantically the same as multiply of diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 305450e026d4b94ab62ceb9ca719ec5570ff43eb..c8161509d9497afe58f32bde12d8e6bd7b876a3c 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -290,6 +290,8 @@ DEF_INTERNAL_FLT_FN (LDEXP, ECF_CONST, ldexp, binary) DEF_INTERNAL_FLT_FLOATN_FN (FMA, ECF_CONST, fma, ternary) DEF_INTERNAL_OPTAB_FN (COMPLEX_FMA, ECF_CONST, cmla, ternary) DEF_INTERNAL_OPTAB_FN (COMPLEX_FMA_CONJ, ECF_CONST, cmla_conj, ternary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_FMS, ECF_CONST, cmls, ternary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_FMS_CONJ, ECF_CONST, cmls_conj, ternary) /* Unary integer ops. */ DEF_INTERNAL_INT_FN (CLRSB, ECF_CONST | ECF_NOTHROW, clrsb, unary) diff --git a/gcc/optabs.def b/gcc/optabs.def index 8e2758d685ed85e02df10dac571eb40d45a294ed..320bb5f3dce31867d312bbbb6a4c6e31c534254e 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -296,6 +296,8 @@ OPTAB_D (cmul_optab, "cmul$a3") OPTAB_D (cmul_conj_optab, "cmul_conj$a3") OPTAB_D (cmla_optab, "cmla$a4") OPTAB_D (cmla_conj_optab, "cmla_conj$a4") +OPTAB_D (cmls_optab, "cmls$a4") +OPTAB_D (cmls_conj_optab, "cmls_conj$a4") OPTAB_D (cos_optab, "cos$a2") OPTAB_D (cosh_optab, "cosh$a2") OPTAB_D (exp10_optab, "exp10$a2") diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index 3625a80c08e3d70fd362fc52e17e65b3b2c7da83..ab6587f0b8522ec5f916f74e7e7401b1f7a35bbb 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -1254,6 +1254,181 @@ complex_fma_pattern::build (vec_info *vinfo) complex_pattern::build (vinfo); } +/******************************************************************************* + * complex_fms_pattern class + ******************************************************************************/ + +class complex_fms_pattern : public complex_pattern +{ + protected: + complex_fms_pattern (slp_tree *node, vec *m_ops, internal_fn ifn) + : complex_pattern (node, m_ops, ifn) + { + this->m_num_args = 3; + } + + public: + void build (vec_info *); + static internal_fn + matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *, + vec *); + + static vect_pattern* + recognize (slp_tree_to_load_perm_map_t *, slp_tree *); + + static vect_pattern* + mkInstance (slp_tree *node, vec *m_ops, internal_fn ifn) + { + return new complex_fms_pattern (node, m_ops, ifn); + } +}; + + +/* Pattern matcher for trying to match complex multiply and accumulate + and multiply and subtract patterns in SLP tree. + If the operation matches then IFN is set to the operation it matched and + the arguments to the two replacement statements are put in m_ops. + + If no match is found then IFN is set to IFN_LAST and m_ops is unchanged. + + This function matches the patterns shaped as: + + double ax = (b[i+1] * a[i]) + (b[i] * a[i]); + double bx = (a[i+1] * b[i]) - (a[i+1] * b[i+1]); + + c[i] = c[i] - ax; + c[i+1] = c[i+1] + bx; + + If a match occurred then TRUE is returned, else FALSE. The initial match is + expected to be in OP1 and the initial match operands in args0. */ + +internal_fn +complex_fms_pattern::matches (complex_operation_t op, + slp_tree_to_load_perm_map_t *perm_cache, + slp_tree * ref_node, vec *ops) +{ + internal_fn ifn = IFN_LAST; + + /* Find the two components. We match Complex MUL first which reduces the + amount of work this pattern has to do. After that we just match the + head node and we're done.: + + * FMS: - +. */ + slp_tree child = NULL; + + /* We need to ignore the two_operands nodes that may also match, + for that we can check if they have any scalar statements and also + check that it's not a permute node as we're looking for a normal + PLUS_EXPR operation. */ + if (op != PLUS_MINUS) + return IFN_LAST; + + child = SLP_TREE_CHILDREN ((*ops)[1])[1]; + if (vect_detect_pair_op (child) != MINUS_PLUS) + return IFN_LAST; + + /* First two nodes must be a multiply. */ + auto_vec muls; + if (vect_match_call_complex_mla (child, 0) != MULT_MULT + || vect_match_call_complex_mla (child, 1, &muls) != MULT_MULT) + return IFN_LAST; + + /* Now operand2+4 may lead to another expression. */ + auto_vec left_op, right_op; + left_op.safe_splice (SLP_TREE_CHILDREN (muls[0])); + right_op.safe_splice (SLP_TREE_CHILDREN (muls[1])); + + bool is_neg = vect_normalize_conj_loc (left_op); + + child = SLP_TREE_CHILDREN ((*ops)[1])[0]; + bool conj_first_operand; + if (!vect_validate_multiplication (perm_cache, right_op, left_op, false, + &conj_first_operand, true)) + return IFN_LAST; + + if (!is_neg) + ifn = IFN_COMPLEX_FMS; + else if (is_neg) + ifn = IFN_COMPLEX_FMS_CONJ; + + if (!vect_pattern_validate_optab (ifn, *ref_node)) + return IFN_LAST; + + ops->truncate (0); + ops->create (4); + + complex_perm_kinds_t kind = linear_loads_p (perm_cache, right_op[0]).first; + if (kind == PERM_EVENODD) + { + ops->quick_push (child); + ops->quick_push (right_op[0]); + ops->quick_push (right_op[1]); + ops->quick_push (left_op[0]); + } + else if (kind == PERM_TOP) + { + ops->quick_push (child); + ops->quick_push (right_op[1]); + ops->quick_push (right_op[0]); + ops->quick_push (left_op[0]); + } + else + { + ops->quick_push (child); + ops->quick_push (right_op[1]); + ops->quick_push (right_op[0]); + ops->quick_push (left_op[1]); + } + + return ifn; +} + +/* Attempt to recognize a complex mul pattern. */ + +vect_pattern* +complex_fms_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache, + slp_tree *node) +{ + auto_vec ops; + complex_operation_t op + = vect_detect_pair_op (*node, true, &ops); + internal_fn ifn + = complex_fms_pattern::matches (op, perm_cache, node, &ops); + if (ifn == IFN_LAST) + return NULL; + + return new complex_fms_pattern (node, &ops, ifn); +} + +/* Perform a replacement of the detected complex mul pattern with the new + instruction sequences. */ + +void +complex_fms_pattern::build (vec_info *vinfo) +{ + auto_vec nodes; + + /* First re-arrange the children. */ + nodes.create (3); + + nodes.quick_push (this->m_ops[0]); + nodes.quick_push (this->m_ops[1]); + nodes.quick_push ( + vect_build_combine_node (this->m_ops[2], this->m_ops[3], *this->m_node)); + SLP_TREE_REF_COUNT (this->m_ops[0])++; + SLP_TREE_REF_COUNT (this->m_ops[1])++; + + slp_tree node; + unsigned i; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (*this->m_node), i, node) + vect_free_slp_tree (node); + + SLP_TREE_CHILDREN (*this->m_node).truncate (0); + SLP_TREE_CHILDREN (*this->m_node).safe_splice (nodes); + + complex_pattern::build (vinfo); +} + /******************************************************************************* * Pattern matching definitions ******************************************************************************/ --AqsLC8rIMeq19msA--