From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2081.outbound.protection.outlook.com [40.107.105.81]) by sourceware.org (Postfix) with ESMTPS id B4B3C3858281 for ; Wed, 21 Jun 2023 06:27:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B4B3C3858281 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Xj/EtJ2ddR39JjBXx8clj9Goc2nacdJujrOufDqM2mC27TNPURebINDgqptE8UNxN154ei9EsswydBb1Pz+8n+BqkJ33LHon3Nk0AOp5665PzQ9YtdG4H+Yf1TSId1HuGA11YJuQJH2MVFT9r/TpbZFXhXotsmmntl3aXF6cvYzXtcOhjl+tpQjwR4RjnuX5kUm8lJ59UJZlzTukj2XedKOhle4coKXhc4gxGdsV6L7dd/dMC1esWkabGtaf+y4yVhB0AGxERovdRzKI+YsazXOLdqq0ao+ZkrYNhl8zJPe2gCWQFslPuZwek6Jki+btEnFtMwPQO2LzKWJlPKNrrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=M21CcXEDFb8i15kOIOiejLCPJm1vvcniThgHMcivVok=; b=d9HkSVL9X6Tner3tTQvYTRU84lsVzfwP/QSH1n86eX/N651/juOXEJPg8NN8wvxgtr495cpVI/81ducX2pV5dEo5+cUPGWzUw6oKF8UgGE83eaL7YLqbo2WUaKiActxQb9mVH15C/ssM2ia8p60rPF+HkDLPqkCHCS4Ult/fM+opC5N9aCew5Fch2e17XfAMge/HS6flpBsLLJflNTcB+E0XwIc+K/r4v1S6nN5OqY6VoicRMCWQATBxrAD4X3leOakSXrUXP21ow0J2usll7+SCMyJP+uRhEG9u/PtXSkyKyZoCXhr+L9ucJRhb24yUB1Qo4VZaw3S02nu4Qp18wQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=M21CcXEDFb8i15kOIOiejLCPJm1vvcniThgHMcivVok=; b=n7FO9/gxVfnOhPKWPP8ep20JAWia7KOliQAXDiyCGX+5t0v80Knuy2+T+J0RMynZUzvcup6mcANTGZdMGNmW0nKKb0AeusXvLnrCyaHtAxAWB8zw5IC+UTz6roL9B4cQuZVgViYzVsBfI9qVzXAIBevvUQUp7ABxVjNxqcUMQDkNVP1JW57rHHNMphfi/OZC5tZBMBdZ69wLmetFLgDhEvImytC1c0JexQF7pIqRPSqzq8pBNrUbkabcEbwil5MxAkTw/jwjyvD3VLh7xUrwZpZMZWyKMfvPlzQ61xAl/mWvZ30JjFWC2gNddNJloA16fjeJpl0QWaVs19meJeYV7g== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DUZPR04MB9982.eurprd04.prod.outlook.com (2603:10a6:10:4db::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.23; Wed, 21 Jun 2023 06:27:52 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6500.036; Wed, 21 Jun 2023 06:27:52 +0000 Message-ID: Date: Wed, 21 Jun 2023 08:27:51 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: [PATCH 4/5] x86: further PR target/100711-like splitting Content-Language: en-US From: Jan Beulich To: "gcc-patches@gcc.gnu.org" Cc: Hongtao Liu , Kirill Yukhin References: <04f99abe-a563-d093-23b7-4abf0f91633d@suse.com> In-Reply-To: <04f99abe-a563-d093-23b7-4abf0f91633d@suse.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR0P281CA0074.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:1e::20) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|DUZPR04MB9982:EE_ X-MS-Office365-Filtering-Correlation-Id: 2af66d0a-5342-42b3-1648-08db7220a481 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: iM4lz5OGtSLa4o1G6ab6V2REksE2bNH5c18ML1nHL+kf5apjcj4sfgU48tBK+I7kirPO7c2dasuXd1nl6jLcwM6MjmXao2vMyeYLTvCz59UMZ91pAD+pOVA18RNRvQNJHqT41AnRB4fKzSjOa/nsCL1L4iHztj4j5kKDkbykWo93GAkdf/gzqqreml1zxhdfi3bDGCC5VK9iRaQ45rt7pUSdqRMoNFm8OWVdUjzexR3M/JzU4+abhPjUTdTojUxqaf/ahon2zGmEnsnrD6PikPfFbyrVGHzJOUAUGe3pXkUVXM2sc8J9Y0jQAIYw6eIT3jgknwRSbL7FY4eSgL4lw+qBTEx3/imf7RBvtjHjLZnh+VvV7Ga2lEdze6LME9NVUdPvRa6Ff8vvu0qXFx4IqJFa9Hc2UGBUjE/5TwAO+MdTsT3itkYRy/i4Rm5C1clA6Uigoc747yAulrywwO65MbSQ3Gki+HNr5/2HsPXBNhBLTEDndQOioIOE8c9X+rbMG1ItKa3aOXDOjQzZr/p43Zc+RPdKKXlOh7PcQ0dT/5vVqgN+jqoQa2uW+Z/2h5bm/vtQAm+tjJOKr5zE6HOykFWpkLMxoz53gImZHhsWDzmjD7watO+0+PIUBRqo+KX8z0GubM19L3np51hk4/y6hw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(366004)(39860400002)(376002)(136003)(346002)(396003)(451199021)(2616005)(84970400001)(6506007)(26005)(186003)(6512007)(66899021)(38100700002)(6486002)(478600001)(54906003)(31686004)(36756003)(31696002)(86362001)(66556008)(66946007)(66476007)(8936002)(8676002)(4326008)(316002)(6916009)(2906002)(5660300002)(41300700001)(45980500001)(43740500002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?cUFaUFV6MU5ibk4wdU9EbmVVR1A4dTdZcUNZZ1hDNjRjOTI2S21rWnNibFJD?= =?utf-8?B?L1FlcWFueS9aK1psQ3h1S05Td3BJYklLbGNSMkRId2k3YzRJR2JDb3BwNmlj?= =?utf-8?B?cjl4RGQ4eHlMdkd5dGRGZlA2YXhUYXgvWUZJZjkxV1hUWlNVM3N6elFsbTNm?= =?utf-8?B?L0xmZjRmOGsybU1Bd2hzOEVpRlNscFZiczNIZ054Z29nSXZhMjd6dGZJS21p?= =?utf-8?B?VGxWZ2s1ODhXQWs1MEJ0S0Ntd29ubmRxaVc1d0tQMHZnVjZNUE9sbk13QUxH?= =?utf-8?B?ZGg2NHFQTGQ1cTlObXZIZnRSK1ZkZ0JBdjBTenZMY2h2NjYrdytERFgwYnRW?= =?utf-8?B?Ui9yWEZVZ3VCZEN0RlZPdlhGbU1SNWZGQmhQbVRMWXozdElQYzBRZktjc3dP?= =?utf-8?B?UUtuUEJNSHZFakRXbmxwSmJmZjcrdDdsOGdoY1N3TjZpN05raGNEMFNaQU81?= =?utf-8?B?UjJnTTBScHF6Y0xONzY1UWpZQ2lUM3E5YkR2M3czQW5nekxSR21UemJxbE01?= =?utf-8?B?MWZLL1puVWMvTVV3dlViZ3N4SlRuQXl0QU01STJaUW55azRYODQvRDZoNUdY?= =?utf-8?B?R0hhUGxVVnUrMk80bnF1cVJ3VlBQb0NBeFVvY2ZPOGo1NnJBQU9VM3h2K3E5?= =?utf-8?B?ZVkwTlN4Rk42UmN3alFRcG8wRFNuc0w1VGV2dndHMWtXT3JXN0JpeFZFN3po?= =?utf-8?B?RDF3dmowbGJ2L2U5MHdjVmFBR282SEpCM292clBoNXdsSEREZ1k4Y25MTnlV?= =?utf-8?B?TEF5MlU4TlhHK0FmNHVMbm5aVkZOaHBaNFdTd1RBaC80anJJOGRYUTgwczAy?= =?utf-8?B?UzJ0d1Q3SXhjY1ZVRzJYSjhZY25pYnhubWhUdVdTRURKZUZWR1dTQlZGbThK?= =?utf-8?B?ZEdJTGI3T3dvQWhhRldMcGhwa2N2VS9LSCtySnRwS3RQaSswMjdVa0JsT29a?= =?utf-8?B?NUllVzJpakxmekRPMGVhQndWNW9MUC9oM2VpMGRkdDZITmFYaklVa3VkQUZi?= =?utf-8?B?cWlSbW0zZUhEK09pUTFDQlB0MnB1dG1vcUhNZ3FNZVFLNUNVUXZDT3FSdHNW?= =?utf-8?B?d3FSa3RiMER5aUdiS1RRVzlQZ3ExVTFjYjQrR0ZZUFFFZTFHa0Zmb0w4VStr?= =?utf-8?B?c3o2Q2dRcGwwNE1KekI2T004cHR2ZDdPK29RSzNtSnBQdHNDTkYySjRjSFlT?= =?utf-8?B?QnZYa2wrSzFQdlhjTDFkS3RBRTFIdlh6K1JieENzcW1tcyt3M1JvM1piU2Rs?= =?utf-8?B?NXdtMEQwdVVJWGptRmZZdkljbjRCVk8rYWlaNzFkNkNoUndycUlqMURjMEhY?= =?utf-8?B?aUpscURzbURMdW40MUs2ZGROSWs1YWJsK1lIV0V0NEdjTld4aFhaTjZLdlFN?= =?utf-8?B?N1A4MlRJVUZkRTVTZVYrSlRaTmw1STlOTURIVlFpRDQ2WU8xaEppMFRhRTls?= =?utf-8?B?OEFsWnV1KzJEYTA1YzJkaXBaa2NXc3Z6U2t0cE1kZHpBVXRTRnI0eHFNZnB6?= =?utf-8?B?dzQ0bkd5SXMvc2F1SURRNXRrVjgvTWNCdDdIampnZDZheldvdWJVcWRwNFly?= =?utf-8?B?b0tzblR2UTUrNGJLRHJ3MEJzOXpTTVlzcjBORURXZWFpelFnUlVmTEpQdGV6?= =?utf-8?B?QUErbGc1WWF0VEZBYi9OdTlydVBLVkdpQlNqazlZaGFVWm9ZZTBNa1pMZ0tY?= =?utf-8?B?NnV5SjVpYTRUQlpyT2FzYk5pdFgxM25JNWUxczRkSVB4VHVmN0tlNEJCQmY0?= =?utf-8?B?MW42akUzT2RGRlVkYU9BVnYrZVBydUdlUGhpNUJ2YUIwQjlkVFNNVU1XL2pK?= =?utf-8?B?ZE9JRDhBZ2JuRy9VRWdGaWlpak5hVGlvQ1VpQnNGZUp1SnNDc2VvS2N4Qngz?= =?utf-8?B?cmNoZlYvSG56SjBReml3eXpiTnVvdmpJL1gwU2wydmE3eXZJVHJORC9FQnRX?= =?utf-8?B?SENlWDhTY3J1T1c2U1JjWXVwa1VGb3FwcUpYQ3hqcVEzc2U4ekp4YzBjeHd3?= =?utf-8?B?UVRCNVVMYnA1WG9CeXNyZ3FjaVJiWlllZWhlKzJReXBPQlJpMkZVaXN2Z3pO?= =?utf-8?B?b3g3bDRqamEyWmk2d3g3cFZVdHVUTnFPR2tHbVl0MUZIMVFWRDJpUjljY3BG?= =?utf-8?Q?7vBKcAzSPfqBIVUZTkDWJpvGh?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2af66d0a-5342-42b3-1648-08db7220a481 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Jun 2023 06:27:52.8755 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: z7jUKBG9eRnzew0yGdYGwVVXYgOeZyG0IFdeszgEEs4feQ3JM0J5rNUkc9yWQml6eMO6RrUXeq8jR1v7/ycd+g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DUZPR04MB9982 X-Spam-Status: No, score=-3027.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: With respective two-operand bitwise operations now expressable by a single VPTERNLOG, add splitters to also deal with ior and xor counterparts of the original and-only case. Note that the splitters need to be separate, as the placement of "not" differs in the final insns (*iornot3, *xnor3) which are intended to pick up one half of the result. gcc/ * config/i386/sse.md: New splitters to simplify not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}. gcc/testsuite/ * gcc.target/i386/pr100711-4.c: New test. * gcc.target/i386/pr100711-5.c: New test. --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -17366,6 +17366,36 @@ (match_dup 2)))] "operands[3] = gen_reg_rtx (mode);") +(define_split + [(set (match_operand:VI 0 "register_operand") + (ior:VI + (vec_duplicate:VI + (not: + (match_operand: 1 "nonimmediate_operand"))) + (match_operand:VI 2 "vector_operand")))] + " == 64 || TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)" + [(set (match_dup 3) + (vec_duplicate:VI (match_dup 1))) + (set (match_dup 0) + (ior:VI (not:VI (match_dup 3)) (match_dup 2)))] + "operands[3] = gen_reg_rtx (mode);") + +(define_split + [(set (match_operand:VI 0 "register_operand") + (xor:VI + (vec_duplicate:VI + (not: + (match_operand: 1 "nonimmediate_operand"))) + (match_operand:VI 2 "vector_operand")))] + " == 64 || TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)" + [(set (match_dup 3) + (vec_duplicate:VI (match_dup 1))) + (set (match_dup 0) + (not:VI (xor:VI (match_dup 3) (match_dup 2))))] + "operands[3] = gen_reg_rtx (mode);") + (define_insn "*andnot3_mask" [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v") (vec_merge:VI48_AVX512VL --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr100711-4.c @@ -0,0 +1,42 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bw -mno-avx512vl -mprefer-vector-width=512 -O2" } */ + +typedef char v64qi __attribute__ ((vector_size (64))); +typedef short v32hi __attribute__ ((vector_size (64))); +typedef int v16si __attribute__ ((vector_size (64))); +typedef long long v8di __attribute__((vector_size (64))); + +v64qi foo_v64qi (char a, v64qi b) +{ + return (__extension__ (v64qi) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b; +} + +v32hi foo_v32hi (short a, v32hi b) +{ + return (__extension__ (v32hi) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b; +} + +v16si foo_v16si (int a, v16si b) +{ + return (__extension__ (v16si) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b; +} + +v8di foo_v8di (long long a, v8di b) +{ + return (__extension__ (v8di) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b; +} + +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xbb" 4 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xbb" 2 { target { ia32 } } } } */ +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xdd" 2 { target { ia32 } } } } */ --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr100711-5.c @@ -0,0 +1,40 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bw -mno-avx512vl -mprefer-vector-width=512 -O2" } */ + +typedef char v64qi __attribute__ ((vector_size (64))); +typedef short v32hi __attribute__ ((vector_size (64))); +typedef int v16si __attribute__ ((vector_size (64))); +typedef long long v8di __attribute__((vector_size (64))); + +v64qi foo_v64qi (char a, v64qi b) +{ + return (__extension__ (v64qi) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) ^ b; +} + +v32hi foo_v32hi (short a, v32hi b) +{ + return (__extension__ (v32hi) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) ^ b; +} + +v16si foo_v16si (int a, v16si b) +{ + return (__extension__ (v16si) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) ^ b; +} + +v8di foo_v8di (long long a, v8di b) +{ + return (__extension__ (v8di) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) ^ b; +} + +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0x99" 4 } } */