From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2049.outbound.protection.outlook.com [40.107.22.49]) by sourceware.org (Postfix) with ESMTPS id DB52B3858D1E for ; Wed, 21 Jun 2023 06:27:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DB52B3858D1E Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HMoAWh4OXf8G9R6u+gnGewTU46Uj6sTQMsAQ9r8hJP7poVDgOaxFcbte9W2/UtkUSMQzJfIap8iyySgSZOxxiQdq57NUYbDMo3Hq3oa5tsK6f321ecC+aybzrj+TVnBQVy3Qk+RrlByz1DtsShIZ2aMSogjPG0mN+Rphd77rkUFwzIPf35+7MdbGON6GK1A0xRpCeYqSMfAhxsQVJJPW1TL+DBwdl8rli9+NgUU0d7SsH3MPq0LMxLww6wHny87yZe4DpKiherDpzTWp2pyx/8CvTdQZMYDUlWwNw8W757OJAo89b/sKkgTj/ZmJ7BvllDDKrW6mRCOK0wm7DIQNoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=H4H+gPSOIQO23blyADL5m5UDuhZssSGUlBE3obSsSoM=; b=Gp2Ed374nVuYYF+5zcNMSoMAl2rJAg++RX8Ubp+tmFUxgXCkS5dW6oV2DwtJ6N3EtzUwj61d7n9b3cenhGKx5q/vJHy6hliM+KYVjciTVBD/77RhgSQfc63zasQuXzjYpx/JmcG+7c7zkW71hMFEjMCeMpuglaB3gAS6qP1CSoAucwvki0nLFcwtFtBIUvb74egLTrH9xMiQZAj3wQVi8VThb9z/GjP6LmLk6E2OwhzOV5PijWyqhe/7mIuU6CDBukVR9vToSzspNANYBC+sZeN4OX7oP/itTJDX9TWs55S0U80avJWBbXdwb+zu3Yv3TDa6kYX+bBhVAx+nFxakng== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=H4H+gPSOIQO23blyADL5m5UDuhZssSGUlBE3obSsSoM=; b=pHBQrKVA6VCo0ghLjoFy9hjzrMmqPKhRLpwjegBh4HSXSzGzB/kZ40mN2/edYe6jvtgyBdiBsyCIWHFC6T/pDlToRwyN3Ihcp9JY4GZONSgv/cIrmC2lKOUNMbS9Nkeij5Eg4CdeW9MvouVfkUHQ6t6UEQRqF+KD9glwUczvx7mtLscUQ6ErVeXzSmiCh7l0uNgJz3UMezrU30G2t5QXCr19wNgzypcldvjlnJ3pnikxV91p12a6veqK67md0im2SgMJOddCOdiWU8UwISGbJRF6VNd/2CVm+K/RJ/AMbLY8LfX4f+0F3ogYdYarU33EN1+MJdMT2lW1gVbCG6MEUA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DUZPR04MB9982.eurprd04.prod.outlook.com (2603:10a6:10:4db::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.23; Wed, 21 Jun 2023 06:27:13 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6500.036; Wed, 21 Jun 2023 06:27:13 +0000 Message-ID: <3cf55c98-d18a-d1ad-2fc2-015c63e217ca@suse.com> Date: Wed, 21 Jun 2023 08:27:11 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: [PATCH 2/5] x86: use VPTERNLOG also for certain andnot forms Content-Language: en-US From: Jan Beulich To: "gcc-patches@gcc.gnu.org" Cc: Hongtao Liu , Kirill Yukhin References: <04f99abe-a563-d093-23b7-4abf0f91633d@suse.com> In-Reply-To: <04f99abe-a563-d093-23b7-4abf0f91633d@suse.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR3P281CA0106.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:a3::16) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|DUZPR04MB9982:EE_ X-MS-Office365-Filtering-Correlation-Id: 5807154e-659b-4d31-60d2-08db72208cd7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: uh2P5j62wDl+L8j8DJMpWQUjt9tVooPcvEiJNgZB4FbTamMxyUWEstvJiMmjXdEQTBUYS5UZyDawE3B5w7Wf4VMxZjeXes+mZU/NPKQObeW97bgFs0U4Zk8p7wPRDQNzQ4NyPAElj95un9ApKeFEamTn2V7IFylr4z7m6Rk7o77w3u9UevGmlcp4KRn430AFzpbjYUGksLq5ncycIq3ynOzz76AwbjQINXzzm695G7i1cP+1oWB66MUTjG1E9mVhRPzhGdhNR+UPfiHqL7qauPeL0OLzBoqT+J6uiMVDS5DG1IoVEFHQBIZNQPVAv4JtJROivguL3yGCaq5bqXWZCs5BUbW4yjPWLbwzcA1eBberwWJPWCZIuD9imeVb+WK4zGQMsDVeMssdfzEGrhitzwHX81tspsCi3iNVij4mQ0BK0BYXl8M/sWmTDEmuTgRXbbialwejhD/BV2+xOI8xBhp+3b85HDNdn3SyjIgmN7SivIKpNeYiq/3Jm7uzHd5W7W3rM/EYWAIgIbTY6oAcVOpwhQ3BR/lyAzKpcKcZw+oeVc3ufU5eAl8sBWNQmvaFJSNhnvC0I+PuPTXqHCmM2IBydcDzHP/kjrOHmRaTXSQqGD0SFgDDvHLyv8Crtnc99TKwM3lW2noF2rC/CuQgUQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(366004)(39860400002)(376002)(136003)(346002)(396003)(451199021)(2616005)(84970400001)(6506007)(83380400001)(26005)(186003)(6512007)(38100700002)(6486002)(478600001)(54906003)(31686004)(36756003)(31696002)(86362001)(66556008)(66946007)(66476007)(8936002)(8676002)(4326008)(316002)(6916009)(2906002)(5660300002)(41300700001)(45980500001)(43740500002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?b1YvUVQzZGM3amNLVkRSNW9KU2lidXU1SDAvbFZNTk00WEhkUzNzajFsNlEr?= =?utf-8?B?K25DcXhWc1c0ZllZM2FxVmZLN0Z5R2VkL3RTV3FaRXBHMUFZdUtnYitBc0pM?= =?utf-8?B?K3psMnYrZmU1djJPR1RoV1B2VkhZcERIUDRQUkpyZnZuNitzdFlrKzZKWXFa?= =?utf-8?B?TEdzMGxkK0RaRjQ0bnUvUGxIVGxNVGVVTlVhUDRYdVhab3JjUnhheWtNMlNw?= =?utf-8?B?L1ZLQ0lNZDM1dml3eXlORnRET1ZQYnM3SG1NMnhDQWRseXlTYVpaRVpJOVNZ?= =?utf-8?B?UmNhSE1ySDhJSDBwNEN4NmRxbDVIdEdudWxPb3B2RU1MZ2luWHhHWjJQbklH?= =?utf-8?B?U1hlby8xQ25weFJSTGptZEphN3RGZzRTVW91c0JZNmE5aDRuN3V1VW40eXdl?= =?utf-8?B?cjF5RkZIRm91SElRUXBrdFhmeDJ5a3RleEpzM0w4MkZQTHVLazdydlMvYlQw?= =?utf-8?B?M2d2VVlpWnZmQWJtTTMrSGFycnN6d2lJNnFIOWVaWWFFOUNiK2hWdm5rZ3Zl?= =?utf-8?B?V1lrTzZZUWNLOTNydm5wWG5HUGNVNHNWbTJnN2hEVUlrVzdiQUVHRXpxMmgv?= =?utf-8?B?ZGM5eWRkaVliQ28wQ1c5Q3pVWmJ1WWFYOEVkUFFveE9QeHBxTzVIVXpjalpl?= =?utf-8?B?Vkh0cFJ0STV0QW83dTd2QlAwdWU5WSt2MCt4TUlrcy9iY2JGV0Z3dkswcmxI?= =?utf-8?B?c1ZITnVqQWUvU3VjeWhGUDdjcHhjUmc4UFlQdG9EbVZjU2RYbk95eTBzbGZJ?= =?utf-8?B?bWJuZFdMQm10MFlFQXd4OUQwVWdsYzg5c3JrVHBuaXdEbHZKSWNpQWg2d3hM?= =?utf-8?B?amhMNWNTeUtId1lxVW1HNWYwRjB1YzNCOGdxN2hFdHMvd3NNR1N3UWVFYm1K?= =?utf-8?B?SGNocUVkY1JvZE9seWloUEFJdE5uRlIySjlPNjJvZW16amhpcFh5N1NLWjNZ?= =?utf-8?B?SGQxR1NwSmE3UkdCZVgvNmxDeWcwdmQ0SlNaYWI4U1o3SHFOZmxYVFRRUVht?= =?utf-8?B?UzBlWncwd2tuZHJoWWEvWS9uRi9pNkU1QmsxYmZFYjE3OUFuVUFhMS9BY0RC?= =?utf-8?B?S0t3OC9VRlZRTHFxdEhRRUdYTmRFK25ycDdFcGw0bEhpU0pScG4zWEo2M0pn?= =?utf-8?B?dkZaRkhiVzVQNmZvSmV1TXhzRy83SnIvNzVJYkRtVHBzaGZ1UlUweHFVYkFL?= =?utf-8?B?eE1xMTFaTzdscEE2R3pkMHhyUnB6ejJNT3I0Z0UvUllmbVlYUnBZaGhWa1RW?= =?utf-8?B?OHFCdzRYNEdwZGlGcGtGelNEU01nazUrYURjRGVtcjNTUmJMeXQ2RDdFMjhi?= =?utf-8?B?d0J6cFQ4UkQ3azlMVFB3ZkNzcytUZitSWXA4djdUNkc3S2JNT1VYUkhCQzNu?= =?utf-8?B?d0RReGhTYWxGKzhLZDVDMjcxMTMzTHFaL0JOZFhJZzdTOW5qVGk5TDUxcm9U?= =?utf-8?B?Q0FtVWVKcXlQMmdULzJvWHhCdW9LaERDVy9xSVRRcnVXN3pYODMrVEhCMEho?= =?utf-8?B?dk11RWc2VDFNSUYwRERESEs0S2xCbDllRmpGcXhUbjR1Z0VrZzF0emh4Mmgy?= =?utf-8?B?TllZUW5lQVpvMTZJSmxyd04yV0FoaVV6SzF4R0cvWGJvWU9RYWpZdmZOcGx5?= =?utf-8?B?L3c1T0hheFpucEVJNDAxYVRwRWs3WFJ1V1ByNDZOOEpsdER2bk04MXFWMXlS?= =?utf-8?B?SHo3QWtmU0QxRnM2TFdSTGZJMEoyWm5WZzR5dG9xZng2dC9saTEyMEdKV1RW?= =?utf-8?B?U282Mm1wYVQ3OXM5dk14aUo2ZE5vaVVoV1ZnVEJmL3k4b2F4ZjlMdUxQQWNk?= =?utf-8?B?V0ZuQSs5Y2JzOU4xNXBHL2FSL0F2dE15eXdUUXlhRnNDN3oxWkJWMlBFZnlE?= =?utf-8?B?UkRDWTAyQmU3NGxVK29rRVhMcVZibUF2QjlLM0NlRU5ITTdNaFNUdExLdEsv?= =?utf-8?B?ZGZJZ0Z6RkJYK2R2cGxLRi9GbEpUR1RtRUFUY3NnNDFTbUZmT0oxM2cvcFJE?= =?utf-8?B?ZEtqMHFFUzJtZmtEelNUTUF6QXl5blhZTVJHK0dVR1hWcTZWQ3VscU5oRGVO?= =?utf-8?B?eFFocVBzeUJDSE9vYUdibC9pUk5CaFRLK3MrY0NGY3doNlRlUDFnM1V4QWlY?= =?utf-8?Q?8Z0uKnmVQ4ypaGYE8AdSwwaMY?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5807154e-659b-4d31-60d2-08db72208cd7 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Jun 2023 06:27:13.1673 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: pQca5zu7YW5Gzq8Z1JryppIpaOTS2Iy6dv9vDHclA0IM733yW9OnVjGnx5EkYACPIjfh5ywt2QtzRW3+enQfHQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DUZPR04MB9982 X-Spam-Status: No, score=-3027.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: When it's the memory operand which is to be inverted, using VPANDN* requires a further load instruction. The same can be achieved by a single VPTERNLOG*. Add two new alternatives (for plain memory and embedded broadcast), adjusting the predicate for the first operand accordingly. Two pre-existing testcases actually end up being affected (improved) by the change, which is reflected in updated expectations there. gcc/ PR target/93768 * config/i386/sse.md (*andnot3): Add new alternatives for memory form operand 1. gcc/testsuite/ PR target/93768 * gcc.target/i386/avx512f-andn-di-zmm-2.c: New test. * gcc.target/i386/avx512f-andn-si-zmm-2.c: Adjust expecations towards generated code. * gcc.target/i386/pr100711-3.c: Adjust expectations for 32-bit code. --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -17210,11 +17210,13 @@ "TARGET_AVX512F") (define_insn "*andnot3" - [(set (match_operand:VI 0 "register_operand" "=x,x,v") + [(set (match_operand:VI 0 "register_operand" "=x,x,v,v,v") (and:VI - (not:VI (match_operand:VI 1 "vector_operand" "0,x,v")) - (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr")))] - "TARGET_SSE" + (not:VI (match_operand:VI 1 "bcst_vector_operand" "0,x,v,m,Br")) + (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr,v,v")))] + "TARGET_SSE + && (register_operand (operands[1], mode) + || register_operand (operands[2], mode))" { char buf[64]; const char *ops; @@ -17281,6 +17283,15 @@ case 2: ops = "v%s%s\t{%%2, %%1, %%0|%%0, %%1, %%2}"; break; + case 3: + case 4: + tmp = "pternlog"; + ssesuffix = ""; + if (which_alternative != 4 || TARGET_AVX512VL) + ops = "v%s%s\t{$0x44, %%1, %%2, %%0|%%0, %%2, %%1, $0x44}"; + else + ops = "v%s%s\t{$0x44, %%g1, %%g2, %%g0|%%g0, %%g2, %%g1, $0x44}"; + break; default: gcc_unreachable (); } @@ -17289,7 +17300,7 @@ output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx") + [(set_attr "isa" "noavx,avx,avx,*,*") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -17297,9 +17308,12 @@ (eq_attr "mode" "TI")) (const_string "1") (const_string "*"))) - (set_attr "prefix" "orig,vex,evex") + (set_attr "prefix" "orig,vex,evex,evex,evex") (set (attr "mode") - (cond [(match_test "TARGET_AVX2") + (cond [(and (eq_attr "alternative" "3,4") + (match_test " < 64 && !TARGET_AVX512VL")) + (const_string "XI") + (match_test "TARGET_AVX2") (const_string "") (match_test "TARGET_AVX") (if_then_else @@ -17310,7 +17324,15 @@ (match_test "optimize_function_for_size_p (cfun)")) (const_string "V4SF") ] - (const_string "")))]) + (const_string ""))) + (set (attr "enabled") + (cond [(eq_attr "alternative" "3") + (symbol_ref " == 64 || TARGET_AVX512VL") + (eq_attr "alternative" "4") + (symbol_ref " == 64 || TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)") + ] + (const_string "*")))]) ;; PR target/100711: Split notl; vpbroadcastd; vpand as vpbroadcastd; vpandn (define_split --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-di-zmm-2.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */ +/* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\\\$0x44, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */ +/* { dg-final { scan-assembler-not "vpbroadcast" } } */ + +#define type __m512i +#define vec 512 +#define op andnot +#define suffix epi64 +#define SCALAR long long + +#include "avx512-binop-2.h" --- a/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c +++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c @@ -1,7 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-mavx512f -O2" } */ -/* { dg-final { scan-assembler-times "vpbroadcastd\[^\n\]*%zmm\[0-9\]+" 1 } } */ -/* { dg-final { scan-assembler-times "vpandnd\[^\n\]*%zmm\[0-9\]+" 1 } } */ +/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$0x44, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */ +/* { dg-final { scan-assembler-not "vpbroadcast" } } */ #define type __m512i #define vec 512 --- a/gcc/testsuite/gcc.target/i386/pr100711-3.c +++ b/gcc/testsuite/gcc.target/i386/pr100711-3.c @@ -37,4 +37,6 @@ v8di foo_v8di (long long a, v8di b) return (__extension__ (v8di) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) & b; } -/* { dg-final { scan-assembler-times "vpandn" 4 } } */ +/* { dg-final { scan-assembler-times "vpandn" 4 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "vpandn" 2 { target { ia32 } } } } */ +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0x44" 2 { target { ia32 } } } } */