From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2082.outbound.protection.outlook.com [40.107.105.82]) by sourceware.org (Postfix) with ESMTPS id CF8203858C66 for ; Wed, 26 Jul 2023 07:21:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CF8203858C66 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=AhKfgp+nCTE7QXSvrOZNAksuqReFIk2uaKrI81Jg3nk=; b=iS1Th+Q5OldDuo9OPJF/M3GrSMs+r0JUX71/SCAjVItqQkx/US/qBw2eRKprElMtUrYxKuh/C+TbsKSu2GQ5Pp1G2ZHNRNKkQM2KB4AhVLaliMeOufhVUjZ9cD9jU/ZGWlBph7ZHdPdRiIegZmD9gusg/oA56naIKf1N1WcD8hY= Received: from AS9P251CA0004.EURP251.PROD.OUTLOOK.COM (2603:10a6:20b:50f::8) by DB9PR08MB9729.eurprd08.prod.outlook.com (2603:10a6:10:45c::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.29; Wed, 26 Jul 2023 07:21:37 +0000 Received: from AM7EUR03FT061.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:50f:cafe::f7) by AS9P251CA0004.outlook.office365.com (2603:10a6:20b:50f::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.29 via Frontend Transport; Wed, 26 Jul 2023 07:21:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT061.mail.protection.outlook.com (100.127.140.72) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.25 via Frontend Transport; Wed, 26 Jul 2023 07:21:36 +0000 Received: ("Tessian outbound 997ae1cc9f47:v145"); Wed, 26 Jul 2023 07:21:36 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 8220aaf67fa17f36 X-CR-MTA-TID: 64aa7808 Received: from cacf198b8621.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 27043E27-6D1B-4272-A514-CB38ED4CC245.1; Wed, 26 Jul 2023 07:21:29 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id cacf198b8621.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 26 Jul 2023 07:21:29 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ET/a5Lu3NQmI0VaoIFkuN10LycMN1lK8EFrz7UIIBUI1C+5v1nT5I/RCUnvCQXS82DfmJ4QvfgZNxayMuG8IGOXjBzSRz4IVn1xnGqmDWhAR6+70Qk2btCC4nSjqLr10Uikg7lsC0cJrXIusjbIuvmvWZwHsd0wgFJlTCpKtm0EL16qNFwVP6sLtBtfOC1N4OFhqhDs3/g6kf7rj6BZdFYxakgaKo+bp7INI7/i8msLcTFjKQUCZmSKZT0JXOPhagEgtrcO5ifv8BOyrak5NXek9bRFcDhDfaTz47+MPY0Tlt7KLxMH3/kIF5LElchfZ2e1lRzD0kdo0V+DRvA8dEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AhKfgp+nCTE7QXSvrOZNAksuqReFIk2uaKrI81Jg3nk=; b=Zn9TDjn1GogV4WOOITKA1YHt5ep4bD3Cans8TEY4K9F8rhCU6Ffzxd0R4p0WKVP9EnxROtgJC8H+K3xxzl/gVIXpXXOOOn4SYxw+rErH8Ugb6nQsVT6YCrnyeb+rmcxuhzVX/OIkwkNw1rdEmC4LyHVWNXdk6WNeyr/z0oaa8x9dJL2Hp/PFVZjjGzODLmu0WEdb1PGxSaTjKrR3xDgaX2sIwjCf5GH0Uxqg3QoIIeUREM2rKjE7Ik3z+pCpf5NGqaXb4rvMO51usJw5RL+BZol+WF2Fu/TosnoHukz2CG+P81GnzEs0K6Sw+h/yeLeFFAE535Fs4FlczfNxxZGFgA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=AhKfgp+nCTE7QXSvrOZNAksuqReFIk2uaKrI81Jg3nk=; b=iS1Th+Q5OldDuo9OPJF/M3GrSMs+r0JUX71/SCAjVItqQkx/US/qBw2eRKprElMtUrYxKuh/C+TbsKSu2GQ5Pp1G2ZHNRNKkQM2KB4AhVLaliMeOufhVUjZ9cD9jU/ZGWlBph7ZHdPdRiIegZmD9gusg/oA56naIKf1N1WcD8hY= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from AS8PR08MB7079.eurprd08.prod.outlook.com (2603:10a6:20b:400::12) by PAVPR08MB10339.eurprd08.prod.outlook.com (2603:10a6:102:30c::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6609.33; Wed, 26 Jul 2023 07:21:26 +0000 Received: from AS8PR08MB7079.eurprd08.prod.outlook.com ([fe80::1b11:71d7:b893:c81a]) by AS8PR08MB7079.eurprd08.prod.outlook.com ([fe80::1b11:71d7:b893:c81a%6]) with mapi id 15.20.6631.026; Wed, 26 Jul 2023 07:21:26 +0000 Message-ID: Date: Wed, 26 Jul 2023 12:51:17 +0530 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [RFC] GNU Vector Extension -- Packed Boolean Vectors Content-Language: en-US To: Richard Biener Cc: "gcc-patches@gcc.gnu.org" References: <87a51e61-271a-44d7-ed94-de45d32b2e18@arm.com> From: Tejas Belagod In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: PN3PR01CA0099.INDPRD01.PROD.OUTLOOK.COM (2603:1096:c01:9b::19) To AS8PR08MB7079.eurprd08.prod.outlook.com (2603:10a6:20b:400::12) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: AS8PR08MB7079:EE_|PAVPR08MB10339:EE_|AM7EUR03FT061:EE_|DB9PR08MB9729:EE_ X-MS-Office365-Filtering-Correlation-Id: 680abd56-a792-4a55-5f16-08db8da8f2c3 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: QuYxPnLGB4AyYBjqkDcFFTZ4xF0qk4D/tY7WoODG7MRmCnq2oW3BUu/rSWDK0tv0qPKd8ko2ZYCjPLVhQTTTdWGmYji7rPVFb+Zg5qzXOO8qBuY4qoN3AtdSq9lUlbAcMot+M1nb9KqUsn6i+xUOhHRA0+lLOH8FOXwBa/d8bEWr4IgTLc4bILaRs+QF/s9O24fYJ1oj7xdlkvKLtq3/nMT+9R2GG0ac7H7tPUML7JXBgx6vTCaW+Htenwhr3rySbpYP8sXcl+gXZhaemMu3vPzeojKhvB2m5+vT7+AxFBiSKRualt3saxsQhH3rusQGvaxNLw9Y0cRh+LV1Wz7qnGKZz1rElcc6VXS8NZ15mgsqF2ba84H8RmTJfgzz2TfG/Eh90YgnDOfPYxca79nUX2g+H/DzVQx/PL3A/rvEPEGzwFEVn0y/E8hhqJ6lAf61U9W97geUy4fkOCctwq5zD1+eNFtrKf+JT+mA/B4DDZblxGCWtCiYom6exAqqx6GSXZmEu5cA5s8XnZsDP3G0GX+npra0/TC1kjUuAl8r0fpZMSA/t7Oo2JRtwxXzlbLc+B7HBOk/2TFr0Dc/HLBwTgFKY7Er0V8TJlxA3nC9t5rS0uTXxjqExklSxwi78Lc3ymeW10Nf6+208N1xfEaFmQHDCe0BCY7N5CGzfZAsB94tnYUUEDdFWNh8K+SPN/6t X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:AS8PR08MB7079.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(376002)(136003)(39860400002)(396003)(366004)(346002)(451199021)(86362001)(36756003)(31696002)(31686004)(66899021)(478600001)(2906002)(30864003)(38100700002)(2616005)(186003)(53546011)(6506007)(26005)(41300700001)(44832011)(8676002)(966005)(5660300002)(6486002)(6666004)(6512007)(66476007)(6916009)(83380400001)(66946007)(8936002)(4326008)(66556008)(316002)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAVPR08MB10339 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT061.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 61a13e13-0988-45d4-cd5a-08db8da8ebf0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: UGxBcVqF2LStMk1M5QaQdpnEvHK2K5eUJZOCOAD4OCBMJDd/dxr+cEodBtWUZQ4zKoMzaHF4yHJqqaOk+q0alW4b9DzzpVs6CtaFO+M2s+oeqwfNrx1ekHrTJxy3x1z9pYCYYFiI5Yb3DlO9aNNRPZ0fNSpfq7jYNCbQL07ZeYhZGm7o5Oeql8fbq768MGj6VvoZNbuuQqZ8Y5WSAe8EZRCrDzUWOIkpXTrATpHxbSD07sQBWfpCV6UBn8pl1yHUc7NcFa92M0VMpRytDrk7wlbDUB7S8hUpNnjM3YmG5TlH98yvfmP1+mL6KjAYoPgFnHscUXGnMCOACZq9q/JBfX97MeDwnTk2zw5FpdcFMxRlA3kW1bjmR6okz3AQRapqtQVF63NeZyZM7ObHrtaQdO60fh0sexEg0ww9Iv25xHdYHZBfC76T5C58gDpl6jGQI1bKz25mzQ1jIglSqr+3nWqVtXrTOpe0Uine8fuIeux/oYaEERuMrwtYbl18wW7/zVcREmSvAKjt5whfm9JdTW5Ab3er86d2NUZNVcIEihsLpuQl6VXyY8S/lsQvvVprNLQxZ/7wqASoNwHhz9y4FiZRg/JA7Tnv3ffBVlNXFc38QTUrFYBHURZP1DbEuRHhB11QNCejA/1VAsgfoumuRzmpuYN/ExmEBmlxS03KZX756gbl4cZaTiu7emtDOPQ9hKJruzWuXVNFTRgmKB36nSb+mR+CWT3Oe9O6bpuOpwljcbxB2NSH/dSBqJDYbWWSAtzK6Cd/ElMzh2k/2zHtXsryiYqluwB/9jM9jXEGceA= X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(346002)(376002)(39860400002)(136003)(396003)(451199021)(82310400008)(40470700004)(46966006)(36840700001)(6666004)(6512007)(6486002)(966005)(478600001)(47076005)(2616005)(26005)(53546011)(186003)(336012)(6506007)(30864003)(2906002)(6862004)(4326008)(316002)(70206006)(70586007)(44832011)(8936002)(8676002)(5660300002)(41300700001)(82740400003)(81166007)(356005)(36756003)(40460700003)(86362001)(31696002)(83380400001)(36860700001)(40480700001)(31686004)(66899021)(43740500002);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2023 07:21:36.9188 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 680abd56-a792-4a55-5f16-08db8da8f2c3 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT061.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB9729 X-Spam-Status: No, score=-5.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,KAM_DMARC_NONE,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 7/17/23 5:46 PM, Richard Biener wrote: > On Fri, Jul 14, 2023 at 12:18 PM Tejas Belagod wrote: >> >> On 7/13/23 4:05 PM, Richard Biener wrote: >>> On Thu, Jul 13, 2023 at 12:15 PM Tejas Belagod wrote: >>>> >>>> On 7/3/23 1:31 PM, Richard Biener wrote: >>>>> On Mon, Jul 3, 2023 at 8:50 AM Tejas Belagod wrote: >>>>>> >>>>>> On 6/29/23 6:55 PM, Richard Biener wrote: >>>>>>> On Wed, Jun 28, 2023 at 1:26 PM Tejas Belagod wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> From: Richard Biener >>>>>>>> Date: Tuesday, June 27, 2023 at 12:58 PM >>>>>>>> To: Tejas Belagod >>>>>>>> Cc: gcc-patches@gcc.gnu.org >>>>>>>> Subject: Re: [RFC] GNU Vector Extension -- Packed Boolean Vectors >>>>>>>> >>>>>>>> On Tue, Jun 27, 2023 at 8:30 AM Tejas Belagod wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> From: Richard Biener >>>>>>>>> Date: Monday, June 26, 2023 at 2:23 PM >>>>>>>>> To: Tejas Belagod >>>>>>>>> Cc: gcc-patches@gcc.gnu.org >>>>>>>>> Subject: Re: [RFC] GNU Vector Extension -- Packed Boolean Vectors >>>>>>>>> >>>>>>>>> On Mon, Jun 26, 2023 at 8:24 AM Tejas Belagod via Gcc-patches >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Packed Boolean Vectors >>>>>>>>>> ---------------------- >>>>>>>>>> >>>>>>>>>> I'd like to propose a feature addition to GNU Vector extensions to add packed >>>>>>>>>> boolean vectors (PBV). This has been discussed in the past here[1] and a variant has >>>>>>>>>> been implemented in Clang recently[2]. >>>>>>>>>> >>>>>>>>>> With predication features being added to vector architectures (SVE, MVE, AVX), >>>>>>>>>> it is a useful feature to have to model predication on targets. This could >>>>>>>>>> find its use in intrinsics or just used as is as a GNU vector extension being >>>>>>>>>> mapped to underlying target features. For example, the packed boolean vector >>>>>>>>>> could directly map to a predicate register on SVE. >>>>>>>>>> >>>>>>>>>> Also, this new packed boolean type GNU extension can be used with SVE ACLE >>>>>>>>>> intrinsics to replace a fixed-length svbool_t. >>>>>>>>>> >>>>>>>>>> Here are a few options to represent the packed boolean vector type. >>>>>>>>> >>>>>>>>> The GIMPLE frontend uses a new 'vector_mask' attribute: >>>>>>>>> >>>>>>>>> typedef int v8si __attribute__((vector_size(8*sizeof(int)))); >>>>>>>>> typedef v8si v8sib __attribute__((vector_mask)); >>>>>>>>> >>>>>>>>> it get's you a vector type that's the appropriate (dependent on the >>>>>>>>> target) vector >>>>>>>>> mask type for the vector data type (v8si in this case). >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks Richard. >>>>>>>>> >>>>>>>>> Having had a quick look at the implementation, it does seem to tick the boxes. >>>>>>>>> >>>>>>>>> I must admit I haven't dug deep, but if the target hook allows the mask to be >>>>>>>>> >>>>>>>>> defined in way that is target-friendly (and I don't know how much effort it will >>>>>>>>> >>>>>>>>> be to migrate the attribute to more front-ends), it should do the job nicely. >>>>>>>>> >>>>>>>>> Let me go back and dig a bit deeper and get back with questions if any. >>>>>>>> >>>>>>>> >>>>>>>> Let me add that the advantage of this is the compiler doesn't need >>>>>>>> to support weird explicitely laid out packed boolean vectors that do >>>>>>>> not match what the target supports and the user doesn't need to know >>>>>>>> what the target supports (and thus have an #ifdef maze around explicitely >>>>>>>> specified layouts). >>>>>>>> >>>>>>>> Sorry for the delayed response – I spent a day experimenting with vector_mask. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Yeah, this is what option 4 in the RFC is trying to achieve – be portable enough >>>>>>>> >>>>>>>> to avoid having to sprinkle the code with ifdefs. >>>>>>>> >>>>>>>> >>>>>>>> It does remove some flexibility though, for example with -mavx512f -mavx512vl >>>>>>>> you'll get AVX512 style masks for V4SImode data vectors but of course the >>>>>>>> target sill supports SSE2/AVX2 style masks as well, but those would not be >>>>>>>> available as "packed boolean vectors", though they are of course in fact >>>>>>>> equal to V4SImode data vectors with -1 or 0 values, so in this particular >>>>>>>> case it might not matter. >>>>>>>> >>>>>>>> That said, the vector_mask attribute will get you V4SImode vectors with >>>>>>>> signed boolean elements of 32 bits for V4SImode data vectors with >>>>>>>> SSE2/AVX2. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> This sounds very much like what the scenario would be with NEON vs SVE. Coming to think >>>>>>>> >>>>>>>> of it, vector_mask resembles option 4 in the proposal with ‘n’ implied by the ‘base’ vector type >>>>>>>> >>>>>>>> and a ‘w’ specified for the type. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Given its current implementation, if vector_mask is exposed to the CFE, would there be any >>>>>>>> >>>>>>>> major challenges wrt implementation or defining behaviour semantics? I played around with a >>>>>>>> >>>>>>>> few examples from the testsuite and wrote some new ones. I mostly tried operations that >>>>>>>> >>>>>>>> the new type would have to support (unary, binary bitwise, initializations etc) – with a couple of exceptions >>>>>>>> >>>>>>>> most of the ops seem to be supported. I also triggered a couple of ICEs in some tests involving >>>>>>>> >>>>>>>> implicit conversions to wider/narrower vector_mask types (will raise reports for these). Correct me >>>>>>>> >>>>>>>> if I’m wrong here, but we’d probably have to support a couple of new ops if vector_mask is exposed >>>>>>>> >>>>>>>> to the CFE – initialization and subscript operations? >>>>>>> >>>>>>> Yes, either that or restrict how the mask vectors can be used, thus >>>>>>> properly diagnose improper >>>>>>> uses. >>>>>> >>>>>> Indeed. >>>>>> >>>>>> A question would be for example how to write common mask test >>>>>>> operations like >>>>>>> if (any (mask)) or if (all (mask)). >>>>>> >>>>>> I see 2 options here. New builtins could support new types - they'd >>>>>> provide a target independent way to test any and all conditions. Another >>>>>> would be to let the target use its intrinsics to do them in the most >>>>>> efficient way possible (which the builtins would get lowered down to >>>>>> anyway). >>>>>> >>>>>> >>>>>> Likewise writing merge operations >>>>>>> - do those as >>>>>>> >>>>>>> a = a | (mask ? b : 0); >>>>>>> >>>>>>> thus use ternary ?: for this? >>>>>> >>>>>> Yes, like now, the ternary could just translate to >>>>>> >>>>>> {mask[0] ? b[0] : 0, mask[1] ? b[1] : 0, ... } >>>>>> >>>>>> One thing to flesh out is the semantics. Should we allow this operation >>>>>> as long as the number of elements are the same even if the mask type if >>>>>> different i.e. >>>>>> >>>>>> v4hib ? v4si : v4si; >>>>>> >>>>>> I don't see why this can't be allowed as now we let >>>>>> >>>>>> v4si ? v4sf : v4sf; >>>>>> >>>>>> >>>>>> For initialization regular vector >>>>>>> syntax should work: >>>>>>> >>>>>>> mtype mask = (mtype){ -1, -1, 0, 0, ... }; >>>>>>> >>>>>>> there's the question of the signedness of the mask elements. GCC >>>>>>> internally uses signed >>>>>>> bools with values -1 for true and 0 for false. >>>>>> >>>>>> One of the things is the value that represents true. This is largely >>>>>> target-dependent when it comes to the vector_mask type. When vector_mask >>>>>> types are created from GCC's internal representation of bool vectors >>>>>> (signed ints) the point about implicit/explicit conversions from signed >>>>>> int vect to mask types in the proposal covers this. So mask in >>>>>> >>>>>> v4sib mask = (v4sib){-1, -1, 0, 0, ... } >>>>>> >>>>>> will probably end up being represented as 0x3xxxx on AVX512 and 0x11xxx >>>>>> on SVE. On AVX2/SSE they'd still be represented as vector of signed ints >>>>>> {-1, -1, 0, 0, ... }. I'm not entirely confident what ramifications this >>>>>> new mask type representations will have in the mid-end while being >>>>>> converted back and forth to and from GCC's internal representation, but >>>>>> I'm guessing this is already being handled at some level by the >>>>>> vector_mask type's current support? >>>>> >>>>> Yes, I would guess so. Of course what the middle-end is currently exposed >>>>> to is simply what the vectorizer generates - once fuzzers discover this feature >>>>> we'll see "interesting" uses that might run into missed or wrong handling of >>>>> them. >>>>> >>>>> So whatever we do on the side of exposing this to users a good portion >>>>> of testsuite coverage for the allowed use cases is important. >>>>> >>>>> Richard. >>>>> >>>> >>>> Apologies for the long-ish reply, but here's a TLDR and gory details follow. >>>> >>>> TLDR: >>>> GIMPLE's vector_mask type semantics seems to be target-dependent, so >>>> elevating vector_mask to CFE with same semantics is undesirable. OTOH, >>>> changing vector_mask to have target-independent CFE semantics will cause >>>> dichotomy between its CFE and GFE behaviours. But vector_mask approach >>>> scales well for sizeless types. Is the solution to have something like >>>> vector_mask with defined target-independent type semantics, but call it >>>> something else to prevent conflation with GIMPLE, a viable option? >>>> >>>> Details: >>>> After some more analysis of the proposed options, here are some >>>> interesting findings: >>>> >>>> vector_mask looked like a very interesting option until I ran into some >>>> semantic uncertainly. This code: >>>> >>>> typedef int v8si __attribute__((vector_size(8*sizeof(int)))); >>>> typedef v8si v8sib __attribute__((vector_mask)); >>>> >>>> typedef short v8hi __attribute__((vector_size(8*sizeof(short)))); >>>> typedef v8hi v8hib __attribute__((vector_mask)); >>>> >>>> v8si res; >>>> v8hi resh; >>>> >>>> v8hib __GIMPLE () foo (v8hib x, v8sib y) >>>> { >>>> v8hib res; >>>> >>>> res = x & y; >>>> return res; >>>> } >>>> >>>> When compiled on AArch64, produces a type-mismatch error for binary >>>> expression involving '&' because the 'derived' types 'v8hib' and 'v8sib' >>>> have a different target-layout. If the layout of these two 'derived' >>>> types match, then the above code has no issue. Which is the case on >>>> amdgcn-amdhsa target where it compiles without any error(amdgcn uses a >>>> scalar DImode mask mode). IoW such code seems to be allowed on some >>>> targets and not on others. >>>> >>>> With the same code, I tried putting casts and it worked fine on AArch64 >>>> and amdgcn. This target-specific behaviour of vector_mask derived types >>>> will be difficult to specify once we move it to the CFE - in fact we >>>> probably don't want target-specific behaviour once it moves to the CFE. >>>> >>>> If we expose vector_mask to CFE, we'd have to specify consistent >>>> semantics for vector_mask types. We'd have to resolve ambiguities like >>>> 'v4hib & v4sib' clearly to be able to specify the semantics of the type >>>> system involving vector_mask. If we do this, don't we run the risk of a >>>> dichotomy between the CFE and GFE semantics of vector_mask? I'm assuming >>>> we'd want to retain vector_mask semantics as they are in GIMPLE. >>>> >>>> If we want to enforce constant semantics for vector_mask in the CFE, one >>>> way is to treat vector_mask types as distinct if they're 'attached' to >>>> distinct data vector types. In such a scenario, vector_mask types >>>> attached to two data vector types with the same lane-width and number of >>>> lanes would be classified as distinct. For eg: >>>> >>>> typedef int v8si __attribute__((vector_size(8*sizeof(int)))); >>>> typedef v8si v8sib __attribute__((vector_mask)); >>>> >>>> typedef float v8sf __attribute__((vector_size(8*sizeof(float)))); >>>> typedef v8sf v8sfb __attribute__((vector_mask)); >>>> >>>> v8si foo (v8sf x, v8sf y, v8si i, v8si j) >>>> { >>>> (a == b) & (v8sfb)(x == y) ? x : (v8si){0}; >>>> } >>>> >>>> This could be the case for unsigned vs signed int vectors too for eg - >>>> seems a bit unnecessary tbh. >>>> >>>> Though vector_mask's being 'attached' to a type has its drawbacks, it >>>> does seem to have an advantage when sizeless types are considered. If we >>>> have to define a sizeless vector boolean type that is implied by the >>>> lane size, we could do something like >>>> >>>> typedef svint32_t svbool32_t __attribute__((vector_mask)); >>>> >>>> int32_t foo (svint32_t a, svint32_t b) >>>> { >>>> svbool32_t pred = a > b; >>>> >>>> return pred[2] ? a[2] : b[2]; >>>> } >>>> >>>> This is harder to do in the other schemes proposed so far as they're >>>> size-based. >>>> >>>> To be able to free the boolean from the base type (not size) and retain >>>> vector_mask's flexibility to declare sizeless types, we could have an >>>> attribute that is more flexibly-typed and only 'derives' the lane-size >>>> and number of lanes from its 'base' type without actually inheriting the >>>> actual base type(char, short, int etc) or its signedness. This creates a >>>> purer and stand-alone boolean type without the associated semantics' >>>> complexity of having to cast between two same-size types with the same >>>> number of lanes. Eg. >>>> >>>> typedef int v8si __attribute__((vector_size(8*sizeof(int)))); >>>> typedef v8si v8b __attribute__((vector_bool)); >>>> >>>> However, with differing lane-sizes, there will have to be a cast as the >>>> 'derived' element size is different which could impact the layout of the >>>> vector mask. Eg. >>>> >>>> v8si foo (v8hi x, v8hi y, v8si i, v8si j) >>>> { >>>> (v8sib)(x == y) & (i == j) ? i : (v8si){0}; >>>> } >>>> >>>> Such conversions on targets like AVX512/AMDGCN will be a NOP, but >>>> non-trivial on SVE (depending on the implemented layout of the bool vector). >>>> >>>> vector_bool decouples us from having to retain the behaviour of >>>> vector_mask and provides the flexibility of not having to cast across >>>> same-element-size vector types. Wrt to sizeless types, it could scale well. >>>> >>>> typedef svint32_t svbool32_t __attribute__((vector_bool)); >>>> typedef svint16_t svbool16_t __attribute__((vector_bool)); >>>> >>>> int32_t foo (svint32_t a, svint32_t b) >>>> { >>>> svbool32_t pred = a > b; >>>> >>>> return pred[2] ? a[2] : b[2]; >>>> } >>>> >>>> int16_t bar (svint16_t a, svint16_t b) >>>> { >>>> svbool16_t pred = a > b; >>>> >>>> return pred[2] ? a[2] : b[2]; >>>> } >>>> >>>> On SVE, pred[2] refers to bit 4 for svint16_t and bit 8 for svint32_t on >>>> the target predicate. >>>> >>>> Thoughts? >>> >>> The GIMPLE frontend accepts just what is valid on the target here. Any >>> "plumbing" such as implicit conversions (if we do not want to require >>> explicit ones even when NOP) need to be done/enforced by the C frontend. >>> >> >> Sorry, I'm not sure I follow - correct me if I'm wrong here. >> >> If we desire to define/allow operations like implicit/explicit >> conversion on vector_mask types in CFE, don't we have to start from a >> position of defining what makes vector_mask types distinct and therefore >> require implicit/explicit conversions? > > We need to look at which operations we want to produce vector masks and > which operations consume them and what operations operate on them. > > In GIMPLE comparisons produce them, conditionals consume them and > we allow bitwise ops to operate on them directly (GIMPLE doesn't have > logical && it just has bitwise &). > Thanks for your thoughts - after I spent more cycles researching and experimenting, I think I understand the driving factors here. Comparison producers generate signed integer vectors of the same lane-width as the comparison operands. This means mixed type vectors can't be applied to conditional consumers or bitwise operators eg: v8hi foo (v8si a, v8si b, v8hi c, v8hi d) { return a > b || c > d; // error! return a > b || __builtin_convertvector (c > d, v8si); // OK. return a | b && c | d; // error! return a | b && __builtin_convertvector (c | d, v8si); // OK. } Similarly, if we extend these 'stricter-typing' rules to vector_mask, it could look like: typedef v4sib v4si __attribute__((vector_mask)); typedef v4hib v4hi __attribute__((vector_mask)); v8sib foo (v8si a, v8si b, v8hi c, v8hi d) { v8sib psi = a > b; v8hib phi = c > d; return psi || phi; // error! return psi || __builtin_convertvector (phi, v8sib); // OK. return psi | phi; // error! return psi | __builtin_convertvector (phi, v8sib); // OK. } At GIMPLE stage, on targets where the layout allows it (eg AMDGCN), expressions like psi | __builtin_convertvector (phi, v8sib) can be optimized to psi | phi because __builtin_convertvector (phi, v8sib) is a NOP. I think this could make vector_mask more portable across targets. If one wants to take CFE vector_mask code and run it on the GFE, it should work; while the reverse won't as CFE vector_mask rules are more restrictive. Does this look like a sensible approach for progress? >> IIUC, GFE's distinctness of vector_mask types depends on how the mask >> mode is implemented on the target. If implemented in CFE, vector_mask >> types' distinctness probably shouldn't be based on target layout and >> could be based on the type they're 'attached' to. > > But since we eventually run on the target the layout should ideally > match that of the target. Now, the question is whether that's ever > OK behavior - it effectively makes the mask somewhat opaque and > only "observable" by probing it in defined manners. > >> Wouldn't that diverge from target-specific GFE behaviour - or are you >> suggesting its OK for vector_mask type semantics to be different in CFE >> and GFE? > > It's definitely undesirable but as said I'm not sure it has to differ > [the layout]. > I agree it is best to have a consistent layout of vector_mask across CFE and GFE and also implement it to match the target layout for optimal code quality. For observability, I think it makes sense to allow operations that are relevant and have a consistent meaning irrespective of that target. Eg. 'vector_mask & 2' might not mean the same thing on all targets, but vector_mask[2] does. Therefore, I think the opaqueness is useful and necessary to some extent. Thanks, Tejas. >>> There's one issue I can see that wasn't mentioned yet - GCC currently >>> accepts >>> >>> typedef long gv1024di __attribute__((vector_size(1024*8))); >>> >>> even if there's no underlying support on the target which either has support >>> only for smaller vectors or no vectors at all. Currently vector_mask will >>> simply fail to produce sth desirable here. What's your idea of making >>> that not target dependent? GCC will later lower operations with such >>> vectors, possibly splitting them up into sizes supported by the hardware >>> natively, possibly performing elementwise operations. For the former >>> one would need to guess the "decomposition type" and based on that >>> select the mask type [layout]? >>> >>> One idea would be to specify the mask layout follows the largest vector >>> kind supported by the target and if there is none follow the layout >>> of (signed?) _Bool [n]? When there's no target support for vectors >>> GCC will generally use elementwise operations apart from some >>> special-cases. >>> >> >> That is a very good point - thanks for raising it. For when GCC chooses >> to lower to a vector type supported by the target, my initial thought >> would be to, as you say, choose a mask that has enough bits to represent >> the largest vector size with the smallest lane-width. The actual layout >> of the mask will depend on how the target implements its mask mode. >> Decomposition of vector_mask ought to follow the decomposition of the >> GNU vectors type and each decomposed vector_mask type ought to have >> enough bits to represent the decomposed GNU vector shape. It sounds nice >> on paper, but I haven't really worked through a design for this. Do you >> see any gotchas here? > > Not really. In the end it comes down to what the C writer is allowed to > do with a vector mask. I would for example expect that I could do > > auto m = v1 < v2; > _mm512_mask_sub_epi32 (a, m, b, c); > > so generic masks should inter-operate with intrinsics (when the appropriate > ISA is enabled). That works for the data vectors themselves for example > (quite some intrinsics are implemented with GCCs generic vector code). > > I for example can't do > > _Bool lane2 = m[2]; > > to inspect lane two of a maks with AVX512. I can do m & 2 but I wouldn't expect > that to work (should I?) with a vector_mask mask (it's at least not > valid directly > in GIMPLE). There's _mm512_int2mask and _mm512_mask2int which transfer > between mask and int (but the mask types are really just typedefd to > integer typeS). > >>> While using a different name than vector_mask is certainly possible >>> it wouldn't me to decide that, but I'm also not yet convinced it's >>> really necessary. As said, what the GIMPLE frontend accepts >>> or not shouldn't limit us here - just the actual chosen layout of the >>> boolean vectors. >>> >> >> I'm just concerned about creating an alternate vector_mask functionality >> in the CFE and risk not being consistent with GFE. > > I think it's more important to double-check usablilty from the users side. > If the implementation necessarily diverges from GIMPLE then we can > choose a different attribute name but then it will also inevitably have > code-generation (quality) issues as GIMPLE matches what the hardware > can do. > > Richard. > >> Thanks, >> Tejas. >> >>> Richard. >>> >>>> Thanks, >>>> Tejas. >>>> >>>>>> >>>>>> Thanks, >>>>>> Tejas. >>>>>> >>>>>>> >>>>>>> Richard. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Tejas. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Richard. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Tejas. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> 1. __attribute__((vector_size (n))) where n represents bytes >>>>>>>>>> >>>>>>>>>> typedef bool vbool __attribute__ ((vector_size (1))); >>>>>>>>>> >>>>>>>>>> In this approach, the shape of the boolean vector is unclear. IoW, it is not >>>>>>>>>> clear if each bit in 'n' controls a byte or an element. On targets >>>>>>>>>> like SVE, it would be natural to have each bit control a byte of the target >>>>>>>>>> vector (therefore resulting in an 'unpacked' layout of the PBV) and on AVX, each >>>>>>>>>> bit would control one element/lane on the target vector(therefore resulting in a >>>>>>>>>> 'packed' layout with all significant bits at the LSB). >>>>>>>>>> >>>>>>>>>> 2. __attribute__((vector_size (n))) where n represents num of lanes >>>>>>>>>> >>>>>>>>>> typedef int v4si __attribute__ ((vector_size (4 * sizeof (int))); >>>>>>>>>> typedef bool v4bi __attribute__ ((vector_size (sizeof v4si / sizeof (v4si){0}[0]))); >>>>>>>>>> >>>>>>>>>> Here the 'n' in the vector_size attribute represents the number of bits that >>>>>>>>>> is needed to represent a vector quantity. In this case, this packed boolean >>>>>>>>>> vector can represent upto 'n' vector lanes. The size of the type is >>>>>>>>>> rounded up the nearest byte. For example, the sizeof v4bi in the above >>>>>>>>>> example is 1. >>>>>>>>>> >>>>>>>>>> In this approach, because of the nature of the representation, the n bits required >>>>>>>>>> to represent the n lanes of the vector are packed at the LSB. This does not naturally >>>>>>>>>> align with the SVE approach of each bit representing a byte of the target vector >>>>>>>>>> and PBV therefore having an 'unpacked' layout. >>>>>>>>>> >>>>>>>>>> More importantly, another drawback here is that the change in units for vector_size >>>>>>>>>> might be confusing to programmers. The units will have to be interpreted based on the >>>>>>>>>> base type of the typedef. It does not offer any flexibility in terms of the layout of >>>>>>>>>> the bool vector - it is fixed. >>>>>>>>>> >>>>>>>>>> 3. Combination of 1 and 2. >>>>>>>>>> >>>>>>>>>> Combining the best of 1 and 2, we can introduce extra parameters to vector_size that will >>>>>>>>>> unambiguously represent the layout of the PBV. Consider >>>>>>>>>> >>>>>>>>>> typedef bool vbool __attribute__((vector_size (s, n[, w]))); >>>>>>>>>> >>>>>>>>>> where 's' is size in bytes, 'n' is the number of lanes and an optional 3rd parameter 'w' >>>>>>>>>> is the number of bits of the PBV that represents a lane of the target vector. 'w' would >>>>>>>>>> allow a target to force a certain layout of the PBV. >>>>>>>>>> >>>>>>>>>> The 2-parameter form of vector_size allows the target to have an >>>>>>>>>> implementation-defined layout of the PBV. The target is free to choose the 'w' >>>>>>>>>> if it is not specified to mirror the target layout of predicate registers. For >>>>>>>>>> eg. AVX would choose 'w' as 1 and SVE would choose s*8/n. >>>>>>>>>> >>>>>>>>>> As an example, to represent the result of a comparison on 2 int16x8_t, we'd need >>>>>>>>>> 8 lanes of boolean which could be represented by >>>>>>>>>> >>>>>>>>>> typedef bool v8b __attribute__ ((vector_size (2, 8))); >>>>>>>>>> >>>>>>>>>> SVE would implement v8b layout to make every 2nd bit significant i.e. w == 2 >>>>>>>>>> >>>>>>>>>> and AVX would choose a layout where all 8 consecutive bits packed at LSB would >>>>>>>>>> be significant i.e. w == 1. >>>>>>>>>> >>>>>>>>>> This scheme would accomodate more than 1 target to effectively represent vector >>>>>>>>>> bools that mirror the target properties. >>>>>>>>>> >>>>>>>>>> 4. A new attribite >>>>>>>>>> >>>>>>>>>> This is based on a suggestion from Richard S in [3]. The idea is to introduce a new >>>>>>>>>> attribute to define the PBV and make it general enough to >>>>>>>>>> >>>>>>>>>> * represent all targets flexibly (SVE, AVX etc) >>>>>>>>>> * represent sub-byte length predicates >>>>>>>>>> * have no change in units of vector_size/no new vector_size signature >>>>>>>>>> * not have the number of bytes constrain representation >>>>>>>>>> >>>>>>>>>> If we call the new attribute 'bool_vec' (for lack of a better name), consider >>>>>>>>>> >>>>>>>>>> typedef bool vbool __attribute__((bool_vec (n[, w]))) >>>>>>>>>> >>>>>>>>>> where 'n' represents number of lanes/elements and the optional 'w' is bits-per-lane. >>>>>>>>>> >>>>>>>>>> If 'w' is not specified, it and bytes-per-predicate are implementation-defined based on target. >>>>>>>>>> If 'w' is specified, sizeof (vbool) will be ceil (n*w/8). >>>>>>>>>> >>>>>>>>>> 5. Behaviour of the packed vector boolean type. >>>>>>>>>> >>>>>>>>>> Taking the example of one of the options above, following is an illustration of it's behavior >>>>>>>>>> >>>>>>>>>> * ABI >>>>>>>>>> >>>>>>>>>> New ABI rules will need to be defined for this type - eg alignment, PCS, >>>>>>>>>> mangling etc >>>>>>>>>> >>>>>>>>>> * Initialization: >>>>>>>>>> >>>>>>>>>> Packed Boolean Vectors(PBV) can be initialized like so: >>>>>>>>>> >>>>>>>>>> typedef bool v4bi __attribute__ ((vector_size (2, 4, 4))); >>>>>>>>>> v4bi p = {false, true, false, false}; >>>>>>>>>> >>>>>>>>>> Each value in the initizlizer constant is of type bool. The lowest numbered >>>>>>>>>> element in the const array corresponds to the LSbit of p, element 1 is >>>>>>>>>> assigned to bit 4 etc. >>>>>>>>>> >>>>>>>>>> p is effectively a 2-byte bitmask with value 0x0010 >>>>>>>>>> >>>>>>>>>> With a different layout >>>>>>>>>> >>>>>>>>>> typedef bool v4bi __attribute__ ((vector_size (2, 4, 1))); >>>>>>>>>> v4bi p = {false, true, false, false}; >>>>>>>>>> >>>>>>>>>> p is effectively a 2-byte bitmask with value 0x0002 >>>>>>>>>> >>>>>>>>>> * Operations: >>>>>>>>>> >>>>>>>>>> Packed Boolean Vectors support the following operations: >>>>>>>>>> . unary ~ >>>>>>>>>> . unary ! >>>>>>>>>> . binary&,|andˆ >>>>>>>>>> . assignments &=, |= and ˆ= >>>>>>>>>> . comparisons <, <=, ==, !=, >= and > >>>>>>>>>> . Ternary operator ?: >>>>>>>>>> >>>>>>>>>> Operations are defined as applied to the individual elements i.e the bits >>>>>>>>>> that are significant in the PBV. Whether the PBVs are treated as bitmasks >>>>>>>>>> or otherwise is implementation-defined. >>>>>>>>>> >>>>>>>>>> Insignificant bits could affect results of comparisons or ternary operators. >>>>>>>>>> In such cases, it is implementation defined how the unused bits are treated. >>>>>>>>>> >>>>>>>>>> . Subscript operator [] >>>>>>>>>> >>>>>>>>>> For the subscript operator, the packed boolean vector acts like a array of >>>>>>>>>> elements - the first or the 0th indexed element being the LSbit of the PBV. >>>>>>>>>> Subscript operator yields a scalar boolean value. >>>>>>>>>> For example: >>>>>>>>>> >>>>>>>>>> typedef bool v8b __attribute__ ((vector_size (2, 8, 2))); >>>>>>>>>> >>>>>>>>>> // Subscript operator result yields a boolean value. >>>>>>>>>> // x[3] is the 7th LSbit and x[1] is the 3rd LSbit of x. >>>>>>>>>> bool foo (v8b p, int n) { p[3] = true; return p[1]; } >>>>>>>>>> >>>>>>>>>> Out of bounds access: OOB access can be determined at compile time given the >>>>>>>>>> strong typing of the PBVs. >>>>>>>>>> >>>>>>>>>> PBV does not support address of operator(&) for elements of PBVs. >>>>>>>>>> >>>>>>>>>> . Implicit conversion from integer vectors to PBVs >>>>>>>>>> >>>>>>>>>> We would like to support the output of comparison operations to be PBVs. This >>>>>>>>>> requires us to define the implicit conversion from an integer vector to PBV >>>>>>>>>> as the result of vector comparisons are integer vectors. >>>>>>>>>> >>>>>>>>>> To define this operation: >>>>>>>>>> >>>>>>>>>> bool_vector = vector vector >>>>>>>>>> >>>>>>>>>> There is no change in how vector vector behavior i.e. this comparison >>>>>>>>>> would still produce an int_vector type as it does now. >>>>>>>>>> >>>>>>>>>> temp_int_vec = vector vector >>>>>>>>>> bool_vec = temp_int_vec // Implicit conversion from int_vec to bool_vec >>>>>>>>>> >>>>>>>>>> The implicit conversion from int_vec to bool I'd define simply to be: >>>>>>>>>> >>>>>>>>>> bool_vec[n] = (_Bool) int_vec[n] >>>>>>>>>> >>>>>>>>>> where the C11 standard rules apply >>>>>>>>>> 6.3.1.2 Boolean type When any scalar value is converted to _Bool, the result >>>>>>>>>> is 0 if the value compares equal to 0; otherwise, the result is 1. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [1] https://lists.llvm.org/pipermail/cfe-dev/2020-May/065434.html >>>>>>>>>> [2] https://reviews.llvm.org/D88905 >>>>>>>>>> [3] https://reviews.llvm.org/D81083 >>>>>>>>>> >>>>>>>>>> Thoughts? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Tejas. >>>>>> >>>> >>