From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpout30.security-mail.net (smtpout35.security-mail.net [85.31.212.35]) by sourceware.org (Postfix) with ESMTPS id 143083858C35 for ; Wed, 18 Oct 2023 07:24:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 143083858C35 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=kalrayinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kalrayinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 143083858C35 Authentication-Results: server2.sourceware.org; arc=fail smtp.remote-ip=85.31.212.35 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1697613897; cv=fail; b=RjFqRarO1iD83tVM/DkKhc/s6MuOeRRkiT7eW5UK92SjuTsP9zVjumbi2Ix0nHM7MwtV2sUWs93t7y46CBmtCRKZhnXlh3/HmwRe7YkWaNvUuigHDKW4PDt+Jc5rSVGFAnB9C4yTiz4Bs0/MeWNuu0g3fEefs3WWCHfKqRPnpCo= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1697613897; c=relaxed/simple; bh=vAL+IZOPUZ5uG0XPb03RrOlmUZ5YA2iVQDGe/c1yR9g=; h=DKIM-Signature:DKIM-Signature:Message-ID:Date:Subject:To:From: MIME-Version; b=H9t37dBiJzkbSVm+SIqMzkHZTsr7+phsRLEsUpk/aXD9ZquXrv576lncAGZWkaIZhIqp8zbbVx4i0PYJHgxC5GFwTu4VWor1aHEu0Vxz5xQ/TTXiOQTNE7ZAuxv++FkFdL9Ktwzfd2W2HgTT6qfFVb7KTrzRvGu8JqQ6wwqur08= ARC-Authentication-Results: i=2; server2.sourceware.org Received: from localhost (fx305.security-mail.net [127.0.0.1]) by fx305.security-mail.net (Postfix) with ESMTP id D616F30FCA2 for ; Wed, 18 Oct 2023 09:24:53 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kalrayinc.com; s=sec-sig-email; t=1697613893; bh=vAL+IZOPUZ5uG0XPb03RrOlmUZ5YA2iVQDGe/c1yR9g=; h=Date:Subject:To:Cc:References:From:In-Reply-To; b=ZwCbNnbg8BdK67MBXagQtmB1Q0hzxcyGVzqFL1XPkLnF0Bt/CT1oWzkGPcpeW2qoZ k3PJWuG1IWBlv5BSS88m6cTSK092KbLdYkUOtGoxVxtqWUos6KTDJJG8UWUy7kA+tE F/nZv4cRgw3i7kODjs73pJ8nDfnYS/dvhrdCbWE0= Received: from fx305 (fx305.security-mail.net [127.0.0.1]) by fx305.security-mail.net (Postfix) with ESMTP id 9F4E030FC92; Wed, 18 Oct 2023 09:24:53 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01on0100.outbound.protection.outlook.com [104.47.24.100]) by fx305.security-mail.net (Postfix) with ESMTPS id 8D92B30FC3B; Wed, 18 Oct 2023 09:24:52 +0200 (CEST) Received: from PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:184::6) by MRZP264MB2331.FRAP264.PROD.OUTLOOK.COM (2603:10a6:501:1c::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6886.36; Wed, 18 Oct 2023 07:24:50 +0000 Received: from PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM ([fe80::5ece:32eb:eae9:b4d7]) by PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM ([fe80::5ece:32eb:eae9:b4d7%3]) with mapi id 15.20.6907.022; Wed, 18 Oct 2023 07:24:50 +0000 X-Virus-Scanned: E-securemail Secumail-id: <16e7b.652f8844.8c58e.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hGRtJsDoszyqLBn1fbFtU7XegVXxfdQALZqSskF98d8uc5FZRDACSdRaGRNhRbu1/IItx9/EXFjPesIvrndUD5fESYr8StzNWevStF6io9IPo3v/toIthljzcYi3852qu/ARL6ajT2/BqEmHjAOM6XgmKHOkKAW0td6q5ca1U4sMgnVQXnm72eVFuwU37XY8q7zMFuhvG38805ItqUxiHW3JgWSEz0OMwbvUsjdfs0wkOQqVuVb6XfMerN+0WTt87htI5VgOuzLVwTweSSuqPE1Utg+wzg6lyxGRe7TpKC3glcMq69E52mt35/0z9Ge/fLvl9R/0V1X4gejOemQAgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GDsNJYYqUJQxAJiIbmztzGxghX7l0eO1XtfIfSPPs34=; b=f34/H62lk/HCTztZeMNsNtj7nmcDJLHm/mQUAukA+cBWNV/nmLVtVMznqNW+zP1VX1x55HDMSUt0qKplt9mO8Ez8WF+dxfdNcIGoVJ7y+/7NKjwSMN4bJm//5oRzw5YCye7n4dEiWxxjPbeRijUP5EHpbb8KgNhkMcqhpZWglrNlgQiJDDIjyVgKHUG7VDgZx+Qt7AI6qivQXCR1aV41ys6E3WoH3Lr2+GDbF+i6/YJrAHpNBmWLloS8e3SSoiWhV1CKmESfEqoCGibnyPJJhxM6nSCGHpiUB3J/VVvNbzEnBPsamJd1vZyQ9l3nfjQd60KEG0eRV/Izs8mHHYSSDA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kalrayinc.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GDsNJYYqUJQxAJiIbmztzGxghX7l0eO1XtfIfSPPs34=; b=GcM3Y4sSbvw9gc9ppXXumhMsrtZiRFb+uw1gEuH2LMzjfWkOkPyG5lziBsDAT4UokA54dXo2HjjHTVFf6EuavKh9qaWIGaBqzX+MDtVOXP7541V1+dxFXbBPZdZukPFKukGmlKJC7yqg/3CQpkkFr3T9Hm8hy0FuuIJ9P88QcY3sFIFeipkH2jzs1BWjUlqCLJlki+xvpx7CNDm2uBzh7PG1tF1lELu8ildtjkWVZAbdkpqtfCu4k/GK2KIau5C7lx9xbeEvDbHAfGaNhzDEc/6Dau5cnvAyhVc6L4QzFlcF3DypLmrl6GcpqFRAx/acfn1igwEotpOyxPK5YFop3A== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=kalrayinc.com; Message-ID: Date: Wed, 18 Oct 2023 09:24:49 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: Complex numbers support: discussions summary To: Toon Moene , gcc@gcc.gnu.org Cc: piannetta@kalrayinc.com References: <567d0202-bb1a-4242-96f0-e6bc93d237a1@moene.org> Content-Language: en-us From: Sylvain Noiry In-Reply-To: <567d0202-bb1a-4242-96f0-e6bc93d237a1@moene.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: AM0PR10CA0122.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:208:e6::39) To PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:184::6) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: PR1P264MB3448:EE_|MRZP264MB2331:EE_ X-MS-Office365-Filtering-Correlation-Id: 746d5d8c-4a25-4fe6-0a85-08dbcfab50da X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: tp7aL7bvCEiNdB40FUYs3+FxPiJBtiaCcrAd0OKO/OVNEM3PfIOsUbK4qDjmuHhhp8QVrKTm/F17m2/GAsZoVpfkMiK+KaH4LnomKO7BCZCwwYbn6/f2a9CYko43mas/7YLpIWSjny1QdsvJz35NOJ0HsvsdqQEu9koZ9RbTxoC4EkjSDtlrVrY2OBoKvbuh6KowiU2mq/yIgukWIxwgqfCK4DcazPB4rKuY7qubfj1OeP6lPule/IHSBfIG/B/He3B7yJ5EYjIYRWvPwiMZ9NsMztVtMKUUyDJ+T5tmQR65YdrogIuCsia5gRCSD0fDfo5VBsdaRajZ93uFmD0iIJoP8D8sYQaxGYuIRIzaQNrU7i+jO3ljjErDJ5kyOgEClCF5T9/UFu+kzqPm4WatKOT4J5GEfdsrE3Xl6UO4nMZTTvC99GqUy2rjV95wqv0sTFil+DY7vBMeOlNpHlc7NFG2qPhico4j3G7El84wBhHXyN9yROyCShjqvlhkfP/19WqHfamKBAKheGK35YIPB0UkDk1rcdtNniPU8NDlh/x9fwrAl7AhLi9ktOAzfrj8ZWtM+OYHCjdAmYe//ogiShhNchh6/jvEBMvwtIrxE2HYp4gA3ofsPapHVqmu0MW4EPIH69QLsKyw6w1tNW5vcw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230031)(396003)(376002)(346002)(39850400004)(366004)(136003)(230922051799003)(1800799009)(186009)(451199024)(64100799003)(31686004)(6512007)(6506007)(2906002)(5660300002)(4326008)(8936002)(8676002)(41300700001)(66946007)(66556008)(66476007)(966005)(6486002)(316002)(478600001)(53546011)(38100700002)(2616005)(26005)(107886003)(83380400001)(36756003)(31696002)(86362001)(45980500001)(43740500002);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 8Hp+vc9OXcKCgrJb/JpHF8bUgBNRwz2vHCNTzqMCz85Z8glRIJGPeTzNRwSf22X0UexFyNf3dZjeN3xgUUfiCnbiBZHgp7Ghddn9pvUyfEebBVPN5ZxfPgDoFlBYTRipBAySL4tI4MSErjZdPJvMhWAgPEy7rOPMT14Ellw32GVEI9l/ledB2PGcMnAUpuBN8zBs1ZFFTzR4Al5+/83ZN6yFxZfAjUNL4tU21aJwQB1BMjm/ABmu9cbfD2jlCZPxD6uuZIyBUov5XsqA8vj1rt2+a8zU76u6Ru6M6cgNRQBWfLMTtitlDICB9JPkGSsQytotLwpYqcRoGscl0kUYSNW//31LLmWkK/8hqQ6jpfBV7qdvDyMH9KZdef2YHDaAaqSWIWLLbj9XTDpwKafQvweDI7058iZ7kItwdNEB98ykLj2UrsYyW67LrTJd2XfOa2s4Cv7iZFflhmt+6Ea0tTCN0Y5NZVi60WoeNKjuXWX1SbSFP3EH5RotuJyMNY0igCMdd7SEeS6lMUjThINZFhLQwfm7UQfORuHgn9htkY91hC9RuNcNt34KPZs5cJIDFu/HBanQvMshGmEnLfrdjNY+Ft3ki2cL+1qTyc/MIseKC9rXGb6ygoLGI4cB0ZvehNWfuiUNkavMRMTGc84h3gvsKyZda96S1NmqXqSYtpKXmNjKB6xR/2S0froBbZ5bNHDGxByHl3GKVYth3eV2RGz0aJw6+DOgH6FwZTExUFYzp00LEidMCMS60vwrE1B+uTj8OqdKrJNgsXrZzq7qIdyLTw/jCx8XecRge5OIWE5dyWYrgAE6Wm107G8rz0L7a/o/nQ4Rsx68r+HJu0TkNaE/+vkfX/7XSLirdXvPzeRetgylz+k1RTCPK46i7l/6N+b5AFYOBgQiaXhrtcLxpp5TNub7xwmN66CNkBJFE00XlQNLT1gQ+sTxRqahhSGg JB9LC3TYi0mi8e96yv3iSjheyWJJES0P+pQIJruyPyIzLCci1JUT6w93JP5QxPJnELmRMHE0CSb4BYC3uGt8Emkwz5O+dFcuYHh6yy85nt1AwyMecf4P2nAVRMnRok6B877/lB9jFqcpZUEqj1Wcxo9wwB3U0i82rh6qhT+G75aqFyN6iBUOuSn6C0EXuMgJeM6qOQr/HvKUZLxbjxA9vCn5/1uXA7TKc4/YLUnmEXp+cif5ijiVgepVD1Yw2ayS79fT0OGjINwIAqf+EROnwRvnd406hXvZpqefRqtecaGpXanKQjLqwq4LzmWL1MOStR5r7xI48Wac6J9DZvMtU481pzi5hHz6Rpd/FM73rUygb/KXKotRdXBAfX/VLHzTRPZSuNKTDh42vI1vHoD7lxRmWF9xUNfHcirjrH1XnvhIijFaaNtzNtbiD7lZmMNxPUSXa7NKwuP/+cdlbIr/F9taz9Lv2BoD09D+UrjsQiS52kgPAAuU+L52J6kdfsHpC7hYJt5s3MmpePB4eQHXSY5M1RSK/bLOTfuk/G6Iqjfqt1EX2EaXMh6dudrdFkqNlhc4tUIczwNAr1QcmKFX29L/INaDl7L/w1n5VSqz6ePLsCBlrdB3LLDdlC2q47Ce X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: 746d5d8c-4a25-4fe6-0a85-08dbcfab50da X-MS-Exchange-CrossTenant-AuthSource: PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Oct 2023 07:24:50.7067 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: EsLEMWTPD6ZC54az8QY+J3ipkMJnCU2NZL2c4JQJ8BnX3WWO4My6gmmwvt1jKu0deHPTXVkiM3NWumfig+QiWQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MRZP264MB2331 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Toon, the implementation is not finished, we have just made some tests for now. If no one sees huge problems with this new approach, we will continue to implement and stabilize it. Thank you for your interest ! Sylvain On 10/17/23 22:37, Toon Moene wrote: > Sylvain, > > Is this on a branch in your github repository > >     https://github.com/kalray/gcc > > somewhere ? > > That would make it easier to test it for me (and probably others). > > See for instance my mail here (d.d. Thu Oct 5 14:45:05 GMT 2023): > > https://gcc.gnu.org/pipermail/gcc/2023-October/242643.html > > Thanks in advance. > > Kind regards, > > Toon Moene. > > On 10/16/23 11:14, Sylvain Noiry via Gcc wrote: > >> Hi, >> >> We are trying to update our patches on complex numbers to take into >> account what has been discussed. >> >> The main change from our previous patches consists of replacing >> vectors of complex types with classical vectors of real types (ex >> V4SF instead of V2SC) associated with existing complex opcodes (like >> .COMPLEX_MUL) when vectorizing.  Non vectored complex modes are also >> replaced by vectors of two reals at the end of the middle-end (ex SC >> to V2SF), so that it can reuse already existing patterns.  Indeed, >> non complex specific operations like an addition does not require an >> specific pattern anymore, and already implementing patterns like >> cmul, cmul_conj, cadd90,... can be used. >> >> To do so, the cplxlower pass has been cut into two passes: >>    - The first one replace complex specific opcodes with dedicated >> opcodes (like .COMPLEX_MUL replacing MUL_EXPR with SC mode), but >> complex modes are kept at this point.  Unsupported native operations >> are also lowered, because we assume that it's better to lower and >> hope for standard optimizations in the middle-end than trying to >> vectorize with near-zero chance, and then lower only after. >>    - The second one almost only remaps non vectored complex modes >> into vector of two reals (like SC to V2SF). >> >> So the vectorizer takes complex modes as input but vectorize with >> vectors of real modes (ex V4SF vector mode for SC). Because complex >> specific opcodes have been set before, no confusion with real >> operations is possible. We also may use vectors of two reals as >> inputs, but vectorizing small vector modes into bigger ones (like >> V2SF to V4SF) is not possible. >> >> Here are some advantages of this new approach: >>    - No more vectors of complex modes >>    - The vectorization of complex operations is improved, because >> split and unified vectored statements can easely be mixed as it uses >> the same vector type. We can also imagine to test multiple options >> (First: native vectored, second: split vectored, third: unified >> scalar,...). >>    - It reuses patterns for vectors of two reals for non complex >> specific operations, and also already existing complex patterns like >> cmul implemented on aarch64, which could mean almost free performance >> gains on many targets. >> >> On the performance side, we can still exploit the full potential of >> complex instructions on KVX.  To illustrate the gains on aarch64 >> without rewriting any patterns (except a mov), here is the assembly >> generated for a vector complex mul mul add with -O2 -mcpu=neoverse-v1 >> (and without ffast-math like with SLP): >> >> void vfmma (_Complex float a[restrict N], _Complex float b[restrict N], >>                       _Complex float c[restrict N], _Complex float >> d[restrict N]) >> { >>    for (int i = 0; i < N; i++) >>      c[i] += a[i] * b[i] * d[i]; >> } >> >> >> vfmma: >>          movi    v3.4s, 0 >>          mov     x4, 0 >>          .align  5 >> .L2: >>          ldr     q2, [x1, x4] >>          mov     v1.16b, v3.16b >>          ldr     q0, [x0, x4] >>          fcmla   v1.4s, v0.4s, v2.4s, #0 >>          fcmla   v1.4s, v0.4s, v2.4s, #90 >>          ldr     q0, [x2, x4] >>          ldr     q2, [x3, x4] >>          fcmla   v0.4s, v2.4s, v1.4s, #0 >>          fcmla   v0.4s, v2.4s, v1.4s, #90 >>          str     q0, [x2, x4] >>          add     x4, x4, 16 >>          cmp     x4, 256 >>          bne     .L2 >>          ret >> >> We have only done some experimentation with this approach.  If you >> think that it could be interesting we will try to develop it more. >> >> Thanks, >> >> Sylvain >> >> >> >> >> >