From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua1-x92d.google.com (mail-ua1-x92d.google.com [IPv6:2607:f8b0:4864:20::92d]) by sourceware.org (Postfix) with ESMTPS id 1EDA7385661F for ; Tue, 30 May 2023 15:45:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1EDA7385661F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ua1-x92d.google.com with SMTP id a1e0cc1a2514c-784e2b8c1d7so1345304241.0 for ; Tue, 30 May 2023 08:45:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685461557; x=1688053557; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=2xL8odyehj6gRQkYGnAZ7qPheaQWtb/99l+skXdAQLg=; b=SneYIacbf2e/0k876CCGAcDdv0bKQwl3B+8UYXXL77O9+R9DI2Sk60mq4wnlrhqZai CDhFzffsqn7ntpPTjkznSR0fXeabff879ksPQ+fjjAJ+8iuGIDC1AZ+asw1gocxVspCK Ku/v2L7WRD7rfZf5Cs8U5ppgb5hmhhRtc8qvVC0vK7+WDueOI/GZZ22YAZKYdItw5PZz q6sBRirooQPCMFRi3HGsquDl01faCHWtk9LvvOmqwRidC/9ViERf0Doj5mUKa6duKPO5 IRvSt6H4gRfiaDC4CVf6sNmk7tYEt4xRY+yEJr0JgptDBO3LW6F8Rrk8xy27c9bnijn/ YN/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685461557; x=1688053557; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2xL8odyehj6gRQkYGnAZ7qPheaQWtb/99l+skXdAQLg=; b=Yd7h8EiPVybCIzSwBQQ/v9C7uzVEHAjQi0M0Cgb1u+VBDJiLJMHAEmBX2pKl8U6IC5 MsmFYR3sDtwNab2eEIggXSOuTrhnv3FcROw/As/GPAKaBIIXNxDxKAYkjakEZgSUQ4JC 5cOd+uqGQehidebWq/97VjrlOjNCLzVhpPZmFgenPjGd8hDZ4eza0UI/JDjg03jOQKeE dr1vJYGtdpEG0IuTHnjfmYRd5xqjGDwAmoJDwffflw198S5Kpe5GAqXZZMZG2aFOmn9l jfh/RuXlNM+E7PBHXW/i+1gjODZ9rYLcHNTYDuzYMYDSAwNHz5+uCuK+6RI2cnzg3cgc 7K4g== X-Gm-Message-State: AC+VfDwIrkTl6OalD9bnqt0Tzlep0Xa7nou3JTU/KOsuUtKvxdOnEQQK 1kj5jJRsh6aUzmkLHv8fkZcjNJvljezKWY5Y79k= X-Google-Smtp-Source: ACHHUZ4Gryms1lqzFyTQWx/lAKhyCQks8LcweAtrkiHJDj31eQDHaUypxlOrGM4F+2FwTYMh0iCCXIjZD7OajCa/xwU= X-Received: by 2002:a67:db93:0:b0:439:4b7b:2ae8 with SMTP id f19-20020a67db93000000b004394b7b2ae8mr1037662vsk.1.1685461557175; Tue, 30 May 2023 08:45:57 -0700 (PDT) MIME-Version: 1.0 References: <20230530060621.31449-1-kito.cheng@sifive.com> <87B2E2DEA59DF7D1+20230530154530505119343@rivai.ai> <5873dbb5-ef8f-6411-1841-d849030554e3@gmail.com> <8AF15FA3F4884D3C+20230530174423101539359@rivai.ai> In-Reply-To: <8AF15FA3F4884D3C+20230530174423101539359@rivai.ai> From: Kito Cheng Date: Tue, 30 May 2023 23:45:45 +0800 Message-ID: Subject: Re: Re: [PATCH] RISC-V: Basic VLS code gen for RISC-V To: "juzhe.zhong@rivai.ai" Cc: Richard Biener , Robin Dapp , "Kito.cheng" , gcc-patches , palmer , jeffreyalaw , "pan2.li" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: It's long mail but I think this should explain most high level concept why I did this: I guess I skipped too much story about the VLS-mode support; VLS-mode support can be split into the middle-end and back-end. # Middle-end As Richard mentioned, those VLS types can be held by VLA-modes; for example, int32x4_t can be held by VNx4SI mode, so IMO there are three different options here: 1) use VLS type with VLS mode in middle-end, 2) use VLS type with VLA mode in middle-end 3) use VLA type with VLA mode. Option 2 might be weird and not natural to implement in GCC, so let me ignore that. Option 3 is a possible way, and actually, I did that on our downstream compiler, and then...we found a fact that is not friendly to optimization; give a few practical examples here VLA type is hard to present a vector constructor other than a step or splat/duplicated value, we need to push those value into memory first - and then load by len_load, okay, so constant propagation and folding can't work well here - since it's hard to evaluate that with unknown vector length. And it is also not friendly to pointer alias - because the length is unknown, so GCC must be conservative on this, which will block some optimization due to AA issues. So IMO the use the VLS-type with VLS mode is the best way in the middle-end. # Back-end OK, it's back-end time; we have two options in the back-end to support the VLS-type: support that with VLS mode or VLA mode. What's the meaning of support with VLA mode? convert VLS-type stuff into VLA mode pattern and give the right length information - then everything works. But what is wrong with this path? Again, similar issues in the back-end: the propagation and folding with constant vector will be limited when we hold in VLA type - we can't be held const_vector other than splat/duplicated value or step value; it can't even be held during the combine process, give an example here, we have a = {1, 2, 3, 4} and b = {4, 3, 2, 1}, this can be easily present at VLS mode RTL, but impossible to present in VLA mode RLT, and then we can folding to a+b to {5, 5, 5, 5}, but VLA mode will get a bunch of problems to optimize those stuff. And also the stack issue mentioned before - unless we can teach RA to track the length used for each register with VLA mode, I believe it would be terrible for RA... # Back to this patch Ju-Zhe has suggested we could reuse VLA pattern for VLS mode, I considered that before, however, I feel that might not be friendly with combine pass, because our VLA pattern is kind of complicated than the plain VLS pattern, BUT I believe we will improve that in the near future :P so I think that it should be reasonable just to use the same pattern - then we could just add VLS mode to the mode iterator to support that without magic mode changing, I can understand that really seems very unsafe.