From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=CGeQ=4Z=gmail.com=jeffreyalaw@sourceware.org>
Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432])
	by sourceware.org (Postfix) with ESMTPS id 0EC893858D37
	for <gcc-patches@gcc.gnu.org>; Tue, 27 Dec 2022 20:46:14 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0EC893858D37
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-pf1-x432.google.com with SMTP id c9so4642814pfj.5
        for <gcc-patches@gcc.gnu.org>; Tue, 27 Dec 2022 12:46:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :from:to:cc:subject:date:message-id:reply-to;
        bh=BLhAfchTRN7yeMT55kfgZcXOVhqcXxliLVbZkdq0Wp0=;
        b=KKRW3G0RFvxUk/BYQiLXq/i8t1LiO31Hb3a38RBlkY/TgYX19nxqgYxJuIPt071k3A
         cb8apC7nQfkWKAtC/AK8ND56J3PIQmesBJGv0+v3BX67q/xolb9nRjOE1zy67SSAB9ud
         arFCTqR3IuiFfdonHT7CwBlEjBf7PpU92e8CZPQqhKfJPSgz2usPEcKIquifTDgF0Ppm
         d876m5qSlvkWa/6bNEwVS/RtB0L4pCTvq/7sUVENmuMsBkF1tZwom3HBcXMsu2sti0JP
         C+nE3KM26WkHg7dDbCAqbUwRsHzrWkTGVca9y98sO0/mPQ69LUt493qupzLYmwbYK6Ku
         LnWQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=BLhAfchTRN7yeMT55kfgZcXOVhqcXxliLVbZkdq0Wp0=;
        b=bMewwnloV3Njb5wKj/Tbf/fnwXmU4ELPjbEuvPITxbddekqpKu5wkpvSZCkR05Mg5U
         /j6+O2uARNU4ta3/2ehS2SGhHeogecZ3Av2BYY2PsGbn9jNYPQ466nn6Ptnu7IYPvJht
         gMMSx9y2/46x6iZ0K5rCBdt8JHq2rzyIt2ZX9uArIFp0uxcXTiORGo9NOgFjbcHuiRhy
         zR0wNvqaJTfZkKh7yY/xd+1AkIC0kn1mOnTgviPoMYvxa3Tj+P8YkvVAB692CHRk4y/N
         W0dgLFM57eRbdPsqU2B1H9rtMym1VNF8C4pZnKl2ft+TuQohykdpXnNnycBV9kMN3ne/
         wEWw==
X-Gm-Message-State: AFqh2kpSFqCjSGCGojYWHvmTbLmMdwQcCUgjiUdcA5fIPQRoVH/vFe0A
	cZfzJ5tEBdJW9AD1A6FBozs=
X-Google-Smtp-Source: AMrXdXvAJWo/6DRDb2qPwVrPcQVs92ppUyydXoNMx/mRZSyyww03cRdm9nspMR5W2ClnNnJyBoLoqA==
X-Received: by 2002:aa7:8c56:0:b0:576:f89d:2c4b with SMTP id e22-20020aa78c56000000b00576f89d2c4bmr37263797pfd.32.1672173972853;
        Tue, 27 Dec 2022 12:46:12 -0800 (PST)
Received: from ?IPV6:2601:681:8600:13d0::f0a? ([2601:681:8600:13d0::f0a])
        by smtp.gmail.com with ESMTPSA id h132-20020a62838a000000b00580f445d1easm5423289pfe.216.2022.12.27.12.46.11
        (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
        Tue, 27 Dec 2022 12:46:12 -0800 (PST)
Message-ID: <d1819835-5eba-7a23-c2c3-1e875c1f34d1@gmail.com>
Date: Tue, 27 Dec 2022 13:46:10 -0700
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.5.1
Subject: Re: [PATCH] RISC-V: Fix RVV mask mode size
Content-Language: en-US
To: Richard Biener <richard.guenther@gmail.com>
Cc: =?UTF-8?B?6ZKf5bGF5ZOy?= <juzhe.zhong@rivai.ai>,
 gcc-patches <gcc-patches@gcc.gnu.org>, "kito.cheng" <kito.cheng@gmail.com>,
 palmer <palmer@dabbelt.com>
References: <20221214064825.240605-1-juzhe.zhong@rivai.ai>
 <190019d9-155b-e0d1-43d3-d9baae96a2cc@gmail.com>
 <27D1642B23C2C0D1+2022121709442960704166@rivai.ai>
 <75eb29fd-6449-e2d1-2702-d297373cecf3@gmail.com>
 <CAFiYyc2pbx43X4_gObAX3yGrLFiiJKa_NvqfUJb2Wjkfa4Hzmw@mail.gmail.com>
From: Jeff Law <jeffreyalaw@gmail.com>
In-Reply-To: <CAFiYyc2pbx43X4_gObAX3yGrLFiiJKa_NvqfUJb2Wjkfa4Hzmw@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>


On 12/19/22 00:44, Richard Biener wrote:
> On Sat, Dec 17, 2022 at 2:54 AM Jeff Law via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>>
>>
>> On 12/16/22 18:44, 钟居哲 wrote:
>>> Yes, VNx4DF only has 4 bit in mask mode in case of load and store.
>>> For example vlm or vsm we will load store 8-bit ??? (I am not sure
>>> hardward can load store 4bit,but I am sure it definetly not load store
>>> the whole register size)
>> Most likely than not you end up loading a larger quantity with the high
>> bits zero'd.  Interesting that we're using a packed model.  I'd been
>> told it was fairly expensive to implement in hardware relative to teh
>> cost of implementing the sparse model.
> 
> Since the masks are extra inputs if you use a packed model you need
> to wire less bits into the execution units for the masks which I guess
> is actually cheaper.  Yes, producing the masks might be more complicated.
We went through this at a prior employer and the hardware guys argued 
strongly that a packed model for mask registers was just too expensive 
to implement.  I don't think it was the # of wires, but the muxes.  The 
number of wires into the unit was an issue when we started talking about 
sub-byte masking :-)

Conceptually on the hardware side each bit in the mask corresponds to a 
byte in a vector register.  When the element size is 8 bits, then 
obviously there is a 1:1 correspondence between potentially masked 
elements and bits the mask register.

When the element size is 32 bits, then there are 3 don't care bits in 
the mask register, then a single bit that is queried for masked 
operations.  So if you had a 128bit vector with 32 bits per element, a 
mask register might have a value like:

0xxx 1xxx 1xxx 0xxx

A 128 bit vector with 64 bits per element might be:

0xxx xxxx 1xxx xxxx

Where the xxxs are don't cares and the 0/1 are the masks.


> 
> The only "issue" might be with 4, 2 and 1 bit masks which would
> have a size of 8 bits but a precision of less that endianess might
> play a role.
> 
> Btw, this is all similar to AVX512 where we even don't use
> vector BI modes but integer modes for the mask which
> then becomes QImode for 1, 2, 4 and 8 bit masks and
> HImode for 16, SImode for 32 and DImode for 64 bit masks.
Right.  I think in hindsight that might have been a mistake.

jeff