From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x2e.google.com (mail-oa1-x2e.google.com [IPv6:2001:4860:4864:20::2e]) by sourceware.org (Postfix) with ESMTPS id 14FEB3858C53 for ; Thu, 14 Apr 2022 19:40:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 14FEB3858C53 Received: by mail-oa1-x2e.google.com with SMTP id 586e51a60fabf-ddfa38f1c1so6265484fac.11 for ; Thu, 14 Apr 2022 12:40:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=Pxb/9U8yWulP3CIm7KTPUAJV9kx12zvy8ijqInW/RTk=; b=2gmnQZq3UqHK71LAMJH+fFePpMwpPtwgNM2qnrabjP8pJvz2OXlXuZod4Dth7hEwVU ma3ckKGw78qohF1Cm3TFD1s3nBqnCRjAfaagdyU8mPLPuGGi9kogZhkeWU490EJO/5OK G/kIuCGnzelW0KgZIjvlw0B3xhFLaK1trA3JuHWmMSef+tT5OcEH2+J9t3ODzCdANXm+ EtmvY6uHoTrhNZ49AtvEXPeZyJRwux1FfX5iHxtI8O6/8iJcDIkJPec2JKw7r7+Wr8QM LAN2NOooveLboRDtykRhoozgpnx0SlJSkNO0KNPsT52VYbadxXmbqRuLYQXqA2VKGCSE eWqw== X-Gm-Message-State: AOAM532971oK9it24gxraAlUX6IWpzcXBr4KDXuZTnPnbVz/APpvI8yg hA4kTyGAffeiLT41qPRPeCnrlKYEmaU5Qg== X-Google-Smtp-Source: ABdhPJxMo8M/p9BeMpCrEN1Qbft0xi7Toj6EDJg1K9y8RjiIUKtbLlGRZ82YQrHpdJhbCHEijawjCg== X-Received: by 2002:a05:6870:630a:b0:e2:f8bb:5ec with SMTP id s10-20020a056870630a00b000e2f8bb05ecmr60097oao.171.1649965209240; Thu, 14 Apr 2022 12:40:09 -0700 (PDT) Received: from ?IPV6:2804:431:c7ca:431f:3dc9:7133:8dac:5273? ([2804:431:c7ca:431f:3dc9:7133:8dac:5273]) by smtp.gmail.com with ESMTPSA id q13-20020a4ab3cd000000b0032830efe365sm348576ooo.43.2022.04.14.12.40.07 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Apr 2022 12:40:08 -0700 (PDT) Message-ID: <29cc4487-c76f-d098-127e-946ff8a92291@linaro.org> Date: Thu, 14 Apr 2022 16:40:06 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 Subject: Re: [PATCH 4/7] x86: Add SSSE3 optimized chacha20 Content-Language: en-US To: Noah Goldstein Cc: GNU C Library References: <20220413202401.408267-1-adhemerval.zanella@linaro.org> <20220413202401.408267-5-adhemerval.zanella@linaro.org> From: Adhemerval Zanella In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2022 19:40:12 -0000 On 14/04/2022 16:25, Noah Goldstein wrote: > On Wed, Apr 13, 2022 at 3:27 PM Adhemerval Zanella via Libc-alpha > wrote: >> >> + >> +static inline void >> +chacha20_crypt (struct chacha20_state *state, uint8_t *dst, const uint8_t *src, >> + size_t bytes) >> +{ >> + if (CPU_FEATURE_USABLE_P (cpu_features, SSSE3) && bytes >= CHACHA20_BLOCK_SIZE * 4) >> + { >> + size_t nblocks = bytes / CHACHA20_BLOCK_SIZE; >> + nblocks -= nblocks % 4; > > Are we locking ourselves into the api of __chacha_* expecting > this precomputation? I imagine we might want to move this to > assembly unless `nblock` is a compile time constant. > >> + __chacha20_ssse3_blocks8 (state->ctx, dst, src, nblocks); >> + bytes -= nblocks * CHACHA20_BLOCK_SIZE; >> + dst += nblocks * CHACHA20_BLOCK_SIZE; >> + src += nblocks * CHACHA20_BLOCK_SIZE; >> + } >> + >> + if (bytes > 0) >> + chacha20_crypt_generic (state, dst, src, bytes); >> +} >> -- >> 2.32.0 >> I think should be ok to _Static_assert that CHACHA20_BUFSIZE is a multiple of the expected nblocks used by the optimized version and just call it without the need to handle nblocks. I will change it to v2.