From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mo4-p00-ob.smtp.rzone.de (mo4-p00-ob.smtp.rzone.de [85.215.255.21]) by sourceware.org (Postfix) with ESMTPS id 337DA3858C5E for ; Fri, 3 Nov 2023 18:28:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 337DA3858C5E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gjlay.de Authentication-Results: sourceware.org; spf=none smtp.mailfrom=gjlay.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 337DA3858C5E Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=85.215.255.21 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1699036128; cv=pass; b=sy5Po7ZZcKyB5RN58mblwteRKo8sTuP5iToucvEvdYzZwicb+PixKZlt1S+1ILiMWrFOtY7cmEEJSmxmL23AdCpLTR/yerDL94HdBVbBGlUk36I5oYCUomhKfyAVSTNp+Dp8/CWJMzYHRF0lY1sA3AozMpHkr+GzfBmwHTIu5DA= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1699036128; c=relaxed/simple; bh=2n5+qFFlsWD/DQpBNErEgo7cb+raAdhz94DwX1Ai/6I=; h=DKIM-Signature:DKIM-Signature:Message-ID:Date:MIME-Version: Subject:To:From; b=Ylrb0TahVFnfllTIkB7mR7M3+XLl8IWZrvD2ExqPZgm4n89LL9ECBedHRwxccuqjSEt9lxS6rzz91N1IRk02Y0QGs1Y53nxGrGtfsvfSN0TCSU7ZX6D7blJezJeDTr5qXqW2Sr9ISW6fCFVH3r8l2TqnAZp5rpjWaCpxhnRD7B0= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; t=1699036115; cv=none; d=strato.com; s=strato-dkim-0002; b=RZLXa5xcei3s+s1eltR0ydfLsx0wtSzj8k+MAfuq66l9cLwY7vXQfGtDj/2Kxl7yCQ eioAWLGXRBhhqcZVwC/PojFcEBTonz1Kvede0r0TO6zD546gc8dPZBpbyw7lLhqQL9p+ gB59kGUAp+BEtPvm/qhXU/SNXgdwwyZPVrSgatrAEODLDxmLN2l56xotU+lpOLBNNJJm tncmMC2r42WKzx9X8LjQJts267xmEWopBl2zCJUjCT9LJHcm1CWg7Ldc2adtE85hO2gF Ixo2xPIgL58XO4vqmKU6JP3hPNblfEANq4r4v1CpEFffjg29Zf8Ti/HFl4u/DP/8p4so rKSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1699036115; s=strato-dkim-0002; d=strato.com; h=In-Reply-To:From:References:Cc:To:Subject:Date:Message-ID:Cc:Date: From:Subject:Sender; bh=M7175++1tDqcJ+3mwjOhcckM0+t7B0mG9gteHYzGS/c=; b=fuI+nSXXwUsgSCle/0nvApnhRe/ju9GhIS6bFKh4W8zyFm2voVpZCKvNk4a93YQ0nR AIGYTMijh7jLGxEjThjSHDfUrDciLXfsmSjl0zCirHkDYkhK2ylV5+P3vxE7sslRQfXo s1rrKUnQFxiQs4DcZkh8mt4YV+yXVNPZY2qR3SmtKObi/BC9mEHdTlMCAyXQflGEp/rG sGfZEZz6zKbOYXei1+LSGANpr5mVQuDicYbb/gOhg9Lkj+SlDWnklne2G9Km2LBVioVX mmP52t+vIJfgvsP3//JZFx44Fqjiy2KZJCy5Kyu2iiV7IchR9dMJk6q99oXnw/O0fQj/ pKxw== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1699036115; s=strato-dkim-0002; d=gjlay.de; h=In-Reply-To:From:References:Cc:To:Subject:Date:Message-ID:Cc:Date: From:Subject:Sender; bh=M7175++1tDqcJ+3mwjOhcckM0+t7B0mG9gteHYzGS/c=; b=ZNJ9NNfxDhl6YxyWsiImYKsEAbEDnLvUcLEhtll+nQ5jAJCLm4p5sqyK+gUA9kBmpi 6rjQn8+UqjxUkzbiTBy+jHU9SrWU/bO35eN93LTKViz/PqbgTZY77Z01wvdi0AcobEJI 8JQf7T3RZlDZbnO0GjZeNt5Oum7FnkIDCmqH0/2KgKvRTcN8r3Koua+RC97WMmHbTEt2 5U0a+9dzgE18m/DYTHj2QXYiLJGLkuYQO/BZh6DPjlfrUktlsElE692nrPFT2SNnCFLX P15H6cie4+aggjkPysHVzdqeu5ukRsLkC7FvNkXCK4TAKncKJJEXI0Ig0aQL/u4xMF3z t3XQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1699036115; s=strato-dkim-0003; d=gjlay.de; h=In-Reply-To:From:References:Cc:To:Subject:Date:Message-ID:Cc:Date: From:Subject:Sender; bh=M7175++1tDqcJ+3mwjOhcckM0+t7B0mG9gteHYzGS/c=; b=Gpj3bIun4m/NNpyf7M6Vu69f1Ggf058CBpCWLUdQVAP/kSeB2smVvkV/9Lrw6BGhJa jFo1Yih7QEXzBb2RqPAg== X-RZG-AUTH: ":LXoWVUeid/7A29J/hMvvT3koxZnKT7Qq0xotTetVnKkbjtK7q2y9LkX3hYlnPQ==" Received: from [192.168.2.102] by smtp.strato.de (RZmta 49.9.1 DYNA|AUTH) with ESMTPSA id 3b3849zA3ISZsSz (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Fri, 3 Nov 2023 19:28:35 +0100 (CET) Message-ID: Date: Fri, 3 Nov 2023 19:28:23 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [AVR PATCH] Improvements to SImode and PSImode shifts by constants. Content-Language: en-US To: Roger Sayle , gcc-patches@gcc.gnu.org Cc: 'Denis Chertykov' References: <026501da0d83$457b0d20$d0712760$@nextmovesoftware.com> From: Georg-Johann Lay In-Reply-To: <026501da0d83$457b0d20$d0712760$@nextmovesoftware.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Am 02.11.23 um 12:54 schrieb Roger Sayle: > > This patch provides non-looping implementations for more SImode (32-bit) > and PSImode (24-bit) shifts on AVR. For most cases, these are shorter > and faster than using a loop, but for a few (controlled by optimize_size) Maybe this should also adjust the insn costs, like in avr_rtx_costs_1? Depending on what you are outputting, avr_asm_len() might be more convenient. What I am not sure about are the text cases that expect exact sequences which might be annoying in the future? Johann > they are a little larger but significantly faster, The approach is to > perform byte-based shifts by 1, 2 or 3 bytes, followed by bit-based shifts > (effectively in a narrower type) for the remaining bits, beyond 8, 16 or 24. > > For example, the simple test case below (inspired by PR 112268): > > unsigned long foo(unsigned long x) > { > return x >> 26; > } > > gcc -O2 currently generates: > > foo: ldi r18,26 > 1: lsr r25 > ror r24 > ror r23 > ror r22 > dec r18 > brne 1b > ret > > which is 8 instructions, and takes ~158 cycles. > With this patch, we now generate: > > foo: mov r22,r25 > clr r23 > clr r24 > clr r25 > lsr r22 > lsr r22 > ret > > which is 7 instructions, and takes ~7 cycles. > > One complication is that the modified functions sometimes use spaces instead > of TABs, with occasional mistakes in GNU-style formatting, so I've fixed > these indentation/whitespace issues. There's no change in the code for the > cases previously handled/special-cased, with the exception of ashrqi3 reg,5 > where with -Os a (4-instruction) loop is shorter than the five single-bit > shifts of a fully unrolled implementation. > > This patch has been (partially) tested with a cross-compiler to avr-elf > hosted on x86_64, without a simulator, where the compile-only tests in > the gcc testsuite show no regressions. If someone could test this more > thoroughly that would be great. > > > 2023-11-02 Roger Sayle > > gcc/ChangeLog > * config/avr/avr.cc (ashlqi3_out): Fix indentation whitespace. > (ashlhi3_out): Likewise. > (avr_out_ashlpsi3): Likewise. Handle shifts by 9 and 17-22. > (ashlsi3_out): Fix formatting. Handle shifts by 9 and 25-30. > (ashrqi3_our): Use loop for shifts by 5 when optimizing for size. > Fix indentation whitespace. > (ashrhi3_out): Likewise. > (avr_out_ashrpsi3): Likewise. Handle shifts by 17. > (ashrsi3_out): Fix indentation. Handle shifts by 17 and 25. > (lshrqi3_out): Fix whitespace. > (lshrhi3_out): Likewise. > (avr_out_lshrpsi3): Likewise. Handle shifts by 9 and 17-22. > (lshrsi3_out): Fix indentation. Handle shifts by 9,17,18 and 25-30. > > gcc/testsuite/ChangeLog > * gcc.target/avr/ashlsi-1.c: New test case. > * gcc.target/avr/ashlsi-2.c: Likewise. > * gcc.target/avr/ashrsi-1.c: Likewise. > * gcc.target/avr/ashrsi-2.c: Likewise. > * gcc.target/avr/lshrsi-1.c: Likewise. > * gcc.target/avr/lshrsi-2.c: Likewise. > > > Thanks in advance, > Roger > -- >