From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 8D2863856DEC for ; Thu, 7 Dec 2023 16:04:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8D2863856DEC Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8D2863856DEC Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=162.254.253.69 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701965074; cv=none; b=vP95cA5Jofa2vdQT5lEuovMzrntiMmJVafHSQBZKV3wJUZOxSsz4FCcrOIUT0oCatuSVFA4BtjMnLBgfx6dp6nBG7UR1BL/E028DO/nKdZqWoWs+q7p0ovpdRrfSYks5b/h169o5f/MzWRdpIOSPbirw0p+ci/PShKqs+fhnBa8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701965074; c=relaxed/simple; bh=y10m4xSajBwswBYmwC9x32PiJlRkatUCknijHLlpp/c=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=N53wr8U5jETN8yfoSHHTCx2oUaRTSeFHO9XpFAZ0UknC+syvuEbfO/33u53veOfwvUoPm9AjBkBficAdGBlOn6qc1WVDv7rE/vzovOOnd6uGVIUhytgjpKtMuB5Il2fpHXG6ESMzbmX0SOLOW1XhDLiY+z87kQvkLyYtxbfqSEs= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:In-Reply-To:References:Cc:To:From:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=eF/m5WdJaB/matIzIuXV5as0nK/NLbfnw0cpsy0B6p8=; b=flPMG74NPZs13ylrGIFc0ECAE2 LuTzjmfhxsp9ioqUsDAiwBRkwbbTDDqbTVnw9sXEFhw5Fz4uQrYan3fp+Mpov1fRwubl8Z5jDXmrb ggJr8/fi+wXH0GVU6pyj7XjDdhPHikBkZw8IC2lTWzsVeYjdqcbUXYIoV9NqvjB4kHH1h/eSFFnKr 1nvqlpru7PgWAcNpI4W4ERiLI1imswzu/RLUPuRa619lgQW2pF/vXdsKF3FJVv/T2k2VEB3+8G6/f u0Tiri6CDV6nWVsqRFj/MDqB6cuT+pdpraYR72wHldxNGusisy7bV8f6n9ObOi4DKuVUQobarSKjm tN0oAggA==; Received: from [185.62.158.67] (port=55390 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96.2) (envelope-from ) id 1rBGrP-0008F8-25; Thu, 07 Dec 2023 11:04:23 -0500 From: "Roger Sayle" To: "'Jeff Law'" , Cc: "'Claudiu Zissulescu'" References: <007e01da2783$50bb6b70$f2324250$@nextmovesoftware.com> In-Reply-To: Subject: RE: [ARC PATCH] Add *extvsi_n_0 define_insn_and_split for PR 110717. Date: Thu, 7 Dec 2023 16:04:22 -0000 Message-ID: <00ad01da2927$0c1afe90$2450fbb0$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_00AE_01DA2927.0C1AFE90" X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQGntB44aVSwGP4OcW00jh2G3uHTRgJ5p9S8sO8P2BA= Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multipart message in MIME format. ------=_NextPart_000_00AE_01DA2927.0C1AFE90 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi Jeff, Doh! Great catch. The perils of not (yet) being able to actually run any ARC execution tests myself. > Shouldn't operands[4] be GEN_INT ((HOST_WIDE_INT_1U << tmp) - 1)? Yes(-ish), operands[4] should be GEN_INT(HOST_WIDE_INT_1U << (tmp - 1)). And the 32s in the test cases need to be 16s (the MSB of a five bit = field is 16). You're probably also thinking the same thing that I am... that it might = be possible to implement this in the middle-end, but things are complicated by = combine's make_compound_operation/expand_compound_operation, and that combine doesn't (normally) like turning two instructions into three. Fingers-crossed the attached patch works better on the nightly testers. Thanks in advance, Roger -- > -----Original Message----- > From: Jeff Law > Sent: 07 December 2023 14:47 > To: Roger Sayle ; gcc-patches@gcc.gnu.org > Cc: 'Claudiu Zissulescu' > Subject: Re: [ARC PATCH] Add *extvsi_n_0 define_insn_and_split for PR = 110717. >=20 > On 12/5/23 06:59, Roger Sayle wrote: > > This patch improves the code generated for bitfield sign extensions = on > > ARC cpus without a barrel shifter. > > > > > > Compiling the following test case: > > > > int foo(int x) { return (x<<27)>>27; } > > > > with -O2 -mcpu=3Dem, generates two loops: > > > > foo: mov lp_count,27 > > lp 2f > > add r0,r0,r0 > > nop > > 2: # end single insn loop > > mov lp_count,27 > > lp 2f > > asr r0,r0 > > nop > > 2: # end single insn loop > > j_s [blink] > > > > > > and the closely related test case: > > > > struct S { int a : 5; }; > > int bar (struct S *p) { return p->a; } > > > > generates the slightly better: > > > > bar: ldb_s r0,[r0] > > mov_s r2,0 ;3 > > add3 r0,r2,r0 > > sexb_s r0,r0 > > asr_s r0,r0 > > asr_s r0,r0 > > j_s.d [blink] > > asr_s r0,r0 > > > > which uses 6 instructions to perform this particular sign extension. > > It turns out that sign extensions can always be implemented using at > > most three instructions on ARC (without a barrel shifter) using the > > idiom ((x&mask)^msb)-msb [as described in section "2-5 Sign = Extension" > > of Henry Warren's book "Hacker's Delight"]. Using this, the sign > > extensions above on ARC's EM both become: > > > > bmsk_s r0,r0,4 > > xor r0,r0,32 > > sub r0,r0,32 > > > > which takes about 3 cycles, compared to the ~112 cycles for the = loops > > in foo. > > > > > > Tested with a cross-compiler to arc-linux hosted on x86_64, with no > > new (compile-only) regressions from make -k check. > > Ok for mainline if this passes Claudiu's nightly testing? > > > > > > 2023-12-05 Roger Sayle > > > > gcc/ChangeLog > > * config/arc/arc.md (*extvsi_n_0): New = define_insn_and_split to > > implement SImode sign extract using a AND, XOR and MINUS = sequence. > > > > gcc/testsuite/ChangeLog > > * gcc.target/arc/extvsi-1.c: New test case. > > * gcc.target/arc/extvsi-2.c: Likewise. > > > > > > Thanks in advance, > > Roger > > -- > > > > > > patchar.txt > > > > diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index > > bf9f88eff047..5ebaf2e20ab0 100644 > > --- a/gcc/config/arc/arc.md > > +++ b/gcc/config/arc/arc.md > > @@ -6127,6 +6127,26 @@ archs4x, archs4xd" > > "" > > [(set_attr "length" "8")]) > > > > +(define_insn_and_split "*extvsi_n_0" > > + [(set (match_operand:SI 0 "register_operand" "=3Dr") > > + (sign_extract:SI (match_operand:SI 1 "register_operand" "0") > > + (match_operand:QI 2 "const_int_operand") > > + (const_int 0)))] > > + "!TARGET_BARREL_SHIFTER > > + && IN_RANGE (INTVAL (operands[2]), 2, > > + (optimize_insn_for_size_p () ? 28 : 30))" > > + "#" > > + "&& 1" > > +[(set (match_dup 0) (and:SI (match_dup 0) (match_dup 3))) (set > > +(match_dup 0) (xor:SI (match_dup 0) (match_dup 4))) (set = (match_dup > > +0) (minus:SI (match_dup 0) (match_dup 4)))] { > > + int tmp =3D INTVAL (operands[2]); > > + operands[3] =3D GEN_INT (~(HOST_WIDE_INT_M1U << tmp)); > > + operands[4] =3D GEN_INT (HOST_WIDE_INT_1U << tmp); > Shouldn't operands[4] be GEN_INT ((HOST_WIDE_INT_1U << tmp) - 1)? > Otherwise it's flipping the wrong bit AFAICT. >=20 > H8 can benefit from the same transformation which is how I found this = little goof. > It's not as big a gain as ARC, but it does affect one of those = builtin-overflow tests > which tend to dominate testing time on the H8. >=20 > jeff ------=_NextPart_000_00AE_01DA2927.0C1AFE90 Content-Type: text/plain; name="patcharb.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="patcharb.txt" diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md=0A= index bf9f88eff047..d980876eff8f 100644=0A= --- a/gcc/config/arc/arc.md=0A= +++ b/gcc/config/arc/arc.md=0A= @@ -6127,6 +6127,26 @@ archs4x, archs4xd"=0A= ""=0A= [(set_attr "length" "8")])=0A= =0A= +(define_insn_and_split "*extvsi_n_0"=0A= + [(set (match_operand:SI 0 "register_operand" "=3Dr")=0A= + (sign_extract:SI (match_operand:SI 1 "register_operand" "0")=0A= + (match_operand:QI 2 "const_int_operand")=0A= + (const_int 0)))]=0A= + "!TARGET_BARREL_SHIFTER=0A= + && IN_RANGE (INTVAL (operands[2]), 2,=0A= + (optimize_insn_for_size_p () ? 28 : 30))"=0A= + "#"=0A= + "&& 1"=0A= +[(set (match_dup 0) (and:SI (match_dup 0) (match_dup 3)))=0A= + (set (match_dup 0) (xor:SI (match_dup 0) (match_dup 4)))=0A= + (set (match_dup 0) (minus:SI (match_dup 0) (match_dup 4)))]=0A= +{=0A= + int tmp =3D INTVAL (operands[2]);=0A= + operands[3] =3D GEN_INT (~(HOST_WIDE_INT_M1U << tmp));=0A= + operands[4] =3D GEN_INT (HOST_WIDE_INT_1U << (tmp - 1));=0A= +}=0A= + [(set_attr "length" "14")])=0A= +=0A= (define_insn_and_split "rotlsi3_cnt1"=0A= [(set (match_operand:SI 0 "dest_reg_operand" "=3Dr")=0A= (rotate:SI (match_operand:SI 1 "register_operand" "r")=0A= diff --git a/gcc/testsuite/gcc.target/arc/extvsi-1.c = b/gcc/testsuite/gcc.target/arc/extvsi-1.c=0A= new file mode 100644=0A= index 000000000000..5ac6feafae30=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.target/arc/extvsi-1.c=0A= @@ -0,0 +1,15 @@=0A= +/* { dg-do compile } */=0A= +/* { dg-options "-O2 -mcpu=3Dem" } */=0A= +struct S { int a : 5; };=0A= +=0A= +int foo (struct S *p)=0A= +{=0A= + return p->a;=0A= +}=0A= +=0A= +/* { dg-final { scan-assembler "msk_s\\s+r0,r0,4" } } */=0A= +/* { dg-final { scan-assembler "xor\\s+r0,r0,16" } } */=0A= +/* { dg-final { scan-assembler "sub\\s+r0,r0,16" } } */=0A= +/* { dg-final { scan-assembler-not "add3\\s+r0,r2,r0" } } */=0A= +/* { dg-final { scan-assembler-not "sext_s\\s+r0,r0" } } */=0A= +/* { dg-final { scan-assembler-not "asr_s\\s+r0,r0" } } */=0A= diff --git a/gcc/testsuite/gcc.target/arc/extvsi-2.c = b/gcc/testsuite/gcc.target/arc/extvsi-2.c=0A= new file mode 100644=0A= index 000000000000..953ea6a8b243=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.target/arc/extvsi-2.c=0A= @@ -0,0 +1,12 @@=0A= +/* { dg-do compile } */=0A= +/* { dg-options "-O2 -mcpu=3Dem" } */=0A= +=0A= +int foo(int x)=0A= +{=0A= + return (x<<27)>>27;=0A= +}=0A= +=0A= +/* { dg-final { scan-assembler "msk_s\\s+r0,r0,4" } } */=0A= +/* { dg-final { scan-assembler "xor\\s+r0,r0,16" } } */=0A= +/* { dg-final { scan-assembler "sub\\s+r0,r0,16" } } */=0A= +/* { dg-final { scan-assembler-not "lp\\s+2f" } } */=0A= ------=_NextPart_000_00AE_01DA2927.0C1AFE90--