From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by sourceware.org (Postfix) with ESMTPS id A4FF13858C2C for ; Thu, 18 Nov 2021 08:20:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A4FF13858C2C Received: by mail-qt1-x829.google.com with SMTP id n15so5353669qta.0 for ; Thu, 18 Nov 2021 00:20:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=q71rxFAEz2BR82B5bY9CnAp1hR0xR2z6xcMPy7GaVqo=; b=B/wjjAGyCUXlqH5pSB5LHuqMAQGeP7aJ0Su5Z50MJW7eNV0yFc/W85Q2lk8ixW8pcn AqXycv3QjxBbOZSbTyO3tfhk8/OHusyYsFj6xmBBFMYxPxpF5RHsG7YANJRbgO81w2s+ lcEoql8NRXpsnpU7PDBUORwdzGNEmQIzdLk5YpOSIVsIK/udoPVSs4jUQA1GfWyRv28N X6vGMpLAbTTEAeF038L9sKOXz4V3wacpKMsqq3pUfu3zoW1/GA0KDD5bT6uiT17BmnNk mMfeyNC2jft0LYOoRoAD8n+ktI41OMcNewzW+CUJUg6yjN7qa4qBpcfAPDxOWzszL9iK wjnA== X-Gm-Message-State: AOAM531VxQ+JV61p73NaHIejJxjqzD2TzRX+CrGsQZ9oDwh5N7cfhRrK gvZ05XkbrD8HcrsKpUtmMRsO5qTXorOpCrqS7lU= X-Google-Smtp-Source: ABdhPJwa3Y6C0Qys6z0UK8T9KSAydVX2JyxKjAjLuQo713n7I1Oi95K70oCG72WjxOPh0HEFVfR4VTgZtsO21e6Zsbs= X-Received: by 2002:ac8:5710:: with SMTP id 16mr24195166qtw.140.1637223607080; Thu, 18 Nov 2021 00:20:07 -0800 (PST) MIME-Version: 1.0 References: <20211118071825.1796-1-hongtao.liu@intel.com> In-Reply-To: <20211118071825.1796-1-hongtao.liu@intel.com> From: Uros Bizjak Date: Thu, 18 Nov 2021 09:19:55 +0100 Message-ID: Subject: Re: [PATCH] Don't allow mask/sse/mmx mov in TLS code sequences. To: liuhongt Cc: "gcc-patches@gcc.gnu.org" , "H. J. Lu" , Hongtao Liu Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Nov 2021 08:20:10 -0000 On Thu, Nov 18, 2021 at 8:18 AM liuhongt wrote: > > As change in assembler, refer to [1], this patch disallow mask/sse/mmx > mov in TLS code sequences which require integer MOV instructions. > > [1] https://sourceware.org/git/?p=binutils-gdb.git;a=patch;h=d7e3e627027fcf37d63e284144fe27ff4eba36b5 > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk and GCC11 upstream branch? > > gcc/ChangeLog: > > PR target/103275 > * config/i386/i386-protos.h (ix86_notls_memory): Declare. > * config/i386/i386.c (ix86_notls_memory): New function. > * config/i386/i386.md (*movsi_internal): Don't allow > mask/sse/mmx move in TLS code sequences. > (*movdi_internal): Ditto. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr103275.c: New test. > --- > gcc/config/i386/constraints.md | 5 + > gcc/config/i386/i386-protos.h | 1 + > gcc/config/i386/i386.c | 34 ++++++ > gcc/config/i386/i386.md | 18 +-- > gcc/testsuite/gcc.target/i386/pr103275.c | 148 +++++++++++++++++++++++ > 5 files changed, 197 insertions(+), 9 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr103275.c > > diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md > index 87cceac4cfb..489c76164a1 100644 > --- a/gcc/config/i386/constraints.md > +++ b/gcc/config/i386/constraints.md > @@ -186,6 +186,11 @@ (define_special_memory_constraint "Bc" > (and (match_operand 0 "memory_operand") > (match_test "constant_address_p (XEXP (op, 0))"))) > > +(define_special_memory_constraint "Bk" > + "@internal notls memory operand." > + (and (match_operand 0 "memory_operand") > + (match_test "ix86_notls_memory (op)"))) Why is the above declared as a special memory constraint? Also the predicate comment is missing and the description should say something like: @internal TLS address that allows insn using non-integer registers > (define_special_memory_constraint "Bn" > "@internal Memory operand without REX prefix." > (match_operand 0 "norex_memory_operand")) > diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h > index 7e05510c679..1fb09be8b7e 100644 > --- a/gcc/config/i386/i386-protos.h > +++ b/gcc/config/i386/i386-protos.h > @@ -243,6 +243,7 @@ extern unsigned int ix86_get_callcvt (const_tree); > #endif > > extern rtx ix86_tls_module_base (void); > +extern bool ix86_notls_memory (rtx); > extern bool ix86_tls_address_pattern_p (rtx); > extern rtx ix86_rewrite_tls_address (rtx); > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index c246c8736f5..f1b7f57b0ca 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -11628,6 +11628,40 @@ legitimize_tls_address (rtx x, enum tls_model model, bool for_mov) > return dest; > } > > +/* Return true if it's not tls memory, > + NB: it's different from ix86_tls_address_pattern_p since it also matchs > + gottpoff/gotntpoff. > + It's used to prevent KMOV/VMOV in TLS code sequences which require integer > + MOV instructions, refer to PR103275. */ > +bool > +ix86_notls_memory (rtx mem) I think it is better to avoid negative logic. So, something like Return true if the TLS address requires insn using integer registers. bool ix86_gpr_tls_address_pattern_p (and use not in the predicate) > +{ > + gcc_assert (MEM_P (mem)); > + > + rtx addr = XEXP (mem, 0); > + subrtx_var_iterator::array_type array; > + FOR_EACH_SUBRTX_VAR (iter, array, addr, ALL) > + { > + rtx op = *iter; > + if (GET_CODE (op) == UNSPEC) > + switch (XINT (op, 1)) > + { > + case UNSPEC_GOTNTPOFF: > + return false; > + case UNSPEC_TPOFF: > + if (!TARGET_64BIT) > + return false; > + break; > + default: > + break; > + } > + /* Should iter.skip_subrtxes (); > + if there's no inner UNSPEC in addr???. */ You should figure the above before submitting the patch. > + } > + > + return true; > +} > + > /* Return true if OP refers to a TLS address. */ > bool > ix86_tls_address_pattern_p (rtx op) > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 7b2de60706d..9feba81974b 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -2164,9 +2164,9 @@ (define_split > > (define_insn "*movdi_internal" > [(set (match_operand:DI 0 "nonimmediate_operand" > - "=r ,o ,r,r ,r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,m,?r ,?*Yd,?r,?*v,?*y,?*x,*k,*k ,*r,*m,*k") > + "=r ,o ,r,r ,r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,m,?r ,?*Yd,?r,?*v,?*y,?*x,*k,*k,*k ,*r,*m,*k") > (match_operand:DI 1 "general_operand" > - "riFo,riF,Z,rem,i,re,C ,*y,m ,*y,*y,r ,C ,*v,m ,*v,v,*Yd,r ,*v,r ,*x ,*y ,*r,*km,*k,*k,CBC"))] > + "riFo,riF,Z,rem,i,re,C ,*y,Bk ,*y,*y,r ,C ,*v,Bk,*v,v,*Yd,r ,*v,r ,*x ,*y ,*r,*k,*Bk,*k,*k,CBC"))] > "!(MEM_P (operands[0]) && MEM_P (operands[1])) > && ix86_hardreg_mov_ok (operands[0], operands[1])" > { > @@ -2228,7 +2228,7 @@ (define_insn "*movdi_internal" > [(set (attr "isa") > (cond [(eq_attr "alternative" "0,1,17,18") > (const_string "nox64") > - (eq_attr "alternative" "2,3,4,5,10,11,23,25") > + (eq_attr "alternative" "2,3,4,5,10,11,23,26") > (const_string "x64") > (eq_attr "alternative" "19,20") > (const_string "x64_sse2") > @@ -2249,9 +2249,9 @@ (define_insn "*movdi_internal" > (const_string "ssemov") > (eq_attr "alternative" "21,22") > (const_string "ssecvt") > - (eq_attr "alternative" "23,24,25,26") > + (eq_attr "alternative" "23,24,25,26,27") > (const_string "mskmov") > - (eq_attr "alternative" "27") > + (eq_attr "alternative" "28") > (const_string "msklog") > (and (match_operand 0 "register_operand") > (match_operand 1 "pic_32bit_operand")) > @@ -2385,9 +2385,9 @@ (define_peephole2 > > (define_insn "*movsi_internal" > [(set (match_operand:SI 0 "nonimmediate_operand" > - "=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,*k") > + "=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*k,*rm,*k") > (match_operand:SI 1 "general_operand" > - "g ,re,C ,*y,m ,*y,*y,r ,C ,*v,m ,*v,*v,r ,*r,*km,*k ,CBC"))] > + "g ,re,C ,*y,Bk ,*y,*y,r ,C ,*v,Bk,*v,*v,r ,*r,*k, *Bk,*k ,CBC"))] > "!(MEM_P (operands[0]) && MEM_P (operands[1])) > && ix86_hardreg_mov_ok (operands[0], operands[1])" > { > @@ -2452,9 +2452,9 @@ (define_insn "*movsi_internal" > (const_string "sselog1") > (eq_attr "alternative" "9,10,11,12,13") > (const_string "ssemov") > - (eq_attr "alternative" "14,15,16") > + (eq_attr "alternative" "14,15,16,17") > (const_string "mskmov") > - (eq_attr "alternative" "17") > + (eq_attr "alternative" "18") > (const_string "msklog") > (and (match_operand 0 "register_operand") > (match_operand 1 "pic_32bit_operand")) > diff --git a/gcc/testsuite/gcc.target/i386/pr103275.c b/gcc/testsuite/gcc.target/i386/pr103275.c > new file mode 100644 > index 00000000000..288e680ffb2 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr103275.c > @@ -0,0 +1,148 @@ > +/* { dg-do compile { target ia32 } } */ > +/* { dg-options "-O2 -march=tigerlake -fPIC" } */ > +/* { dg-final { scan-assembler-not {(?n)kmovd.*@gotntpoff} } } */ Can you please minimize the testcase? Uros. > + > +typedef unsigned short uint16_t; > +typedef int int32_t; > +typedef unsigned int uint32_t; > +typedef unsigned char uint8_t; > + > +typedef uint32_t in_addr_t; > +struct in_addr { in_addr_t s_addr; }; > + > +extern __thread const uint16_t * __libc_tsd_CTYPE_B __attribute__ ((tls_model ("initial-exec"))); > +extern __thread int __libc_errno __attribute__ ((tls_model ("initial-exec"))); > + > +enum > +{ > + _ISspace = ((5) < 8 ? ((1 << (5)) << 8) : ((1 << (5)) >> 8)), > +}; > + > + > +extern inline const uint16_t ** __attribute__ ((const)) > +__ctype_b_loc (void) > +{ > + return (&__libc_tsd_CTYPE_B); > +} > + > +extern unsigned long strtoul (const char*, char**, int); > +extern uint32_t __bswap_32 (in_addr_t); > +static int > +inet_aton_end (const char *cp, struct in_addr *addr, const char **endp) > +{ > + static const in_addr_t max[4] = { 0xffffffff, 0xffffff, 0xffff, 0xff }; > + in_addr_t val; > + char c; > + union iaddr > + { > + uint8_t bytes[4]; > + uint32_t word; > + } res; > + uint8_t *pp = res.bytes; > + int digit; > + > + int saved_errno = __libc_errno; > + (__libc_errno = (0)); > + > + res.word = 0; > + > + c = *cp; > + for (;;) > + { > + > + > + if (!({ int __c = (c); __c >= '0' && __c <= '9'; })) > + goto ret_0; > + { > + char *endp; > + unsigned long ul = strtoul (cp, &endp, 0); > + if (ul == > + (0x7fffffffL * 2UL + 1UL) > + && __libc_errno == > + 34 > + ) > + goto ret_0; > + if (ul > 0xfffffffful) > + goto ret_0; > + val = ul; > + digit = cp != endp; > + cp = endp; > + } > + c = *cp; > + if (c == '.') > + { > + > + > + > + > + if (pp > res.bytes + 2 || val > 0xff) > + goto ret_0; > + *pp++ = val; > + c = *++cp; > + } > + else > + break; > + } > + > + if (c != '\0' && (!(((c) & ~0x7f) == 0) || !((*__ctype_b_loc ())[(int) ((c))] & (unsigned short int) _ISspace))) > + goto ret_0; > + > + if (!digit) > + goto ret_0; > + > + > + > + if (val > max[pp - res.bytes]) > + goto ret_0; > + > + if (addr != > + ((void *)0) > + ) > + addr->s_addr = res.word | __bswap_32 (val); > + *endp = cp; > + > + (__libc_errno = (saved_errno)); > + return 1; > + > + ret_0: > + (__libc_errno = (saved_errno)); > + return 0; > +} > + > +int > +__inet_aton_exact (const char *cp, struct in_addr *addr) > +{ > + struct in_addr val; > + const char *endp; > + > + if (inet_aton_end (cp, &val, &endp) != 0 && *endp == 0) > + { > + *addr = val; > + return 1; > + } > + else > + return 0; > +} > + > + > + > +int > +__inet_aton_ignore_trailing (const char *cp, struct in_addr *addr) > +{ > + const char *endp; > + return inet_aton_end (cp, addr, &endp); > +} > +extern __typeof (__inet_aton_ignore_trailing) inet_aton __attribute__ ((weak, alias ("__inet_aton_ignore_trailing"))) __attribute__ ((__copy__ (__inet_aton_ignore_trailing))); > + > + > + > +in_addr_t > +__inet_addr (const char *cp) > +{ > + struct in_addr val; > + const char *endp; > + if (inet_aton_end (cp, &val, &endp)) > + return val.s_addr; > + return ((in_addr_t) 0xffffffff); > +} > +extern __typeof (__inet_addr) inet_addr __attribute__ ((weak, alias ("__inet_addr"))) __attribute__ ((__copy__ (__inet_addr))); > -- > 2.18.1 >