* Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different @ 2021-11-04 20:37 Qing Zhao 2021-11-05 1:34 ` Kewen.Lin 2021-11-05 6:05 ` Andrew Pinski 0 siblings, 2 replies; 18+ messages in thread From: Qing Zhao @ 2021-11-04 20:37 UTC (permalink / raw) To: gcc-patches Nick Alcock via Hi, I noticed that the macro “WIDE_INT_MAX_ELTS” has different values in GCC11 and GCC12 (on the same X86 machine) For gcc11: wide int max elts =3 For gcc12: wide int max elts =9 Does anyone know what’s the reason for this difference? Thanks a lot for any help. Qing ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-04 20:37 Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different Qing Zhao @ 2021-11-05 1:34 ` Kewen.Lin 2021-11-05 6:05 ` Andrew Pinski 1 sibling, 0 replies; 18+ messages in thread From: Kewen.Lin @ 2021-11-05 1:34 UTC (permalink / raw) To: Qing Zhao; +Cc: gcc-patches Nick Alcock via, H.J. Lu Hi Qing, on 2021/11/5 上午4:37, Qing Zhao via Gcc-patches wrote: > Hi, > > I noticed that the macro “WIDE_INT_MAX_ELTS” has different values in GCC11 and GCC12 (on the same X86 machine) > > For gcc11: > > wide int max elts =3 > > For gcc12: > > wide int max elts =9 > > Does anyone know what’s the reason for this difference? > I guess it's due to commit r12-979 (782e57f2c09). For #define WIDE_INT_MAX_ELTS \ ((MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT) / HOST_BITS_PER_WIDE_INT Before the change, the MAX_BITSIZE_MODE_ANY_INT is explicitly set as 160. -#define MAX_BITSIZE_MODE_ANY_INT (160) it's (160+64)/64 = 3 After the change, MAX_BITSIZE_MODE_ANY_INT is counted in function emit_max_int and becomes 512. it's (512+64)/64 = 9 As the commit log, the previous 160 bits seems a workaround for some gone problem, now the commit makes it use the default way to align with the other ports. BR, Kewen > Thanks a lot for any help. > > Qing > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-04 20:37 Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different Qing Zhao 2021-11-05 1:34 ` Kewen.Lin @ 2021-11-05 6:05 ` Andrew Pinski 2021-11-05 6:53 ` Jakub Jelinek 1 sibling, 1 reply; 18+ messages in thread From: Andrew Pinski @ 2021-11-05 6:05 UTC (permalink / raw) To: Qing Zhao; +Cc: gcc-patches Nick Alcock via On Thu, Nov 4, 2021 at 1:38 PM Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > > I noticed that the macro “WIDE_INT_MAX_ELTS” has different values in GCC11 and GCC12 (on the same X86 machine) > > For gcc11: > > wide int max elts =3 > > For gcc12: > > wide int max elts =9 > > Does anyone know what’s the reason for this difference? > > Thanks a lot for any help. Yes originally, the x86 backend only used OI and XI modes for vectors during data movement. This changed with r10-5741-gc57b4c22089 which added the use of OI mode for TImode adding with overflow and then MAX_BITSIZE_MODE_ANY_INT changed from 128 to 160 (in r10-6178-gc124b345e46078) to fix the ICE introduced by that change . And then with r12-979-g782e57f2c09 removed the define of MAX_BITSIZE_MODE_ANY_INT. Now what was not mentioned in r12-979-g782e57f2c09 (or before) of why MAX_BITSIZE_MODE_ANY_INT was defined in the first place for x86. HJL assumed there was some problem of why it was defined that way but not realizing memory usage was the reason. It was defined to keep the memory usage down as you see that it is now almost a 3x memory increase for all wi::wide_int. I do think r12-979-g782e57f2c09 should be reverted with an added comment on saying defining MAX_BITSIZE_MODE_ANY_INT here is to decrease the memory footprint. Thanks, Andrew Pinski > > Qing ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-05 6:05 ` Andrew Pinski @ 2021-11-05 6:53 ` Jakub Jelinek 2021-11-05 10:01 ` Richard Biener 0 siblings, 1 reply; 18+ messages in thread From: Jakub Jelinek @ 2021-11-05 6:53 UTC (permalink / raw) To: Andrew Pinski, hjl.tools; +Cc: Qing Zhao, gcc-patches Nick Alcock via On Thu, Nov 04, 2021 at 11:05:35PM -0700, Andrew Pinski via Gcc-patches wrote: > > I noticed that the macro “WIDE_INT_MAX_ELTS” has different values in GCC11 and GCC12 (on the same X86 machine) > > > > For gcc11: > > > > wide int max elts =3 > > > > For gcc12: > > > > wide int max elts =9 > > > > Does anyone know what’s the reason for this difference? > > > > Thanks a lot for any help. > > Yes originally, the x86 backend only used OI and XI modes for vectors > during data movement. > This changed with r10-5741-gc57b4c22089 which added the use of OI mode > for TImode adding with overflow and then MAX_BITSIZE_MODE_ANY_INT > changed from 128 to 160 (in r10-6178-gc124b345e46078) to fix the ICE > introduced by that change . > And then with r12-979-g782e57f2c09 removed the define of > MAX_BITSIZE_MODE_ANY_INT. > Now what was not mentioned in r12-979-g782e57f2c09 (or before) of why > MAX_BITSIZE_MODE_ANY_INT was defined in the first place for x86. HJL > assumed there was some problem of why it was defined that way but not > realizing memory usage was the reason. > It was defined to keep the memory usage down as you see that it is now > almost a 3x memory increase for all wi::wide_int. > I do think r12-979-g782e57f2c09 should be reverted with an added > comment on saying defining MAX_BITSIZE_MODE_ANY_INT here is to > decrease the memory footprint. I completely agree. Jakub ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-05 6:53 ` Jakub Jelinek @ 2021-11-05 10:01 ` Richard Biener 2021-11-05 12:25 ` H.J. Lu 2021-11-05 16:11 ` Qing Zhao 0 siblings, 2 replies; 18+ messages in thread From: Richard Biener @ 2021-11-05 10:01 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Andrew Pinski, H. J. Lu, gcc-patches Nick Alcock via On Fri, Nov 5, 2021 at 7:54 AM Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Thu, Nov 04, 2021 at 11:05:35PM -0700, Andrew Pinski via Gcc-patches wrote: > > > I noticed that the macro “WIDE_INT_MAX_ELTS” has different values in GCC11 and GCC12 (on the same X86 machine) > > > > > > For gcc11: > > > > > > wide int max elts =3 > > > > > > For gcc12: > > > > > > wide int max elts =9 > > > > > > Does anyone know what’s the reason for this difference? > > > > > > Thanks a lot for any help. > > > > Yes originally, the x86 backend only used OI and XI modes for vectors > > during data movement. > > This changed with r10-5741-gc57b4c22089 which added the use of OI mode > > for TImode adding with overflow and then MAX_BITSIZE_MODE_ANY_INT > > changed from 128 to 160 (in r10-6178-gc124b345e46078) to fix the ICE > > introduced by that change . > > And then with r12-979-g782e57f2c09 removed the define of > > MAX_BITSIZE_MODE_ANY_INT. > > Now what was not mentioned in r12-979-g782e57f2c09 (or before) of why > > MAX_BITSIZE_MODE_ANY_INT was defined in the first place for x86. HJL > > assumed there was some problem of why it was defined that way but not > > realizing memory usage was the reason. > > It was defined to keep the memory usage down as you see that it is now > > almost a 3x memory increase for all wi::wide_int. > > I do think r12-979-g782e57f2c09 should be reverted with an added > > comment on saying defining MAX_BITSIZE_MODE_ANY_INT here is to > > decrease the memory footprint. > > I completely agree. Do we have permanent objects embedding wide[st]_int? I know of class loop and loop_bound. Btw, there are other targets with large integer modes (aarch64 with XImode) and not defining MAX_BITSIZE_MODE_ANY_INT Richard. > Jakub > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-05 10:01 ` Richard Biener @ 2021-11-05 12:25 ` H.J. Lu 2021-11-05 16:11 ` Qing Zhao 1 sibling, 0 replies; 18+ messages in thread From: H.J. Lu @ 2021-11-05 12:25 UTC (permalink / raw) To: Richard Biener; +Cc: Jakub Jelinek, Andrew Pinski, gcc-patches Nick Alcock via On Fri, Nov 5, 2021 at 3:01 AM Richard Biener <richard.guenther@gmail.com> wrote: > > On Fri, Nov 5, 2021 at 7:54 AM Jakub Jelinek via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > On Thu, Nov 04, 2021 at 11:05:35PM -0700, Andrew Pinski via Gcc-patches wrote: > > > > I noticed that the macro “WIDE_INT_MAX_ELTS” has different values in GCC11 and GCC12 (on the same X86 machine) > > > > > > > > For gcc11: > > > > > > > > wide int max elts =3 > > > > > > > > For gcc12: > > > > > > > > wide int max elts =9 > > > > > > > > Does anyone know what’s the reason for this difference? > > > > > > > > Thanks a lot for any help. > > > > > > Yes originally, the x86 backend only used OI and XI modes for vectors > > > during data movement. > > > This changed with r10-5741-gc57b4c22089 which added the use of OI mode > > > for TImode adding with overflow and then MAX_BITSIZE_MODE_ANY_INT > > > changed from 128 to 160 (in r10-6178-gc124b345e46078) to fix the ICE > > > introduced by that change . > > > And then with r12-979-g782e57f2c09 removed the define of > > > MAX_BITSIZE_MODE_ANY_INT. > > > Now what was not mentioned in r12-979-g782e57f2c09 (or before) of why > > > MAX_BITSIZE_MODE_ANY_INT was defined in the first place for x86. HJL > > > assumed there was some problem of why it was defined that way but not > > > realizing memory usage was the reason. > > > It was defined to keep the memory usage down as you see that it is now > > > almost a 3x memory increase for all wi::wide_int. > > > I do think r12-979-g782e57f2c09 should be reverted with an added > > > comment on saying defining MAX_BITSIZE_MODE_ANY_INT here is to > > > decrease the memory footprint. > > > > I completely agree. > > Do we have permanent objects embedding wide[st]_int? I know of > class loop and loop_bound. Btw, there are other targets with large > integer modes (aarch64 with XImode) and not defining > MAX_BITSIZE_MODE_ANY_INT > MAX_BITSIZE_MODE_ANY_INT was removed so that YMM and ZMM registers can be used for by_pieces operations. -- H.J. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-05 10:01 ` Richard Biener 2021-11-05 12:25 ` H.J. Lu @ 2021-11-05 16:11 ` Qing Zhao 2021-11-05 16:17 ` Jakub Jelinek 1 sibling, 1 reply; 18+ messages in thread From: Qing Zhao @ 2021-11-05 16:11 UTC (permalink / raw) To: Richard Biener Cc: Jakub Jelinek, gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski Thanks all for the information. Based on the information so far, my understanding is that we cannot revert r12-979-g782e57f2c09 Since it’s for enabling YMM and ZMM registers to be used for by_pieces operations on X86. Let me know if I miss anything here. FYI. This issue was found during my work to back port all the patches of -ftrivial-auto-var-init so far from GCC12 to GCC11. The following small testing case (_Complex long double) _Complex long double result; _Complex long double foo() { _Complex long double temp3; result = temp3; return result; } Failed with -ftrivial-auto-var-init=pattern with GCC11 at the following Line 3087 at call to “build_nonstandard_integer_type” : (expand_DEFERRED_INIT): 3076 if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE 3077 && tree_fits_uhwi_p (var_size) 3078 && (init_type == AUTO_INIT_PATTERN 3079 || !is_gimple_reg_type (var_type)) 3080 && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, 3081 0).exists ()) 3082 { 3083 unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); 3084 unsigned char *buf = (unsigned char *) xmalloc (total_bytes); 3085 memset (buf, (init_type == AUTO_INIT_PATTERN 3086 ? INIT_PATTERN_VALUE : 0), total_bytes); 3087 tree itype = build_nonstandard_integer_type 3088 (total_bytes * BITS_PER_UNIT, 1); The exact failing point is at function “set_min_and_max_values_for_integral_type”: 2851 gcc_assert (precision <= WIDE_INT_MAX_PRECISION); For _Complex long double, “precision” is 256. In GCC11, “WIDE_INT_MAX_PRECISION” is 192, in GCC12, it’s 512. As a result, the above assertion failed on GCC11. I am wondering what’s the best fix for this issue in gcc11? Qing > On Nov 5, 2021, at 5:01 AM, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Fri, Nov 5, 2021 at 7:54 AM Jakub Jelinek via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> On Thu, Nov 04, 2021 at 11:05:35PM -0700, Andrew Pinski via Gcc-patches wrote: >>>> I noticed that the macro “WIDE_INT_MAX_ELTS” has different values in GCC11 and GCC12 (on the same X86 machine) >>>> >>>> For gcc11: >>>> >>>> wide int max elts =3 >>>> >>>> For gcc12: >>>> >>>> wide int max elts =9 >>>> >>>> Does anyone know what’s the reason for this difference? >>>> >>>> Thanks a lot for any help. >>> >>> Yes originally, the x86 backend only used OI and XI modes for vectors >>> during data movement. >>> This changed with r10-5741-gc57b4c22089 which added the use of OI mode >>> for TImode adding with overflow and then MAX_BITSIZE_MODE_ANY_INT >>> changed from 128 to 160 (in r10-6178-gc124b345e46078) to fix the ICE >>> introduced by that change . >>> And then with r12-979-g782e57f2c09 removed the define of >>> MAX_BITSIZE_MODE_ANY_INT. >>> Now what was not mentioned in r12-979-g782e57f2c09 (or before) of why >>> MAX_BITSIZE_MODE_ANY_INT was defined in the first place for x86. HJL >>> assumed there was some problem of why it was defined that way but not >>> realizing memory usage was the reason. >>> It was defined to keep the memory usage down as you see that it is now >>> almost a 3x memory increase for all wi::wide_int. >>> I do think r12-979-g782e57f2c09 should be reverted with an added >>> comment on saying defining MAX_BITSIZE_MODE_ANY_INT here is to >>> decrease the memory footprint. >> >> I completely agree. > > Do we have permanent objects embedding wide[st]_int? I know of > class loop and loop_bound. Btw, there are other targets with large > integer modes (aarch64 with XImode) and not defining > MAX_BITSIZE_MODE_ANY_INT > > Richard. > >> Jakub ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-05 16:11 ` Qing Zhao @ 2021-11-05 16:17 ` Jakub Jelinek 2021-11-05 17:37 ` Qing Zhao 0 siblings, 1 reply; 18+ messages in thread From: Jakub Jelinek @ 2021-11-05 16:17 UTC (permalink / raw) To: Qing Zhao Cc: Richard Biener, gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski On Fri, Nov 05, 2021 at 04:11:36PM +0000, Qing Zhao wrote: > 3076 if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > 3077 && tree_fits_uhwi_p (var_size) > 3078 && (init_type == AUTO_INIT_PATTERN > 3079 || !is_gimple_reg_type (var_type)) > 3080 && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > 3081 0).exists ()) > 3082 { > 3083 unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > 3084 unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > 3085 memset (buf, (init_type == AUTO_INIT_PATTERN > 3086 ? INIT_PATTERN_VALUE : 0), total_bytes); > 3087 tree itype = build_nonstandard_integer_type > 3088 (total_bytes * BITS_PER_UNIT, 1); > > The exact failing point is at function “set_min_and_max_values_for_integral_type”: > > 2851 gcc_assert (precision <= WIDE_INT_MAX_PRECISION); > > For _Complex long double, “precision” is 256. > In GCC11, “WIDE_INT_MAX_PRECISION” is 192, in GCC12, it’s 512. > As a result, the above assertion failed on GCC11. > > I am wondering what’s the best fix for this issue in gcc11? Even for gcc 12 the above is wrong, you can't blindly assume that build_nonstandard_integer_type will work for arbitrary precisions, and even if it works that it will actually work. The fact that such a mode exist is one thing, but targetm.scalar_mode_supported_p should be tested for whether the mode is actually supported. Jakub ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-05 16:17 ` Jakub Jelinek @ 2021-11-05 17:37 ` Qing Zhao 2021-11-06 9:56 ` Jakub Jelinek 0 siblings, 1 reply; 18+ messages in thread From: Qing Zhao @ 2021-11-05 17:37 UTC (permalink / raw) To: Jakub Jelinek Cc: Richard Biener, gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski > On Nov 5, 2021, at 11:17 AM, Jakub Jelinek <jakub@redhat.com> wrote: > > On Fri, Nov 05, 2021 at 04:11:36PM +0000, Qing Zhao wrote: >> 3076 if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE >> 3077 && tree_fits_uhwi_p (var_size) >> 3078 && (init_type == AUTO_INIT_PATTERN >> 3079 || !is_gimple_reg_type (var_type)) >> 3080 && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, >> 3081 0).exists ()) >> 3082 { >> 3083 unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); >> 3084 unsigned char *buf = (unsigned char *) xmalloc (total_bytes); >> 3085 memset (buf, (init_type == AUTO_INIT_PATTERN >> 3086 ? INIT_PATTERN_VALUE : 0), total_bytes); >> 3087 tree itype = build_nonstandard_integer_type >> 3088 (total_bytes * BITS_PER_UNIT, 1); >> >> The exact failing point is at function “set_min_and_max_values_for_integral_type”: >> >> 2851 gcc_assert (precision <= WIDE_INT_MAX_PRECISION); >> >> For _Complex long double, “precision” is 256. >> In GCC11, “WIDE_INT_MAX_PRECISION” is 192, in GCC12, it’s 512. >> As a result, the above assertion failed on GCC11. >> >> I am wondering what’s the best fix for this issue in gcc11? > > Even for gcc 12 the above is wrong, you can't blindly assume that > build_nonstandard_integer_type will work for arbitrary precisions, > and even if it works that it will actually work. > The fact that such a mode exist is one thing, but > targetm.scalar_mode_supported_p should be tested for whether the mode > is actually supported. You mean “int_mode_for_size().exists()” is not enough to make sure “build_nonstandard_integer_type” to be valid? We should add “targetm.scalar_mode_supported_p” too ? Qing > > Jakub > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-05 17:37 ` Qing Zhao @ 2021-11-06 9:56 ` Jakub Jelinek 2021-11-08 8:41 ` Richard Biener 0 siblings, 1 reply; 18+ messages in thread From: Jakub Jelinek @ 2021-11-06 9:56 UTC (permalink / raw) To: Qing Zhao Cc: Richard Biener, gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski On Fri, Nov 05, 2021 at 05:37:25PM +0000, Qing Zhao wrote: > > On Nov 5, 2021, at 11:17 AM, Jakub Jelinek <jakub@redhat.com> wrote: > > > > On Fri, Nov 05, 2021 at 04:11:36PM +0000, Qing Zhao wrote: > >> 3076 if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > >> 3077 && tree_fits_uhwi_p (var_size) > >> 3078 && (init_type == AUTO_INIT_PATTERN > >> 3079 || !is_gimple_reg_type (var_type)) > >> 3080 && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > >> 3081 0).exists ()) > >> 3082 { > >> 3083 unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > >> 3084 unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > >> 3085 memset (buf, (init_type == AUTO_INIT_PATTERN > >> 3086 ? INIT_PATTERN_VALUE : 0), total_bytes); > >> 3087 tree itype = build_nonstandard_integer_type > >> 3088 (total_bytes * BITS_PER_UNIT, 1); > >> > >> The exact failing point is at function “set_min_and_max_values_for_integral_type”: > >> > >> 2851 gcc_assert (precision <= WIDE_INT_MAX_PRECISION); > >> > >> For _Complex long double, “precision” is 256. > >> In GCC11, “WIDE_INT_MAX_PRECISION” is 192, in GCC12, it’s 512. > >> As a result, the above assertion failed on GCC11. > >> > >> I am wondering what’s the best fix for this issue in gcc11? > > > > Even for gcc 12 the above is wrong, you can't blindly assume that > > build_nonstandard_integer_type will work for arbitrary precisions, > > and even if it works that it will actually work. > > The fact that such a mode exist is one thing, but > > targetm.scalar_mode_supported_p should be tested for whether the mode > > is actually supported. > > You mean “int_mode_for_size().exists()” is not enough to make sure > “build_nonstandard_integer_type” to be valid? We should add > “targetm.scalar_mode_supported_p” too ? Yeah. The former says whether the backend has that mode at all. But some modes may be there only in some specific patterns but without support for mov, add, etc. Only for targetm.scalar_mode_supported_p modes the backend guarantees that one can use them e.g. in mode attribute and can expect expansion to expand everything with that mode that is needed in some way. E.g. only if targetm.scalar_mode_supported_p (TImode) the FEs support __int128_t type, etc. Jakub ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-06 9:56 ` Jakub Jelinek @ 2021-11-08 8:41 ` Richard Biener 2021-11-08 23:47 ` Qing Zhao 0 siblings, 1 reply; 18+ messages in thread From: Richard Biener @ 2021-11-08 8:41 UTC (permalink / raw) To: Jakub Jelinek Cc: Qing Zhao, gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski On Sat, Nov 6, 2021 at 10:56 AM Jakub Jelinek <jakub@redhat.com> wrote: > > On Fri, Nov 05, 2021 at 05:37:25PM +0000, Qing Zhao wrote: > > > On Nov 5, 2021, at 11:17 AM, Jakub Jelinek <jakub@redhat.com> wrote: > > > > > > On Fri, Nov 05, 2021 at 04:11:36PM +0000, Qing Zhao wrote: > > >> 3076 if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > > >> 3077 && tree_fits_uhwi_p (var_size) > > >> 3078 && (init_type == AUTO_INIT_PATTERN > > >> 3079 || !is_gimple_reg_type (var_type)) > > >> 3080 && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > > >> 3081 0).exists ()) > > >> 3082 { > > >> 3083 unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > > >> 3084 unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > > >> 3085 memset (buf, (init_type == AUTO_INIT_PATTERN > > >> 3086 ? INIT_PATTERN_VALUE : 0), total_bytes); > > >> 3087 tree itype = build_nonstandard_integer_type > > >> 3088 (total_bytes * BITS_PER_UNIT, 1); > > >> > > >> The exact failing point is at function “set_min_and_max_values_for_integral_type”: > > >> > > >> 2851 gcc_assert (precision <= WIDE_INT_MAX_PRECISION); > > >> > > >> For _Complex long double, “precision” is 256. > > >> In GCC11, “WIDE_INT_MAX_PRECISION” is 192, in GCC12, it’s 512. > > >> As a result, the above assertion failed on GCC11. > > >> > > >> I am wondering what’s the best fix for this issue in gcc11? > > > > > > Even for gcc 12 the above is wrong, you can't blindly assume that > > > build_nonstandard_integer_type will work for arbitrary precisions, > > > and even if it works that it will actually work. > > > The fact that such a mode exist is one thing, but > > > targetm.scalar_mode_supported_p should be tested for whether the mode > > > is actually supported. > > > > You mean “int_mode_for_size().exists()” is not enough to make sure > > “build_nonstandard_integer_type” to be valid? We should add > > “targetm.scalar_mode_supported_p” too ? > > Yeah. The former says whether the backend has that mode at all. > But some modes may be there only in some specific patterns but > without support for mov, add, etc. Only for > targetm.scalar_mode_supported_p modes the backend guarantees that > one can use them e.g. in mode attribute and can expect expansion > to expand everything with that mode that is needed in some way. > E.g. only if targetm.scalar_mode_supported_p (TImode) the FEs > support __int128_t type, etc. The memcpy folding code now checks scalar_int_mode mode; if (int_mode_for_size (ilen * 8, 0).exists (&mode) && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8 && have_insn_for (SET, mode) thus specifically only have_insn_for (SET, mode), which I guess is good enough for this case as well? > Jakub > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-08 8:41 ` Richard Biener @ 2021-11-08 23:47 ` Qing Zhao 2021-11-09 7:13 ` Richard Biener 0 siblings, 1 reply; 18+ messages in thread From: Qing Zhao @ 2021-11-08 23:47 UTC (permalink / raw) To: Richard Biener, Jakub Jelinek Cc: gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski Hi, I tried both the following patches: Patch1: [opc@qinzhao-ol8u3-x86 gcc]$ git diff diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 0cba95411a6..ca49d2b4514 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) /* If this variable is in a register use expand_assignment. For boolean scalars force zero-init. */ tree init; + scalar_int_mode var_mode; if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE && tree_fits_uhwi_p (var_size) && (init_type == AUTO_INIT_PATTERN || !is_gimple_reg_type (var_type)) && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, - 0).exists ()) + 0).exists (&var_mode) + && targetm.scalar_mode_supported_p (var_mode)) { unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); unsigned char *buf = (unsigned char *) xmalloc (total_bytes); AND Patch2: diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 0cba95411a6..7f129655926 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) /* If this variable is in a register use expand_assignment. For boolean scalars force zero-init. */ tree init; + scalar_int_mode var_mode; if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE && tree_fits_uhwi_p (var_size) && (init_type == AUTO_INIT_PATTERN || !is_gimple_reg_type (var_type)) && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, - 0).exists ()) + 0).exists (&var_mode) + && have_insn_for (SET, var_mode)) { unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); unsigned char *buf = (unsigned char *) xmalloc (total_bytes); Have the same effect: 1. Resolved the ICE in gcc11; 2. For _Complex long double variables, both return FALSE, as a result, for PATTERN initialization of _Complex long double variables, now they are initialization with ZEROs instead of FEs. Let me know you opinion on this, If the above 2 is okay, then I might pick the above Patch 1 for the final patch to this issue. Thanks. Qing > On Nov 8, 2021, at 2:41 AM, Richard Biener <richard.guenther@gmail.com> wrote: > > On Sat, Nov 6, 2021 at 10:56 AM Jakub Jelinek <jakub@redhat.com> wrote: >> >> On Fri, Nov 05, 2021 at 05:37:25PM +0000, Qing Zhao wrote: >>>> On Nov 5, 2021, at 11:17 AM, Jakub Jelinek <jakub@redhat.com> wrote: >>>> >>>> On Fri, Nov 05, 2021 at 04:11:36PM +0000, Qing Zhao wrote: >>>>> 3076 if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE >>>>> 3077 && tree_fits_uhwi_p (var_size) >>>>> 3078 && (init_type == AUTO_INIT_PATTERN >>>>> 3079 || !is_gimple_reg_type (var_type)) >>>>> 3080 && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, >>>>> 3081 0).exists ()) >>>>> 3082 { >>>>> 3083 unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); >>>>> 3084 unsigned char *buf = (unsigned char *) xmalloc (total_bytes); >>>>> 3085 memset (buf, (init_type == AUTO_INIT_PATTERN >>>>> 3086 ? INIT_PATTERN_VALUE : 0), total_bytes); >>>>> 3087 tree itype = build_nonstandard_integer_type >>>>> 3088 (total_bytes * BITS_PER_UNIT, 1); >>>>> >>>>> The exact failing point is at function “set_min_and_max_values_for_integral_type”: >>>>> >>>>> 2851 gcc_assert (precision <= WIDE_INT_MAX_PRECISION); >>>>> >>>>> For _Complex long double, “precision” is 256. >>>>> In GCC11, “WIDE_INT_MAX_PRECISION” is 192, in GCC12, it’s 512. >>>>> As a result, the above assertion failed on GCC11. >>>>> >>>>> I am wondering what’s the best fix for this issue in gcc11? >>>> >>>> Even for gcc 12 the above is wrong, you can't blindly assume that >>>> build_nonstandard_integer_type will work for arbitrary precisions, >>>> and even if it works that it will actually work. >>>> The fact that such a mode exist is one thing, but >>>> targetm.scalar_mode_supported_p should be tested for whether the mode >>>> is actually supported. >>> >>> You mean “int_mode_for_size().exists()” is not enough to make sure >>> “build_nonstandard_integer_type” to be valid? We should add >>> “targetm.scalar_mode_supported_p” too ? >> >> Yeah. The former says whether the backend has that mode at all. >> But some modes may be there only in some specific patterns but >> without support for mov, add, etc. Only for >> targetm.scalar_mode_supported_p modes the backend guarantees that >> one can use them e.g. in mode attribute and can expect expansion >> to expand everything with that mode that is needed in some way. >> E.g. only if targetm.scalar_mode_supported_p (TImode) the FEs >> support __int128_t type, etc. > > The memcpy folding code now checks > > scalar_int_mode mode; > if (int_mode_for_size (ilen * 8, 0).exists (&mode) > && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8 > && have_insn_for (SET, mode) > > thus specifically only have_insn_for (SET, mode), which I guess is > good enough for this case as well? > >> Jakub ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-08 23:47 ` Qing Zhao @ 2021-11-09 7:13 ` Richard Biener 2021-11-09 9:10 ` Jakub Jelinek 0 siblings, 1 reply; 18+ messages in thread From: Richard Biener @ 2021-11-09 7:13 UTC (permalink / raw) To: Qing Zhao Cc: Jakub Jelinek, gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski On Tue, Nov 9, 2021 at 12:47 AM Qing Zhao <qing.zhao@oracle.com> wrote: > > Hi, I tried both the following patches: > > Patch1: > > [opc@qinzhao-ol8u3-x86 gcc]$ git diff > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > index 0cba95411a6..ca49d2b4514 100644 > --- a/gcc/internal-fn.c > +++ b/gcc/internal-fn.c > @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) > /* If this variable is in a register use expand_assignment. > For boolean scalars force zero-init. */ > tree init; > + scalar_int_mode var_mode; > if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > && tree_fits_uhwi_p (var_size) > && (init_type == AUTO_INIT_PATTERN > || !is_gimple_reg_type (var_type)) > && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > - 0).exists ()) > + 0).exists (&var_mode) > + && targetm.scalar_mode_supported_p (var_mode)) > { > unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > > AND > > Patch2: > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > index 0cba95411a6..7f129655926 100644 > --- a/gcc/internal-fn.c > +++ b/gcc/internal-fn.c > @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) > /* If this variable is in a register use expand_assignment. > For boolean scalars force zero-init. */ > tree init; > + scalar_int_mode var_mode; > if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > && tree_fits_uhwi_p (var_size) > && (init_type == AUTO_INIT_PATTERN > || !is_gimple_reg_type (var_type)) > && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > - 0).exists ()) > + 0).exists (&var_mode) > + && have_insn_for (SET, var_mode)) > { > unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > > Have the same effect: > > 1. Resolved the ICE in gcc11; > 2. For _Complex long double variables, both return FALSE, as a result, for PATTERN initialization of _Complex long double variables, now they are initialization with ZEROs instead of FEs. > > Let me know you opinion on this, If the above 2 is okay, then I might pick the above Patch 1 for the final patch to this issue. I think zero-initialization is OK, but I'd choose Patch2 for consistency with what we do in the memcpy folding. Richard. > Thanks. > > Qing > > > On Nov 8, 2021, at 2:41 AM, Richard Biener <richard.guenther@gmail.com> wrote: > > > > On Sat, Nov 6, 2021 at 10:56 AM Jakub Jelinek <jakub@redhat.com> wrote: > >> > >> On Fri, Nov 05, 2021 at 05:37:25PM +0000, Qing Zhao wrote: > >>>> On Nov 5, 2021, at 11:17 AM, Jakub Jelinek <jakub@redhat.com> wrote: > >>>> > >>>> On Fri, Nov 05, 2021 at 04:11:36PM +0000, Qing Zhao wrote: > >>>>> 3076 if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > >>>>> 3077 && tree_fits_uhwi_p (var_size) > >>>>> 3078 && (init_type == AUTO_INIT_PATTERN > >>>>> 3079 || !is_gimple_reg_type (var_type)) > >>>>> 3080 && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > >>>>> 3081 0).exists ()) > >>>>> 3082 { > >>>>> 3083 unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > >>>>> 3084 unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > >>>>> 3085 memset (buf, (init_type == AUTO_INIT_PATTERN > >>>>> 3086 ? INIT_PATTERN_VALUE : 0), total_bytes); > >>>>> 3087 tree itype = build_nonstandard_integer_type > >>>>> 3088 (total_bytes * BITS_PER_UNIT, 1); > >>>>> > >>>>> The exact failing point is at function “set_min_and_max_values_for_integral_type”: > >>>>> > >>>>> 2851 gcc_assert (precision <= WIDE_INT_MAX_PRECISION); > >>>>> > >>>>> For _Complex long double, “precision” is 256. > >>>>> In GCC11, “WIDE_INT_MAX_PRECISION” is 192, in GCC12, it’s 512. > >>>>> As a result, the above assertion failed on GCC11. > >>>>> > >>>>> I am wondering what’s the best fix for this issue in gcc11? > >>>> > >>>> Even for gcc 12 the above is wrong, you can't blindly assume that > >>>> build_nonstandard_integer_type will work for arbitrary precisions, > >>>> and even if it works that it will actually work. > >>>> The fact that such a mode exist is one thing, but > >>>> targetm.scalar_mode_supported_p should be tested for whether the mode > >>>> is actually supported. > >>> > >>> You mean “int_mode_for_size().exists()” is not enough to make sure > >>> “build_nonstandard_integer_type” to be valid? We should add > >>> “targetm.scalar_mode_supported_p” too ? > >> > >> Yeah. The former says whether the backend has that mode at all. > >> But some modes may be there only in some specific patterns but > >> without support for mov, add, etc. Only for > >> targetm.scalar_mode_supported_p modes the backend guarantees that > >> one can use them e.g. in mode attribute and can expect expansion > >> to expand everything with that mode that is needed in some way. > >> E.g. only if targetm.scalar_mode_supported_p (TImode) the FEs > >> support __int128_t type, etc. > > > > The memcpy folding code now checks > > > > scalar_int_mode mode; > > if (int_mode_for_size (ilen * 8, 0).exists (&mode) > > && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8 > > && have_insn_for (SET, mode) > > > > thus specifically only have_insn_for (SET, mode), which I guess is > > good enough for this case as well? > > > >> Jakub > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-09 7:13 ` Richard Biener @ 2021-11-09 9:10 ` Jakub Jelinek 2021-11-09 10:44 ` Richard Biener 0 siblings, 1 reply; 18+ messages in thread From: Jakub Jelinek @ 2021-11-09 9:10 UTC (permalink / raw) To: Richard Biener Cc: Qing Zhao, gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski On Tue, Nov 09, 2021 at 08:13:57AM +0100, Richard Biener wrote: > > Hi, I tried both the following patches: > > > > Patch1: > > > > [opc@qinzhao-ol8u3-x86 gcc]$ git diff > > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > > index 0cba95411a6..ca49d2b4514 100644 > > --- a/gcc/internal-fn.c > > +++ b/gcc/internal-fn.c > > @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) > > /* If this variable is in a register use expand_assignment. > > For boolean scalars force zero-init. */ > > tree init; > > + scalar_int_mode var_mode; > > if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > > && tree_fits_uhwi_p (var_size) > > && (init_type == AUTO_INIT_PATTERN > > || !is_gimple_reg_type (var_type)) > > && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > > - 0).exists ()) > > + 0).exists (&var_mode) > > + && targetm.scalar_mode_supported_p (var_mode)) > > { > > unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > > unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > > > > AND > > > > Patch2: > > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > > index 0cba95411a6..7f129655926 100644 > > --- a/gcc/internal-fn.c > > +++ b/gcc/internal-fn.c > > @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) > > /* If this variable is in a register use expand_assignment. > > For boolean scalars force zero-init. */ > > tree init; > > + scalar_int_mode var_mode; > > if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > > && tree_fits_uhwi_p (var_size) > > && (init_type == AUTO_INIT_PATTERN > > || !is_gimple_reg_type (var_type)) > > && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > > - 0).exists ()) > > + 0).exists (&var_mode) > > + && have_insn_for (SET, var_mode)) > > { > > unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > > unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > > > > Have the same effect: > > > > 1. Resolved the ICE in gcc11; > > 2. For _Complex long double variables, both return FALSE, as a result, for PATTERN initialization of _Complex long double variables, now they are initialization with ZEROs instead of FEs. > > > > Let me know you opinion on this, If the above 2 is okay, then I might pick the above Patch 1 for the final patch to this issue. > > I think zero-initialization is OK, but I'd choose Patch2 for > consistency with what we do in the memcpy > folding. Note, I think the code leaks memory (buf is never freed) and should be allocated using unsigned char *buf = XNEWVEC (unsigned char *, total_bytes); and deallocated with XDELETEVEC (buf); If lhs is SSA_NAME, I think it would be easiest to use native_interpret_expr directly instead of finding some integral mode, after memset just native_interpret_expr. Should be guarded by BITS_PER_UNIT == 8 && CHAR_BIT == 8 && BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN but I think the current code assumes at least that BITS_PER_UNIT == CHAR_BIT too anyway. That way you get directly a constant init instead of hopping through some integral mode which might not exist at all. If lhs is not SSA_NAME but var_type has COMPLEX_TYPE or VECTOR_TYPE, there is always an option to find var_mode only for the element type (i.e. TREE_TYPE (var_type), use var_size for the element size too, and after you compute init fold_build2 (COMPLEX_EXPR, ) or build_vector_from_val. For VECTOR_TYPE one would need to check if the vector mode is supported, sure (but if it isn't, it will go the BUILT_IN_MEMSET way, I'm pretty sure). Also, tree value = (init_type == AUTO_INIT_PATTERN) ? build_int_cst (integer_type_node, INIT_PATTERN_VALUE) : integer_zero_node; a few lines before has bad formatting, both ? and : should be at the start of next lines rather than on end of the previous ones. Jakub ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-09 9:10 ` Jakub Jelinek @ 2021-11-09 10:44 ` Richard Biener 2021-11-09 17:19 ` Qing Zhao 0 siblings, 1 reply; 18+ messages in thread From: Richard Biener @ 2021-11-09 10:44 UTC (permalink / raw) To: Jakub Jelinek Cc: Qing Zhao, gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski On Tue, Nov 9, 2021 at 10:10 AM Jakub Jelinek <jakub@redhat.com> wrote: > > On Tue, Nov 09, 2021 at 08:13:57AM +0100, Richard Biener wrote: > > > Hi, I tried both the following patches: > > > > > > Patch1: > > > > > > [opc@qinzhao-ol8u3-x86 gcc]$ git diff > > > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > > > index 0cba95411a6..ca49d2b4514 100644 > > > --- a/gcc/internal-fn.c > > > +++ b/gcc/internal-fn.c > > > @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) > > > /* If this variable is in a register use expand_assignment. > > > For boolean scalars force zero-init. */ > > > tree init; > > > + scalar_int_mode var_mode; > > > if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > > > && tree_fits_uhwi_p (var_size) > > > && (init_type == AUTO_INIT_PATTERN > > > || !is_gimple_reg_type (var_type)) > > > && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > > > - 0).exists ()) > > > + 0).exists (&var_mode) > > > + && targetm.scalar_mode_supported_p (var_mode)) > > > { > > > unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > > > unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > > > > > > AND > > > > > > Patch2: > > > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > > > index 0cba95411a6..7f129655926 100644 > > > --- a/gcc/internal-fn.c > > > +++ b/gcc/internal-fn.c > > > @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) > > > /* If this variable is in a register use expand_assignment. > > > For boolean scalars force zero-init. */ > > > tree init; > > > + scalar_int_mode var_mode; > > > if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > > > && tree_fits_uhwi_p (var_size) > > > && (init_type == AUTO_INIT_PATTERN > > > || !is_gimple_reg_type (var_type)) > > > && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > > > - 0).exists ()) > > > + 0).exists (&var_mode) > > > + && have_insn_for (SET, var_mode)) > > > { > > > unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > > > unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > > > > > > Have the same effect: > > > > > > 1. Resolved the ICE in gcc11; > > > 2. For _Complex long double variables, both return FALSE, as a result, for PATTERN initialization of _Complex long double variables, now they are initialization with ZEROs instead of FEs. > > > > > > Let me know you opinion on this, If the above 2 is okay, then I might pick the above Patch 1 for the final patch to this issue. > > > > I think zero-initialization is OK, but I'd choose Patch2 for > > consistency with what we do in the memcpy > > folding. > > Note, I think the code leaks memory (buf is never freed) and > should be allocated using > unsigned char *buf = XNEWVEC (unsigned char *, total_bytes); > and deallocated with XDELETEVEC (buf); Oops. I think the intention was to use XALLOCAVEC > If lhs is SSA_NAME, I think it would be easiest to use > native_interpret_expr directly instead of finding some integral > mode, after memset just native_interpret_expr. I've removed the native_interpret_expr path, I think it would be best to not rely on expand_assignment for the case of a register destination. The intent is to do an integer move into the destination and avoid issues with things like XFmode or other modes with padding (and possibly trapping loads). So it should become some (set (subreg:IntMode ...) (const_wide_int ...)) but my RTL fu is too weak to try and so we ended up with expand_assignment which doesn't like V_C_E on SSA names on the LHS (well, because that's not valid GIMPLE...). > Should be guarded by > BITS_PER_UNIT == 8 && CHAR_BIT == 8 && BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN > but I think the current code assumes at least that BITS_PER_UNIT == CHAR_BIT > too anyway. > That way you get directly a constant init instead of hopping through > some integral mode which might not exist at all. > If lhs is not SSA_NAME but var_type has COMPLEX_TYPE or VECTOR_TYPE, there > is always an option to find var_mode only for the element type (i.e. > TREE_TYPE (var_type), use var_size for the element size too, and after > you compute init fold_build2 (COMPLEX_EXPR, ) or build_vector_from_val. > For VECTOR_TYPE one would need to check if the vector mode is supported, > sure (but if it isn't, it will go the BUILT_IN_MEMSET way, I'm pretty sure). > > Also, > tree value = (init_type == AUTO_INIT_PATTERN) ? > build_int_cst (integer_type_node, > INIT_PATTERN_VALUE) : > integer_zero_node; > a few lines before has bad formatting, both ? and : should be at the start > of next lines rather than on end of the previous ones. > > Jakub > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-09 10:44 ` Richard Biener @ 2021-11-09 17:19 ` Qing Zhao 2021-11-10 8:37 ` Richard Biener 0 siblings, 1 reply; 18+ messages in thread From: Qing Zhao @ 2021-11-09 17:19 UTC (permalink / raw) To: Richard Biener, Jakub Jelinek Cc: gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski So, based on the discussion so far, is the following patch good to go? Let me know if you have more comments on the following patch: (At the same time, I am testing this patch on both x86 and aarch64) thanks. Qing diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 0cba95411a6..e8fd16b9c21 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -3059,10 +3059,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) mark_addressable (lhs); tree var_addr = build_fold_addr_expr (lhs); - tree value = (init_type == AUTO_INIT_PATTERN) ? - build_int_cst (integer_type_node, - INIT_PATTERN_VALUE) : - integer_zero_node; + tree value = (init_type == AUTO_INIT_PATTERN) + ? build_int_cst (integer_type_node, + INIT_PATTERN_VALUE) + : integer_zero_node; tree m_call = build_call_expr (builtin_decl_implicit (BUILT_IN_MEMSET), 3, var_addr, value, var_size); /* Expand this memset call. */ @@ -3073,15 +3073,17 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) /* If this variable is in a register use expand_assignment. For boolean scalars force zero-init. */ tree init; + scalar_int_mode var_mode; if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE && tree_fits_uhwi_p (var_size) && (init_type == AUTO_INIT_PATTERN || !is_gimple_reg_type (var_type)) && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, - 0).exists ()) + 0).exists (&var_mode) + && have_insn_for (SET, var_mode)) { unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); - unsigned char *buf = (unsigned char *) xmalloc (total_bytes); + unsigned char *buf = XALLOCAVEC (unsigned char, total_bytes); memset (buf, (init_type == AUTO_INIT_PATTERN ? INIT_PATTERN_VALUE : 0), total_bytes); tree itype = build_nonstandard_integer_type diff --git a/gcc/testsuite/gcc.target/i386/auto-init-6.c b/gcc/testsuite/gcc.target/i386/auto-init-6.c index 339f8bc2966..e53385f0eb7 100644 --- a/gcc/testsuite/gcc.target/i386/auto-init-6.c +++ b/gcc/testsuite/gcc.target/i386/auto-init-6.c @@ -1,4 +1,6 @@ /* Verify pattern initialization for complex type automatic variables. */ +/* Note, _Complex long double is initialized to zeroes due to the current + implemenation limitation. */ /* { dg-do compile } */ /* { dg-options "-ftrivial-auto-var-init=pattern -march=x86-64 -mtune=generic -msse" } */ @@ -15,6 +17,6 @@ _Complex long double foo() return result; } -/* { dg-final { scan-assembler-times "long\t-16843010" 10 { target { ! ia32 } } } } */ -/* { dg-final { scan-assembler-times "long\t-16843010" 6 { target { ia32 } } } } */ +/* { dg-final { scan-assembler-times "long\t0" 8 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "long\t-16843010" 6 } } */ > On Nov 9, 2021, at 4:44 AM, Richard Biener <richard.guenther@gmail.com> wrote: > > On Tue, Nov 9, 2021 at 10:10 AM Jakub Jelinek <jakub@redhat.com> wrote: >> >> On Tue, Nov 09, 2021 at 08:13:57AM +0100, Richard Biener wrote: >>>> Hi, I tried both the following patches: >>>> >>>> Patch1: >>>> >>>> [opc@qinzhao-ol8u3-x86 gcc]$ git diff >>>> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c >>>> index 0cba95411a6..ca49d2b4514 100644 >>>> --- a/gcc/internal-fn.c >>>> +++ b/gcc/internal-fn.c >>>> @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) >>>> /* If this variable is in a register use expand_assignment. >>>> For boolean scalars force zero-init. */ >>>> tree init; >>>> + scalar_int_mode var_mode; >>>> if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE >>>> && tree_fits_uhwi_p (var_size) >>>> && (init_type == AUTO_INIT_PATTERN >>>> || !is_gimple_reg_type (var_type)) >>>> && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, >>>> - 0).exists ()) >>>> + 0).exists (&var_mode) >>>> + && targetm.scalar_mode_supported_p (var_mode)) >>>> { >>>> unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); >>>> unsigned char *buf = (unsigned char *) xmalloc (total_bytes); >>>> >>>> AND >>>> >>>> Patch2: >>>> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c >>>> index 0cba95411a6..7f129655926 100644 >>>> --- a/gcc/internal-fn.c >>>> +++ b/gcc/internal-fn.c >>>> @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) >>>> /* If this variable is in a register use expand_assignment. >>>> For boolean scalars force zero-init. */ >>>> tree init; >>>> + scalar_int_mode var_mode; >>>> if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE >>>> && tree_fits_uhwi_p (var_size) >>>> && (init_type == AUTO_INIT_PATTERN >>>> || !is_gimple_reg_type (var_type)) >>>> && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, >>>> - 0).exists ()) >>>> + 0).exists (&var_mode) >>>> + && have_insn_for (SET, var_mode)) >>>> { >>>> unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); >>>> unsigned char *buf = (unsigned char *) xmalloc (total_bytes); >>>> >>>> Have the same effect: >>>> >>>> 1. Resolved the ICE in gcc11; >>>> 2. For _Complex long double variables, both return FALSE, as a result, for PATTERN initialization of _Complex long double variables, now they are initialization with ZEROs instead of FEs. >>>> >>>> Let me know you opinion on this, If the above 2 is okay, then I might pick the above Patch 1 for the final patch to this issue. >>> >>> I think zero-initialization is OK, but I'd choose Patch2 for >>> consistency with what we do in the memcpy >>> folding. >> >> Note, I think the code leaks memory (buf is never freed) and >> should be allocated using >> unsigned char *buf = XNEWVEC (unsigned char *, total_bytes); >> and deallocated with XDELETEVEC (buf); > > Oops. I think the intention was to use XALLOCAVEC > >> If lhs is SSA_NAME, I think it would be easiest to use >> native_interpret_expr directly instead of finding some integral >> mode, after memset just native_interpret_expr. > > I've removed the native_interpret_expr path, I think it would be best to not > rely on expand_assignment for the case of a register destination. The > intent is to do an integer move into the destination and avoid issues with > things like XFmode or other modes with padding (and possibly trapping loads). > So it should become some > > (set (subreg:IntMode ...) (const_wide_int ...)) > > but my RTL fu is too weak to try and so we ended up with expand_assignment > which doesn't like V_C_E on SSA names on the LHS (well, because that's > not valid GIMPLE...). > >> Should be guarded by >> BITS_PER_UNIT == 8 && CHAR_BIT == 8 && BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN >> but I think the current code assumes at least that BITS_PER_UNIT == CHAR_BIT >> too anyway. >> That way you get directly a constant init instead of hopping through >> some integral mode which might not exist at all. >> If lhs is not SSA_NAME but var_type has COMPLEX_TYPE or VECTOR_TYPE, there >> is always an option to find var_mode only for the element type (i.e. >> TREE_TYPE (var_type), use var_size for the element size too, and after >> you compute init fold_build2 (COMPLEX_EXPR, ) or build_vector_from_val. >> For VECTOR_TYPE one would need to check if the vector mode is supported, >> sure (but if it isn't, it will go the BUILT_IN_MEMSET way, I'm pretty sure). >> >> Also, >> tree value = (init_type == AUTO_INIT_PATTERN) ? >> build_int_cst (integer_type_node, >> INIT_PATTERN_VALUE) : >> integer_zero_node; >> a few lines before has bad formatting, both ? and : should be at the start >> of next lines rather than on end of the previous ones. >> >> Jakub ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-09 17:19 ` Qing Zhao @ 2021-11-10 8:37 ` Richard Biener 2021-11-10 18:02 ` Qing Zhao 0 siblings, 1 reply; 18+ messages in thread From: Richard Biener @ 2021-11-10 8:37 UTC (permalink / raw) To: Qing Zhao Cc: Jakub Jelinek, gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski On Tue, Nov 9, 2021 at 6:48 PM Qing Zhao <qing.zhao@oracle.com> wrote: > > So, based on the discussion so far, is the following patch good to go? OK. Thanks, Richard. > Let me know if you have more comments on the following patch: > > (At the same time, I am testing this patch on both x86 and aarch64) > > thanks. > > Qing > > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > index 0cba95411a6..e8fd16b9c21 100644 > --- a/gcc/internal-fn.c > +++ b/gcc/internal-fn.c > @@ -3059,10 +3059,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) > mark_addressable (lhs); > tree var_addr = build_fold_addr_expr (lhs); > > - tree value = (init_type == AUTO_INIT_PATTERN) ? > - build_int_cst (integer_type_node, > - INIT_PATTERN_VALUE) : > - integer_zero_node; > + tree value = (init_type == AUTO_INIT_PATTERN) > + ? build_int_cst (integer_type_node, > + INIT_PATTERN_VALUE) > + : integer_zero_node; > tree m_call = build_call_expr (builtin_decl_implicit (BUILT_IN_MEMSET), > 3, var_addr, value, var_size); > /* Expand this memset call. */ > @@ -3073,15 +3073,17 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) > /* If this variable is in a register use expand_assignment. > For boolean scalars force zero-init. */ > tree init; > + scalar_int_mode var_mode; > if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > && tree_fits_uhwi_p (var_size) > && (init_type == AUTO_INIT_PATTERN > || !is_gimple_reg_type (var_type)) > && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > - 0).exists ()) > + 0).exists (&var_mode) > + && have_insn_for (SET, var_mode)) > { > unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > - unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > + unsigned char *buf = XALLOCAVEC (unsigned char, total_bytes); > memset (buf, (init_type == AUTO_INIT_PATTERN > ? INIT_PATTERN_VALUE : 0), total_bytes); > tree itype = build_nonstandard_integer_type > diff --git a/gcc/testsuite/gcc.target/i386/auto-init-6.c b/gcc/testsuite/gcc.target/i386/auto-init-6.c > index 339f8bc2966..e53385f0eb7 100644 > --- a/gcc/testsuite/gcc.target/i386/auto-init-6.c > +++ b/gcc/testsuite/gcc.target/i386/auto-init-6.c > @@ -1,4 +1,6 @@ > /* Verify pattern initialization for complex type automatic variables. */ > +/* Note, _Complex long double is initialized to zeroes due to the current > + implemenation limitation. */ > /* { dg-do compile } */ > /* { dg-options "-ftrivial-auto-var-init=pattern -march=x86-64 -mtune=generic -msse" } */ > > @@ -15,6 +17,6 @@ _Complex long double foo() > return result; > } > > -/* { dg-final { scan-assembler-times "long\t-16843010" 10 { target { ! ia32 } } } } */ > -/* { dg-final { scan-assembler-times "long\t-16843010" 6 { target { ia32 } } } } */ > +/* { dg-final { scan-assembler-times "long\t0" 8 { target { ! ia32 } } } } */ > +/* { dg-final { scan-assembler-times "long\t-16843010" 6 } } */ > > > > > > On Nov 9, 2021, at 4:44 AM, Richard Biener <richard.guenther@gmail.com> wrote: > > > > On Tue, Nov 9, 2021 at 10:10 AM Jakub Jelinek <jakub@redhat.com> wrote: > >> > >> On Tue, Nov 09, 2021 at 08:13:57AM +0100, Richard Biener wrote: > >>>> Hi, I tried both the following patches: > >>>> > >>>> Patch1: > >>>> > >>>> [opc@qinzhao-ol8u3-x86 gcc]$ git diff > >>>> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > >>>> index 0cba95411a6..ca49d2b4514 100644 > >>>> --- a/gcc/internal-fn.c > >>>> +++ b/gcc/internal-fn.c > >>>> @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) > >>>> /* If this variable is in a register use expand_assignment. > >>>> For boolean scalars force zero-init. */ > >>>> tree init; > >>>> + scalar_int_mode var_mode; > >>>> if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > >>>> && tree_fits_uhwi_p (var_size) > >>>> && (init_type == AUTO_INIT_PATTERN > >>>> || !is_gimple_reg_type (var_type)) > >>>> && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > >>>> - 0).exists ()) > >>>> + 0).exists (&var_mode) > >>>> + && targetm.scalar_mode_supported_p (var_mode)) > >>>> { > >>>> unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > >>>> unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > >>>> > >>>> AND > >>>> > >>>> Patch2: > >>>> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > >>>> index 0cba95411a6..7f129655926 100644 > >>>> --- a/gcc/internal-fn.c > >>>> +++ b/gcc/internal-fn.c > >>>> @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) > >>>> /* If this variable is in a register use expand_assignment. > >>>> For boolean scalars force zero-init. */ > >>>> tree init; > >>>> + scalar_int_mode var_mode; > >>>> if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE > >>>> && tree_fits_uhwi_p (var_size) > >>>> && (init_type == AUTO_INIT_PATTERN > >>>> || !is_gimple_reg_type (var_type)) > >>>> && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, > >>>> - 0).exists ()) > >>>> + 0).exists (&var_mode) > >>>> + && have_insn_for (SET, var_mode)) > >>>> { > >>>> unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); > >>>> unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > >>>> > >>>> Have the same effect: > >>>> > >>>> 1. Resolved the ICE in gcc11; > >>>> 2. For _Complex long double variables, both return FALSE, as a result, for PATTERN initialization of _Complex long double variables, now they are initialization with ZEROs instead of FEs. > >>>> > >>>> Let me know you opinion on this, If the above 2 is okay, then I might pick the above Patch 1 for the final patch to this issue. > >>> > >>> I think zero-initialization is OK, but I'd choose Patch2 for > >>> consistency with what we do in the memcpy > >>> folding. > >> > >> Note, I think the code leaks memory (buf is never freed) and > >> should be allocated using > >> unsigned char *buf = XNEWVEC (unsigned char *, total_bytes); > >> and deallocated with XDELETEVEC (buf); > > > > Oops. I think the intention was to use XALLOCAVEC > > > >> If lhs is SSA_NAME, I think it would be easiest to use > >> native_interpret_expr directly instead of finding some integral > >> mode, after memset just native_interpret_expr. > > > > I've removed the native_interpret_expr path, I think it would be best to not > > rely on expand_assignment for the case of a register destination. The > > intent is to do an integer move into the destination and avoid issues with > > things like XFmode or other modes with padding (and possibly trapping loads). > > So it should become some > > > > (set (subreg:IntMode ...) (const_wide_int ...)) > > > > but my RTL fu is too weak to try and so we ended up with expand_assignment > > which doesn't like V_C_E on SSA names on the LHS (well, because that's > > not valid GIMPLE...). > > > >> Should be guarded by > >> BITS_PER_UNIT == 8 && CHAR_BIT == 8 && BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN > >> but I think the current code assumes at least that BITS_PER_UNIT == CHAR_BIT > >> too anyway. > >> That way you get directly a constant init instead of hopping through > >> some integral mode which might not exist at all. > >> If lhs is not SSA_NAME but var_type has COMPLEX_TYPE or VECTOR_TYPE, there > >> is always an option to find var_mode only for the element type (i.e. > >> TREE_TYPE (var_type), use var_size for the element size too, and after > >> you compute init fold_build2 (COMPLEX_EXPR, ) or build_vector_from_val. > >> For VECTOR_TYPE one would need to check if the vector mode is supported, > >> sure (but if it isn't, it will go the BUILT_IN_MEMSET way, I'm pretty sure). > >> > >> Also, > >> tree value = (init_type == AUTO_INIT_PATTERN) ? > >> build_int_cst (integer_type_node, > >> INIT_PATTERN_VALUE) : > >> integer_zero_node; > >> a few lines before has bad formatting, both ? and : should be at the start > >> of next lines rather than on end of the previous ones. > >> > >> Jakub > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different 2021-11-10 8:37 ` Richard Biener @ 2021-11-10 18:02 ` Qing Zhao 0 siblings, 0 replies; 18+ messages in thread From: Qing Zhao @ 2021-11-10 18:02 UTC (permalink / raw) To: Richard Biener Cc: Jakub Jelinek, gcc-patches Nick Alcock via, Kewen. Lin, H.J. Lu, andrew Pinski Pushed the patch as: https://gcc.gnu.org/pipermail/gcc-cvs/2021-November/356543.html Qing > On Nov 10, 2021, at 2:37 AM, Richard Biener <richard.guenther@gmail.com> wrote: > > On Tue, Nov 9, 2021 at 6:48 PM Qing Zhao <qing.zhao@oracle.com> wrote: >> >> So, based on the discussion so far, is the following patch good to go? > > OK. > > Thanks, > Richard. > >> Let me know if you have more comments on the following patch: >> >> (At the same time, I am testing this patch on both x86 and aarch64) >> >> thanks. >> >> Qing >> >> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c >> index 0cba95411a6..e8fd16b9c21 100644 >> --- a/gcc/internal-fn.c >> +++ b/gcc/internal-fn.c >> @@ -3059,10 +3059,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) >> mark_addressable (lhs); >> tree var_addr = build_fold_addr_expr (lhs); >> >> - tree value = (init_type == AUTO_INIT_PATTERN) ? >> - build_int_cst (integer_type_node, >> - INIT_PATTERN_VALUE) : >> - integer_zero_node; >> + tree value = (init_type == AUTO_INIT_PATTERN) >> + ? build_int_cst (integer_type_node, >> + INIT_PATTERN_VALUE) >> + : integer_zero_node; >> tree m_call = build_call_expr (builtin_decl_implicit (BUILT_IN_MEMSET), >> 3, var_addr, value, var_size); >> /* Expand this memset call. */ >> @@ -3073,15 +3073,17 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) >> /* If this variable is in a register use expand_assignment. >> For boolean scalars force zero-init. */ >> tree init; >> + scalar_int_mode var_mode; >> if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE >> && tree_fits_uhwi_p (var_size) >> && (init_type == AUTO_INIT_PATTERN >> || !is_gimple_reg_type (var_type)) >> && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, >> - 0).exists ()) >> + 0).exists (&var_mode) >> + && have_insn_for (SET, var_mode)) >> { >> unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); >> - unsigned char *buf = (unsigned char *) xmalloc (total_bytes); >> + unsigned char *buf = XALLOCAVEC (unsigned char, total_bytes); >> memset (buf, (init_type == AUTO_INIT_PATTERN >> ? INIT_PATTERN_VALUE : 0), total_bytes); >> tree itype = build_nonstandard_integer_type >> diff --git a/gcc/testsuite/gcc.target/i386/auto-init-6.c b/gcc/testsuite/gcc.target/i386/auto-init-6.c >> index 339f8bc2966..e53385f0eb7 100644 >> --- a/gcc/testsuite/gcc.target/i386/auto-init-6.c >> +++ b/gcc/testsuite/gcc.target/i386/auto-init-6.c >> @@ -1,4 +1,6 @@ >> /* Verify pattern initialization for complex type automatic variables. */ >> +/* Note, _Complex long double is initialized to zeroes due to the current >> + implemenation limitation. */ >> /* { dg-do compile } */ >> /* { dg-options "-ftrivial-auto-var-init=pattern -march=x86-64 -mtune=generic -msse" } */ >> >> @@ -15,6 +17,6 @@ _Complex long double foo() >> return result; >> } >> >> -/* { dg-final { scan-assembler-times "long\t-16843010" 10 { target { ! ia32 } } } } */ >> -/* { dg-final { scan-assembler-times "long\t-16843010" 6 { target { ia32 } } } } */ >> +/* { dg-final { scan-assembler-times "long\t0" 8 { target { ! ia32 } } } } */ >> +/* { dg-final { scan-assembler-times "long\t-16843010" 6 } } */ >> >> >> >> >>> On Nov 9, 2021, at 4:44 AM, Richard Biener <richard.guenther@gmail.com> wrote: >>> >>> On Tue, Nov 9, 2021 at 10:10 AM Jakub Jelinek <jakub@redhat.com> wrote: >>>> >>>> On Tue, Nov 09, 2021 at 08:13:57AM +0100, Richard Biener wrote: >>>>>> Hi, I tried both the following patches: >>>>>> >>>>>> Patch1: >>>>>> >>>>>> [opc@qinzhao-ol8u3-x86 gcc]$ git diff >>>>>> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c >>>>>> index 0cba95411a6..ca49d2b4514 100644 >>>>>> --- a/gcc/internal-fn.c >>>>>> +++ b/gcc/internal-fn.c >>>>>> @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) >>>>>> /* If this variable is in a register use expand_assignment. >>>>>> For boolean scalars force zero-init. */ >>>>>> tree init; >>>>>> + scalar_int_mode var_mode; >>>>>> if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE >>>>>> && tree_fits_uhwi_p (var_size) >>>>>> && (init_type == AUTO_INIT_PATTERN >>>>>> || !is_gimple_reg_type (var_type)) >>>>>> && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, >>>>>> - 0).exists ()) >>>>>> + 0).exists (&var_mode) >>>>>> + && targetm.scalar_mode_supported_p (var_mode)) >>>>>> { >>>>>> unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); >>>>>> unsigned char *buf = (unsigned char *) xmalloc (total_bytes); >>>>>> >>>>>> AND >>>>>> >>>>>> Patch2: >>>>>> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c >>>>>> index 0cba95411a6..7f129655926 100644 >>>>>> --- a/gcc/internal-fn.c >>>>>> +++ b/gcc/internal-fn.c >>>>>> @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) >>>>>> /* If this variable is in a register use expand_assignment. >>>>>> For boolean scalars force zero-init. */ >>>>>> tree init; >>>>>> + scalar_int_mode var_mode; >>>>>> if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE >>>>>> && tree_fits_uhwi_p (var_size) >>>>>> && (init_type == AUTO_INIT_PATTERN >>>>>> || !is_gimple_reg_type (var_type)) >>>>>> && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT, >>>>>> - 0).exists ()) >>>>>> + 0).exists (&var_mode) >>>>>> + && have_insn_for (SET, var_mode)) >>>>>> { >>>>>> unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size); >>>>>> unsigned char *buf = (unsigned char *) xmalloc (total_bytes); >>>>>> >>>>>> Have the same effect: >>>>>> >>>>>> 1. Resolved the ICE in gcc11; >>>>>> 2. For _Complex long double variables, both return FALSE, as a result, for PATTERN initialization of _Complex long double variables, now they are initialization with ZEROs instead of FEs. >>>>>> >>>>>> Let me know you opinion on this, If the above 2 is okay, then I might pick the above Patch 1 for the final patch to this issue. >>>>> >>>>> I think zero-initialization is OK, but I'd choose Patch2 for >>>>> consistency with what we do in the memcpy >>>>> folding. >>>> >>>> Note, I think the code leaks memory (buf is never freed) and >>>> should be allocated using >>>> unsigned char *buf = XNEWVEC (unsigned char *, total_bytes); >>>> and deallocated with XDELETEVEC (buf); >>> >>> Oops. I think the intention was to use XALLOCAVEC >>> >>>> If lhs is SSA_NAME, I think it would be easiest to use >>>> native_interpret_expr directly instead of finding some integral >>>> mode, after memset just native_interpret_expr. >>> >>> I've removed the native_interpret_expr path, I think it would be best to not >>> rely on expand_assignment for the case of a register destination. The >>> intent is to do an integer move into the destination and avoid issues with >>> things like XFmode or other modes with padding (and possibly trapping loads). >>> So it should become some >>> >>> (set (subreg:IntMode ...) (const_wide_int ...)) >>> >>> but my RTL fu is too weak to try and so we ended up with expand_assignment >>> which doesn't like V_C_E on SSA names on the LHS (well, because that's >>> not valid GIMPLE...). >>> >>>> Should be guarded by >>>> BITS_PER_UNIT == 8 && CHAR_BIT == 8 && BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN >>>> but I think the current code assumes at least that BITS_PER_UNIT == CHAR_BIT >>>> too anyway. >>>> That way you get directly a constant init instead of hopping through >>>> some integral mode which might not exist at all. >>>> If lhs is not SSA_NAME but var_type has COMPLEX_TYPE or VECTOR_TYPE, there >>>> is always an option to find var_mode only for the element type (i.e. >>>> TREE_TYPE (var_type), use var_size for the element size too, and after >>>> you compute init fold_build2 (COMPLEX_EXPR, ) or build_vector_from_val. >>>> For VECTOR_TYPE one would need to check if the vector mode is supported, >>>> sure (but if it isn't, it will go the BUILT_IN_MEMSET way, I'm pretty sure). >>>> >>>> Also, >>>> tree value = (init_type == AUTO_INIT_PATTERN) ? >>>> build_int_cst (integer_type_node, >>>> INIT_PATTERN_VALUE) : >>>> integer_zero_node; >>>> a few lines before has bad formatting, both ? and : should be at the start >>>> of next lines rather than on end of the previous ones. >>>> >>>> Jakub >> ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2021-11-10 18:02 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-11-04 20:37 Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different Qing Zhao 2021-11-05 1:34 ` Kewen.Lin 2021-11-05 6:05 ` Andrew Pinski 2021-11-05 6:53 ` Jakub Jelinek 2021-11-05 10:01 ` Richard Biener 2021-11-05 12:25 ` H.J. Lu 2021-11-05 16:11 ` Qing Zhao 2021-11-05 16:17 ` Jakub Jelinek 2021-11-05 17:37 ` Qing Zhao 2021-11-06 9:56 ` Jakub Jelinek 2021-11-08 8:41 ` Richard Biener 2021-11-08 23:47 ` Qing Zhao 2021-11-09 7:13 ` Richard Biener 2021-11-09 9:10 ` Jakub Jelinek 2021-11-09 10:44 ` Richard Biener 2021-11-09 17:19 ` Qing Zhao 2021-11-10 8:37 ` Richard Biener 2021-11-10 18:02 ` Qing Zhao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).