* [001/nnn] poly_int: add poly-int.h
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
@ 2017-10-23 16:58 ` Richard Sandiford
2017-10-25 16:17 ` Martin Sebor
2017-11-08 10:03 ` Richard Sandiford
2017-10-23 16:59 ` [002/nnn] poly_int: IN_TARGET_CODE Richard Sandiford
` (106 subsequent siblings)
107 siblings, 2 replies; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 16:58 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 3967 bytes --]
This patch adds a new "poly_int" class to represent polynomial integers
of the form:
C0 + C1*X1 + C2*X2 ... + Cn*Xn
It also adds poly_int-based typedefs for offsets and sizes of various
precisions. In these typedefs, the Ci coefficients are compile-time
constants and the Xi indeterminates are run-time invariants. The number
of coefficients is controlled by the target and is initially 1 for all
ports.
Most routines can handle general coefficient counts, but for now a few
are specific to one or two coefficients. Support for other coefficient
counts can be added when needed.
The patch also adds a new macro, IN_TARGET_CODE, that can be
set to indicate that a TU contains target-specific rather than
target-independent code. When this macro is set and the number of
coefficients is 1, the poly-int.h classes define a conversion operator
to a constant. This allows most existing target code to work without
modification. The main exceptions are:
- values passed through ..., which need an explicit conversion to a
constant
- ?: expression in which one arm ends up being a polynomial and the
other remains a constant. In these cases it would be valid to convert
the constant to a polynomial and the polynomial to a constant, so a
cast is needed to break the ambiguity.
The patch also adds a new target hook to return the estimated
value of a polynomial for costing purposes.
The patch also adds operator<< on wide_ints (it was already defined
for offset_int and widest_int). I think this was originally excluded
because >> is ambiguous for wide_int, but << is useful for converting
bytes to bits, etc., so is worth defining on its own. The patch also
adds operator% and operator/ for offset_int and widest_int, since those
types are always signed. These changes allow the poly_int interface to
be more predictable.
I'd originally tried adding the tests as selftests, but that ended up
bloating cc1 by at least a third. It also took a while to build them
at -O2. The patch therefore uses plugin tests instead, where we can
force the tests to be built at -O0. They still run in negligible time
when built that way.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* poly-int.h: New file.
* poly-int-types.h: Likewise.
* coretypes.h: Include them.
(POLY_INT_CONVERSION): Define.
* target.def (estimated_poly_value): New hook.
* doc/tm.texi.in (TARGET_ESTIMATED_POLY_VALUE): New hook.
* doc/tm.texi: Regenerate.
* doc/poly-int.texi: New file.
* doc/gccint.texi: Include it.
* doc/rtl.texi: Describe restrictions on subreg modes.
* Makefile.in (TEXI_GCCINT_FILES): Add poly-int.texi.
* genmodes.c (NUM_POLY_INT_COEFFS): Provide a default definition.
(emit_insn_modes_h): Emit a definition of NUM_POLY_INT_COEFFS.
* targhooks.h (default_estimated_poly_value): Declare.
* targhooks.c (default_estimated_poly_value): New function.
* target.h (estimated_poly_value): Likewise.
* wide-int.h (WI_UNARY_RESULT): Use wi::binary_traits.
(wi::unary_traits): Delete.
(wi::binary_traits::signed_shift_result_type): Define for
offset_int << HOST_WIDE_INT, etc.
(generic_wide_int::operator <<=): Define for all types and use
wi::lshift instead of <<.
(wi::hwi_with_prec): Add a default constructor.
(wi::ints_for): New class.
(operator <<): Define for all wide-int types.
(operator /): New function.
(operator %): Likewise.
* selftest.h (ASSERT_MUST_EQ, ASSERT_MUST_EQ_AT, ASSERT_MAY_NE)
(ASSERT_MAY_NE_AT): New macros.
gcc/testsuite/
* gcc.dg/plugin/poly-int-tests.h,
gcc.dg/plugin/poly-int-test-1.c,
gcc.dg/plugin/poly-int-01_plugin.c,
gcc.dg/plugin/poly-int-02_plugin.c,
gcc.dg/plugin/poly-int-03_plugin.c,
gcc.dg/plugin/poly-int-04_plugin.c,
gcc.dg/plugin/poly-int-05_plugin.c,
gcc.dg/plugin/poly-int-06_plugin.c,
gcc.dg/plugin/poly-int-07_plugin.c: New tests.
* gcc.dg/plugin/plugin.exp: Run them.
[-- Attachment #2: poly-int.diff.bz2 --]
[-- Type: application/x-bzip2, Size: 39587 bytes --]
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-10-23 16:58 ` [001/nnn] poly_int: add poly-int.h Richard Sandiford
@ 2017-10-25 16:17 ` Martin Sebor
2017-11-08 9:44 ` Richard Sandiford
2017-11-08 10:03 ` Richard Sandiford
1 sibling, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-10-25 16:17 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 10:57 AM, Richard Sandiford wrote:
> This patch adds a new "poly_int" class to represent polynomial integers
> of the form:
>
> C0 + C1*X1 + C2*X2 ... + Cn*Xn
>
> It also adds poly_int-based typedefs for offsets and sizes of various
> precisions. In these typedefs, the Ci coefficients are compile-time
> constants and the Xi indeterminates are run-time invariants. The number
> of coefficients is controlled by the target and is initially 1 for all
> ports.
>
> Most routines can handle general coefficient counts, but for now a few
> are specific to one or two coefficients. Support for other coefficient
> counts can be added when needed.
>
> The patch also adds a new macro, IN_TARGET_CODE, that can be
> set to indicate that a TU contains target-specific rather than
> target-independent code. When this macro is set and the number of
> coefficients is 1, the poly-int.h classes define a conversion operator
> to a constant. This allows most existing target code to work without
> modification. The main exceptions are:
>
> - values passed through ..., which need an explicit conversion to a
> constant
>
> - ?: expression in which one arm ends up being a polynomial and the
> other remains a constant. In these cases it would be valid to convert
> the constant to a polynomial and the polynomial to a constant, so a
> cast is needed to break the ambiguity.
>
> The patch also adds a new target hook to return the estimated
> value of a polynomial for costing purposes.
>
> The patch also adds operator<< on wide_ints (it was already defined
> for offset_int and widest_int). I think this was originally excluded
> because >> is ambiguous for wide_int, but << is useful for converting
> bytes to bits, etc., so is worth defining on its own. The patch also
> adds operator% and operator/ for offset_int and widest_int, since those
> types are always signed. These changes allow the poly_int interface to
> be more predictable.
>
> I'd originally tried adding the tests as selftests, but that ended up
> bloating cc1 by at least a third. It also took a while to build them
> at -O2. The patch therefore uses plugin tests instead, where we can
> force the tests to be built at -O0. They still run in negligible time
> when built that way.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * poly-int.h: New file.
> * poly-int-types.h: Likewise.
> * coretypes.h: Include them.
> (POLY_INT_CONVERSION): Define.
> * target.def (estimated_poly_value): New hook.
> * doc/tm.texi.in (TARGET_ESTIMATED_POLY_VALUE): New hook.
> * doc/tm.texi: Regenerate.
> * doc/poly-int.texi: New file.
> * doc/gccint.texi: Include it.
> * doc/rtl.texi: Describe restrictions on subreg modes.
> * Makefile.in (TEXI_GCCINT_FILES): Add poly-int.texi.
> * genmodes.c (NUM_POLY_INT_COEFFS): Provide a default definition.
> (emit_insn_modes_h): Emit a definition of NUM_POLY_INT_COEFFS.
> * targhooks.h (default_estimated_poly_value): Declare.
> * targhooks.c (default_estimated_poly_value): New function.
> * target.h (estimated_poly_value): Likewise.
> * wide-int.h (WI_UNARY_RESULT): Use wi::binary_traits.
> (wi::unary_traits): Delete.
> (wi::binary_traits::signed_shift_result_type): Define for
> offset_int << HOST_WIDE_INT, etc.
> (generic_wide_int::operator <<=): Define for all types and use
> wi::lshift instead of <<.
> (wi::hwi_with_prec): Add a default constructor.
> (wi::ints_for): New class.
> (operator <<): Define for all wide-int types.
> (operator /): New function.
> (operator %): Likewise.
> * selftest.h (ASSERT_MUST_EQ, ASSERT_MUST_EQ_AT, ASSERT_MAY_NE)
> (ASSERT_MAY_NE_AT): New macros.
>
> gcc/testsuite/
> * gcc.dg/plugin/poly-int-tests.h,
> gcc.dg/plugin/poly-int-test-1.c,
> gcc.dg/plugin/poly-int-01_plugin.c,
> gcc.dg/plugin/poly-int-02_plugin.c,
> gcc.dg/plugin/poly-int-03_plugin.c,
> gcc.dg/plugin/poly-int-04_plugin.c,
> gcc.dg/plugin/poly-int-05_plugin.c,
> gcc.dg/plugin/poly-int-06_plugin.c,
> gcc.dg/plugin/poly-int-07_plugin.c: New tests.
> * gcc.dg/plugin/plugin.exp: Run them.
I haven't done nearly a thorough review but the dtor followed by
the placement new in the POLY_SET_COEFF() macro caught my eye so
I thought I'd ask sooner rather than later. Given the macro
definition:
+ The dummy comparison against a null C * is just a way of checking
+ that C gives the right type. */
+#define POLY_SET_COEFF(C, RES, I, VALUE) \
+ ((void) (&(RES).coeffs[0] == (C *) 0), \
+ wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
+ ? (void) ((RES).coeffs[I] = VALUE) \
+ : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
is the following use well-defined?
+template<unsigned int N, typename C>
+inline poly_int_pod<N, C>&
+poly_int_pod<N, C>::operator <<= (unsigned int a)
+{
+ POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
It looks to me as though the VALUE argument in the ctor invoked
by the placement new expression is evaluated after the dtor has
destroyed the very array element the VALUE argument expands to.
Or am I misreading the code?
Whether or not is, in fact, a problem, it seems to me that using
a function template rather than a macro would be a clearer and
safer way to do the same thing. (Safer in that the macro also
evaluates its arguments multiple times, which is often a source
of subtle bugs.)
Other than that, I would suggest changing 't' to something a bit
less terse, like perhaps 'type' in traits like the following:
+struct if_lossless;
+template<typename T1, typename T2, typename T3>
+struct if_lossless<T1, T2, T3, true>
+{
+ typedef T3 t;
+};
Lastly (for now), I note that the default ply_int ctor, like
that of the other xxx_int types, is a no-op. That makes using
all these types error prone, e.g., as arrays in ctor-initializer
lists:
struct Pair {
poly_int<...> poly[2];
Pair (): poly () { } // poly[] (unexpectedly) uninitialized
}
Martin
PS My initial interest in this class was to to see if the it is
less prone to error than wide_int and offset_int. Specifically,
if it's easier to convert the various flavors of xxx_int among
one another and between basic integers and the xxx_ints. But
after reading the documentation I have the impression it might
help with some of the range work I've been doing recently, so
I'll try to do a more thorough review in the (hopefully) near
future.
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-10-25 16:17 ` Martin Sebor
@ 2017-11-08 9:44 ` Richard Sandiford
2017-11-08 16:51 ` Martin Sebor
0 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-11-08 9:44 UTC (permalink / raw)
To: Martin Sebor; +Cc: gcc-patches
Martin Sebor <msebor@gmail.com> writes:
> I haven't done nearly a thorough review but the dtor followed by
> the placement new in the POLY_SET_COEFF() macro caught my eye so
> I thought I'd ask sooner rather than later. Given the macro
> definition:
>
> + The dummy comparison against a null C * is just a way of checking
> + that C gives the right type. */
> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
> + ((void) (&(RES).coeffs[0] == (C *) 0), \
> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
> + ? (void) ((RES).coeffs[I] = VALUE) \
> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>
> is the following use well-defined?
>
> +template<unsigned int N, typename C>
> +inline poly_int_pod<N, C>&
> +poly_int_pod<N, C>::operator <<= (unsigned int a)
> +{
> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>
> It looks to me as though the VALUE argument in the ctor invoked
> by the placement new expression is evaluated after the dtor has
> destroyed the very array element the VALUE argument expands to.
Good catch! It should simply have been doing <<= on each coefficient --
I must have got carried away when converting to POLY_SET_COEFF.
I double-checked the other uses and think that's the only one.
> Whether or not is, in fact, a problem, it seems to me that using
> a function template rather than a macro would be a clearer and
> safer way to do the same thing. (Safer in that the macro also
> evaluates its arguments multiple times, which is often a source
> of subtle bugs.)
That would slow down -O0 builds though, by introducing an extra
function call and set of temporaries even when the coefficients
are primitive integers.
> Other than that, I would suggest changing 't' to something a bit
> less terse, like perhaps 'type' in traits like the following:
>
> +struct if_lossless;
> +template<typename T1, typename T2, typename T3>
> +struct if_lossless<T1, T2, T3, true>
> +{
> + typedef T3 t;
> +};
OK, done in v2.
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-08 9:44 ` Richard Sandiford
@ 2017-11-08 16:51 ` Martin Sebor
2017-11-08 16:56 ` Richard Sandiford
0 siblings, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-11-08 16:51 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 11/08/2017 02:32 AM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:
>> I haven't done nearly a thorough review but the dtor followed by
>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>> I thought I'd ask sooner rather than later. Given the macro
>> definition:
>>
>> + The dummy comparison against a null C * is just a way of checking
>> + that C gives the right type. */
>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>> + ((void) (&(RES).coeffs[0] == (C *) 0), \
>> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
>> + ? (void) ((RES).coeffs[I] = VALUE) \
>> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>
>> is the following use well-defined?
>>
>> +template<unsigned int N, typename C>
>> +inline poly_int_pod<N, C>&
>> +poly_int_pod<N, C>::operator <<= (unsigned int a)
>> +{
>> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>
>> It looks to me as though the VALUE argument in the ctor invoked
>> by the placement new expression is evaluated after the dtor has
>> destroyed the very array element the VALUE argument expands to.
>
> Good catch! It should simply have been doing <<= on each coefficient --
> I must have got carried away when converting to POLY_SET_COEFF.
>
> I double-checked the other uses and think that's the only one.
>
>> Whether or not is, in fact, a problem, it seems to me that using
>> a function template rather than a macro would be a clearer and
>> safer way to do the same thing. (Safer in that the macro also
>> evaluates its arguments multiple times, which is often a source
>> of subtle bugs.)
>
> That would slow down -O0 builds though, by introducing an extra
> function call and set of temporaries even when the coefficients
> are primitive integers.
Would decorating the function template with attribute always_inline
help?
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-08 16:51 ` Martin Sebor
@ 2017-11-08 16:56 ` Richard Sandiford
2017-11-08 17:33 ` Martin Sebor
2017-11-08 17:34 ` Martin Sebor
0 siblings, 2 replies; 302+ messages in thread
From: Richard Sandiford @ 2017-11-08 16:56 UTC (permalink / raw)
To: Martin Sebor; +Cc: gcc-patches
Martin Sebor <msebor@gmail.com> writes:
> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
>> Martin Sebor <msebor@gmail.com> writes:
>>> I haven't done nearly a thorough review but the dtor followed by
>>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>>> I thought I'd ask sooner rather than later. Given the macro
>>> definition:
>>>
>>> + The dummy comparison against a null C * is just a way of checking
>>> + that C gives the right type. */
>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>>> + ((void) (&(RES).coeffs[0] == (C *) 0), \
>>> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
>>> + ? (void) ((RES).coeffs[I] = VALUE) \
>>> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>>
>>> is the following use well-defined?
>>>
>>> +template<unsigned int N, typename C>
>>> +inline poly_int_pod<N, C>&
>>> +poly_int_pod<N, C>::operator <<= (unsigned int a)
>>> +{
>>> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>>
>>> It looks to me as though the VALUE argument in the ctor invoked
>>> by the placement new expression is evaluated after the dtor has
>>> destroyed the very array element the VALUE argument expands to.
>>
>> Good catch! It should simply have been doing <<= on each coefficient --
>> I must have got carried away when converting to POLY_SET_COEFF.
>>
>> I double-checked the other uses and think that's the only one.
>>
>>> Whether or not is, in fact, a problem, it seems to me that using
>>> a function template rather than a macro would be a clearer and
>>> safer way to do the same thing. (Safer in that the macro also
>>> evaluates its arguments multiple times, which is often a source
>>> of subtle bugs.)
>>
>> That would slow down -O0 builds though, by introducing an extra
>> function call and set of temporaries even when the coefficients
>> are primitive integers.
>
> Would decorating the function template with attribute always_inline
> help?
It would remove the call itself, but we'd still have the extra temporary
objects that were the function argument and return value.
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-08 16:56 ` Richard Sandiford
@ 2017-11-08 17:33 ` Martin Sebor
2017-11-08 17:34 ` Martin Sebor
1 sibling, 0 replies; 302+ messages in thread
From: Martin Sebor @ 2017-11-08 17:33 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 11/08/2017 09:51 AM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:
>> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
>>> Martin Sebor <msebor@gmail.com> writes:
>>>> I haven't done nearly a thorough review but the dtor followed by
>>>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>>>> I thought I'd ask sooner rather than later. Given the macro
>>>> definition:
>>>>
>>>> + The dummy comparison against a null C * is just a way of checking
>>>> + that C gives the right type. */
>>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>>>> + ((void) (&(RES).coeffs[0] == (C *) 0), \
>>>> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
>>>> + ? (void) ((RES).coeffs[I] = VALUE) \
>>>> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>>>
>>>> is the following use well-defined?
>>>>
>>>> +template<unsigned int N, typename C>
>>>> +inline poly_int_pod<N, C>&
>>>> +poly_int_pod<N, C>::operator <<= (unsigned int a)
>>>> +{
>>>> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>>>
>>>> It looks to me as though the VALUE argument in the ctor invoked
>>>> by the placement new expression is evaluated after the dtor has
>>>> destroyed the very array element the VALUE argument expands to.
>>>
>>> Good catch! It should simply have been doing <<= on each coefficient --
>>> I must have got carried away when converting to POLY_SET_COEFF.
>>>
>>> I double-checked the other uses and think that's the only one.
>>>
>>>> Whether or not is, in fact, a problem, it seems to me that using
>>>> a function template rather than a macro would be a clearer and
>>>> safer way to do the same thing. (Safer in that the macro also
>>>> evaluates its arguments multiple times, which is often a source
>>>> of subtle bugs.)
>>>
>>> That would slow down -O0 builds though, by introducing an extra
>>> function call and set of temporaries even when the coefficients
>>> are primitive integers.
>>
>> Would decorating the function template with attribute always_inline
>> help?
>
> It would remove the call itself, but we'd still have the extra temporary
> objects that were the function argument and return value.
Sorry, I do not want to get into another long discussion about
trade-offs between safety and efficiency but I'm not sure I see
what extra temporaries it would create. It seems to me that
an inline function template that took arguments of user-defined
types by reference and others by value should be just as efficient
as a macro.
From GCC's own manual:
6.43 An Inline Function is As Fast As a Macro
https://gcc.gnu.org/onlinedocs/gcc/Inline.html
If that's not the case and there is a significant performance
penalty associated with inline functions at -O0 then GCC should
be fixed to avoid it.
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-08 16:56 ` Richard Sandiford
2017-11-08 17:33 ` Martin Sebor
@ 2017-11-08 17:34 ` Martin Sebor
2017-11-08 18:34 ` Richard Sandiford
1 sibling, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-11-08 17:34 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 11/08/2017 09:51 AM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:
>> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
>>> Martin Sebor <msebor@gmail.com> writes:
>>>> I haven't done nearly a thorough review but the dtor followed by
>>>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>>>> I thought I'd ask sooner rather than later. Given the macro
>>>> definition:
>>>>
>>>> + The dummy comparison against a null C * is just a way of checking
>>>> + that C gives the right type. */
>>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>>>> + ((void) (&(RES).coeffs[0] == (C *) 0), \
>>>> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
>>>> + ? (void) ((RES).coeffs[I] = VALUE) \
>>>> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>>>
>>>> is the following use well-defined?
>>>>
>>>> +template<unsigned int N, typename C>
>>>> +inline poly_int_pod<N, C>&
>>>> +poly_int_pod<N, C>::operator <<= (unsigned int a)
>>>> +{
>>>> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>>>
>>>> It looks to me as though the VALUE argument in the ctor invoked
>>>> by the placement new expression is evaluated after the dtor has
>>>> destroyed the very array element the VALUE argument expands to.
>>>
>>> Good catch! It should simply have been doing <<= on each coefficient --
>>> I must have got carried away when converting to POLY_SET_COEFF.
>>>
>>> I double-checked the other uses and think that's the only one.
>>>
>>>> Whether or not is, in fact, a problem, it seems to me that using
>>>> a function template rather than a macro would be a clearer and
>>>> safer way to do the same thing. (Safer in that the macro also
>>>> evaluates its arguments multiple times, which is often a source
>>>> of subtle bugs.)
>>>
>>> That would slow down -O0 builds though, by introducing an extra
>>> function call and set of temporaries even when the coefficients
>>> are primitive integers.
>>
>> Would decorating the function template with attribute always_inline
>> help?
>
> It would remove the call itself, but we'd still have the extra temporary
> objects that were the function argument and return value.
Sorry, I do not want to get into another long discussion about
trade-offs between safety and efficiency but I'm not sure I see
what extra temporaries it would create. It seems to me that
an inline function template that took arguments of user-defined
types by reference and others by value should be just as efficient
as a macro.
From GCC's own manual:
6.43 An Inline Function is As Fast As a Macro
https://gcc.gnu.org/onlinedocs/gcc/Inline.html
If that's not the case and there is a significant performance
penalty associated with inline functions at -O0 then GCC should
be fixed to avoid it.
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-08 17:34 ` Martin Sebor
@ 2017-11-08 18:34 ` Richard Sandiford
2017-11-09 9:10 ` Martin Sebor
0 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-11-08 18:34 UTC (permalink / raw)
To: Martin Sebor; +Cc: gcc-patches
Martin Sebor <msebor@gmail.com> writes:
> On 11/08/2017 09:51 AM, Richard Sandiford wrote:
>> Martin Sebor <msebor@gmail.com> writes:
>>> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
>>>> Martin Sebor <msebor@gmail.com> writes:
>>>>> I haven't done nearly a thorough review but the dtor followed by
>>>>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>>>>> I thought I'd ask sooner rather than later. Given the macro
>>>>> definition:
>>>>>
>>>>> + The dummy comparison against a null C * is just a way of checking
>>>>> + that C gives the right type. */
>>>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>>>>> + ((void) (&(RES).coeffs[0] == (C *) 0), \
>>>>> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
>>>>> + ? (void) ((RES).coeffs[I] = VALUE) \
>>>>> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>>>>
>>>>> is the following use well-defined?
>>>>>
>>>>> +template<unsigned int N, typename C>
>>>>> +inline poly_int_pod<N, C>&
>>>>> +poly_int_pod<N, C>::operator <<= (unsigned int a)
>>>>> +{
>>>>> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>>>>
>>>>> It looks to me as though the VALUE argument in the ctor invoked
>>>>> by the placement new expression is evaluated after the dtor has
>>>>> destroyed the very array element the VALUE argument expands to.
>>>>
>>>> Good catch! It should simply have been doing <<= on each coefficient --
>>>> I must have got carried away when converting to POLY_SET_COEFF.
>>>>
>>>> I double-checked the other uses and think that's the only one.
>>>>
>>>>> Whether or not is, in fact, a problem, it seems to me that using
>>>>> a function template rather than a macro would be a clearer and
>>>>> safer way to do the same thing. (Safer in that the macro also
>>>>> evaluates its arguments multiple times, which is often a source
>>>>> of subtle bugs.)
>>>>
>>>> That would slow down -O0 builds though, by introducing an extra
>>>> function call and set of temporaries even when the coefficients
>>>> are primitive integers.
>>>
>>> Would decorating the function template with attribute always_inline
>>> help?
>>
>> It would remove the call itself, but we'd still have the extra temporary
>> objects that were the function argument and return value.
>
> Sorry, I do not want to get into another long discussion about
> trade-offs between safety and efficiency but I'm not sure I see
> what extra temporaries it would create. It seems to me that
> an inline function template that took arguments of user-defined
> types by reference and others by value should be just as efficient
> as a macro.
>
> From GCC's own manual:
>
> 6.43 An Inline Function is As Fast As a Macro
> https://gcc.gnu.org/onlinedocs/gcc/Inline.html
You can see the difference with something like:
inline
void __attribute__((always_inline))
f(int &dst, const int &src) { dst = src; }
int g1(const int &y) { int x; f(x, y); return x; }
int g2(const int &y) { int x; x = y; return x; }
where *.optimized from GCC 7.1 at -O0 is:
int g1(const int&) (const int & y)
{
int & dst;
const int & src;
int x;
int D.2285;
int _3;
int _6;
<bb 2> [0.00%]:
src_5 = y_2(D);
_6 = *src_5;
x = _6;
_3 = x;
x ={v} {CLOBBER};
<L1> [0.00%]:
return _3;
}
vs:
int g2(const int&) (const int & y)
{
int x;
int D.2288;
int _4;
<bb 2> [0.00%]:
x_3 = *y_2(D);
_4 = x_3;
<L0> [0.00%]:
return _4;
}
> If that's not the case and there is a significant performance
> penalty associated with inline functions at -O0 then GCC should
> be fixed to avoid it.
I think those docs are really talking about inline functions being as
fast as macros when optimisation is enabled. I don't think we make
any guarantees about -O0 code quality.
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-08 18:34 ` Richard Sandiford
@ 2017-11-09 9:10 ` Martin Sebor
2017-11-09 11:14 ` Richard Sandiford
0 siblings, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-11-09 9:10 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 11/08/2017 11:28 AM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:
>> On 11/08/2017 09:51 AM, Richard Sandiford wrote:
>>> Martin Sebor <msebor@gmail.com> writes:
>>>> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
>>>>> Martin Sebor <msebor@gmail.com> writes:
>>>>>> I haven't done nearly a thorough review but the dtor followed by
>>>>>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>>>>>> I thought I'd ask sooner rather than later. Given the macro
>>>>>> definition:
>>>>>>
>>>>>> + The dummy comparison against a null C * is just a way of checking
>>>>>> + that C gives the right type. */
>>>>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>>>>>> + ((void) (&(RES).coeffs[0] == (C *) 0), \
>>>>>> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
>>>>>> + ? (void) ((RES).coeffs[I] = VALUE) \
>>>>>> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>>>>>
>>>>>> is the following use well-defined?
>>>>>>
>>>>>> +template<unsigned int N, typename C>
>>>>>> +inline poly_int_pod<N, C>&
>>>>>> +poly_int_pod<N, C>::operator <<= (unsigned int a)
>>>>>> +{
>>>>>> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>>>>>
>>>>>> It looks to me as though the VALUE argument in the ctor invoked
>>>>>> by the placement new expression is evaluated after the dtor has
>>>>>> destroyed the very array element the VALUE argument expands to.
>>>>>
>>>>> Good catch! It should simply have been doing <<= on each coefficient --
>>>>> I must have got carried away when converting to POLY_SET_COEFF.
>>>>>
>>>>> I double-checked the other uses and think that's the only one.
>>>>>
>>>>>> Whether or not is, in fact, a problem, it seems to me that using
>>>>>> a function template rather than a macro would be a clearer and
>>>>>> safer way to do the same thing. (Safer in that the macro also
>>>>>> evaluates its arguments multiple times, which is often a source
>>>>>> of subtle bugs.)
>>>>>
>>>>> That would slow down -O0 builds though, by introducing an extra
>>>>> function call and set of temporaries even when the coefficients
>>>>> are primitive integers.
>>>>
>>>> Would decorating the function template with attribute always_inline
>>>> help?
>>>
>>> It would remove the call itself, but we'd still have the extra temporary
>>> objects that were the function argument and return value.
>>
>> Sorry, I do not want to get into another long discussion about
>> trade-offs between safety and efficiency but I'm not sure I see
>> what extra temporaries it would create. It seems to me that
>> an inline function template that took arguments of user-defined
>> types by reference and others by value should be just as efficient
>> as a macro.
>>
>> From GCC's own manual:
>>
>> 6.43 An Inline Function is As Fast As a Macro
>> https://gcc.gnu.org/onlinedocs/gcc/Inline.html
>
> You can see the difference with something like:
>
> inline
> void __attribute__((always_inline))
> f(int &dst, const int &src) { dst = src; }
>
> int g1(const int &y) { int x; f(x, y); return x; }
> int g2(const int &y) { int x; x = y; return x; }
Let me say at the outset that I struggle to comprehend that a few
instructions is even a consideration when not optimizing, especially
in light of the bug the macro caused that would have been prevented
by using a function instead. But...
...I don't think your example above is representative of using
the POLY_SET_COEFF macro. The function template I'm suggesting
might look something to this:
template <unsigned N, class C>
inline void __attribute__ ((always_inline))
poly_set_coeff (poly_int_pod<N, C> *p, unsigned idx, C val)
{
((void) (&(*p).coeffs[0] == (C *) 0),
wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION ? (void)
((*p).coeffs[0] = val) : (void) ((*p).coeffs[0].~C (), new
(&(*p).coeffs[0]) C (val)));
if (N >= 2)
for (unsigned int i = 1; i < N; i++)
((void) (&(*p).coeffs[0] == (C *) 0),
wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION ? (void)
((*p).coeffs[i] = val) : (void) ((*p).coeffs[i].~C (), new
(&(*p).coeffs[i]) C (val)));
}
To compare apples to apples I suggest to instead compare the shift
operator (or any other poly_int function that uses the macro) that
doesn't suffer from the bug vs one that makes use of the function
template. I see a difference of 2 instructions on x86_64 (21 vs
23) for operator<<=.
Are two assembly instructions even worth talking about?
>> If that's not the case and there is a significant performance
>> penalty associated with inline functions at -O0 then GCC should
>> be fixed to avoid it.
>
> I think those docs are really talking about inline functions being as
> fast as macros when optimisation is enabled. I don't think we make
> any guarantees about -O0 code quality.
Sure, but you are using unsafe macros in preference to a safer
inline function even with optimization, introducing a bug as
a result, and making an argument that the performance impact
of a few instructions when not using optimization is what should
drive the decision between one and the other in all situations.
With all respect, I fail to see the logic in this like of
reasoning. By that argument we would never be able to define
any inline functions.
That being said, if the performance implications of using inline
functions with no optimization are so serious here then I suggest
you should be concerned about introducing the poly_int API in its
current form at all: every access to the class is an inline
function.
On a more serious/constructive note, if you really are worried
about efficiency at this level then introducing an intrinsic
primitive into the compiler instead of a set of classes might
be worth thinking about. It will only benefit GCC but it might
lay a foundation for all sorts of infinite precision integer
classes (including the C++ proposal that was pointed out in
the other thread).
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-09 9:10 ` Martin Sebor
@ 2017-11-09 11:14 ` Richard Sandiford
2017-11-09 17:42 ` Martin Sebor
2017-11-13 17:59 ` Jeff Law
0 siblings, 2 replies; 302+ messages in thread
From: Richard Sandiford @ 2017-11-09 11:14 UTC (permalink / raw)
To: Martin Sebor; +Cc: gcc-patches
Martin Sebor <msebor@gmail.com> writes:
> On 11/08/2017 11:28 AM, Richard Sandiford wrote:
>> Martin Sebor <msebor@gmail.com> writes:
>>> On 11/08/2017 09:51 AM, Richard Sandiford wrote:
>>>> Martin Sebor <msebor@gmail.com> writes:
>>>>> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
>>>>>> Martin Sebor <msebor@gmail.com> writes:
>>>>>>> I haven't done nearly a thorough review but the dtor followed by
>>>>>>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>>>>>>> I thought I'd ask sooner rather than later. Given the macro
>>>>>>> definition:
>>>>>>>
>>>>>>> + The dummy comparison against a null C * is just a way of checking
>>>>>>> + that C gives the right type. */
>>>>>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>>>>>>> + ((void) (&(RES).coeffs[0] == (C *) 0), \
>>>>>>> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
>>>>>>> + ? (void) ((RES).coeffs[I] = VALUE) \
>>>>>>> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>>>>>>
>>>>>>> is the following use well-defined?
>>>>>>>
>>>>>>> +template<unsigned int N, typename C>
>>>>>>> +inline poly_int_pod<N, C>&
>>>>>>> +poly_int_pod<N, C>::operator <<= (unsigned int a)
>>>>>>> +{
>>>>>>> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>>>>>>
>>>>>>> It looks to me as though the VALUE argument in the ctor invoked
>>>>>>> by the placement new expression is evaluated after the dtor has
>>>>>>> destroyed the very array element the VALUE argument expands to.
>>>>>>
>>>>>> Good catch! It should simply have been doing <<= on each coefficient --
>>>>>> I must have got carried away when converting to POLY_SET_COEFF.
>>>>>>
>>>>>> I double-checked the other uses and think that's the only one.
>>>>>>
>>>>>>> Whether or not is, in fact, a problem, it seems to me that using
>>>>>>> a function template rather than a macro would be a clearer and
>>>>>>> safer way to do the same thing. (Safer in that the macro also
>>>>>>> evaluates its arguments multiple times, which is often a source
>>>>>>> of subtle bugs.)
>>>>>>
>>>>>> That would slow down -O0 builds though, by introducing an extra
>>>>>> function call and set of temporaries even when the coefficients
>>>>>> are primitive integers.
>>>>>
>>>>> Would decorating the function template with attribute always_inline
>>>>> help?
>>>>
>>>> It would remove the call itself, but we'd still have the extra temporary
>>>> objects that were the function argument and return value.
>>>
>>> Sorry, I do not want to get into another long discussion about
>>> trade-offs between safety and efficiency but I'm not sure I see
>>> what extra temporaries it would create. It seems to me that
>>> an inline function template that took arguments of user-defined
>>> types by reference and others by value should be just as efficient
>>> as a macro.
>>>
>>> From GCC's own manual:
>>>
>>> 6.43 An Inline Function is As Fast As a Macro
>>> https://gcc.gnu.org/onlinedocs/gcc/Inline.html
>>
>> You can see the difference with something like:
>>
>> inline
>> void __attribute__((always_inline))
>> f(int &dst, const int &src) { dst = src; }
>>
>> int g1(const int &y) { int x; f(x, y); return x; }
>> int g2(const int &y) { int x; x = y; return x; }
>
> Let me say at the outset that I struggle to comprehend that a few
> instructions is even a consideration when not optimizing, especially
> in light of the bug the macro caused that would have been prevented
> by using a function instead. But...
Many people still build at -O0 though. One of the things I was asked
for was the time it takes to build stage 2 with an -O0 stage 1
(where stage 1 would usually be built by the host compiler).
> ...I don't think your example above is representative of using
> the POLY_SET_COEFF macro. The function template I'm suggesting
> might look something to this:
>
> template <unsigned N, class C>
> inline void __attribute__ ((always_inline))
> poly_set_coeff (poly_int_pod<N, C> *p, unsigned idx, C val)
> {
> ((void) (&(*p).coeffs[0] == (C *) 0),
> wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION ? (void)
> ((*p).coeffs[0] = val) : (void) ((*p).coeffs[0].~C (), new
> (&(*p).coeffs[0]) C (val)));
>
> if (N >= 2)
> for (unsigned int i = 1; i < N; i++)
> ((void) (&(*p).coeffs[0] == (C *) 0),
> wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION ? (void)
> ((*p).coeffs[i] = val) : (void) ((*p).coeffs[i].~C (), new
> (&(*p).coeffs[i]) C (val)));
> }
That ignores the idx parameter and sets all coefficents to val. Did you
mean somnething like:
template <unsigned N, typename C1, typename C2>
inline void __attribute__ ((always_inline))
poly_set_coeff (poly_int_pod<N, C1> *p, unsigned idx, C2 val)
{
wi::int_traits<C1>::precision_type == wi::FLEXIBLE_PRECISION ? (void) ((*p).coeffs[idx] = val) : (void) ((*p).coeffs[idx].~C1 (), new (&(*p).coeffs[idx]) C1 (val));
}
? If so...
> To compare apples to apples I suggest to instead compare the shift
> operator (or any other poly_int function that uses the macro) that
> doesn't suffer from the bug vs one that makes use of the function
> template. I see a difference of 2 instructions on x86_64 (21 vs
> 23) for operator<<=.
>
> Are two assembly instructions even worth talking about?
...the problem is that passing C by value defeats the point of the
optimisation:
/* RES is a poly_int result that has coefficients of type C and that
is being built up a coefficient at a time. Set coefficient number I
to VALUE in the most efficient way possible.
For primitive C it is better to assign directly, since it avoids
any further calls and so is more efficient when the compiler is
built at -O0. But for wide-int based C it is better to construct
the value in-place. This means that calls out to a wide-int.cc
routine can take the address of RES rather than the address of
a temporary.
With the inline function, the wide-int.cc routines will be taking
the address of the temporary "val" object, which will then be used
to initialise the target object via a copy. The macro was there
to avoid the copy.
E.g. for a normal --enable-checking=release build of current sources
on x86_64, mem_ref_offset is:
0000000000000034 T mem_ref_offset(tree_node const*)
With the POLY_SET_COEFF macro it's the same size (and code) with
poly-int.h:
0000000000000034 T mem_ref_offset(tree_node const*)
But using the function above gives:
0000000000000058 T mem_ref_offset(tree_node const*)
which is very similar to what we'd get by assigning to the coefficients
normally.
This kind of thing happened in quite a few other places. mem_ref_offset
is just a nice example because it's so self-contained. And it did have
a measurable effect on the speed of the compiler.
That's why the cut-down version quoted above passed the source by
reference too. Doing that, i.e.:
template <unsigned N, typename C1, typename C2>
inline void __attribute__ ((always_inline))
poly_set_coeff (poly_int_pod<N, C1> *p, unsigned idx, const C2 &val)
gives:
0000000000000052 T mem_ref_offset(tree_node const*)
But the use of this inline function in <<= would be just as incorrect as
using the macro.
[These are all sizes for normally-optimised release builds]
>>> If that's not the case and there is a significant performance
>>> penalty associated with inline functions at -O0 then GCC should
>>> be fixed to avoid it.
>>
>> I think those docs are really talking about inline functions being as
>> fast as macros when optimisation is enabled. I don't think we make
>> any guarantees about -O0 code quality.
>
> Sure, but you are using unsafe macros in preference to a safer
> inline function even with optimization, introducing a bug as
> a result, and making an argument that the performance impact
> of a few instructions when not using optimization is what should
> drive the decision between one and the other in all situations.
> With all respect, I fail to see the logic in this like of
> reasoning. By that argument we would never be able to define
> any inline functions.
>
> That being said, if the performance implications of using inline
> functions with no optimization are so serious here then I suggest
> you should be concerned about introducing the poly_int API in its
> current form at all: every access to the class is an inline
> function.
It's a trade-off. It would be very difficult to do poly-int.h
via macros without making the changes even more invasive.
But with a case like this, where we *can* do something common
via a macro, I think using the macro makes sense. Especially
when it's local to the file rather than a "public" interface.
> On a more serious/constructive note, if you really are worried
> about efficiency at this level then introducing an intrinsic
> primitive into the compiler instead of a set of classes might
> be worth thinking about. It will only benefit GCC but it might
> lay a foundation for all sorts of infinite precision integer
> classes (including the C++ proposal that was pointed out in
> the other thread).
This has to work with host compilers other than GCC though.
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-09 11:14 ` Richard Sandiford
@ 2017-11-09 17:42 ` Martin Sebor
2017-11-13 17:59 ` Jeff Law
1 sibling, 0 replies; 302+ messages in thread
From: Martin Sebor @ 2017-11-09 17:42 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 11/09/2017 04:06 AM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:
>> On 11/08/2017 11:28 AM, Richard Sandiford wrote:
>>> Martin Sebor <msebor@gmail.com> writes:
>>>> On 11/08/2017 09:51 AM, Richard Sandiford wrote:
>>>>> Martin Sebor <msebor@gmail.com> writes:
>>>>>> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
>>>>>>> Martin Sebor <msebor@gmail.com> writes:
>>>>>>>> I haven't done nearly a thorough review but the dtor followed by
>>>>>>>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>>>>>>>> I thought I'd ask sooner rather than later. Given the macro
>>>>>>>> definition:
>>>>>>>>
>>>>>>>> + The dummy comparison against a null C * is just a way of checking
>>>>>>>> + that C gives the right type. */
>>>>>>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>>>>>>>> + ((void) (&(RES).coeffs[0] == (C *) 0), \
>>>>>>>> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
>>>>>>>> + ? (void) ((RES).coeffs[I] = VALUE) \
>>>>>>>> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>>>>>>>
>>>>>>>> is the following use well-defined?
>>>>>>>>
>>>>>>>> +template<unsigned int N, typename C>
>>>>>>>> +inline poly_int_pod<N, C>&
>>>>>>>> +poly_int_pod<N, C>::operator <<= (unsigned int a)
>>>>>>>> +{
>>>>>>>> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>>>>>>>
>>>>>>>> It looks to me as though the VALUE argument in the ctor invoked
>>>>>>>> by the placement new expression is evaluated after the dtor has
>>>>>>>> destroyed the very array element the VALUE argument expands to.
>>>>>>>
>>>>>>> Good catch! It should simply have been doing <<= on each coefficient --
>>>>>>> I must have got carried away when converting to POLY_SET_COEFF.
>>>>>>>
>>>>>>> I double-checked the other uses and think that's the only one.
>>>>>>>
>>>>>>>> Whether or not is, in fact, a problem, it seems to me that using
>>>>>>>> a function template rather than a macro would be a clearer and
>>>>>>>> safer way to do the same thing. (Safer in that the macro also
>>>>>>>> evaluates its arguments multiple times, which is often a source
>>>>>>>> of subtle bugs.)
>>>>>>>
>>>>>>> That would slow down -O0 builds though, by introducing an extra
>>>>>>> function call and set of temporaries even when the coefficients
>>>>>>> are primitive integers.
>>>>>>
>>>>>> Would decorating the function template with attribute always_inline
>>>>>> help?
>>>>>
>>>>> It would remove the call itself, but we'd still have the extra temporary
>>>>> objects that were the function argument and return value.
>>>>
>>>> Sorry, I do not want to get into another long discussion about
>>>> trade-offs between safety and efficiency but I'm not sure I see
>>>> what extra temporaries it would create. It seems to me that
>>>> an inline function template that took arguments of user-defined
>>>> types by reference and others by value should be just as efficient
>>>> as a macro.
>>>>
>>>> From GCC's own manual:
>>>>
>>>> 6.43 An Inline Function is As Fast As a Macro
>>>> https://gcc.gnu.org/onlinedocs/gcc/Inline.html
>>>
>>> You can see the difference with something like:
>>>
>>> inline
>>> void __attribute__((always_inline))
>>> f(int &dst, const int &src) { dst = src; }
>>>
>>> int g1(const int &y) { int x; f(x, y); return x; }
>>> int g2(const int &y) { int x; x = y; return x; }
>>
>> Let me say at the outset that I struggle to comprehend that a few
>> instructions is even a consideration when not optimizing, especially
>> in light of the bug the macro caused that would have been prevented
>> by using a function instead. But...
>
> Many people still build at -O0 though. One of the things I was asked
> for was the time it takes to build stage 2 with an -O0 stage 1
> (where stage 1 would usually be built by the host compiler).
Yes, of course. I do all my development and basic testing at
-O0. But I don't expect the performance to be comparable to
-O2. I'd be surprised if anyone did. What I do expect at -O0
(and what I'm grateful for) is GCC to make it easier to find
bugs in my code by enabling extra checks, even if they come at
the expense of slower execution.
That said, if your enhancement has such dramatic performance
implications at -O0 that the only way to avoid them is by using
macros then I would say it's not appropriate.
>> ...I don't think your example above is representative of using
>> the POLY_SET_COEFF macro. The function template I'm suggesting
>> might look something to this:
>>
>> template <unsigned N, class C>
>> inline void __attribute__ ((always_inline))
>> poly_set_coeff (poly_int_pod<N, C> *p, unsigned idx, C val)
>> {
>> ((void) (&(*p).coeffs[0] == (C *) 0),
>> wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION ? (void)
>> ((*p).coeffs[0] = val) : (void) ((*p).coeffs[0].~C (), new
>> (&(*p).coeffs[0]) C (val)));
>>
>> if (N >= 2)
>> for (unsigned int i = 1; i < N; i++)
>> ((void) (&(*p).coeffs[0] == (C *) 0),
>> wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION ? (void)
>> ((*p).coeffs[i] = val) : (void) ((*p).coeffs[i].~C (), new
>> (&(*p).coeffs[i]) C (val)));
>> }
>
> That ignores the idx parameter and sets all coefficents to val. Did you
> mean somnething like:
>
> template <unsigned N, typename C1, typename C2>
> inline void __attribute__ ((always_inline))
> poly_set_coeff (poly_int_pod<N, C1> *p, unsigned idx, C2 val)
> {
> wi::int_traits<C1>::precision_type == wi::FLEXIBLE_PRECISION ? (void) ((*p).coeffs[idx] = val) : (void) ((*p).coeffs[idx].~C1 (), new (&(*p).coeffs[idx]) C1 (val));
> }
>
> ? If so...
Yes, I didn't have it quite right. With the above there's
a difference of three x86_64 mov instructions at -O0. The code
at -O2 is still identical.
>> To compare apples to apples I suggest to instead compare the shift
>> operator (or any other poly_int function that uses the macro) that
>> doesn't suffer from the bug vs one that makes use of the function
>> template. I see a difference of 2 instructions on x86_64 (21 vs
>> 23) for operator<<=.
>>
>> Are two assembly instructions even worth talking about?
>
> ...the problem is that passing C by value defeats the point of the
> optimisation:
>
> /* RES is a poly_int result that has coefficients of type C and that
> is being built up a coefficient at a time. Set coefficient number I
> to VALUE in the most efficient way possible.
>
> For primitive C it is better to assign directly, since it avoids
> any further calls and so is more efficient when the compiler is
> built at -O0. But for wide-int based C it is better to construct
> the value in-place. This means that calls out to a wide-int.cc
> routine can take the address of RES rather than the address of
> a temporary.
>
> With the inline function, the wide-int.cc routines will be taking
> the address of the temporary "val" object, which will then be used
> to initialise the target object via a copy. The macro was there
> to avoid the copy.
There are many ways to write code that is both safe and efficient
that don't involve resorting to convoluted, error-prone macros.
I don't quite see how passing a fundamental type by value to
to an inline function can be a bottleneck. If C can be both
a fundamental type or an aggregate or then the aggregate case
can be optimized by providing an overload or specialization.
But an answer that "the only way to write the code is by using
a macro" cannot possibly be acceptable (maybe 20 years ago but
not today).
> But the use of this inline function in <<= would be just as incorrect as
> using the macro.
>
> [These are all sizes for normally-optimised release builds]
We seem to have very different priorities. I make mistakes all
the time and so I need all the help I can get to find problems
in my code. Microoptimizations that make it easier to get
things wrong make debugging harder. In both of the cases we
have discussed -- the no-op ctor and the macro this has actually
happened, to both of us. -O0 is for meant for development and
should make it easy to avoid mistakes (as all the GCC checking
tries to do). If the overhead of an always-inline function is
unacceptable when optimizing then use a macro if you must but
only then and please at least make the -O0 default safe. And
open a bug to get the inefficiency fixed. Otherwise we can't
really claim that An Inline Function is As Fast As a Macro.
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-09 11:14 ` Richard Sandiford
2017-11-09 17:42 ` Martin Sebor
@ 2017-11-13 17:59 ` Jeff Law
2017-11-13 23:57 ` Richard Sandiford
1 sibling, 1 reply; 302+ messages in thread
From: Jeff Law @ 2017-11-13 17:59 UTC (permalink / raw)
To: Martin Sebor, gcc-patches, richard.sandiford
On 11/09/2017 04:06 AM, Richard Sandiford wrote:
>> Let me say at the outset that I struggle to comprehend that a few
>> instructions is even a consideration when not optimizing, especially
>> in light of the bug the macro caused that would have been prevented
>> by using a function instead. But...
>
> Many people still build at -O0 though. One of the things I was asked
> for was the time it takes to build stage 2 with an -O0 stage 1
> (where stage 1 would usually be built by the host compiler).
I suspect folks are concerned about this because it potentially affects
their daily development cycle times. So they're looking to see if the
introduction of the poly types has a significant impact. It's a
legitimate question, particularly for the introduction of low level
infrastructure that potentially gets hit a lot.
Richard, what were the results of that test (if it's elsewhere in the
thread I'll eventually find it... I'm just starting to try and make
some headway on this kit).
Jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-13 17:59 ` Jeff Law
@ 2017-11-13 23:57 ` Richard Sandiford
2017-11-14 1:21 ` Martin Sebor
2017-11-17 3:31 ` Jeff Law
0 siblings, 2 replies; 302+ messages in thread
From: Richard Sandiford @ 2017-11-13 23:57 UTC (permalink / raw)
To: Jeff Law; +Cc: Martin Sebor, gcc-patches
Jeff Law <law@redhat.com> writes:
> On 11/09/2017 04:06 AM, Richard Sandiford wrote:
>
>>> Let me say at the outset that I struggle to comprehend that a few
>>> instructions is even a consideration when not optimizing, especially
>>> in light of the bug the macro caused that would have been prevented
>>> by using a function instead. But...
>>
>> Many people still build at -O0 though. One of the things I was asked
>> for was the time it takes to build stage 2 with an -O0 stage 1
>> (where stage 1 would usually be built by the host compiler).
> I suspect folks are concerned about this because it potentially affects
> their daily development cycle times. So they're looking to see if the
> introduction of the poly types has a significant impact. It's a
> legitimate question, particularly for the introduction of low level
> infrastructure that potentially gets hit a lot.
>
> Richard, what were the results of that test (if it's elsewhere in the
> thread I'll eventually find it...
On an x86_64 box I got:
real: +7%
user: +8.6%
for building stage2 with an -O0 -g stage1. For aarch64 with the
NUM_POLY_INT_COEFFS==2 change it was:
real: +17%
user: +20%
That's obviously not ideal, but C++11 would get rid of some of the
inefficiencies, once we can switch to that.
You've probably already seen this, but it's compile-time neutral on
x86_64 in terms of running a gcc built with --enable-checking=release,
within a margin of about [-0.1%, 0.1%].
For aarch64 with NUM_POLY_INT_COEFFS==2, a gcc built with
--enable-checking=release is ~1% slower when using -g and ~2%
slower with -O2 -g.
> I'm just starting to try and make some headway on this kit).
Thanks :-) I guess it's going to be a real slog going through them,
sorry, even despite the attempt to split them up.
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-13 23:57 ` Richard Sandiford
@ 2017-11-14 1:21 ` Martin Sebor
2017-11-14 9:46 ` Richard Sandiford
2017-11-17 3:31 ` Jeff Law
1 sibling, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-11-14 1:21 UTC (permalink / raw)
To: Jeff Law, gcc-patches, richard.sandiford
On 11/13/2017 04:36 PM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 11/09/2017 04:06 AM, Richard Sandiford wrote:
>>
>>>> Let me say at the outset that I struggle to comprehend that a few
>>>> instructions is even a consideration when not optimizing, especially
>>>> in light of the bug the macro caused that would have been prevented
>>>> by using a function instead. But...
>>>
>>> Many people still build at -O0 though. One of the things I was asked
>>> for was the time it takes to build stage 2 with an -O0 stage 1
>>> (where stage 1 would usually be built by the host compiler).
>> I suspect folks are concerned about this because it potentially affects
>> their daily development cycle times. So they're looking to see if the
>> introduction of the poly types has a significant impact. It's a
>> legitimate question, particularly for the introduction of low level
>> infrastructure that potentially gets hit a lot.
>>
>> Richard, what were the results of that test (if it's elsewhere in the
>> thread I'll eventually find it...
>
> On an x86_64 box I got:
>
> real: +7%
> user: +8.6%
>
> for building stage2 with an -O0 -g stage1. For aarch64 with the
> NUM_POLY_INT_COEFFS==2 change it was:
>
> real: +17%
> user: +20%
>
> That's obviously not ideal, but C++11 would get rid of some of the
> inefficiencies, once we can switch to that.
For the purposes of this discussion, what would the numbers look
like if the macro were replaced with the inline function as I
suggested?
What impact on the numbers would having the default ctor actually
initialize the object have? (As opposed to leaving it uninitialized.)
I don't want to make a bigger deal out of this macro than it
already is. Unlike the wide int constructors, it's
an implementation detail that, when correct, almost no-one will
have to worry about. The main reason for my strenuous objections
is not the macro itself but the philosophy that performance,
especially at -O0, should be an overriding consideration. Code
should be safe first and foremost. Otherwise, the few cycles we
might save by writing unsafe but fast code will be wasted in
debugging sessions.
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-14 1:21 ` Martin Sebor
@ 2017-11-14 9:46 ` Richard Sandiford
0 siblings, 0 replies; 302+ messages in thread
From: Richard Sandiford @ 2017-11-14 9:46 UTC (permalink / raw)
To: Martin Sebor; +Cc: Jeff Law, gcc-patches
Martin Sebor <msebor@gmail.com> writes:
> On 11/13/2017 04:36 PM, Richard Sandiford wrote:
>> Jeff Law <law@redhat.com> writes:
>>> On 11/09/2017 04:06 AM, Richard Sandiford wrote:
>>>
>>>>> Let me say at the outset that I struggle to comprehend that a few
>>>>> instructions is even a consideration when not optimizing, especially
>>>>> in light of the bug the macro caused that would have been prevented
>>>>> by using a function instead. But...
>>>>
>>>> Many people still build at -O0 though. One of the things I was asked
>>>> for was the time it takes to build stage 2 with an -O0 stage 1
>>>> (where stage 1 would usually be built by the host compiler).
>>> I suspect folks are concerned about this because it potentially affects
>>> their daily development cycle times. So they're looking to see if the
>>> introduction of the poly types has a significant impact. It's a
>>> legitimate question, particularly for the introduction of low level
>>> infrastructure that potentially gets hit a lot.
>>>
>>> Richard, what were the results of that test (if it's elsewhere in the
>>> thread I'll eventually find it...
>>
>> On an x86_64 box I got:
>>
>> real: +7%
>> user: +8.6%
>>
>> for building stage2 with an -O0 -g stage1. For aarch64 with the
>> NUM_POLY_INT_COEFFS==2 change it was:
>>
>> real: +17%
>> user: +20%
>>
>> That's obviously not ideal, but C++11 would get rid of some of the
>> inefficiencies, once we can switch to that.
>
> For the purposes of this discussion, what would the numbers look
> like if the macro were replaced with the inline function as I
> suggested?
>
> What impact on the numbers would having the default ctor actually
> initialize the object have? (As opposed to leaving it uninitialized.)
I was objecting to that for semantic reasons[*], not performance when
built with -O0. I realise you don't agree, but I don't think either of
us is going to convince the other here.
> I don't want to make a bigger deal out of this macro than it
> already is. Unlike the wide int constructors, it's
> an implementation detail that, when correct, almost no-one will
> have to worry about. The main reason for my strenuous objections
> is not the macro itself but the philosophy that performance,
> especially at -O0, should be an overriding consideration. Code
> should be safe first and foremost. Otherwise, the few cycles we
> might save by writing unsafe but fast code will be wasted in
> debugging sessions.
But the macro was originally added to improve release builds,
not -O0 builds. It replaced plain assignments of the form:
r.coeffs[0] = ...;
Using an inline function instead of a macro is no better than
the original call to operator=(); see the mem_ref_offset figures
I gave earlier.
Thanks,
Richard
[*] Which were:
1) Not all types used with poly_int have a single meaningful initial
value (wide_int).
2) It prevents useful static warnings about uninitialised variables.
The fact that we don't warn in every case doesn't defeat this IMO.
3) Using C++11 "= default" is a good compromise, but making poly_ints
always initialised by default now would make it too dangerous to
switch to "= default" in future.
4) Conditionally using "= default" when being built with C++11
compilers and something else when being built with C++03 compilers
would be too dangerous, since we don't want the semantics of the
class to depend on host compiler.
5) AFAIK, the only case that would be handled differently by the
current "() {}" constructor and "= default" is the use of
"T ()" to construct a zeroed T. IMO, "T (0)" is better than "T ()"
anyway because (a) it makes it obvious that you're initialising it to
the numerical value 0, rather than simply zeroed memory contents, and
(b) it will give a compile error if T doesn't have a single zero
representation (as for wide_int).
Those are all independent of whatever the -O0 performance regression
would be from unconditional initialisation.
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-13 23:57 ` Richard Sandiford
2017-11-14 1:21 ` Martin Sebor
@ 2017-11-17 3:31 ` Jeff Law
1 sibling, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-17 3:31 UTC (permalink / raw)
To: Martin Sebor, gcc-patches, richard.sandiford
On 11/13/2017 04:36 PM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 11/09/2017 04:06 AM, Richard Sandiford wrote:
>>
>>>> Let me say at the outset that I struggle to comprehend that a few
>>>> instructions is even a consideration when not optimizing, especially
>>>> in light of the bug the macro caused that would have been prevented
>>>> by using a function instead. But...
>>>
>>> Many people still build at -O0 though. One of the things I was asked
>>> for was the time it takes to build stage 2 with an -O0 stage 1
>>> (where stage 1 would usually be built by the host compiler).
>> I suspect folks are concerned about this because it potentially affects
>> their daily development cycle times. So they're looking to see if the
>> introduction of the poly types has a significant impact. It's a
>> legitimate question, particularly for the introduction of low level
>> infrastructure that potentially gets hit a lot.
>>
>> Richard, what were the results of that test (if it's elsewhere in the
>> thread I'll eventually find it...
>
> On an x86_64 box I got:
>
> real: +7%
> user: +8.6%
>
> for building stage2 with an -O0 -g stage1. For aarch64 with the
> NUM_POLY_INT_COEFFS==2 change it was:
>
> real: +17%
> user: +20%
>
> That's obviously not ideal, but C++11 would get rid of some of the
> inefficiencies, once we can switch to that.
Ouch. But I guess to some extent it has to be expected given what
you've got to do under the hood.
>
> You've probably already seen this, but it's compile-time neutral on
> x86_64 in terms of running a gcc built with --enable-checking=release,
> within a margin of about [-0.1%, 0.1%].
Good. Presumably that's because it all just falls out and turns into
wi:: stuff on targets that don't need the poly stuff.
>
> For aarch64 with NUM_POLY_INT_COEFFS==2, a gcc built with
> --enable-checking=release is ~1% slower when using -g and ~2%
> slower with -O2 -g.
That's not terrible given what's going on here.
I'm still pondering the whole construction/initialization and temporary
objects issue. I may try to work through some of the actual patches,
then come back to those issues.
>
>> I'm just starting to try and make some headway on this kit).
>
> Thanks :-) I guess it's going to be a real slog going through them,
> sorry, even despite the attempt to split them up.
No worries. It's what we sign up for :-) Your deep testing and long
history with the project really help in that if something goes wrong I
know you're going to be around to fix it.
Jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-10-23 16:58 ` [001/nnn] poly_int: add poly-int.h Richard Sandiford
2017-10-25 16:17 ` Martin Sebor
@ 2017-11-08 10:03 ` Richard Sandiford
2017-11-14 0:42 ` Richard Sandiford
1 sibling, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-11-08 10:03 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 4555 bytes --]
Richard Sandiford <richard.sandiford@linaro.org> writes:
> This patch adds a new "poly_int" class to represent polynomial integers
> of the form:
>
> C0 + C1*X1 + C2*X2 ... + Cn*Xn
>
> It also adds poly_int-based typedefs for offsets and sizes of various
> precisions. In these typedefs, the Ci coefficients are compile-time
> constants and the Xi indeterminates are run-time invariants. The number
> of coefficients is controlled by the target and is initially 1 for all
> ports.
>
> Most routines can handle general coefficient counts, but for now a few
> are specific to one or two coefficients. Support for other coefficient
> counts can be added when needed.
>
> The patch also adds a new macro, IN_TARGET_CODE, that can be
> set to indicate that a TU contains target-specific rather than
> target-independent code. When this macro is set and the number of
> coefficients is 1, the poly-int.h classes define a conversion operator
> to a constant. This allows most existing target code to work without
> modification. The main exceptions are:
>
> - values passed through ..., which need an explicit conversion to a
> constant
>
> - ?: expression in which one arm ends up being a polynomial and the
> other remains a constant. In these cases it would be valid to convert
> the constant to a polynomial and the polynomial to a constant, so a
> cast is needed to break the ambiguity.
>
> The patch also adds a new target hook to return the estimated
> value of a polynomial for costing purposes.
>
> The patch also adds operator<< on wide_ints (it was already defined
> for offset_int and widest_int). I think this was originally excluded
> because >> is ambiguous for wide_int, but << is useful for converting
> bytes to bits, etc., so is worth defining on its own. The patch also
> adds operator% and operator/ for offset_int and widest_int, since those
> types are always signed. These changes allow the poly_int interface to
> be more predictable.
>
> I'd originally tried adding the tests as selftests, but that ended up
> bloating cc1 by at least a third. It also took a while to build them
> at -O2. The patch therefore uses plugin tests instead, where we can
> force the tests to be built at -O0. They still run in negligible time
> when built that way.
Changes in v2:
- Drop the controversial known_zero etc. wrapper functions.
- Fix the operator<<= bug that Martin found.
- Switch from "t" to "type" in SFINAE classes (requested by Martin).
Not changed in v2:
- Default constructors are still empty. I agree it makes sense to use
"= default" when we switch to C++11, but it would be dangerous for
that to make "poly_int64 x;" less defined than it is now.
Tested as before.
Thanks,
Richard
2017-11-08 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* poly-int.h: New file.
* poly-int-types.h: Likewise.
* coretypes.h: Include them.
(POLY_INT_CONVERSION): Define.
* target.def (estimated_poly_value): New hook.
* doc/tm.texi.in (TARGET_ESTIMATED_POLY_VALUE): New hook.
* doc/tm.texi: Regenerate.
* doc/poly-int.texi: New file.
* doc/gccint.texi: Include it.
* doc/rtl.texi: Describe restrictions on subreg modes.
* Makefile.in (TEXI_GCCINT_FILES): Add poly-int.texi.
* genmodes.c (NUM_POLY_INT_COEFFS): Provide a default definition.
(emit_insn_modes_h): Emit a definition of NUM_POLY_INT_COEFFS.
* targhooks.h (default_estimated_poly_value): Declare.
* targhooks.c (default_estimated_poly_value): New function.
* target.h (estimated_poly_value): Likewise.
* wide-int.h (WI_UNARY_RESULT): Use wi::binary_traits.
(wi::unary_traits): Delete.
(wi::binary_traits::signed_shift_result_type): Define for
offset_int << HOST_WIDE_INT, etc.
(generic_wide_int::operator <<=): Define for all types and use
wi::lshift instead of <<.
(wi::hwi_with_prec): Add a default constructor.
(wi::ints_for): New class.
(operator <<): Define for all wide-int types.
(operator /): New function.
(operator %): Likewise.
* selftest.h (ASSERT_MUST_EQ, ASSERT_MUST_EQ_AT, ASSERT_MAY_NE)
(ASSERT_MAY_NE_AT): New macros.
gcc/testsuite/
* gcc.dg/plugin/poly-int-tests.h,
gcc.dg/plugin/poly-int-test-1.c,
gcc.dg/plugin/poly-int-01_plugin.c,
gcc.dg/plugin/poly-int-02_plugin.c,
gcc.dg/plugin/poly-int-03_plugin.c,
gcc.dg/plugin/poly-int-04_plugin.c,
gcc.dg/plugin/poly-int-05_plugin.c,
gcc.dg/plugin/poly-int-06_plugin.c,
gcc.dg/plugin/poly-int-07_plugin.c: New tests.
* gcc.dg/plugin/plugin.exp: Run them.
[-- Attachment #2: poly-001-poly-int-h.diff.gz --]
[-- Type: application/gzip, Size: 47953 bytes --]
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-08 10:03 ` Richard Sandiford
@ 2017-11-14 0:42 ` Richard Sandiford
2017-12-06 20:11 ` Jeff Law
0 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-11-14 0:42 UTC (permalink / raw)
To: gcc-patches
Richard Sandiford <richard.sandiford@linaro.org> writes:
> Richard Sandiford <richard.sandiford@linaro.org> writes:
>> This patch adds a new "poly_int" class to represent polynomial integers
>> of the form:
>>
>> C0 + C1*X1 + C2*X2 ... + Cn*Xn
>>
>> It also adds poly_int-based typedefs for offsets and sizes of various
>> precisions. In these typedefs, the Ci coefficients are compile-time
>> constants and the Xi indeterminates are run-time invariants. The number
>> of coefficients is controlled by the target and is initially 1 for all
>> ports.
>>
>> Most routines can handle general coefficient counts, but for now a few
>> are specific to one or two coefficients. Support for other coefficient
>> counts can be added when needed.
>>
>> The patch also adds a new macro, IN_TARGET_CODE, that can be
>> set to indicate that a TU contains target-specific rather than
>> target-independent code. When this macro is set and the number of
>> coefficients is 1, the poly-int.h classes define a conversion operator
>> to a constant. This allows most existing target code to work without
>> modification. The main exceptions are:
>>
>> - values passed through ..., which need an explicit conversion to a
>> constant
>>
>> - ?: expression in which one arm ends up being a polynomial and the
>> other remains a constant. In these cases it would be valid to convert
>> the constant to a polynomial and the polynomial to a constant, so a
>> cast is needed to break the ambiguity.
>>
>> The patch also adds a new target hook to return the estimated
>> value of a polynomial for costing purposes.
>>
>> The patch also adds operator<< on wide_ints (it was already defined
>> for offset_int and widest_int). I think this was originally excluded
>> because >> is ambiguous for wide_int, but << is useful for converting
>> bytes to bits, etc., so is worth defining on its own. The patch also
>> adds operator% and operator/ for offset_int and widest_int, since those
>> types are always signed. These changes allow the poly_int interface to
>> be more predictable.
>>
>> I'd originally tried adding the tests as selftests, but that ended up
>> bloating cc1 by at least a third. It also took a while to build them
>> at -O2. The patch therefore uses plugin tests instead, where we can
>> force the tests to be built at -O0. They still run in negligible time
>> when built that way.
>
> Changes in v2:
>
> - Drop the controversial known_zero etc. wrapper functions.
> - Fix the operator<<= bug that Martin found.
> - Switch from "t" to "type" in SFINAE classes (requested by Martin).
>
> Not changed in v2:
>
> - Default constructors are still empty. I agree it makes sense to use
> "= default" when we switch to C++11, but it would be dangerous for
> that to make "poly_int64 x;" less defined than it is now.
After talking about this a bit more internally, it was obvious that
the choice of "must" and "may" for the predicate names was a common
sticking point. The idea was to match the names of alias predicates,
but given my track record with names ("too_empty_p" being a recently
questioned example :-)), I'd be happy to rename them to something else.
Some alternatives we came up with were:
- known_eq / maybe_eq / known_lt / maybe_lt etc.
Some functions already use "known" and "maybe", so this would arguably
be more consistent than using "must" and "may".
- always_eq / sometimes_eq / always_lt / sometimes_lt
Similar to the previous one in intent. It's just a question of which
wordng is clearer.
- forall_eq / exists_eq / forall_lt / exists_lt etc.
Matches the usual logic quantifiers. This seems quite appealing,
as long as it's obvious that in:
forall_eq (v0, v1)
v0 and v1 themselves are already bound: if vi == ai + bi*X then
what we really saying is:
forall X, a0 + b0*X == a1 + b1*X
Which of those sounds best? Any other suggestions?
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-11-14 0:42 ` Richard Sandiford
@ 2017-12-06 20:11 ` Jeff Law
2017-12-07 14:46 ` Richard Biener
0 siblings, 1 reply; 302+ messages in thread
From: Jeff Law @ 2017-12-06 20:11 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 11/13/2017 05:04 PM, Richard Sandiford wrote:
> Richard Sandiford <richard.sandiford@linaro.org> writes:
>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>> This patch adds a new "poly_int" class to represent polynomial integers
>>> of the form:
>>>
>>> C0 + C1*X1 + C2*X2 ... + Cn*Xn
>>>
>>> It also adds poly_int-based typedefs for offsets and sizes of various
>>> precisions. In these typedefs, the Ci coefficients are compile-time
>>> constants and the Xi indeterminates are run-time invariants. The number
>>> of coefficients is controlled by the target and is initially 1 for all
>>> ports.
>>>
>>> Most routines can handle general coefficient counts, but for now a few
>>> are specific to one or two coefficients. Support for other coefficient
>>> counts can be added when needed.
>>>
>>> The patch also adds a new macro, IN_TARGET_CODE, that can be
>>> set to indicate that a TU contains target-specific rather than
>>> target-independent code. When this macro is set and the number of
>>> coefficients is 1, the poly-int.h classes define a conversion operator
>>> to a constant. This allows most existing target code to work without
>>> modification. The main exceptions are:
>>>
>>> - values passed through ..., which need an explicit conversion to a
>>> constant
>>>
>>> - ?: expression in which one arm ends up being a polynomial and the
>>> other remains a constant. In these cases it would be valid to convert
>>> the constant to a polynomial and the polynomial to a constant, so a
>>> cast is needed to break the ambiguity.
>>>
>>> The patch also adds a new target hook to return the estimated
>>> value of a polynomial for costing purposes.
>>>
>>> The patch also adds operator<< on wide_ints (it was already defined
>>> for offset_int and widest_int). I think this was originally excluded
>>> because >> is ambiguous for wide_int, but << is useful for converting
>>> bytes to bits, etc., so is worth defining on its own. The patch also
>>> adds operator% and operator/ for offset_int and widest_int, since those
>>> types are always signed. These changes allow the poly_int interface to
>>> be more predictable.
>>>
>>> I'd originally tried adding the tests as selftests, but that ended up
>>> bloating cc1 by at least a third. It also took a while to build them
>>> at -O2. The patch therefore uses plugin tests instead, where we can
>>> force the tests to be built at -O0. They still run in negligible time
>>> when built that way.
>>
>> Changes in v2:
>>
>> - Drop the controversial known_zero etc. wrapper functions.
>> - Fix the operator<<= bug that Martin found.
>> - Switch from "t" to "type" in SFINAE classes (requested by Martin).
>>
>> Not changed in v2:
>>
>> - Default constructors are still empty. I agree it makes sense to use
>> "= default" when we switch to C++11, but it would be dangerous for
>> that to make "poly_int64 x;" less defined than it is now.
>
> After talking about this a bit more internally, it was obvious that
> the choice of "must" and "may" for the predicate names was a common
> sticking point. The idea was to match the names of alias predicates,
> but given my track record with names ("too_empty_p" being a recently
> questioned example :-)), I'd be happy to rename them to something else.
> Some alternatives we came up with were:
I didn't find the must vs may naming problematical as I was going
through the changes. What I did find much more difficult was
determining if the behavior was correct when we used a "may" predicate.
It really relies a good deal on knowing the surrounding code.
In places where I knew the code reasonably well could tell without much
surrounding context. In other places I had to look at the code and
deduce proper behavior in the "may" cases -- and often I resorted to
spot checking and relying on your reputation & testing to DTRT.
>
> - known_eq / maybe_eq / known_lt / maybe_lt etc.
>
> Some functions already use "known" and "maybe", so this would arguably
> be more consistent than using "must" and "may".
>
> - always_eq / sometimes_eq / always_lt / sometimes_lt
>
> Similar to the previous one in intent. It's just a question of which
> wordng is clearer.
>
> - forall_eq / exists_eq / forall_lt / exists_lt etc.
>
> Matches the usual logic quantifiers. This seems quite appealing,
> as long as it's obvious that in:
>
> forall_eq (v0, v1)
>
> v0 and v1 themselves are already bound: if vi == ai + bi*X then
> what we really saying is:
>
> forall X, a0 + b0*X == a1 + b1*X
>
> Which of those sounds best? Any other suggestions?
I can live with any of them. I tend to prefer one of the first two, but
it's not a major concern for me. So if you or others have a clear
preference, go with it.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-12-06 20:11 ` Jeff Law
@ 2017-12-07 14:46 ` Richard Biener
2017-12-07 15:08 ` Jeff Law
0 siblings, 1 reply; 302+ messages in thread
From: Richard Biener @ 2017-12-07 14:46 UTC (permalink / raw)
To: Jeff Law; +Cc: GCC Patches, Richard Sandiford
On Wed, Dec 6, 2017 at 9:11 PM, Jeff Law <law@redhat.com> wrote:
> On 11/13/2017 05:04 PM, Richard Sandiford wrote:
>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>> This patch adds a new "poly_int" class to represent polynomial integers
>>>> of the form:
>>>>
>>>> C0 + C1*X1 + C2*X2 ... + Cn*Xn
>>>>
>>>> It also adds poly_int-based typedefs for offsets and sizes of various
>>>> precisions. In these typedefs, the Ci coefficients are compile-time
>>>> constants and the Xi indeterminates are run-time invariants. The number
>>>> of coefficients is controlled by the target and is initially 1 for all
>>>> ports.
>>>>
>>>> Most routines can handle general coefficient counts, but for now a few
>>>> are specific to one or two coefficients. Support for other coefficient
>>>> counts can be added when needed.
>>>>
>>>> The patch also adds a new macro, IN_TARGET_CODE, that can be
>>>> set to indicate that a TU contains target-specific rather than
>>>> target-independent code. When this macro is set and the number of
>>>> coefficients is 1, the poly-int.h classes define a conversion operator
>>>> to a constant. This allows most existing target code to work without
>>>> modification. The main exceptions are:
>>>>
>>>> - values passed through ..., which need an explicit conversion to a
>>>> constant
>>>>
>>>> - ?: expression in which one arm ends up being a polynomial and the
>>>> other remains a constant. In these cases it would be valid to convert
>>>> the constant to a polynomial and the polynomial to a constant, so a
>>>> cast is needed to break the ambiguity.
>>>>
>>>> The patch also adds a new target hook to return the estimated
>>>> value of a polynomial for costing purposes.
>>>>
>>>> The patch also adds operator<< on wide_ints (it was already defined
>>>> for offset_int and widest_int). I think this was originally excluded
>>>> because >> is ambiguous for wide_int, but << is useful for converting
>>>> bytes to bits, etc., so is worth defining on its own. The patch also
>>>> adds operator% and operator/ for offset_int and widest_int, since those
>>>> types are always signed. These changes allow the poly_int interface to
>>>> be more predictable.
>>>>
>>>> I'd originally tried adding the tests as selftests, but that ended up
>>>> bloating cc1 by at least a third. It also took a while to build them
>>>> at -O2. The patch therefore uses plugin tests instead, where we can
>>>> force the tests to be built at -O0. They still run in negligible time
>>>> when built that way.
>>>
>>> Changes in v2:
>>>
>>> - Drop the controversial known_zero etc. wrapper functions.
>>> - Fix the operator<<= bug that Martin found.
>>> - Switch from "t" to "type" in SFINAE classes (requested by Martin).
>>>
>>> Not changed in v2:
>>>
>>> - Default constructors are still empty. I agree it makes sense to use
>>> "= default" when we switch to C++11, but it would be dangerous for
>>> that to make "poly_int64 x;" less defined than it is now.
>>
>> After talking about this a bit more internally, it was obvious that
>> the choice of "must" and "may" for the predicate names was a common
>> sticking point. The idea was to match the names of alias predicates,
>> but given my track record with names ("too_empty_p" being a recently
>> questioned example :-)), I'd be happy to rename them to something else.
>> Some alternatives we came up with were:
> I didn't find the must vs may naming problematical as I was going
> through the changes. What I did find much more difficult was
> determining if the behavior was correct when we used a "may" predicate.
> It really relies a good deal on knowing the surrounding code.
>
> In places where I knew the code reasonably well could tell without much
> surrounding context. In other places I had to look at the code and
> deduce proper behavior in the "may" cases -- and often I resorted to
> spot checking and relying on your reputation & testing to DTRT.
>
>
>>
>> - known_eq / maybe_eq / known_lt / maybe_lt etc.
>>
>> Some functions already use "known" and "maybe", so this would arguably
>> be more consistent than using "must" and "may".
>>
>> - always_eq / sometimes_eq / always_lt / sometimes_lt
>>
>> Similar to the previous one in intent. It's just a question of which
>> wordng is clearer.
>>
>> - forall_eq / exists_eq / forall_lt / exists_lt etc.
>>
>> Matches the usual logic quantifiers. This seems quite appealing,
>> as long as it's obvious that in:
>>
>> forall_eq (v0, v1)
>>
>> v0 and v1 themselves are already bound: if vi == ai + bi*X then
>> what we really saying is:
>>
>> forall X, a0 + b0*X == a1 + b1*X
>>
>> Which of those sounds best? Any other suggestions?
> I can live with any of them. I tend to prefer one of the first two, but
> it's not a major concern for me. So if you or others have a clear
> preference, go with it.
Whatever you do use a consistent naming which I guess means
using known_eq / maybe_eq?
Otherwise ok.
Richard.
>
> jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-12-07 14:46 ` Richard Biener
@ 2017-12-07 15:08 ` Jeff Law
2017-12-07 22:39 ` Richard Sandiford
0 siblings, 1 reply; 302+ messages in thread
From: Jeff Law @ 2017-12-07 15:08 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches, Richard Sandiford
On 12/07/2017 07:46 AM, Richard Biener wrote:
> On Wed, Dec 6, 2017 at 9:11 PM, Jeff Law <law@redhat.com> wrote:
>> On 11/13/2017 05:04 PM, Richard Sandiford wrote:
>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>>> This patch adds a new "poly_int" class to represent polynomial integers
>>>>> of the form:
>>>>>
>>>>> C0 + C1*X1 + C2*X2 ... + Cn*Xn
>>>>>
>>>>> It also adds poly_int-based typedefs for offsets and sizes of various
>>>>> precisions. In these typedefs, the Ci coefficients are compile-time
>>>>> constants and the Xi indeterminates are run-time invariants. The number
>>>>> of coefficients is controlled by the target and is initially 1 for all
>>>>> ports.
>>>>>
>>>>> Most routines can handle general coefficient counts, but for now a few
>>>>> are specific to one or two coefficients. Support for other coefficient
>>>>> counts can be added when needed.
>>>>>
>>>>> The patch also adds a new macro, IN_TARGET_CODE, that can be
>>>>> set to indicate that a TU contains target-specific rather than
>>>>> target-independent code. When this macro is set and the number of
>>>>> coefficients is 1, the poly-int.h classes define a conversion operator
>>>>> to a constant. This allows most existing target code to work without
>>>>> modification. The main exceptions are:
>>>>>
>>>>> - values passed through ..., which need an explicit conversion to a
>>>>> constant
>>>>>
>>>>> - ?: expression in which one arm ends up being a polynomial and the
>>>>> other remains a constant. In these cases it would be valid to convert
>>>>> the constant to a polynomial and the polynomial to a constant, so a
>>>>> cast is needed to break the ambiguity.
>>>>>
>>>>> The patch also adds a new target hook to return the estimated
>>>>> value of a polynomial for costing purposes.
>>>>>
>>>>> The patch also adds operator<< on wide_ints (it was already defined
>>>>> for offset_int and widest_int). I think this was originally excluded
>>>>> because >> is ambiguous for wide_int, but << is useful for converting
>>>>> bytes to bits, etc., so is worth defining on its own. The patch also
>>>>> adds operator% and operator/ for offset_int and widest_int, since those
>>>>> types are always signed. These changes allow the poly_int interface to
>>>>> be more predictable.
>>>>>
>>>>> I'd originally tried adding the tests as selftests, but that ended up
>>>>> bloating cc1 by at least a third. It also took a while to build them
>>>>> at -O2. The patch therefore uses plugin tests instead, where we can
>>>>> force the tests to be built at -O0. They still run in negligible time
>>>>> when built that way.
>>>>
>>>> Changes in v2:
>>>>
>>>> - Drop the controversial known_zero etc. wrapper functions.
>>>> - Fix the operator<<= bug that Martin found.
>>>> - Switch from "t" to "type" in SFINAE classes (requested by Martin).
>>>>
>>>> Not changed in v2:
>>>>
>>>> - Default constructors are still empty. I agree it makes sense to use
>>>> "= default" when we switch to C++11, but it would be dangerous for
>>>> that to make "poly_int64 x;" less defined than it is now.
>>>
>>> After talking about this a bit more internally, it was obvious that
>>> the choice of "must" and "may" for the predicate names was a common
>>> sticking point. The idea was to match the names of alias predicates,
>>> but given my track record with names ("too_empty_p" being a recently
>>> questioned example :-)), I'd be happy to rename them to something else.
>>> Some alternatives we came up with were:
>> I didn't find the must vs may naming problematical as I was going
>> through the changes. What I did find much more difficult was
>> determining if the behavior was correct when we used a "may" predicate.
>> It really relies a good deal on knowing the surrounding code.
>>
>> In places where I knew the code reasonably well could tell without much
>> surrounding context. In other places I had to look at the code and
>> deduce proper behavior in the "may" cases -- and often I resorted to
>> spot checking and relying on your reputation & testing to DTRT.
>>
>>
>>>
>>> - known_eq / maybe_eq / known_lt / maybe_lt etc.
>>>
>>> Some functions already use "known" and "maybe", so this would arguably
>>> be more consistent than using "must" and "may".
>>>
>>> - always_eq / sometimes_eq / always_lt / sometimes_lt
>>>
>>> Similar to the previous one in intent. It's just a question of which
>>> wordng is clearer.
>>>
>>> - forall_eq / exists_eq / forall_lt / exists_lt etc.
>>>
>>> Matches the usual logic quantifiers. This seems quite appealing,
>>> as long as it's obvious that in:
>>>
>>> forall_eq (v0, v1)
>>>
>>> v0 and v1 themselves are already bound: if vi == ai + bi*X then
>>> what we really saying is:
>>>
>>> forall X, a0 + b0*X == a1 + b1*X
>>>
>>> Which of those sounds best? Any other suggestions?
>> I can live with any of them. I tend to prefer one of the first two, but
>> it's not a major concern for me. So if you or others have a clear
>> preference, go with it.
>
> Whatever you do use a consistent naming which I guess means
> using known_eq / maybe_eq?
>
> Otherwise ok.
So I think that's the final ack on this series. Richard S. can you
confirm? I fully expect the trunk has moved some and the patches will
need adjustments -- consider adjustments which work in a manner similar
to the patches to date pre-approved.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-12-07 15:08 ` Jeff Law
@ 2017-12-07 22:39 ` Richard Sandiford
2017-12-07 22:48 ` Jeff Law
0 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-12-07 22:39 UTC (permalink / raw)
To: Jeff Law; +Cc: Richard Biener, GCC Patches
Jeff Law <law@redhat.com> writes:
> On 12/07/2017 07:46 AM, Richard Biener wrote:
>> On Wed, Dec 6, 2017 at 9:11 PM, Jeff Law <law@redhat.com> wrote:
>>> On 11/13/2017 05:04 PM, Richard Sandiford wrote:
>>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>>>> This patch adds a new "poly_int" class to represent polynomial integers
>>>>>> of the form:
>>>>>>
>>>>>> C0 + C1*X1 + C2*X2 ... + Cn*Xn
>>>>>>
>>>>>> It also adds poly_int-based typedefs for offsets and sizes of various
>>>>>> precisions. In these typedefs, the Ci coefficients are compile-time
>>>>>> constants and the Xi indeterminates are run-time invariants. The number
>>>>>> of coefficients is controlled by the target and is initially 1 for all
>>>>>> ports.
>>>>>>
>>>>>> Most routines can handle general coefficient counts, but for now a few
>>>>>> are specific to one or two coefficients. Support for other coefficient
>>>>>> counts can be added when needed.
>>>>>>
>>>>>> The patch also adds a new macro, IN_TARGET_CODE, that can be
>>>>>> set to indicate that a TU contains target-specific rather than
>>>>>> target-independent code. When this macro is set and the number of
>>>>>> coefficients is 1, the poly-int.h classes define a conversion operator
>>>>>> to a constant. This allows most existing target code to work without
>>>>>> modification. The main exceptions are:
>>>>>>
>>>>>> - values passed through ..., which need an explicit conversion to a
>>>>>> constant
>>>>>>
>>>>>> - ?: expression in which one arm ends up being a polynomial and the
>>>>>> other remains a constant. In these cases it would be valid to convert
>>>>>> the constant to a polynomial and the polynomial to a constant, so a
>>>>>> cast is needed to break the ambiguity.
>>>>>>
>>>>>> The patch also adds a new target hook to return the estimated
>>>>>> value of a polynomial for costing purposes.
>>>>>>
>>>>>> The patch also adds operator<< on wide_ints (it was already defined
>>>>>> for offset_int and widest_int). I think this was originally excluded
>>>>>> because >> is ambiguous for wide_int, but << is useful for converting
>>>>>> bytes to bits, etc., so is worth defining on its own. The patch also
>>>>>> adds operator% and operator/ for offset_int and widest_int, since those
>>>>>> types are always signed. These changes allow the poly_int interface to
>>>>>> be more predictable.
>>>>>>
>>>>>> I'd originally tried adding the tests as selftests, but that ended up
>>>>>> bloating cc1 by at least a third. It also took a while to build them
>>>>>> at -O2. The patch therefore uses plugin tests instead, where we can
>>>>>> force the tests to be built at -O0. They still run in negligible time
>>>>>> when built that way.
>>>>>
>>>>> Changes in v2:
>>>>>
>>>>> - Drop the controversial known_zero etc. wrapper functions.
>>>>> - Fix the operator<<= bug that Martin found.
>>>>> - Switch from "t" to "type" in SFINAE classes (requested by Martin).
>>>>>
>>>>> Not changed in v2:
>>>>>
>>>>> - Default constructors are still empty. I agree it makes sense to use
>>>>> "= default" when we switch to C++11, but it would be dangerous for
>>>>> that to make "poly_int64 x;" less defined than it is now.
>>>>
>>>> After talking about this a bit more internally, it was obvious that
>>>> the choice of "must" and "may" for the predicate names was a common
>>>> sticking point. The idea was to match the names of alias predicates,
>>>> but given my track record with names ("too_empty_p" being a recently
>>>> questioned example :-)), I'd be happy to rename them to something else.
>>>> Some alternatives we came up with were:
>>> I didn't find the must vs may naming problematical as I was going
>>> through the changes. What I did find much more difficult was
>>> determining if the behavior was correct when we used a "may" predicate.
>>> It really relies a good deal on knowing the surrounding code.
>>>
>>> In places where I knew the code reasonably well could tell without much
>>> surrounding context. In other places I had to look at the code and
>>> deduce proper behavior in the "may" cases -- and often I resorted to
>>> spot checking and relying on your reputation & testing to DTRT.
>>>
>>>
>>>>
>>>> - known_eq / maybe_eq / known_lt / maybe_lt etc.
>>>>
>>>> Some functions already use "known" and "maybe", so this would arguably
>>>> be more consistent than using "must" and "may".
>>>>
>>>> - always_eq / sometimes_eq / always_lt / sometimes_lt
>>>>
>>>> Similar to the previous one in intent. It's just a question of which
>>>> wordng is clearer.
>>>>
>>>> - forall_eq / exists_eq / forall_lt / exists_lt etc.
>>>>
>>>> Matches the usual logic quantifiers. This seems quite appealing,
>>>> as long as it's obvious that in:
>>>>
>>>> forall_eq (v0, v1)
>>>>
>>>> v0 and v1 themselves are already bound: if vi == ai + bi*X then
>>>> what we really saying is:
>>>>
>>>> forall X, a0 + b0*X == a1 + b1*X
>>>>
>>>> Which of those sounds best? Any other suggestions?
>>> I can live with any of them. I tend to prefer one of the first two, but
>>> it's not a major concern for me. So if you or others have a clear
>>> preference, go with it.
>>
>> Whatever you do use a consistent naming which I guess means
>> using known_eq / maybe_eq?
>>
>> Otherwise ok.
> So I think that's the final ack on this series.
Thanks to both of you, really appreciate it!
> Richard S. can you confirm? I fully expect the trunk has moved some
> and the patches will need adjustments -- consider adjustments which
> work in a manner similar to the patches to date pre-approved.
Yeah, that's now all of the poly_int patches. I still owe you replies
to some of them -- I'll get to that as soon as I can.
I'll make the name changes and propagate through the series and then
commit this first patch. I was thinking that for the rest it would
make sense to commit them individually, with individual testing of
each patch, so that it's easier to bisect. I'll try to make sure
I don't repeat the merge mistake in the machine-mode series.
I think it'd also make sense to divide the commits up into groups rather
than do them all at once, since it's easier to do the individual testing
that way. Does that sound OK?
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-12-07 22:39 ` Richard Sandiford
@ 2017-12-07 22:48 ` Jeff Law
2017-12-15 3:40 ` Martin Sebor
0 siblings, 1 reply; 302+ messages in thread
From: Jeff Law @ 2017-12-07 22:48 UTC (permalink / raw)
To: Richard Biener, GCC Patches, richard.sandiford
On 12/07/2017 03:38 PM, Richard Sandiford wrote:
>> So I think that's the final ack on this series.
>
> Thanks to both of you, really appreciate it!
Sorry it took so long.
>
>> Richard S. can you confirm? I fully expect the trunk has moved some
>> and the patches will need adjustments -- consider adjustments which
>> work in a manner similar to the patches to date pre-approved.
>
> Yeah, that's now all of the poly_int patches. I still owe you replies
> to some of them -- I'll get to that as soon as I can.
NP. I don't think any of the questions were all that significant.
Those which were I think you already responded to.
>
> I'll make the name changes and propagate through the series and then
> commit this first patch. I was thinking that for the rest it would
> make sense to commit them individually, with individual testing of
> each patch, so that it's easier to bisect. I'll try to make sure
> I don't repeat the merge mistake in the machine-mode series.
>
> I think it'd also make sense to divide the commits up into groups rather
> than do them all at once, since it's easier to do the individual testing
> that way. Does that sound OK?
Your call on the best way to stage in.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-12-07 22:48 ` Jeff Law
@ 2017-12-15 3:40 ` Martin Sebor
2017-12-15 9:08 ` Richard Biener
0 siblings, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-12-15 3:40 UTC (permalink / raw)
To: Jeff Law, Richard Biener, GCC Patches, richard.sandiford
On 12/07/2017 03:48 PM, Jeff Law wrote:
> On 12/07/2017 03:38 PM, Richard Sandiford wrote:
>
>>> So I think that's the final ack on this series.
>>
>> Thanks to both of you, really appreciate it!
> Sorry it took so long.
>
>>
>>> Richard S. can you confirm? I fully expect the trunk has moved some
>>> and the patches will need adjustments -- consider adjustments which
>>> work in a manner similar to the patches to date pre-approved.
>>
>> Yeah, that's now all of the poly_int patches. I still owe you replies
>> to some of them -- I'll get to that as soon as I can.
> NP. I don't think any of the questions were all that significant.
> Those which were I think you already responded to.
I am disappointed that the no-op ctor issue hasn't been adequately
addressed. No numbers were presented as to the difference it makes
to have the ctor do the expected thing (i.e., initialize the object).
In my view, the choice seems arbitrarily in favor of a hypothetical
performance improvement at -O0 without regard to the impact on
correctness. We have recently seen the adverse effects of similar
choices in other areas: the hash table insertion[*] and the related
offset_int initialization.
Martin
[*] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82977
PS To be clear, the numbers I asked for were those showing
the difference between a no-op ctor and one that initializes
the object to some determinate state, whatever that is. IIUC
the numbers in the following post show the aggregate slowdown
for many or most of the changes in the series, not just
the ctor. If the numbers were significant, I suggested
a solution to explicitly request a non-op ctor to make
the default safe and eliminate the overhead where it mattered.
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01028.html
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-12-15 3:40 ` Martin Sebor
@ 2017-12-15 9:08 ` Richard Biener
2017-12-15 15:19 ` Jeff Law
0 siblings, 1 reply; 302+ messages in thread
From: Richard Biener @ 2017-12-15 9:08 UTC (permalink / raw)
To: Martin Sebor; +Cc: Jeff Law, GCC Patches, Richard Sandiford
On Fri, Dec 15, 2017 at 4:40 AM, Martin Sebor <msebor@gmail.com> wrote:
> On 12/07/2017 03:48 PM, Jeff Law wrote:
>>
>> On 12/07/2017 03:38 PM, Richard Sandiford wrote:
>>
>>>> So I think that's the final ack on this series.
>>>
>>>
>>> Thanks to both of you, really appreciate it!
>>
>> Sorry it took so long.
>>
>>>
>>>> Richard S. can you confirm? I fully expect the trunk has moved some
>>>> and the patches will need adjustments -- consider adjustments which
>>>> work in a manner similar to the patches to date pre-approved.
>>>
>>>
>>> Yeah, that's now all of the poly_int patches. I still owe you replies
>>> to some of them -- I'll get to that as soon as I can.
>>
>> NP. I don't think any of the questions were all that significant.
>> Those which were I think you already responded to.
>
>
> I am disappointed that the no-op ctor issue hasn't been adequately
> addressed. No numbers were presented as to the difference it makes
> to have the ctor do the expected thing (i.e., initialize the object).
> In my view, the choice seems arbitrarily in favor of a hypothetical
> performance improvement at -O0 without regard to the impact on
> correctness. We have recently seen the adverse effects of similar
> choices in other areas: the hash table insertion[*] and the related
> offset_int initialization.
As were coming from a C code base not initializing stuff is what I expect.
I'm really surprised to see lot of default initialization done in places
where it only hurts compile-time (of GCC at least where we need to
optimize that away).
Richard.
> Martin
>
> [*] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82977
>
> PS To be clear, the numbers I asked for were those showing
> the difference between a no-op ctor and one that initializes
> the object to some determinate state, whatever that is. IIUC
> the numbers in the following post show the aggregate slowdown
> for many or most of the changes in the series, not just
> the ctor. If the numbers were significant, I suggested
> a solution to explicitly request a non-op ctor to make
> the default safe and eliminate the overhead where it mattered.
>
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01028.html
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [001/nnn] poly_int: add poly-int.h
2017-12-15 9:08 ` Richard Biener
@ 2017-12-15 15:19 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-15 15:19 UTC (permalink / raw)
To: Richard Biener, Martin Sebor; +Cc: GCC Patches, Richard Sandiford
On 12/15/2017 02:08 AM, Richard Biener wrote:
> On Fri, Dec 15, 2017 at 4:40 AM, Martin Sebor <msebor@gmail.com> wrote:
>> On 12/07/2017 03:48 PM, Jeff Law wrote:
>>>
>>> On 12/07/2017 03:38 PM, Richard Sandiford wrote:
>>>
>>>>> So I think that's the final ack on this series.
>>>>
>>>>
>>>> Thanks to both of you, really appreciate it!
>>>
>>> Sorry it took so long.
>>>
>>>>
>>>>> Richard S. can you confirm? I fully expect the trunk has moved some
>>>>> and the patches will need adjustments -- consider adjustments which
>>>>> work in a manner similar to the patches to date pre-approved.
>>>>
>>>>
>>>> Yeah, that's now all of the poly_int patches. I still owe you replies
>>>> to some of them -- I'll get to that as soon as I can.
>>>
>>> NP. I don't think any of the questions were all that significant.
>>> Those which were I think you already responded to.
>>
>>
>> I am disappointed that the no-op ctor issue hasn't been adequately
>> addressed. No numbers were presented as to the difference it makes
>> to have the ctor do the expected thing (i.e., initialize the object).
>> In my view, the choice seems arbitrarily in favor of a hypothetical
>> performance improvement at -O0 without regard to the impact on
>> correctness. We have recently seen the adverse effects of similar
>> choices in other areas: the hash table insertion[*] and the related
>> offset_int initialization.
>
> As were coming from a C code base not initializing stuff is what I expect.
> I'm really surprised to see lot of default initialization done in places
> where it only hurts compile-time (of GCC at least where we need to
> optimize that away).
I suspect a lot of the default initializations were done when Kaveh and
others were working to get us -Wuninitialized clean -- which happened
when uninitialized warnings were still done in RTL (flow.c).
I've long wished we had marked the initializations which were done
solely to make -Wuninitialized happy because it would be a good way to
measure progress on our analysis & optimization passes's ability to
prove the paths weren't executable.
WRT the nop ctor issue, I had a slight leaning towards initializing
them, but I certainly could argue either side. I think the long term
goal really should be to move to C++11 where it can be done right.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [002/nnn] poly_int: IN_TARGET_CODE
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
2017-10-23 16:58 ` [001/nnn] poly_int: add poly-int.h Richard Sandiford
@ 2017-10-23 16:59 ` Richard Sandiford
2017-11-17 3:35 ` Jeff Law
2017-10-23 17:00 ` [003/nnn] poly_int: MACRO_MODE Richard Sandiford
` (105 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 16:59 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 8110 bytes --]
This patch makes each target-specific TU define an IN_TARGET_CODE macro,
which is used to decide whether poly_int<1, C> should convert to C.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* genattrtab.c (write_header): Define IN_TARGET_CODE to 1 in the
target C file.
* genautomata.c (main): Likewise.
* genconditions.c (write_header): Likewise.
* genemit.c (main): Likewise.
* genextract.c (print_header): Likewise.
* genopinit.c (main): Likewise.
* genoutput.c (output_prologue): Likewise.
* genpeep.c (main): Likewise.
* genpreds.c (write_insn_preds_c): Likewise.
* genrecog.c (writer_header): Likewise.
* config/aarch64/aarch64-builtins.c (IN_TARGET_CODE): Define.
* config/aarch64/aarch64-c.c (IN_TARGET_CODE): Likewise.
* config/aarch64/aarch64.c (IN_TARGET_CODE): Likewise.
* config/aarch64/cortex-a57-fma-steering.c (IN_TARGET_CODE): Likewise.
* config/aarch64/driver-aarch64.c (IN_TARGET_CODE): Likewise.
* config/alpha/alpha.c (IN_TARGET_CODE): Likewise.
* config/alpha/driver-alpha.c (IN_TARGET_CODE): Likewise.
* config/arc/arc-c.c (IN_TARGET_CODE): Likewise.
* config/arc/arc.c (IN_TARGET_CODE): Likewise.
* config/arc/driver-arc.c (IN_TARGET_CODE): Likewise.
* config/arm/aarch-common.c (IN_TARGET_CODE): Likewise.
* config/arm/arm-builtins.c (IN_TARGET_CODE): Likewise.
* config/arm/arm-c.c (IN_TARGET_CODE): Likewise.
* config/arm/arm.c (IN_TARGET_CODE): Likewise.
* config/arm/driver-arm.c (IN_TARGET_CODE): Likewise.
* config/avr/avr-c.c (IN_TARGET_CODE): Likewise.
* config/avr/avr-devices.c (IN_TARGET_CODE): Likewise.
* config/avr/avr-log.c (IN_TARGET_CODE): Likewise.
* config/avr/avr.c (IN_TARGET_CODE): Likewise.
* config/avr/driver-avr.c (IN_TARGET_CODE): Likewise.
* config/avr/gen-avr-mmcu-specs.c (IN_TARGET_CODE): Likewise.
* config/bfin/bfin.c (IN_TARGET_CODE): Likewise.
* config/c6x/c6x.c (IN_TARGET_CODE): Likewise.
* config/cr16/cr16.c (IN_TARGET_CODE): Likewise.
* config/cris/cris.c (IN_TARGET_CODE): Likewise.
* config/darwin.c (IN_TARGET_CODE): Likewise.
* config/epiphany/epiphany.c (IN_TARGET_CODE): Likewise.
* config/epiphany/mode-switch-use.c (IN_TARGET_CODE): Likewise.
* config/epiphany/resolve-sw-modes.c (IN_TARGET_CODE): Likewise.
* config/fr30/fr30.c (IN_TARGET_CODE): Likewise.
* config/frv/frv.c (IN_TARGET_CODE): Likewise.
* config/ft32/ft32.c (IN_TARGET_CODE): Likewise.
* config/h8300/h8300.c (IN_TARGET_CODE): Likewise.
* config/i386/djgpp.c (IN_TARGET_CODE): Likewise.
* config/i386/driver-i386.c (IN_TARGET_CODE): Likewise.
* config/i386/driver-mingw32.c (IN_TARGET_CODE): Likewise.
* config/i386/host-cygwin.c (IN_TARGET_CODE): Likewise.
* config/i386/host-i386-darwin.c (IN_TARGET_CODE): Likewise.
* config/i386/host-mingw32.c (IN_TARGET_CODE): Likewise.
* config/i386/i386-c.c (IN_TARGET_CODE): Likewise.
* config/i386/i386.c (IN_TARGET_CODE): Likewise.
* config/i386/intelmic-mkoffload.c (IN_TARGET_CODE): Likewise.
* config/i386/msformat-c.c (IN_TARGET_CODE): Likewise.
* config/i386/winnt-cxx.c (IN_TARGET_CODE): Likewise.
* config/i386/winnt-stubs.c (IN_TARGET_CODE): Likewise.
* config/i386/winnt.c (IN_TARGET_CODE): Likewise.
* config/i386/x86-tune-sched-atom.c (IN_TARGET_CODE): Likewise.
* config/i386/x86-tune-sched-bd.c (IN_TARGET_CODE): Likewise.
* config/i386/x86-tune-sched-core.c (IN_TARGET_CODE): Likewise.
* config/i386/x86-tune-sched.c (IN_TARGET_CODE): Likewise.
* config/ia64/ia64-c.c (IN_TARGET_CODE): Likewise.
* config/ia64/ia64.c (IN_TARGET_CODE): Likewise.
* config/iq2000/iq2000.c (IN_TARGET_CODE): Likewise.
* config/lm32/lm32.c (IN_TARGET_CODE): Likewise.
* config/m32c/m32c-pragma.c (IN_TARGET_CODE): Likewise.
* config/m32c/m32c.c (IN_TARGET_CODE): Likewise.
* config/m32r/m32r.c (IN_TARGET_CODE): Likewise.
* config/m68k/m68k.c (IN_TARGET_CODE): Likewise.
* config/mcore/mcore.c (IN_TARGET_CODE): Likewise.
* config/microblaze/microblaze-c.c (IN_TARGET_CODE): Likewise.
* config/microblaze/microblaze.c (IN_TARGET_CODE): Likewise.
* config/mips/driver-native.c (IN_TARGET_CODE): Likewise.
* config/mips/frame-header-opt.c (IN_TARGET_CODE): Likewise.
* config/mips/mips.c (IN_TARGET_CODE): Likewise.
* config/mmix/mmix.c (IN_TARGET_CODE): Likewise.
* config/mn10300/mn10300.c (IN_TARGET_CODE): Likewise.
* config/moxie/moxie.c (IN_TARGET_CODE): Likewise.
* config/msp430/driver-msp430.c (IN_TARGET_CODE): Likewise.
* config/msp430/msp430-c.c (IN_TARGET_CODE): Likewise.
* config/msp430/msp430.c (IN_TARGET_CODE): Likewise.
* config/nds32/nds32-cost.c (IN_TARGET_CODE): Likewise.
* config/nds32/nds32-fp-as-gp.c (IN_TARGET_CODE): Likewise.
* config/nds32/nds32-intrinsic.c (IN_TARGET_CODE): Likewise.
* config/nds32/nds32-isr.c (IN_TARGET_CODE): Likewise.
* config/nds32/nds32-md-auxiliary.c (IN_TARGET_CODE): Likewise.
* config/nds32/nds32-memory-manipulation.c (IN_TARGET_CODE): Likewise.
* config/nds32/nds32-pipelines-auxiliary.c (IN_TARGET_CODE): Likewise.
* config/nds32/nds32-predicates.c (IN_TARGET_CODE): Likewise.
* config/nds32/nds32.c (IN_TARGET_CODE): Likewise.
* config/nios2/nios2.c (IN_TARGET_CODE): Likewise.
* config/nvptx/mkoffload.c (IN_TARGET_CODE): Likewise.
* config/nvptx/nvptx.c (IN_TARGET_CODE): Likewise.
* config/pa/pa.c (IN_TARGET_CODE): Likewise.
* config/pdp11/pdp11.c (IN_TARGET_CODE): Likewise.
* config/powerpcspe/driver-powerpcspe.c (IN_TARGET_CODE): Likewise.
* config/powerpcspe/host-darwin.c (IN_TARGET_CODE): Likewise.
* config/powerpcspe/host-ppc64-darwin.c (IN_TARGET_CODE): Likewise.
* config/powerpcspe/powerpcspe-c.c (IN_TARGET_CODE): Likewise.
* config/powerpcspe/powerpcspe-linux.c (IN_TARGET_CODE): Likewise.
* config/powerpcspe/powerpcspe.c (IN_TARGET_CODE): Likewise.
* config/riscv/riscv-builtins.c (IN_TARGET_CODE): Likewise.
* config/riscv/riscv-c.c (IN_TARGET_CODE): Likewise.
* config/riscv/riscv.c (IN_TARGET_CODE): Likewise.
* config/rl78/rl78-c.c (IN_TARGET_CODE): Likewise.
* config/rl78/rl78.c (IN_TARGET_CODE): Likewise.
* config/rs6000/driver-rs6000.c (IN_TARGET_CODE): Likewise.
* config/rs6000/host-darwin.c (IN_TARGET_CODE): Likewise.
* config/rs6000/host-ppc64-darwin.c (IN_TARGET_CODE): Likewise.
* config/rs6000/rs6000-c.c (IN_TARGET_CODE): Likewise.
* config/rs6000/rs6000-linux.c (IN_TARGET_CODE): Likewise.
* config/rs6000/rs6000-p8swap.c (IN_TARGET_CODE): Likewise.
* config/rs6000/rs6000-string.c (IN_TARGET_CODE): Likewise.
* config/rs6000/rs6000.c (IN_TARGET_CODE): Likewise.
* config/rx/rx.c (IN_TARGET_CODE): Likewise.
* config/s390/driver-native.c (IN_TARGET_CODE): Likewise.
* config/s390/s390-c.c (IN_TARGET_CODE): Likewise.
* config/s390/s390.c (IN_TARGET_CODE): Likewise.
* config/sh/sh-c.c (IN_TARGET_CODE): Likewise.
* config/sh/sh-mem.cc (IN_TARGET_CODE): Likewise.
* config/sh/sh.c (IN_TARGET_CODE): Likewise.
* config/sh/sh_optimize_sett_clrt.cc (IN_TARGET_CODE): Likewise.
* config/sh/sh_treg_combine.cc (IN_TARGET_CODE): Likewise.
* config/sparc/driver-sparc.c (IN_TARGET_CODE): Likewise.
* config/sparc/sparc-c.c (IN_TARGET_CODE): Likewise.
* config/sparc/sparc.c (IN_TARGET_CODE): Likewise.
* config/spu/spu-c.c (IN_TARGET_CODE): Likewise.
* config/spu/spu.c (IN_TARGET_CODE): Likewise.
* config/stormy16/stormy16.c (IN_TARGET_CODE): Likewise.
* config/tilegx/mul-tables.c (IN_TARGET_CODE): Likewise.
* config/tilegx/tilegx-c.c (IN_TARGET_CODE): Likewise.
* config/tilegx/tilegx.c (IN_TARGET_CODE): Likewise.
* config/tilepro/mul-tables.c (IN_TARGET_CODE): Likewise.
* config/tilepro/tilepro-c.c (IN_TARGET_CODE): Likewise.
* config/tilepro/tilepro.c (IN_TARGET_CODE): Likewise.
* config/v850/v850-c.c (IN_TARGET_CODE): Likewise.
* config/v850/v850.c (IN_TARGET_CODE): Likewise.
* config/vax/vax.c (IN_TARGET_CODE): Likewise.
* config/visium/visium.c (IN_TARGET_CODE): Likewise.
* config/vms/vms-c.c (IN_TARGET_CODE): Likewise.
* config/vms/vms-f.c (IN_TARGET_CODE): Likewise.
* config/vms/vms.c (IN_TARGET_CODE): Likewise.
* config/xtensa/xtensa.c (IN_TARGET_CODE): Likewise.
[-- Attachment #2: in-target-code.diff.bz2 --]
[-- Type: application/x-bzip2, Size: 4083 bytes --]
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [002/nnn] poly_int: IN_TARGET_CODE
2017-10-23 16:59 ` [002/nnn] poly_int: IN_TARGET_CODE Richard Sandiford
@ 2017-11-17 3:35 ` Jeff Law
2017-12-15 1:08 ` Richard Sandiford
0 siblings, 1 reply; 302+ messages in thread
From: Jeff Law @ 2017-11-17 3:35 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 10:58 AM, Richard Sandiford wrote:
> This patch makes each target-specific TU define an IN_TARGET_CODE macro,
> which is used to decide whether poly_int<1, C> should convert to C.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * genattrtab.c (write_header): Define IN_TARGET_CODE to 1 in the
> target C file.
> * genautomata.c (main): Likewise.
> * genconditions.c (write_header): Likewise.
> * genemit.c (main): Likewise.
> * genextract.c (print_header): Likewise.
> * genopinit.c (main): Likewise.
> * genoutput.c (output_prologue): Likewise.
> * genpeep.c (main): Likewise.
> * genpreds.c (write_insn_preds_c): Likewise.
> * genrecog.c (writer_header): Likewise.
> * config/aarch64/aarch64-builtins.c (IN_TARGET_CODE): Define.
> * config/aarch64/aarch64-c.c (IN_TARGET_CODE): Likewise.
> * config/aarch64/aarch64.c (IN_TARGET_CODE): Likewise.
> * config/aarch64/cortex-a57-fma-steering.c (IN_TARGET_CODE): Likewise.
> * config/aarch64/driver-aarch64.c (IN_TARGET_CODE): Likewise.
> * config/alpha/alpha.c (IN_TARGET_CODE): Likewise.
> * config/alpha/driver-alpha.c (IN_TARGET_CODE): Likewise.
> * config/arc/arc-c.c (IN_TARGET_CODE): Likewise.
> * config/arc/arc.c (IN_TARGET_CODE): Likewise.
> * config/arc/driver-arc.c (IN_TARGET_CODE): Likewise.
> * config/arm/aarch-common.c (IN_TARGET_CODE): Likewise.
> * config/arm/arm-builtins.c (IN_TARGET_CODE): Likewise.
> * config/arm/arm-c.c (IN_TARGET_CODE): Likewise.
> * config/arm/arm.c (IN_TARGET_CODE): Likewise.
> * config/arm/driver-arm.c (IN_TARGET_CODE): Likewise.
> * config/avr/avr-c.c (IN_TARGET_CODE): Likewise.
> * config/avr/avr-devices.c (IN_TARGET_CODE): Likewise.
> * config/avr/avr-log.c (IN_TARGET_CODE): Likewise.
> * config/avr/avr.c (IN_TARGET_CODE): Likewise.
> * config/avr/driver-avr.c (IN_TARGET_CODE): Likewise.
> * config/avr/gen-avr-mmcu-specs.c (IN_TARGET_CODE): Likewise.
> * config/bfin/bfin.c (IN_TARGET_CODE): Likewise.
> * config/c6x/c6x.c (IN_TARGET_CODE): Likewise.
> * config/cr16/cr16.c (IN_TARGET_CODE): Likewise.
> * config/cris/cris.c (IN_TARGET_CODE): Likewise.
> * config/darwin.c (IN_TARGET_CODE): Likewise.
> * config/epiphany/epiphany.c (IN_TARGET_CODE): Likewise.
> * config/epiphany/mode-switch-use.c (IN_TARGET_CODE): Likewise.
> * config/epiphany/resolve-sw-modes.c (IN_TARGET_CODE): Likewise.
> * config/fr30/fr30.c (IN_TARGET_CODE): Likewise.
> * config/frv/frv.c (IN_TARGET_CODE): Likewise.
> * config/ft32/ft32.c (IN_TARGET_CODE): Likewise.
> * config/h8300/h8300.c (IN_TARGET_CODE): Likewise.
> * config/i386/djgpp.c (IN_TARGET_CODE): Likewise.
> * config/i386/driver-i386.c (IN_TARGET_CODE): Likewise.
> * config/i386/driver-mingw32.c (IN_TARGET_CODE): Likewise.
> * config/i386/host-cygwin.c (IN_TARGET_CODE): Likewise.
> * config/i386/host-i386-darwin.c (IN_TARGET_CODE): Likewise.
> * config/i386/host-mingw32.c (IN_TARGET_CODE): Likewise.
> * config/i386/i386-c.c (IN_TARGET_CODE): Likewise.
> * config/i386/i386.c (IN_TARGET_CODE): Likewise.
> * config/i386/intelmic-mkoffload.c (IN_TARGET_CODE): Likewise.
> * config/i386/msformat-c.c (IN_TARGET_CODE): Likewise.
> * config/i386/winnt-cxx.c (IN_TARGET_CODE): Likewise.
> * config/i386/winnt-stubs.c (IN_TARGET_CODE): Likewise.
> * config/i386/winnt.c (IN_TARGET_CODE): Likewise.
> * config/i386/x86-tune-sched-atom.c (IN_TARGET_CODE): Likewise.
> * config/i386/x86-tune-sched-bd.c (IN_TARGET_CODE): Likewise.
> * config/i386/x86-tune-sched-core.c (IN_TARGET_CODE): Likewise.
> * config/i386/x86-tune-sched.c (IN_TARGET_CODE): Likewise.
> * config/ia64/ia64-c.c (IN_TARGET_CODE): Likewise.
> * config/ia64/ia64.c (IN_TARGET_CODE): Likewise.
> * config/iq2000/iq2000.c (IN_TARGET_CODE): Likewise.
> * config/lm32/lm32.c (IN_TARGET_CODE): Likewise.
> * config/m32c/m32c-pragma.c (IN_TARGET_CODE): Likewise.
> * config/m32c/m32c.c (IN_TARGET_CODE): Likewise.
> * config/m32r/m32r.c (IN_TARGET_CODE): Likewise.
> * config/m68k/m68k.c (IN_TARGET_CODE): Likewise.
> * config/mcore/mcore.c (IN_TARGET_CODE): Likewise.
> * config/microblaze/microblaze-c.c (IN_TARGET_CODE): Likewise.
> * config/microblaze/microblaze.c (IN_TARGET_CODE): Likewise.
> * config/mips/driver-native.c (IN_TARGET_CODE): Likewise.
> * config/mips/frame-header-opt.c (IN_TARGET_CODE): Likewise.
> * config/mips/mips.c (IN_TARGET_CODE): Likewise.
> * config/mmix/mmix.c (IN_TARGET_CODE): Likewise.
> * config/mn10300/mn10300.c (IN_TARGET_CODE): Likewise.
> * config/moxie/moxie.c (IN_TARGET_CODE): Likewise.
> * config/msp430/driver-msp430.c (IN_TARGET_CODE): Likewise.
> * config/msp430/msp430-c.c (IN_TARGET_CODE): Likewise.
> * config/msp430/msp430.c (IN_TARGET_CODE): Likewise.
> * config/nds32/nds32-cost.c (IN_TARGET_CODE): Likewise.
> * config/nds32/nds32-fp-as-gp.c (IN_TARGET_CODE): Likewise.
> * config/nds32/nds32-intrinsic.c (IN_TARGET_CODE): Likewise.
> * config/nds32/nds32-isr.c (IN_TARGET_CODE): Likewise.
> * config/nds32/nds32-md-auxiliary.c (IN_TARGET_CODE): Likewise.
> * config/nds32/nds32-memory-manipulation.c (IN_TARGET_CODE): Likewise.
> * config/nds32/nds32-pipelines-auxiliary.c (IN_TARGET_CODE): Likewise.
> * config/nds32/nds32-predicates.c (IN_TARGET_CODE): Likewise.
> * config/nds32/nds32.c (IN_TARGET_CODE): Likewise.
> * config/nios2/nios2.c (IN_TARGET_CODE): Likewise.
> * config/nvptx/mkoffload.c (IN_TARGET_CODE): Likewise.
> * config/nvptx/nvptx.c (IN_TARGET_CODE): Likewise.
> * config/pa/pa.c (IN_TARGET_CODE): Likewise.
> * config/pdp11/pdp11.c (IN_TARGET_CODE): Likewise.
> * config/powerpcspe/driver-powerpcspe.c (IN_TARGET_CODE): Likewise.
> * config/powerpcspe/host-darwin.c (IN_TARGET_CODE): Likewise.
> * config/powerpcspe/host-ppc64-darwin.c (IN_TARGET_CODE): Likewise.
> * config/powerpcspe/powerpcspe-c.c (IN_TARGET_CODE): Likewise.
> * config/powerpcspe/powerpcspe-linux.c (IN_TARGET_CODE): Likewise.
> * config/powerpcspe/powerpcspe.c (IN_TARGET_CODE): Likewise.
> * config/riscv/riscv-builtins.c (IN_TARGET_CODE): Likewise.
> * config/riscv/riscv-c.c (IN_TARGET_CODE): Likewise.
> * config/riscv/riscv.c (IN_TARGET_CODE): Likewise.
> * config/rl78/rl78-c.c (IN_TARGET_CODE): Likewise.
> * config/rl78/rl78.c (IN_TARGET_CODE): Likewise.
> * config/rs6000/driver-rs6000.c (IN_TARGET_CODE): Likewise.
> * config/rs6000/host-darwin.c (IN_TARGET_CODE): Likewise.
> * config/rs6000/host-ppc64-darwin.c (IN_TARGET_CODE): Likewise.
> * config/rs6000/rs6000-c.c (IN_TARGET_CODE): Likewise.
> * config/rs6000/rs6000-linux.c (IN_TARGET_CODE): Likewise.
> * config/rs6000/rs6000-p8swap.c (IN_TARGET_CODE): Likewise.
> * config/rs6000/rs6000-string.c (IN_TARGET_CODE): Likewise.
> * config/rs6000/rs6000.c (IN_TARGET_CODE): Likewise.
> * config/rx/rx.c (IN_TARGET_CODE): Likewise.
> * config/s390/driver-native.c (IN_TARGET_CODE): Likewise.
> * config/s390/s390-c.c (IN_TARGET_CODE): Likewise.
> * config/s390/s390.c (IN_TARGET_CODE): Likewise.
> * config/sh/sh-c.c (IN_TARGET_CODE): Likewise.
> * config/sh/sh-mem.cc (IN_TARGET_CODE): Likewise.
> * config/sh/sh.c (IN_TARGET_CODE): Likewise.
> * config/sh/sh_optimize_sett_clrt.cc (IN_TARGET_CODE): Likewise.
> * config/sh/sh_treg_combine.cc (IN_TARGET_CODE): Likewise.
> * config/sparc/driver-sparc.c (IN_TARGET_CODE): Likewise.
> * config/sparc/sparc-c.c (IN_TARGET_CODE): Likewise.
> * config/sparc/sparc.c (IN_TARGET_CODE): Likewise.
> * config/spu/spu-c.c (IN_TARGET_CODE): Likewise.
> * config/spu/spu.c (IN_TARGET_CODE): Likewise.
> * config/stormy16/stormy16.c (IN_TARGET_CODE): Likewise.
> * config/tilegx/mul-tables.c (IN_TARGET_CODE): Likewise.
> * config/tilegx/tilegx-c.c (IN_TARGET_CODE): Likewise.
> * config/tilegx/tilegx.c (IN_TARGET_CODE): Likewise.
> * config/tilepro/mul-tables.c (IN_TARGET_CODE): Likewise.
> * config/tilepro/tilepro-c.c (IN_TARGET_CODE): Likewise.
> * config/tilepro/tilepro.c (IN_TARGET_CODE): Likewise.
> * config/v850/v850-c.c (IN_TARGET_CODE): Likewise.
> * config/v850/v850.c (IN_TARGET_CODE): Likewise.
> * config/vax/vax.c (IN_TARGET_CODE): Likewise.
> * config/visium/visium.c (IN_TARGET_CODE): Likewise.
> * config/vms/vms-c.c (IN_TARGET_CODE): Likewise.
> * config/vms/vms-f.c (IN_TARGET_CODE): Likewise.
> * config/vms/vms.c (IN_TARGET_CODE): Likewise.
> * config/xtensa/xtensa.c (IN_TARGET_CODE): Likewise.
ISTM this needs documenting somewhere.
OK with a suitable doc patch.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [002/nnn] poly_int: IN_TARGET_CODE
2017-11-17 3:35 ` Jeff Law
@ 2017-12-15 1:08 ` Richard Sandiford
2017-12-15 15:22 ` Jeff Law
0 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-12-15 1:08 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches
Jeff Law <law@redhat.com> writes:
> On 10/23/2017 10:58 AM, Richard Sandiford wrote:
>> This patch makes each target-specific TU define an IN_TARGET_CODE macro,
>> which is used to decide whether poly_int<1, C> should convert to C.
>>
>>
>> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
>> Alan Hayward <alan.hayward@arm.com>
>> David Sherwood <david.sherwood@arm.com>
>>
>> gcc/
>> * genattrtab.c (write_header): Define IN_TARGET_CODE to 1 in the
>> target C file.
>> * genautomata.c (main): Likewise.
>> * genconditions.c (write_header): Likewise.
>> * genemit.c (main): Likewise.
>> * genextract.c (print_header): Likewise.
>> * genopinit.c (main): Likewise.
>> * genoutput.c (output_prologue): Likewise.
>> * genpeep.c (main): Likewise.
>> * genpreds.c (write_insn_preds_c): Likewise.
>> * genrecog.c (writer_header): Likewise.
>> * config/aarch64/aarch64-builtins.c (IN_TARGET_CODE): Define.
>> * config/aarch64/aarch64-c.c (IN_TARGET_CODE): Likewise.
>> * config/aarch64/aarch64.c (IN_TARGET_CODE): Likewise.
>> * config/aarch64/cortex-a57-fma-steering.c (IN_TARGET_CODE): Likewise.
>> * config/aarch64/driver-aarch64.c (IN_TARGET_CODE): Likewise.
>> * config/alpha/alpha.c (IN_TARGET_CODE): Likewise.
>> * config/alpha/driver-alpha.c (IN_TARGET_CODE): Likewise.
>> * config/arc/arc-c.c (IN_TARGET_CODE): Likewise.
>> * config/arc/arc.c (IN_TARGET_CODE): Likewise.
>> * config/arc/driver-arc.c (IN_TARGET_CODE): Likewise.
>> * config/arm/aarch-common.c (IN_TARGET_CODE): Likewise.
>> * config/arm/arm-builtins.c (IN_TARGET_CODE): Likewise.
>> * config/arm/arm-c.c (IN_TARGET_CODE): Likewise.
>> * config/arm/arm.c (IN_TARGET_CODE): Likewise.
>> * config/arm/driver-arm.c (IN_TARGET_CODE): Likewise.
>> * config/avr/avr-c.c (IN_TARGET_CODE): Likewise.
>> * config/avr/avr-devices.c (IN_TARGET_CODE): Likewise.
>> * config/avr/avr-log.c (IN_TARGET_CODE): Likewise.
>> * config/avr/avr.c (IN_TARGET_CODE): Likewise.
>> * config/avr/driver-avr.c (IN_TARGET_CODE): Likewise.
>> * config/avr/gen-avr-mmcu-specs.c (IN_TARGET_CODE): Likewise.
>> * config/bfin/bfin.c (IN_TARGET_CODE): Likewise.
>> * config/c6x/c6x.c (IN_TARGET_CODE): Likewise.
>> * config/cr16/cr16.c (IN_TARGET_CODE): Likewise.
>> * config/cris/cris.c (IN_TARGET_CODE): Likewise.
>> * config/darwin.c (IN_TARGET_CODE): Likewise.
>> * config/epiphany/epiphany.c (IN_TARGET_CODE): Likewise.
>> * config/epiphany/mode-switch-use.c (IN_TARGET_CODE): Likewise.
>> * config/epiphany/resolve-sw-modes.c (IN_TARGET_CODE): Likewise.
>> * config/fr30/fr30.c (IN_TARGET_CODE): Likewise.
>> * config/frv/frv.c (IN_TARGET_CODE): Likewise.
>> * config/ft32/ft32.c (IN_TARGET_CODE): Likewise.
>> * config/h8300/h8300.c (IN_TARGET_CODE): Likewise.
>> * config/i386/djgpp.c (IN_TARGET_CODE): Likewise.
>> * config/i386/driver-i386.c (IN_TARGET_CODE): Likewise.
>> * config/i386/driver-mingw32.c (IN_TARGET_CODE): Likewise.
>> * config/i386/host-cygwin.c (IN_TARGET_CODE): Likewise.
>> * config/i386/host-i386-darwin.c (IN_TARGET_CODE): Likewise.
>> * config/i386/host-mingw32.c (IN_TARGET_CODE): Likewise.
>> * config/i386/i386-c.c (IN_TARGET_CODE): Likewise.
>> * config/i386/i386.c (IN_TARGET_CODE): Likewise.
>> * config/i386/intelmic-mkoffload.c (IN_TARGET_CODE): Likewise.
>> * config/i386/msformat-c.c (IN_TARGET_CODE): Likewise.
>> * config/i386/winnt-cxx.c (IN_TARGET_CODE): Likewise.
>> * config/i386/winnt-stubs.c (IN_TARGET_CODE): Likewise.
>> * config/i386/winnt.c (IN_TARGET_CODE): Likewise.
>> * config/i386/x86-tune-sched-atom.c (IN_TARGET_CODE): Likewise.
>> * config/i386/x86-tune-sched-bd.c (IN_TARGET_CODE): Likewise.
>> * config/i386/x86-tune-sched-core.c (IN_TARGET_CODE): Likewise.
>> * config/i386/x86-tune-sched.c (IN_TARGET_CODE): Likewise.
>> * config/ia64/ia64-c.c (IN_TARGET_CODE): Likewise.
>> * config/ia64/ia64.c (IN_TARGET_CODE): Likewise.
>> * config/iq2000/iq2000.c (IN_TARGET_CODE): Likewise.
>> * config/lm32/lm32.c (IN_TARGET_CODE): Likewise.
>> * config/m32c/m32c-pragma.c (IN_TARGET_CODE): Likewise.
>> * config/m32c/m32c.c (IN_TARGET_CODE): Likewise.
>> * config/m32r/m32r.c (IN_TARGET_CODE): Likewise.
>> * config/m68k/m68k.c (IN_TARGET_CODE): Likewise.
>> * config/mcore/mcore.c (IN_TARGET_CODE): Likewise.
>> * config/microblaze/microblaze-c.c (IN_TARGET_CODE): Likewise.
>> * config/microblaze/microblaze.c (IN_TARGET_CODE): Likewise.
>> * config/mips/driver-native.c (IN_TARGET_CODE): Likewise.
>> * config/mips/frame-header-opt.c (IN_TARGET_CODE): Likewise.
>> * config/mips/mips.c (IN_TARGET_CODE): Likewise.
>> * config/mmix/mmix.c (IN_TARGET_CODE): Likewise.
>> * config/mn10300/mn10300.c (IN_TARGET_CODE): Likewise.
>> * config/moxie/moxie.c (IN_TARGET_CODE): Likewise.
>> * config/msp430/driver-msp430.c (IN_TARGET_CODE): Likewise.
>> * config/msp430/msp430-c.c (IN_TARGET_CODE): Likewise.
>> * config/msp430/msp430.c (IN_TARGET_CODE): Likewise.
>> * config/nds32/nds32-cost.c (IN_TARGET_CODE): Likewise.
>> * config/nds32/nds32-fp-as-gp.c (IN_TARGET_CODE): Likewise.
>> * config/nds32/nds32-intrinsic.c (IN_TARGET_CODE): Likewise.
>> * config/nds32/nds32-isr.c (IN_TARGET_CODE): Likewise.
>> * config/nds32/nds32-md-auxiliary.c (IN_TARGET_CODE): Likewise.
>> * config/nds32/nds32-memory-manipulation.c (IN_TARGET_CODE): Likewise.
>> * config/nds32/nds32-pipelines-auxiliary.c (IN_TARGET_CODE): Likewise.
>> * config/nds32/nds32-predicates.c (IN_TARGET_CODE): Likewise.
>> * config/nds32/nds32.c (IN_TARGET_CODE): Likewise.
>> * config/nios2/nios2.c (IN_TARGET_CODE): Likewise.
>> * config/nvptx/mkoffload.c (IN_TARGET_CODE): Likewise.
>> * config/nvptx/nvptx.c (IN_TARGET_CODE): Likewise.
>> * config/pa/pa.c (IN_TARGET_CODE): Likewise.
>> * config/pdp11/pdp11.c (IN_TARGET_CODE): Likewise.
>> * config/powerpcspe/driver-powerpcspe.c (IN_TARGET_CODE): Likewise.
>> * config/powerpcspe/host-darwin.c (IN_TARGET_CODE): Likewise.
>> * config/powerpcspe/host-ppc64-darwin.c (IN_TARGET_CODE): Likewise.
>> * config/powerpcspe/powerpcspe-c.c (IN_TARGET_CODE): Likewise.
>> * config/powerpcspe/powerpcspe-linux.c (IN_TARGET_CODE): Likewise.
>> * config/powerpcspe/powerpcspe.c (IN_TARGET_CODE): Likewise.
>> * config/riscv/riscv-builtins.c (IN_TARGET_CODE): Likewise.
>> * config/riscv/riscv-c.c (IN_TARGET_CODE): Likewise.
>> * config/riscv/riscv.c (IN_TARGET_CODE): Likewise.
>> * config/rl78/rl78-c.c (IN_TARGET_CODE): Likewise.
>> * config/rl78/rl78.c (IN_TARGET_CODE): Likewise.
>> * config/rs6000/driver-rs6000.c (IN_TARGET_CODE): Likewise.
>> * config/rs6000/host-darwin.c (IN_TARGET_CODE): Likewise.
>> * config/rs6000/host-ppc64-darwin.c (IN_TARGET_CODE): Likewise.
>> * config/rs6000/rs6000-c.c (IN_TARGET_CODE): Likewise.
>> * config/rs6000/rs6000-linux.c (IN_TARGET_CODE): Likewise.
>> * config/rs6000/rs6000-p8swap.c (IN_TARGET_CODE): Likewise.
>> * config/rs6000/rs6000-string.c (IN_TARGET_CODE): Likewise.
>> * config/rs6000/rs6000.c (IN_TARGET_CODE): Likewise.
>> * config/rx/rx.c (IN_TARGET_CODE): Likewise.
>> * config/s390/driver-native.c (IN_TARGET_CODE): Likewise.
>> * config/s390/s390-c.c (IN_TARGET_CODE): Likewise.
>> * config/s390/s390.c (IN_TARGET_CODE): Likewise.
>> * config/sh/sh-c.c (IN_TARGET_CODE): Likewise.
>> * config/sh/sh-mem.cc (IN_TARGET_CODE): Likewise.
>> * config/sh/sh.c (IN_TARGET_CODE): Likewise.
>> * config/sh/sh_optimize_sett_clrt.cc (IN_TARGET_CODE): Likewise.
>> * config/sh/sh_treg_combine.cc (IN_TARGET_CODE): Likewise.
>> * config/sparc/driver-sparc.c (IN_TARGET_CODE): Likewise.
>> * config/sparc/sparc-c.c (IN_TARGET_CODE): Likewise.
>> * config/sparc/sparc.c (IN_TARGET_CODE): Likewise.
>> * config/spu/spu-c.c (IN_TARGET_CODE): Likewise.
>> * config/spu/spu.c (IN_TARGET_CODE): Likewise.
>> * config/stormy16/stormy16.c (IN_TARGET_CODE): Likewise.
>> * config/tilegx/mul-tables.c (IN_TARGET_CODE): Likewise.
>> * config/tilegx/tilegx-c.c (IN_TARGET_CODE): Likewise.
>> * config/tilegx/tilegx.c (IN_TARGET_CODE): Likewise.
>> * config/tilepro/mul-tables.c (IN_TARGET_CODE): Likewise.
>> * config/tilepro/tilepro-c.c (IN_TARGET_CODE): Likewise.
>> * config/tilepro/tilepro.c (IN_TARGET_CODE): Likewise.
>> * config/v850/v850-c.c (IN_TARGET_CODE): Likewise.
>> * config/v850/v850.c (IN_TARGET_CODE): Likewise.
>> * config/vax/vax.c (IN_TARGET_CODE): Likewise.
>> * config/visium/visium.c (IN_TARGET_CODE): Likewise.
>> * config/vms/vms-c.c (IN_TARGET_CODE): Likewise.
>> * config/vms/vms-f.c (IN_TARGET_CODE): Likewise.
>> * config/vms/vms.c (IN_TARGET_CODE): Likewise.
>> * config/xtensa/xtensa.c (IN_TARGET_CODE): Likewise.
> ISTM this needs documenting somewhere.
>
> OK with a suitable doc patch.
OK. I couldn't find anywhere that was an obvious fit, so in the end
I went for the "Anatomy of a target back end". (The effect on
IN_TARGET_CODE on poly-int is already documented in poly-int.texi.)
Does this look OK?
Thanks,
Richard
2017-12-15 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/sourcebuild.texi: Document IN_TARGET_CODE.
[...]
Index: gcc/doc/sourcebuild.texi
===================================================================
--- gcc/doc/sourcebuild.texi 2017-12-05 14:24:53.348999076 +0000
+++ gcc/doc/sourcebuild.texi 2017-12-15 01:04:55.748638203 +0000
@@ -822,6 +822,17 @@ manual needs to be installed as info for
chapter of this manual.
@end itemize
+GCC uses the macro @code{IN_TARGET_CODE} to distinguish between
+machine-specific @file{.c} and @file{.cc} files and
+machine-independent @file{.c} and @file{.cc} files. Machine-specific
+files should use the directive:
+
+@example
+#define IN_TARGET_CODE 1
+@end example
+
+before including @code{config.h}.
+
If the back end is added to the official GCC source repository, the
following are also necessary:
[...]
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [002/nnn] poly_int: IN_TARGET_CODE
2017-12-15 1:08 ` Richard Sandiford
@ 2017-12-15 15:22 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-15 15:22 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 12/14/2017 06:08 PM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 10/23/2017 10:58 AM, Richard Sandiford wrote:
>>> This patch makes each target-specific TU define an IN_TARGET_CODE macro,
>>> which is used to decide whether poly_int<1, C> should convert to C.
>>>
>>>
>>> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
>>> Alan Hayward <alan.hayward@arm.com>
>>> David Sherwood <david.sherwood@arm.com>
>>>
>>> gcc/
>>> * genattrtab.c (write_header): Define IN_TARGET_CODE to 1 in the
>>> target C file.
>>> * genautomata.c (main): Likewise.
>>> * genconditions.c (write_header): Likewise.
>>> * genemit.c (main): Likewise.
>>> * genextract.c (print_header): Likewise.
>>> * genopinit.c (main): Likewise.
>>> * genoutput.c (output_prologue): Likewise.
>>> * genpeep.c (main): Likewise.
>>> * genpreds.c (write_insn_preds_c): Likewise.
>>> * genrecog.c (writer_header): Likewise.
>>> * config/aarch64/aarch64-builtins.c (IN_TARGET_CODE): Define.
>>> * config/aarch64/aarch64-c.c (IN_TARGET_CODE): Likewise.
>>> * config/aarch64/aarch64.c (IN_TARGET_CODE): Likewise.
>>> * config/aarch64/cortex-a57-fma-steering.c (IN_TARGET_CODE): Likewise.
>>> * config/aarch64/driver-aarch64.c (IN_TARGET_CODE): Likewise.
>>> * config/alpha/alpha.c (IN_TARGET_CODE): Likewise.
>>> * config/alpha/driver-alpha.c (IN_TARGET_CODE): Likewise.
>>> * config/arc/arc-c.c (IN_TARGET_CODE): Likewise.
>>> * config/arc/arc.c (IN_TARGET_CODE): Likewise.
>>> * config/arc/driver-arc.c (IN_TARGET_CODE): Likewise.
>>> * config/arm/aarch-common.c (IN_TARGET_CODE): Likewise.
>>> * config/arm/arm-builtins.c (IN_TARGET_CODE): Likewise.
>>> * config/arm/arm-c.c (IN_TARGET_CODE): Likewise.
>>> * config/arm/arm.c (IN_TARGET_CODE): Likewise.
>>> * config/arm/driver-arm.c (IN_TARGET_CODE): Likewise.
>>> * config/avr/avr-c.c (IN_TARGET_CODE): Likewise.
>>> * config/avr/avr-devices.c (IN_TARGET_CODE): Likewise.
>>> * config/avr/avr-log.c (IN_TARGET_CODE): Likewise.
>>> * config/avr/avr.c (IN_TARGET_CODE): Likewise.
>>> * config/avr/driver-avr.c (IN_TARGET_CODE): Likewise.
>>> * config/avr/gen-avr-mmcu-specs.c (IN_TARGET_CODE): Likewise.
>>> * config/bfin/bfin.c (IN_TARGET_CODE): Likewise.
>>> * config/c6x/c6x.c (IN_TARGET_CODE): Likewise.
>>> * config/cr16/cr16.c (IN_TARGET_CODE): Likewise.
>>> * config/cris/cris.c (IN_TARGET_CODE): Likewise.
>>> * config/darwin.c (IN_TARGET_CODE): Likewise.
>>> * config/epiphany/epiphany.c (IN_TARGET_CODE): Likewise.
>>> * config/epiphany/mode-switch-use.c (IN_TARGET_CODE): Likewise.
>>> * config/epiphany/resolve-sw-modes.c (IN_TARGET_CODE): Likewise.
>>> * config/fr30/fr30.c (IN_TARGET_CODE): Likewise.
>>> * config/frv/frv.c (IN_TARGET_CODE): Likewise.
>>> * config/ft32/ft32.c (IN_TARGET_CODE): Likewise.
>>> * config/h8300/h8300.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/djgpp.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/driver-i386.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/driver-mingw32.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/host-cygwin.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/host-i386-darwin.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/host-mingw32.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/i386-c.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/i386.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/intelmic-mkoffload.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/msformat-c.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/winnt-cxx.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/winnt-stubs.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/winnt.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/x86-tune-sched-atom.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/x86-tune-sched-bd.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/x86-tune-sched-core.c (IN_TARGET_CODE): Likewise.
>>> * config/i386/x86-tune-sched.c (IN_TARGET_CODE): Likewise.
>>> * config/ia64/ia64-c.c (IN_TARGET_CODE): Likewise.
>>> * config/ia64/ia64.c (IN_TARGET_CODE): Likewise.
>>> * config/iq2000/iq2000.c (IN_TARGET_CODE): Likewise.
>>> * config/lm32/lm32.c (IN_TARGET_CODE): Likewise.
>>> * config/m32c/m32c-pragma.c (IN_TARGET_CODE): Likewise.
>>> * config/m32c/m32c.c (IN_TARGET_CODE): Likewise.
>>> * config/m32r/m32r.c (IN_TARGET_CODE): Likewise.
>>> * config/m68k/m68k.c (IN_TARGET_CODE): Likewise.
>>> * config/mcore/mcore.c (IN_TARGET_CODE): Likewise.
>>> * config/microblaze/microblaze-c.c (IN_TARGET_CODE): Likewise.
>>> * config/microblaze/microblaze.c (IN_TARGET_CODE): Likewise.
>>> * config/mips/driver-native.c (IN_TARGET_CODE): Likewise.
>>> * config/mips/frame-header-opt.c (IN_TARGET_CODE): Likewise.
>>> * config/mips/mips.c (IN_TARGET_CODE): Likewise.
>>> * config/mmix/mmix.c (IN_TARGET_CODE): Likewise.
>>> * config/mn10300/mn10300.c (IN_TARGET_CODE): Likewise.
>>> * config/moxie/moxie.c (IN_TARGET_CODE): Likewise.
>>> * config/msp430/driver-msp430.c (IN_TARGET_CODE): Likewise.
>>> * config/msp430/msp430-c.c (IN_TARGET_CODE): Likewise.
>>> * config/msp430/msp430.c (IN_TARGET_CODE): Likewise.
>>> * config/nds32/nds32-cost.c (IN_TARGET_CODE): Likewise.
>>> * config/nds32/nds32-fp-as-gp.c (IN_TARGET_CODE): Likewise.
>>> * config/nds32/nds32-intrinsic.c (IN_TARGET_CODE): Likewise.
>>> * config/nds32/nds32-isr.c (IN_TARGET_CODE): Likewise.
>>> * config/nds32/nds32-md-auxiliary.c (IN_TARGET_CODE): Likewise.
>>> * config/nds32/nds32-memory-manipulation.c (IN_TARGET_CODE): Likewise.
>>> * config/nds32/nds32-pipelines-auxiliary.c (IN_TARGET_CODE): Likewise.
>>> * config/nds32/nds32-predicates.c (IN_TARGET_CODE): Likewise.
>>> * config/nds32/nds32.c (IN_TARGET_CODE): Likewise.
>>> * config/nios2/nios2.c (IN_TARGET_CODE): Likewise.
>>> * config/nvptx/mkoffload.c (IN_TARGET_CODE): Likewise.
>>> * config/nvptx/nvptx.c (IN_TARGET_CODE): Likewise.
>>> * config/pa/pa.c (IN_TARGET_CODE): Likewise.
>>> * config/pdp11/pdp11.c (IN_TARGET_CODE): Likewise.
>>> * config/powerpcspe/driver-powerpcspe.c (IN_TARGET_CODE): Likewise.
>>> * config/powerpcspe/host-darwin.c (IN_TARGET_CODE): Likewise.
>>> * config/powerpcspe/host-ppc64-darwin.c (IN_TARGET_CODE): Likewise.
>>> * config/powerpcspe/powerpcspe-c.c (IN_TARGET_CODE): Likewise.
>>> * config/powerpcspe/powerpcspe-linux.c (IN_TARGET_CODE): Likewise.
>>> * config/powerpcspe/powerpcspe.c (IN_TARGET_CODE): Likewise.
>>> * config/riscv/riscv-builtins.c (IN_TARGET_CODE): Likewise.
>>> * config/riscv/riscv-c.c (IN_TARGET_CODE): Likewise.
>>> * config/riscv/riscv.c (IN_TARGET_CODE): Likewise.
>>> * config/rl78/rl78-c.c (IN_TARGET_CODE): Likewise.
>>> * config/rl78/rl78.c (IN_TARGET_CODE): Likewise.
>>> * config/rs6000/driver-rs6000.c (IN_TARGET_CODE): Likewise.
>>> * config/rs6000/host-darwin.c (IN_TARGET_CODE): Likewise.
>>> * config/rs6000/host-ppc64-darwin.c (IN_TARGET_CODE): Likewise.
>>> * config/rs6000/rs6000-c.c (IN_TARGET_CODE): Likewise.
>>> * config/rs6000/rs6000-linux.c (IN_TARGET_CODE): Likewise.
>>> * config/rs6000/rs6000-p8swap.c (IN_TARGET_CODE): Likewise.
>>> * config/rs6000/rs6000-string.c (IN_TARGET_CODE): Likewise.
>>> * config/rs6000/rs6000.c (IN_TARGET_CODE): Likewise.
>>> * config/rx/rx.c (IN_TARGET_CODE): Likewise.
>>> * config/s390/driver-native.c (IN_TARGET_CODE): Likewise.
>>> * config/s390/s390-c.c (IN_TARGET_CODE): Likewise.
>>> * config/s390/s390.c (IN_TARGET_CODE): Likewise.
>>> * config/sh/sh-c.c (IN_TARGET_CODE): Likewise.
>>> * config/sh/sh-mem.cc (IN_TARGET_CODE): Likewise.
>>> * config/sh/sh.c (IN_TARGET_CODE): Likewise.
>>> * config/sh/sh_optimize_sett_clrt.cc (IN_TARGET_CODE): Likewise.
>>> * config/sh/sh_treg_combine.cc (IN_TARGET_CODE): Likewise.
>>> * config/sparc/driver-sparc.c (IN_TARGET_CODE): Likewise.
>>> * config/sparc/sparc-c.c (IN_TARGET_CODE): Likewise.
>>> * config/sparc/sparc.c (IN_TARGET_CODE): Likewise.
>>> * config/spu/spu-c.c (IN_TARGET_CODE): Likewise.
>>> * config/spu/spu.c (IN_TARGET_CODE): Likewise.
>>> * config/stormy16/stormy16.c (IN_TARGET_CODE): Likewise.
>>> * config/tilegx/mul-tables.c (IN_TARGET_CODE): Likewise.
>>> * config/tilegx/tilegx-c.c (IN_TARGET_CODE): Likewise.
>>> * config/tilegx/tilegx.c (IN_TARGET_CODE): Likewise.
>>> * config/tilepro/mul-tables.c (IN_TARGET_CODE): Likewise.
>>> * config/tilepro/tilepro-c.c (IN_TARGET_CODE): Likewise.
>>> * config/tilepro/tilepro.c (IN_TARGET_CODE): Likewise.
>>> * config/v850/v850-c.c (IN_TARGET_CODE): Likewise.
>>> * config/v850/v850.c (IN_TARGET_CODE): Likewise.
>>> * config/vax/vax.c (IN_TARGET_CODE): Likewise.
>>> * config/visium/visium.c (IN_TARGET_CODE): Likewise.
>>> * config/vms/vms-c.c (IN_TARGET_CODE): Likewise.
>>> * config/vms/vms-f.c (IN_TARGET_CODE): Likewise.
>>> * config/vms/vms.c (IN_TARGET_CODE): Likewise.
>>> * config/xtensa/xtensa.c (IN_TARGET_CODE): Likewise.
>> ISTM this needs documenting somewhere.
>>
>> OK with a suitable doc patch.
>
> OK. I couldn't find anywhere that was an obvious fit, so in the end
> I went for the "Anatomy of a target back end". (The effect on
> IN_TARGET_CODE on poly-int is already documented in poly-int.texi.)
>
> Does this look OK?
>
> Thanks,
> Richard
>
>
> 2017-12-15 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * doc/sourcebuild.texi: Document IN_TARGET_CODE.
> [...]
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [003/nnn] poly_int: MACRO_MODE
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
2017-10-23 16:58 ` [001/nnn] poly_int: add poly-int.h Richard Sandiford
2017-10-23 16:59 ` [002/nnn] poly_int: IN_TARGET_CODE Richard Sandiford
@ 2017-10-23 17:00 ` Richard Sandiford
2017-11-17 3:36 ` Jeff Law
2017-10-23 17:00 ` [004/nnn] poly_int: mode query functions Richard Sandiford
` (104 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:00 UTC (permalink / raw)
To: gcc-patches
This patch uses a MACRO_MODE wrapper for the target macro invocations
in targhooks.c and address.h, so that macros for non-AArch64 targets
can continue to treat modes as fixed-size.
It didn't seem worth converting the address macros to hooks since
(a) they're heavily used, (b) they should be probably be replaced
with a different interface rather than converted to hooks as-is,
and most importantly (c) addresses.h already localises the problem.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (MACRO_MODE): New macro.
* addresses.h (base_reg_class, ok_for_base_p_1): Use it.
* targhooks.c (default_libcall_value, default_secondary_reload)
(default_memory_move_cost, default_register_move_cost)
(default_class_max_nregs): Likewise.
Index: gcc/machmode.h
===================================================================
--- gcc/machmode.h 2017-10-23 16:52:20.675923636 +0100
+++ gcc/machmode.h 2017-10-23 17:00:49.664349224 +0100
@@ -685,6 +685,17 @@ fixed_size_mode::includes_p (machine_mod
return true;
}
+/* Wrapper for mode arguments to target macros, so that if a target
+ doesn't need polynomial-sized modes, its header file can continue
+ to treat everything as fixed_size_mode. This should go away once
+ macros are moved to target hooks. It shouldn't be used in other
+ contexts. */
+#if NUM_POLY_INT_COEFFS == 1
+#define MACRO_MODE(MODE) (as_a <fixed_size_mode> (MODE))
+#else
+#define MACRO_MODE(MODE) (MODE)
+#endif
+
extern opt_machine_mode mode_for_size (unsigned int, enum mode_class, int);
/* Return the machine mode to use for a MODE_INT of SIZE bits, if one
Index: gcc/addresses.h
===================================================================
--- gcc/addresses.h 2017-10-23 16:52:20.675923636 +0100
+++ gcc/addresses.h 2017-10-23 17:00:49.663350133 +0100
@@ -31,14 +31,15 @@ base_reg_class (machine_mode mode ATTRIB
enum rtx_code index_code ATTRIBUTE_UNUSED)
{
#ifdef MODE_CODE_BASE_REG_CLASS
- return MODE_CODE_BASE_REG_CLASS (mode, as, outer_code, index_code);
+ return MODE_CODE_BASE_REG_CLASS (MACRO_MODE (mode), as, outer_code,
+ index_code);
#else
#ifdef MODE_BASE_REG_REG_CLASS
if (index_code == REG)
- return MODE_BASE_REG_REG_CLASS (mode);
+ return MODE_BASE_REG_REG_CLASS (MACRO_MODE (mode));
#endif
#ifdef MODE_BASE_REG_CLASS
- return MODE_BASE_REG_CLASS (mode);
+ return MODE_BASE_REG_CLASS (MACRO_MODE (mode));
#else
return BASE_REG_CLASS;
#endif
@@ -58,15 +59,15 @@ ok_for_base_p_1 (unsigned regno ATTRIBUT
enum rtx_code index_code ATTRIBUTE_UNUSED)
{
#ifdef REGNO_MODE_CODE_OK_FOR_BASE_P
- return REGNO_MODE_CODE_OK_FOR_BASE_P (regno, mode, as,
+ return REGNO_MODE_CODE_OK_FOR_BASE_P (regno, MACRO_MODE (mode), as,
outer_code, index_code);
#else
#ifdef REGNO_MODE_OK_FOR_REG_BASE_P
if (index_code == REG)
- return REGNO_MODE_OK_FOR_REG_BASE_P (regno, mode);
+ return REGNO_MODE_OK_FOR_REG_BASE_P (regno, MACRO_MODE (mode));
#endif
#ifdef REGNO_MODE_OK_FOR_BASE_P
- return REGNO_MODE_OK_FOR_BASE_P (regno, mode);
+ return REGNO_MODE_OK_FOR_BASE_P (regno, MACRO_MODE (mode));
#else
return REGNO_OK_FOR_BASE_P (regno);
#endif
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c 2017-10-23 17:00:20.920834919 +0100
+++ gcc/targhooks.c 2017-10-23 17:00:49.664349224 +0100
@@ -941,7 +941,7 @@ default_libcall_value (machine_mode mode
const_rtx fun ATTRIBUTE_UNUSED)
{
#ifdef LIBCALL_VALUE
- return LIBCALL_VALUE (mode);
+ return LIBCALL_VALUE (MACRO_MODE (mode));
#else
gcc_unreachable ();
#endif
@@ -1071,11 +1071,13 @@ default_secondary_reload (bool in_p ATTR
}
#ifdef SECONDARY_INPUT_RELOAD_CLASS
if (in_p)
- rclass = SECONDARY_INPUT_RELOAD_CLASS (reload_class, reload_mode, x);
+ rclass = SECONDARY_INPUT_RELOAD_CLASS (reload_class,
+ MACRO_MODE (reload_mode), x);
#endif
#ifdef SECONDARY_OUTPUT_RELOAD_CLASS
if (! in_p)
- rclass = SECONDARY_OUTPUT_RELOAD_CLASS (reload_class, reload_mode, x);
+ rclass = SECONDARY_OUTPUT_RELOAD_CLASS (reload_class,
+ MACRO_MODE (reload_mode), x);
#endif
if (rclass != NO_REGS)
{
@@ -1603,7 +1605,7 @@ default_memory_move_cost (machine_mode m
#ifndef MEMORY_MOVE_COST
return (4 + memory_move_secondary_cost (mode, (enum reg_class) rclass, in));
#else
- return MEMORY_MOVE_COST (mode, (enum reg_class) rclass, in);
+ return MEMORY_MOVE_COST (MACRO_MODE (mode), (enum reg_class) rclass, in);
#endif
}
@@ -1618,7 +1620,8 @@ default_register_move_cost (machine_mode
#ifndef REGISTER_MOVE_COST
return 2;
#else
- return REGISTER_MOVE_COST (mode, (enum reg_class) from, (enum reg_class) to);
+ return REGISTER_MOVE_COST (MACRO_MODE (mode),
+ (enum reg_class) from, (enum reg_class) to);
#endif
}
@@ -1807,7 +1810,8 @@ default_class_max_nregs (reg_class_t rcl
machine_mode mode ATTRIBUTE_UNUSED)
{
#ifdef CLASS_MAX_NREGS
- return (unsigned char) CLASS_MAX_NREGS ((enum reg_class) rclass, mode);
+ return (unsigned char) CLASS_MAX_NREGS ((enum reg_class) rclass,
+ MACRO_MODE (mode));
#else
return ((GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD);
#endif
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [003/nnn] poly_int: MACRO_MODE
2017-10-23 17:00 ` [003/nnn] poly_int: MACRO_MODE Richard Sandiford
@ 2017-11-17 3:36 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-17 3:36 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 10:59 AM, Richard Sandiford wrote:
> This patch uses a MACRO_MODE wrapper for the target macro invocations
> in targhooks.c and address.h, so that macros for non-AArch64 targets
> can continue to treat modes as fixed-size.
>
> It didn't seem worth converting the address macros to hooks since
> (a) they're heavily used, (b) they should be probably be replaced
> with a different interface rather than converted to hooks as-is,
> and most importantly (c) addresses.h already localises the problem.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * machmode.h (MACRO_MODE): New macro.
> * addresses.h (base_reg_class, ok_for_base_p_1): Use it.
> * targhooks.c (default_libcall_value, default_secondary_reload)
> (default_memory_move_cost, default_register_move_cost)
> (default_class_max_nregs): Likewise.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [004/nnn] poly_int: mode query functions
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (2 preceding siblings ...)
2017-10-23 17:00 ` [003/nnn] poly_int: MACRO_MODE Richard Sandiford
@ 2017-10-23 17:00 ` Richard Sandiford
2017-11-17 3:37 ` Jeff Law
2017-10-23 17:01 ` [005/nnn] poly_int: rtx constants Richard Sandiford
` (103 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:00 UTC (permalink / raw)
To: gcc-patches
This patch changes the bit size and vector count arguments to the
machmode.h functions from unsigned int to poly_uint64.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_for_size, int_mode_for_size, float_mode_for_size)
(smallest_mode_for_size, smallest_int_mode_for_size): Take the mode
size as a poly_uint64.
(mode_for_vector, mode_for_int_vector): Take the number of vector
elements as a poly_uint64.
* stor-layout.c (mode_for_size, smallest_mode_for_size): Take the mode
size as a poly_uint64.
(mode_for_vector, mode_for_int_vector): Take the number of vector
elements as a poly_uint64.
Index: gcc/machmode.h
===================================================================
--- gcc/machmode.h 2017-10-23 17:00:49.664349224 +0100
+++ gcc/machmode.h 2017-10-23 17:00:52.669615373 +0100
@@ -696,14 +696,14 @@ #define MACRO_MODE(MODE) (as_a <fixed_si
#define MACRO_MODE(MODE) (MODE)
#endif
-extern opt_machine_mode mode_for_size (unsigned int, enum mode_class, int);
+extern opt_machine_mode mode_for_size (poly_uint64, enum mode_class, int);
/* Return the machine mode to use for a MODE_INT of SIZE bits, if one
exists. If LIMIT is nonzero, modes wider than MAX_FIXED_MODE_SIZE
will not be used. */
inline opt_scalar_int_mode
-int_mode_for_size (unsigned int size, int limit)
+int_mode_for_size (poly_uint64 size, int limit)
{
return dyn_cast <scalar_int_mode> (mode_for_size (size, MODE_INT, limit));
}
@@ -712,7 +712,7 @@ int_mode_for_size (unsigned int size, in
exists. */
inline opt_scalar_float_mode
-float_mode_for_size (unsigned int size)
+float_mode_for_size (poly_uint64 size)
{
return dyn_cast <scalar_float_mode> (mode_for_size (size, MODE_FLOAT, 0));
}
@@ -726,21 +726,21 @@ decimal_float_mode_for_size (unsigned in
(mode_for_size (size, MODE_DECIMAL_FLOAT, 0));
}
-extern machine_mode smallest_mode_for_size (unsigned int, enum mode_class);
+extern machine_mode smallest_mode_for_size (poly_uint64, enum mode_class);
/* Find the narrowest integer mode that contains at least SIZE bits.
Such a mode must exist. */
inline scalar_int_mode
-smallest_int_mode_for_size (unsigned int size)
+smallest_int_mode_for_size (poly_uint64 size)
{
return as_a <scalar_int_mode> (smallest_mode_for_size (size, MODE_INT));
}
extern opt_scalar_int_mode int_mode_for_mode (machine_mode);
extern opt_machine_mode bitwise_mode_for_mode (machine_mode);
-extern opt_machine_mode mode_for_vector (scalar_mode, unsigned);
-extern opt_machine_mode mode_for_int_vector (unsigned int, unsigned int);
+extern opt_machine_mode mode_for_vector (scalar_mode, poly_uint64);
+extern opt_machine_mode mode_for_int_vector (unsigned int, poly_uint64);
/* Return the integer vector equivalent of MODE, if one exists. In other
words, return the mode for an integer vector that has the same number
Index: gcc/stor-layout.c
===================================================================
--- gcc/stor-layout.c 2017-10-23 16:52:20.627879504 +0100
+++ gcc/stor-layout.c 2017-10-23 17:00:52.669615373 +0100
@@ -297,22 +297,22 @@ finalize_size_functions (void)
MAX_FIXED_MODE_SIZE. */
opt_machine_mode
-mode_for_size (unsigned int size, enum mode_class mclass, int limit)
+mode_for_size (poly_uint64 size, enum mode_class mclass, int limit)
{
machine_mode mode;
int i;
- if (limit && size > MAX_FIXED_MODE_SIZE)
+ if (limit && may_gt (size, (unsigned int) MAX_FIXED_MODE_SIZE))
return opt_machine_mode ();
/* Get the first mode which has this size, in the specified class. */
FOR_EACH_MODE_IN_CLASS (mode, mclass)
- if (GET_MODE_PRECISION (mode) == size)
+ if (must_eq (GET_MODE_PRECISION (mode), size))
return mode;
if (mclass == MODE_INT || mclass == MODE_PARTIAL_INT)
for (i = 0; i < NUM_INT_N_ENTS; i ++)
- if (int_n_data[i].bitsize == size
+ if (must_eq (int_n_data[i].bitsize, size)
&& int_n_enabled_p[i])
return int_n_data[i].m;
@@ -340,7 +340,7 @@ mode_for_size_tree (const_tree size, enu
SIZE bits. Abort if no such mode exists. */
machine_mode
-smallest_mode_for_size (unsigned int size, enum mode_class mclass)
+smallest_mode_for_size (poly_uint64 size, enum mode_class mclass)
{
machine_mode mode = VOIDmode;
int i;
@@ -348,19 +348,18 @@ smallest_mode_for_size (unsigned int siz
/* Get the first mode which has at least this size, in the
specified class. */
FOR_EACH_MODE_IN_CLASS (mode, mclass)
- if (GET_MODE_PRECISION (mode) >= size)
+ if (must_ge (GET_MODE_PRECISION (mode), size))
break;
+ gcc_assert (mode != VOIDmode);
+
if (mclass == MODE_INT || mclass == MODE_PARTIAL_INT)
for (i = 0; i < NUM_INT_N_ENTS; i ++)
- if (int_n_data[i].bitsize >= size
- && int_n_data[i].bitsize < GET_MODE_PRECISION (mode)
+ if (must_ge (int_n_data[i].bitsize, size)
+ && must_lt (int_n_data[i].bitsize, GET_MODE_PRECISION (mode))
&& int_n_enabled_p[i])
mode = int_n_data[i].m;
- if (mode == VOIDmode)
- gcc_unreachable ();
-
return mode;
}
@@ -475,7 +474,7 @@ bitwise_type_for_mode (machine_mode mode
either an integer mode or a vector mode. */
opt_machine_mode
-mode_for_vector (scalar_mode innermode, unsigned nunits)
+mode_for_vector (scalar_mode innermode, poly_uint64 nunits)
{
machine_mode mode;
@@ -496,14 +495,14 @@ mode_for_vector (scalar_mode innermode,
/* Do not check vector_mode_supported_p here. We'll do that
later in vector_type_mode. */
FOR_EACH_MODE_FROM (mode, mode)
- if (GET_MODE_NUNITS (mode) == nunits
+ if (must_eq (GET_MODE_NUNITS (mode), nunits)
&& GET_MODE_INNER (mode) == innermode)
return mode;
/* For integers, try mapping it to a same-sized scalar mode. */
if (GET_MODE_CLASS (innermode) == MODE_INT)
{
- unsigned int nbits = nunits * GET_MODE_BITSIZE (innermode);
+ poly_uint64 nbits = nunits * GET_MODE_BITSIZE (innermode);
if (int_mode_for_size (nbits, 0).exists (&mode)
&& have_regs_of_mode[mode])
return mode;
@@ -517,7 +516,7 @@ mode_for_vector (scalar_mode innermode,
an integer mode or a vector mode. */
opt_machine_mode
-mode_for_int_vector (unsigned int int_bits, unsigned int nunits)
+mode_for_int_vector (unsigned int int_bits, poly_uint64 nunits)
{
scalar_int_mode int_mode;
machine_mode vec_mode;
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [004/nnn] poly_int: mode query functions
2017-10-23 17:00 ` [004/nnn] poly_int: mode query functions Richard Sandiford
@ 2017-11-17 3:37 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-17 3:37 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 10:59 AM, Richard Sandiford wrote:
> This patch changes the bit size and vector count arguments to the
> machmode.h functions from unsigned int to poly_uint64.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * machmode.h (mode_for_size, int_mode_for_size, float_mode_for_size)
> (smallest_mode_for_size, smallest_int_mode_for_size): Take the mode
> size as a poly_uint64.
> (mode_for_vector, mode_for_int_vector): Take the number of vector
> elements as a poly_uint64.
> * stor-layout.c (mode_for_size, smallest_mode_for_size): Take the mode
> size as a poly_uint64.
> (mode_for_vector, mode_for_int_vector): Take the number of vector
> elements as a poly_uint64.
OK.
I think that in general a change from a integer to a poly_uint64 should
generally be considered OK without the need for review. Ultimately
those are highly mechanical changes with little risk for mucking
something up badly.
Obviously the changes wouldn't go in until we settled the poly_uint64
questions though.
Jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [005/nnn] poly_int: rtx constants
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (3 preceding siblings ...)
2017-10-23 17:00 ` [004/nnn] poly_int: mode query functions Richard Sandiford
@ 2017-10-23 17:01 ` Richard Sandiford
2017-11-17 4:17 ` Jeff Law
2017-10-23 17:02 ` [007/nnn] poly_int: dump routines Richard Sandiford
` (102 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:01 UTC (permalink / raw)
To: gcc-patches
This patch adds an rtl representation of poly_int values.
There were three possible ways of doing this:
(1) Add a new rtl code for the poly_ints themselves and store the
coefficients as trailing wide_ints. This would give constants like:
(const_poly_int [c0 c1 ... cn])
The runtime value would be:
c0 + c1 * x1 + ... + cn * xn
(2) Like (1), but use rtxes for the coefficients. This would give
constants like:
(const_poly_int [(const_int c0)
(const_int c1)
...
(const_int cn)])
although the coefficients could be const_wide_ints instead
of const_ints where appropriate.
(3) Add a new rtl code for the polynomial indeterminates,
then use them in const wrappers. A constant like c0 + c1 * x1
would then look like:
(const:M (plus:M (mult:M (const_param:M x1)
(const_int c1))
(const_int c0)))
There didn't seem to be that much to choose between them. The main
advantage of (1) is that it's a more efficient representation and
that we can refer to the cofficients directly as wide_int_storage.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/rtl.texi (const_poly_int): Document.
* gengenrtl.c (excluded_rtx): Return true for CONST_POLY_INT.
* rtl.h (const_poly_int_def): New struct.
(rtx_def::u): Add a cpi field.
(CASE_CONST_UNIQUE, CASE_CONST_ANY): Add CONST_POLY_INT.
(CONST_POLY_INT_P, CONST_POLY_INT_COEFFS): New macros.
(wi::rtx_to_poly_wide_ref): New typedef
(const_poly_int_value, wi::to_poly_wide, rtx_to_poly_int64)
(poly_int_rtx_p): New functions.
(trunc_int_for_mode): Declare a poly_int64 version.
(plus_constant): Take a poly_int64 instead of a HOST_WIDE_INT.
(immed_wide_int_const): Take a poly_wide_int_ref rather than
a wide_int_ref.
(strip_offset): Declare.
(strip_offset_and_add): New function.
* rtl.def (CONST_POLY_INT): New rtx code.
* rtl.c (rtx_size): Handle CONST_POLY_INT.
(shared_const_p): Use poly_int_rtx_p.
* emit-rtl.h (gen_int_mode): Take a poly_int64 instead of a
HOST_WIDE_INT.
(gen_int_shift_amount): Likewise.
* emit-rtl.c (const_poly_int_hasher): New class.
(const_poly_int_htab): New variable.
(init_emit_once): Initialize it when NUM_POLY_INT_COEFFS > 1.
(const_poly_int_hasher::hash): New function.
(const_poly_int_hasher::equal): Likewise.
(gen_int_mode): Take a poly_int64 instead of a HOST_WIDE_INT.
(immed_wide_int_const): Rename to...
(immed_wide_int_const_1): ...this and make static.
(immed_wide_int_const): New function, taking a poly_wide_int_ref
instead of a wide_int_ref.
(gen_int_shift_amount): Take a poly_int64 instead of a HOST_WIDE_INT.
(gen_lowpart_common): Handle CONST_POLY_INT.
* cse.c (hash_rtx_cb, equiv_constant): Likewise.
* cselib.c (cselib_hash_rtx): Likewise.
* dwarf2out.c (const_ok_for_output_1): Likewise.
* expr.c (convert_modes): Likewise.
* print-rtl.c (rtx_writer::print_rtx, print_value): Likewise.
* rtlhash.c (add_rtx): Likewise.
* explow.c (trunc_int_for_mode): Add a poly_int64 version.
(plus_constant): Take a poly_int64 instead of a HOST_WIDE_INT.
Handle existing CONST_POLY_INT rtxes.
* expmed.h (expand_shift): Take a poly_int64 instead of a
HOST_WIDE_INT.
* expmed.c (expand_shift): Likewise.
* rtlanal.c (strip_offset): New function.
(commutative_operand_precedence): Give CONST_POLY_INT the same
precedence as CONST_DOUBLE and put CONST_WIDE_INT between that
and CONST_INT.
* rtl-tests.c (const_poly_int_tests): New struct.
(rtl_tests_c_tests): Use it.
* simplify-rtx.c (simplify_const_unary_operation): Handle
CONST_POLY_INT.
(simplify_const_binary_operation): Likewise.
(simplify_binary_operation_1): Fold additions of symbolic constants
and CONST_POLY_INTs.
(simplify_subreg): Handle extensions and truncations of
CONST_POLY_INTs.
(simplify_const_poly_int_tests): New struct.
(simplify_rtx_c_tests): Use it.
* wide-int.h (storage_ref): Add default constructor.
(wide_int_ref_storage): Likewise.
(trailing_wide_ints): Use GTY((user)).
(trailing_wide_ints::operator[]): Add a const version.
(trailing_wide_ints::get_precision): New function.
(trailing_wide_ints::extra_size): Likewise.
Index: gcc/doc/rtl.texi
===================================================================
--- gcc/doc/rtl.texi 2017-10-23 17:00:20.916834036 +0100
+++ gcc/doc/rtl.texi 2017-10-23 17:00:54.437007600 +0100
@@ -1621,6 +1621,15 @@ is accessed with the macro @code{CONST_F
data is accessed with @code{CONST_FIXED_VALUE_HIGH}; the low part is
accessed with @code{CONST_FIXED_VALUE_LOW}.
+@findex const_poly_int
+@item (const_poly_int:@var{m} [@var{c0} @var{c1} @dots{}])
+Represents a @code{poly_int}-style polynomial integer with coefficients
+@var{c0}, @var{c1}, @dots{}. The coefficients are @code{wide_int}-based
+integers rather than rtxes. @code{CONST_POLY_INT_COEFFS} gives the
+values of individual coefficients (which is mostly only useful in
+low-level routines) and @code{const_poly_int_value} gives the full
+@code{poly_int} value.
+
@findex const_vector
@item (const_vector:@var{m} [@var{x0} @var{x1} @dots{}])
Represents a vector constant. The square brackets stand for the vector
Index: gcc/gengenrtl.c
===================================================================
--- gcc/gengenrtl.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/gengenrtl.c 2017-10-23 17:00:54.442003055 +0100
@@ -157,6 +157,7 @@ excluded_rtx (int idx)
return (strcmp (defs[idx].enumname, "VAR_LOCATION") == 0
|| strcmp (defs[idx].enumname, "CONST_DOUBLE") == 0
|| strcmp (defs[idx].enumname, "CONST_WIDE_INT") == 0
+ || strcmp (defs[idx].enumname, "CONST_POLY_INT") == 0
|| strcmp (defs[idx].enumname, "CONST_FIXED") == 0);
}
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h 2017-10-23 16:52:20.579835373 +0100
+++ gcc/rtl.h 2017-10-23 17:00:54.444001238 +0100
@@ -280,6 +280,10 @@ #define CWI_GET_NUM_ELEM(RTX) \
#define CWI_PUT_NUM_ELEM(RTX, NUM) \
(RTL_FLAG_CHECK1("CWI_PUT_NUM_ELEM", (RTX), CONST_WIDE_INT)->u2.num_elem = (NUM))
+struct GTY((variable_size)) const_poly_int_def {
+ trailing_wide_ints<NUM_POLY_INT_COEFFS> coeffs;
+};
+
/* RTL expression ("rtx"). */
/* The GTY "desc" and "tag" options below are a kludge: we need a desc
@@ -424,6 +428,7 @@ struct GTY((desc("0"), tag("0"),
struct real_value rv;
struct fixed_value fv;
struct hwivec_def hwiv;
+ struct const_poly_int_def cpi;
} GTY ((special ("rtx_def"), desc ("GET_CODE (&%0)"))) u;
};
@@ -734,6 +739,7 @@ #define CASE_CONST_SCALAR_INT \
#define CASE_CONST_UNIQUE \
case CONST_INT: \
case CONST_WIDE_INT: \
+ case CONST_POLY_INT: \
case CONST_DOUBLE: \
case CONST_FIXED
@@ -741,6 +747,7 @@ #define CASE_CONST_UNIQUE \
#define CASE_CONST_ANY \
case CONST_INT: \
case CONST_WIDE_INT: \
+ case CONST_POLY_INT: \
case CONST_DOUBLE: \
case CONST_FIXED: \
case CONST_VECTOR
@@ -773,6 +780,11 @@ #define CONST_INT_P(X) (GET_CODE (X) ==
/* Predicate yielding nonzero iff X is an rtx for a constant integer. */
#define CONST_WIDE_INT_P(X) (GET_CODE (X) == CONST_WIDE_INT)
+/* Predicate yielding nonzero iff X is an rtx for a polynomial constant
+ integer. */
+#define CONST_POLY_INT_P(X) \
+ (NUM_POLY_INT_COEFFS > 1 && GET_CODE (X) == CONST_POLY_INT)
+
/* Predicate yielding nonzero iff X is an rtx for a constant fixed-point. */
#define CONST_FIXED_P(X) (GET_CODE (X) == CONST_FIXED)
@@ -1871,6 +1883,12 @@ #define CONST_WIDE_INT_VEC(RTX) HWIVEC_C
#define CONST_WIDE_INT_NUNITS(RTX) CWI_GET_NUM_ELEM (RTX)
#define CONST_WIDE_INT_ELT(RTX, N) CWI_ELT (RTX, N)
+/* For a CONST_POLY_INT, CONST_POLY_INT_COEFFS gives access to the
+ individual coefficients, in the form of a trailing_wide_ints structure. */
+#define CONST_POLY_INT_COEFFS(RTX) \
+ (RTL_FLAG_CHECK1("CONST_POLY_INT_COEFFS", (RTX), \
+ CONST_POLY_INT)->u.cpi.coeffs)
+
/* For a CONST_DOUBLE:
#if TARGET_SUPPORTS_WIDE_INT == 0
For a VOIDmode, there are two integers CONST_DOUBLE_LOW is the
@@ -2184,6 +2202,84 @@ wi::max_value (machine_mode mode, signop
return max_value (GET_MODE_PRECISION (as_a <scalar_mode> (mode)), sgn);
}
+namespace wi
+{
+ typedef poly_int<NUM_POLY_INT_COEFFS,
+ generic_wide_int <wide_int_ref_storage <false, false> > >
+ rtx_to_poly_wide_ref;
+ rtx_to_poly_wide_ref to_poly_wide (const_rtx, machine_mode);
+}
+
+/* Return the value of a CONST_POLY_INT in its native precision. */
+
+inline wi::rtx_to_poly_wide_ref
+const_poly_int_value (const_rtx x)
+{
+ poly_int<NUM_POLY_INT_COEFFS, WIDE_INT_REF_FOR (wide_int)> res;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ res.coeffs[i] = CONST_POLY_INT_COEFFS (x)[i];
+ return res;
+}
+
+/* Return true if X is a scalar integer or a CONST_POLY_INT. The value
+ can then be extracted using wi::to_poly_wide. */
+
+inline bool
+poly_int_rtx_p (const_rtx x)
+{
+ return CONST_SCALAR_INT_P (x) || CONST_POLY_INT_P (x);
+}
+
+/* Access X (which satisfies poly_int_rtx_p) as a poly_wide_int.
+ MODE is the mode of X. */
+
+inline wi::rtx_to_poly_wide_ref
+wi::to_poly_wide (const_rtx x, machine_mode mode)
+{
+ if (CONST_POLY_INT_P (x))
+ return const_poly_int_value (x);
+ return rtx_mode_t (const_cast<rtx> (x), mode);
+}
+
+/* Return the value of X as a poly_int64. */
+
+inline poly_int64
+rtx_to_poly_int64 (const_rtx x)
+{
+ if (CONST_POLY_INT_P (x))
+ {
+ poly_int64 res;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ res.coeffs[i] = CONST_POLY_INT_COEFFS (x)[i].to_shwi ();
+ return res;
+ }
+ return INTVAL (x);
+}
+
+/* Return true if arbitrary value X is an integer constant that can
+ be represented as a poly_int64. Store the value in *RES if so,
+ otherwise leave it unmodified. */
+
+inline bool
+poly_int_rtx_p (const_rtx x, poly_int64_pod *res)
+{
+ if (CONST_INT_P (x))
+ {
+ *res = INTVAL (x);
+ return true;
+ }
+ if (CONST_POLY_INT_P (x))
+ {
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ if (!wi::fits_shwi_p (CONST_POLY_INT_COEFFS (x)[i]))
+ return false;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ res->coeffs[i] = CONST_POLY_INT_COEFFS (x)[i].to_shwi ();
+ return true;
+ }
+ return false;
+}
+
extern void init_rtlanal (void);
extern int rtx_cost (rtx, machine_mode, enum rtx_code, int, bool);
extern int address_cost (rtx, machine_mode, addr_space_t, bool);
@@ -2721,7 +2817,8 @@ #define EXTRACT_ARGS_IN_RANGE(SIZE, POS,
/* In explow.c */
extern HOST_WIDE_INT trunc_int_for_mode (HOST_WIDE_INT, machine_mode);
-extern rtx plus_constant (machine_mode, rtx, HOST_WIDE_INT, bool = false);
+extern poly_int64 trunc_int_for_mode (poly_int64, machine_mode);
+extern rtx plus_constant (machine_mode, rtx, poly_int64, bool = false);
extern HOST_WIDE_INT get_stack_check_protect (void);
/* In rtl.c */
@@ -3032,13 +3129,11 @@ extern void end_sequence (void);
extern double_int rtx_to_double_int (const_rtx);
#endif
extern void cwi_output_hex (FILE *, const_rtx);
-#ifndef GENERATOR_FILE
-extern rtx immed_wide_int_const (const wide_int_ref &, machine_mode);
-#endif
#if TARGET_SUPPORTS_WIDE_INT == 0
extern rtx immed_double_const (HOST_WIDE_INT, HOST_WIDE_INT,
machine_mode);
#endif
+extern rtx immed_wide_int_const (const poly_wide_int_ref &, machine_mode);
/* In varasm.c */
extern rtx force_const_mem (machine_mode, rtx);
@@ -3226,6 +3321,7 @@ extern HOST_WIDE_INT get_integer_term (c
extern rtx get_related_value (const_rtx);
extern bool offset_within_block_p (const_rtx, HOST_WIDE_INT);
extern void split_const (rtx, rtx *, rtx *);
+extern rtx strip_offset (rtx, poly_int64_pod *);
extern bool unsigned_reg_p (rtx);
extern int reg_mentioned_p (const_rtx, const_rtx);
extern int count_occurrences (const_rtx, const_rtx, int);
@@ -4160,6 +4256,21 @@ load_extend_op (machine_mode mode)
return UNKNOWN;
}
+/* If X is a PLUS of a base and a constant offset, add the constant to *OFFSET
+ and return the base. Return X otherwise. */
+
+inline rtx
+strip_offset_and_add (rtx x, poly_int64_pod *offset)
+{
+ if (GET_CODE (x) == PLUS)
+ {
+ poly_int64 suboffset;
+ x = strip_offset (x, &suboffset);
+ *offset += suboffset;
+ }
+ return x;
+}
+
/* gtype-desc.c. */
extern void gt_ggc_mx (rtx &);
extern void gt_pch_nx (rtx &);
Index: gcc/rtl.def
===================================================================
--- gcc/rtl.def 2017-10-23 16:52:20.579835373 +0100
+++ gcc/rtl.def 2017-10-23 17:00:54.443002147 +0100
@@ -348,6 +348,9 @@ DEF_RTL_EXPR(CONST_INT, "const_int", "w"
/* numeric integer constant */
DEF_RTL_EXPR(CONST_WIDE_INT, "const_wide_int", "", RTX_CONST_OBJ)
+/* An rtx representation of a poly_wide_int. */
+DEF_RTL_EXPR(CONST_POLY_INT, "const_poly_int", "", RTX_CONST_OBJ)
+
/* fixed-point constant */
DEF_RTL_EXPR(CONST_FIXED, "const_fixed", "www", RTX_CONST_OBJ)
Index: gcc/rtl.c
===================================================================
--- gcc/rtl.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/rtl.c 2017-10-23 17:00:54.443002147 +0100
@@ -189,6 +189,10 @@ rtx_size (const_rtx x)
+ sizeof (struct hwivec_def)
+ ((CONST_WIDE_INT_NUNITS (x) - 1)
* sizeof (HOST_WIDE_INT)));
+ if (CONST_POLY_INT_P (x))
+ return (RTX_HDR_SIZE
+ + sizeof (struct const_poly_int_def)
+ + CONST_POLY_INT_COEFFS (x).extra_size ());
if (GET_CODE (x) == SYMBOL_REF && SYMBOL_REF_HAS_BLOCK_INFO_P (x))
return RTX_HDR_SIZE + sizeof (struct block_symbol);
return RTX_CODE_SIZE (GET_CODE (x));
@@ -257,9 +261,10 @@ shared_const_p (const_rtx orig)
/* CONST can be shared if it contains a SYMBOL_REF. If it contains
a LABEL_REF, it isn't sharable. */
+ poly_int64 offset;
return (GET_CODE (XEXP (orig, 0)) == PLUS
&& GET_CODE (XEXP (XEXP (orig, 0), 0)) == SYMBOL_REF
- && CONST_INT_P (XEXP (XEXP (orig, 0), 1)));
+ && poly_int_rtx_p (XEXP (XEXP (orig, 0), 1), &offset));
}
Index: gcc/emit-rtl.h
===================================================================
--- gcc/emit-rtl.h 2017-10-23 16:52:20.579835373 +0100
+++ gcc/emit-rtl.h 2017-10-23 17:00:54.440004873 +0100
@@ -362,14 +362,14 @@ extern rtvec gen_rtvec (int, ...);
extern rtx copy_insn_1 (rtx);
extern rtx copy_insn (rtx);
extern rtx_insn *copy_delay_slot_insn (rtx_insn *);
-extern rtx gen_int_mode (HOST_WIDE_INT, machine_mode);
+extern rtx gen_int_mode (poly_int64, machine_mode);
extern rtx_insn *emit_copy_of_insn_after (rtx_insn *, rtx_insn *);
extern void set_reg_attrs_from_value (rtx, rtx);
extern void set_reg_attrs_for_parm (rtx, rtx);
extern void set_reg_attrs_for_decl_rtl (tree t, rtx x);
extern void adjust_reg_mode (rtx, machine_mode);
extern int mem_expr_equal_p (const_tree, const_tree);
-extern rtx gen_int_shift_amount (machine_mode, HOST_WIDE_INT);
+extern rtx gen_int_shift_amount (machine_mode, poly_int64);
extern bool need_atomic_barrier_p (enum memmodel, bool);
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/emit-rtl.c 2017-10-23 17:00:54.440004873 +0100
@@ -148,6 +148,16 @@ struct const_wide_int_hasher : ggc_cache
static GTY ((cache)) hash_table<const_wide_int_hasher> *const_wide_int_htab;
+struct const_poly_int_hasher : ggc_cache_ptr_hash<rtx_def>
+{
+ typedef std::pair<machine_mode, poly_wide_int_ref> compare_type;
+
+ static hashval_t hash (rtx x);
+ static bool equal (rtx x, const compare_type &y);
+};
+
+static GTY ((cache)) hash_table<const_poly_int_hasher> *const_poly_int_htab;
+
/* A hash table storing register attribute structures. */
struct reg_attr_hasher : ggc_cache_ptr_hash<reg_attrs>
{
@@ -257,6 +267,31 @@ const_wide_int_hasher::equal (rtx x, rtx
}
#endif
+/* Returns a hash code for CONST_POLY_INT X. */
+
+hashval_t
+const_poly_int_hasher::hash (rtx x)
+{
+ inchash::hash h;
+ h.add_int (GET_MODE (x));
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ h.add_wide_int (CONST_POLY_INT_COEFFS (x)[i]);
+ return h.end ();
+}
+
+/* Returns nonzero if CONST_POLY_INT X is an rtx representation of Y. */
+
+bool
+const_poly_int_hasher::equal (rtx x, const compare_type &y)
+{
+ if (GET_MODE (x) != y.first)
+ return false;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ if (CONST_POLY_INT_COEFFS (x)[i] != y.second.coeffs[i])
+ return false;
+ return true;
+}
+
/* Returns a hash code for X (which is really a CONST_DOUBLE). */
hashval_t
const_double_hasher::hash (rtx x)
@@ -520,9 +555,13 @@ gen_rtx_CONST_INT (machine_mode mode ATT
}
rtx
-gen_int_mode (HOST_WIDE_INT c, machine_mode mode)
+gen_int_mode (poly_int64 c, machine_mode mode)
{
- return GEN_INT (trunc_int_for_mode (c, mode));
+ c = trunc_int_for_mode (c, mode);
+ if (c.is_constant ())
+ return GEN_INT (c.coeffs[0]);
+ unsigned int prec = GET_MODE_PRECISION (as_a <scalar_mode> (mode));
+ return immed_wide_int_const (poly_wide_int::from (c, prec, SIGNED), mode);
}
/* CONST_DOUBLEs might be created from pairs of integers, or from
@@ -626,8 +665,8 @@ lookup_const_wide_int (rtx wint)
a CONST_DOUBLE (if !TARGET_SUPPORTS_WIDE_INT) or a CONST_WIDE_INT
(if TARGET_SUPPORTS_WIDE_INT). */
-rtx
-immed_wide_int_const (const wide_int_ref &v, machine_mode mode)
+static rtx
+immed_wide_int_const_1 (const wide_int_ref &v, machine_mode mode)
{
unsigned int len = v.get_len ();
/* Not scalar_int_mode because we also allow pointer bound modes. */
@@ -714,6 +753,53 @@ immed_double_const (HOST_WIDE_INT i0, HO
}
#endif
+/* Return an rtx representation of C in mode MODE. */
+
+rtx
+immed_wide_int_const (const poly_wide_int_ref &c, machine_mode mode)
+{
+ if (c.is_constant ())
+ return immed_wide_int_const_1 (c.coeffs[0], mode);
+
+ /* Not scalar_int_mode because we also allow pointer bound modes. */
+ unsigned int prec = GET_MODE_PRECISION (as_a <scalar_mode> (mode));
+
+ /* Allow truncation but not extension since we do not know if the
+ number is signed or unsigned. */
+ gcc_assert (prec <= c.coeffs[0].get_precision ());
+ poly_wide_int newc = poly_wide_int::from (c, prec, SIGNED);
+
+ /* See whether we already have an rtx for this constant. */
+ inchash::hash h;
+ h.add_int (mode);
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ h.add_wide_int (newc.coeffs[i]);
+ const_poly_int_hasher::compare_type typed_value (mode, newc);
+ rtx *slot = const_poly_int_htab->find_slot_with_hash (typed_value,
+ h.end (), INSERT);
+ rtx x = *slot;
+ if (x)
+ return x;
+
+ /* Create a new rtx. There's a choice to be made here between installing
+ the actual mode of the rtx or leaving it as VOIDmode (for consistency
+ with CONST_INT). In practice the handling of the codes is different
+ enough that we get no benefit from using VOIDmode, and various places
+ assume that VOIDmode implies CONST_INT. Using the real mode seems like
+ the right long-term direction anyway. */
+ typedef trailing_wide_ints<NUM_POLY_INT_COEFFS> twi;
+ size_t extra_size = twi::extra_size (prec);
+ x = rtx_alloc_v (CONST_POLY_INT,
+ sizeof (struct const_poly_int_def) + extra_size);
+ PUT_MODE (x, mode);
+ CONST_POLY_INT_COEFFS (x).set_precision (prec);
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ CONST_POLY_INT_COEFFS (x)[i] = newc.coeffs[i];
+
+ *slot = x;
+ return x;
+}
+
rtx
gen_rtx_REG (machine_mode mode, unsigned int regno)
{
@@ -1502,7 +1588,8 @@ gen_lowpart_common (machine_mode mode, r
}
else if (GET_CODE (x) == SUBREG || REG_P (x)
|| GET_CODE (x) == CONCAT || const_vec_p (x)
- || CONST_DOUBLE_AS_FLOAT_P (x) || CONST_SCALAR_INT_P (x))
+ || CONST_DOUBLE_AS_FLOAT_P (x) || CONST_SCALAR_INT_P (x)
+ || CONST_POLY_INT_P (x))
return lowpart_subreg (mode, x, innermode);
/* Otherwise, we can't do this. */
@@ -6089,6 +6176,9 @@ init_emit_once (void)
#endif
const_double_htab = hash_table<const_double_hasher>::create_ggc (37);
+ if (NUM_POLY_INT_COEFFS > 1)
+ const_poly_int_htab = hash_table<const_poly_int_hasher>::create_ggc (37);
+
const_fixed_htab = hash_table<const_fixed_hasher>::create_ggc (37);
reg_attrs_htab = hash_table<reg_attr_hasher>::create_ggc (37);
@@ -6482,7 +6572,7 @@ need_atomic_barrier_p (enum memmodel mod
by VALUE bits. */
rtx
-gen_int_shift_amount (machine_mode mode, HOST_WIDE_INT value)
+gen_int_shift_amount (machine_mode mode, poly_int64 value)
{
return gen_int_mode (value, get_shift_amount_mode (mode));
}
Index: gcc/cse.c
===================================================================
--- gcc/cse.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/cse.c 2017-10-23 17:00:54.436008509 +0100
@@ -2323,6 +2323,15 @@ hash_rtx_cb (const_rtx x, machine_mode m
hash += CONST_WIDE_INT_ELT (x, i);
return hash;
+ case CONST_POLY_INT:
+ {
+ inchash::hash h;
+ h.add_int (hash);
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ h.add_wide_int (CONST_POLY_INT_COEFFS (x)[i]);
+ return h.end ();
+ }
+
case CONST_DOUBLE:
/* This is like the general case, except that it only counts
the integers representing the constant. */
@@ -3781,6 +3790,8 @@ equiv_constant (rtx x)
/* See if we previously assigned a constant value to this SUBREG. */
if ((new_rtx = lookup_as_function (x, CONST_INT)) != 0
|| (new_rtx = lookup_as_function (x, CONST_WIDE_INT)) != 0
+ || (NUM_POLY_INT_COEFFS > 1
+ && (new_rtx = lookup_as_function (x, CONST_POLY_INT)) != 0)
|| (new_rtx = lookup_as_function (x, CONST_DOUBLE)) != 0
|| (new_rtx = lookup_as_function (x, CONST_FIXED)) != 0)
return new_rtx;
Index: gcc/cselib.c
===================================================================
--- gcc/cselib.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/cselib.c 2017-10-23 17:00:54.436008509 +0100
@@ -1128,6 +1128,15 @@ cselib_hash_rtx (rtx x, int create, mach
hash += CONST_WIDE_INT_ELT (x, i);
return hash;
+ case CONST_POLY_INT:
+ {
+ inchash::hash h;
+ h.add_int (hash);
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ h.add_wide_int (CONST_POLY_INT_COEFFS (x)[i]);
+ return h.end ();
+ }
+
case CONST_DOUBLE:
/* This is like the general case, except that it only counts
the integers representing the constant. */
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/dwarf2out.c 2017-10-23 17:00:54.439005782 +0100
@@ -13753,6 +13753,9 @@ const_ok_for_output_1 (rtx rtl)
return false;
}
+ if (CONST_POLY_INT_P (rtl))
+ return false;
+
if (targetm.const_not_ok_for_debug_p (rtl))
{
expansion_failed (NULL_TREE, rtl,
Index: gcc/expr.c
===================================================================
--- gcc/expr.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/expr.c 2017-10-23 17:00:54.442003055 +0100
@@ -692,6 +692,7 @@ convert_modes (machine_mode mode, machin
&& is_int_mode (oldmode, &int_oldmode)
&& GET_MODE_PRECISION (int_mode) <= GET_MODE_PRECISION (int_oldmode)
&& ((MEM_P (x) && !MEM_VOLATILE_P (x) && direct_load[(int) int_mode])
+ || CONST_POLY_INT_P (x)
|| (REG_P (x)
&& (!HARD_REGISTER_P (x)
|| targetm.hard_regno_mode_ok (REGNO (x), int_mode))
Index: gcc/print-rtl.c
===================================================================
--- gcc/print-rtl.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/print-rtl.c 2017-10-23 17:00:54.443002147 +0100
@@ -898,6 +898,17 @@ rtx_writer::print_rtx (const_rtx in_rtx)
fprintf (m_outfile, " ");
cwi_output_hex (m_outfile, in_rtx);
break;
+
+ case CONST_POLY_INT:
+ fprintf (m_outfile, " [");
+ print_dec (CONST_POLY_INT_COEFFS (in_rtx)[0], m_outfile, SIGNED);
+ for (unsigned int i = 1; i < NUM_POLY_INT_COEFFS; ++i)
+ {
+ fprintf (m_outfile, ", ");
+ print_dec (CONST_POLY_INT_COEFFS (in_rtx)[i], m_outfile, SIGNED);
+ }
+ fprintf (m_outfile, "]");
+ break;
#endif
case CODE_LABEL:
@@ -1568,6 +1579,17 @@ print_value (pretty_printer *pp, const_r
}
break;
+ case CONST_POLY_INT:
+ pp_left_bracket (pp);
+ pp_wide_int (pp, CONST_POLY_INT_COEFFS (x)[0], SIGNED);
+ for (unsigned int i = 1; i < NUM_POLY_INT_COEFFS; ++i)
+ {
+ pp_string (pp, ", ");
+ pp_wide_int (pp, CONST_POLY_INT_COEFFS (x)[i], SIGNED);
+ }
+ pp_right_bracket (pp);
+ break;
+
case CONST_DOUBLE:
if (FLOAT_MODE_P (GET_MODE (x)))
{
Index: gcc/rtlhash.c
===================================================================
--- gcc/rtlhash.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/rtlhash.c 2017-10-23 17:00:54.444001238 +0100
@@ -55,6 +55,10 @@ add_rtx (const_rtx x, hash &hstate)
for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
hstate.add_object (CONST_WIDE_INT_ELT (x, i));
return;
+ case CONST_POLY_INT:
+ for (i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ hstate.add_wide_int (CONST_POLY_INT_COEFFS (x)[i]);
+ break;
case SYMBOL_REF:
if (XSTR (x, 0))
hstate.add (XSTR (x, 0), strlen (XSTR (x, 0)) + 1);
Index: gcc/explow.c
===================================================================
--- gcc/explow.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/explow.c 2017-10-23 17:00:54.440004873 +0100
@@ -77,13 +77,23 @@ trunc_int_for_mode (HOST_WIDE_INT c, mac
return c;
}
+/* Likewise for polynomial values, using the sign-extended representation
+ for each individual coefficient. */
+
+poly_int64
+trunc_int_for_mode (poly_int64 x, machine_mode mode)
+{
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ x.coeffs[i] = trunc_int_for_mode (x.coeffs[i], mode);
+ return x;
+}
+
/* Return an rtx for the sum of X and the integer C, given that X has
mode MODE. INPLACE is true if X can be modified inplace or false
if it must be treated as immutable. */
rtx
-plus_constant (machine_mode mode, rtx x, HOST_WIDE_INT c,
- bool inplace)
+plus_constant (machine_mode mode, rtx x, poly_int64 c, bool inplace)
{
RTX_CODE code;
rtx y;
@@ -92,7 +102,7 @@ plus_constant (machine_mode mode, rtx x,
gcc_assert (GET_MODE (x) == VOIDmode || GET_MODE (x) == mode);
- if (c == 0)
+ if (known_zero (c))
return x;
restart:
@@ -180,10 +190,12 @@ plus_constant (machine_mode mode, rtx x,
break;
default:
+ if (CONST_POLY_INT_P (x))
+ return immed_wide_int_const (const_poly_int_value (x) + c, mode);
break;
}
- if (c != 0)
+ if (maybe_nonzero (c))
x = gen_rtx_PLUS (mode, x, gen_int_mode (c, mode));
if (GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF)
Index: gcc/expmed.h
===================================================================
--- gcc/expmed.h 2017-10-23 16:52:20.579835373 +0100
+++ gcc/expmed.h 2017-10-23 17:00:54.441003964 +0100
@@ -712,8 +712,8 @@ extern unsigned HOST_WIDE_INT choose_mul
#ifdef TREE_CODE
extern rtx expand_variable_shift (enum tree_code, machine_mode,
rtx, tree, rtx, int);
-extern rtx expand_shift (enum tree_code, machine_mode, rtx, int, rtx,
- int);
+extern rtx expand_shift (enum tree_code, machine_mode, rtx, poly_int64, rtx,
+ int);
extern rtx expand_divmod (int, enum tree_code, machine_mode, rtx, rtx,
rtx, int);
#endif
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/expmed.c 2017-10-23 17:00:54.441003964 +0100
@@ -2541,7 +2541,7 @@ expand_shift_1 (enum tree_code code, mac
rtx
expand_shift (enum tree_code code, machine_mode mode, rtx shifted,
- int amount, rtx target, int unsignedp)
+ poly_int64 amount, rtx target, int unsignedp)
{
return expand_shift_1 (code, mode, shifted,
gen_int_shift_amount (mode, amount),
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/rtlanal.c 2017-10-23 17:00:54.444001238 +0100
@@ -915,6 +915,28 @@ split_const (rtx x, rtx *base_out, rtx *
*base_out = x;
*offset_out = const0_rtx;
}
+
+/* Express integer value X as some value Y plus a polynomial offset,
+ where Y is either const0_rtx, X or something within X (as opposed
+ to a new rtx). Return the Y and store the offset in *OFFSET_OUT. */
+
+rtx
+strip_offset (rtx x, poly_int64_pod *offset_out)
+{
+ rtx base = const0_rtx;
+ rtx test = x;
+ if (GET_CODE (test) == CONST)
+ test = XEXP (test, 0);
+ if (GET_CODE (test) == PLUS)
+ {
+ base = XEXP (test, 0);
+ test = XEXP (test, 1);
+ }
+ if (poly_int_rtx_p (test, offset_out))
+ return base;
+ *offset_out = 0;
+ return x;
+}
\f
/* Return the number of places FIND appears within X. If COUNT_DEST is
zero, we do not count occurrences inside the destination of a SET. */
@@ -3406,13 +3428,15 @@ commutative_operand_precedence (rtx op)
/* Constants always become the second operand. Prefer "nice" constants. */
if (code == CONST_INT)
- return -8;
+ return -10;
if (code == CONST_WIDE_INT)
- return -7;
+ return -9;
+ if (code == CONST_POLY_INT)
+ return -8;
if (code == CONST_DOUBLE)
- return -7;
+ return -8;
if (code == CONST_FIXED)
- return -7;
+ return -8;
op = avoid_constant_pool_reference (op);
code = GET_CODE (op);
@@ -3420,13 +3444,15 @@ commutative_operand_precedence (rtx op)
{
case RTX_CONST_OBJ:
if (code == CONST_INT)
- return -6;
+ return -7;
if (code == CONST_WIDE_INT)
- return -6;
+ return -6;
+ if (code == CONST_POLY_INT)
+ return -5;
if (code == CONST_DOUBLE)
- return -5;
+ return -5;
if (code == CONST_FIXED)
- return -5;
+ return -5;
return -4;
case RTX_EXTRA:
Index: gcc/rtl-tests.c
===================================================================
--- gcc/rtl-tests.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/rtl-tests.c 2017-10-23 17:00:54.443002147 +0100
@@ -228,6 +228,62 @@ test_uncond_jump ()
jump_insn);
}
+template<unsigned int N>
+struct const_poly_int_tests
+{
+ static void run ();
+};
+
+template<>
+struct const_poly_int_tests<1>
+{
+ static void run () {}
+};
+
+/* Test various CONST_POLY_INT properties. */
+
+template<unsigned int N>
+void
+const_poly_int_tests<N>::run ()
+{
+ rtx x1 = gen_int_mode (poly_int64 (1, 1), QImode);
+ rtx x255 = gen_int_mode (poly_int64 (1, 255), QImode);
+
+ /* Test that constants are unique. */
+ ASSERT_EQ (x1, gen_int_mode (poly_int64 (1, 1), QImode));
+ ASSERT_NE (x1, gen_int_mode (poly_int64 (1, 1), HImode));
+ ASSERT_NE (x1, x255);
+
+ /* Test const_poly_int_value. */
+ ASSERT_MUST_EQ (const_poly_int_value (x1), poly_int64 (1, 1));
+ ASSERT_MUST_EQ (const_poly_int_value (x255), poly_int64 (1, -1));
+
+ /* Test rtx_to_poly_int64. */
+ ASSERT_MUST_EQ (rtx_to_poly_int64 (x1), poly_int64 (1, 1));
+ ASSERT_MUST_EQ (rtx_to_poly_int64 (x255), poly_int64 (1, -1));
+ ASSERT_MAY_NE (rtx_to_poly_int64 (x255), poly_int64 (1, 255));
+
+ /* Test plus_constant of a symbol. */
+ rtx symbol = gen_rtx_SYMBOL_REF (Pmode, "foo");
+ rtx offset1 = gen_int_mode (poly_int64 (9, 11), Pmode);
+ rtx sum1 = gen_rtx_CONST (Pmode, gen_rtx_PLUS (Pmode, symbol, offset1));
+ ASSERT_RTX_EQ (plus_constant (Pmode, symbol, poly_int64 (9, 11)), sum1);
+
+ /* Test plus_constant of a CONST. */
+ rtx offset2 = gen_int_mode (poly_int64 (12, 20), Pmode);
+ rtx sum2 = gen_rtx_CONST (Pmode, gen_rtx_PLUS (Pmode, symbol, offset2));
+ ASSERT_RTX_EQ (plus_constant (Pmode, sum1, poly_int64 (3, 9)), sum2);
+
+ /* Test a cancelling plus_constant. */
+ ASSERT_EQ (plus_constant (Pmode, sum2, poly_int64 (-12, -20)), symbol);
+
+ /* Test plus_constant on integer constants. */
+ ASSERT_EQ (plus_constant (QImode, const1_rtx, poly_int64 (4, -2)),
+ gen_int_mode (poly_int64 (5, -2), QImode));
+ ASSERT_EQ (plus_constant (QImode, x1, poly_int64 (4, -2)),
+ gen_int_mode (poly_int64 (5, -1), QImode));
+}
+
/* Run all of the selftests within this file. */
void
@@ -238,6 +294,7 @@ rtl_tests_c_tests ()
test_dumping_rtx_reuse ();
test_single_set ();
test_uncond_jump ();
+ const_poly_int_tests<NUM_POLY_INT_COEFFS>::run ();
/* Purge state. */
set_first_insn (NULL);
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c 2017-10-23 16:52:20.579835373 +0100
+++ gcc/simplify-rtx.c 2017-10-23 17:00:54.445000329 +0100
@@ -2039,6 +2039,26 @@ simplify_const_unary_operation (enum rtx
}
}
+ /* Handle polynomial integers. */
+ else if (CONST_POLY_INT_P (op))
+ {
+ poly_wide_int result;
+ switch (code)
+ {
+ case NEG:
+ result = -const_poly_int_value (op);
+ break;
+
+ case NOT:
+ result = ~const_poly_int_value (op);
+ break;
+
+ default:
+ return NULL_RTX;
+ }
+ return immed_wide_int_const (result, mode);
+ }
+
return NULL_RTX;
}
\f
@@ -2219,6 +2239,7 @@ simplify_binary_operation_1 (enum rtx_co
rtx tem, reversed, opleft, opright, elt0, elt1;
HOST_WIDE_INT val;
scalar_int_mode int_mode, inner_mode;
+ poly_int64 offset;
/* Even if we can't compute a constant result,
there are some cases worth simplifying. */
@@ -2531,6 +2552,12 @@ simplify_binary_operation_1 (enum rtx_co
return simplify_gen_binary (MINUS, mode, tem, XEXP (op0, 0));
}
+ if ((GET_CODE (op0) == CONST
+ || GET_CODE (op0) == SYMBOL_REF
+ || GET_CODE (op0) == LABEL_REF)
+ && poly_int_rtx_p (op1, &offset))
+ return plus_constant (mode, op0, trunc_int_for_mode (-offset, mode));
+
/* Don't let a relocatable value get a negative coeff. */
if (CONST_INT_P (op1) && GET_MODE (op0) != VOIDmode)
return simplify_gen_binary (PLUS, mode,
@@ -4325,6 +4352,57 @@ simplify_const_binary_operation (enum rt
return immed_wide_int_const (result, int_mode);
}
+ /* Handle polynomial integers. */
+ if (NUM_POLY_INT_COEFFS > 1
+ && is_a <scalar_int_mode> (mode, &int_mode)
+ && poly_int_rtx_p (op0)
+ && poly_int_rtx_p (op1))
+ {
+ poly_wide_int result;
+ switch (code)
+ {
+ case PLUS:
+ result = wi::to_poly_wide (op0, mode) + wi::to_poly_wide (op1, mode);
+ break;
+
+ case MINUS:
+ result = wi::to_poly_wide (op0, mode) - wi::to_poly_wide (op1, mode);
+ break;
+
+ case MULT:
+ if (CONST_SCALAR_INT_P (op1))
+ result = wi::to_poly_wide (op0, mode) * rtx_mode_t (op1, mode);
+ else
+ return NULL_RTX;
+ break;
+
+ case ASHIFT:
+ if (CONST_SCALAR_INT_P (op1))
+ {
+ wide_int shift = rtx_mode_t (op1, mode);
+ if (SHIFT_COUNT_TRUNCATED)
+ shift = wi::umod_trunc (shift, GET_MODE_PRECISION (int_mode));
+ else if (wi::geu_p (shift, GET_MODE_PRECISION (int_mode)))
+ return NULL_RTX;
+ result = wi::to_poly_wide (op0, mode) << shift;
+ }
+ else
+ return NULL_RTX;
+ break;
+
+ case IOR:
+ if (!CONST_SCALAR_INT_P (op1)
+ || !can_ior_p (wi::to_poly_wide (op0, mode),
+ rtx_mode_t (op1, mode), &result))
+ return NULL_RTX;
+ break;
+
+ default:
+ return NULL_RTX;
+ }
+ return immed_wide_int_const (result, int_mode);
+ }
+
return NULL_RTX;
}
@@ -6317,13 +6395,27 @@ simplify_subreg (machine_mode outermode,
scalar_int_mode int_outermode, int_innermode;
if (is_a <scalar_int_mode> (outermode, &int_outermode)
&& is_a <scalar_int_mode> (innermode, &int_innermode)
- && (GET_MODE_PRECISION (int_outermode)
- < GET_MODE_PRECISION (int_innermode))
&& byte == subreg_lowpart_offset (int_outermode, int_innermode))
{
- rtx tem = simplify_truncation (int_outermode, op, int_innermode);
- if (tem)
- return tem;
+ /* Handle polynomial integers. The upper bits of a paradoxical
+ subreg are undefined, so this is safe regardless of whether
+ we're truncating or extending. */
+ if (CONST_POLY_INT_P (op))
+ {
+ poly_wide_int val
+ = poly_wide_int::from (const_poly_int_value (op),
+ GET_MODE_PRECISION (int_outermode),
+ SIGNED);
+ return immed_wide_int_const (val, int_outermode);
+ }
+
+ if (GET_MODE_PRECISION (int_outermode)
+ < GET_MODE_PRECISION (int_innermode))
+ {
+ rtx tem = simplify_truncation (int_outermode, op, int_innermode);
+ if (tem)
+ return tem;
+ }
}
return NULL_RTX;
@@ -6629,12 +6721,60 @@ test_vector_ops ()
}
}
+template<unsigned int N>
+struct simplify_const_poly_int_tests
+{
+ static void run ();
+};
+
+template<>
+struct simplify_const_poly_int_tests<1>
+{
+ static void run () {}
+};
+
+/* Test various CONST_POLY_INT properties. */
+
+template<unsigned int N>
+void
+simplify_const_poly_int_tests<N>::run ()
+{
+ rtx x1 = gen_int_mode (poly_int64 (1, 1), QImode);
+ rtx x2 = gen_int_mode (poly_int64 (-80, 127), QImode);
+ rtx x3 = gen_int_mode (poly_int64 (-79, -128), QImode);
+ rtx x4 = gen_int_mode (poly_int64 (5, 4), QImode);
+ rtx x5 = gen_int_mode (poly_int64 (30, 24), QImode);
+ rtx x6 = gen_int_mode (poly_int64 (20, 16), QImode);
+ rtx x7 = gen_int_mode (poly_int64 (7, 4), QImode);
+ rtx x8 = gen_int_mode (poly_int64 (30, 24), HImode);
+ rtx x9 = gen_int_mode (poly_int64 (-30, -24), HImode);
+ rtx x10 = gen_int_mode (poly_int64 (-31, -24), HImode);
+ rtx two = GEN_INT (2);
+ rtx six = GEN_INT (6);
+ HOST_WIDE_INT offset = subreg_lowpart_offset (QImode, HImode);
+
+ /* These tests only try limited operation combinations. Fuller arithmetic
+ testing is done directly on poly_ints. */
+ ASSERT_EQ (simplify_unary_operation (NEG, HImode, x8, HImode), x9);
+ ASSERT_EQ (simplify_unary_operation (NOT, HImode, x8, HImode), x10);
+ ASSERT_EQ (simplify_unary_operation (TRUNCATE, QImode, x8, HImode), x5);
+ ASSERT_EQ (simplify_binary_operation (PLUS, QImode, x1, x2), x3);
+ ASSERT_EQ (simplify_binary_operation (MINUS, QImode, x3, x1), x2);
+ ASSERT_EQ (simplify_binary_operation (MULT, QImode, x4, six), x5);
+ ASSERT_EQ (simplify_binary_operation (MULT, QImode, six, x4), x5);
+ ASSERT_EQ (simplify_binary_operation (ASHIFT, QImode, x4, two), x6);
+ ASSERT_EQ (simplify_binary_operation (IOR, QImode, x4, two), x7);
+ ASSERT_EQ (simplify_subreg (HImode, x5, QImode, 0), x8);
+ ASSERT_EQ (simplify_subreg (QImode, x8, HImode, offset), x5);
+}
+
/* Run all of the selftests within this file. */
void
simplify_rtx_c_tests ()
{
test_vector_ops ();
+ simplify_const_poly_int_tests<NUM_POLY_INT_COEFFS>::run ();
}
} // namespace selftest
Index: gcc/wide-int.h
===================================================================
--- gcc/wide-int.h 2017-10-23 17:00:20.923835582 +0100
+++ gcc/wide-int.h 2017-10-23 17:00:54.445999420 +0100
@@ -613,6 +613,7 @@ #define SHIFT_FUNCTION \
access. */
struct storage_ref
{
+ storage_ref () {}
storage_ref (const HOST_WIDE_INT *, unsigned int, unsigned int);
const HOST_WIDE_INT *val;
@@ -944,6 +945,8 @@ struct wide_int_ref_storage : public wi:
HOST_WIDE_INT scratch[2];
public:
+ wide_int_ref_storage () {}
+
wide_int_ref_storage (const wi::storage_ref &);
template <typename T>
@@ -1323,7 +1326,7 @@ typedef generic_wide_int <trailing_wide_
bytes beyond the sizeof need to be allocated. Use set_precision
to initialize the structure. */
template <int N>
-class GTY(()) trailing_wide_ints
+class GTY((user)) trailing_wide_ints
{
private:
/* The shared precision of each number. */
@@ -1340,9 +1343,14 @@ class GTY(()) trailing_wide_ints
HOST_WIDE_INT m_val[1];
public:
+ typedef WIDE_INT_REF_FOR (trailing_wide_int_storage) const_reference;
+
void set_precision (unsigned int);
+ unsigned int get_precision () const { return m_precision; }
trailing_wide_int operator [] (unsigned int);
+ const_reference operator [] (unsigned int) const;
static size_t extra_size (unsigned int);
+ size_t extra_size () const { return extra_size (m_precision); }
};
inline trailing_wide_int_storage::
@@ -1414,6 +1422,14 @@ trailing_wide_ints <N>::operator [] (uns
&m_val[index * m_max_len]);
}
+template <int N>
+inline typename trailing_wide_ints <N>::const_reference
+trailing_wide_ints <N>::operator [] (unsigned int index) const
+{
+ return wi::storage_ref (&m_val[index * m_max_len],
+ m_len[index], m_precision);
+}
+
/* Return how many extra bytes need to be added to the end of the structure
in order to handle N wide_ints of precision PRECISION. */
template <int N>
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [005/nnn] poly_int: rtx constants
2017-10-23 17:01 ` [005/nnn] poly_int: rtx constants Richard Sandiford
@ 2017-11-17 4:17 ` Jeff Law
2017-12-15 1:25 ` Richard Sandiford
0 siblings, 1 reply; 302+ messages in thread
From: Jeff Law @ 2017-11-17 4:17 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:00 AM, Richard Sandiford wrote:
> This patch adds an rtl representation of poly_int values.
> There were three possible ways of doing this:
>
> (1) Add a new rtl code for the poly_ints themselves and store the
> coefficients as trailing wide_ints. This would give constants like:
>
> (const_poly_int [c0 c1 ... cn])
>
> The runtime value would be:
>
> c0 + c1 * x1 + ... + cn * xn
>
> (2) Like (1), but use rtxes for the coefficients. This would give
> constants like:
>
> (const_poly_int [(const_int c0)
> (const_int c1)
> ...
> (const_int cn)])
>
> although the coefficients could be const_wide_ints instead
> of const_ints where appropriate.
>
> (3) Add a new rtl code for the polynomial indeterminates,
> then use them in const wrappers. A constant like c0 + c1 * x1
> would then look like:
>
> (const:M (plus:M (mult:M (const_param:M x1)
> (const_int c1))
> (const_int c0)))
>
> There didn't seem to be that much to choose between them. The main
> advantage of (1) is that it's a more efficient representation and
> that we can refer to the cofficients directly as wide_int_storage.
Well, and #1 feels more like how we handle CONST_INT :-)
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * doc/rtl.texi (const_poly_int): Document.
> * gengenrtl.c (excluded_rtx): Return true for CONST_POLY_INT.
> * rtl.h (const_poly_int_def): New struct.
> (rtx_def::u): Add a cpi field.
> (CASE_CONST_UNIQUE, CASE_CONST_ANY): Add CONST_POLY_INT.
> (CONST_POLY_INT_P, CONST_POLY_INT_COEFFS): New macros.
> (wi::rtx_to_poly_wide_ref): New typedef
> (const_poly_int_value, wi::to_poly_wide, rtx_to_poly_int64)
> (poly_int_rtx_p): New functions.
> (trunc_int_for_mode): Declare a poly_int64 version.
> (plus_constant): Take a poly_int64 instead of a HOST_WIDE_INT.
> (immed_wide_int_const): Take a poly_wide_int_ref rather than
> a wide_int_ref.
> (strip_offset): Declare.
> (strip_offset_and_add): New function.
> * rtl.def (CONST_POLY_INT): New rtx code.
> * rtl.c (rtx_size): Handle CONST_POLY_INT.
> (shared_const_p): Use poly_int_rtx_p.
> * emit-rtl.h (gen_int_mode): Take a poly_int64 instead of a
> HOST_WIDE_INT.
> (gen_int_shift_amount): Likewise.
> * emit-rtl.c (const_poly_int_hasher): New class.
> (const_poly_int_htab): New variable.
> (init_emit_once): Initialize it when NUM_POLY_INT_COEFFS > 1.
> (const_poly_int_hasher::hash): New function.
> (const_poly_int_hasher::equal): Likewise.
> (gen_int_mode): Take a poly_int64 instead of a HOST_WIDE_INT.
> (immed_wide_int_const): Rename to...
> (immed_wide_int_const_1): ...this and make static.
> (immed_wide_int_const): New function, taking a poly_wide_int_ref
> instead of a wide_int_ref.
> (gen_int_shift_amount): Take a poly_int64 instead of a HOST_WIDE_INT.
> (gen_lowpart_common): Handle CONST_POLY_INT.
> * cse.c (hash_rtx_cb, equiv_constant): Likewise.
> * cselib.c (cselib_hash_rtx): Likewise.
> * dwarf2out.c (const_ok_for_output_1): Likewise.
> * expr.c (convert_modes): Likewise.
> * print-rtl.c (rtx_writer::print_rtx, print_value): Likewise.
> * rtlhash.c (add_rtx): Likewise.
> * explow.c (trunc_int_for_mode): Add a poly_int64 version.
> (plus_constant): Take a poly_int64 instead of a HOST_WIDE_INT.
> Handle existing CONST_POLY_INT rtxes.
> * expmed.h (expand_shift): Take a poly_int64 instead of a
> HOST_WIDE_INT.
> * expmed.c (expand_shift): Likewise.
> * rtlanal.c (strip_offset): New function.
> (commutative_operand_precedence): Give CONST_POLY_INT the same
> precedence as CONST_DOUBLE and put CONST_WIDE_INT between that
> and CONST_INT.
> * rtl-tests.c (const_poly_int_tests): New struct.
> (rtl_tests_c_tests): Use it.
> * simplify-rtx.c (simplify_const_unary_operation): Handle
> CONST_POLY_INT.
> (simplify_const_binary_operation): Likewise.
> (simplify_binary_operation_1): Fold additions of symbolic constants
> and CONST_POLY_INTs.
> (simplify_subreg): Handle extensions and truncations of
> CONST_POLY_INTs.
> (simplify_const_poly_int_tests): New struct.
> (simplify_rtx_c_tests): Use it.
> * wide-int.h (storage_ref): Add default constructor.
> (wide_int_ref_storage): Likewise.
> (trailing_wide_ints): Use GTY((user)).
> (trailing_wide_ints::operator[]): Add a const version.
> (trailing_wide_ints::get_precision): New function.
> (trailing_wide_ints::extra_size): Likewise.
Do we need to define anything WRT structure sharing in rtl.texi for a
CONST_POLY_INT?
>
> Index: gcc/rtl.c
> ===================================================================
> --- gcc/rtl.c 2017-10-23 16:52:20.579835373 +0100
> +++ gcc/rtl.c 2017-10-23 17:00:54.443002147 +0100
> @@ -257,9 +261,10 @@ shared_const_p (const_rtx orig)
>
> /* CONST can be shared if it contains a SYMBOL_REF. If it contains
> a LABEL_REF, it isn't sharable. */
> + poly_int64 offset;
> return (GET_CODE (XEXP (orig, 0)) == PLUS
> && GET_CODE (XEXP (XEXP (orig, 0), 0)) == SYMBOL_REF
> - && CONST_INT_P (XEXP (XEXP (orig, 0), 1)));
> + && poly_int_rtx_p (XEXP (XEXP (orig, 0), 1), &offset));
Did this just change structure sharing for CONST_WIDE_INT?
> + /* Create a new rtx. There's a choice to be made here between installing
> + the actual mode of the rtx or leaving it as VOIDmode (for consistency
> + with CONST_INT). In practice the handling of the codes is different
> + enough that we get no benefit from using VOIDmode, and various places
> + assume that VOIDmode implies CONST_INT. Using the real mode seems like
> + the right long-term direction anyway. */
Certainly my preference is to get the mode in there. I see modeless
CONST_INTs as a long standing wart and I'm not keen to repeat it.
> Index: gcc/wide-int.h
> ===================================================================
> --- gcc/wide-int.h 2017-10-23 17:00:20.923835582 +0100
> +++ gcc/wide-int.h 2017-10-23 17:00:54.445999420 +0100
> @@ -613,6 +613,7 @@ #define SHIFT_FUNCTION \
> access. */
> struct storage_ref
> {
> + storage_ref () {}
> storage_ref (const HOST_WIDE_INT *, unsigned int, unsigned int);
>
> const HOST_WIDE_INT *val;
> @@ -944,6 +945,8 @@ struct wide_int_ref_storage : public wi:
> HOST_WIDE_INT scratch[2];
>
> public:
> + wide_int_ref_storage () {}
> +
> wide_int_ref_storage (const wi::storage_ref &);
>
> template <typename T>
So doesn't this play into the whole question about initialization of
these objects. So I'll defer on this hunk until we settle that
question, but the rest is OK.
Jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [005/nnn] poly_int: rtx constants
2017-11-17 4:17 ` Jeff Law
@ 2017-12-15 1:25 ` Richard Sandiford
2017-12-19 4:52 ` Jeff Law
0 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-12-15 1:25 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches
Jeff Law <law@redhat.com> writes:
> On 10/23/2017 11:00 AM, Richard Sandiford wrote:
>> This patch adds an rtl representation of poly_int values.
>> There were three possible ways of doing this:
>>
>> (1) Add a new rtl code for the poly_ints themselves and store the
>> coefficients as trailing wide_ints. This would give constants like:
>>
>> (const_poly_int [c0 c1 ... cn])
>>
>> The runtime value would be:
>>
>> c0 + c1 * x1 + ... + cn * xn
>>
>> (2) Like (1), but use rtxes for the coefficients. This would give
>> constants like:
>>
>> (const_poly_int [(const_int c0)
>> (const_int c1)
>> ...
>> (const_int cn)])
>>
>> although the coefficients could be const_wide_ints instead
>> of const_ints where appropriate.
>>
>> (3) Add a new rtl code for the polynomial indeterminates,
>> then use them in const wrappers. A constant like c0 + c1 * x1
>> would then look like:
>>
>> (const:M (plus:M (mult:M (const_param:M x1)
>> (const_int c1))
>> (const_int c0)))
>>
>> There didn't seem to be that much to choose between them. The main
>> advantage of (1) is that it's a more efficient representation and
>> that we can refer to the cofficients directly as wide_int_storage.
> Well, and #1 feels more like how we handle CONST_INT :-)
>>
>>
>> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
>> Alan Hayward <alan.hayward@arm.com>
>> David Sherwood <david.sherwood@arm.com>
>>
>> gcc/
>> * doc/rtl.texi (const_poly_int): Document.
>> * gengenrtl.c (excluded_rtx): Return true for CONST_POLY_INT.
>> * rtl.h (const_poly_int_def): New struct.
>> (rtx_def::u): Add a cpi field.
>> (CASE_CONST_UNIQUE, CASE_CONST_ANY): Add CONST_POLY_INT.
>> (CONST_POLY_INT_P, CONST_POLY_INT_COEFFS): New macros.
>> (wi::rtx_to_poly_wide_ref): New typedef
>> (const_poly_int_value, wi::to_poly_wide, rtx_to_poly_int64)
>> (poly_int_rtx_p): New functions.
>> (trunc_int_for_mode): Declare a poly_int64 version.
>> (plus_constant): Take a poly_int64 instead of a HOST_WIDE_INT.
>> (immed_wide_int_const): Take a poly_wide_int_ref rather than
>> a wide_int_ref.
>> (strip_offset): Declare.
>> (strip_offset_and_add): New function.
>> * rtl.def (CONST_POLY_INT): New rtx code.
>> * rtl.c (rtx_size): Handle CONST_POLY_INT.
>> (shared_const_p): Use poly_int_rtx_p.
>> * emit-rtl.h (gen_int_mode): Take a poly_int64 instead of a
>> HOST_WIDE_INT.
>> (gen_int_shift_amount): Likewise.
>> * emit-rtl.c (const_poly_int_hasher): New class.
>> (const_poly_int_htab): New variable.
>> (init_emit_once): Initialize it when NUM_POLY_INT_COEFFS > 1.
>> (const_poly_int_hasher::hash): New function.
>> (const_poly_int_hasher::equal): Likewise.
>> (gen_int_mode): Take a poly_int64 instead of a HOST_WIDE_INT.
>> (immed_wide_int_const): Rename to...
>> (immed_wide_int_const_1): ...this and make static.
>> (immed_wide_int_const): New function, taking a poly_wide_int_ref
>> instead of a wide_int_ref.
>> (gen_int_shift_amount): Take a poly_int64 instead of a HOST_WIDE_INT.
>> (gen_lowpart_common): Handle CONST_POLY_INT.
>> * cse.c (hash_rtx_cb, equiv_constant): Likewise.
>> * cselib.c (cselib_hash_rtx): Likewise.
>> * dwarf2out.c (const_ok_for_output_1): Likewise.
>> * expr.c (convert_modes): Likewise.
>> * print-rtl.c (rtx_writer::print_rtx, print_value): Likewise.
>> * rtlhash.c (add_rtx): Likewise.
>> * explow.c (trunc_int_for_mode): Add a poly_int64 version.
>> (plus_constant): Take a poly_int64 instead of a HOST_WIDE_INT.
>> Handle existing CONST_POLY_INT rtxes.
>> * expmed.h (expand_shift): Take a poly_int64 instead of a
>> HOST_WIDE_INT.
>> * expmed.c (expand_shift): Likewise.
>> * rtlanal.c (strip_offset): New function.
>> (commutative_operand_precedence): Give CONST_POLY_INT the same
>> precedence as CONST_DOUBLE and put CONST_WIDE_INT between that
>> and CONST_INT.
>> * rtl-tests.c (const_poly_int_tests): New struct.
>> (rtl_tests_c_tests): Use it.
>> * simplify-rtx.c (simplify_const_unary_operation): Handle
>> CONST_POLY_INT.
>> (simplify_const_binary_operation): Likewise.
>> (simplify_binary_operation_1): Fold additions of symbolic constants
>> and CONST_POLY_INTs.
>> (simplify_subreg): Handle extensions and truncations of
>> CONST_POLY_INTs.
>> (simplify_const_poly_int_tests): New struct.
>> (simplify_rtx_c_tests): Use it.
>> * wide-int.h (storage_ref): Add default constructor.
>> (wide_int_ref_storage): Likewise.
>> (trailing_wide_ints): Use GTY((user)).
>> (trailing_wide_ints::operator[]): Add a const version.
>> (trailing_wide_ints::get_precision): New function.
>> (trailing_wide_ints::extra_size): Likewise.
> Do we need to define anything WRT structure sharing in rtl.texi for a
> CONST_POLY_INT?
Good catch. Fixed in the patch below.
>> Index: gcc/rtl.c
>> ===================================================================
>> --- gcc/rtl.c 2017-10-23 16:52:20.579835373 +0100
>> +++ gcc/rtl.c 2017-10-23 17:00:54.443002147 +0100
>> @@ -257,9 +261,10 @@ shared_const_p (const_rtx orig)
>>
>> /* CONST can be shared if it contains a SYMBOL_REF. If it contains
>> a LABEL_REF, it isn't sharable. */
>> + poly_int64 offset;
>> return (GET_CODE (XEXP (orig, 0)) == PLUS
>> && GET_CODE (XEXP (XEXP (orig, 0), 0)) == SYMBOL_REF
>> - && CONST_INT_P (XEXP (XEXP (orig, 0), 1)));
>> + && poly_int_rtx_p (XEXP (XEXP (orig, 0), 1), &offset));
> Did this just change structure sharing for CONST_WIDE_INT?
No, we'd only use CONST_WIDE_INT for things that don't fit in
poly_int64.
>> + /* Create a new rtx. There's a choice to be made here between installing
>> + the actual mode of the rtx or leaving it as VOIDmode (for consistency
>> + with CONST_INT). In practice the handling of the codes is different
>> + enough that we get no benefit from using VOIDmode, and various places
>> + assume that VOIDmode implies CONST_INT. Using the real mode seems like
>> + the right long-term direction anyway. */
> Certainly my preference is to get the mode in there. I see modeless
> CONST_INTs as a long standing wart and I'm not keen to repeat it.
Yeah. Still regularly hit problems related to modeless CONST_INTs
today (including the gen_int_shift_amount patch).
>> Index: gcc/wide-int.h
>> ===================================================================
>> --- gcc/wide-int.h 2017-10-23 17:00:20.923835582 +0100
>> +++ gcc/wide-int.h 2017-10-23 17:00:54.445999420 +0100
>> @@ -613,6 +613,7 @@ #define SHIFT_FUNCTION \
>> access. */
>> struct storage_ref
>> {
>> + storage_ref () {}
>> storage_ref (const HOST_WIDE_INT *, unsigned int, unsigned int);
>>
>> const HOST_WIDE_INT *val;
>> @@ -944,6 +945,8 @@ struct wide_int_ref_storage : public wi:
>> HOST_WIDE_INT scratch[2];
>>
>> public:
>> + wide_int_ref_storage () {}
>> +
>> wide_int_ref_storage (const wi::storage_ref &);
>>
>> template <typename T>
> So doesn't this play into the whole question about initialization of
> these objects. So I'll defer on this hunk until we settle that
> question, but the rest is OK.
Any more thoughts on this? In the end the 001 patch went in with
the empty constructors. Like I say, I'm happy to switch to C++-11
"= default;" once we require C++11, but I think having well-defined
implicit construction would make switching to "= default" harder
in future.
Thanks,
Richard
2017-11-15 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/rtl.texi (const_poly_int): Document. Also document the
rtl sharing behavior.
* gengenrtl.c (excluded_rtx): Return true for CONST_POLY_INT.
* rtl.h (const_poly_int_def): New struct.
(rtx_def::u): Add a cpi field.
(CASE_CONST_UNIQUE, CASE_CONST_ANY): Add CONST_POLY_INT.
(CONST_POLY_INT_P, CONST_POLY_INT_COEFFS): New macros.
(wi::rtx_to_poly_wide_ref): New typedef
(const_poly_int_value, wi::to_poly_wide, rtx_to_poly_int64)
(poly_int_rtx_p): New functions.
(trunc_int_for_mode): Declare a poly_int64 version.
(plus_constant): Take a poly_int64 instead of a HOST_WIDE_INT.
(immed_wide_int_const): Take a poly_wide_int_ref rather than
a wide_int_ref.
(strip_offset): Declare.
(strip_offset_and_add): New function.
* rtl.def (CONST_POLY_INT): New rtx code.
* rtl.c (rtx_size): Handle CONST_POLY_INT.
(shared_const_p): Use poly_int_rtx_p.
* emit-rtl.h (gen_int_mode): Take a poly_int64 instead of a
HOST_WIDE_INT.
(gen_int_shift_amount): Likewise.
* emit-rtl.c (const_poly_int_hasher): New class.
(const_poly_int_htab): New variable.
(init_emit_once): Initialize it when NUM_POLY_INT_COEFFS > 1.
(const_poly_int_hasher::hash): New function.
(const_poly_int_hasher::equal): Likewise.
(gen_int_mode): Take a poly_int64 instead of a HOST_WIDE_INT.
(immed_wide_int_const): Rename to...
(immed_wide_int_const_1): ...this and make static.
(immed_wide_int_const): New function, taking a poly_wide_int_ref
instead of a wide_int_ref.
(gen_int_shift_amount): Take a poly_int64 instead of a HOST_WIDE_INT.
(gen_lowpart_common): Handle CONST_POLY_INT.
* cse.c (hash_rtx_cb, equiv_constant): Likewise.
* cselib.c (cselib_hash_rtx): Likewise.
* dwarf2out.c (const_ok_for_output_1): Likewise.
* expr.c (convert_modes): Likewise.
* print-rtl.c (rtx_writer::print_rtx, print_value): Likewise.
* rtlhash.c (add_rtx): Likewise.
* explow.c (trunc_int_for_mode): Add a poly_int64 version.
(plus_constant): Take a poly_int64 instead of a HOST_WIDE_INT.
Handle existing CONST_POLY_INT rtxes.
* expmed.h (expand_shift): Take a poly_int64 instead of a
HOST_WIDE_INT.
* expmed.c (expand_shift): Likewise.
* rtlanal.c (strip_offset): New function.
(commutative_operand_precedence): Give CONST_POLY_INT the same
precedence as CONST_DOUBLE and put CONST_WIDE_INT between that
and CONST_INT.
* rtl-tests.c (const_poly_int_tests): New struct.
(rtl_tests_c_tests): Use it.
* simplify-rtx.c (simplify_const_unary_operation): Handle
CONST_POLY_INT.
(simplify_const_binary_operation): Likewise.
(simplify_binary_operation_1): Fold additions of symbolic constants
and CONST_POLY_INTs.
(simplify_subreg): Handle extensions and truncations of
CONST_POLY_INTs.
(simplify_const_poly_int_tests): New struct.
(simplify_rtx_c_tests): Use it.
* wide-int.h (storage_ref): Add default constructor.
(wide_int_ref_storage): Likewise.
(trailing_wide_ints): Use GTY((user)).
(trailing_wide_ints::operator[]): Add a const version.
(trailing_wide_ints::get_precision): New function.
(trailing_wide_ints::extra_size): Likewise.
Index: gcc/doc/rtl.texi
===================================================================
--- gcc/doc/rtl.texi 2017-12-15 01:16:50.894351263 +0000
+++ gcc/doc/rtl.texi 2017-12-15 01:16:51.235339239 +0000
@@ -1633,6 +1633,15 @@ is accessed with the macro @code{CONST_F
data is accessed with @code{CONST_FIXED_VALUE_HIGH}; the low part is
accessed with @code{CONST_FIXED_VALUE_LOW}.
+@findex const_poly_int
+@item (const_poly_int:@var{m} [@var{c0} @var{c1} @dots{}])
+Represents a @code{poly_int}-style polynomial integer with coefficients
+@var{c0}, @var{c1}, @dots{}. The coefficients are @code{wide_int}-based
+integers rather than rtxes. @code{CONST_POLY_INT_COEFFS} gives the
+values of individual coefficients (which is mostly only useful in
+low-level routines) and @code{const_poly_int_value} gives the full
+@code{poly_int} value.
+
@findex const_vector
@item (const_vector:@var{m} [@var{x0} @var{x1} @dots{}])
Represents a vector constant. The square brackets stand for the vector
@@ -4236,6 +4245,11 @@ referring to it.
@item
All @code{const_int} expressions with equal values are shared.
+@cindex @code{const_poly_int}, RTL sharing
+@item
+All @code{const_poly_int} expressions with equal modes and values
+are shared.
+
@cindex @code{pc}, RTL sharing
@item
There is only one @code{pc} expression.
Index: gcc/gengenrtl.c
===================================================================
--- gcc/gengenrtl.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/gengenrtl.c 2017-12-15 01:16:51.240339063 +0000
@@ -157,6 +157,7 @@ excluded_rtx (int idx)
return (strcmp (defs[idx].enumname, "VAR_LOCATION") == 0
|| strcmp (defs[idx].enumname, "CONST_DOUBLE") == 0
|| strcmp (defs[idx].enumname, "CONST_WIDE_INT") == 0
+ || strcmp (defs[idx].enumname, "CONST_POLY_INT") == 0
|| strcmp (defs[idx].enumname, "CONST_FIXED") == 0);
}
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h 2017-12-15 01:16:50.894351263 +0000
+++ gcc/rtl.h 2017-12-15 01:16:51.241339028 +0000
@@ -280,6 +280,10 @@ #define CWI_GET_NUM_ELEM(RTX) \
#define CWI_PUT_NUM_ELEM(RTX, NUM) \
(RTL_FLAG_CHECK1("CWI_PUT_NUM_ELEM", (RTX), CONST_WIDE_INT)->u2.num_elem = (NUM))
+struct GTY((variable_size)) const_poly_int_def {
+ trailing_wide_ints<NUM_POLY_INT_COEFFS> coeffs;
+};
+
/* RTL expression ("rtx"). */
/* The GTY "desc" and "tag" options below are a kludge: we need a desc
@@ -424,6 +428,7 @@ struct GTY((desc("0"), tag("0"),
struct real_value rv;
struct fixed_value fv;
struct hwivec_def hwiv;
+ struct const_poly_int_def cpi;
} GTY ((special ("rtx_def"), desc ("GET_CODE (&%0)"))) u;
};
@@ -734,6 +739,7 @@ #define CASE_CONST_SCALAR_INT \
#define CASE_CONST_UNIQUE \
case CONST_INT: \
case CONST_WIDE_INT: \
+ case CONST_POLY_INT: \
case CONST_DOUBLE: \
case CONST_FIXED
@@ -741,6 +747,7 @@ #define CASE_CONST_UNIQUE \
#define CASE_CONST_ANY \
case CONST_INT: \
case CONST_WIDE_INT: \
+ case CONST_POLY_INT: \
case CONST_DOUBLE: \
case CONST_FIXED: \
case CONST_VECTOR
@@ -773,6 +780,11 @@ #define CONST_INT_P(X) (GET_CODE (X) ==
/* Predicate yielding nonzero iff X is an rtx for a constant integer. */
#define CONST_WIDE_INT_P(X) (GET_CODE (X) == CONST_WIDE_INT)
+/* Predicate yielding nonzero iff X is an rtx for a polynomial constant
+ integer. */
+#define CONST_POLY_INT_P(X) \
+ (NUM_POLY_INT_COEFFS > 1 && GET_CODE (X) == CONST_POLY_INT)
+
/* Predicate yielding nonzero iff X is an rtx for a constant fixed-point. */
#define CONST_FIXED_P(X) (GET_CODE (X) == CONST_FIXED)
@@ -1914,6 +1926,12 @@ #define CONST_WIDE_INT_VEC(RTX) HWIVEC_C
#define CONST_WIDE_INT_NUNITS(RTX) CWI_GET_NUM_ELEM (RTX)
#define CONST_WIDE_INT_ELT(RTX, N) CWI_ELT (RTX, N)
+/* For a CONST_POLY_INT, CONST_POLY_INT_COEFFS gives access to the
+ individual coefficients, in the form of a trailing_wide_ints structure. */
+#define CONST_POLY_INT_COEFFS(RTX) \
+ (RTL_FLAG_CHECK1("CONST_POLY_INT_COEFFS", (RTX), \
+ CONST_POLY_INT)->u.cpi.coeffs)
+
/* For a CONST_DOUBLE:
#if TARGET_SUPPORTS_WIDE_INT == 0
For a VOIDmode, there are two integers CONST_DOUBLE_LOW is the
@@ -2227,6 +2245,84 @@ wi::max_value (machine_mode mode, signop
return max_value (GET_MODE_PRECISION (as_a <scalar_mode> (mode)), sgn);
}
+namespace wi
+{
+ typedef poly_int<NUM_POLY_INT_COEFFS,
+ generic_wide_int <wide_int_ref_storage <false, false> > >
+ rtx_to_poly_wide_ref;
+ rtx_to_poly_wide_ref to_poly_wide (const_rtx, machine_mode);
+}
+
+/* Return the value of a CONST_POLY_INT in its native precision. */
+
+inline wi::rtx_to_poly_wide_ref
+const_poly_int_value (const_rtx x)
+{
+ poly_int<NUM_POLY_INT_COEFFS, WIDE_INT_REF_FOR (wide_int)> res;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ res.coeffs[i] = CONST_POLY_INT_COEFFS (x)[i];
+ return res;
+}
+
+/* Return true if X is a scalar integer or a CONST_POLY_INT. The value
+ can then be extracted using wi::to_poly_wide. */
+
+inline bool
+poly_int_rtx_p (const_rtx x)
+{
+ return CONST_SCALAR_INT_P (x) || CONST_POLY_INT_P (x);
+}
+
+/* Access X (which satisfies poly_int_rtx_p) as a poly_wide_int.
+ MODE is the mode of X. */
+
+inline wi::rtx_to_poly_wide_ref
+wi::to_poly_wide (const_rtx x, machine_mode mode)
+{
+ if (CONST_POLY_INT_P (x))
+ return const_poly_int_value (x);
+ return rtx_mode_t (const_cast<rtx> (x), mode);
+}
+
+/* Return the value of X as a poly_int64. */
+
+inline poly_int64
+rtx_to_poly_int64 (const_rtx x)
+{
+ if (CONST_POLY_INT_P (x))
+ {
+ poly_int64 res;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ res.coeffs[i] = CONST_POLY_INT_COEFFS (x)[i].to_shwi ();
+ return res;
+ }
+ return INTVAL (x);
+}
+
+/* Return true if arbitrary value X is an integer constant that can
+ be represented as a poly_int64. Store the value in *RES if so,
+ otherwise leave it unmodified. */
+
+inline bool
+poly_int_rtx_p (const_rtx x, poly_int64_pod *res)
+{
+ if (CONST_INT_P (x))
+ {
+ *res = INTVAL (x);
+ return true;
+ }
+ if (CONST_POLY_INT_P (x))
+ {
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ if (!wi::fits_shwi_p (CONST_POLY_INT_COEFFS (x)[i]))
+ return false;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ res->coeffs[i] = CONST_POLY_INT_COEFFS (x)[i].to_shwi ();
+ return true;
+ }
+ return false;
+}
+
extern void init_rtlanal (void);
extern int rtx_cost (rtx, machine_mode, enum rtx_code, int, bool);
extern int address_cost (rtx, machine_mode, addr_space_t, bool);
@@ -2764,7 +2860,8 @@ #define EXTRACT_ARGS_IN_RANGE(SIZE, POS,
/* In explow.c */
extern HOST_WIDE_INT trunc_int_for_mode (HOST_WIDE_INT, machine_mode);
-extern rtx plus_constant (machine_mode, rtx, HOST_WIDE_INT, bool = false);
+extern poly_int64 trunc_int_for_mode (poly_int64, machine_mode);
+extern rtx plus_constant (machine_mode, rtx, poly_int64, bool = false);
extern HOST_WIDE_INT get_stack_check_protect (void);
/* In rtl.c */
@@ -3075,13 +3172,11 @@ extern void end_sequence (void);
extern double_int rtx_to_double_int (const_rtx);
#endif
extern void cwi_output_hex (FILE *, const_rtx);
-#ifndef GENERATOR_FILE
-extern rtx immed_wide_int_const (const wide_int_ref &, machine_mode);
-#endif
#if TARGET_SUPPORTS_WIDE_INT == 0
extern rtx immed_double_const (HOST_WIDE_INT, HOST_WIDE_INT,
machine_mode);
#endif
+extern rtx immed_wide_int_const (const poly_wide_int_ref &, machine_mode);
/* In varasm.c */
extern rtx force_const_mem (machine_mode, rtx);
@@ -3269,6 +3364,7 @@ extern HOST_WIDE_INT get_integer_term (c
extern rtx get_related_value (const_rtx);
extern bool offset_within_block_p (const_rtx, HOST_WIDE_INT);
extern void split_const (rtx, rtx *, rtx *);
+extern rtx strip_offset (rtx, poly_int64_pod *);
extern bool unsigned_reg_p (rtx);
extern int reg_mentioned_p (const_rtx, const_rtx);
extern int count_occurrences (const_rtx, const_rtx, int);
@@ -4203,6 +4299,21 @@ load_extend_op (machine_mode mode)
return UNKNOWN;
}
+/* If X is a PLUS of a base and a constant offset, add the constant to *OFFSET
+ and return the base. Return X otherwise. */
+
+inline rtx
+strip_offset_and_add (rtx x, poly_int64_pod *offset)
+{
+ if (GET_CODE (x) == PLUS)
+ {
+ poly_int64 suboffset;
+ x = strip_offset (x, &suboffset);
+ *offset += suboffset;
+ }
+ return x;
+}
+
/* gtype-desc.c. */
extern void gt_ggc_mx (rtx &);
extern void gt_pch_nx (rtx &);
Index: gcc/rtl.def
===================================================================
--- gcc/rtl.def 2017-12-15 01:16:50.894351263 +0000
+++ gcc/rtl.def 2017-12-15 01:16:51.240339063 +0000
@@ -348,6 +348,9 @@ DEF_RTL_EXPR(CONST_INT, "const_int", "w"
/* numeric integer constant */
DEF_RTL_EXPR(CONST_WIDE_INT, "const_wide_int", "", RTX_CONST_OBJ)
+/* An rtx representation of a poly_wide_int. */
+DEF_RTL_EXPR(CONST_POLY_INT, "const_poly_int", "", RTX_CONST_OBJ)
+
/* fixed-point constant */
DEF_RTL_EXPR(CONST_FIXED, "const_fixed", "www", RTX_CONST_OBJ)
Index: gcc/rtl.c
===================================================================
--- gcc/rtl.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/rtl.c 2017-12-15 01:16:51.240339063 +0000
@@ -189,6 +189,10 @@ rtx_size (const_rtx x)
+ sizeof (struct hwivec_def)
+ ((CONST_WIDE_INT_NUNITS (x) - 1)
* sizeof (HOST_WIDE_INT)));
+ if (CONST_POLY_INT_P (x))
+ return (RTX_HDR_SIZE
+ + sizeof (struct const_poly_int_def)
+ + CONST_POLY_INT_COEFFS (x).extra_size ());
if (GET_CODE (x) == SYMBOL_REF && SYMBOL_REF_HAS_BLOCK_INFO_P (x))
return RTX_HDR_SIZE + sizeof (struct block_symbol);
return RTX_CODE_SIZE (GET_CODE (x));
@@ -257,9 +261,10 @@ shared_const_p (const_rtx orig)
/* CONST can be shared if it contains a SYMBOL_REF. If it contains
a LABEL_REF, it isn't sharable. */
+ poly_int64 offset;
return (GET_CODE (XEXP (orig, 0)) == PLUS
&& GET_CODE (XEXP (XEXP (orig, 0), 0)) == SYMBOL_REF
- && CONST_INT_P (XEXP (XEXP (orig, 0), 1)));
+ && poly_int_rtx_p (XEXP (XEXP (orig, 0), 1), &offset));
}
Index: gcc/emit-rtl.h
===================================================================
--- gcc/emit-rtl.h 2017-12-15 01:16:50.894351263 +0000
+++ gcc/emit-rtl.h 2017-12-15 01:16:51.238339134 +0000
@@ -362,14 +362,14 @@ extern rtvec gen_rtvec (int, ...);
extern rtx copy_insn_1 (rtx);
extern rtx copy_insn (rtx);
extern rtx_insn *copy_delay_slot_insn (rtx_insn *);
-extern rtx gen_int_mode (HOST_WIDE_INT, machine_mode);
+extern rtx gen_int_mode (poly_int64, machine_mode);
extern rtx_insn *emit_copy_of_insn_after (rtx_insn *, rtx_insn *);
extern void set_reg_attrs_from_value (rtx, rtx);
extern void set_reg_attrs_for_parm (rtx, rtx);
extern void set_reg_attrs_for_decl_rtl (tree t, rtx x);
extern void adjust_reg_mode (rtx, machine_mode);
extern int mem_expr_equal_p (const_tree, const_tree);
-extern rtx gen_int_shift_amount (machine_mode, HOST_WIDE_INT);
+extern rtx gen_int_shift_amount (machine_mode, poly_int64);
extern bool need_atomic_barrier_p (enum memmodel, bool);
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/emit-rtl.c 2017-12-15 01:16:51.238339134 +0000
@@ -148,6 +148,16 @@ struct const_wide_int_hasher : ggc_cache
static GTY ((cache)) hash_table<const_wide_int_hasher> *const_wide_int_htab;
+struct const_poly_int_hasher : ggc_cache_ptr_hash<rtx_def>
+{
+ typedef std::pair<machine_mode, poly_wide_int_ref> compare_type;
+
+ static hashval_t hash (rtx x);
+ static bool equal (rtx x, const compare_type &y);
+};
+
+static GTY ((cache)) hash_table<const_poly_int_hasher> *const_poly_int_htab;
+
/* A hash table storing register attribute structures. */
struct reg_attr_hasher : ggc_cache_ptr_hash<reg_attrs>
{
@@ -257,6 +267,31 @@ const_wide_int_hasher::equal (rtx x, rtx
}
#endif
+/* Returns a hash code for CONST_POLY_INT X. */
+
+hashval_t
+const_poly_int_hasher::hash (rtx x)
+{
+ inchash::hash h;
+ h.add_int (GET_MODE (x));
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ h.add_wide_int (CONST_POLY_INT_COEFFS (x)[i]);
+ return h.end ();
+}
+
+/* Returns nonzero if CONST_POLY_INT X is an rtx representation of Y. */
+
+bool
+const_poly_int_hasher::equal (rtx x, const compare_type &y)
+{
+ if (GET_MODE (x) != y.first)
+ return false;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ if (CONST_POLY_INT_COEFFS (x)[i] != y.second.coeffs[i])
+ return false;
+ return true;
+}
+
/* Returns a hash code for X (which is really a CONST_DOUBLE). */
hashval_t
const_double_hasher::hash (rtx x)
@@ -520,9 +555,13 @@ gen_rtx_CONST_INT (machine_mode mode ATT
}
rtx
-gen_int_mode (HOST_WIDE_INT c, machine_mode mode)
+gen_int_mode (poly_int64 c, machine_mode mode)
{
- return GEN_INT (trunc_int_for_mode (c, mode));
+ c = trunc_int_for_mode (c, mode);
+ if (c.is_constant ())
+ return GEN_INT (c.coeffs[0]);
+ unsigned int prec = GET_MODE_PRECISION (as_a <scalar_mode> (mode));
+ return immed_wide_int_const (poly_wide_int::from (c, prec, SIGNED), mode);
}
/* CONST_DOUBLEs might be created from pairs of integers, or from
@@ -626,8 +665,8 @@ lookup_const_wide_int (rtx wint)
a CONST_DOUBLE (if !TARGET_SUPPORTS_WIDE_INT) or a CONST_WIDE_INT
(if TARGET_SUPPORTS_WIDE_INT). */
-rtx
-immed_wide_int_const (const wide_int_ref &v, machine_mode mode)
+static rtx
+immed_wide_int_const_1 (const wide_int_ref &v, machine_mode mode)
{
unsigned int len = v.get_len ();
/* Not scalar_int_mode because we also allow pointer bound modes. */
@@ -714,6 +753,53 @@ immed_double_const (HOST_WIDE_INT i0, HO
}
#endif
+/* Return an rtx representation of C in mode MODE. */
+
+rtx
+immed_wide_int_const (const poly_wide_int_ref &c, machine_mode mode)
+{
+ if (c.is_constant ())
+ return immed_wide_int_const_1 (c.coeffs[0], mode);
+
+ /* Not scalar_int_mode because we also allow pointer bound modes. */
+ unsigned int prec = GET_MODE_PRECISION (as_a <scalar_mode> (mode));
+
+ /* Allow truncation but not extension since we do not know if the
+ number is signed or unsigned. */
+ gcc_assert (prec <= c.coeffs[0].get_precision ());
+ poly_wide_int newc = poly_wide_int::from (c, prec, SIGNED);
+
+ /* See whether we already have an rtx for this constant. */
+ inchash::hash h;
+ h.add_int (mode);
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ h.add_wide_int (newc.coeffs[i]);
+ const_poly_int_hasher::compare_type typed_value (mode, newc);
+ rtx *slot = const_poly_int_htab->find_slot_with_hash (typed_value,
+ h.end (), INSERT);
+ rtx x = *slot;
+ if (x)
+ return x;
+
+ /* Create a new rtx. There's a choice to be made here between installing
+ the actual mode of the rtx or leaving it as VOIDmode (for consistency
+ with CONST_INT). In practice the handling of the codes is different
+ enough that we get no benefit from using VOIDmode, and various places
+ assume that VOIDmode implies CONST_INT. Using the real mode seems like
+ the right long-term direction anyway. */
+ typedef trailing_wide_ints<NUM_POLY_INT_COEFFS> twi;
+ size_t extra_size = twi::extra_size (prec);
+ x = rtx_alloc_v (CONST_POLY_INT,
+ sizeof (struct const_poly_int_def) + extra_size);
+ PUT_MODE (x, mode);
+ CONST_POLY_INT_COEFFS (x).set_precision (prec);
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ CONST_POLY_INT_COEFFS (x)[i] = newc.coeffs[i];
+
+ *slot = x;
+ return x;
+}
+
rtx
gen_rtx_REG (machine_mode mode, unsigned int regno)
{
@@ -1517,7 +1603,8 @@ gen_lowpart_common (machine_mode mode, r
}
else if (GET_CODE (x) == SUBREG || REG_P (x)
|| GET_CODE (x) == CONCAT || const_vec_p (x)
- || CONST_DOUBLE_AS_FLOAT_P (x) || CONST_SCALAR_INT_P (x))
+ || CONST_DOUBLE_AS_FLOAT_P (x) || CONST_SCALAR_INT_P (x)
+ || CONST_POLY_INT_P (x))
return lowpart_subreg (mode, x, innermode);
/* Otherwise, we can't do this. */
@@ -6124,6 +6211,9 @@ init_emit_once (void)
#endif
const_double_htab = hash_table<const_double_hasher>::create_ggc (37);
+ if (NUM_POLY_INT_COEFFS > 1)
+ const_poly_int_htab = hash_table<const_poly_int_hasher>::create_ggc (37);
+
const_fixed_htab = hash_table<const_fixed_hasher>::create_ggc (37);
reg_attrs_htab = hash_table<reg_attr_hasher>::create_ggc (37);
@@ -6517,7 +6607,7 @@ need_atomic_barrier_p (enum memmodel mod
by VALUE bits. */
rtx
-gen_int_shift_amount (machine_mode mode, HOST_WIDE_INT value)
+gen_int_shift_amount (machine_mode mode, poly_int64 value)
{
/* ??? Using the inner mode should be wide enough for all useful
cases (e.g. QImode usually has 8 shiftable bits, while a QImode
Index: gcc/cse.c
===================================================================
--- gcc/cse.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/cse.c 2017-12-15 01:16:51.234339275 +0000
@@ -2323,6 +2323,15 @@ hash_rtx_cb (const_rtx x, machine_mode m
hash += CONST_WIDE_INT_ELT (x, i);
return hash;
+ case CONST_POLY_INT:
+ {
+ inchash::hash h;
+ h.add_int (hash);
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ h.add_wide_int (CONST_POLY_INT_COEFFS (x)[i]);
+ return h.end ();
+ }
+
case CONST_DOUBLE:
/* This is like the general case, except that it only counts
the integers representing the constant. */
@@ -3781,6 +3790,8 @@ equiv_constant (rtx x)
/* See if we previously assigned a constant value to this SUBREG. */
if ((new_rtx = lookup_as_function (x, CONST_INT)) != 0
|| (new_rtx = lookup_as_function (x, CONST_WIDE_INT)) != 0
+ || (NUM_POLY_INT_COEFFS > 1
+ && (new_rtx = lookup_as_function (x, CONST_POLY_INT)) != 0)
|| (new_rtx = lookup_as_function (x, CONST_DOUBLE)) != 0
|| (new_rtx = lookup_as_function (x, CONST_FIXED)) != 0)
return new_rtx;
Index: gcc/cselib.c
===================================================================
--- gcc/cselib.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/cselib.c 2017-12-15 01:16:51.234339275 +0000
@@ -1128,6 +1128,15 @@ cselib_hash_rtx (rtx x, int create, mach
hash += CONST_WIDE_INT_ELT (x, i);
return hash;
+ case CONST_POLY_INT:
+ {
+ inchash::hash h;
+ h.add_int (hash);
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ h.add_wide_int (CONST_POLY_INT_COEFFS (x)[i]);
+ return h.end ();
+ }
+
case CONST_DOUBLE:
/* This is like the general case, except that it only counts
the integers representing the constant. */
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/dwarf2out.c 2017-12-15 01:16:51.237339169 +0000
@@ -13781,6 +13781,16 @@ const_ok_for_output_1 (rtx rtl)
return false;
}
+ if (CONST_POLY_INT_P (rtl))
+ return false;
+
+ if (targetm.const_not_ok_for_debug_p (rtl))
+ {
+ expansion_failed (NULL_TREE, rtl,
+ "Expression rejected for debug by the backend.\n");
+ return false;
+ }
+
/* FIXME: Refer to PR60655. It is possible for simplification
of rtl expressions in var tracking to produce such expressions.
We should really identify / validate expressions
Index: gcc/expr.c
===================================================================
--- gcc/expr.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/expr.c 2017-12-15 01:16:51.239339098 +0000
@@ -692,6 +692,7 @@ convert_modes (machine_mode mode, machin
&& is_int_mode (oldmode, &int_oldmode)
&& GET_MODE_PRECISION (int_mode) <= GET_MODE_PRECISION (int_oldmode)
&& ((MEM_P (x) && !MEM_VOLATILE_P (x) && direct_load[(int) int_mode])
+ || CONST_POLY_INT_P (x)
|| (REG_P (x)
&& (!HARD_REGISTER_P (x)
|| targetm.hard_regno_mode_ok (REGNO (x), int_mode))
Index: gcc/print-rtl.c
===================================================================
--- gcc/print-rtl.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/print-rtl.c 2017-12-15 01:16:51.240339063 +0000
@@ -908,6 +908,17 @@ rtx_writer::print_rtx (const_rtx in_rtx)
fprintf (m_outfile, " ");
cwi_output_hex (m_outfile, in_rtx);
break;
+
+ case CONST_POLY_INT:
+ fprintf (m_outfile, " [");
+ print_dec (CONST_POLY_INT_COEFFS (in_rtx)[0], m_outfile, SIGNED);
+ for (unsigned int i = 1; i < NUM_POLY_INT_COEFFS; ++i)
+ {
+ fprintf (m_outfile, ", ");
+ print_dec (CONST_POLY_INT_COEFFS (in_rtx)[i], m_outfile, SIGNED);
+ }
+ fprintf (m_outfile, "]");
+ break;
#endif
case CODE_LABEL:
@@ -1595,6 +1606,17 @@ print_value (pretty_printer *pp, const_r
}
break;
+ case CONST_POLY_INT:
+ pp_left_bracket (pp);
+ pp_wide_int (pp, CONST_POLY_INT_COEFFS (x)[0], SIGNED);
+ for (unsigned int i = 1; i < NUM_POLY_INT_COEFFS; ++i)
+ {
+ pp_string (pp, ", ");
+ pp_wide_int (pp, CONST_POLY_INT_COEFFS (x)[i], SIGNED);
+ }
+ pp_right_bracket (pp);
+ break;
+
case CONST_DOUBLE:
if (FLOAT_MODE_P (GET_MODE (x)))
{
Index: gcc/rtlhash.c
===================================================================
--- gcc/rtlhash.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/rtlhash.c 2017-12-15 01:16:51.241339028 +0000
@@ -55,6 +55,10 @@ add_rtx (const_rtx x, hash &hstate)
for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
hstate.add_object (CONST_WIDE_INT_ELT (x, i));
return;
+ case CONST_POLY_INT:
+ for (i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ hstate.add_wide_int (CONST_POLY_INT_COEFFS (x)[i]);
+ break;
case SYMBOL_REF:
if (XSTR (x, 0))
hstate.add (XSTR (x, 0), strlen (XSTR (x, 0)) + 1);
Index: gcc/explow.c
===================================================================
--- gcc/explow.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/explow.c 2017-12-15 01:16:51.238339134 +0000
@@ -77,13 +77,23 @@ trunc_int_for_mode (HOST_WIDE_INT c, mac
return c;
}
+/* Likewise for polynomial values, using the sign-extended representation
+ for each individual coefficient. */
+
+poly_int64
+trunc_int_for_mode (poly_int64 x, machine_mode mode)
+{
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ x.coeffs[i] = trunc_int_for_mode (x.coeffs[i], mode);
+ return x;
+}
+
/* Return an rtx for the sum of X and the integer C, given that X has
mode MODE. INPLACE is true if X can be modified inplace or false
if it must be treated as immutable. */
rtx
-plus_constant (machine_mode mode, rtx x, HOST_WIDE_INT c,
- bool inplace)
+plus_constant (machine_mode mode, rtx x, poly_int64 c, bool inplace)
{
RTX_CODE code;
rtx y;
@@ -92,7 +102,7 @@ plus_constant (machine_mode mode, rtx x,
gcc_assert (GET_MODE (x) == VOIDmode || GET_MODE (x) == mode);
- if (c == 0)
+ if (known_eq (c, 0))
return x;
restart:
@@ -180,10 +190,12 @@ plus_constant (machine_mode mode, rtx x,
break;
default:
+ if (CONST_POLY_INT_P (x))
+ return immed_wide_int_const (const_poly_int_value (x) + c, mode);
break;
}
- if (c != 0)
+ if (maybe_ne (c, 0))
x = gen_rtx_PLUS (mode, x, gen_int_mode (c, mode));
if (GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF)
Index: gcc/expmed.h
===================================================================
--- gcc/expmed.h 2017-12-15 01:16:50.894351263 +0000
+++ gcc/expmed.h 2017-12-15 01:16:51.239339098 +0000
@@ -712,8 +712,8 @@ extern unsigned HOST_WIDE_INT choose_mul
#ifdef TREE_CODE
extern rtx expand_variable_shift (enum tree_code, machine_mode,
rtx, tree, rtx, int);
-extern rtx expand_shift (enum tree_code, machine_mode, rtx, int, rtx,
- int);
+extern rtx expand_shift (enum tree_code, machine_mode, rtx, poly_int64, rtx,
+ int);
extern rtx expand_divmod (int, enum tree_code, machine_mode, rtx, rtx,
rtx, int);
#endif
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/expmed.c 2017-12-15 01:16:51.239339098 +0000
@@ -2541,7 +2541,7 @@ expand_shift_1 (enum tree_code code, mac
rtx
expand_shift (enum tree_code code, machine_mode mode, rtx shifted,
- int amount, rtx target, int unsignedp)
+ poly_int64 amount, rtx target, int unsignedp)
{
return expand_shift_1 (code, mode, shifted,
gen_int_shift_amount (mode, amount),
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/rtlanal.c 2017-12-15 01:16:51.241339028 +0000
@@ -915,6 +915,28 @@ split_const (rtx x, rtx *base_out, rtx *
*base_out = x;
*offset_out = const0_rtx;
}
+
+/* Express integer value X as some value Y plus a polynomial offset,
+ where Y is either const0_rtx, X or something within X (as opposed
+ to a new rtx). Return the Y and store the offset in *OFFSET_OUT. */
+
+rtx
+strip_offset (rtx x, poly_int64_pod *offset_out)
+{
+ rtx base = const0_rtx;
+ rtx test = x;
+ if (GET_CODE (test) == CONST)
+ test = XEXP (test, 0);
+ if (GET_CODE (test) == PLUS)
+ {
+ base = XEXP (test, 0);
+ test = XEXP (test, 1);
+ }
+ if (poly_int_rtx_p (test, offset_out))
+ return base;
+ *offset_out = 0;
+ return x;
+}
\f
/* Return the number of places FIND appears within X. If COUNT_DEST is
zero, we do not count occurrences inside the destination of a SET. */
@@ -3406,13 +3428,15 @@ commutative_operand_precedence (rtx op)
/* Constants always become the second operand. Prefer "nice" constants. */
if (code == CONST_INT)
- return -8;
+ return -10;
if (code == CONST_WIDE_INT)
- return -7;
+ return -9;
+ if (code == CONST_POLY_INT)
+ return -8;
if (code == CONST_DOUBLE)
- return -7;
+ return -8;
if (code == CONST_FIXED)
- return -7;
+ return -8;
op = avoid_constant_pool_reference (op);
code = GET_CODE (op);
@@ -3420,13 +3444,15 @@ commutative_operand_precedence (rtx op)
{
case RTX_CONST_OBJ:
if (code == CONST_INT)
- return -6;
+ return -7;
if (code == CONST_WIDE_INT)
- return -6;
+ return -6;
+ if (code == CONST_POLY_INT)
+ return -5;
if (code == CONST_DOUBLE)
- return -5;
+ return -5;
if (code == CONST_FIXED)
- return -5;
+ return -5;
return -4;
case RTX_EXTRA:
Index: gcc/rtl-tests.c
===================================================================
--- gcc/rtl-tests.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/rtl-tests.c 2017-12-15 01:16:51.240339063 +0000
@@ -228,6 +228,62 @@ test_uncond_jump ()
jump_insn);
}
+template<unsigned int N>
+struct const_poly_int_tests
+{
+ static void run ();
+};
+
+template<>
+struct const_poly_int_tests<1>
+{
+ static void run () {}
+};
+
+/* Test various CONST_POLY_INT properties. */
+
+template<unsigned int N>
+void
+const_poly_int_tests<N>::run ()
+{
+ rtx x1 = gen_int_mode (poly_int64 (1, 1), QImode);
+ rtx x255 = gen_int_mode (poly_int64 (1, 255), QImode);
+
+ /* Test that constants are unique. */
+ ASSERT_EQ (x1, gen_int_mode (poly_int64 (1, 1), QImode));
+ ASSERT_NE (x1, gen_int_mode (poly_int64 (1, 1), HImode));
+ ASSERT_NE (x1, x255);
+
+ /* Test const_poly_int_value. */
+ ASSERT_KNOWN_EQ (const_poly_int_value (x1), poly_int64 (1, 1));
+ ASSERT_KNOWN_EQ (const_poly_int_value (x255), poly_int64 (1, -1));
+
+ /* Test rtx_to_poly_int64. */
+ ASSERT_KNOWN_EQ (rtx_to_poly_int64 (x1), poly_int64 (1, 1));
+ ASSERT_KNOWN_EQ (rtx_to_poly_int64 (x255), poly_int64 (1, -1));
+ ASSERT_MAYBE_NE (rtx_to_poly_int64 (x255), poly_int64 (1, 255));
+
+ /* Test plus_constant of a symbol. */
+ rtx symbol = gen_rtx_SYMBOL_REF (Pmode, "foo");
+ rtx offset1 = gen_int_mode (poly_int64 (9, 11), Pmode);
+ rtx sum1 = gen_rtx_CONST (Pmode, gen_rtx_PLUS (Pmode, symbol, offset1));
+ ASSERT_RTX_EQ (plus_constant (Pmode, symbol, poly_int64 (9, 11)), sum1);
+
+ /* Test plus_constant of a CONST. */
+ rtx offset2 = gen_int_mode (poly_int64 (12, 20), Pmode);
+ rtx sum2 = gen_rtx_CONST (Pmode, gen_rtx_PLUS (Pmode, symbol, offset2));
+ ASSERT_RTX_EQ (plus_constant (Pmode, sum1, poly_int64 (3, 9)), sum2);
+
+ /* Test a cancelling plus_constant. */
+ ASSERT_EQ (plus_constant (Pmode, sum2, poly_int64 (-12, -20)), symbol);
+
+ /* Test plus_constant on integer constants. */
+ ASSERT_EQ (plus_constant (QImode, const1_rtx, poly_int64 (4, -2)),
+ gen_int_mode (poly_int64 (5, -2), QImode));
+ ASSERT_EQ (plus_constant (QImode, x1, poly_int64 (4, -2)),
+ gen_int_mode (poly_int64 (5, -1), QImode));
+}
+
/* Run all of the selftests within this file. */
void
@@ -238,6 +294,7 @@ rtl_tests_c_tests ()
test_dumping_rtx_reuse ();
test_single_set ();
test_uncond_jump ();
+ const_poly_int_tests<NUM_POLY_INT_COEFFS>::run ();
/* Purge state. */
set_first_insn (NULL);
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c 2017-12-15 01:16:50.894351263 +0000
+++ gcc/simplify-rtx.c 2017-12-15 01:16:51.242338992 +0000
@@ -2038,6 +2038,26 @@ simplify_const_unary_operation (enum rtx
}
}
+ /* Handle polynomial integers. */
+ else if (CONST_POLY_INT_P (op))
+ {
+ poly_wide_int result;
+ switch (code)
+ {
+ case NEG:
+ result = -const_poly_int_value (op);
+ break;
+
+ case NOT:
+ result = ~const_poly_int_value (op);
+ break;
+
+ default:
+ return NULL_RTX;
+ }
+ return immed_wide_int_const (result, mode);
+ }
+
return NULL_RTX;
}
\f
@@ -2218,6 +2238,7 @@ simplify_binary_operation_1 (enum rtx_co
rtx tem, reversed, opleft, opright, elt0, elt1;
HOST_WIDE_INT val;
scalar_int_mode int_mode, inner_mode;
+ poly_int64 offset;
/* Even if we can't compute a constant result,
there are some cases worth simplifying. */
@@ -2530,6 +2551,12 @@ simplify_binary_operation_1 (enum rtx_co
return simplify_gen_binary (MINUS, mode, tem, XEXP (op0, 0));
}
+ if ((GET_CODE (op0) == CONST
+ || GET_CODE (op0) == SYMBOL_REF
+ || GET_CODE (op0) == LABEL_REF)
+ && poly_int_rtx_p (op1, &offset))
+ return plus_constant (mode, op0, trunc_int_for_mode (-offset, mode));
+
/* Don't let a relocatable value get a negative coeff. */
if (CONST_INT_P (op1) && GET_MODE (op0) != VOIDmode)
return simplify_gen_binary (PLUS, mode,
@@ -4327,6 +4354,57 @@ simplify_const_binary_operation (enum rt
return immed_wide_int_const (result, int_mode);
}
+ /* Handle polynomial integers. */
+ if (NUM_POLY_INT_COEFFS > 1
+ && is_a <scalar_int_mode> (mode, &int_mode)
+ && poly_int_rtx_p (op0)
+ && poly_int_rtx_p (op1))
+ {
+ poly_wide_int result;
+ switch (code)
+ {
+ case PLUS:
+ result = wi::to_poly_wide (op0, mode) + wi::to_poly_wide (op1, mode);
+ break;
+
+ case MINUS:
+ result = wi::to_poly_wide (op0, mode) - wi::to_poly_wide (op1, mode);
+ break;
+
+ case MULT:
+ if (CONST_SCALAR_INT_P (op1))
+ result = wi::to_poly_wide (op0, mode) * rtx_mode_t (op1, mode);
+ else
+ return NULL_RTX;
+ break;
+
+ case ASHIFT:
+ if (CONST_SCALAR_INT_P (op1))
+ {
+ wide_int shift = rtx_mode_t (op1, mode);
+ if (SHIFT_COUNT_TRUNCATED)
+ shift = wi::umod_trunc (shift, GET_MODE_PRECISION (int_mode));
+ else if (wi::geu_p (shift, GET_MODE_PRECISION (int_mode)))
+ return NULL_RTX;
+ result = wi::to_poly_wide (op0, mode) << shift;
+ }
+ else
+ return NULL_RTX;
+ break;
+
+ case IOR:
+ if (!CONST_SCALAR_INT_P (op1)
+ || !can_ior_p (wi::to_poly_wide (op0, mode),
+ rtx_mode_t (op1, mode), &result))
+ return NULL_RTX;
+ break;
+
+ default:
+ return NULL_RTX;
+ }
+ return immed_wide_int_const (result, int_mode);
+ }
+
return NULL_RTX;
}
@@ -6370,13 +6448,27 @@ simplify_subreg (machine_mode outermode,
scalar_int_mode int_outermode, int_innermode;
if (is_a <scalar_int_mode> (outermode, &int_outermode)
&& is_a <scalar_int_mode> (innermode, &int_innermode)
- && (GET_MODE_PRECISION (int_outermode)
- < GET_MODE_PRECISION (int_innermode))
&& byte == subreg_lowpart_offset (int_outermode, int_innermode))
{
- rtx tem = simplify_truncation (int_outermode, op, int_innermode);
- if (tem)
- return tem;
+ /* Handle polynomial integers. The upper bits of a paradoxical
+ subreg are undefined, so this is safe regardless of whether
+ we're truncating or extending. */
+ if (CONST_POLY_INT_P (op))
+ {
+ poly_wide_int val
+ = poly_wide_int::from (const_poly_int_value (op),
+ GET_MODE_PRECISION (int_outermode),
+ SIGNED);
+ return immed_wide_int_const (val, int_outermode);
+ }
+
+ if (GET_MODE_PRECISION (int_outermode)
+ < GET_MODE_PRECISION (int_innermode))
+ {
+ rtx tem = simplify_truncation (int_outermode, op, int_innermode);
+ if (tem)
+ return tem;
+ }
}
return NULL_RTX;
@@ -6685,12 +6777,60 @@ test_vector_ops ()
}
}
+template<unsigned int N>
+struct simplify_const_poly_int_tests
+{
+ static void run ();
+};
+
+template<>
+struct simplify_const_poly_int_tests<1>
+{
+ static void run () {}
+};
+
+/* Test various CONST_POLY_INT properties. */
+
+template<unsigned int N>
+void
+simplify_const_poly_int_tests<N>::run ()
+{
+ rtx x1 = gen_int_mode (poly_int64 (1, 1), QImode);
+ rtx x2 = gen_int_mode (poly_int64 (-80, 127), QImode);
+ rtx x3 = gen_int_mode (poly_int64 (-79, -128), QImode);
+ rtx x4 = gen_int_mode (poly_int64 (5, 4), QImode);
+ rtx x5 = gen_int_mode (poly_int64 (30, 24), QImode);
+ rtx x6 = gen_int_mode (poly_int64 (20, 16), QImode);
+ rtx x7 = gen_int_mode (poly_int64 (7, 4), QImode);
+ rtx x8 = gen_int_mode (poly_int64 (30, 24), HImode);
+ rtx x9 = gen_int_mode (poly_int64 (-30, -24), HImode);
+ rtx x10 = gen_int_mode (poly_int64 (-31, -24), HImode);
+ rtx two = GEN_INT (2);
+ rtx six = GEN_INT (6);
+ HOST_WIDE_INT offset = subreg_lowpart_offset (QImode, HImode);
+
+ /* These tests only try limited operation combinations. Fuller arithmetic
+ testing is done directly on poly_ints. */
+ ASSERT_EQ (simplify_unary_operation (NEG, HImode, x8, HImode), x9);
+ ASSERT_EQ (simplify_unary_operation (NOT, HImode, x8, HImode), x10);
+ ASSERT_EQ (simplify_unary_operation (TRUNCATE, QImode, x8, HImode), x5);
+ ASSERT_EQ (simplify_binary_operation (PLUS, QImode, x1, x2), x3);
+ ASSERT_EQ (simplify_binary_operation (MINUS, QImode, x3, x1), x2);
+ ASSERT_EQ (simplify_binary_operation (MULT, QImode, x4, six), x5);
+ ASSERT_EQ (simplify_binary_operation (MULT, QImode, six, x4), x5);
+ ASSERT_EQ (simplify_binary_operation (ASHIFT, QImode, x4, two), x6);
+ ASSERT_EQ (simplify_binary_operation (IOR, QImode, x4, two), x7);
+ ASSERT_EQ (simplify_subreg (HImode, x5, QImode, 0), x8);
+ ASSERT_EQ (simplify_subreg (QImode, x8, HImode, offset), x5);
+}
+
/* Run all of the selftests within this file. */
void
simplify_rtx_c_tests ()
{
test_vector_ops ();
+ simplify_const_poly_int_tests<NUM_POLY_INT_COEFFS>::run ();
}
} // namespace selftest
Index: gcc/wide-int.h
===================================================================
--- gcc/wide-int.h 2017-12-15 01:16:50.894351263 +0000
+++ gcc/wide-int.h 2017-12-15 01:16:51.242338992 +0000
@@ -613,6 +613,7 @@ #define SHIFT_FUNCTION \
access. */
struct storage_ref
{
+ storage_ref () {}
storage_ref (const HOST_WIDE_INT *, unsigned int, unsigned int);
const HOST_WIDE_INT *val;
@@ -944,6 +945,8 @@ struct wide_int_ref_storage : public wi:
HOST_WIDE_INT scratch[2];
public:
+ wide_int_ref_storage () {}
+
wide_int_ref_storage (const wi::storage_ref &);
template <typename T>
@@ -1323,7 +1326,7 @@ typedef generic_wide_int <trailing_wide_
bytes beyond the sizeof need to be allocated. Use set_precision
to initialize the structure. */
template <int N>
-class GTY(()) trailing_wide_ints
+class GTY((user)) trailing_wide_ints
{
private:
/* The shared precision of each number. */
@@ -1340,9 +1343,14 @@ class GTY(()) trailing_wide_ints
HOST_WIDE_INT m_val[1];
public:
+ typedef WIDE_INT_REF_FOR (trailing_wide_int_storage) const_reference;
+
void set_precision (unsigned int);
+ unsigned int get_precision () const { return m_precision; }
trailing_wide_int operator [] (unsigned int);
+ const_reference operator [] (unsigned int) const;
static size_t extra_size (unsigned int);
+ size_t extra_size () const { return extra_size (m_precision); }
};
inline trailing_wide_int_storage::
@@ -1414,6 +1422,14 @@ trailing_wide_ints <N>::operator [] (uns
&m_val[index * m_max_len]);
}
+template <int N>
+inline typename trailing_wide_ints <N>::const_reference
+trailing_wide_ints <N>::operator [] (unsigned int index) const
+{
+ return wi::storage_ref (&m_val[index * m_max_len],
+ m_len[index], m_precision);
+}
+
/* Return how many extra bytes need to be added to the end of the structure
in order to handle N wide_ints of precision PRECISION. */
template <int N>
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [005/nnn] poly_int: rtx constants
2017-12-15 1:25 ` Richard Sandiford
@ 2017-12-19 4:52 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-19 4:52 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 12/14/2017 06:25 PM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 10/23/2017 11:00 AM, Richard Sandiford wrote:
>>> This patch adds an rtl representation of poly_int values.
>>> There were three possible ways of doing this:
>>>
>>> (1) Add a new rtl code for the poly_ints themselves and store the
>>> coefficients as trailing wide_ints. This would give constants like:
>>>
>>> (const_poly_int [c0 c1 ... cn])
>>>
>>> The runtime value would be:
>>>
>>> c0 + c1 * x1 + ... + cn * xn
>>>
>>> (2) Like (1), but use rtxes for the coefficients. This would give
>>> constants like:
>>>
>>> (const_poly_int [(const_int c0)
>>> (const_int c1)
>>> ...
>>> (const_int cn)])
>>>
>>> although the coefficients could be const_wide_ints instead
>>> of const_ints where appropriate.
>>>
>>> (3) Add a new rtl code for the polynomial indeterminates,
>>> then use them in const wrappers. A constant like c0 + c1 * x1
>>> would then look like:
>>>
>>> (const:M (plus:M (mult:M (const_param:M x1)
>>> (const_int c1))
>>> (const_int c0)))
>>>
>>> There didn't seem to be that much to choose between them. The main
>>> advantage of (1) is that it's a more efficient representation and
>>> that we can refer to the cofficients directly as wide_int_storage.
>> Well, and #1 feels more like how we handle CONST_INT :-)
>>>
>>>
>>> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
>>> Alan Hayward <alan.hayward@arm.com>
>>> David Sherwood <david.sherwood@arm.com>
>>>
>>> gcc/
>>> * doc/rtl.texi (const_poly_int): Document.
>>> * gengenrtl.c (excluded_rtx): Return true for CONST_POLY_INT.
>>> * rtl.h (const_poly_int_def): New struct.
>>> (rtx_def::u): Add a cpi field.
>>> (CASE_CONST_UNIQUE, CASE_CONST_ANY): Add CONST_POLY_INT.
>>> (CONST_POLY_INT_P, CONST_POLY_INT_COEFFS): New macros.
>>> (wi::rtx_to_poly_wide_ref): New typedef
>>> (const_poly_int_value, wi::to_poly_wide, rtx_to_poly_int64)
>>> (poly_int_rtx_p): New functions.
>>> (trunc_int_for_mode): Declare a poly_int64 version.
>>> (plus_constant): Take a poly_int64 instead of a HOST_WIDE_INT.
>>> (immed_wide_int_const): Take a poly_wide_int_ref rather than
>>> a wide_int_ref.
>>> (strip_offset): Declare.
>>> (strip_offset_and_add): New function.
>>> * rtl.def (CONST_POLY_INT): New rtx code.
>>> * rtl.c (rtx_size): Handle CONST_POLY_INT.
>>> (shared_const_p): Use poly_int_rtx_p.
>>> * emit-rtl.h (gen_int_mode): Take a poly_int64 instead of a
>>> HOST_WIDE_INT.
>>> (gen_int_shift_amount): Likewise.
>>> * emit-rtl.c (const_poly_int_hasher): New class.
>>> (const_poly_int_htab): New variable.
>>> (init_emit_once): Initialize it when NUM_POLY_INT_COEFFS > 1.
>>> (const_poly_int_hasher::hash): New function.
>>> (const_poly_int_hasher::equal): Likewise.
>>> (gen_int_mode): Take a poly_int64 instead of a HOST_WIDE_INT.
>>> (immed_wide_int_const): Rename to...
>>> (immed_wide_int_const_1): ...this and make static.
>>> (immed_wide_int_const): New function, taking a poly_wide_int_ref
>>> instead of a wide_int_ref.
>>> (gen_int_shift_amount): Take a poly_int64 instead of a HOST_WIDE_INT.
>>> (gen_lowpart_common): Handle CONST_POLY_INT.
>>> * cse.c (hash_rtx_cb, equiv_constant): Likewise.
>>> * cselib.c (cselib_hash_rtx): Likewise.
>>> * dwarf2out.c (const_ok_for_output_1): Likewise.
>>> * expr.c (convert_modes): Likewise.
>>> * print-rtl.c (rtx_writer::print_rtx, print_value): Likewise.
>>> * rtlhash.c (add_rtx): Likewise.
>>> * explow.c (trunc_int_for_mode): Add a poly_int64 version.
>>> (plus_constant): Take a poly_int64 instead of a HOST_WIDE_INT.
>>> Handle existing CONST_POLY_INT rtxes.
>>> * expmed.h (expand_shift): Take a poly_int64 instead of a
>>> HOST_WIDE_INT.
>>> * expmed.c (expand_shift): Likewise.
>>> * rtlanal.c (strip_offset): New function.
>>> (commutative_operand_precedence): Give CONST_POLY_INT the same
>>> precedence as CONST_DOUBLE and put CONST_WIDE_INT between that
>>> and CONST_INT.
>>> * rtl-tests.c (const_poly_int_tests): New struct.
>>> (rtl_tests_c_tests): Use it.
>>> * simplify-rtx.c (simplify_const_unary_operation): Handle
>>> CONST_POLY_INT.
>>> (simplify_const_binary_operation): Likewise.
>>> (simplify_binary_operation_1): Fold additions of symbolic constants
>>> and CONST_POLY_INTs.
>>> (simplify_subreg): Handle extensions and truncations of
>>> CONST_POLY_INTs.
>>> (simplify_const_poly_int_tests): New struct.
>>> (simplify_rtx_c_tests): Use it.
>>> * wide-int.h (storage_ref): Add default constructor.
>>> (wide_int_ref_storage): Likewise.
>>> (trailing_wide_ints): Use GTY((user)).
>>> (trailing_wide_ints::operator[]): Add a const version.
>>> (trailing_wide_ints::get_precision): New function.
>>> (trailing_wide_ints::extra_size): Likewise.
>> Do we need to define anything WRT structure sharing in rtl.texi for a
>> CONST_POLY_INT?
>
> Good catch. Fixed in the patch below.
>
>>> Index: gcc/rtl.c
>>> ===================================================================
>>> --- gcc/rtl.c 2017-10-23 16:52:20.579835373 +0100
>>> +++ gcc/rtl.c 2017-10-23 17:00:54.443002147 +0100
>>> @@ -257,9 +261,10 @@ shared_const_p (const_rtx orig)
>>>
>>> /* CONST can be shared if it contains a SYMBOL_REF. If it contains
>>> a LABEL_REF, it isn't sharable. */
>>> + poly_int64 offset;
>>> return (GET_CODE (XEXP (orig, 0)) == PLUS
>>> && GET_CODE (XEXP (XEXP (orig, 0), 0)) == SYMBOL_REF
>>> - && CONST_INT_P (XEXP (XEXP (orig, 0), 1)));
>>> + && poly_int_rtx_p (XEXP (XEXP (orig, 0), 1), &offset));
>> Did this just change structure sharing for CONST_WIDE_INT?
>
> No, we'd only use CONST_WIDE_INT for things that don't fit in
> poly_int64.
>
>>> + /* Create a new rtx. There's a choice to be made here between installing
>>> + the actual mode of the rtx or leaving it as VOIDmode (for consistency
>>> + with CONST_INT). In practice the handling of the codes is different
>>> + enough that we get no benefit from using VOIDmode, and various places
>>> + assume that VOIDmode implies CONST_INT. Using the real mode seems like
>>> + the right long-term direction anyway. */
>> Certainly my preference is to get the mode in there. I see modeless
>> CONST_INTs as a long standing wart and I'm not keen to repeat it.
>
> Yeah. Still regularly hit problems related to modeless CONST_INTs
> today (including the gen_int_shift_amount patch).
>
>>> Index: gcc/wide-int.h
>>> ===================================================================
>>> --- gcc/wide-int.h 2017-10-23 17:00:20.923835582 +0100
>>> +++ gcc/wide-int.h 2017-10-23 17:00:54.445999420 +0100
>>> @@ -613,6 +613,7 @@ #define SHIFT_FUNCTION \
>>> access. */
>>> struct storage_ref
>>> {
>>> + storage_ref () {}
>>> storage_ref (const HOST_WIDE_INT *, unsigned int, unsigned int);
>>>
>>> const HOST_WIDE_INT *val;
>>> @@ -944,6 +945,8 @@ struct wide_int_ref_storage : public wi:
>>> HOST_WIDE_INT scratch[2];
>>>
>>> public:
>>> + wide_int_ref_storage () {}
>>> +
>>> wide_int_ref_storage (const wi::storage_ref &);
>>>
>>> template <typename T>
>> So doesn't this play into the whole question about initialization of
>> these objects. So I'll defer on this hunk until we settle that
>> question, but the rest is OK.
>
> Any more thoughts on this? In the end the 001 patch went in with
> the empty constructors. Like I say, I'm happy to switch to C++-11
> "= default;" once we require C++11, but I think having well-defined
> implicit construction would make switching to "= default" harder
> in future.
I think we're good to go. I would have slightly preferred to avoid the
empty ctor, but not enough to raise an objection to Richi's ACK and
ultimately make the switch to = default harder later.
And just to be clear, I'd like to propose we step forward to C++11 in
the gcc-9 timeframe. I haven't run that by anyone, but that's the
timeframe I'd personally prefer.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [007/nnn] poly_int: dump routines
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (4 preceding siblings ...)
2017-10-23 17:01 ` [005/nnn] poly_int: rtx constants Richard Sandiford
@ 2017-10-23 17:02 ` Richard Sandiford
2017-11-17 3:38 ` Jeff Law
2017-10-23 17:02 ` [006/nnn] poly_int: tree constants Richard Sandiford
` (101 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:02 UTC (permalink / raw)
To: gcc-patches
Add poly_int routines for the dumpfile.h and pretty-print.h frameworks.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* dumpfile.h (dump_dec): Declare.
* dumpfile.c (dump_dec): New function.
* pretty-print.h (pp_wide_integer): Turn into a function and
declare a poly_int version.
* pretty-print.c (pp_wide_integer): New function for poly_ints.
Index: gcc/dumpfile.h
===================================================================
--- gcc/dumpfile.h 2017-10-23 16:52:20.417686430 +0100
+++ gcc/dumpfile.h 2017-10-23 17:01:00.431554440 +0100
@@ -174,6 +174,9 @@ extern void dump_gimple_stmt (dump_flags
extern void print_combine_total_stats (void);
extern bool enable_rtl_dump_file (void);
+template<unsigned int N, typename C>
+void dump_dec (int, const poly_int<N, C> &);
+
/* In tree-dump.c */
extern void dump_node (const_tree, dump_flags_t, FILE *);
Index: gcc/dumpfile.c
===================================================================
--- gcc/dumpfile.c 2017-10-23 16:52:20.417686430 +0100
+++ gcc/dumpfile.c 2017-10-23 17:01:00.431554440 +0100
@@ -473,6 +473,27 @@ dump_printf_loc (dump_flags_t dump_kind,
}
}
+/* Output VALUE in decimal to appropriate dump streams. */
+
+template<unsigned int N, typename C>
+void
+dump_dec (int dump_kind, const poly_int<N, C> &value)
+{
+ STATIC_ASSERT (poly_coeff_traits<C>::signedness >= 0);
+ signop sgn = poly_coeff_traits<C>::signedness ? SIGNED : UNSIGNED;
+ if (dump_file && (dump_kind & pflags))
+ print_dec (value, dump_file, sgn);
+
+ if (alt_dump_file && (dump_kind & alt_flags))
+ print_dec (value, alt_dump_file, sgn);
+}
+
+template void dump_dec (int, const poly_uint16 &);
+template void dump_dec (int, const poly_int64 &);
+template void dump_dec (int, const poly_uint64 &);
+template void dump_dec (int, const poly_offset_int &);
+template void dump_dec (int, const poly_widest_int &);
+
/* Start a dump for PHASE. Store user-supplied dump flags in
*FLAG_PTR. Return the number of streams opened. Set globals
DUMP_FILE, and ALT_DUMP_FILE to point to the opened streams, and
Index: gcc/pretty-print.h
===================================================================
--- gcc/pretty-print.h 2017-10-23 16:52:20.417686430 +0100
+++ gcc/pretty-print.h 2017-10-23 17:01:00.431554440 +0100
@@ -328,8 +328,6 @@ #define pp_wide_int(PP, W, SGN) \
pp_string (PP, pp_buffer (PP)->digit_buffer); \
} \
while (0)
-#define pp_wide_integer(PP, I) \
- pp_scalar (PP, HOST_WIDE_INT_PRINT_DEC, (HOST_WIDE_INT) I)
#define pp_pointer(PP, P) pp_scalar (PP, "%p", P)
#define pp_identifier(PP, ID) pp_string (PP, (pp_translate_identifiers (PP) \
@@ -401,4 +399,15 @@ extern const char *identifier_to_locale
extern void *(*identifier_to_locale_alloc) (size_t);
extern void (*identifier_to_locale_free) (void *);
+/* Print I to PP in decimal. */
+
+inline void
+pp_wide_integer (pretty_printer *pp, HOST_WIDE_INT i)
+{
+ pp_scalar (pp, HOST_WIDE_INT_PRINT_DEC, i);
+}
+
+template<unsigned int N, typename T>
+void pp_wide_integer (pretty_printer *pp, const poly_int_pod<N, T> &);
+
#endif /* GCC_PRETTY_PRINT_H */
Index: gcc/pretty-print.c
===================================================================
--- gcc/pretty-print.c 2017-10-23 16:52:20.417686430 +0100
+++ gcc/pretty-print.c 2017-10-23 17:01:00.431554440 +0100
@@ -795,6 +795,30 @@ pp_clear_state (pretty_printer *pp)
pp_indentation (pp) = 0;
}
+/* Print X to PP in decimal. */
+template<unsigned int N, typename T>
+void
+pp_wide_integer (pretty_printer *pp, const poly_int_pod<N, T> &x)
+{
+ if (x.is_constant ())
+ pp_wide_integer (pp, x.coeffs[0]);
+ else
+ {
+ pp_left_bracket (pp);
+ for (unsigned int i = 0; i < N; ++i)
+ {
+ if (i != 0)
+ pp_comma (pp);
+ pp_wide_integer (pp, x.coeffs[i]);
+ }
+ pp_right_bracket (pp);
+ }
+}
+
+template void pp_wide_integer (pretty_printer *, const poly_uint16_pod &);
+template void pp_wide_integer (pretty_printer *, const poly_int64_pod &);
+template void pp_wide_integer (pretty_printer *, const poly_uint64_pod &);
+
/* Flush the formatted text of PRETTY-PRINTER onto the attached stream. */
void
pp_write_text_to_stream (pretty_printer *pp)
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [007/nnn] poly_int: dump routines
2017-10-23 17:02 ` [007/nnn] poly_int: dump routines Richard Sandiford
@ 2017-11-17 3:38 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-17 3:38 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:02 AM, Richard Sandiford wrote:
> Add poly_int routines for the dumpfile.h and pretty-print.h frameworks.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * dumpfile.h (dump_dec): Declare.
> * dumpfile.c (dump_dec): New function.
> * pretty-print.h (pp_wide_integer): Turn into a function and
> declare a poly_int version.
> * pretty-print.c (pp_wide_integer): New function for poly_ints.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [006/nnn] poly_int: tree constants
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (5 preceding siblings ...)
2017-10-23 17:02 ` [007/nnn] poly_int: dump routines Richard Sandiford
@ 2017-10-23 17:02 ` Richard Sandiford
2017-10-25 17:14 ` Martin Sebor
2017-11-17 4:51 ` Jeff Law
2017-10-23 17:03 ` [008/nnn] poly_int: create_integer_operand Richard Sandiford
` (100 subsequent siblings)
107 siblings, 2 replies; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:02 UTC (permalink / raw)
To: gcc-patches
This patch adds a tree representation for poly_ints. Unlike the
rtx version, the coefficients are INTEGER_CSTs rather than plain
integers, so that we can easily access them as poly_widest_ints
and poly_offset_ints.
The patch also adjusts some places that previously
relied on "constant" meaning "INTEGER_CST". It also makes
sure that the TYPE_SIZE agrees with the TYPE_SIZE_UNIT for
vector booleans, given the existing:
/* Several boolean vector elements may fit in a single unit. */
if (VECTOR_BOOLEAN_TYPE_P (type)
&& type->type_common.mode != BLKmode)
TYPE_SIZE_UNIT (type)
= size_int (GET_MODE_SIZE (type->type_common.mode));
else
TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
TYPE_SIZE_UNIT (innertype),
size_int (nunits));
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/generic.texi (POLY_INT_CST): Document.
* tree.def (POLY_INT_CST): New tree code.
* treestruct.def (TS_POLY_INT_CST): New tree layout.
* tree-core.h (tree_poly_int_cst): New struct.
(tree_node): Add a poly_int_cst field.
* tree.h (POLY_INT_CST_P, POLY_INT_CST_COEFF): New macros.
(wide_int_to_tree, force_fit_type): Take a poly_wide_int_ref
instead of a wide_int_ref.
(build_int_cst, build_int_cst_type): Take a poly_int64 instead
of a HOST_WIDE_INT.
(build_int_cstu, build_array_type_nelts): Take a poly_uint64
instead of an unsigned HOST_WIDE_INT.
(build_poly_int_cst, tree_fits_poly_int64_p, tree_fits_poly_uint64_p)
(ptrdiff_tree_p): Declare.
(tree_to_poly_int64, tree_to_poly_uint64): Likewise. Provide
extern inline implementations if the target doesn't use POLY_INT_CST.
(poly_int_tree_p): New function.
(wi::unextended_tree): New class.
(wi::int_traits <unextended_tree>): New override.
(wi::extended_tree): Add a default constructor.
(wi::extended_tree::get_tree): New function.
(wi::widest_extended_tree, wi::offset_extended_tree): New typedefs.
(wi::tree_to_widest_ref, wi::tree_to_offset_ref): Use them.
(wi::tree_to_poly_widest_ref, wi::tree_to_poly_offset_ref)
(wi::tree_to_poly_wide_ref): New typedefs.
(wi::ints_for): Provide overloads for extended_tree and
unextended_tree.
(poly_int_cst_value, wi::to_poly_widest, wi::to_poly_offset)
(wi::to_wide): New functions.
(wi::fits_to_boolean_p, wi::fits_to_tree_p): Handle poly_ints.
* tree.c (poly_int_cst_hasher): New struct.
(poly_int_cst_hash_table): New variable.
(tree_node_structure_for_code, tree_code_size, simple_cst_equal)
(valid_constant_size_p, add_expr, drop_tree_overflow): Handle
POLY_INT_CST.
(initialize_tree_contains_struct): Handle TS_POLY_INT_CST.
(init_ttree): Initialize poly_int_cst_hash_table.
(build_int_cst, build_int_cst_type, build_invariant_address): Take
a poly_int64 instead of a HOST_WIDE_INT.
(build_int_cstu, build_array_type_nelts): Take a poly_uint64
instead of an unsigned HOST_WIDE_INT.
(wide_int_to_tree): Rename to...
(wide_int_to_tree_1): ...this.
(build_new_poly_int_cst, build_poly_int_cst): New functions.
(force_fit_type): Take a poly_wide_int_ref instead of a wide_int_ref.
(wide_int_to_tree): New function that takes a poly_wide_int_ref.
(ptrdiff_tree_p, tree_to_poly_int64, tree_to_poly_uint64)
(tree_fits_poly_int64_p, tree_fits_poly_uint64_p): New functions.
* lto-streamer-out.c (DFS::DFS_write_tree_body, hash_tree): Handle
TS_POLY_INT_CST.
* tree-streamer-in.c (lto_input_ts_poly_tree_pointers): Likewise.
(streamer_read_tree_body): Likewise.
* tree-streamer-out.c (write_ts_poly_tree_pointers): Likewise.
(streamer_write_tree_body): Likewise.
* tree-streamer.c (streamer_check_handled_ts_structures): Likewise.
* asan.c (asan_protect_global): Require the size to be an INTEGER_CST.
* cfgexpand.c (expand_debug_expr): Handle POLY_INT_CST.
* expr.c (const_vector_element, expand_expr_real_1): Likewise.
* gimple-expr.h (is_gimple_constant): Likewise.
* gimplify.c (maybe_with_size_expr): Likewise.
* print-tree.c (print_node): Likewise.
* tree-data-ref.c (data_ref_compare_tree): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-ssa-address.c (addr_for_mem_ref): Likewise.
* tree-vect-data-refs.c (dr_group_sort_cmp): Likewise.
* tree-vrp.c (compare_values_warnv): Likewise.
* tree-ssa-loop-ivopts.c (determine_base_object, constant_multiple_of)
(get_loop_invariant_expr, add_candidate_1, get_computation_aff_1)
(force_expr_to_var_cost): Likewise.
* tree-ssa-loop.c (for_each_index): Likewise.
* fold-const.h (build_invariant_address, size_int_kind): Take a
poly_int64 instead of a HOST_WIDE_INT.
* fold-const.c (fold_negate_expr_1, const_binop, const_unop)
(fold_convert_const, multiple_of_p, fold_negate_const): Handle
POLY_INT_CST.
(size_binop_loc): Likewise. Allow int_const_binop_1 to fail.
(int_const_binop_2): New function, split out from...
(int_const_binop_1): ...here. Handle POLY_INT_CST.
(size_int_kind): Take a poly_int64 instead of a HOST_WIDE_INT.
* expmed.c (make_tree): Handle CONST_POLY_INT_P.
* gimple-ssa-strength-reduction.c (slsr_process_add)
(slsr_process_mul): Check for INTEGER_CSTs before using them
as candidates.
* stor-layout.c (bits_from_bytes): New function.
(bit_from_pos): Use it.
(layout_type): Likewise. For vectors, multiply the TYPE_SIZE_UNIT
by BITS_PER_UNIT to get the TYPE_SIZE.
* tree-cfg.c (verify_expr, verify_types_in_gimple_reference): Allow
MEM_REF and TARGET_MEM_REF offsets to be a POLY_INT_CST.
Index: gcc/doc/generic.texi
===================================================================
--- gcc/doc/generic.texi 2017-10-23 16:52:20.504766418 +0100
+++ gcc/doc/generic.texi 2017-10-23 17:00:57.771973825 +0100
@@ -1039,6 +1039,7 @@ As this example indicates, the operands
@tindex VEC_DUPLICATE_CST
@tindex VEC_SERIES_CST
@tindex STRING_CST
+@tindex POLY_INT_CST
@findex TREE_STRING_LENGTH
@findex TREE_STRING_POINTER
@@ -1128,6 +1129,16 @@ of the @code{STRING_CST}.
FIXME: The formats of string constants are not well-defined when the
target system bytes are not the same width as host system bytes.
+@item POLY_INT_CST
+These nodes represent invariants that depend on some target-specific
+runtime parameters. They consist of @code{NUM_POLY_INT_COEFFS}
+coefficients, with the first coefficient being the constant term and
+the others being multipliers that are applied to the runtime parameters.
+
+@code{POLY_INT_CST_ELT (@var{x}, @var{i})} references coefficient number
+@var{i} of @code{POLY_INT_CST} node @var{x}. Each coefficient is an
+@code{INTEGER_CST}.
+
@end table
@node Storage References
Index: gcc/tree.def
===================================================================
--- gcc/tree.def 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree.def 2017-10-23 17:00:57.783962919 +0100
@@ -291,6 +291,9 @@ DEFTREECODE (VOID_CST, "void_cst", tcc_c
some circumstances. */
DEFTREECODE (INTEGER_CST, "integer_cst", tcc_constant, 0)
+/* Contents are given by POLY_INT_CST_COEFF. */
+DEFTREECODE (POLY_INT_CST, "poly_int_cst", tcc_constant, 0)
+
/* Contents are in TREE_REAL_CST field. */
DEFTREECODE (REAL_CST, "real_cst", tcc_constant, 0)
Index: gcc/treestruct.def
===================================================================
--- gcc/treestruct.def 2017-10-23 16:52:20.504766418 +0100
+++ gcc/treestruct.def 2017-10-23 17:00:57.784962010 +0100
@@ -34,6 +34,7 @@ DEFTREESTRUCT(TS_BASE, "base")
DEFTREESTRUCT(TS_TYPED, "typed")
DEFTREESTRUCT(TS_COMMON, "common")
DEFTREESTRUCT(TS_INT_CST, "integer cst")
+DEFTREESTRUCT(TS_POLY_INT_CST, "poly_int_cst")
DEFTREESTRUCT(TS_REAL_CST, "real cst")
DEFTREESTRUCT(TS_FIXED_CST, "fixed cst")
DEFTREESTRUCT(TS_VECTOR, "vector")
Index: gcc/tree-core.h
===================================================================
--- gcc/tree-core.h 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-core.h 2017-10-23 17:00:57.778967463 +0100
@@ -1336,6 +1336,11 @@ struct GTY(()) tree_vector {
tree GTY ((length ("((tree) &%h)->base.u.nelts"))) elts[1];
};
+struct GTY(()) tree_poly_int_cst {
+ struct tree_typed typed;
+ tree coeffs[NUM_POLY_INT_COEFFS];
+};
+
struct GTY(()) tree_identifier {
struct tree_common common;
struct ht_identifier id;
@@ -1861,6 +1866,7 @@ union GTY ((ptr_alias (union lang_tree_n
struct tree_typed GTY ((tag ("TS_TYPED"))) typed;
struct tree_common GTY ((tag ("TS_COMMON"))) common;
struct tree_int_cst GTY ((tag ("TS_INT_CST"))) int_cst;
+ struct tree_poly_int_cst GTY ((tag ("TS_POLY_INT_CST"))) poly_int_cst;
struct tree_real_cst GTY ((tag ("TS_REAL_CST"))) real_cst;
struct tree_fixed_cst GTY ((tag ("TS_FIXED_CST"))) fixed_cst;
struct tree_vector GTY ((tag ("TS_VECTOR"))) vector;
Index: gcc/tree.h
===================================================================
--- gcc/tree.h 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree.h 2017-10-23 17:00:57.784962010 +0100
@@ -1008,6 +1008,15 @@ #define TREE_INT_CST_ELT(NODE, I) TREE_I
#define TREE_INT_CST_LOW(NODE) \
((unsigned HOST_WIDE_INT) TREE_INT_CST_ELT (NODE, 0))
+/* Return true if NODE is a POLY_INT_CST. This is only ever true on
+ targets with variable-sized modes. */
+#define POLY_INT_CST_P(NODE) \
+ (NUM_POLY_INT_COEFFS > 1 && TREE_CODE (NODE) == POLY_INT_CST)
+
+/* In a POLY_INT_CST node. */
+#define POLY_INT_CST_COEFF(NODE, I) \
+ (POLY_INT_CST_CHECK (NODE)->poly_int_cst.coeffs[I])
+
#define TREE_REAL_CST_PTR(NODE) (REAL_CST_CHECK (NODE)->real_cst.real_cst_ptr)
#define TREE_REAL_CST(NODE) (*TREE_REAL_CST_PTR (NODE))
@@ -4025,15 +4034,15 @@ build5_loc (location_t loc, enum tree_co
extern tree double_int_to_tree (tree, double_int);
-extern tree wide_int_to_tree (tree type, const wide_int_ref &cst);
-extern tree force_fit_type (tree, const wide_int_ref &, int, bool);
+extern tree wide_int_to_tree (tree type, const poly_wide_int_ref &cst);
+extern tree force_fit_type (tree, const poly_wide_int_ref &, int, bool);
/* Create an INT_CST node with a CST value zero extended. */
/* static inline */
-extern tree build_int_cst (tree, HOST_WIDE_INT);
-extern tree build_int_cstu (tree type, unsigned HOST_WIDE_INT cst);
-extern tree build_int_cst_type (tree, HOST_WIDE_INT);
+extern tree build_int_cst (tree, poly_int64);
+extern tree build_int_cstu (tree type, poly_uint64);
+extern tree build_int_cst_type (tree, poly_int64);
extern tree make_vector (unsigned CXX_MEM_STAT_INFO);
extern tree build_vec_duplicate_cst (tree, tree CXX_MEM_STAT_INFO);
extern tree build_vec_series_cst (tree, tree, tree CXX_MEM_STAT_INFO);
@@ -4056,6 +4065,7 @@ extern tree build_minus_one_cst (tree);
extern tree build_all_ones_cst (tree);
extern tree build_zero_cst (tree);
extern tree build_string (int, const char *);
+extern tree build_poly_int_cst (tree, const poly_wide_int_ref &);
extern tree build_tree_list (tree, tree CXX_MEM_STAT_INFO);
extern tree build_tree_list_vec (const vec<tree, va_gc> * CXX_MEM_STAT_INFO);
extern tree build_decl (location_t, enum tree_code,
@@ -4104,7 +4114,7 @@ extern tree build_opaque_vector_type (tr
extern tree build_index_type (tree);
extern tree build_array_type (tree, tree, bool = false);
extern tree build_nonshared_array_type (tree, tree);
-extern tree build_array_type_nelts (tree, unsigned HOST_WIDE_INT);
+extern tree build_array_type_nelts (tree, poly_uint64);
extern tree build_function_type (tree, tree);
extern tree build_function_type_list (tree, ...);
extern tree build_varargs_function_type_list (tree, ...);
@@ -4128,12 +4138,14 @@ extern tree chain_index (int, tree);
extern int tree_int_cst_equal (const_tree, const_tree);
-extern bool tree_fits_shwi_p (const_tree)
- ATTRIBUTE_PURE;
-extern bool tree_fits_uhwi_p (const_tree)
- ATTRIBUTE_PURE;
+extern bool tree_fits_shwi_p (const_tree) ATTRIBUTE_PURE;
+extern bool tree_fits_poly_int64_p (const_tree) ATTRIBUTE_PURE;
+extern bool tree_fits_uhwi_p (const_tree) ATTRIBUTE_PURE;
+extern bool tree_fits_poly_uint64_p (const_tree) ATTRIBUTE_PURE;
extern HOST_WIDE_INT tree_to_shwi (const_tree);
+extern poly_int64 tree_to_poly_int64 (const_tree);
extern unsigned HOST_WIDE_INT tree_to_uhwi (const_tree);
+extern poly_uint64 tree_to_poly_uint64 (const_tree);
#if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003)
extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
tree_to_shwi (const_tree t)
@@ -4148,6 +4160,21 @@ tree_to_uhwi (const_tree t)
gcc_assert (tree_fits_uhwi_p (t));
return TREE_INT_CST_LOW (t);
}
+#if NUM_POLY_INT_COEFFS == 1
+extern inline __attribute__ ((__gnu_inline__)) poly_int64
+tree_to_poly_int64 (const_tree t)
+{
+ gcc_assert (tree_fits_poly_int64_p (t));
+ return TREE_INT_CST_LOW (t);
+}
+
+extern inline __attribute__ ((__gnu_inline__)) poly_uint64
+tree_to_poly_uint64 (const_tree t)
+{
+ gcc_assert (tree_fits_poly_uint64_p (t));
+ return TREE_INT_CST_LOW (t);
+}
+#endif
#endif
extern int tree_int_cst_sgn (const_tree);
extern int tree_int_cst_sign_bit (const_tree);
@@ -4156,6 +4183,33 @@ extern tree strip_array_types (tree);
extern tree excess_precision_type (tree);
extern bool valid_constant_size_p (const_tree);
+/* Return true if T holds a value that can be represented as a poly_int64
+ without loss of precision. Store the value in *VALUE if so. */
+
+inline bool
+poly_int_tree_p (const_tree t, poly_int64_pod *value)
+{
+ if (tree_fits_poly_int64_p (t))
+ {
+ *value = tree_to_poly_int64 (t);
+ return true;
+ }
+ return false;
+}
+
+/* Return true if T holds a value that can be represented as a poly_uint64
+ without loss of precision. Store the value in *VALUE if so. */
+
+inline bool
+poly_int_tree_p (const_tree t, poly_uint64_pod *value)
+{
+ if (tree_fits_poly_uint64_p (t))
+ {
+ *value = tree_to_poly_uint64 (t);
+ return true;
+ }
+ return false;
+}
/* From expmed.c. Since rtl.h is included after tree.h, we can't
put the prototype here. Rtl.h does declare the prototype if
@@ -4702,8 +4756,17 @@ complete_or_array_type_p (const_tree typ
&& COMPLETE_TYPE_P (TREE_TYPE (type)));
}
+/* Return true if the value of T could be represented as a poly_widest_int. */
+
+inline bool
+poly_int_tree_p (const_tree t)
+{
+ return (TREE_CODE (t) == INTEGER_CST || POLY_INT_CST_P (t));
+}
+
extern tree strip_float_extensions (tree);
extern int really_constant_p (const_tree);
+extern bool ptrdiff_tree_p (const_tree, poly_int64_pod *);
extern bool decl_address_invariant_p (const_tree);
extern bool decl_address_ip_invariant_p (const_tree);
extern bool int_fits_type_p (const_tree, const_tree);
@@ -5132,6 +5195,29 @@ extern bool anon_aggrname_p (const_tree)
/* The tree and const_tree overload templates. */
namespace wi
{
+ class unextended_tree
+ {
+ private:
+ const_tree m_t;
+
+ public:
+ unextended_tree () {}
+ unextended_tree (const_tree t) : m_t (t) {}
+
+ unsigned int get_precision () const;
+ const HOST_WIDE_INT *get_val () const;
+ unsigned int get_len () const;
+ const_tree get_tree () const { return m_t; }
+ };
+
+ template <>
+ struct int_traits <unextended_tree>
+ {
+ static const enum precision_type precision_type = VAR_PRECISION;
+ static const bool host_dependent_precision = false;
+ static const bool is_sign_extended = false;
+ };
+
template <int N>
class extended_tree
{
@@ -5139,11 +5225,13 @@ extern bool anon_aggrname_p (const_tree)
const_tree m_t;
public:
+ extended_tree () {}
extended_tree (const_tree);
unsigned int get_precision () const;
const HOST_WIDE_INT *get_val () const;
unsigned int get_len () const;
+ const_tree get_tree () const { return m_t; }
};
template <int N>
@@ -5155,10 +5243,11 @@ extern bool anon_aggrname_p (const_tree)
static const unsigned int precision = N;
};
- typedef const generic_wide_int <extended_tree <WIDE_INT_MAX_PRECISION> >
- tree_to_widest_ref;
- typedef const generic_wide_int <extended_tree <ADDR_MAX_PRECISION> >
- tree_to_offset_ref;
+ typedef extended_tree <WIDE_INT_MAX_PRECISION> widest_extended_tree;
+ typedef extended_tree <ADDR_MAX_PRECISION> offset_extended_tree;
+
+ typedef const generic_wide_int <widest_extended_tree> tree_to_widest_ref;
+ typedef const generic_wide_int <offset_extended_tree> tree_to_offset_ref;
typedef const generic_wide_int<wide_int_ref_storage<false, false> >
tree_to_wide_ref;
@@ -5166,6 +5255,34 @@ extern bool anon_aggrname_p (const_tree)
tree_to_offset_ref to_offset (const_tree);
tree_to_wide_ref to_wide (const_tree);
wide_int to_wide (const_tree, unsigned int);
+
+ typedef const poly_int <NUM_POLY_INT_COEFFS,
+ generic_wide_int <widest_extended_tree> >
+ tree_to_poly_widest_ref;
+ typedef const poly_int <NUM_POLY_INT_COEFFS,
+ generic_wide_int <offset_extended_tree> >
+ tree_to_poly_offset_ref;
+ typedef const poly_int <NUM_POLY_INT_COEFFS,
+ generic_wide_int <unextended_tree> >
+ tree_to_poly_wide_ref;
+
+ tree_to_poly_widest_ref to_poly_widest (const_tree);
+ tree_to_poly_offset_ref to_poly_offset (const_tree);
+ tree_to_poly_wide_ref to_poly_wide (const_tree);
+
+ template <int N>
+ struct ints_for <generic_wide_int <extended_tree <N> >, CONST_PRECISION>
+ {
+ typedef generic_wide_int <extended_tree <N> > extended;
+ static extended zero (const extended &);
+ };
+
+ template <>
+ struct ints_for <generic_wide_int <unextended_tree>, VAR_PRECISION>
+ {
+ typedef generic_wide_int <unextended_tree> unextended;
+ static unextended zero (const unextended &);
+ };
}
/* Refer to INTEGER_CST T as though it were a widest_int.
@@ -5310,6 +5427,95 @@ wi::extended_tree <N>::get_len () const
gcc_unreachable ();
}
+inline unsigned int
+wi::unextended_tree::get_precision () const
+{
+ return TYPE_PRECISION (TREE_TYPE (m_t));
+}
+
+inline const HOST_WIDE_INT *
+wi::unextended_tree::get_val () const
+{
+ return &TREE_INT_CST_ELT (m_t, 0);
+}
+
+inline unsigned int
+wi::unextended_tree::get_len () const
+{
+ return TREE_INT_CST_NUNITS (m_t);
+}
+
+/* Return the value of a POLY_INT_CST in its native precision. */
+
+inline wi::tree_to_poly_wide_ref
+poly_int_cst_value (const_tree x)
+{
+ poly_int <NUM_POLY_INT_COEFFS, generic_wide_int <wi::unextended_tree> > res;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ res.coeffs[i] = POLY_INT_CST_COEFF (x, i);
+ return res;
+}
+
+/* Access INTEGER_CST or POLY_INT_CST tree T as if it were a
+ poly_widest_int. See wi::to_widest for more details. */
+
+inline wi::tree_to_poly_widest_ref
+wi::to_poly_widest (const_tree t)
+{
+ if (POLY_INT_CST_P (t))
+ {
+ poly_int <NUM_POLY_INT_COEFFS,
+ generic_wide_int <widest_extended_tree> > res;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ res.coeffs[i] = POLY_INT_CST_COEFF (t, i);
+ return res;
+ }
+ return t;
+}
+
+/* Access INTEGER_CST or POLY_INT_CST tree T as if it were a
+ poly_offset_int. See wi::to_offset for more details. */
+
+inline wi::tree_to_poly_offset_ref
+wi::to_poly_offset (const_tree t)
+{
+ if (POLY_INT_CST_P (t))
+ {
+ poly_int <NUM_POLY_INT_COEFFS,
+ generic_wide_int <offset_extended_tree> > res;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ res.coeffs[i] = POLY_INT_CST_COEFF (t, i);
+ return res;
+ }
+ return t;
+}
+
+/* Access INTEGER_CST or POLY_INT_CST tree T as if it were a
+ poly_wide_int. See wi::to_wide for more details. */
+
+inline wi::tree_to_poly_wide_ref
+wi::to_poly_wide (const_tree t)
+{
+ if (POLY_INT_CST_P (t))
+ return poly_int_cst_value (t);
+ return t;
+}
+
+template <int N>
+inline generic_wide_int <wi::extended_tree <N> >
+wi::ints_for <generic_wide_int <wi::extended_tree <N> >,
+ wi::CONST_PRECISION>::zero (const extended &x)
+{
+ return build_zero_cst (TREE_TYPE (x.get_tree ()));
+}
+
+inline generic_wide_int <wi::unextended_tree>
+wi::ints_for <generic_wide_int <wi::unextended_tree>,
+ wi::VAR_PRECISION>::zero (const unextended &x)
+{
+ return build_zero_cst (TREE_TYPE (x.get_tree ()));
+}
+
namespace wi
{
template <typename T>
@@ -5327,7 +5533,8 @@ wi::extended_tree <N>::get_len () const
bool
wi::fits_to_boolean_p (const T &x, const_tree type)
{
- return eq_p (x, 0) || eq_p (x, TYPE_UNSIGNED (type) ? 1 : -1);
+ return (known_zero (x)
+ || (TYPE_UNSIGNED (type) ? known_one (x) : known_all_ones (x)));
}
template <typename T>
@@ -5340,9 +5547,9 @@ wi::fits_to_tree_p (const T &x, const_tr
return fits_to_boolean_p (x, type);
if (TYPE_UNSIGNED (type))
- return eq_p (x, zext (x, TYPE_PRECISION (type)));
+ return must_eq (x, zext (x, TYPE_PRECISION (type)));
else
- return eq_p (x, sext (x, TYPE_PRECISION (type)));
+ return must_eq (x, sext (x, TYPE_PRECISION (type)));
}
/* Produce the smallest number that is represented in TYPE. The precision
Index: gcc/tree.c
===================================================================
--- gcc/tree.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree.c 2017-10-23 17:00:57.783962919 +0100
@@ -203,6 +203,17 @@ struct int_cst_hasher : ggc_cache_ptr_ha
static GTY ((cache)) hash_table<int_cst_hasher> *int_cst_hash_table;
+/* Class and variable for making sure that there is a single POLY_INT_CST
+ for a given value. */
+struct poly_int_cst_hasher : ggc_cache_ptr_hash<tree_node>
+{
+ typedef std::pair<tree, const poly_wide_int *> compare_type;
+ static hashval_t hash (tree t);
+ static bool equal (tree x, const compare_type &y);
+};
+
+static GTY ((cache)) hash_table<poly_int_cst_hasher> *poly_int_cst_hash_table;
+
/* Hash table for optimization flags and target option flags. Use the same
hash table for both sets of options. Nodes for building the current
optimization and target option nodes. The assumption is most of the time
@@ -460,6 +471,7 @@ tree_node_structure_for_code (enum tree_
/* tcc_constant cases. */
case VOID_CST: return TS_TYPED;
case INTEGER_CST: return TS_INT_CST;
+ case POLY_INT_CST: return TS_POLY_INT_CST;
case REAL_CST: return TS_REAL_CST;
case FIXED_CST: return TS_FIXED_CST;
case COMPLEX_CST: return TS_COMPLEX;
@@ -519,6 +531,7 @@ initialize_tree_contains_struct (void)
case TS_COMMON:
case TS_INT_CST:
+ case TS_POLY_INT_CST:
case TS_REAL_CST:
case TS_FIXED_CST:
case TS_VECTOR:
@@ -652,6 +665,8 @@ init_ttree (void)
int_cst_hash_table = hash_table<int_cst_hasher>::create_ggc (1024);
+ poly_int_cst_hash_table = hash_table<poly_int_cst_hasher>::create_ggc (64);
+
int_cst_node = make_int_cst (1, 1);
cl_option_hash_table = hash_table<cl_option_hasher>::create_ggc (64);
@@ -814,6 +829,7 @@ tree_code_size (enum tree_code code)
{
case VOID_CST: return sizeof (struct tree_typed);
case INTEGER_CST: gcc_unreachable ();
+ case POLY_INT_CST: return sizeof (struct tree_poly_int_cst);
case REAL_CST: return sizeof (struct tree_real_cst);
case FIXED_CST: return sizeof (struct tree_fixed_cst);
case COMPLEX_CST: return sizeof (struct tree_complex);
@@ -1298,31 +1314,51 @@ build_new_int_cst (tree type, const wide
return nt;
}
-/* Create an INT_CST node with a LOW value sign extended to TYPE. */
+/* Return a new POLY_INT_CST with coefficients COEFFS and type TYPE. */
+
+static tree
+build_new_poly_int_cst (tree type, tree (&coeffs)[NUM_POLY_INT_COEFFS])
+{
+ size_t length = sizeof (struct tree_poly_int_cst);
+ record_node_allocation_statistics (POLY_INT_CST, length);
+
+ tree t = ggc_alloc_cleared_tree_node_stat (length PASS_MEM_STAT);
+
+ TREE_SET_CODE (t, POLY_INT_CST);
+ TREE_CONSTANT (t) = 1;
+ TREE_TYPE (t) = type;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ POLY_INT_CST_COEFF (t, i) = coeffs[i];
+ return t;
+}
+
+/* Create a constant tree that contains CST sign-extended to TYPE. */
tree
-build_int_cst (tree type, HOST_WIDE_INT low)
+build_int_cst (tree type, poly_int64 cst)
{
/* Support legacy code. */
if (!type)
type = integer_type_node;
- return wide_int_to_tree (type, wi::shwi (low, TYPE_PRECISION (type)));
+ return wide_int_to_tree (type, wi::shwi (cst, TYPE_PRECISION (type)));
}
+/* Create a constant tree that contains CST zero-extended to TYPE. */
+
tree
-build_int_cstu (tree type, unsigned HOST_WIDE_INT cst)
+build_int_cstu (tree type, poly_uint64 cst)
{
return wide_int_to_tree (type, wi::uhwi (cst, TYPE_PRECISION (type)));
}
-/* Create an INT_CST node with a LOW value sign extended to TYPE. */
+/* Create a constant tree that contains CST sign-extended to TYPE. */
tree
-build_int_cst_type (tree type, HOST_WIDE_INT low)
+build_int_cst_type (tree type, poly_int64 cst)
{
gcc_assert (type);
- return wide_int_to_tree (type, wi::shwi (low, TYPE_PRECISION (type)));
+ return wide_int_to_tree (type, wi::shwi (cst, TYPE_PRECISION (type)));
}
/* Constructs tree in type TYPE from with value given by CST. Signedness
@@ -1350,7 +1386,7 @@ double_int_to_tree (tree type, double_in
tree
-force_fit_type (tree type, const wide_int_ref &cst,
+force_fit_type (tree type, const poly_wide_int_ref &cst,
int overflowable, bool overflowed)
{
signop sign = TYPE_SIGN (type);
@@ -1362,8 +1398,21 @@ force_fit_type (tree type, const wide_in
|| overflowable < 0
|| (overflowable > 0 && sign == SIGNED))
{
- wide_int tmp = wide_int::from (cst, TYPE_PRECISION (type), sign);
- tree t = build_new_int_cst (type, tmp);
+ poly_wide_int tmp = poly_wide_int::from (cst, TYPE_PRECISION (type),
+ sign);
+ tree t;
+ if (tmp.is_constant ())
+ t = build_new_int_cst (type, tmp.coeffs[0]);
+ else
+ {
+ tree coeffs[NUM_POLY_INT_COEFFS];
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ {
+ coeffs[i] = build_new_int_cst (type, tmp.coeffs[i]);
+ TREE_OVERFLOW (coeffs[i]) = 1;
+ }
+ t = build_new_poly_int_cst (type, coeffs);
+ }
TREE_OVERFLOW (t) = 1;
return t;
}
@@ -1420,8 +1469,8 @@ int_cst_hasher::equal (tree x, tree y)
the upper bits and ensures that hashing and value equality based
upon the underlying HOST_WIDE_INTs works without masking. */
-tree
-wide_int_to_tree (tree type, const wide_int_ref &pcst)
+static tree
+wide_int_to_tree_1 (tree type, const wide_int_ref &pcst)
{
tree t;
int ix = -1;
@@ -1566,6 +1615,66 @@ wide_int_to_tree (tree type, const wide_
return t;
}
+hashval_t
+poly_int_cst_hasher::hash (tree t)
+{
+ inchash::hash hstate;
+
+ hstate.add_int (TYPE_UID (TREE_TYPE (t)));
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ hstate.add_wide_int (wi::to_wide (POLY_INT_CST_COEFF (t, i)));
+
+ return hstate.end ();
+}
+
+bool
+poly_int_cst_hasher::equal (tree x, const compare_type &y)
+{
+ if (TREE_TYPE (x) != y.first)
+ return false;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ if (wi::to_wide (POLY_INT_CST_COEFF (x, i)) != y.second->coeffs[i])
+ return false;
+ return true;
+}
+
+/* Build a POLY_INT_CST node with type TYPE and with the elements in VALUES.
+ The elements must also have type TYPE. */
+
+tree
+build_poly_int_cst (tree type, const poly_wide_int_ref &values)
+{
+ unsigned int prec = TYPE_PRECISION (type);
+ gcc_assert (prec <= values.coeffs[0].get_precision ());
+ poly_wide_int c = poly_wide_int::from (values, prec, SIGNED);
+
+ inchash::hash h;
+ h.add_int (TYPE_UID (type));
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ h.add_wide_int (c.coeffs[i]);
+ poly_int_cst_hasher::compare_type comp (type, &c);
+ tree *slot = poly_int_cst_hash_table->find_slot_with_hash (comp, h.end (),
+ INSERT);
+ if (*slot == NULL_TREE)
+ {
+ tree coeffs[NUM_POLY_INT_COEFFS];
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ coeffs[i] = wide_int_to_tree_1 (type, c.coeffs[i]);
+ *slot = build_new_poly_int_cst (type, coeffs);
+ }
+ return *slot;
+}
+
+/* Create a constant tree with value VALUE in type TYPE. */
+
+tree
+wide_int_to_tree (tree type, const poly_wide_int_ref &value)
+{
+ if (value.is_constant ())
+ return wide_int_to_tree_1 (type, value.coeffs[0]);
+ return build_poly_int_cst (type, value);
+}
+
void
cache_integer_cst (tree t)
{
@@ -2791,6 +2900,59 @@ really_constant_p (const_tree exp)
exp = TREE_OPERAND (exp, 0);
return TREE_CONSTANT (exp);
}
+
+/* Return true if T holds a polynomial pointer difference, storing it in
+ *VALUE if so. A true return means that T's precision is no greater
+ than 64 bits, which is the largest address space we support, so *VALUE
+ never loses precision. However, the signedness of the result is
+ somewhat arbitrary, since if B lives near the end of a 64-bit address
+ range and A lives near the beginning, B - A is a large positive value
+ outside the range of int64_t. A - B is likewise a large negative value
+ outside the range of int64_t. All the pointer difference really
+ gives is a raw pointer-sized bitstring that can be added to the first
+ pointer value to get the second. */
+
+bool
+ptrdiff_tree_p (const_tree t, poly_int64_pod *value)
+{
+ if (!t)
+ return false;
+ if (TREE_CODE (t) == INTEGER_CST)
+ {
+ if (!cst_and_fits_in_hwi (t))
+ return false;
+ *value = int_cst_value (t);
+ return true;
+ }
+ if (POLY_INT_CST_P (t))
+ {
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ if (!cst_and_fits_in_hwi (POLY_INT_CST_COEFF (t, i)))
+ return false;
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ value->coeffs[i] = int_cst_value (POLY_INT_CST_COEFF (t, i));
+ return true;
+ }
+ return false;
+}
+
+poly_int64
+tree_to_poly_int64 (const_tree t)
+{
+ gcc_assert (tree_fits_poly_int64_p (t));
+ if (POLY_INT_CST_P (t))
+ return poly_int_cst_value (t).force_shwi ();
+ return TREE_INT_CST_LOW (t);
+}
+
+poly_uint64
+tree_to_poly_uint64 (const_tree t)
+{
+ gcc_assert (tree_fits_poly_uint64_p (t));
+ if (POLY_INT_CST_P (t))
+ return poly_int_cst_value (t).force_uhwi ();
+ return TREE_INT_CST_LOW (t);
+}
\f
/* Return first list element whose TREE_VALUE is ELEM.
Return 0 if ELEM is not in LIST. */
@@ -4773,7 +4935,7 @@ mem_ref_offset (const_tree t)
offsetted by OFFSET units. */
tree
-build_invariant_address (tree type, tree base, HOST_WIDE_INT offset)
+build_invariant_address (tree type, tree base, poly_int64 offset)
{
tree ref = fold_build2 (MEM_REF, TREE_TYPE (type),
build_fold_addr_expr (base),
@@ -6661,6 +6823,25 @@ tree_fits_shwi_p (const_tree t)
&& wi::fits_shwi_p (wi::to_widest (t)));
}
+/* Return true if T is an INTEGER_CST or POLY_INT_CST whose numerical
+ value (extended according to TYPE_UNSIGNED) fits in a poly_int64. */
+
+bool
+tree_fits_poly_int64_p (const_tree t)
+{
+ if (t == NULL_TREE)
+ return false;
+ if (POLY_INT_CST_P (t))
+ {
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; i++)
+ if (!wi::fits_shwi_p (wi::to_wide (POLY_INT_CST_COEFF (t, i))))
+ return false;
+ return true;
+ }
+ return (TREE_CODE (t) == INTEGER_CST
+ && wi::fits_shwi_p (wi::to_widest (t)));
+}
+
/* Return true if T is an INTEGER_CST whose numerical value (extended
according to TYPE_UNSIGNED) fits in an unsigned HOST_WIDE_INT. */
@@ -6672,6 +6853,25 @@ tree_fits_uhwi_p (const_tree t)
&& wi::fits_uhwi_p (wi::to_widest (t)));
}
+/* Return true if T is an INTEGER_CST or POLY_INT_CST whose numerical
+ value (extended according to TYPE_UNSIGNED) fits in a poly_uint64. */
+
+bool
+tree_fits_poly_uint64_p (const_tree t)
+{
+ if (t == NULL_TREE)
+ return false;
+ if (POLY_INT_CST_P (t))
+ {
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; i++)
+ if (!wi::fits_uhwi_p (wi::to_widest (POLY_INT_CST_COEFF (t, i))))
+ return false;
+ return true;
+ }
+ return (TREE_CODE (t) == INTEGER_CST
+ && wi::fits_uhwi_p (wi::to_widest (t)));
+}
+
/* T is an INTEGER_CST whose numerical value (extended according to
TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT. Return that
HOST_WIDE_INT. */
@@ -6880,6 +7080,12 @@ simple_cst_equal (const_tree t1, const_t
return 0;
default:
+ if (POLY_INT_CST_P (t1))
+ /* A false return means may_ne rather than must_ne. */
+ return must_eq (poly_widest_int::from (poly_int_cst_value (t1),
+ TYPE_SIGN (TREE_TYPE (t1))),
+ poly_widest_int::from (poly_int_cst_value (t2),
+ TYPE_SIGN (TREE_TYPE (t2))));
break;
}
@@ -6939,8 +7145,16 @@ compare_tree_int (const_tree t, unsigned
bool
valid_constant_size_p (const_tree size)
{
+ if (TREE_OVERFLOW (size))
+ return false;
+ if (POLY_INT_CST_P (size))
+ {
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ if (!valid_constant_size_p (POLY_INT_CST_COEFF (size, i)))
+ return false;
+ return true;
+ }
if (! tree_fits_uhwi_p (size)
- || TREE_OVERFLOW (size)
|| tree_int_cst_sign_bit (size) != 0)
return false;
return true;
@@ -7239,6 +7453,12 @@ add_expr (const_tree t, inchash::hash &h
}
/* FALL THROUGH */
default:
+ if (POLY_INT_CST_P (t))
+ {
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ hstate.add_wide_int (wi::to_wide (POLY_INT_CST_COEFF (t, i)));
+ return;
+ }
tclass = TREE_CODE_CLASS (code);
if (tclass == tcc_declaration)
@@ -7776,7 +7996,7 @@ build_nonshared_array_type (tree elt_typ
sizetype. */
tree
-build_array_type_nelts (tree elt_type, unsigned HOST_WIDE_INT nelts)
+build_array_type_nelts (tree elt_type, poly_uint64 nelts)
{
return build_array_type (elt_type, build_index_type (size_int (nelts - 1)));
}
@@ -12459,8 +12679,8 @@ drop_tree_overflow (tree t)
gcc_checking_assert (TREE_OVERFLOW (t));
/* For tree codes with a sharing machinery re-build the result. */
- if (TREE_CODE (t) == INTEGER_CST)
- return wide_int_to_tree (TREE_TYPE (t), wi::to_wide (t));
+ if (poly_int_tree_p (t))
+ return wide_int_to_tree (TREE_TYPE (t), wi::to_poly_wide (t));
/* Otherwise, as all tcc_constants are possibly shared, copy the node
and drop the flag. */
Index: gcc/lto-streamer-out.c
===================================================================
--- gcc/lto-streamer-out.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/lto-streamer-out.c 2017-10-23 17:00:57.776969281 +0100
@@ -751,6 +751,10 @@ #define DFS_follow_tree_edge(DEST) \
DFS_follow_tree_edge (VECTOR_CST_ELT (expr, i));
}
+ if (CODE_CONTAINS_STRUCT (code, TS_POLY_INT_CST))
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ DFS_follow_tree_edge (POLY_INT_CST_COEFF (expr, i));
+
if (CODE_CONTAINS_STRUCT (code, TS_COMPLEX))
{
DFS_follow_tree_edge (TREE_REALPART (expr));
@@ -1202,6 +1206,10 @@ #define visit(SIBLING) \
for (unsigned i = 0; i < VECTOR_CST_NELTS (t); ++i)
visit (VECTOR_CST_ELT (t, i));
+ if (CODE_CONTAINS_STRUCT (code, TS_POLY_INT_CST))
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ visit (POLY_INT_CST_COEFF (t, i));
+
if (CODE_CONTAINS_STRUCT (code, TS_COMPLEX))
{
visit (TREE_REALPART (t));
Index: gcc/tree-streamer-in.c
===================================================================
--- gcc/tree-streamer-in.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-streamer-in.c 2017-10-23 17:00:57.780965645 +0100
@@ -654,6 +654,19 @@ lto_input_ts_vector_tree_pointers (struc
}
+/* Read all pointer fields in the TS_POLY_INT_CST structure of EXPR from
+ input block IB. DATA_IN contains tables and descriptors for the
+ file being read. */
+
+static void
+lto_input_ts_poly_tree_pointers (struct lto_input_block *ib,
+ struct data_in *data_in, tree expr)
+{
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ POLY_INT_CST_COEFF (expr, i) = stream_read_tree (ib, data_in);
+}
+
+
/* Read all pointer fields in the TS_COMPLEX structure of EXPR from input
block IB. DATA_IN contains tables and descriptors for the
file being read. */
@@ -1037,6 +1050,9 @@ streamer_read_tree_body (struct lto_inpu
if (CODE_CONTAINS_STRUCT (code, TS_VECTOR))
lto_input_ts_vector_tree_pointers (ib, data_in, expr);
+ if (CODE_CONTAINS_STRUCT (code, TS_POLY_INT_CST))
+ lto_input_ts_poly_tree_pointers (ib, data_in, expr);
+
if (CODE_CONTAINS_STRUCT (code, TS_COMPLEX))
lto_input_ts_complex_tree_pointers (ib, data_in, expr);
Index: gcc/tree-streamer-out.c
===================================================================
--- gcc/tree-streamer-out.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-streamer-out.c 2017-10-23 17:00:57.780965645 +0100
@@ -539,6 +539,18 @@ write_ts_vector_tree_pointers (struct ou
}
+/* Write all pointer fields in the TS_POLY_INT_CST structure of EXPR to
+ output block OB. If REF_P is true, write a reference to EXPR's pointer
+ fields. */
+
+static void
+write_ts_poly_tree_pointers (struct output_block *ob, tree expr, bool ref_p)
+{
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ stream_write_tree (ob, POLY_INT_CST_COEFF (expr, i), ref_p);
+}
+
+
/* Write all pointer fields in the TS_COMPLEX structure of EXPR to output
block OB. If REF_P is true, write a reference to EXPR's pointer
fields. */
@@ -880,6 +892,9 @@ streamer_write_tree_body (struct output_
if (CODE_CONTAINS_STRUCT (code, TS_VECTOR))
write_ts_vector_tree_pointers (ob, expr, ref_p);
+ if (CODE_CONTAINS_STRUCT (code, TS_POLY_INT_CST))
+ write_ts_poly_tree_pointers (ob, expr, ref_p);
+
if (CODE_CONTAINS_STRUCT (code, TS_COMPLEX))
write_ts_complex_tree_pointers (ob, expr, ref_p);
Index: gcc/tree-streamer.c
===================================================================
--- gcc/tree-streamer.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-streamer.c 2017-10-23 17:00:57.780965645 +0100
@@ -55,6 +55,7 @@ streamer_check_handled_ts_structures (vo
handled_p[TS_TYPED] = true;
handled_p[TS_COMMON] = true;
handled_p[TS_INT_CST] = true;
+ handled_p[TS_POLY_INT_CST] = true;
handled_p[TS_REAL_CST] = true;
handled_p[TS_FIXED_CST] = true;
handled_p[TS_VECTOR] = true;
Index: gcc/asan.c
===================================================================
--- gcc/asan.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/asan.c 2017-10-23 17:00:57.770974734 +0100
@@ -1647,6 +1647,7 @@ asan_protect_global (tree decl)
&& !section_sanitized_p (DECL_SECTION_NAME (decl)))
|| DECL_SIZE (decl) == 0
|| ASAN_RED_ZONE_SIZE * BITS_PER_UNIT > MAX_OFILE_ALIGNMENT
+ || TREE_CODE (DECL_SIZE_UNIT (decl)) != INTEGER_CST
|| !valid_constant_size_p (DECL_SIZE_UNIT (decl))
|| DECL_ALIGN_UNIT (decl) > 2 * ASAN_RED_ZONE_SIZE
|| TREE_TYPE (decl) == ubsan_get_source_location_type ()
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/cfgexpand.c 2017-10-23 17:00:57.770974734 +0100
@@ -4244,6 +4244,9 @@ expand_debug_expr (tree exp)
op0 = expand_expr (exp, NULL_RTX, mode, EXPAND_INITIALIZER);
return op0;
+ case POLY_INT_CST:
+ return immed_wide_int_const (poly_int_cst_value (exp), mode);
+
case COMPLEX_CST:
gcc_assert (COMPLEX_MODE_P (mode));
op0 = expand_debug_expr (TREE_REALPART (exp));
Index: gcc/expr.c
===================================================================
--- gcc/expr.c 2017-10-23 17:00:54.442003055 +0100
+++ gcc/expr.c 2017-10-23 17:00:57.772972916 +0100
@@ -7717,6 +7717,8 @@ const_vector_element (scalar_mode mode,
return const_double_from_real_value (TREE_REAL_CST (elt), mode);
if (TREE_CODE (elt) == FIXED_CST)
return CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), mode);
+ if (POLY_INT_CST_P (elt))
+ return immed_wide_int_const (poly_int_cst_value (elt), mode);
return immed_wide_int_const (wi::to_wide (elt), mode);
}
@@ -10132,6 +10134,9 @@ expand_expr_real_1 (tree exp, rtx target
copy_rtx (XEXP (temp, 0)));
return temp;
+ case POLY_INT_CST:
+ return immed_wide_int_const (poly_int_cst_value (exp), mode);
+
case SAVE_EXPR:
{
tree val = treeop0;
Index: gcc/gimple-expr.h
===================================================================
--- gcc/gimple-expr.h 2017-10-23 16:52:20.504766418 +0100
+++ gcc/gimple-expr.h 2017-10-23 17:00:57.774971099 +0100
@@ -130,6 +130,7 @@ is_gimple_constant (const_tree t)
switch (TREE_CODE (t))
{
case INTEGER_CST:
+ case POLY_INT_CST:
case REAL_CST:
case FIXED_CST:
case COMPLEX_CST:
Index: gcc/gimplify.c
===================================================================
--- gcc/gimplify.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/gimplify.c 2017-10-23 17:00:57.776969281 +0100
@@ -3028,7 +3028,7 @@ maybe_with_size_expr (tree *expr_p)
/* If the size isn't known or is a constant, we have nothing to do. */
size = TYPE_SIZE_UNIT (type);
- if (!size || TREE_CODE (size) == INTEGER_CST)
+ if (!size || poly_int_tree_p (size))
return;
/* Otherwise, make a WITH_SIZE_EXPR. */
Index: gcc/print-tree.c
===================================================================
--- gcc/print-tree.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/print-tree.c 2017-10-23 17:00:57.776969281 +0100
@@ -814,6 +814,18 @@ print_node (FILE *file, const char *pref
}
break;
+ case POLY_INT_CST:
+ {
+ char buf[10];
+ for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ {
+ snprintf (buf, sizeof (buf), "elt%u: ", i);
+ print_node (file, buf, POLY_INT_CST_COEFF (node, i),
+ indent + 4);
+ }
+ }
+ break;
+
case IDENTIFIER_NODE:
lang_hooks.print_identifier (file, node, indent);
break;
Index: gcc/tree-data-ref.c
===================================================================
--- gcc/tree-data-ref.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-data-ref.c 2017-10-23 17:00:57.778967463 +0100
@@ -1235,6 +1235,10 @@ data_ref_compare_tree (tree t1, tree t2)
break;
default:
+ if (POLY_INT_CST_P (t1))
+ return compare_sizes_for_sort (wi::to_poly_widest (t1),
+ wi::to_poly_widest (t2));
+
tclass = TREE_CODE_CLASS (code);
/* For decls, compare their UIDs. */
Index: gcc/tree-pretty-print.c
===================================================================
--- gcc/tree-pretty-print.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-pretty-print.c 2017-10-23 17:00:57.779966554 +0100
@@ -1744,6 +1744,18 @@ dump_generic_node (pretty_printer *pp, t
pp_string (pp, "(OVF)");
break;
+ case POLY_INT_CST:
+ pp_string (pp, "POLY_INT_CST [");
+ dump_generic_node (pp, POLY_INT_CST_COEFF (node, 0), spc, flags, false);
+ for (unsigned int i = 1; i < NUM_POLY_INT_COEFFS; ++i)
+ {
+ pp_string (pp, ", ");
+ dump_generic_node (pp, POLY_INT_CST_COEFF (node, i),
+ spc, flags, false);
+ }
+ pp_string (pp, "]");
+ break;
+
case REAL_CST:
/* Code copied from print_node. */
{
Index: gcc/tree-ssa-address.c
===================================================================
--- gcc/tree-ssa-address.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-ssa-address.c 2017-10-23 17:00:57.779966554 +0100
@@ -203,7 +203,8 @@ addr_for_mem_ref (struct mem_address *ad
if (addr->offset && !integer_zerop (addr->offset))
{
- offset_int dc = offset_int::from (wi::to_wide (addr->offset), SIGNED);
+ poly_offset_int dc
+ = poly_offset_int::from (wi::to_poly_wide (addr->offset), SIGNED);
off = immed_wide_int_const (dc, pointer_mode);
}
else
Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-vect-data-refs.c 2017-10-23 17:00:57.781964737 +0100
@@ -2753,7 +2753,7 @@ dr_group_sort_cmp (const void *dra_, con
return cmp;
/* Then sort after DR_INIT. In case of identical DRs sort after stmt UID. */
- cmp = tree_int_cst_compare (DR_INIT (dra), DR_INIT (drb));
+ cmp = data_ref_compare_tree (DR_INIT (dra), DR_INIT (drb));
if (cmp == 0)
return gimple_uid (DR_STMT (dra)) < gimple_uid (DR_STMT (drb)) ? -1 : 1;
return cmp;
Index: gcc/tree-vrp.c
===================================================================
--- gcc/tree-vrp.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-vrp.c 2017-10-23 17:00:57.782963828 +0100
@@ -1121,7 +1121,24 @@ compare_values_warnv (tree val1, tree va
if (TREE_OVERFLOW (val1) || TREE_OVERFLOW (val2))
return -2;
- return tree_int_cst_compare (val1, val2);
+ if (TREE_CODE (val1) == INTEGER_CST
+ && TREE_CODE (val2) == INTEGER_CST)
+ return tree_int_cst_compare (val1, val2);
+
+ if (poly_int_tree_p (val1) && poly_int_tree_p (val2))
+ {
+ if (must_eq (wi::to_poly_widest (val1),
+ wi::to_poly_widest (val2)))
+ return 0;
+ if (must_lt (wi::to_poly_widest (val1),
+ wi::to_poly_widest (val2)))
+ return -1;
+ if (must_gt (wi::to_poly_widest (val1),
+ wi::to_poly_widest (val2)))
+ return 1;
+ }
+
+ return -2;
}
else
{
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-ssa-loop-ivopts.c 2017-10-23 17:00:57.780965645 +0100
@@ -1127,6 +1127,8 @@ determine_base_object (tree expr)
gcc_unreachable ();
default:
+ if (POLY_INT_CST_P (expr))
+ return NULL_TREE;
return fold_convert (ptr_type_node, expr);
}
}
@@ -2168,6 +2170,12 @@ constant_multiple_of (tree top, tree bot
return res == 0;
default:
+ if (POLY_INT_CST_P (top)
+ && POLY_INT_CST_P (bot)
+ && constant_multiple_p (wi::to_poly_widest (top),
+ wi::to_poly_widest (bot), mul))
+ return true;
+
return false;
}
}
@@ -2967,7 +2975,8 @@ get_loop_invariant_expr (struct ivopts_d
{
STRIP_NOPS (inv_expr);
- if (TREE_CODE (inv_expr) == INTEGER_CST || TREE_CODE (inv_expr) == SSA_NAME)
+ if (poly_int_tree_p (inv_expr)
+ || TREE_CODE (inv_expr) == SSA_NAME)
return NULL;
/* Don't strip constant part away as we used to. */
@@ -3064,7 +3073,7 @@ add_candidate_1 (struct ivopts_data *dat
cand->incremented_at = incremented_at;
data->vcands.safe_push (cand);
- if (TREE_CODE (step) != INTEGER_CST)
+ if (!poly_int_tree_p (step))
{
find_inv_vars (data, &step, &cand->inv_vars);
@@ -3800,7 +3809,7 @@ get_computation_aff_1 (struct loop *loop
if (TYPE_PRECISION (utype) < TYPE_PRECISION (ctype))
{
if (cand->orig_iv != NULL && CONVERT_EXPR_P (cbase)
- && (CONVERT_EXPR_P (cstep) || TREE_CODE (cstep) == INTEGER_CST))
+ && (CONVERT_EXPR_P (cstep) || poly_int_tree_p (cstep)))
{
tree inner_base, inner_step, inner_type;
inner_base = TREE_OPERAND (cbase, 0);
@@ -4058,7 +4067,7 @@ force_expr_to_var_cost (tree expr, bool
if (is_gimple_min_invariant (expr))
{
- if (TREE_CODE (expr) == INTEGER_CST)
+ if (poly_int_tree_p (expr))
return comp_cost (integer_cost [speed], 0);
if (TREE_CODE (expr) == ADDR_EXPR)
Index: gcc/tree-ssa-loop.c
===================================================================
--- gcc/tree-ssa-loop.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-ssa-loop.c 2017-10-23 17:00:57.780965645 +0100
@@ -620,6 +620,7 @@ for_each_index (tree *addr_p, bool (*cbc
case VEC_SERIES_CST:
case COMPLEX_CST:
case INTEGER_CST:
+ case POLY_INT_CST:
case REAL_CST:
case FIXED_CST:
case CONSTRUCTOR:
Index: gcc/fold-const.h
===================================================================
--- gcc/fold-const.h 2017-10-23 16:52:20.504766418 +0100
+++ gcc/fold-const.h 2017-10-23 17:00:57.774971099 +0100
@@ -115,7 +115,7 @@ extern tree build_simple_mem_ref_loc (lo
#define build_simple_mem_ref(T)\
build_simple_mem_ref_loc (UNKNOWN_LOCATION, T)
extern offset_int mem_ref_offset (const_tree);
-extern tree build_invariant_address (tree, tree, HOST_WIDE_INT);
+extern tree build_invariant_address (tree, tree, poly_int64);
extern tree constant_boolean_node (bool, tree);
extern tree div_if_zero_remainder (const_tree, const_tree);
@@ -152,7 +152,7 @@ #define round_up(T,N) round_up_loc (UNKN
extern tree round_up_loc (location_t, tree, unsigned int);
#define round_down(T,N) round_down_loc (UNKNOWN_LOCATION, T, N)
extern tree round_down_loc (location_t, tree, int);
-extern tree size_int_kind (HOST_WIDE_INT, enum size_type_kind);
+extern tree size_int_kind (poly_int64, enum size_type_kind);
#define size_binop(CODE,T1,T2)\
size_binop_loc (UNKNOWN_LOCATION, CODE, T1, T2)
extern tree size_binop_loc (location_t, enum tree_code, tree, tree);
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/fold-const.c 2017-10-23 17:00:57.774971099 +0100
@@ -553,10 +553,8 @@ fold_negate_expr_1 (location_t loc, tree
return tem;
break;
+ case POLY_INT_CST:
case REAL_CST:
- tem = fold_negate_const (t, type);
- return tem;
-
case FIXED_CST:
tem = fold_negate_const (t, type);
return tem;
@@ -986,13 +984,10 @@ int_binop_types_match_p (enum tree_code
&& TYPE_MODE (type1) == TYPE_MODE (type2);
}
-
-/* Combine two integer constants PARG1 and PARG2 under operation CODE
- to produce a new constant. Return NULL_TREE if we don't know how
- to evaluate CODE at compile-time. */
+/* Subroutine of int_const_binop_1 that handles two INTEGER_CSTs. */
static tree
-int_const_binop_1 (enum tree_code code, const_tree parg1, const_tree parg2,
+int_const_binop_2 (enum tree_code code, const_tree parg1, const_tree parg2,
int overflowable)
{
wide_int res;
@@ -1140,6 +1135,74 @@ int_const_binop_1 (enum tree_code code,
return t;
}
+/* Combine two integer constants PARG1 and PARG2 under operation CODE
+ to produce a new constant. Return NULL_TREE if we don't know how
+ to evaluate CODE at compile-time. */
+
+static tree
+int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree arg2,
+ int overflowable)
+{
+ if (TREE_CODE (arg1) == INTEGER_CST && TREE_CODE (arg2) == INTEGER_CST)
+ return int_const_binop_2 (code, arg1, arg2, overflowable);
+
+ gcc_assert (NUM_POLY_INT_COEFFS != 1);
+
+ if (poly_int_tree_p (arg1) && poly_int_tree_p (arg2))
+ {
+ poly_wide_int res;
+ bool overflow;
+ tree type = TREE_TYPE (arg1);
+ signop sign = TYPE_SIGN (type);
+ switch (code)
+ {
+ case PLUS_EXPR:
+ res = wi::add (wi::to_poly_wide (arg1),
+ wi::to_poly_wide (arg2), sign, &overflow);
+ break;
+
+ case MINUS_EXPR:
+ res = wi::sub (wi::to_poly_wide (arg1),
+ wi::to_poly_wide (arg2), sign, &overflow);
+ break;
+
+ case MULT_EXPR:
+ if (TREE_CODE (arg2) == INTEGER_CST)
+ res = wi::mul (wi::to_poly_wide (arg1),
+ wi::to_wide (arg2), sign, &overflow);
+ else if (TREE_CODE (arg1) == INTEGER_CST)
+ res = wi::mul (wi::to_poly_wide (arg2),
+ wi::to_wide (arg1), sign, &overflow);
+ else
+ return NULL_TREE;
+ break;
+
+ case LSHIFT_EXPR:
+ if (TREE_CODE (arg2) == INTEGER_CST)
+ res = wi::to_poly_wide (arg1) << wi::to_wide (arg2);
+ else
+ return NULL_TREE;
+ break;
+
+ case BIT_IOR_EXPR:
+ if (TREE_CODE (arg2) != INTEGER_CST
+ || !can_ior_p (wi::to_poly_wide (arg1), wi::to_wide (arg2),
+ &res))
+ return NULL_TREE;
+ break;
+
+ default:
+ return NULL_TREE;
+ }
+ return force_fit_type (type, res, overflowable,
+ (((sign == SIGNED || overflowable == -1)
+ && overflow)
+ | TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2)));
+ }
+
+ return NULL_TREE;
+}
+
tree
int_const_binop (enum tree_code code, const_tree arg1, const_tree arg2)
{
@@ -1183,7 +1246,7 @@ const_binop (enum tree_code code, tree a
STRIP_NOPS (arg1);
STRIP_NOPS (arg2);
- if (TREE_CODE (arg1) == INTEGER_CST && TREE_CODE (arg2) == INTEGER_CST)
+ if (poly_int_tree_p (arg1) && poly_int_tree_p (arg2))
{
if (code == POINTER_PLUS_EXPR)
return int_const_binop (PLUS_EXPR,
@@ -1721,6 +1784,8 @@ const_unop (enum tree_code code, tree ty
case BIT_NOT_EXPR:
if (TREE_CODE (arg0) == INTEGER_CST)
return fold_not_const (arg0, type);
+ else if (POLY_INT_CST_P (arg0))
+ return wide_int_to_tree (type, -poly_int_cst_value (arg0));
/* Perform BIT_NOT_EXPR on each element individually. */
else if (TREE_CODE (arg0) == VECTOR_CST)
{
@@ -1847,7 +1912,7 @@ const_unop (enum tree_code code, tree ty
indicates which particular sizetype to create. */
tree
-size_int_kind (HOST_WIDE_INT number, enum size_type_kind kind)
+size_int_kind (poly_int64 number, enum size_type_kind kind)
{
return build_int_cst (sizetype_tab[(int) kind], number);
}
@@ -1868,8 +1933,8 @@ size_binop_loc (location_t loc, enum tre
gcc_assert (int_binop_types_match_p (code, TREE_TYPE (arg0),
TREE_TYPE (arg1)));
- /* Handle the special case of two integer constants faster. */
- if (TREE_CODE (arg0) == INTEGER_CST && TREE_CODE (arg1) == INTEGER_CST)
+ /* Handle the special case of two poly_int constants faster. */
+ if (poly_int_tree_p (arg0) && poly_int_tree_p (arg1))
{
/* And some specific cases even faster than that. */
if (code == PLUS_EXPR)
@@ -1893,7 +1958,9 @@ size_binop_loc (location_t loc, enum tre
/* Handle general case of two integer constants. For sizetype
constant calculations we always want to know about overflow,
even in the unsigned case. */
- return int_const_binop_1 (code, arg0, arg1, -1);
+ tree res = int_const_binop_1 (code, arg0, arg1, -1);
+ if (res != NULL_TREE)
+ return res;
}
return fold_build2_loc (loc, code, type, arg0, arg1);
@@ -2217,9 +2284,20 @@ fold_convert_const_fixed_from_real (tree
static tree
fold_convert_const (enum tree_code code, tree type, tree arg1)
{
- if (TREE_TYPE (arg1) == type)
+ tree arg_type = TREE_TYPE (arg1);
+ if (arg_type == type)
return arg1;
+ /* We can't widen types, since the runtime value could overflow the
+ original type before being extended to the new type. */
+ if (POLY_INT_CST_P (arg1)
+ && (POINTER_TYPE_P (type) || INTEGRAL_TYPE_P (type))
+ && TYPE_PRECISION (type) <= TYPE_PRECISION (arg_type))
+ return build_poly_int_cst (type,
+ poly_wide_int::from (poly_int_cst_value (arg1),
+ TYPE_PRECISION (type),
+ TYPE_SIGN (arg_type)));
+
if (POINTER_TYPE_P (type) || INTEGRAL_TYPE_P (type)
|| TREE_CODE (type) == OFFSET_TYPE)
{
@@ -12666,6 +12744,10 @@ multiple_of_p (tree type, const_tree top
/* fall through */
default:
+ if (POLY_INT_CST_P (top) && poly_int_tree_p (bottom))
+ return multiple_p (wi::to_poly_widest (top),
+ wi::to_poly_widest (bottom));
+
return 0;
}
}
@@ -13722,16 +13804,6 @@ fold_negate_const (tree arg0, tree type)
switch (TREE_CODE (arg0))
{
- case INTEGER_CST:
- {
- bool overflow;
- wide_int val = wi::neg (wi::to_wide (arg0), &overflow);
- t = force_fit_type (type, val, 1,
- (overflow && ! TYPE_UNSIGNED (type))
- || TREE_OVERFLOW (arg0));
- break;
- }
-
case REAL_CST:
t = build_real (type, real_value_negate (&TREE_REAL_CST (arg0)));
break;
@@ -13750,6 +13822,16 @@ fold_negate_const (tree arg0, tree type)
}
default:
+ if (poly_int_tree_p (arg0))
+ {
+ bool overflow;
+ poly_wide_int res = wi::neg (wi::to_poly_wide (arg0), &overflow);
+ t = force_fit_type (type, res, 1,
+ (overflow && ! TYPE_UNSIGNED (type))
+ || TREE_OVERFLOW (arg0));
+ break;
+ }
+
gcc_unreachable ();
}
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c 2017-10-23 17:00:54.441003964 +0100
+++ gcc/expmed.c 2017-10-23 17:00:57.771973825 +0100
@@ -5276,6 +5276,9 @@ make_tree (tree type, rtx x)
/* fall through. */
default:
+ if (CONST_POLY_INT_P (x))
+ return wide_int_to_tree (t, const_poly_int_value (x));
+
t = build_decl (RTL_LOCATION (x), VAR_DECL, NULL_TREE, type);
/* If TYPE is a POINTER_TYPE, we might need to convert X from
Index: gcc/gimple-ssa-strength-reduction.c
===================================================================
--- gcc/gimple-ssa-strength-reduction.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/gimple-ssa-strength-reduction.c 2017-10-23 17:00:57.775970190 +0100
@@ -1258,7 +1258,7 @@ slsr_process_mul (gimple *gs, tree rhs1,
c2 = create_mul_ssa_cand (gs, rhs2, rhs1, speed);
c->next_interp = c2->cand_num;
}
- else
+ else if (TREE_CODE (rhs2) == INTEGER_CST)
{
/* Record an interpretation for the multiply-immediate. */
c = create_mul_imm_cand (gs, rhs1, rhs2, speed);
@@ -1499,7 +1499,7 @@ slsr_process_add (gimple *gs, tree rhs1,
add_cand_for_stmt (gs, c2);
}
}
- else
+ else if (TREE_CODE (rhs2) == INTEGER_CST)
{
/* Record an interpretation for the add-immediate. */
widest_int index = wi::to_widest (rhs2);
Index: gcc/stor-layout.c
===================================================================
--- gcc/stor-layout.c 2017-10-23 17:00:52.669615373 +0100
+++ gcc/stor-layout.c 2017-10-23 17:00:57.777968372 +0100
@@ -840,6 +840,28 @@ start_record_layout (tree t)
return rli;
}
+/* Fold sizetype value X to bitsizetype, given that X represents a type
+ size or offset. */
+
+static tree
+bits_from_bytes (tree x)
+{
+ if (POLY_INT_CST_P (x))
+ /* The runtime calculation isn't allowed to overflow sizetype;
+ increasing the runtime values must always increase the size
+ or offset of the object. This means that the object imposes
+ a maximum value on the runtime parameters, but we don't record
+ what that is. */
+ return build_poly_int_cst
+ (bitsizetype,
+ poly_wide_int::from (poly_int_cst_value (x),
+ TYPE_PRECISION (bitsizetype),
+ TYPE_SIGN (TREE_TYPE (x))));
+ x = fold_convert (bitsizetype, x);
+ gcc_checking_assert (x);
+ return x;
+}
+
/* Return the combined bit position for the byte offset OFFSET and the
bit position BITPOS.
@@ -853,8 +875,7 @@ start_record_layout (tree t)
bit_from_pos (tree offset, tree bitpos)
{
return size_binop (PLUS_EXPR, bitpos,
- size_binop (MULT_EXPR,
- fold_convert (bitsizetype, offset),
+ size_binop (MULT_EXPR, bits_from_bytes (offset),
bitsize_unit_node));
}
@@ -2268,9 +2289,10 @@ layout_type (tree type)
TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
TYPE_SIZE_UNIT (innertype),
size_int (nunits));
- TYPE_SIZE (type) = int_const_binop (MULT_EXPR,
- TYPE_SIZE (innertype),
- bitsize_int (nunits));
+ TYPE_SIZE (type) = int_const_binop
+ (MULT_EXPR,
+ bits_from_bytes (TYPE_SIZE_UNIT (type)),
+ bitsize_int (BITS_PER_UNIT));
/* For vector types, we do not default to the mode's alignment.
Instead, query a target hook, defaulting to natural alignment.
@@ -2383,8 +2405,7 @@ layout_type (tree type)
length = size_zero_node;
TYPE_SIZE (type) = size_binop (MULT_EXPR, element_size,
- fold_convert (bitsizetype,
- length));
+ bits_from_bytes (length));
/* If we know the size of the element, calculate the total size
directly, rather than do some division thing below. This
Index: gcc/tree-cfg.c
===================================================================
--- gcc/tree-cfg.c 2017-10-23 16:52:20.504766418 +0100
+++ gcc/tree-cfg.c 2017-10-23 17:00:57.777968372 +0100
@@ -2952,7 +2952,7 @@ #define CHECK_OP(N, MSG) \
error ("invalid first operand of MEM_REF");
return x;
}
- if (TREE_CODE (TREE_OPERAND (t, 1)) != INTEGER_CST
+ if (!poly_int_tree_p (TREE_OPERAND (t, 1))
|| !POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (t, 1))))
{
error ("invalid offset operand of MEM_REF");
@@ -3358,7 +3358,7 @@ verify_types_in_gimple_reference (tree e
debug_generic_stmt (expr);
return true;
}
- if (TREE_CODE (TREE_OPERAND (expr, 1)) != INTEGER_CST
+ if (!poly_int_tree_p (TREE_OPERAND (expr, 1))
|| !POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (expr, 1))))
{
error ("invalid offset operand in MEM_REF");
@@ -3375,7 +3375,7 @@ verify_types_in_gimple_reference (tree e
return true;
}
if (!TMR_OFFSET (expr)
- || TREE_CODE (TMR_OFFSET (expr)) != INTEGER_CST
+ || !poly_int_tree_p (TMR_OFFSET (expr))
|| !POINTER_TYPE_P (TREE_TYPE (TMR_OFFSET (expr))))
{
error ("invalid offset operand in TARGET_MEM_REF");
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-23 17:02 ` [006/nnn] poly_int: tree constants Richard Sandiford
@ 2017-10-25 17:14 ` Martin Sebor
2017-10-25 21:35 ` Richard Sandiford
2017-11-17 4:51 ` Jeff Law
1 sibling, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-10-25 17:14 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:00 AM, Richard Sandiford wrote:
> +#if NUM_POLY_INT_COEFFS == 1
> +extern inline __attribute__ ((__gnu_inline__)) poly_int64
> +tree_to_poly_int64 (const_tree t)
I'm curious about the extern inline and __gnu_inline__ here and
not in poly_int_tree_p below. Am I correct in assuming that
the combination is a holdover from the days when GCC was compiled
using a C compiler, and that the way to write the same definition
in C++ 98 is simply:
inline poly_int64
tree_to_poly_int64 (const_tree t)
> +{
> + gcc_assert (tree_fits_poly_int64_p (t));
> + return TREE_INT_CST_LOW (t);
> +}
If yes, I would suggest to use the C++ form (and at some point,
changing the existing uses of the GCC/C idiom to the C++ form
as well).
Otherwise, if something requires the use of the C form I would
suggest to add a brief comment explaining it.
...
> +
> +inline bool
> +poly_int_tree_p (const_tree t, poly_int64_pod *value)
> +{
...
> /* The tree and const_tree overload templates. */
> namespace wi
> {
> + class unextended_tree
> + {
> + private:
> + const_tree m_t;
> +
> + public:
> + unextended_tree () {}
Defining no-op ctors is quite dangerous and error-prone. I suggest
to instead default initialize the member(s):
unextended_tree (): m_t () {}
Ditto everywhere else, such as in:
...
> template <int N>
> class extended_tree
> {
> @@ -5139,11 +5225,13 @@ extern bool anon_aggrname_p (const_tree)
> const_tree m_t;
>
> public:
> + extended_tree () {}
> extended_tree (const_tree);
...
> Index: gcc/tree.c
> ===================================================================
...
> +
> +/* Return true if T holds a polynomial pointer difference, storing it in
> + *VALUE if so. A true return means that T's precision is no greater
> + than 64 bits, which is the largest address space we support, so *VALUE
> + never loses precision. However, the signedness of the result is
> + somewhat arbitrary, since if B lives near the end of a 64-bit address
> + range and A lives near the beginning, B - A is a large positive value
> + outside the range of int64_t. A - B is likewise a large negative value
> + outside the range of int64_t. All the pointer difference really
> + gives is a raw pointer-sized bitstring that can be added to the first
> + pointer value to get the second. */
I'm not sure I understand the comment about the sign correctly, but
if I do, I don't think it's correct.
Because their difference wouldn't representable in any basic integer
type (i.,e., in ptrdiff_t) the pointers described above could never
point to the same object (or array), and so their difference is not
meaningful. C/C++ only define the semantics of a difference between
pointers to the same object. That restricts the size of the largest
possible object typically to SIZE_MAX / 2, or at most SIZE_MAX on
the handful of targets where ptrdiff_t has greater precision than
size_t. But even on those targets, the difference between any two
pointers to the same object must be representable in ptrdiff_t,
including the sign.
> +bool
> +ptrdiff_tree_p (const_tree t, poly_int64_pod *value)
> +{
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-25 17:14 ` Martin Sebor
@ 2017-10-25 21:35 ` Richard Sandiford
2017-10-26 5:52 ` Martin Sebor
0 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-25 21:35 UTC (permalink / raw)
To: Martin Sebor; +Cc: gcc-patches
Martin Sebor <msebor@gmail.com> writes:
> On 10/23/2017 11:00 AM, Richard Sandiford wrote:
>> +#if NUM_POLY_INT_COEFFS == 1
>> +extern inline __attribute__ ((__gnu_inline__)) poly_int64
>> +tree_to_poly_int64 (const_tree t)
>
> I'm curious about the extern inline and __gnu_inline__ here and
> not in poly_int_tree_p below. Am I correct in assuming that
> the combination is a holdover from the days when GCC was compiled
> using a C compiler, and that the way to write the same definition
> in C++ 98 is simply:
>
> inline poly_int64
> tree_to_poly_int64 (const_tree t)
>
>> +{
>> + gcc_assert (tree_fits_poly_int64_p (t));
>> + return TREE_INT_CST_LOW (t);
>> +}
>
> If yes, I would suggest to use the C++ form (and at some point,
> changing the existing uses of the GCC/C idiom to the C++ form
> as well).
>
> Otherwise, if something requires the use of the C form I would
> suggest to add a brief comment explaining it.
You probably saw that this is based on tree_to_[su]hwi. AIUI the
differences from plain C++ inline are that:
a) with __gnu_inline__, an out-of-line definition must still exist.
That fits this use case well, because the inline is conditional on
the #ifdef and tree.c has an out-of-line definition either way.
If we used normal inline, we'd need to add extra #ifs to tree.c
as well, to avoid multiple definitions.
b) __gnu_inline__ has the strength of __always_inline__, but without the
correctness implications if inlining is impossible for any reason.
I did try normal inline first, but it wasn't strong enough. The
compiler ended up measurably faster if I copied the tree_to_[su]hwi
approach.
> ...
>> +
>> +inline bool
>> +poly_int_tree_p (const_tree t, poly_int64_pod *value)
>> +{
> ...
[This one is unconditionally inline.]
>> /* The tree and const_tree overload templates. */
>> namespace wi
>> {
>> + class unextended_tree
>> + {
>> + private:
>> + const_tree m_t;
>> +
>> + public:
>> + unextended_tree () {}
>
> Defining no-op ctors is quite dangerous and error-prone. I suggest
> to instead default initialize the member(s):
>
> unextended_tree (): m_t () {}
>
> Ditto everywhere else, such as in:
This is really performance-senesitive code though, so I don't think
we want to add any unnecessary initialisation. Primitive types are
uninitalised by default too, and the point of this class is to
provide an integer-like interface.
In your other message you used the example of explicit default
initialisation, such as:
class foo
{
foo () : x () {}
unextended_tree x;
};
But I think we should strongly discourage that kind of thing.
If someone wants to initialise x to a particular value, like
integer_zero_node, then it would be better to do it explicitly.
If they don't care what the initial value is, then for these
integer-mimicing classes, uninitialised is as good as anything
else. :-)
Note that for this class NULL_TREE is not a safe default value.
The same goes for the wide-int.h classes, where a length or precision
of 0 is undefined and isn't necessarily going to be handled gracefully
or predictably.
>> template <int N>
>> class extended_tree
>> {
>> @@ -5139,11 +5225,13 @@ extern bool anon_aggrname_p (const_tree)
>> const_tree m_t;
>>
>> public:
>> + extended_tree () {}
>> extended_tree (const_tree);
> ...
>> Index: gcc/tree.c
>> ===================================================================
> ...
>> +
>> +/* Return true if T holds a polynomial pointer difference, storing it in
>> + *VALUE if so. A true return means that T's precision is no greater
>> + than 64 bits, which is the largest address space we support, so *VALUE
>> + never loses precision. However, the signedness of the result is
>> + somewhat arbitrary, since if B lives near the end of a 64-bit address
>> + range and A lives near the beginning, B - A is a large positive value
>> + outside the range of int64_t. A - B is likewise a large negative value
>> + outside the range of int64_t. All the pointer difference really
>> + gives is a raw pointer-sized bitstring that can be added to the first
>> + pointer value to get the second. */
>
> I'm not sure I understand the comment about the sign correctly, but
> if I do, I don't think it's correct.
>
> Because their difference wouldn't representable in any basic integer
> type (i.,e., in ptrdiff_t) the pointers described above could never
> point to the same object (or array), and so their difference is not
> meaningful. C/C++ only define the semantics of a difference between
> pointers to the same object. That restricts the size of the largest
> possible object typically to SIZE_MAX / 2, or at most SIZE_MAX on
> the handful of targets where ptrdiff_t has greater precision than
> size_t. But even on those targets, the difference between any two
> pointers to the same object must be representable in ptrdiff_t,
> including the sign.
But does that apply even when no pointer difference of that size
occurs in the original source? I.e., is:
char *x = malloc (0x80000001)
undefined in itself on 32-bit targets? Does it become undefined after:
for (unsigned int i = 0; i < 0x80000001; ++i)
x[i++] = 0;
where no large pointer difference is calculated? But I realise
gcc's support for this kind of thing is limited, and that we do
try to emit a diagnostic for obvious instances...
In the (two) places that need this -- both conversions from
cst_and_fits_in_hwi -- the immediate problem is that the sign
of the type doesn't necessarily match the logical sign of the
difference. E.g. a negative offset can be represented as a large
unsigned value of sizetype.
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-25 21:35 ` Richard Sandiford
@ 2017-10-26 5:52 ` Martin Sebor
2017-10-26 8:40 ` Richard Sandiford
0 siblings, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-10-26 5:52 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/25/2017 03:31 PM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:
>> On 10/23/2017 11:00 AM, Richard Sandiford wrote:
>>> +#if NUM_POLY_INT_COEFFS == 1
>>> +extern inline __attribute__ ((__gnu_inline__)) poly_int64
>>> +tree_to_poly_int64 (const_tree t)
>>
>> I'm curious about the extern inline and __gnu_inline__ here and
>> not in poly_int_tree_p below. Am I correct in assuming that
>> the combination is a holdover from the days when GCC was compiled
>> using a C compiler, and that the way to write the same definition
>> in C++ 98 is simply:
>>
>> inline poly_int64
>> tree_to_poly_int64 (const_tree t)
>>
>>> +{
>>> + gcc_assert (tree_fits_poly_int64_p (t));
>>> + return TREE_INT_CST_LOW (t);
>>> +}
>>
>> If yes, I would suggest to use the C++ form (and at some point,
>> changing the existing uses of the GCC/C idiom to the C++ form
>> as well).
>>
>> Otherwise, if something requires the use of the C form I would
>> suggest to add a brief comment explaining it.
>
> You probably saw that this is based on tree_to_[su]hwi. AIUI the
> differences from plain C++ inline are that:
>
> a) with __gnu_inline__, an out-of-line definition must still exist.
> That fits this use case well, because the inline is conditional on
> the #ifdef and tree.c has an out-of-line definition either way.
> If we used normal inline, we'd need to add extra #ifs to tree.c
> as well, to avoid multiple definitions.
>
> b) __gnu_inline__ has the strength of __always_inline__, but without the
> correctness implications if inlining is impossible for any reason.
> I did try normal inline first, but it wasn't strong enough. The
> compiler ended up measurably faster if I copied the tree_to_[su]hwi
> approach.
Thanks for the clarification. I'm not sure I fully understand
it but I'm happy to take your word for it that's necessary. I
would just recommend adding a brief comment to this effect since
it isn't obvious.
>>> +
>>> +inline bool
>>> +poly_int_tree_p (const_tree t, poly_int64_pod *value)
>>> +{
>> ...
>
> [This one is unconditionally inline.]
>
>>> /* The tree and const_tree overload templates. */
>>> namespace wi
>>> {
>>> + class unextended_tree
>>> + {
>>> + private:
>>> + const_tree m_t;
>>> +
>>> + public:
>>> + unextended_tree () {}
>>
>> Defining no-op ctors is quite dangerous and error-prone. I suggest
>> to instead default initialize the member(s):
>>
>> unextended_tree (): m_t () {}
>>
>> Ditto everywhere else, such as in:
>
> This is really performance-senesitive code though, so I don't think
> we want to add any unnecessary initialisation. Primitive types are
> uninitalised by default too, and the point of this class is to
> provide an integer-like interface.
I understand the performance concern (more on that below), but
to clarify the usability issues, I don't think the analogy with
primitive types is quite fitting here: int() evaluates to zero,
as do the values of i and a[0] and a[1] after an object of type
S is constructed using its default ctor, i.e., S ():
struct S {
int i;
int a[2];
S (): i (), a () { }
};
With the new (and some existing) classes that's not so, and it
makes them harder and more error-prone to use (I just recently
learned this the hard way about offset_int and the debugging
experience is still fresh in my memory).
When the cor is inline and the initialization unnecessary then
GCC will in most instances eliminate it, so I also don't think
the suggested change would have a significant impact on
the efficiency of optimized code, but...
...if it is thought essential to provide a no-op ctor, I would
suggest to consider making its property explicit, e.g., like so:
struct unextended_tree {
struct Uninit { };
// ...
unextended_tree (Uninit) { /* no initialization */ }
// ...
};
This way the programmer has to explicitly opt in to using the
unsafe ctor. (This ctor is suitable for single objects, not
arrays of such things, but presumably that would be sufficient.
If not, there are tricks to make that work too.)
> In your other message you used the example of explicit default
> initialisation, such as:
>
> class foo
> {
> foo () : x () {}
> unextended_tree x;
> };
>
> But I think we should strongly discourage that kind of thing.
> If someone wants to initialise x to a particular value, like
> integer_zero_node, then it would be better to do it explicitly.
> If they don't care what the initial value is, then for these
> integer-mimicing classes, uninitialised is as good as anything
> else. :-)
Efficiency is certainly important, but it doesn't have to come
at the expense of usability or correctness. I think it's possible
(and important) to design interfaces that are usable safely and
intuitively, and difficult to misuse, while also accommodating
advanced efficient use cases.
> Note that for this class NULL_TREE is not a safe default value.
> The same goes for the wide-int.h classes, where a length or precision
> of 0 is undefined and isn't necessarily going to be handled gracefully
> or predictably.
For offset_int both precision and length are known so I think
it would make sense to have the default ctor value-initialize
the object. For wide_int, it seems to me that choosing some
default precision and length in the default ctor would still
be preferable to leaving the members indeterminate. (That
functionality could still be provided by some other ctor as
I suggested above).
>>> template <int N>
>>> class extended_tree
>>> {
>>> @@ -5139,11 +5225,13 @@ extern bool anon_aggrname_p (const_tree)
>>> const_tree m_t;
>>>
>>> public:
>>> + extended_tree () {}
>>> extended_tree (const_tree);
>> ...
>>> Index: gcc/tree.c
>>> ===================================================================
>> ...
>>> +
>>> +/* Return true if T holds a polynomial pointer difference, storing it in
>>> + *VALUE if so. A true return means that T's precision is no greater
>>> + than 64 bits, which is the largest address space we support, so *VALUE
>>> + never loses precision. However, the signedness of the result is
>>> + somewhat arbitrary, since if B lives near the end of a 64-bit address
>>> + range and A lives near the beginning, B - A is a large positive value
>>> + outside the range of int64_t. A - B is likewise a large negative value
>>> + outside the range of int64_t. All the pointer difference really
>>> + gives is a raw pointer-sized bitstring that can be added to the first
>>> + pointer value to get the second. */
>>
>> I'm not sure I understand the comment about the sign correctly, but
>> if I do, I don't think it's correct.
>>
>> Because their difference wouldn't representable in any basic integer
>> type (i.,e., in ptrdiff_t) the pointers described above could never
>> point to the same object (or array), and so their difference is not
>> meaningful. C/C++ only define the semantics of a difference between
>> pointers to the same object. That restricts the size of the largest
>> possible object typically to SIZE_MAX / 2, or at most SIZE_MAX on
>> the handful of targets where ptrdiff_t has greater precision than
>> size_t. But even on those targets, the difference between any two
>> pointers to the same object must be representable in ptrdiff_t,
>> including the sign.
>
> But does that apply even when no pointer difference of that size
> occurs in the original source? I.e., is:
>
> char *x = malloc (0x80000001)
>
> undefined in itself on 32-bit targets?
No, the call itself isn't undefined, but it shouldn't succeed
on a conforming implementation where ptrdiff_t is a 32-bit type
(which is why GCC diagnoses it). If the call were to succeed
then pointers to the allocated object would fail to meet the
C requirements on additive operators.
> Does it become undefined after:
>
> for (unsigned int i = 0; i < 0x80000001; ++i)
> x[i++] = 0;
>
> where no large pointer difference is calculated? But I realise
> gcc's support for this kind of thing is limited, and that we do
> try to emit a diagnostic for obvious instances...
Yes, this is undefined, both in C (unless ptrdiff_t is wider
than 32 bits) and in GCC, because x[0x80000000] doesn't refer
to the 2147483648-th element of x.
> In the (two) places that need this -- both conversions from
> cst_and_fits_in_hwi -- the immediate problem is that the sign
> of the type doesn't necessarily match the logical sign of the
> difference. E.g. a negative offset can be represented as a large
> unsigned value of sizetype.
I only meant to suggest that the comment be reworded so as
not to imply that such pointers (that are farther apart than
PTRDIFF_MAX) can point to the same object and be subtracted.
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-26 5:52 ` Martin Sebor
@ 2017-10-26 8:40 ` Richard Sandiford
2017-10-26 16:45 ` Martin Sebor
0 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-26 8:40 UTC (permalink / raw)
To: Martin Sebor; +Cc: gcc-patches
Martin Sebor <msebor@gmail.com> writes:
> On 10/25/2017 03:31 PM, Richard Sandiford wrote:
>> Martin Sebor <msebor@gmail.com> writes:
>>> On 10/23/2017 11:00 AM, Richard Sandiford wrote:
>>>> +#if NUM_POLY_INT_COEFFS == 1
>>>> +extern inline __attribute__ ((__gnu_inline__)) poly_int64
>>>> +tree_to_poly_int64 (const_tree t)
>>>
>>> I'm curious about the extern inline and __gnu_inline__ here and
>>> not in poly_int_tree_p below. Am I correct in assuming that
>>> the combination is a holdover from the days when GCC was compiled
>>> using a C compiler, and that the way to write the same definition
>>> in C++ 98 is simply:
>>>
>>> inline poly_int64
>>> tree_to_poly_int64 (const_tree t)
>>>
>>>> +{
>>>> + gcc_assert (tree_fits_poly_int64_p (t));
>>>> + return TREE_INT_CST_LOW (t);
>>>> +}
>>>
>>> If yes, I would suggest to use the C++ form (and at some point,
>>> changing the existing uses of the GCC/C idiom to the C++ form
>>> as well).
>>>
>>> Otherwise, if something requires the use of the C form I would
>>> suggest to add a brief comment explaining it.
>>
>> You probably saw that this is based on tree_to_[su]hwi. AIUI the
>> differences from plain C++ inline are that:
>>
>> a) with __gnu_inline__, an out-of-line definition must still exist.
>> That fits this use case well, because the inline is conditional on
>> the #ifdef and tree.c has an out-of-line definition either way.
>> If we used normal inline, we'd need to add extra #ifs to tree.c
>> as well, to avoid multiple definitions.
>>
>> b) __gnu_inline__ has the strength of __always_inline__, but without the
>> correctness implications if inlining is impossible for any reason.
>> I did try normal inline first, but it wasn't strong enough. The
>> compiler ended up measurably faster if I copied the tree_to_[su]hwi
>> approach.
>
> Thanks for the clarification. I'm not sure I fully understand
> it but I'm happy to take your word for it that's necessary. I
> would just recommend adding a brief comment to this effect since
> it isn't obvious.
>
>>>> +
>>>> +inline bool
>>>> +poly_int_tree_p (const_tree t, poly_int64_pod *value)
>>>> +{
>>> ...
>>
>> [This one is unconditionally inline.]
>>
>>>> /* The tree and const_tree overload templates. */
>>>> namespace wi
>>>> {
>>>> + class unextended_tree
>>>> + {
>>>> + private:
>>>> + const_tree m_t;
>>>> +
>>>> + public:
>>>> + unextended_tree () {}
>>>
>>> Defining no-op ctors is quite dangerous and error-prone. I suggest
>>> to instead default initialize the member(s):
>>>
>>> unextended_tree (): m_t () {}
>>>
>>> Ditto everywhere else, such as in:
>>
>> This is really performance-senesitive code though, so I don't think
>> we want to add any unnecessary initialisation. Primitive types are
>> uninitalised by default too, and the point of this class is to
>> provide an integer-like interface.
>
> I understand the performance concern (more on that below), but
> to clarify the usability issues, I don't think the analogy with
> primitive types is quite fitting here: int() evaluates to zero,
> as do the values of i and a[0] and a[1] after an object of type
> S is constructed using its default ctor, i.e., S ():
>
> struct S {
> int i;
> int a[2];
>
> S (): i (), a () { }
> };
Sure, I realise that. I meant that:
int x;
doesn't initialise x to zero. So it's a question of which case is the
most motivating one: using "x ()" to initialise x to 0 in a constructor
or "int x;" to declare a variable of type x, uninitialised. I think the
latter use case is much more common (at least in GCC). Rearranging
things, I said later:
>> In your other message you used the example of explicit default
>> initialisation, such as:
>>
>> class foo
>> {
>> foo () : x () {}
>> unextended_tree x;
>> };
>>
>> But I think we should strongly discourage that kind of thing.
>> If someone wants to initialise x to a particular value, like
>> integer_zero_node, then it would be better to do it explicitly.
>> If they don't care what the initial value is, then for these
>> integer-mimicing classes, uninitialised is as good as anything
>> else. :-)
What I meant was: if you want to initialise "i" to 1 in your example,
you'd have to write "i (1)". Being able to write "i ()" instead of
"i (0)" saves one character but I don't think it adds much clarity.
Explicitly initialising something only seems worthwhile if you say
what you're initialising it to.
> With the new (and some existing) classes that's not so, and it
> makes them harder and more error-prone to use (I just recently
> learned this the hard way about offset_int and the debugging
> experience is still fresh in my memory).
Sorry about the bad experience. But that kind of thing cuts
both ways. If I write:
poly_int64
foo (void)
{
poly_int64 x;
x += 2;
return x;
}
then I get a warning about x being used uninitialised, without
having had to run anything. If we add default initialisation
then this becomes something that has to be debugged against
a particular test case, i.e. we've stopped the compiler from
giving us useful static analysis.
> When the cor is inline and the initialization unnecessary then
> GCC will in most instances eliminate it, so I also don't think
> the suggested change would have a significant impact on
> the efficiency of optimized code, but...
>
> ...if it is thought essential to provide a no-op ctor, I would
> suggest to consider making its property explicit, e.g., like so:
>
> struct unextended_tree {
>
> struct Uninit { };
>
> // ...
> unextended_tree (Uninit) { /* no initialization */ }
> // ...
> };
>
> This way the programmer has to explicitly opt in to using the
> unsafe ctor. (This ctor is suitable for single objects, not
> arrays of such things, but presumably that would be sufficient.
> If not, there are tricks to make that work too.)
The default constructors for unextended_tree and extended_tree
are only there for the array case (in poly-int.h).
Part of the problem here is that we still have to live by C++03
POD rules. If we moved to C++11, the need for the poly_int_pod/
poly_int split would go away and things would probably be much
simpler. :-)
[...]
>> Note that for this class NULL_TREE is not a safe default value.
>> The same goes for the wide-int.h classes, where a length or precision
>> of 0 is undefined and isn't necessarily going to be handled gracefully
>> or predictably.
>
> For offset_int both precision and length are known so I think
> it would make sense to have the default ctor value-initialize
> the object. For wide_int, it seems to me that choosing some
> default precision and length in the default ctor would still
> be preferable to leaving the members indeterminate. (That
> functionality could still be provided by some other ctor as
> I suggested above).
But which precision though? If we pick a commonly-used one
then we make a missing initialisation bug very data-dependent.
Even if we pick a rarely-used one, we create a bug in which
the wide_int has the wrong precision even though all assignments
to it "obviously" have the right precision.
>>>> template <int N>
>>>> class extended_tree
>>>> {
>>>> @@ -5139,11 +5225,13 @@ extern bool anon_aggrname_p (const_tree)
>>>> const_tree m_t;
>>>>
>>>> public:
>>>> + extended_tree () {}
>>>> extended_tree (const_tree);
>>> ...
>>>> Index: gcc/tree.c
>>>> ===================================================================
>>> ...
>>>> +
>>>> +/* Return true if T holds a polynomial pointer difference, storing it in
>>>> + *VALUE if so. A true return means that T's precision is no greater
>>>> + than 64 bits, which is the largest address space we support, so *VALUE
>>>> + never loses precision. However, the signedness of the result is
>>>> + somewhat arbitrary, since if B lives near the end of a 64-bit address
>>>> + range and A lives near the beginning, B - A is a large positive value
>>>> + outside the range of int64_t. A - B is likewise a large negative value
>>>> + outside the range of int64_t. All the pointer difference really
>>>> + gives is a raw pointer-sized bitstring that can be added to the first
>>>> + pointer value to get the second. */
>>>
>>> I'm not sure I understand the comment about the sign correctly, but
>>> if I do, I don't think it's correct.
>>>
>>> Because their difference wouldn't representable in any basic integer
>>> type (i.,e., in ptrdiff_t) the pointers described above could never
>>> point to the same object (or array), and so their difference is not
>>> meaningful. C/C++ only define the semantics of a difference between
>>> pointers to the same object. That restricts the size of the largest
>>> possible object typically to SIZE_MAX / 2, or at most SIZE_MAX on
>>> the handful of targets where ptrdiff_t has greater precision than
>>> size_t. But even on those targets, the difference between any two
>>> pointers to the same object must be representable in ptrdiff_t,
>>> including the sign.
>>
>> But does that apply even when no pointer difference of that size
>> occurs in the original source? I.e., is:
>>
>> char *x = malloc (0x80000001)
>>
>> undefined in itself on 32-bit targets?
>
> No, the call itself isn't undefined, but it shouldn't succeed
> on a conforming implementation where ptrdiff_t is a 32-bit type
> (which is why GCC diagnoses it). If the call were to succeed
> then pointers to the allocated object would fail to meet the
> C requirements on additive operators.
>
>> Does it become undefined after:
>>
>> for (unsigned int i = 0; i < 0x80000001; ++i)
>> x[i++] = 0;
>>
>> where no large pointer difference is calculated? But I realise
>> gcc's support for this kind of thing is limited, and that we do
>> try to emit a diagnostic for obvious instances...
>
> Yes, this is undefined, both in C (unless ptrdiff_t is wider
> than 32 bits) and in GCC, because x[0x80000000] doesn't refer
> to the 2147483648-th element of x.
>
>> In the (two) places that need this -- both conversions from
>> cst_and_fits_in_hwi -- the immediate problem is that the sign
>> of the type doesn't necessarily match the logical sign of the
>> difference. E.g. a negative offset can be represented as a large
>> unsigned value of sizetype.
>
> I only meant to suggest that the comment be reworded so as
> not to imply that such pointers (that are farther apart than
> PTRDIFF_MAX) can point to the same object and be subtracted.
OK, how about:
/* Return true if T holds a polynomial pointer difference, storing it in
*VALUE if so. A true return means that T's precision is no greater
than 64 bits, which is the largest address space we support, so *VALUE
never loses precision. However, the signedness of the result does
not necessarily match the signedness of T: sometimes an unsigned type
like sizetype is used to encode a value that is actually negative. */
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-26 8:40 ` Richard Sandiford
@ 2017-10-26 16:45 ` Martin Sebor
2017-10-26 18:05 ` Richard Sandiford
2017-10-26 18:11 ` Pedro Alves
0 siblings, 2 replies; 302+ messages in thread
From: Martin Sebor @ 2017-10-26 16:45 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
>>>>> /* The tree and const_tree overload templates. */
>>>>> namespace wi
>>>>> {
>>>>> + class unextended_tree
>>>>> + {
>>>>> + private:
>>>>> + const_tree m_t;
>>>>> +
>>>>> + public:
>>>>> + unextended_tree () {}
>>>>
>>>> Defining no-op ctors is quite dangerous and error-prone. I suggest
>>>> to instead default initialize the member(s):
>>>>
>>>> unextended_tree (): m_t () {}
>>>>
>>>> Ditto everywhere else, such as in:
>>>
>>> This is really performance-senesitive code though, so I don't think
>>> we want to add any unnecessary initialisation. Primitive types are
>>> uninitalised by default too, and the point of this class is to
>>> provide an integer-like interface.
>>
>> I understand the performance concern (more on that below), but
>> to clarify the usability issues, I don't think the analogy with
>> primitive types is quite fitting here: int() evaluates to zero,
>> as do the values of i and a[0] and a[1] after an object of type
>> S is constructed using its default ctor, i.e., S ():
>>
>> struct S {
>> int i;
>> int a[2];
>>
>> S (): i (), a () { }
>> };
>
> Sure, I realise that. I meant that:
>
> int x;
>
> doesn't initialise x to zero. So it's a question of which case is the
> most motivating one: using "x ()" to initialise x to 0 in a constructor
> or "int x;" to declare a variable of type x, uninitialised. I think the
> latter use case is much more common (at least in GCC). Rearranging
> things, I said later:
I agree that the latter use case is more common in GCC, but I don't
see it as a good thing. GCC was written in C and most code still
uses now outdated C practices such as declaring variables at the top
of a (often long) function, and usually without initializing them.
It's been established that it's far better to declare variables with
the smallest scope, and to initialize them on declaration. Compilers
are smart enough these days to eliminate redundant initialization or
assignments.
>>> In your other message you used the example of explicit default
>>> initialisation, such as:
>>>
>>> class foo
>>> {
>>> foo () : x () {}
>>> unextended_tree x;
>>> };
>>>
>>> But I think we should strongly discourage that kind of thing.
>>> If someone wants to initialise x to a particular value, like
>>> integer_zero_node, then it would be better to do it explicitly.
>>> If they don't care what the initial value is, then for these
>>> integer-mimicing classes, uninitialised is as good as anything
>>> else. :-)
>
> What I meant was: if you want to initialise "i" to 1 in your example,
> you'd have to write "i (1)". Being able to write "i ()" instead of
> "i (0)" saves one character but I don't think it adds much clarity.
> Explicitly initialising something only seems worthwhile if you say
> what you're initialising it to.
My comment is not motivated by convenience. What I'm concerned
about is that defining a default ctor to be a no-op defeats the
zero-initialization semantics most users expect of T().
This is particularly concerning for a class designed to behave
like an [improved] basic integer type. Such a class should act
as closely as possible to the type it emulates and in the least
surprising ways. Any sort of a deviation that replaces well-
defined behavior with undefined is a gotcha and a bug waiting
to happen.
It's also a concern in generic (template) contexts where T() is
expected to zero-initialize. A template designed to work with
a fundamental integer type should also work with a user-defined
type designed to behave like an integer.
>> With the new (and some existing) classes that's not so, and it
>> makes them harder and more error-prone to use (I just recently
>> learned this the hard way about offset_int and the debugging
>> experience is still fresh in my memory).
>
> Sorry about the bad experience. But that kind of thing cuts
> both ways. If I write:
>
> poly_int64
> foo (void)
> {
> poly_int64 x;
> x += 2;
> return x;
> }
>
> then I get a warning about x being used uninitialised, without
> having had to run anything. If we add default initialisation
> then this becomes something that has to be debugged against
> a particular test case, i.e. we've stopped the compiler from
> giving us useful static analysis.
With default initialization the code above becomes valid and has
the expected effect of adding 2 to zero. It's just more robust
than the same code with that uses a basic type instead. This
seems no more unexpected and no less desirable than the well-
defined semantics of something like:
std::string x;
x += "2";
return x;
or using any other C++ standard library type in a similar way.
(Incidentally, although I haven't tried with poly_int, I get no
warnings for the code above with offset_int or wide_int.)
>> When the cor is inline and the initialization unnecessary then
>> GCC will in most instances eliminate it, so I also don't think
>> the suggested change would have a significant impact on
>> the efficiency of optimized code, but...
>>
>> ...if it is thought essential to provide a no-op ctor, I would
>> suggest to consider making its property explicit, e.g., like so:
>>
>> struct unextended_tree {
>>
>> struct Uninit { };
>>
>> // ...
>> unextended_tree (Uninit) { /* no initialization */ }
>> // ...
>> };
>>
>> This way the programmer has to explicitly opt in to using the
>> unsafe ctor. (This ctor is suitable for single objects, not
>> arrays of such things, but presumably that would be sufficient.
>> If not, there are tricks to make that work too.)
>
> The default constructors for unextended_tree and extended_tree
> are only there for the array case (in poly-int.h).
My main concern is with the new poly_int classes (and the existing
wide int classes) because I think those are or will be widely used,
far more so than the unextended_tree class (I confess this review
is the first time I've ever noticed it).
> Part of the problem here is that we still have to live by C++03
> POD rules. If we moved to C++11, the need for the poly_int_pod/
> poly_int split would go away and things would probably be much
> simpler. :-)
Understood. With the heavy use of templates, template templates,
and partial specialization, the poly_int classes will push older
C++ 98 compilers to their limits. It seems that for stability's
sake it would make sense to require a more modern compiler.
>>> Note that for this class NULL_TREE is not a safe default value.
>>> The same goes for the wide-int.h classes, where a length or precision
>>> of 0 is undefined and isn't necessarily going to be handled gracefully
>>> or predictably.
>>
>> For offset_int both precision and length are known so I think
>> it would make sense to have the default ctor value-initialize
>> the object. For wide_int, it seems to me that choosing some
>> default precision and length in the default ctor would still
>> be preferable to leaving the members indeterminate. (That
>> functionality could still be provided by some other ctor as
>> I suggested above).
>
> But which precision though? If we pick a commonly-used one
> then we make a missing initialisation bug very data-dependent.
> Even if we pick a rarely-used one, we create a bug in which
> the wide_int has the wrong precision even though all assignments
> to it "obviously" have the right precision.
For offset_int the default precision is 128-bits. Making that
the default also for wide_int should be unsurprising.
>> I only meant to suggest that the comment be reworded so as
>> not to imply that such pointers (that are farther apart than
>> PTRDIFF_MAX) can point to the same object and be subtracted.
>
> OK, how about:
>
> /* Return true if T holds a polynomial pointer difference, storing it in
> *VALUE if so. A true return means that T's precision is no greater
> than 64 bits, which is the largest address space we support, so *VALUE
> never loses precision. However, the signedness of the result does
> not necessarily match the signedness of T: sometimes an unsigned type
> like sizetype is used to encode a value that is actually negative. */
That looks good to me.
Thanks
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-26 16:45 ` Martin Sebor
@ 2017-10-26 18:05 ` Richard Sandiford
2017-10-26 23:53 ` Martin Sebor
2017-10-26 18:11 ` Pedro Alves
1 sibling, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-26 18:05 UTC (permalink / raw)
To: Martin Sebor; +Cc: gcc-patches
Martin Sebor <msebor@gmail.com> writes:
>>>>>> /* The tree and const_tree overload templates. */
>>>>>> namespace wi
>>>>>> {
>>>>>> + class unextended_tree
>>>>>> + {
>>>>>> + private:
>>>>>> + const_tree m_t;
>>>>>> +
>>>>>> + public:
>>>>>> + unextended_tree () {}
>>>>>
>>>>> Defining no-op ctors is quite dangerous and error-prone. I suggest
>>>>> to instead default initialize the member(s):
>>>>>
>>>>> unextended_tree (): m_t () {}
>>>>>
>>>>> Ditto everywhere else, such as in:
>>>>
>>>> This is really performance-senesitive code though, so I don't think
>>>> we want to add any unnecessary initialisation. Primitive types are
>>>> uninitalised by default too, and the point of this class is to
>>>> provide an integer-like interface.
>>>
>>> I understand the performance concern (more on that below), but
>>> to clarify the usability issues, I don't think the analogy with
>>> primitive types is quite fitting here: int() evaluates to zero,
>>> as do the values of i and a[0] and a[1] after an object of type
>>> S is constructed using its default ctor, i.e., S ():
>>>
>>> struct S {
>>> int i;
>>> int a[2];
>>>
>>> S (): i (), a () { }
>>> };
>>
>> Sure, I realise that. I meant that:
>>
>> int x;
>>
>> doesn't initialise x to zero. So it's a question of which case is the
>> most motivating one: using "x ()" to initialise x to 0 in a constructor
>> or "int x;" to declare a variable of type x, uninitialised. I think the
>> latter use case is much more common (at least in GCC). Rearranging
>> things, I said later:
>
> I agree that the latter use case is more common in GCC, but I don't
> see it as a good thing. GCC was written in C and most code still
> uses now outdated C practices such as declaring variables at the top
> of a (often long) function, and usually without initializing them.
> It's been established that it's far better to declare variables with
> the smallest scope, and to initialize them on declaration. Compilers
> are smart enough these days to eliminate redundant initialization or
> assignments.
>
>>>> In your other message you used the example of explicit default
>>>> initialisation, such as:
>>>>
>>>> class foo
>>>> {
>>>> foo () : x () {}
>>>> unextended_tree x;
>>>> };
>>>>
>>>> But I think we should strongly discourage that kind of thing.
>>>> If someone wants to initialise x to a particular value, like
>>>> integer_zero_node, then it would be better to do it explicitly.
>>>> If they don't care what the initial value is, then for these
>>>> integer-mimicing classes, uninitialised is as good as anything
>>>> else. :-)
>>
>> What I meant was: if you want to initialise "i" to 1 in your example,
>> you'd have to write "i (1)". Being able to write "i ()" instead of
>> "i (0)" saves one character but I don't think it adds much clarity.
>> Explicitly initialising something only seems worthwhile if you say
>> what you're initialising it to.
>
> My comment is not motivated by convenience. What I'm concerned
> about is that defining a default ctor to be a no-op defeats the
> zero-initialization semantics most users expect of T().
>
> This is particularly concerning for a class designed to behave
> like an [improved] basic integer type. Such a class should act
> as closely as possible to the type it emulates and in the least
> surprising ways. Any sort of a deviation that replaces well-
> defined behavior with undefined is a gotcha and a bug waiting
> to happen.
>
> It's also a concern in generic (template) contexts where T() is
> expected to zero-initialize. A template designed to work with
> a fundamental integer type should also work with a user-defined
> type designed to behave like an integer.
But that kind of situation is one where using "T (0)" over "T ()"
is useful. It means that template substitution will succeed for
T that are sufficiently integer-like to have a single well-defined
zero but not for T that aren't (such as wide_int).
>>> With the new (and some existing) classes that's not so, and it
>>> makes them harder and more error-prone to use (I just recently
>>> learned this the hard way about offset_int and the debugging
>>> experience is still fresh in my memory).
>>
>> Sorry about the bad experience. But that kind of thing cuts
>> both ways. If I write:
>>
>> poly_int64
>> foo (void)
>> {
>> poly_int64 x;
>> x += 2;
>> return x;
>> }
>>
>> then I get a warning about x being used uninitialised, without
>> having had to run anything. If we add default initialisation
>> then this becomes something that has to be debugged against
>> a particular test case, i.e. we've stopped the compiler from
>> giving us useful static analysis.
>
> With default initialization the code above becomes valid and has
> the expected effect of adding 2 to zero. It's just more robust
> than the same code with that uses a basic type instead. This
> seems no more unexpected and no less desirable than the well-
> defined semantics of something like:
>
> std::string x;
> x += "2";
> return x;
>
> or using any other C++ standard library type in a similar way.
>
> (Incidentally, although I haven't tried with poly_int, I get no
> warnings for the code above with offset_int or wide_int.)
>
>>> When the cor is inline and the initialization unnecessary then
>>> GCC will in most instances eliminate it, so I also don't think
>>> the suggested change would have a significant impact on
>>> the efficiency of optimized code, but...
>>>
>>> ...if it is thought essential to provide a no-op ctor, I would
>>> suggest to consider making its property explicit, e.g., like so:
>>>
>>> struct unextended_tree {
>>>
>>> struct Uninit { };
>>>
>>> // ...
>>> unextended_tree (Uninit) { /* no initialization */ }
>>> // ...
>>> };
>>>
>>> This way the programmer has to explicitly opt in to using the
>>> unsafe ctor. (This ctor is suitable for single objects, not
>>> arrays of such things, but presumably that would be sufficient.
>>> If not, there are tricks to make that work too.)
>>
>> The default constructors for unextended_tree and extended_tree
>> are only there for the array case (in poly-int.h).
>
> My main concern is with the new poly_int classes (and the existing
> wide int classes) because I think those are or will be widely used,
> far more so than the unextended_tree class (I confess this review
> is the first time I've ever noticed it).
>
>> Part of the problem here is that we still have to live by C++03
>> POD rules. If we moved to C++11, the need for the poly_int_pod/
>> poly_int split would go away and things would probably be much
>> simpler. :-)
>
> Understood. With the heavy use of templates, template templates,
> and partial specialization, the poly_int classes will push older
> C++ 98 compilers to their limits. It seems that for stability's
> sake it would make sense to require a more modern compiler.
>
>>>> Note that for this class NULL_TREE is not a safe default value.
>>>> The same goes for the wide-int.h classes, where a length or precision
>>>> of 0 is undefined and isn't necessarily going to be handled gracefully
>>>> or predictably.
>>>
>>> For offset_int both precision and length are known so I think
>>> it would make sense to have the default ctor value-initialize
>>> the object. For wide_int, it seems to me that choosing some
>>> default precision and length in the default ctor would still
>>> be preferable to leaving the members indeterminate. (That
>>> functionality could still be provided by some other ctor as
>>> I suggested above).
>>
>> But which precision though? If we pick a commonly-used one
>> then we make a missing initialisation bug very data-dependent.
>> Even if we pick a rarely-used one, we create a bug in which
>> the wide_int has the wrong precision even though all assignments
>> to it "obviously" have the right precision.
>
> For offset_int the default precision is 128-bits. Making that
> the default also for wide_int should be unsurprising.
I think it'd be surprising. offset_int should always be used in
preference to wide_int if the precision is known to be 128 bits
in advance, and there doesn't seem any reason to prefer the
precision of offset_int over widest_int, HOST_WIDE_INT or int.
We would end up with:
wide_int
f (const wide_int &y)
{
wide_int x;
x += y;
return x;
}
being valid if y happens to have 128 bits as well, and a runtime error
otherwise.
Also, I think it'd be inconsistent to allow the specific case of 0
to be assigned by default construction, but not also allow:
wide_int x (0);
wide_int x;
x = 0;
wide_int x;
x = 1;
etc. And wide_int wasn't intended for that use case.
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-26 18:05 ` Richard Sandiford
@ 2017-10-26 23:53 ` Martin Sebor
2017-10-27 8:33 ` Richard Sandiford
0 siblings, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-10-26 23:53 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/26/2017 11:52 AM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:
>>>>>>> /* The tree and const_tree overload templates. */
>>>>>>> namespace wi
>>>>>>> {
>>>>>>> + class unextended_tree
>>>>>>> + {
>>>>>>> + private:
>>>>>>> + const_tree m_t;
>>>>>>> +
>>>>>>> + public:
>>>>>>> + unextended_tree () {}
>>>>>>
>>>>>> Defining no-op ctors is quite dangerous and error-prone. I suggest
>>>>>> to instead default initialize the member(s):
>>>>>>
>>>>>> unextended_tree (): m_t () {}
>>>>>>
>>>>>> Ditto everywhere else, such as in:
>>>>>
>>>>> This is really performance-senesitive code though, so I don't think
>>>>> we want to add any unnecessary initialisation. Primitive types are
>>>>> uninitalised by default too, and the point of this class is to
>>>>> provide an integer-like interface.
>>>>
>>>> I understand the performance concern (more on that below), but
>>>> to clarify the usability issues, I don't think the analogy with
>>>> primitive types is quite fitting here: int() evaluates to zero,
>>>> as do the values of i and a[0] and a[1] after an object of type
>>>> S is constructed using its default ctor, i.e., S ():
>>>>
>>>> struct S {
>>>> int i;
>>>> int a[2];
>>>>
>>>> S (): i (), a () { }
>>>> };
>>>
>>> Sure, I realise that. I meant that:
>>>
>>> int x;
>>>
>>> doesn't initialise x to zero. So it's a question of which case is the
>>> most motivating one: using "x ()" to initialise x to 0 in a constructor
>>> or "int x;" to declare a variable of type x, uninitialised. I think the
>>> latter use case is much more common (at least in GCC). Rearranging
>>> things, I said later:
>>
>> I agree that the latter use case is more common in GCC, but I don't
>> see it as a good thing. GCC was written in C and most code still
>> uses now outdated C practices such as declaring variables at the top
>> of a (often long) function, and usually without initializing them.
>> It's been established that it's far better to declare variables with
>> the smallest scope, and to initialize them on declaration. Compilers
>> are smart enough these days to eliminate redundant initialization or
>> assignments.
>>
>>>>> In your other message you used the example of explicit default
>>>>> initialisation, such as:
>>>>>
>>>>> class foo
>>>>> {
>>>>> foo () : x () {}
>>>>> unextended_tree x;
>>>>> };
>>>>>
>>>>> But I think we should strongly discourage that kind of thing.
>>>>> If someone wants to initialise x to a particular value, like
>>>>> integer_zero_node, then it would be better to do it explicitly.
>>>>> If they don't care what the initial value is, then for these
>>>>> integer-mimicing classes, uninitialised is as good as anything
>>>>> else. :-)
>>>
>>> What I meant was: if you want to initialise "i" to 1 in your example,
>>> you'd have to write "i (1)". Being able to write "i ()" instead of
>>> "i (0)" saves one character but I don't think it adds much clarity.
>>> Explicitly initialising something only seems worthwhile if you say
>>> what you're initialising it to.
>>
>> My comment is not motivated by convenience. What I'm concerned
>> about is that defining a default ctor to be a no-op defeats the
>> zero-initialization semantics most users expect of T().
>>
>> This is particularly concerning for a class designed to behave
>> like an [improved] basic integer type. Such a class should act
>> as closely as possible to the type it emulates and in the least
>> surprising ways. Any sort of a deviation that replaces well-
>> defined behavior with undefined is a gotcha and a bug waiting
>> to happen.
>>
>> It's also a concern in generic (template) contexts where T() is
>> expected to zero-initialize. A template designed to work with
>> a fundamental integer type should also work with a user-defined
>> type designed to behave like an integer.
>
> But that kind of situation is one where using "T (0)" over "T ()"
> is useful. It means that template substitution will succeed for
> T that are sufficiently integer-like to have a single well-defined
> zero but not for T that aren't (such as wide_int).
That strikes me as a little too subtle. But it also doesn't
sound like wide_int is as close to an integer as its name
suggests. After all, it doesn't support relational operators
either, or even assignment from other integer types. It's
really a different beast. But that still doesn't in my mind
justify the no-op initialization semantics.
>> For offset_int the default precision is 128-bits. Making that
>> the default also for wide_int should be unsurprising.
>
> I think it'd be surprising. offset_int should always be used in
> preference to wide_int if the precision is known to be 128 bits
> in advance, and there doesn't seem any reason to prefer the
> precision of offset_int over widest_int, HOST_WIDE_INT or int.
>
> We would end up with:
>
> wide_int
> f (const wide_int &y)
> {
> wide_int x;
> x += y;
> return x;
> }
>
> being valid if y happens to have 128 bits as well, and a runtime error
> otherwise.
Surely that would be far better than the undefined behavior we
have today.
>
> Also, I think it'd be inconsistent to allow the specific case of 0
> to be assigned by default construction, but not also allow:
>
> wide_int x (0);
>
> wide_int x;
> x = 0;
>
> wide_int x;
> x = 1;
>
> etc. And wide_int wasn't intended for that use case.
Then perhaps I don't fully understand wide_int. I would expect
the above assignments to also "just work" and I can't imagine
why we would not want them to. In what way is rejecting
the above helpful when the following is accepted but undefined?
wide_int f ()
{
wide_int x;
x += 0;
return x;
}
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-26 23:53 ` Martin Sebor
@ 2017-10-27 8:33 ` Richard Sandiford
2017-10-29 16:56 ` Martin Sebor
0 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-27 8:33 UTC (permalink / raw)
To: Martin Sebor; +Cc: gcc-patches
Martin Sebor <msebor@gmail.com> writes:
> On 10/26/2017 11:52 AM, Richard Sandiford wrote:
>> Martin Sebor <msebor@gmail.com> writes:
>>> For offset_int the default precision is 128-bits. Making that
>>> the default also for wide_int should be unsurprising.
>>
>> I think it'd be surprising. offset_int should always be used in
>> preference to wide_int if the precision is known to be 128 bits
>> in advance, and there doesn't seem any reason to prefer the
>> precision of offset_int over widest_int, HOST_WIDE_INT or int.
>>
>> We would end up with:
>>
>> wide_int
>> f (const wide_int &y)
>> {
>> wide_int x;
>> x += y;
>> return x;
>> }
>>
>> being valid if y happens to have 128 bits as well, and a runtime error
>> otherwise.
>
> Surely that would be far better than the undefined behavior we
> have today.
I disagree. People shouldn't rely on the above behaviour because
it's never useful. If y is known to be 128 bits in advance then
the code should be using offset_int instead of wide_int. And if
y isn't known to be 128 bits in advance, the code is incorrect,
because it needs to cope with precisions other than 128 but doesn't
do so.
The motivation for doing this was to initialise wide_ints to zero,
but the behaviour of f() wouldn't be same as:
wide_int x = 0 + y;
return x;
That's always valid, because in an operation involving a wide_int
and a primitive type, the primitive type promotes or demotes
to the same precision as the wide_int.
>> Also, I think it'd be inconsistent to allow the specific case of 0
>> to be assigned by default construction, but not also allow:
>>
>> wide_int x (0);
>>
>> wide_int x;
>> x = 0;
>>
>> wide_int x;
>> x = 1;
>>
>> etc. And wide_int wasn't intended for that use case.
>
> Then perhaps I don't fully understand wide_int. I would expect
> the above assignments to also "just work" and I can't imagine
> why we would not want them to. In what way is rejecting
> the above helpful when the following is accepted but undefined?
>
> wide_int f ()
> {
> wide_int x;
> x += 0;
> return x;
> }
Well, it compiles, but with sufficiently good static analysis
it should trigger a warning. (GCC might not be there yet,
but these things improve.) As mentioned above:
wide_int f ()
{
wide_int x = ...;
x += 0;
return x;
}
(or some value other than 0) is well-defined because the int
promotes to whatever precision x has.
The problem with the examples I gave was that wide_int always needs
to have a precision and nothing in that code says what the precision
should be. The "right" way of writing it would be:
wide_int x = wi::shwi (0, prec);
wide_int x;
x = wi::shwi (0, prec);
wide_int x;
x = wi::shwi (1, prec);
where prec specifies the precision of the integer.
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-27 8:33 ` Richard Sandiford
@ 2017-10-29 16:56 ` Martin Sebor
2017-10-30 6:36 ` Trevor Saunders
0 siblings, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-10-29 16:56 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/27/2017 02:08 AM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:
>> On 10/26/2017 11:52 AM, Richard Sandiford wrote:
>>> Martin Sebor <msebor@gmail.com> writes:
>>>> For offset_int the default precision is 128-bits. Making that
>>>> the default also for wide_int should be unsurprising.
>>>
>>> I think it'd be surprising. offset_int should always be used in
>>> preference to wide_int if the precision is known to be 128 bits
>>> in advance, and there doesn't seem any reason to prefer the
>>> precision of offset_int over widest_int, HOST_WIDE_INT or int.
>>>
>>> We would end up with:
>>>
>>> wide_int
>>> f (const wide_int &y)
>>> {
>>> wide_int x;
>>> x += y;
>>> return x;
>>> }
>>>
>>> being valid if y happens to have 128 bits as well, and a runtime error
>>> otherwise.
>>
>> Surely that would be far better than the undefined behavior we
>> have today.
>
> I disagree. People shouldn't rely on the above behaviour because
> it's never useful.
Well, yes, but the main point of my feedback on the poly_int default
ctor (and the ctor of the extended_tree class, and the existing wide
int classes) is that it makes them easy to misuse. That they're not
meant to be [mis]used like that isn't an answer.
You explained earlier that the no-op initialization is necessary
for efficiency and I suggested a safer alternative: an API that
makes the lack of initialization explicit, while providing a safe
default. I still believe this is the right approach for the new
poly_int classes. I also think it's the right solution for
offset_int.
>> wide_int f ()
>> {
>> wide_int x;
>> x += 0;
>> return x;
>> }
>
> Well, it compiles, but with sufficiently good static analysis
> it should trigger a warning. (GCC might not be there yet,
> but these things improve.) As mentioned above:
Forgive me, but knowingly designing classes to be unsafe with
the hope that their accidental misuses may some day be detected
by sufficiently advanced static analyzers is not helpful. It's
also unnecessary when less error-prone and equally efficient
alternatives exist.
> wide_int f ()
> {
> wide_int x = ...;
> x += 0;
> return x;
> }
>
> (or some value other than 0) is well-defined because the int
> promotes to whatever precision x has.
>
> The problem with the examples I gave was that wide_int always needs
> to have a precision and nothing in that code says what the precision
> should be. The "right" way of writing it would be:
>
> wide_int x = wi::shwi (0, prec);
>
> wide_int x;
> x = wi::shwi (0, prec);
>
> wide_int x;
> x = wi::shwi (1, prec);
>
> where prec specifies the precision of the integer.
Yes, I realize that. But we got here by exploring the effects
of default zero-initialization. You have given examples showing
where relying on the zero-intialization could lead to bugs. Sure,
no one is disputing that there are such instances. Those exist
with any type and are, in general, unavoidable.
My argument is that default initialization that leaves the object
in an indeterminate state suffers from all the same problems your
examples do plus infinitely many others (i.e., undefined behavior),
and so is an obviously inferior choice. It's a design error that
should be avoided.
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-29 16:56 ` Martin Sebor
@ 2017-10-30 6:36 ` Trevor Saunders
2017-10-31 20:25 ` Martin Sebor
0 siblings, 1 reply; 302+ messages in thread
From: Trevor Saunders @ 2017-10-30 6:36 UTC (permalink / raw)
To: Martin Sebor; +Cc: gcc-patches, richard.sandiford
On Sun, Oct 29, 2017 at 10:25:38AM -0600, Martin Sebor wrote:
> On 10/27/2017 02:08 AM, Richard Sandiford wrote:
> > Martin Sebor <msebor@gmail.com> writes:
> > > On 10/26/2017 11:52 AM, Richard Sandiford wrote:
> > > > Martin Sebor <msebor@gmail.com> writes:
> > > > > For offset_int the default precision is 128-bits. Making that
> > > > > the default also for wide_int should be unsurprising.
> > > >
> > > > I think it'd be surprising. offset_int should always be used in
> > > > preference to wide_int if the precision is known to be 128 bits
> > > > in advance, and there doesn't seem any reason to prefer the
> > > > precision of offset_int over widest_int, HOST_WIDE_INT or int.
> > > >
> > > > We would end up with:
> > > >
> > > > wide_int
> > > > f (const wide_int &y)
> > > > {
> > > > wide_int x;
> > > > x += y;
> > > > return x;
> > > > }
> > > >
> > > > being valid if y happens to have 128 bits as well, and a runtime error
> > > > otherwise.
> > >
> > > Surely that would be far better than the undefined behavior we
> > > have today.
> >
> > I disagree. People shouldn't rely on the above behaviour because
> > it's never useful.
>
> Well, yes, but the main point of my feedback on the poly_int default
> ctor (and the ctor of the extended_tree class, and the existing wide
> int classes) is that it makes them easy to misuse. That they're not
> meant to be [mis]used like that isn't an answer.
I think Richard's point is different from saying don't misuse it. I
think its that 0 initializing is also always a bug, and the user needs
to choosesome initialization to follow the default ctor in either case.
> You explained earlier that the no-op initialization is necessary
> for efficiency and I suggested a safer alternative: an API that
> makes the lack of initialization explicit, while providing a safe
> default. I still believe this is the right approach for the new
> poly_int classes. I also think it's the right solution for
> offset_int.
>
> > > wide_int f ()
> > > {
> > > wide_int x;
> > > x += 0;
> > > return x;
> > > }
> >
> > Well, it compiles, but with sufficiently good static analysis
> > it should trigger a warning. (GCC might not be there yet,
> > but these things improve.) As mentioned above:
>
> Forgive me, but knowingly designing classes to be unsafe with
> the hope that their accidental misuses may some day be detected
> by sufficiently advanced static analyzers is not helpful. It's
> also unnecessary when less error-prone and equally efficient
> alternatives exist.
If only the world was that nice, unfortunately whenever I go looking at
generated code I find things that make me sad.
> > wide_int f ()
> > {
> > wide_int x = ...;
> > x += 0;
> > return x;
> > }
> >
> > (or some value other than 0) is well-defined because the int
> > promotes to whatever precision x has.
> >
> > The problem with the examples I gave was that wide_int always needs
> > to have a precision and nothing in that code says what the precision
> > should be. The "right" way of writing it would be:
> >
> > wide_int x = wi::shwi (0, prec);
> >
> > wide_int x;
> > x = wi::shwi (0, prec);
> >
> > wide_int x;
> > x = wi::shwi (1, prec);
> >
> > where prec specifies the precision of the integer.
>
> Yes, I realize that. But we got here by exploring the effects
> of default zero-initialization. You have given examples showing
> where relying on the zero-intialization could lead to bugs. Sure,
> no one is disputing that there are such instances. Those exist
> with any type and are, in general, unavoidable.
>
> My argument is that default initialization that leaves the object
> in an indeterminate state suffers from all the same problems your
> examples do plus infinitely many others (i.e., undefined behavior),
> and so is an obviously inferior choice. It's a design error that
> should be avoided.
I'd argue its not strictly inferior, one big advantage it has is
that its much easier for tools like valgrind or msan to find bugs where
something is uninitialized than ones where its initialized with garbage.
deciding a program exhibits undefined behavior in some case is a lot
easier than reasoning about if it did what it was supposed to.
The other problem is that 0 is an especially bad value to pick if it
isn't very likely to always be correct. If you are going to initialize
something with a known garbage value it would be better to pick
something that is more likely to blow up immediately than something that
can hide bugs. Sure, unitialized things change from run to run, but
they are much more likely to look like garbage than 0 is.
Trev
>
> Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-30 6:36 ` Trevor Saunders
@ 2017-10-31 20:25 ` Martin Sebor
0 siblings, 0 replies; 302+ messages in thread
From: Martin Sebor @ 2017-10-31 20:25 UTC (permalink / raw)
To: Trevor Saunders; +Cc: gcc-patches, richard.sandiford
On 10/29/2017 09:14 PM, Trevor Saunders wrote:
> On Sun, Oct 29, 2017 at 10:25:38AM -0600, Martin Sebor wrote:
>> On 10/27/2017 02:08 AM, Richard Sandiford wrote:
>>> Martin Sebor <msebor@gmail.com> writes:
>>>> On 10/26/2017 11:52 AM, Richard Sandiford wrote:
>>>>> Martin Sebor <msebor@gmail.com> writes:
>>>>>> For offset_int the default precision is 128-bits. Making that
>>>>>> the default also for wide_int should be unsurprising.
>>>>>
>>>>> I think it'd be surprising. offset_int should always be used in
>>>>> preference to wide_int if the precision is known to be 128 bits
>>>>> in advance, and there doesn't seem any reason to prefer the
>>>>> precision of offset_int over widest_int, HOST_WIDE_INT or int.
>>>>>
>>>>> We would end up with:
>>>>>
>>>>> wide_int
>>>>> f (const wide_int &y)
>>>>> {
>>>>> wide_int x;
>>>>> x += y;
>>>>> return x;
>>>>> }
>>>>>
>>>>> being valid if y happens to have 128 bits as well, and a runtime error
>>>>> otherwise.
>>>>
>>>> Surely that would be far better than the undefined behavior we
>>>> have today.
>>>
>>> I disagree. People shouldn't rely on the above behaviour because
>>> it's never useful.
>>
>> Well, yes, but the main point of my feedback on the poly_int default
>> ctor (and the ctor of the extended_tree class, and the existing wide
>> int classes) is that it makes them easy to misuse. That they're not
>> meant to be [mis]used like that isn't an answer.
>
> I think Richard's point is different from saying don't misuse it. I
> think its that 0 initializing is also always a bug, and the user needs
> to choosesome initialization to follow the default ctor in either case.
Initializing offset_int to zero isn't a bug and there are examples
of it in GCC sources. Some of those are now being replaced with
those of poly_int xxx = 0. Here's one example from [015/nnn]
poly_int: ao_ref and vn_reference_op_t:
@@ -1365,8 +1369,8 @@ indirect_refs_may_alias_p (tree ref1 ATT
refs_may_alias_p_1 (ao_ref *ref1, ao_ref *ref2, bool tbaa_p)
{
tree base1, base2;
- HOST_WIDE_INT offset1 = 0, offset2 = 0;
- HOST_WIDE_INT max_size1 = -1, max_size2 = -1;
+ poly_int64 offset1 = 0, offset2 = 0;
+ poly_int64 max_size1 = -1, max_size2 = -1;
I'm not suggesting these be changed to avoid the explicit
initialization. But I show this to disprove the claim above.
Clearly, zero initialization is valid and useful.
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-26 16:45 ` Martin Sebor
2017-10-26 18:05 ` Richard Sandiford
@ 2017-10-26 18:11 ` Pedro Alves
2017-10-26 19:12 ` Martin Sebor
1 sibling, 1 reply; 302+ messages in thread
From: Pedro Alves @ 2017-10-26 18:11 UTC (permalink / raw)
To: Martin Sebor, gcc-patches, richard.sandiford
On 10/26/2017 05:37 PM, Martin Sebor wrote:
> I agree that the latter use case is more common in GCC, but I don't
> see it as a good thing. GCC was written in C and most code still
> uses now outdated C practices such as declaring variables at the top
> of a (often long) function, and usually without initializing them.
> It's been established that it's far better to declare variables with
> the smallest scope, and to initialize them on declaration. Compilers
> are smart enough these days to eliminate redundant initialization or
> assignments.
I don't agree that that's established. FWIW, I'm on the
"prefer the -Wuninitialized" warnings camp. I've been looking
forward to all the VRP and threader improvements hoping that that
warning (and -Wmaybe-uninitialized...) will improve along.
> My comment is not motivated by convenience. What I'm concerned
> about is that defining a default ctor to be a no-op defeats the
> zero-initialization semantics most users expect of T().
This sounds like it's a problem because GCC is written in C++98.
You can get the semantics you want in C++11 by defining
the constructor with "= default;" :
struct T
{
T(int); // some other constructor forcing me to
// add a default constructor.
T() = default; // give me default construction using
// default initialization.
int i;
};
And now 'T t;' leaves T::i default initialized, i.e.,
uninitialized, while T() value-initializes T::i, i.e.,
initializes it to zero.
So if that's a concern, maybe you could use "= default"
conditionally depending on #if __cplusplus >= C++11, so that
you'd get it for stages after stage1.
Or just start requiring C++11 already. :-)
Thanks,
Pedro Alves
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-26 18:11 ` Pedro Alves
@ 2017-10-26 19:12 ` Martin Sebor
2017-10-26 19:19 ` Pedro Alves
0 siblings, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-10-26 19:12 UTC (permalink / raw)
To: Pedro Alves, gcc-patches, richard.sandiford
On 10/26/2017 12:05 PM, Pedro Alves wrote:
> On 10/26/2017 05:37 PM, Martin Sebor wrote:
>
>> I agree that the latter use case is more common in GCC, but I don't
>> see it as a good thing. GCC was written in C and most code still
>> uses now outdated C practices such as declaring variables at the top
>> of a (often long) function, and usually without initializing them.
>> It's been established that it's far better to declare variables with
>> the smallest scope, and to initialize them on declaration. Compilers
>> are smart enough these days to eliminate redundant initialization or
>> assignments.
>
> I don't agree that that's established. FWIW, I'm on the
> "prefer the -Wuninitialized" warnings camp. I've been looking
> forward to all the VRP and threader improvements hoping that that
> warning (and -Wmaybe-uninitialized...) will improve along.
You're by far not alone, but it's a shrinking camp as
the majority have come to appreciate the benefits of the practice.
You can see it reflected in most popular coding standards, including
the CERT C++ Secure Coding Standard, Google C++ Style Guide, Sutter
and Alexandrescu's C++ Coding Standards, Scott Meyer's books, and
so on. Just like with every rule, there are exceptions, but there
should be no doubt that in the general case, it is preferable to
declare each variable at the point where its initial value is known
(or can be computed) and initialize it on its declaration.
>> My comment is not motivated by convenience. What I'm concerned
>> about is that defining a default ctor to be a no-op defeats the
>> zero-initialization semantics most users expect of T().
>
> This sounds like it's a problem because GCC is written in C++98.
>
> You can get the semantics you want in C++11 by defining
> the constructor with "= default;" :
>
> struct T
> {
> T(int); // some other constructor forcing me to
> // add a default constructor.
>
> T() = default; // give me default construction using
> // default initialization.
> int i;
> };
>
> And now 'T t;' leaves T::i default initialized, i.e.,
> uninitialized, while T() value-initializes T::i, i.e.,
> initializes it to zero.
>
> So if that's a concern, maybe you could use "= default"
> conditionally depending on #if __cplusplus >= C++11, so that
> you'd get it for stages after stage1.
>
> Or just start requiring C++11 already. :-)
That would make sense to me and from the sound of some of his
comments also Richard's preference.
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-26 19:12 ` Martin Sebor
@ 2017-10-26 19:19 ` Pedro Alves
2017-10-26 23:41 ` Martin Sebor
0 siblings, 1 reply; 302+ messages in thread
From: Pedro Alves @ 2017-10-26 19:19 UTC (permalink / raw)
To: Martin Sebor, gcc-patches, richard.sandiford
On 10/26/2017 07:54 PM, Martin Sebor wrote:
> (...) in the general case, it is preferable to
> declare each variable at the point where its initial value is known
> (or can be computed) and initialize it on its declaration.
With that I fully agree, except it's not always possible or
natural. The issue at hand usually turns up with
conditional initialization, like:
void foo ()
{
int t;
if (something)
t = 1;
else if (something_else)
t = 2;
if (t == 1)
bar ();
}
That's a simple example of course, but more complicated
conditionals aren't so easy to grok and spot the bug.
In the case above, I'd much prefer if the compiler tells me
I missed initializing 't' than initializing it to 0 "just
in case".
Thanks,
Pedro Alves
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-26 19:19 ` Pedro Alves
@ 2017-10-26 23:41 ` Martin Sebor
2017-10-30 10:26 ` Pedro Alves
0 siblings, 1 reply; 302+ messages in thread
From: Martin Sebor @ 2017-10-26 23:41 UTC (permalink / raw)
To: Pedro Alves, gcc-patches, richard.sandiford
On 10/26/2017 01:17 PM, Pedro Alves wrote:
> On 10/26/2017 07:54 PM, Martin Sebor wrote:
>
>> (...) in the general case, it is preferable to
>> declare each variable at the point where its initial value is known
>> (or can be computed) and initialize it on its declaration.
>
> With that I fully agree, except it's not always possible or
> natural. The issue at hand usually turns up with
> conditional initialization, like:
>
> void foo ()
> {
> int t;
> if (something)
> t = 1;
> else if (something_else)
> t = 2;
> if (t == 1)
> bar ();
> }
>
> That's a simple example of course, but more complicated
> conditionals aren't so easy to grok and spot the bug.
>
> In the case above, I'd much prefer if the compiler tells me
> I missed initializing 't' than initializing it to 0 "just
> in case".
Sure. A similar observation could be made about std::string
or std::vector vs a plain C-style array, or for that matter,
about almost any other class. But it would be absurd to use
such examples as arguments that it's better to define classes
with a no-op default constructor. It's safer to initialize
objects to some value (whatever that might be) than not to
initialize them at all. That's true for fundamental types
and even more so for user-defined classes with constructors.
IMO, a good rule of thumb to follow in class design is to have
every class with any user-defined ctor either define a default
ctor that puts the object into a determinate state, or make
the default ctor inaccessible (or deleted in new C++ versions).
If there is a use case for leaving newly constructed objects
of a class in an uninitialized state that's an exception to
the rule that can be accommodated by providing a special API
(or in C++ 11, a defaulted ctor).
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-26 23:41 ` Martin Sebor
@ 2017-10-30 10:26 ` Pedro Alves
2017-10-31 16:12 ` Martin Sebor
0 siblings, 1 reply; 302+ messages in thread
From: Pedro Alves @ 2017-10-30 10:26 UTC (permalink / raw)
To: Martin Sebor, gcc-patches, richard.sandiford
On 10/27/2017 12:29 AM, Martin Sebor wrote:
>
> IMO, a good rule of thumb to follow in class design is to have
> every class with any user-defined ctor either define a default
> ctor that puts the object into a determinate state, or make
> the default ctor inaccessible (or deleted in new C++ versions).
> If there is a use case for leaving newly constructed objects
> of a class in an uninitialized state that's an exception to
> the rule that can be accommodated by providing a special API
> (or in C++ 11, a defaulted ctor).
Yet another rule of thumb is to make classes that model
built-in types behave as close to the built-in types as
possible, making it easier to migrate between the custom
types and the built-in types (and vice versa), to follow
expectations, and to avoid pessimization around e.g., otherwise
useless forcing initialization of such types in containers/arrays
when you're going to immediately fill in the container/array with
real values.
BTW, there's a proposal for adding a wide_int class to C++20:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0539r1.html
and I noticed:
~~~
26.??.2.?? wide_integer constructors [numeric.wide_integer.cons]
constexpr wide_integer() noexcept = default;
Effects: A Constructs an object with undefined value.
~~~
Thanks,
Pedro Alves
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-30 10:26 ` Pedro Alves
@ 2017-10-31 16:12 ` Martin Sebor
0 siblings, 0 replies; 302+ messages in thread
From: Martin Sebor @ 2017-10-31 16:12 UTC (permalink / raw)
To: Pedro Alves, gcc-patches, richard.sandiford
On 10/30/2017 04:19 AM, Pedro Alves wrote:
> On 10/27/2017 12:29 AM, Martin Sebor wrote:
>
>>
>> IMO, a good rule of thumb to follow in class design is to have
>> every class with any user-defined ctor either define a default
>> ctor that puts the object into a determinate state, or make
>> the default ctor inaccessible (or deleted in new C++ versions).
>> If there is a use case for leaving newly constructed objects
>> of a class in an uninitialized state that's an exception to
>> the rule that can be accommodated by providing a special API
>> (or in C++ 11, a defaulted ctor).
>
> Yet another rule of thumb is to make classes that model
> built-in types behave as close to the built-in types as
> possible, making it easier to migrate between the custom
> types and the built-in types (and vice versa), to follow
> expectations, and to avoid pessimization around e.g., otherwise
> useless forcing initialization of such types in containers/arrays
> when you're going to immediately fill in the container/array with
> real values.
>
> BTW, there's a proposal for adding a wide_int class to C++20:
>
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0539r1.html
>
> and I noticed:
>
> ~~~
> 26.??.2.?? wide_integer constructors [numeric.wide_integer.cons]
>
> constexpr wide_integer() noexcept = default;
>
> Effects: A Constructs an object with undefined value.
> ~~~
Thanks for the reference. As I said in an earlier reply, this
would make sense to me if we could use C++ 11 or later. Unlike
a no-op default ctor, the = default constructor provides syntax
to initialize the object, so both the safe use case and the
efficient one are supported. I.e., the proposed wide_int is
zero-initialized by using the 'wide_int()' syntax. The GCC
wide int and poly_int classes, on the other hand, are left in
an indeterminate state. That's a bug waiting to happen (as I
already experienced with offset_int.)
Martin
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-10-23 17:02 ` [006/nnn] poly_int: tree constants Richard Sandiford
2017-10-25 17:14 ` Martin Sebor
@ 2017-11-17 4:51 ` Jeff Law
2017-11-18 15:48 ` Richard Sandiford
1 sibling, 1 reply; 302+ messages in thread
From: Jeff Law @ 2017-11-17 4:51 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:00 AM, Richard Sandiford wrote:
> This patch adds a tree representation for poly_ints. Unlike the
> rtx version, the coefficients are INTEGER_CSTs rather than plain
> integers, so that we can easily access them as poly_widest_ints
> and poly_offset_ints.
>
> The patch also adjusts some places that previously
> relied on "constant" meaning "INTEGER_CST". It also makes
> sure that the TYPE_SIZE agrees with the TYPE_SIZE_UNIT for
> vector booleans, given the existing:
>
> /* Several boolean vector elements may fit in a single unit. */
> if (VECTOR_BOOLEAN_TYPE_P (type)
> && type->type_common.mode != BLKmode)
> TYPE_SIZE_UNIT (type)
> = size_int (GET_MODE_SIZE (type->type_common.mode));
> else
> TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
> TYPE_SIZE_UNIT (innertype),
> size_int (nunits));
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * doc/generic.texi (POLY_INT_CST): Document.
> * tree.def (POLY_INT_CST): New tree code.
> * treestruct.def (TS_POLY_INT_CST): New tree layout.
> * tree-core.h (tree_poly_int_cst): New struct.
> (tree_node): Add a poly_int_cst field.
> * tree.h (POLY_INT_CST_P, POLY_INT_CST_COEFF): New macros.
> (wide_int_to_tree, force_fit_type): Take a poly_wide_int_ref
> instead of a wide_int_ref.
> (build_int_cst, build_int_cst_type): Take a poly_int64 instead
> of a HOST_WIDE_INT.
> (build_int_cstu, build_array_type_nelts): Take a poly_uint64
> instead of an unsigned HOST_WIDE_INT.
> (build_poly_int_cst, tree_fits_poly_int64_p, tree_fits_poly_uint64_p)
> (ptrdiff_tree_p): Declare.
> (tree_to_poly_int64, tree_to_poly_uint64): Likewise. Provide
> extern inline implementations if the target doesn't use POLY_INT_CST.
> (poly_int_tree_p): New function.
> (wi::unextended_tree): New class.
> (wi::int_traits <unextended_tree>): New override.
> (wi::extended_tree): Add a default constructor.
> (wi::extended_tree::get_tree): New function.
> (wi::widest_extended_tree, wi::offset_extended_tree): New typedefs.
> (wi::tree_to_widest_ref, wi::tree_to_offset_ref): Use them.
> (wi::tree_to_poly_widest_ref, wi::tree_to_poly_offset_ref)
> (wi::tree_to_poly_wide_ref): New typedefs.
> (wi::ints_for): Provide overloads for extended_tree and
> unextended_tree.
> (poly_int_cst_value, wi::to_poly_widest, wi::to_poly_offset)
> (wi::to_wide): New functions.
> (wi::fits_to_boolean_p, wi::fits_to_tree_p): Handle poly_ints.
> * tree.c (poly_int_cst_hasher): New struct.
> (poly_int_cst_hash_table): New variable.
> (tree_node_structure_for_code, tree_code_size, simple_cst_equal)
> (valid_constant_size_p, add_expr, drop_tree_overflow): Handle
> POLY_INT_CST.
> (initialize_tree_contains_struct): Handle TS_POLY_INT_CST.
> (init_ttree): Initialize poly_int_cst_hash_table.
> (build_int_cst, build_int_cst_type, build_invariant_address): Take
> a poly_int64 instead of a HOST_WIDE_INT.
> (build_int_cstu, build_array_type_nelts): Take a poly_uint64
> instead of an unsigned HOST_WIDE_INT.
> (wide_int_to_tree): Rename to...
> (wide_int_to_tree_1): ...this.
> (build_new_poly_int_cst, build_poly_int_cst): New functions.
> (force_fit_type): Take a poly_wide_int_ref instead of a wide_int_ref.
> (wide_int_to_tree): New function that takes a poly_wide_int_ref.
> (ptrdiff_tree_p, tree_to_poly_int64, tree_to_poly_uint64)
> (tree_fits_poly_int64_p, tree_fits_poly_uint64_p): New functions.
> * lto-streamer-out.c (DFS::DFS_write_tree_body, hash_tree): Handle
> TS_POLY_INT_CST.
> * tree-streamer-in.c (lto_input_ts_poly_tree_pointers): Likewise.
> (streamer_read_tree_body): Likewise.
> * tree-streamer-out.c (write_ts_poly_tree_pointers): Likewise.
> (streamer_write_tree_body): Likewise.
> * tree-streamer.c (streamer_check_handled_ts_structures): Likewise.
> * asan.c (asan_protect_global): Require the size to be an INTEGER_CST.
> * cfgexpand.c (expand_debug_expr): Handle POLY_INT_CST.
> * expr.c (const_vector_element, expand_expr_real_1): Likewise.
> * gimple-expr.h (is_gimple_constant): Likewise.
> * gimplify.c (maybe_with_size_expr): Likewise.
> * print-tree.c (print_node): Likewise.
> * tree-data-ref.c (data_ref_compare_tree): Likewise.
> * tree-pretty-print.c (dump_generic_node): Likewise.
> * tree-ssa-address.c (addr_for_mem_ref): Likewise.
> * tree-vect-data-refs.c (dr_group_sort_cmp): Likewise.
> * tree-vrp.c (compare_values_warnv): Likewise.
> * tree-ssa-loop-ivopts.c (determine_base_object, constant_multiple_of)
> (get_loop_invariant_expr, add_candidate_1, get_computation_aff_1)
> (force_expr_to_var_cost): Likewise.
> * tree-ssa-loop.c (for_each_index): Likewise.
> * fold-const.h (build_invariant_address, size_int_kind): Take a
> poly_int64 instead of a HOST_WIDE_INT.
> * fold-const.c (fold_negate_expr_1, const_binop, const_unop)
> (fold_convert_const, multiple_of_p, fold_negate_const): Handle
> POLY_INT_CST.
> (size_binop_loc): Likewise. Allow int_const_binop_1 to fail.
> (int_const_binop_2): New function, split out from...
> (int_const_binop_1): ...here. Handle POLY_INT_CST.
> (size_int_kind): Take a poly_int64 instead of a HOST_WIDE_INT.
> * expmed.c (make_tree): Handle CONST_POLY_INT_P.
> * gimple-ssa-strength-reduction.c (slsr_process_add)
> (slsr_process_mul): Check for INTEGER_CSTs before using them
> as candidates.
> * stor-layout.c (bits_from_bytes): New function.
> (bit_from_pos): Use it.
> (layout_type): Likewise. For vectors, multiply the TYPE_SIZE_UNIT
> by BITS_PER_UNIT to get the TYPE_SIZE.
> * tree-cfg.c (verify_expr, verify_types_in_gimple_reference): Allow
> MEM_REF and TARGET_MEM_REF offsets to be a POLY_INT_CST.
>
> Index: gcc/tree.h
> ===================================================================
> --- gcc/tree.h 2017-10-23 16:52:20.504766418 +0100
> +++ gcc/tree.h 2017-10-23 17:00:57.784962010 +0100
> @@ -5132,6 +5195,29 @@ extern bool anon_aggrname_p (const_tree)
> /* The tree and const_tree overload templates. */
> namespace wi
> {
> + class unextended_tree
> + {
> + private:
> + const_tree m_t;
> +
> + public:
> + unextended_tree () {}
> + unextended_tree (const_tree t) : m_t (t) {}
> +
> + unsigned int get_precision () const;
> + const HOST_WIDE_INT *get_val () const;
> + unsigned int get_len () const;
> + const_tree get_tree () const { return m_t; }
> + };
> +
> + template <>
> + struct int_traits <unextended_tree>
> + {
> + static const enum precision_type precision_type = VAR_PRECISION;
> + static const bool host_dependent_precision = false;
> + static const bool is_sign_extended = false;
> + };
> +
> template <int N>
> class extended_tree
> {
> @@ -5139,11 +5225,13 @@ extern bool anon_aggrname_p (const_tree)
> const_tree m_t;
>
> public:
> + extended_tree () {}
> extended_tree (const_tree);
>
> unsigned int get_precision () const;
> const HOST_WIDE_INT *get_val () const;
> unsigned int get_len () const;
> + const_tree get_tree () const { return m_t; }
> };
Similarly I'll defer on part of the patch since the empty ctors play
into the initialization question that's still on the table.
Otherwise this is OK.
Jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [006/nnn] poly_int: tree constants
2017-11-17 4:51 ` Jeff Law
@ 2017-11-18 15:48 ` Richard Sandiford
0 siblings, 0 replies; 302+ messages in thread
From: Richard Sandiford @ 2017-11-18 15:48 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches
Jeff Law <law@redhat.com> writes:
> On 10/23/2017 11:00 AM, Richard Sandiford wrote:
>> This patch adds a tree representation for poly_ints. Unlike the
>> rtx version, the coefficients are INTEGER_CSTs rather than plain
>> integers, so that we can easily access them as poly_widest_ints
>> and poly_offset_ints.
>>
>> The patch also adjusts some places that previously
>> relied on "constant" meaning "INTEGER_CST". It also makes
>> sure that the TYPE_SIZE agrees with the TYPE_SIZE_UNIT for
>> vector booleans, given the existing:
>>
>> /* Several boolean vector elements may fit in a single unit. */
>> if (VECTOR_BOOLEAN_TYPE_P (type)
>> && type->type_common.mode != BLKmode)
>> TYPE_SIZE_UNIT (type)
>> = size_int (GET_MODE_SIZE (type->type_common.mode));
>> else
>> TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
>> TYPE_SIZE_UNIT (innertype),
>> size_int (nunits));
>>
>>
>> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
>> Alan Hayward <alan.hayward@arm.com>
>> David Sherwood <david.sherwood@arm.com>
>>
>> gcc/
>> * doc/generic.texi (POLY_INT_CST): Document.
>> * tree.def (POLY_INT_CST): New tree code.
>> * treestruct.def (TS_POLY_INT_CST): New tree layout.
>> * tree-core.h (tree_poly_int_cst): New struct.
>> (tree_node): Add a poly_int_cst field.
>> * tree.h (POLY_INT_CST_P, POLY_INT_CST_COEFF): New macros.
>> (wide_int_to_tree, force_fit_type): Take a poly_wide_int_ref
>> instead of a wide_int_ref.
>> (build_int_cst, build_int_cst_type): Take a poly_int64 instead
>> of a HOST_WIDE_INT.
>> (build_int_cstu, build_array_type_nelts): Take a poly_uint64
>> instead of an unsigned HOST_WIDE_INT.
>> (build_poly_int_cst, tree_fits_poly_int64_p, tree_fits_poly_uint64_p)
>> (ptrdiff_tree_p): Declare.
>> (tree_to_poly_int64, tree_to_poly_uint64): Likewise. Provide
>> extern inline implementations if the target doesn't use POLY_INT_CST.
>> (poly_int_tree_p): New function.
>> (wi::unextended_tree): New class.
>> (wi::int_traits <unextended_tree>): New override.
>> (wi::extended_tree): Add a default constructor.
>> (wi::extended_tree::get_tree): New function.
>> (wi::widest_extended_tree, wi::offset_extended_tree): New typedefs.
>> (wi::tree_to_widest_ref, wi::tree_to_offset_ref): Use them.
>> (wi::tree_to_poly_widest_ref, wi::tree_to_poly_offset_ref)
>> (wi::tree_to_poly_wide_ref): New typedefs.
>> (wi::ints_for): Provide overloads for extended_tree and
>> unextended_tree.
>> (poly_int_cst_value, wi::to_poly_widest, wi::to_poly_offset)
>> (wi::to_wide): New functions.
>> (wi::fits_to_boolean_p, wi::fits_to_tree_p): Handle poly_ints.
>> * tree.c (poly_int_cst_hasher): New struct.
>> (poly_int_cst_hash_table): New variable.
>> (tree_node_structure_for_code, tree_code_size, simple_cst_equal)
>> (valid_constant_size_p, add_expr, drop_tree_overflow): Handle
>> POLY_INT_CST.
>> (initialize_tree_contains_struct): Handle TS_POLY_INT_CST.
>> (init_ttree): Initialize poly_int_cst_hash_table.
>> (build_int_cst, build_int_cst_type, build_invariant_address): Take
>> a poly_int64 instead of a HOST_WIDE_INT.
>> (build_int_cstu, build_array_type_nelts): Take a poly_uint64
>> instead of an unsigned HOST_WIDE_INT.
>> (wide_int_to_tree): Rename to...
>> (wide_int_to_tree_1): ...this.
>> (build_new_poly_int_cst, build_poly_int_cst): New functions.
>> (force_fit_type): Take a poly_wide_int_ref instead of a wide_int_ref.
>> (wide_int_to_tree): New function that takes a poly_wide_int_ref.
>> (ptrdiff_tree_p, tree_to_poly_int64, tree_to_poly_uint64)
>> (tree_fits_poly_int64_p, tree_fits_poly_uint64_p): New functions.
>> * lto-streamer-out.c (DFS::DFS_write_tree_body, hash_tree): Handle
>> TS_POLY_INT_CST.
>> * tree-streamer-in.c (lto_input_ts_poly_tree_pointers): Likewise.
>> (streamer_read_tree_body): Likewise.
>> * tree-streamer-out.c (write_ts_poly_tree_pointers): Likewise.
>> (streamer_write_tree_body): Likewise.
>> * tree-streamer.c (streamer_check_handled_ts_structures): Likewise.
>> * asan.c (asan_protect_global): Require the size to be an INTEGER_CST.
>> * cfgexpand.c (expand_debug_expr): Handle POLY_INT_CST.
>> * expr.c (const_vector_element, expand_expr_real_1): Likewise.
>> * gimple-expr.h (is_gimple_constant): Likewise.
>> * gimplify.c (maybe_with_size_expr): Likewise.
>> * print-tree.c (print_node): Likewise.
>> * tree-data-ref.c (data_ref_compare_tree): Likewise.
>> * tree-pretty-print.c (dump_generic_node): Likewise.
>> * tree-ssa-address.c (addr_for_mem_ref): Likewise.
>> * tree-vect-data-refs.c (dr_group_sort_cmp): Likewise.
>> * tree-vrp.c (compare_values_warnv): Likewise.
>> * tree-ssa-loop-ivopts.c (determine_base_object, constant_multiple_of)
>> (get_loop_invariant_expr, add_candidate_1, get_computation_aff_1)
>> (force_expr_to_var_cost): Likewise.
>> * tree-ssa-loop.c (for_each_index): Likewise.
>> * fold-const.h (build_invariant_address, size_int_kind): Take a
>> poly_int64 instead of a HOST_WIDE_INT.
>> * fold-const.c (fold_negate_expr_1, const_binop, const_unop)
>> (fold_convert_const, multiple_of_p, fold_negate_const): Handle
>> POLY_INT_CST.
>> (size_binop_loc): Likewise. Allow int_const_binop_1 to fail.
>> (int_const_binop_2): New function, split out from...
>> (int_const_binop_1): ...here. Handle POLY_INT_CST.
>> (size_int_kind): Take a poly_int64 instead of a HOST_WIDE_INT.
>> * expmed.c (make_tree): Handle CONST_POLY_INT_P.
>> * gimple-ssa-strength-reduction.c (slsr_process_add)
>> (slsr_process_mul): Check for INTEGER_CSTs before using them
>> as candidates.
>> * stor-layout.c (bits_from_bytes): New function.
>> (bit_from_pos): Use it.
>> (layout_type): Likewise. For vectors, multiply the TYPE_SIZE_UNIT
>> by BITS_PER_UNIT to get the TYPE_SIZE.
>> * tree-cfg.c (verify_expr, verify_types_in_gimple_reference): Allow
>> MEM_REF and TARGET_MEM_REF offsets to be a POLY_INT_CST.
>>
>> Index: gcc/tree.h
>> ===================================================================
>> --- gcc/tree.h 2017-10-23 16:52:20.504766418 +0100
>> +++ gcc/tree.h 2017-10-23 17:00:57.784962010 +0100
>> @@ -5132,6 +5195,29 @@ extern bool anon_aggrname_p (const_tree)
>> /* The tree and const_tree overload templates. */
>> namespace wi
>> {
>> + class unextended_tree
>> + {
>> + private:
>> + const_tree m_t;
>> +
>> + public:
>> + unextended_tree () {}
>> + unextended_tree (const_tree t) : m_t (t) {}
>> +
>> + unsigned int get_precision () const;
>> + const HOST_WIDE_INT *get_val () const;
>> + unsigned int get_len () const;
>> + const_tree get_tree () const { return m_t; }
>> + };
>> +
>> + template <>
>> + struct int_traits <unextended_tree>
>> + {
>> + static const enum precision_type precision_type = VAR_PRECISION;
>> + static const bool host_dependent_precision = false;
>> + static const bool is_sign_extended = false;
>> + };
>> +
>> template <int N>
>> class extended_tree
>> {
>> @@ -5139,11 +5225,13 @@ extern bool anon_aggrname_p (const_tree)
>> const_tree m_t;
>>
>> public:
>> + extended_tree () {}
>> extended_tree (const_tree);
>>
>> unsigned int get_precision () const;
>> const HOST_WIDE_INT *get_val () const;
>> unsigned int get_len () const;
>> + const_tree get_tree () const { return m_t; }
>> };
> Similarly I'll defer on part of the patch since the empty ctors play
> into the initialization question that's still on the table.
FWIW, I'd expect these two constructors to go away if we switch
to C++11 in future, rather than become "() = default". We only
really need them because of C++03 restrictions.
> Otherwise this is OK.
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* [008/nnn] poly_int: create_integer_operand
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (6 preceding siblings ...)
2017-10-23 17:02 ` [006/nnn] poly_int: tree constants Richard Sandiford
@ 2017-10-23 17:03 ` Richard Sandiford
2017-11-17 3:40 ` Jeff Law
2017-10-23 17:04 ` [009/nnn] poly_int: TRULY_NOOP_TRUNCATION Richard Sandiford
` (99 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:03 UTC (permalink / raw)
To: gcc-patches
This patch generalises create_integer_operand so that it accepts
poly_int64s rather than HOST_WIDE_INTs.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* optabs.h (expand_operand): Add an int_value field.
(create_expand_operand): Add an int_value parameter and use it
to initialize the new expand_operand field.
(create_integer_operand): Replace with a declaration of a function
that accepts poly_int64s. Move the implementation to...
* optabs.c (create_integer_operand): ...here.
(maybe_legitimize_operand): For EXPAND_INTEGER, check whether the
mode preserves the value of int_value, instead of calling
const_int_operand on the rtx.
Index: gcc/optabs.h
===================================================================
--- gcc/optabs.h 2017-10-23 16:52:20.393664364 +0100
+++ gcc/optabs.h 2017-10-23 17:01:02.532643107 +0100
@@ -60,6 +60,9 @@ struct expand_operand {
/* The value of the operand. */
rtx value;
+
+ /* The value of an EXPAND_INTEGER operand. */
+ poly_int64 int_value;
};
/* Initialize OP with the given fields. Initialise the other fields
@@ -69,13 +72,14 @@ struct expand_operand {
create_expand_operand (struct expand_operand *op,
enum expand_operand_type type,
rtx value, machine_mode mode,
- bool unsigned_p)
+ bool unsigned_p, poly_int64 int_value = 0)
{
op->type = type;
op->unsigned_p = unsigned_p;
op->unused = 0;
op->mode = mode;
op->value = value;
+ op->int_value = int_value;
}
/* Make OP describe an operand that must use rtx X, even if X is volatile. */
@@ -142,18 +146,7 @@ create_address_operand (struct expand_op
create_expand_operand (op, EXPAND_ADDRESS, value, Pmode, false);
}
-/* Make OP describe an input operand that has value INTVAL and that has
- no inherent mode. This function should only be used for operands that
- are always expand-time constants. The backend may request that INTVAL
- be copied into a different kind of rtx, but it must specify the mode
- of that rtx if so. */
-
-static inline void
-create_integer_operand (struct expand_operand *op, HOST_WIDE_INT intval)
-{
- create_expand_operand (op, EXPAND_INTEGER, GEN_INT (intval), VOIDmode, false);
-}
-
+extern void create_integer_operand (struct expand_operand *, poly_int64);
/* Passed to expand_simple_binop and expand_binop to say which options
to try to use if the requested operation can't be open-coded on the
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c 2017-10-23 16:52:20.393664364 +0100
+++ gcc/optabs.c 2017-10-23 17:01:02.531644016 +0100
@@ -6959,6 +6959,20 @@ valid_multiword_target_p (rtx target)
return true;
}
+/* Make OP describe an input operand that has value INTVAL and that has
+ no inherent mode. This function should only be used for operands that
+ are always expand-time constants. The backend may request that INTVAL
+ be copied into a different kind of rtx, but it must specify the mode
+ of that rtx if so. */
+
+void
+create_integer_operand (struct expand_operand *op, poly_int64 intval)
+{
+ create_expand_operand (op, EXPAND_INTEGER,
+ gen_int_mode (intval, MAX_MODE_INT),
+ VOIDmode, false, intval);
+}
+
/* Like maybe_legitimize_operand, but do not change the code of the
current rtx value. */
@@ -7071,7 +7085,9 @@ maybe_legitimize_operand (enum insn_code
case EXPAND_INTEGER:
mode = insn_data[(int) icode].operand[opno].mode;
- if (mode != VOIDmode && const_int_operand (op->value, mode))
+ if (mode != VOIDmode
+ && must_eq (trunc_int_for_mode (op->int_value, mode),
+ op->int_value))
goto input;
break;
}
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [008/nnn] poly_int: create_integer_operand
2017-10-23 17:03 ` [008/nnn] poly_int: create_integer_operand Richard Sandiford
@ 2017-11-17 3:40 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-17 3:40 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:02 AM, Richard Sandiford wrote:
> This patch generalises create_integer_operand so that it accepts
> poly_int64s rather than HOST_WIDE_INTs.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * optabs.h (expand_operand): Add an int_value field.
> (create_expand_operand): Add an int_value parameter and use it
> to initialize the new expand_operand field.
> (create_integer_operand): Replace with a declaration of a function
> that accepts poly_int64s. Move the implementation to...
> * optabs.c (create_integer_operand): ...here.
> (maybe_legitimize_operand): For EXPAND_INTEGER, check whether the
> mode preserves the value of int_value, instead of calling
> const_int_operand on the rtx.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [009/nnn] poly_int: TRULY_NOOP_TRUNCATION
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (7 preceding siblings ...)
2017-10-23 17:03 ` [008/nnn] poly_int: create_integer_operand Richard Sandiford
@ 2017-10-23 17:04 ` Richard Sandiford
2017-11-17 3:40 ` Jeff Law
2017-10-23 17:04 ` [010/nnn] poly_int: REG_OFFSET Richard Sandiford
` (98 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:04 UTC (permalink / raw)
To: gcc-patches
This patch makes TRULY_NOOP_TRUNCATION take the mode sizes as
poly_uint64s instead of unsigned ints. The function bodies
don't need to change.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* target.def (truly_noop_truncation): Take poly_uint64s instead of
unsigned ints. Change default to hook_bool_puint64_puint64_true.
* doc/tm.texi: Regenerate.
* hooks.h (hook_bool_uint_uint_true): Delete.
(hook_bool_puint64_puint64_true): Declare.
* hooks.c (hook_bool_uint_uint_true): Delete.
(hook_bool_puint64_puint64_true): New function.
* config/mips/mips.c (mips_truly_noop_truncation): Take poly_uint64s
instead of unsigned ints.
* config/spu/spu.c (spu_truly_noop_truncation): Likewise.
* config/tilegx/tilegx.c (tilegx_truly_noop_truncation): Likewise.
Index: gcc/target.def
===================================================================
--- gcc/target.def 2017-10-23 17:00:20.920834919 +0100
+++ gcc/target.def 2017-10-23 17:01:04.215112587 +0100
@@ -3155,8 +3155,8 @@ is correct for most machines.\n\
If @code{TARGET_MODES_TIEABLE_P} returns false for a pair of modes,\n\
suboptimal code can result if this hook returns true for the corresponding\n\
mode sizes. Making this hook return false in such cases may improve things.",
- bool, (unsigned int outprec, unsigned int inprec),
- hook_bool_uint_uint_true)
+ bool, (poly_uint64 outprec, poly_uint64 inprec),
+ hook_bool_puint64_puint64_true)
/* If the representation of integral MODE is such that values are
always sign-extended to a wider mode MODE_REP then return
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi 2017-10-23 17:00:20.917834257 +0100
+++ gcc/doc/tm.texi 2017-10-23 17:01:04.214113496 +0100
@@ -10823,7 +10823,7 @@ nevertheless truncate the shift count, y
by overriding it.
@end deftypefn
-@deftypefn {Target Hook} bool TARGET_TRULY_NOOP_TRUNCATION (unsigned int @var{outprec}, unsigned int @var{inprec})
+@deftypefn {Target Hook} bool TARGET_TRULY_NOOP_TRUNCATION (poly_uint64 @var{outprec}, poly_uint64 @var{inprec})
This hook returns true if it is safe to ``convert'' a value of
@var{inprec} bits to one of @var{outprec} bits (where @var{outprec} is
smaller than @var{inprec}) by merely operating on it as if it had only
Index: gcc/hooks.h
===================================================================
--- gcc/hooks.h 2017-10-23 16:52:20.369642299 +0100
+++ gcc/hooks.h 2017-10-23 17:01:04.214113496 +0100
@@ -39,7 +39,7 @@ extern bool hook_bool_const_rtx_insn_con
const rtx_insn *);
extern bool hook_bool_mode_uhwi_false (machine_mode,
unsigned HOST_WIDE_INT);
-extern bool hook_bool_uint_uint_true (unsigned int, unsigned int);
+extern bool hook_bool_puint64_puint64_true (poly_uint64, poly_uint64);
extern bool hook_bool_uint_mode_false (unsigned int, machine_mode);
extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
extern bool hook_bool_tree_false (tree);
Index: gcc/hooks.c
===================================================================
--- gcc/hooks.c 2017-10-23 16:52:20.369642299 +0100
+++ gcc/hooks.c 2017-10-23 17:01:04.214113496 +0100
@@ -133,9 +133,9 @@ hook_bool_mode_uhwi_false (machine_mode,
return false;
}
-/* Generic hook that takes (unsigned int, unsigned int) and returns true. */
+/* Generic hook that takes (poly_uint64, poly_uint64) and returns true. */
bool
-hook_bool_uint_uint_true (unsigned int, unsigned int)
+hook_bool_puint64_puint64_true (poly_uint64, poly_uint64)
{
return true;
}
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c 2017-10-23 17:00:43.528930533 +0100
+++ gcc/config/mips/mips.c 2017-10-23 17:01:04.211116223 +0100
@@ -22322,7 +22322,7 @@ mips_promote_function_mode (const_tree t
/* Implement TARGET_TRULY_NOOP_TRUNCATION. */
static bool
-mips_truly_noop_truncation (unsigned int outprec, unsigned int inprec)
+mips_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec)
{
return !TARGET_64BIT || inprec <= 32 || outprec > 32;
}
Index: gcc/config/spu/spu.c
===================================================================
--- gcc/config/spu/spu.c 2017-10-23 17:00:43.548912356 +0100
+++ gcc/config/spu/spu.c 2017-10-23 17:01:04.212115314 +0100
@@ -7182,7 +7182,7 @@ spu_can_change_mode_class (machine_mode
/* Implement TARGET_TRULY_NOOP_TRUNCATION. */
static bool
-spu_truly_noop_truncation (unsigned int outprec, unsigned int inprec)
+spu_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec)
{
return inprec <= 32 && outprec <= inprec;
}
Index: gcc/config/tilegx/tilegx.c
===================================================================
--- gcc/config/tilegx/tilegx.c 2017-10-23 17:00:43.551909629 +0100
+++ gcc/config/tilegx/tilegx.c 2017-10-23 17:01:04.213114405 +0100
@@ -5566,7 +5566,7 @@ tilegx_file_end (void)
as sign-extended DI values in registers. */
static bool
-tilegx_truly_noop_truncation (unsigned int outprec, unsigned int inprec)
+tilegx_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec)
{
return inprec <= 32 || outprec > 32;
}
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [009/nnn] poly_int: TRULY_NOOP_TRUNCATION
2017-10-23 17:04 ` [009/nnn] poly_int: TRULY_NOOP_TRUNCATION Richard Sandiford
@ 2017-11-17 3:40 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-17 3:40 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:03 AM, Richard Sandiford wrote:
> This patch makes TRULY_NOOP_TRUNCATION take the mode sizes as
> poly_uint64s instead of unsigned ints. The function bodies
> don't need to change.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * target.def (truly_noop_truncation): Take poly_uint64s instead of
> unsigned ints. Change default to hook_bool_puint64_puint64_true.
> * doc/tm.texi: Regenerate.
> * hooks.h (hook_bool_uint_uint_true): Delete.
> (hook_bool_puint64_puint64_true): Declare.
> * hooks.c (hook_bool_uint_uint_true): Delete.
> (hook_bool_puint64_puint64_true): New function.
> * config/mips/mips.c (mips_truly_noop_truncation): Take poly_uint64s
> instead of unsigned ints.
> * config/spu/spu.c (spu_truly_noop_truncation): Likewise.
> * config/tilegx/tilegx.c (tilegx_truly_noop_truncation): Likewise.
OK
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [010/nnn] poly_int: REG_OFFSET
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (8 preceding siblings ...)
2017-10-23 17:04 ` [009/nnn] poly_int: TRULY_NOOP_TRUNCATION Richard Sandiford
@ 2017-10-23 17:04 ` Richard Sandiford
2017-11-17 3:41 ` Jeff Law
2017-10-23 17:05 ` [011/nnn] poly_int: DWARF locations Richard Sandiford
` (97 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:04 UTC (permalink / raw)
To: gcc-patches
This patch changes the type of the reg_attrs offset field
from HOST_WIDE_INT to poly_int64 and updates uses accordingly.
This includes changing reg_attr_hasher::hash to use inchash.
(Doing this has no effect on code generation since the only
use of the hasher is to avoid creating duplicate objects.)
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* rtl.h (reg_attrs::offset): Change from HOST_WIDE_INT to poly_int64.
(gen_rtx_REG_offset): Take the offset as a poly_int64.
* inchash.h (inchash::hash::add_poly_hwi): New function.
* gengtype.c (main): Register poly_int64.
* emit-rtl.c (reg_attr_hasher::hash): Use inchash. Treat the
offset as a poly_int.
(reg_attr_hasher::equal): Use must_eq to compare offsets.
(get_reg_attrs, update_reg_offset, gen_rtx_REG_offset): Take the
offset as a poly_int64.
(set_reg_attrs_from_value): Treat the offset as a poly_int64.
* print-rtl.c (print_poly_int): New function.
(rtx_writer::print_rtx_operand_code_r): Treat REG_OFFSET as
a poly_int.
* var-tracking.c (track_offset_p, get_tracked_reg_offset): New
functions.
(var_reg_set, var_reg_delete_and_set, var_reg_delete): Use them.
(same_variable_part_p, track_loc_p): Take the offset as a poly_int64.
(vt_get_decl_and_offset): Return the offset as a poly_int64.
Enforce track_offset_p for parts of a PARALLEL.
(vt_add_function_parameter): Use const_offset for the final
offset to track. Use get_tracked_reg_offset for the parts
of a PARALLEL.
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h 2017-10-23 17:01:15.119130016 +0100
+++ gcc/rtl.h 2017-10-23 17:01:43.314993320 +0100
@@ -187,7 +187,7 @@ struct GTY(()) mem_attrs
struct GTY((for_user)) reg_attrs {
tree decl; /* decl corresponding to REG. */
- HOST_WIDE_INT offset; /* Offset from start of DECL. */
+ poly_int64 offset; /* Offset from start of DECL. */
};
/* Common union for an element of an rtx. */
@@ -2997,7 +2997,7 @@ subreg_promoted_mode (rtx x)
extern rtvec gen_rtvec_v (int, rtx *);
extern rtvec gen_rtvec_v (int, rtx_insn **);
extern rtx gen_reg_rtx (machine_mode);
-extern rtx gen_rtx_REG_offset (rtx, machine_mode, unsigned int, int);
+extern rtx gen_rtx_REG_offset (rtx, machine_mode, unsigned int, poly_int64);
extern rtx gen_reg_rtx_offset (rtx, machine_mode, int);
extern rtx gen_reg_rtx_and_attrs (rtx);
extern rtx_code_label *gen_label_rtx (void);
Index: gcc/inchash.h
===================================================================
--- gcc/inchash.h 2017-10-23 17:01:29.530765486 +0100
+++ gcc/inchash.h 2017-10-23 17:01:43.314993320 +0100
@@ -63,6 +63,14 @@ hashval_t iterative_hash_hashval_t (hash
val = iterative_hash_host_wide_int (v, val);
}
+ /* Add polynomial value V, treating each element as a HOST_WIDE_INT. */
+ template<unsigned int N, typename T>
+ void add_poly_hwi (const poly_int_pod<N, T> &v)
+ {
+ for (unsigned int i = 0; i < N; ++i)
+ add_hwi (v.coeffs[i]);
+ }
+
/* Add wide_int-based value V. */
template<typename T>
void add_wide_int (const generic_wide_int<T> &x)
Index: gcc/gengtype.c
===================================================================
--- gcc/gengtype.c 2017-10-23 17:01:15.119130016 +0100
+++ gcc/gengtype.c 2017-10-23 17:01:43.313994743 +0100
@@ -5190,6 +5190,7 @@ #define POS_HERE(Call) do { pos.file = t
POS_HERE (do_scalar_typedef ("offset_int", &pos));
POS_HERE (do_scalar_typedef ("widest_int", &pos));
POS_HERE (do_scalar_typedef ("int64_t", &pos));
+ POS_HERE (do_scalar_typedef ("poly_int64", &pos));
POS_HERE (do_scalar_typedef ("uint64_t", &pos));
POS_HERE (do_scalar_typedef ("uint8", &pos));
POS_HERE (do_scalar_typedef ("uintptr_t", &pos));
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c 2017-10-23 17:01:15.119130016 +0100
+++ gcc/emit-rtl.c 2017-10-23 17:01:43.313994743 +0100
@@ -205,7 +205,6 @@ static rtx lookup_const_wide_int (rtx);
#endif
static rtx lookup_const_double (rtx);
static rtx lookup_const_fixed (rtx);
-static reg_attrs *get_reg_attrs (tree, int);
static rtx gen_const_vector (machine_mode, int);
static void copy_rtx_if_shared_1 (rtx *orig);
@@ -424,7 +423,10 @@ reg_attr_hasher::hash (reg_attrs *x)
{
const reg_attrs *const p = x;
- return ((p->offset * 1000) ^ (intptr_t) p->decl);
+ inchash::hash h;
+ h.add_ptr (p->decl);
+ h.add_poly_hwi (p->offset);
+ return h.end ();
}
/* Returns nonzero if the value represented by X is the same as that given by
@@ -436,19 +438,19 @@ reg_attr_hasher::equal (reg_attrs *x, re
const reg_attrs *const p = x;
const reg_attrs *const q = y;
- return (p->decl == q->decl && p->offset == q->offset);
+ return (p->decl == q->decl && must_eq (p->offset, q->offset));
}
/* Allocate a new reg_attrs structure and insert it into the hash table if
one identical to it is not already in the table. We are doing this for
MEM of mode MODE. */
static reg_attrs *
-get_reg_attrs (tree decl, int offset)
+get_reg_attrs (tree decl, poly_int64 offset)
{
reg_attrs attrs;
/* If everything is the default, we can just return zero. */
- if (decl == 0 && offset == 0)
+ if (decl == 0 && known_zero (offset))
return 0;
attrs.decl = decl;
@@ -1241,10 +1243,10 @@ reg_is_parm_p (rtx reg)
to the REG_OFFSET. */
static void
-update_reg_offset (rtx new_rtx, rtx reg, int offset)
+update_reg_offset (rtx new_rtx, rtx reg, poly_int64 offset)
{
REG_ATTRS (new_rtx) = get_reg_attrs (REG_EXPR (reg),
- REG_OFFSET (reg) + offset);
+ REG_OFFSET (reg) + offset);
}
/* Generate a register with same attributes as REG, but with OFFSET
@@ -1252,7 +1254,7 @@ update_reg_offset (rtx new_rtx, rtx reg,
rtx
gen_rtx_REG_offset (rtx reg, machine_mode mode, unsigned int regno,
- int offset)
+ poly_int64 offset)
{
rtx new_rtx = gen_rtx_REG (mode, regno);
@@ -1288,7 +1290,7 @@ adjust_reg_mode (rtx reg, machine_mode m
void
set_reg_attrs_from_value (rtx reg, rtx x)
{
- int offset;
+ poly_int64 offset;
bool can_be_reg_pointer = true;
/* Don't call mark_reg_pointer for incompatible pointer sign
Index: gcc/print-rtl.c
===================================================================
--- gcc/print-rtl.c 2017-10-23 17:01:15.119130016 +0100
+++ gcc/print-rtl.c 2017-10-23 17:01:43.314993320 +0100
@@ -178,6 +178,23 @@ print_mem_expr (FILE *outfile, const_tre
fputc (' ', outfile);
print_generic_expr (outfile, CONST_CAST_TREE (expr), dump_flags);
}
+
+/* Print X to FILE. */
+
+static void
+print_poly_int (FILE *file, poly_int64 x)
+{
+ HOST_WIDE_INT const_x;
+ if (x.is_constant (&const_x))
+ fprintf (file, HOST_WIDE_INT_PRINT_DEC, const_x);
+ else
+ {
+ fprintf (file, "[" HOST_WIDE_INT_PRINT_DEC, x.coeffs[0]);
+ for (int i = 1; i < NUM_POLY_INT_COEFFS; ++i)
+ fprintf (file, ", " HOST_WIDE_INT_PRINT_DEC, x.coeffs[i]);
+ fprintf (file, "]");
+ }
+}
#endif
/* Subroutine of print_rtx_operand for handling code '0'.
@@ -499,9 +516,11 @@ rtx_writer::print_rtx_operand_code_r (co
if (REG_EXPR (in_rtx))
print_mem_expr (m_outfile, REG_EXPR (in_rtx));
- if (REG_OFFSET (in_rtx))
- fprintf (m_outfile, "+" HOST_WIDE_INT_PRINT_DEC,
- REG_OFFSET (in_rtx));
+ if (maybe_nonzero (REG_OFFSET (in_rtx)))
+ {
+ fprintf (m_outfile, "+");
+ print_poly_int (m_outfile, REG_OFFSET (in_rtx));
+ }
fputs (" ]", m_outfile);
}
if (regno != ORIGINAL_REGNO (in_rtx))
Index: gcc/var-tracking.c
===================================================================
--- gcc/var-tracking.c 2017-10-23 17:01:15.119130016 +0100
+++ gcc/var-tracking.c 2017-10-23 17:01:43.315991896 +0100
@@ -673,7 +673,6 @@ static bool dataflow_set_different (data
static void dataflow_set_destroy (dataflow_set *);
static bool track_expr_p (tree, bool);
-static bool same_variable_part_p (rtx, tree, HOST_WIDE_INT);
static void add_uses_1 (rtx *, void *);
static void add_stores (rtx, const_rtx, void *);
static bool compute_bb_dataflow (basic_block);
@@ -704,7 +703,6 @@ static void delete_variable_part (datafl
static void emit_notes_in_bb (basic_block, dataflow_set *);
static void vt_emit_notes (void);
-static bool vt_get_decl_and_offset (rtx, tree *, HOST_WIDE_INT *);
static void vt_add_function_parameters (void);
static bool vt_initialize (void);
static void vt_finalize (void);
@@ -1850,6 +1848,32 @@ var_reg_decl_set (dataflow_set *set, rtx
set_variable_part (set, loc, dv, offset, initialized, set_src, iopt);
}
+/* Return true if we should track a location that is OFFSET bytes from
+ a variable. Store the constant offset in *OFFSET_OUT if so. */
+
+static bool
+track_offset_p (poly_int64 offset, HOST_WIDE_INT *offset_out)
+{
+ HOST_WIDE_INT const_offset;
+ if (!offset.is_constant (&const_offset)
+ || !IN_RANGE (const_offset, 0, MAX_VAR_PARTS - 1))
+ return false;
+ *offset_out = const_offset;
+ return true;
+}
+
+/* Return the offset of a register that track_offset_p says we
+ should track. */
+
+static HOST_WIDE_INT
+get_tracked_reg_offset (rtx loc)
+{
+ HOST_WIDE_INT offset;
+ if (!track_offset_p (REG_OFFSET (loc), &offset))
+ gcc_unreachable ();
+ return offset;
+}
+
/* Set the register to contain REG_EXPR (LOC), REG_OFFSET (LOC). */
static void
@@ -1857,7 +1881,7 @@ var_reg_set (dataflow_set *set, rtx loc,
rtx set_src)
{
tree decl = REG_EXPR (loc);
- HOST_WIDE_INT offset = REG_OFFSET (loc);
+ HOST_WIDE_INT offset = get_tracked_reg_offset (loc);
var_reg_decl_set (set, loc, initialized,
dv_from_decl (decl), offset, set_src, INSERT);
@@ -1903,7 +1927,7 @@ var_reg_delete_and_set (dataflow_set *se
enum var_init_status initialized, rtx set_src)
{
tree decl = REG_EXPR (loc);
- HOST_WIDE_INT offset = REG_OFFSET (loc);
+ HOST_WIDE_INT offset = get_tracked_reg_offset (loc);
attrs *node, *next;
attrs **nextp;
@@ -1944,10 +1968,10 @@ var_reg_delete (dataflow_set *set, rtx l
attrs **nextp = &set->regs[REGNO (loc)];
attrs *node, *next;
- if (clobber)
+ HOST_WIDE_INT offset;
+ if (clobber && track_offset_p (REG_OFFSET (loc), &offset))
{
tree decl = REG_EXPR (loc);
- HOST_WIDE_INT offset = REG_OFFSET (loc);
decl = var_debug_decl (decl);
@@ -5245,10 +5269,10 @@ track_expr_p (tree expr, bool need_rtl)
EXPR+OFFSET. */
static bool
-same_variable_part_p (rtx loc, tree expr, HOST_WIDE_INT offset)
+same_variable_part_p (rtx loc, tree expr, poly_int64 offset)
{
tree expr2;
- HOST_WIDE_INT offset2;
+ poly_int64 offset2;
if (! DECL_P (expr))
return false;
@@ -5272,7 +5296,7 @@ same_variable_part_p (rtx loc, tree expr
expr = var_debug_decl (expr);
expr2 = var_debug_decl (expr2);
- return (expr == expr2 && offset == offset2);
+ return (expr == expr2 && must_eq (offset, offset2));
}
/* LOC is a REG or MEM that we would like to track if possible.
@@ -5286,7 +5310,7 @@ same_variable_part_p (rtx loc, tree expr
from EXPR in *OFFSET_OUT (if nonnull). */
static bool
-track_loc_p (rtx loc, tree expr, HOST_WIDE_INT offset, bool store_reg_p,
+track_loc_p (rtx loc, tree expr, poly_int64 offset, bool store_reg_p,
machine_mode *mode_out, HOST_WIDE_INT *offset_out)
{
machine_mode mode;
@@ -5320,19 +5344,20 @@ track_loc_p (rtx loc, tree expr, HOST_WI
|| (store_reg_p
&& !COMPLEX_MODE_P (DECL_MODE (expr))
&& hard_regno_nregs (REGNO (loc), DECL_MODE (expr)) == 1))
- && offset + byte_lowpart_offset (DECL_MODE (expr), mode) == 0)
+ && known_zero (offset + byte_lowpart_offset (DECL_MODE (expr), mode)))
{
mode = DECL_MODE (expr);
offset = 0;
}
- if (offset < 0 || offset >= MAX_VAR_PARTS)
+ HOST_WIDE_INT const_offset;
+ if (!track_offset_p (offset, &const_offset))
return false;
if (mode_out)
*mode_out = mode;
if (offset_out)
- *offset_out = offset;
+ *offset_out = const_offset;
return true;
}
@@ -9544,7 +9569,7 @@ vt_emit_notes (void)
assign declaration to *DECLP and offset to *OFFSETP, and return true. */
static bool
-vt_get_decl_and_offset (rtx rtl, tree *declp, HOST_WIDE_INT *offsetp)
+vt_get_decl_and_offset (rtx rtl, tree *declp, poly_int64 *offsetp)
{
if (REG_P (rtl))
{
@@ -9570,8 +9595,10 @@ vt_get_decl_and_offset (rtx rtl, tree *d
decl = REG_EXPR (reg);
if (REG_EXPR (reg) != decl)
break;
- if (REG_OFFSET (reg) < offset)
- offset = REG_OFFSET (reg);
+ HOST_WIDE_INT this_offset;
+ if (!track_offset_p (REG_OFFSET (reg), &this_offset))
+ break;
+ offset = MIN (offset, this_offset);
}
if (i == len)
@@ -9615,7 +9642,7 @@ vt_add_function_parameter (tree parm)
rtx incoming = DECL_INCOMING_RTL (parm);
tree decl;
machine_mode mode;
- HOST_WIDE_INT offset;
+ poly_int64 offset;
dataflow_set *out;
decl_or_value dv;
@@ -9738,7 +9765,8 @@ vt_add_function_parameter (tree parm)
offset = 0;
}
- if (!track_loc_p (incoming, parm, offset, false, &mode, &offset))
+ HOST_WIDE_INT const_offset;
+ if (!track_loc_p (incoming, parm, offset, false, &mode, &const_offset))
return;
out = &VTI (ENTRY_BLOCK_PTR_FOR_FN (cfun))->out;
@@ -9759,7 +9787,7 @@ vt_add_function_parameter (tree parm)
arguments passed by invisible reference aren't dealt with
above: incoming-rtl will have Pmode rather than the
expected mode for the type. */
- if (offset)
+ if (const_offset)
return;
lowpart = var_lowpart (mode, incoming);
@@ -9774,7 +9802,7 @@ vt_add_function_parameter (tree parm)
if (val)
{
preserve_value (val);
- set_variable_part (out, val->val_rtx, dv, offset,
+ set_variable_part (out, val->val_rtx, dv, const_offset,
VAR_INIT_STATUS_INITIALIZED, NULL, INSERT);
dv = dv_from_value (val->val_rtx);
}
@@ -9795,9 +9823,9 @@ vt_add_function_parameter (tree parm)
{
incoming = var_lowpart (mode, incoming);
gcc_assert (REGNO (incoming) < FIRST_PSEUDO_REGISTER);
- attrs_list_insert (&out->regs[REGNO (incoming)], dv, offset,
+ attrs_list_insert (&out->regs[REGNO (incoming)], dv, const_offset,
incoming);
- set_variable_part (out, incoming, dv, offset,
+ set_variable_part (out, incoming, dv, const_offset,
VAR_INIT_STATUS_INITIALIZED, NULL, INSERT);
if (dv_is_value_p (dv))
{
@@ -9828,17 +9856,19 @@ vt_add_function_parameter (tree parm)
for (i = 0; i < XVECLEN (incoming, 0); i++)
{
rtx reg = XEXP (XVECEXP (incoming, 0, i), 0);
- offset = REG_OFFSET (reg);
+ /* vt_get_decl_and_offset has already checked that the offset
+ is a valid variable part. */
+ const_offset = get_tracked_reg_offset (reg);
gcc_assert (REGNO (reg) < FIRST_PSEUDO_REGISTER);
- attrs_list_insert (&out->regs[REGNO (reg)], dv, offset, reg);
- set_variable_part (out, reg, dv, offset,
+ attrs_list_insert (&out->regs[REGNO (reg)], dv, const_offset, reg);
+ set_variable_part (out, reg, dv, const_offset,
VAR_INIT_STATUS_INITIALIZED, NULL, INSERT);
}
}
else if (MEM_P (incoming))
{
incoming = var_lowpart (mode, incoming);
- set_variable_part (out, incoming, dv, offset,
+ set_variable_part (out, incoming, dv, const_offset,
VAR_INIT_STATUS_INITIALIZED, NULL, INSERT);
}
}
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [010/nnn] poly_int: REG_OFFSET
2017-10-23 17:04 ` [010/nnn] poly_int: REG_OFFSET Richard Sandiford
@ 2017-11-17 3:41 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-17 3:41 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:04 AM, Richard Sandiford wrote:
> This patch changes the type of the reg_attrs offset field
> from HOST_WIDE_INT to poly_int64 and updates uses accordingly.
> This includes changing reg_attr_hasher::hash to use inchash.
> (Doing this has no effect on code generation since the only
> use of the hasher is to avoid creating duplicate objects.)
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * rtl.h (reg_attrs::offset): Change from HOST_WIDE_INT to poly_int64.
> (gen_rtx_REG_offset): Take the offset as a poly_int64.
> * inchash.h (inchash::hash::add_poly_hwi): New function.
> * gengtype.c (main): Register poly_int64.
> * emit-rtl.c (reg_attr_hasher::hash): Use inchash. Treat the
> offset as a poly_int.
> (reg_attr_hasher::equal): Use must_eq to compare offsets.
> (get_reg_attrs, update_reg_offset, gen_rtx_REG_offset): Take the
> offset as a poly_int64.
> (set_reg_attrs_from_value): Treat the offset as a poly_int64.
> * print-rtl.c (print_poly_int): New function.
> (rtx_writer::print_rtx_operand_code_r): Treat REG_OFFSET as
> a poly_int.
> * var-tracking.c (track_offset_p, get_tracked_reg_offset): New
> functions.
> (var_reg_set, var_reg_delete_and_set, var_reg_delete): Use them.
> (same_variable_part_p, track_loc_p): Take the offset as a poly_int64.
> (vt_get_decl_and_offset): Return the offset as a poly_int64.
> Enforce track_offset_p for parts of a PARALLEL.
> (vt_add_function_parameter): Use const_offset for the final
> offset to track. Use get_tracked_reg_offset for the parts
> of a PARALLEL.
>
OK
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [011/nnn] poly_int: DWARF locations
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (9 preceding siblings ...)
2017-10-23 17:04 ` [010/nnn] poly_int: REG_OFFSET Richard Sandiford
@ 2017-10-23 17:05 ` Richard Sandiford
2017-11-17 17:40 ` Jeff Law
2017-10-23 17:05 ` [012/nnn] poly_int: fold_ctor_reference Richard Sandiford
` (96 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:05 UTC (permalink / raw)
To: gcc-patches
This patch adds support for DWARF location expressions
that involve polynomial offsets. It adds a target hook that
says how the runtime invariants used in the offsets should be
represented in DWARF. SVE vectors have to be a multiple of
128 bits in size, so the GCC port uses the number of 128-bit
blocks minus one as the runtime invariant. However, in DWARF,
the vector length is exposed via a pseudo "VG" register that
holds the number of 64-bit elements in a vector. Thus:
indeterminate 1 == (VG / 2) - 1
The hook needs to be general enough to express this.
Note that in most cases the division and subtraction fold
away into surrounding expressions.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* target.def (dwarf_poly_indeterminate_value): New hook.
* targhooks.h (default_dwarf_poly_indeterminate_value): Declare.
* targhooks.c (default_dwarf_poly_indeterminate_value): New function.
* doc/tm.texi.in (TARGET_DWARF_POLY_INDETERMINATE_VALUE): Document.
* doc/tm.texi: Regenerate.
* dwarf2out.h (build_cfa_loc, build_cfa_aligned_loc): Take the
offset as a poly_int64.
* dwarf2out.c (new_reg_loc_descr): Move later in file. Take the
offset as a poly_int64.
(loc_descr_plus_const, loc_list_plus_const, build_cfa_aligned_loc):
Take the offset as a poly_int64.
(build_cfa_loc): Likewise. Use loc_descr_plus_const.
(frame_pointer_fb_offset): Change to a poly_int64.
(int_loc_descriptor): Take the offset as a poly_int64. Use
targetm.dwarf_poly_indeterminate_value for polynomial offsets.
(based_loc_descr): Take the offset as a poly_int64.
Use strip_offset_and_add to handle (plus X (const)).
Use new_reg_loc_descr instead of an open-coded version of the
previous implementation.
(mem_loc_descriptor): Handle CONST_POLY_INT.
(compute_frame_pointer_to_fb_displacement): Take the offset as a
poly_int64. Use strip_offset_and_add to handle (plus X (const)).
Index: gcc/target.def
===================================================================
--- gcc/target.def 2017-10-23 17:01:04.215112587 +0100
+++ gcc/target.def 2017-10-23 17:01:45.057509456 +0100
@@ -4124,6 +4124,21 @@ the CFI label attached to the insn, @var
the insn and @var{index} is @code{UNSPEC_INDEX} or @code{UNSPECV_INDEX}.",
void, (const char *label, rtx pattern, int index), NULL)
+DEFHOOK
+(dwarf_poly_indeterminate_value,
+ "Express the value of @code{poly_int} indeterminate @var{i} as a DWARF\n\
+expression, with @var{i} counting from 1. Return the number of a DWARF\n\
+register @var{R} and set @samp{*@var{factor}} and @samp{*@var{offset}} such\n\
+that the value of the indeterminate is:\n\
+@smallexample\n\
+value_of(@var{R}) / @var{factor} - @var{offset}\n\
+@end smallexample\n\
+\n\
+A target only needs to define this hook if it sets\n\
+@samp{NUM_POLY_INT_COEFFS} to a value greater than 1.",
+ unsigned int, (unsigned int i, unsigned int *factor, int *offset),
+ default_dwarf_poly_indeterminate_value)
+
/* ??? Documenting this hook requires a GFDL license grant. */
DEFHOOK_UNDOC
(stdarg_optimize_hook,
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h 2017-10-23 17:00:20.920834919 +0100
+++ gcc/targhooks.h 2017-10-23 17:01:45.057509456 +0100
@@ -234,6 +234,9 @@ extern int default_label_align_max_skip
extern int default_jump_align_max_skip (rtx_insn *);
extern section * default_function_section(tree decl, enum node_frequency freq,
bool startup, bool exit);
+extern unsigned int default_dwarf_poly_indeterminate_value (unsigned int,
+ unsigned int *,
+ int *);
extern machine_mode default_dwarf_frame_reg_mode (int);
extern fixed_size_mode default_get_reg_raw_mode (int);
extern bool default_keep_leaf_when_profiled ();
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c 2017-10-23 17:00:49.664349224 +0100
+++ gcc/targhooks.c 2017-10-23 17:01:45.057509456 +0100
@@ -1838,6 +1838,15 @@ default_debug_unwind_info (void)
return UI_NONE;
}
+/* Targets that set NUM_POLY_INT_COEFFS to something greater than 1
+ must define this hook. */
+
+unsigned int
+default_dwarf_poly_indeterminate_value (unsigned int, unsigned int *, int *)
+{
+ gcc_unreachable ();
+}
+
/* Determine the correct mode for a Dwarf frame register that represents
register REGNO. */
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in 2017-10-23 17:00:20.918834478 +0100
+++ gcc/doc/tm.texi.in 2017-10-23 17:01:45.053515150 +0100
@@ -2553,6 +2553,8 @@ terminate the stack backtrace. New port
@hook TARGET_DWARF_HANDLE_FRAME_UNSPEC
+@hook TARGET_DWARF_POLY_INDETERMINATE_VALUE
+
@defmac INCOMING_FRAME_SP_OFFSET
A C expression whose value is an integer giving the offset, in bytes,
from the value of the stack pointer register to the top of the stack
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi 2017-10-23 17:01:04.214113496 +0100
+++ gcc/doc/tm.texi 2017-10-23 17:01:45.052516573 +0100
@@ -3133,6 +3133,19 @@ the CFI label attached to the insn, @var
the insn and @var{index} is @code{UNSPEC_INDEX} or @code{UNSPECV_INDEX}.
@end deftypefn
+@deftypefn {Target Hook} {unsigned int} TARGET_DWARF_POLY_INDETERMINATE_VALUE (unsigned int @var{i}, unsigned int *@var{factor}, int *@var{offset})
+Express the value of @code{poly_int} indeterminate @var{i} as a DWARF
+expression, with @var{i} counting from 1. Return the number of a DWARF
+register @var{R} and set @samp{*@var{factor}} and @samp{*@var{offset}} such
+that the value of the indeterminate is:
+@smallexample
+value_of(@var{R}) / @var{factor} - @var{offset}
+@end smallexample
+
+A target only needs to define this hook if it sets
+@samp{NUM_POLY_INT_COEFFS} to a value greater than 1.
+@end deftypefn
+
@defmac INCOMING_FRAME_SP_OFFSET
A C expression whose value is an integer giving the offset, in bytes,
from the value of the stack pointer register to the top of the stack
Index: gcc/dwarf2out.h
===================================================================
--- gcc/dwarf2out.h 2017-10-23 16:52:20.259541165 +0100
+++ gcc/dwarf2out.h 2017-10-23 17:01:45.056510879 +0100
@@ -267,9 +267,9 @@ struct GTY(()) dw_discr_list_node {
/* Interface from dwarf2out.c to dwarf2cfi.c. */
extern struct dw_loc_descr_node *build_cfa_loc
- (dw_cfa_location *, HOST_WIDE_INT);
+ (dw_cfa_location *, poly_int64);
extern struct dw_loc_descr_node *build_cfa_aligned_loc
- (dw_cfa_location *, HOST_WIDE_INT offset, HOST_WIDE_INT alignment);
+ (dw_cfa_location *, poly_int64, HOST_WIDE_INT);
extern struct dw_loc_descr_node *mem_loc_descriptor
(rtx, machine_mode mode, machine_mode mem_mode,
enum var_init_status);
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c 2017-10-23 17:00:54.439005782 +0100
+++ gcc/dwarf2out.c 2017-10-23 17:01:45.056510879 +0100
@@ -1307,7 +1307,7 @@ typedef struct GTY(()) dw_loc_list_struc
bool force;
} dw_loc_list_node;
-static dw_loc_descr_ref int_loc_descriptor (HOST_WIDE_INT);
+static dw_loc_descr_ref int_loc_descriptor (poly_int64);
static dw_loc_descr_ref uint_loc_descriptor (unsigned HOST_WIDE_INT);
/* Convert a DWARF stack opcode into its string name. */
@@ -1344,19 +1344,6 @@ new_loc_descr (enum dwarf_location_atom
return descr;
}
-/* Return a pointer to a newly allocated location description for
- REG and OFFSET. */
-
-static inline dw_loc_descr_ref
-new_reg_loc_descr (unsigned int reg, unsigned HOST_WIDE_INT offset)
-{
- if (reg <= 31)
- return new_loc_descr ((enum dwarf_location_atom) (DW_OP_breg0 + reg),
- offset, 0);
- else
- return new_loc_descr (DW_OP_bregx, reg, offset);
-}
-
/* Add a location description term to a location description expression. */
static inline void
@@ -1489,23 +1476,31 @@ loc_descr_equal_p (dw_loc_descr_ref a, d
}
-/* Add a constant OFFSET to a location expression. */
+/* Add a constant POLY_OFFSET to a location expression. */
static void
-loc_descr_plus_const (dw_loc_descr_ref *list_head, HOST_WIDE_INT offset)
+loc_descr_plus_const (dw_loc_descr_ref *list_head, poly_int64 poly_offset)
{
dw_loc_descr_ref loc;
HOST_WIDE_INT *p;
gcc_assert (*list_head != NULL);
- if (!offset)
+ if (known_zero (poly_offset))
return;
/* Find the end of the chain. */
for (loc = *list_head; loc->dw_loc_next != NULL; loc = loc->dw_loc_next)
;
+ HOST_WIDE_INT offset;
+ if (!poly_offset.is_constant (&offset))
+ {
+ loc->dw_loc_next = int_loc_descriptor (poly_offset);
+ add_loc_descr (&loc->dw_loc_next, new_loc_descr (DW_OP_plus, 0, 0));
+ return;
+ }
+
p = NULL;
if (loc->dw_loc_opc == DW_OP_fbreg
|| (loc->dw_loc_opc >= DW_OP_breg0 && loc->dw_loc_opc <= DW_OP_breg31))
@@ -1531,10 +1526,33 @@ loc_descr_plus_const (dw_loc_descr_ref *
}
}
+/* Return a pointer to a newly allocated location description for
+ REG and OFFSET. */
+
+static inline dw_loc_descr_ref
+new_reg_loc_descr (unsigned int reg, poly_int64 offset)
+{
+ HOST_WIDE_INT const_offset;
+ if (offset.is_constant (&const_offset))
+ {
+ if (reg <= 31)
+ return new_loc_descr ((enum dwarf_location_atom) (DW_OP_breg0 + reg),
+ const_offset, 0);
+ else
+ return new_loc_descr (DW_OP_bregx, reg, const_offset);
+ }
+ else
+ {
+ dw_loc_descr_ref ret = new_reg_loc_descr (reg, 0);
+ loc_descr_plus_const (&ret, offset);
+ return ret;
+ }
+}
+
/* Add a constant OFFSET to a location list. */
static void
-loc_list_plus_const (dw_loc_list_ref list_head, HOST_WIDE_INT offset)
+loc_list_plus_const (dw_loc_list_ref list_head, poly_int64 offset)
{
dw_loc_list_ref d;
for (d = list_head; d != NULL; d = d->dw_loc_next)
@@ -2614,7 +2632,7 @@ output_loc_sequence_raw (dw_loc_descr_re
expression. */
struct dw_loc_descr_node *
-build_cfa_loc (dw_cfa_location *cfa, HOST_WIDE_INT offset)
+build_cfa_loc (dw_cfa_location *cfa, poly_int64 offset)
{
struct dw_loc_descr_node *head, *tmp;
@@ -2627,11 +2645,7 @@ build_cfa_loc (dw_cfa_location *cfa, HOS
head->dw_loc_oprnd1.val_entry = NULL;
tmp = new_loc_descr (DW_OP_deref, 0, 0);
add_loc_descr (&head, tmp);
- if (offset != 0)
- {
- tmp = new_loc_descr (DW_OP_plus_uconst, offset, 0);
- add_loc_descr (&head, tmp);
- }
+ loc_descr_plus_const (&head, offset);
}
else
head = new_reg_loc_descr (cfa->reg, offset);
@@ -2645,7 +2659,7 @@ build_cfa_loc (dw_cfa_location *cfa, HOS
struct dw_loc_descr_node *
build_cfa_aligned_loc (dw_cfa_location *cfa,
- HOST_WIDE_INT offset, HOST_WIDE_INT alignment)
+ poly_int64 offset, HOST_WIDE_INT alignment)
{
struct dw_loc_descr_node *head;
unsigned int dwarf_fp
@@ -3331,7 +3345,7 @@ static GTY(()) vec<tree, va_gc> *generic
/* Offset from the "steady-state frame pointer" to the frame base,
within the current function. */
-static HOST_WIDE_INT frame_pointer_fb_offset;
+static poly_int64 frame_pointer_fb_offset;
static bool frame_pointer_fb_offset_valid;
static vec<dw_die_ref> base_types;
@@ -3505,7 +3519,7 @@ static dw_loc_descr_ref one_reg_loc_desc
enum var_init_status);
static dw_loc_descr_ref multiple_reg_loc_descriptor (rtx, rtx,
enum var_init_status);
-static dw_loc_descr_ref based_loc_descr (rtx, HOST_WIDE_INT,
+static dw_loc_descr_ref based_loc_descr (rtx, poly_int64,
enum var_init_status);
static int is_based_loc (const_rtx);
static bool resolve_one_addr (rtx *);
@@ -13202,13 +13216,58 @@ int_shift_loc_descriptor (HOST_WIDE_INT
return ret;
}
-/* Return a location descriptor that designates a constant. */
+/* Return a location descriptor that designates constant POLY_I. */
static dw_loc_descr_ref
-int_loc_descriptor (HOST_WIDE_INT i)
+int_loc_descriptor (poly_int64 poly_i)
{
enum dwarf_location_atom op;
+ HOST_WIDE_INT i;
+ if (!poly_i.is_constant (&i))
+ {
+ /* Create location descriptions for the non-constant part and
+ add any constant offset at the end. */
+ dw_loc_descr_ref ret = NULL;
+ HOST_WIDE_INT constant = poly_i.coeffs[0];
+ for (unsigned int j = 1; j < NUM_POLY_INT_COEFFS; ++j)
+ {
+ HOST_WIDE_INT coeff = poly_i.coeffs[j];
+ if (coeff != 0)
+ {
+ dw_loc_descr_ref start = ret;
+ unsigned int factor;
+ int bias;
+ unsigned int regno = targetm.dwarf_poly_indeterminate_value
+ (j, &factor, &bias);
+
+ /* Add COEFF * ((REGNO / FACTOR) - BIAS) to the value:
+ add COEFF * (REGNO / FACTOR) now and subtract
+ COEFF * BIAS from the final constant part. */
+ constant -= coeff * bias;
+ add_loc_descr (&ret, new_reg_loc_descr (regno, 0));
+ if (coeff % factor == 0)
+ coeff /= factor;
+ else
+ {
+ int amount = exact_log2 (factor);
+ gcc_assert (amount >= 0);
+ add_loc_descr (&ret, int_loc_descriptor (amount));
+ add_loc_descr (&ret, new_loc_descr (DW_OP_shr, 0, 0));
+ }
+ if (coeff != 1)
+ {
+ add_loc_descr (&ret, int_loc_descriptor (coeff));
+ add_loc_descr (&ret, new_loc_descr (DW_OP_mul, 0, 0));
+ }
+ if (start)
+ add_loc_descr (&ret, new_loc_descr (DW_OP_plus, 0, 0));
+ }
+ }
+ loc_descr_plus_const (&ret, constant);
+ return ret;
+ }
+
/* Pick the smallest representation of a constant, rather than just
defaulting to the LEB encoding. */
if (i >= 0)
@@ -13574,7 +13633,7 @@ address_of_int_loc_descriptor (int size,
/* Return a location descriptor that designates a base+offset location. */
static dw_loc_descr_ref
-based_loc_descr (rtx reg, HOST_WIDE_INT offset,
+based_loc_descr (rtx reg, poly_int64 offset,
enum var_init_status initialized)
{
unsigned int regno;
@@ -13593,11 +13652,7 @@ based_loc_descr (rtx reg, HOST_WIDE_INT
if (elim != reg)
{
- if (GET_CODE (elim) == PLUS)
- {
- offset += INTVAL (XEXP (elim, 1));
- elim = XEXP (elim, 0);
- }
+ elim = strip_offset_and_add (elim, &offset);
gcc_assert ((SUPPORTS_STACK_ALIGNMENT
&& (elim == hard_frame_pointer_rtx
|| elim == stack_pointer_rtx))
@@ -13621,7 +13676,15 @@ based_loc_descr (rtx reg, HOST_WIDE_INT
gcc_assert (frame_pointer_fb_offset_valid);
offset += frame_pointer_fb_offset;
- return new_loc_descr (DW_OP_fbreg, offset, 0);
+ HOST_WIDE_INT const_offset;
+ if (offset.is_constant (&const_offset))
+ return new_loc_descr (DW_OP_fbreg, const_offset, 0);
+ else
+ {
+ dw_loc_descr_ref ret = new_loc_descr (DW_OP_fbreg, 0, 0);
+ loc_descr_plus_const (&ret, offset);
+ return ret;
+ }
}
}
@@ -13636,8 +13699,10 @@ based_loc_descr (rtx reg, HOST_WIDE_INT
#endif
regno = DWARF_FRAME_REGNUM (regno);
+ HOST_WIDE_INT const_offset;
if (!optimize && fde
- && (fde->drap_reg == regno || fde->vdrap_reg == regno))
+ && (fde->drap_reg == regno || fde->vdrap_reg == regno)
+ && offset.is_constant (&const_offset))
{
/* Use cfa+offset to represent the location of arguments passed
on the stack when drap is used to align stack.
@@ -13645,14 +13710,10 @@ based_loc_descr (rtx reg, HOST_WIDE_INT
is supposed to track where the arguments live and the register
used as vdrap or drap in some spot might be used for something
else in other part of the routine. */
- return new_loc_descr (DW_OP_fbreg, offset, 0);
+ return new_loc_descr (DW_OP_fbreg, const_offset, 0);
}
- if (regno <= 31)
- result = new_loc_descr ((enum dwarf_location_atom) (DW_OP_breg0 + regno),
- offset, 0);
- else
- result = new_loc_descr (DW_OP_bregx, regno, offset);
+ result = new_reg_loc_descr (regno, offset);
if (initialized == VAR_INIT_STATUS_UNINITIALIZED)
add_loc_descr (&result, new_loc_descr (DW_OP_GNU_uninit, 0, 0));
@@ -14648,6 +14709,7 @@ mem_loc_descriptor (rtx rtl, machine_mod
enum dwarf_location_atom op;
dw_loc_descr_ref op0, op1;
rtx inner = NULL_RTX;
+ poly_int64 offset;
if (mode == VOIDmode)
mode = GET_MODE (rtl);
@@ -15328,6 +15390,10 @@ mem_loc_descriptor (rtx rtl, machine_mod
}
break;
+ case CONST_POLY_INT:
+ mem_loc_result = int_loc_descriptor (rtx_to_poly_int64 (rtl));
+ break;
+
case EQ:
mem_loc_result = scompare_loc_descriptor (DW_OP_eq, rtl, mem_mode);
break;
@@ -19637,7 +19703,7 @@ convert_cfa_to_fb_loc_list (HOST_WIDE_IN
before the latter is negated. */
static void
-compute_frame_pointer_to_fb_displacement (HOST_WIDE_INT offset)
+compute_frame_pointer_to_fb_displacement (poly_int64 offset)
{
rtx reg, elim;
@@ -19652,11 +19718,7 @@ compute_frame_pointer_to_fb_displacement
elim = (ira_use_lra_p
? lra_eliminate_regs (reg, VOIDmode, NULL_RTX)
: eliminate_regs (reg, VOIDmode, NULL_RTX));
- if (GET_CODE (elim) == PLUS)
- {
- offset += INTVAL (XEXP (elim, 1));
- elim = XEXP (elim, 0);
- }
+ elim = strip_offset_and_add (elim, &offset);
frame_pointer_fb_offset = -offset;
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [011/nnn] poly_int: DWARF locations
2017-10-23 17:05 ` [011/nnn] poly_int: DWARF locations Richard Sandiford
@ 2017-11-17 17:40 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-17 17:40 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:04 AM, Richard Sandiford wrote:
> This patch adds support for DWARF location expressions
> that involve polynomial offsets. It adds a target hook that
> says how the runtime invariants used in the offsets should be
> represented in DWARF. SVE vectors have to be a multiple of
> 128 bits in size, so the GCC port uses the number of 128-bit
> blocks minus one as the runtime invariant. However, in DWARF,
> the vector length is exposed via a pseudo "VG" register that
> holds the number of 64-bit elements in a vector. Thus:
>
> indeterminate 1 == (VG / 2) - 1
>
> The hook needs to be general enough to express this.
> Note that in most cases the division and subtraction fold
> away into surrounding expressions.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * target.def (dwarf_poly_indeterminate_value): New hook.
> * targhooks.h (default_dwarf_poly_indeterminate_value): Declare.
> * targhooks.c (default_dwarf_poly_indeterminate_value): New function.
> * doc/tm.texi.in (TARGET_DWARF_POLY_INDETERMINATE_VALUE): Document.
> * doc/tm.texi: Regenerate.
> * dwarf2out.h (build_cfa_loc, build_cfa_aligned_loc): Take the
> offset as a poly_int64.
> * dwarf2out.c (new_reg_loc_descr): Move later in file. Take the
> offset as a poly_int64.
> (loc_descr_plus_const, loc_list_plus_const, build_cfa_aligned_loc):
> Take the offset as a poly_int64.
> (build_cfa_loc): Likewise. Use loc_descr_plus_const.
> (frame_pointer_fb_offset): Change to a poly_int64.
> (int_loc_descriptor): Take the offset as a poly_int64. Use
> targetm.dwarf_poly_indeterminate_value for polynomial offsets.
> (based_loc_descr): Take the offset as a poly_int64.
> Use strip_offset_and_add to handle (plus X (const)).
> Use new_reg_loc_descr instead of an open-coded version of the
> previous implementation.
> (mem_loc_descriptor): Handle CONST_POLY_INT.
> (compute_frame_pointer_to_fb_displacement): Take the offset as a
> poly_int64. Use strip_offset_and_add to handle (plus X (const)).
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [012/nnn] poly_int: fold_ctor_reference
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (10 preceding siblings ...)
2017-10-23 17:05 ` [011/nnn] poly_int: DWARF locations Richard Sandiford
@ 2017-10-23 17:05 ` Richard Sandiford
2017-11-17 3:59 ` Jeff Law
2017-10-23 17:05 ` [013/nnn] poly_int: same_addr_size_stores_p Richard Sandiford
` (95 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:05 UTC (permalink / raw)
To: gcc-patches
This patch changes the offset and size arguments to
fold_ctor_reference from unsigned HOST_WIDE_INT to poly_uint64.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* gimple-fold.h (fold_ctor_reference): Take the offset and size
as poly_uint64 rather than unsigned HOST_WIDE_INT.
* gimple-fold.c (fold_ctor_reference): Likewise.
Index: gcc/gimple-fold.h
===================================================================
--- gcc/gimple-fold.h 2017-10-23 16:52:20.201487839 +0100
+++ gcc/gimple-fold.h 2017-10-23 17:01:48.165079780 +0100
@@ -44,8 +44,7 @@ extern tree follow_single_use_edges (tre
extern tree gimple_fold_stmt_to_constant_1 (gimple *, tree (*) (tree),
tree (*) (tree) = no_follow_ssa_edges);
extern tree gimple_fold_stmt_to_constant (gimple *, tree (*) (tree));
-extern tree fold_ctor_reference (tree, tree, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, tree);
+extern tree fold_ctor_reference (tree, tree, poly_uint64, poly_uint64, tree);
extern tree fold_const_aggregate_ref_1 (tree, tree (*) (tree));
extern tree fold_const_aggregate_ref (tree);
extern tree gimple_get_virt_method_for_binfo (HOST_WIDE_INT, tree,
Index: gcc/gimple-fold.c
===================================================================
--- gcc/gimple-fold.c 2017-10-23 16:52:20.201487839 +0100
+++ gcc/gimple-fold.c 2017-10-23 17:01:48.164081204 +0100
@@ -6365,20 +6365,25 @@ fold_nonarray_ctor_reference (tree type,
return build_zero_cst (type);
}
-/* CTOR is value initializing memory, fold reference of type TYPE and size SIZE
- to the memory at bit OFFSET. */
+/* CTOR is value initializing memory, fold reference of type TYPE and
+ size POLY_SIZE to the memory at bit POLY_OFFSET. */
tree
-fold_ctor_reference (tree type, tree ctor, unsigned HOST_WIDE_INT offset,
- unsigned HOST_WIDE_INT size, tree from_decl)
+fold_ctor_reference (tree type, tree ctor, poly_uint64 poly_offset,
+ poly_uint64 poly_size, tree from_decl)
{
tree ret;
/* We found the field with exact match. */
if (useless_type_conversion_p (type, TREE_TYPE (ctor))
- && !offset)
+ && known_zero (poly_offset))
return canonicalize_constructor_val (unshare_expr (ctor), from_decl);
+ /* The remaining optimizations need a constant size and offset. */
+ unsigned HOST_WIDE_INT size, offset;
+ if (!poly_size.is_constant (&size) || !poly_offset.is_constant (&offset))
+ return NULL_TREE;
+
/* We are at the end of walk, see if we can view convert the
result. */
if (!AGGREGATE_TYPE_P (TREE_TYPE (ctor)) && !offset
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [012/nnn] poly_int: fold_ctor_reference
2017-10-23 17:05 ` [012/nnn] poly_int: fold_ctor_reference Richard Sandiford
@ 2017-11-17 3:59 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-17 3:59 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:04 AM, Richard Sandiford wrote:
> This patch changes the offset and size arguments to
> fold_ctor_reference from unsigned HOST_WIDE_INT to poly_uint64.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * gimple-fold.h (fold_ctor_reference): Take the offset and size
> as poly_uint64 rather than unsigned HOST_WIDE_INT.
> * gimple-fold.c (fold_ctor_reference): Likewise.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [013/nnn] poly_int: same_addr_size_stores_p
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (11 preceding siblings ...)
2017-10-23 17:05 ` [012/nnn] poly_int: fold_ctor_reference Richard Sandiford
@ 2017-10-23 17:05 ` Richard Sandiford
2017-11-17 4:11 ` Jeff Law
2017-10-23 17:06 ` [015/nnn] poly_int: ao_ref and vn_reference_op_t Richard Sandiford
` (94 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:05 UTC (permalink / raw)
To: gcc-patches
This patch makes tree-ssa-alias.c:same_addr_size_stores_p handle
poly_int sizes and offsets.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-ssa-alias.c (same_addr_size_stores_p): Take the offsets and
sizes as poly_int64s rather than HOST_WIDE_INTs.
Index: gcc/tree-ssa-alias.c
===================================================================
--- gcc/tree-ssa-alias.c 2017-10-23 16:52:20.150440950 +0100
+++ gcc/tree-ssa-alias.c 2017-10-23 17:01:49.579064221 +0100
@@ -2322,14 +2322,14 @@ stmt_may_clobber_ref_p (gimple *stmt, tr
address. */
static bool
-same_addr_size_stores_p (tree base1, HOST_WIDE_INT offset1, HOST_WIDE_INT size1,
- HOST_WIDE_INT max_size1,
- tree base2, HOST_WIDE_INT offset2, HOST_WIDE_INT size2,
- HOST_WIDE_INT max_size2)
+same_addr_size_stores_p (tree base1, poly_int64 offset1, poly_int64 size1,
+ poly_int64 max_size1,
+ tree base2, poly_int64 offset2, poly_int64 size2,
+ poly_int64 max_size2)
{
/* Offsets need to be 0. */
- if (offset1 != 0
- || offset2 != 0)
+ if (maybe_nonzero (offset1)
+ || maybe_nonzero (offset2))
return false;
bool base1_obj_p = SSA_VAR_P (base1);
@@ -2348,17 +2348,19 @@ same_addr_size_stores_p (tree base1, HOS
tree memref = base1_memref_p ? base1 : base2;
/* Sizes need to be valid. */
- if (max_size1 == -1 || max_size2 == -1
- || size1 == -1 || size2 == -1)
+ if (!known_size_p (max_size1)
+ || !known_size_p (max_size2)
+ || !known_size_p (size1)
+ || !known_size_p (size2))
return false;
/* Max_size needs to match size. */
- if (max_size1 != size1
- || max_size2 != size2)
+ if (may_ne (max_size1, size1)
+ || may_ne (max_size2, size2))
return false;
/* Sizes need to match. */
- if (size1 != size2)
+ if (may_ne (size1, size2))
return false;
@@ -2386,10 +2388,9 @@ same_addr_size_stores_p (tree base1, HOS
/* Check that the object size is the same as the store size. That ensures us
that ptr points to the start of obj. */
- if (!tree_fits_shwi_p (DECL_SIZE (obj)))
- return false;
- HOST_WIDE_INT obj_size = tree_to_shwi (DECL_SIZE (obj));
- return obj_size == size1;
+ return (DECL_SIZE (obj)
+ && poly_int_tree_p (DECL_SIZE (obj))
+ && must_eq (wi::to_poly_offset (DECL_SIZE (obj)), size1));
}
/* If STMT kills the memory reference REF return true, otherwise
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [013/nnn] poly_int: same_addr_size_stores_p
2017-10-23 17:05 ` [013/nnn] poly_int: same_addr_size_stores_p Richard Sandiford
@ 2017-11-17 4:11 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-17 4:11 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:05 AM, Richard Sandiford wrote:
> This patch makes tree-ssa-alias.c:same_addr_size_stores_p handle
> poly_int sizes and offsets.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * tree-ssa-alias.c (same_addr_size_stores_p): Take the offsets and
> sizes as poly_int64s rather than HOST_WIDE_INTs.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [015/nnn] poly_int: ao_ref and vn_reference_op_t
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (12 preceding siblings ...)
2017-10-23 17:05 ` [013/nnn] poly_int: same_addr_size_stores_p Richard Sandiford
@ 2017-10-23 17:06 ` Richard Sandiford
2017-11-18 4:25 ` Jeff Law
2017-10-23 17:06 ` [014/nnn] poly_int: indirect_refs_may_alias_p Richard Sandiford
` (93 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:06 UTC (permalink / raw)
To: gcc-patches
This patch changes the offset, size and max_size fields
of ao_ref from HOST_WIDE_INT to poly_int64 and propagates
the change through the code that references it. This includes
changing the off field of vn_reference_op_struct in the same way.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* inchash.h (inchash::hash::add_poly_int): New function.
* tree-ssa-alias.h (ao_ref::offset, ao_ref::size, ao_ref::max_size):
Use poly_int64 rather than HOST_WIDE_INT.
(ao_ref::max_size_known_p): New function.
* tree-ssa-sccvn.h (vn_reference_op_struct::off): Use poly_int64_pod
rather than HOST_WIDE_INT.
* tree-ssa-alias.c (ao_ref_base): Apply get_ref_base_and_extent
to temporaries until its interface is adjusted to match.
(ao_ref_init_from_ptr_and_size): Handle polynomial offsets and sizes.
(aliasing_component_refs_p, decl_refs_may_alias_p)
(indirect_ref_may_alias_decl_p, indirect_refs_may_alias_p): Take
the offsets and max_sizes as poly_int64s instead of HOST_WIDE_INTs.
(refs_may_alias_p_1, stmt_kills_ref_p): Adjust for changes to
ao_ref fields.
* alias.c (ao_ref_from_mem): Likewise.
* tree-ssa-dce.c (mark_aliased_reaching_defs_necessary_1): Likewise.
* tree-ssa-dse.c (valid_ao_ref_for_dse, normalize_ref)
(clear_bytes_written_by, setup_live_bytes_from_ref, compute_trims)
(maybe_trim_complex_store, maybe_trim_constructor_store)
(live_bytes_read, dse_classify_store): Likewise.
* tree-ssa-sccvn.c (vn_reference_compute_hash, vn_reference_eq):
(copy_reference_ops_from_ref, ao_ref_init_from_vn_reference)
(fully_constant_vn_reference_p, valueize_refs_1): Likewise.
(vn_reference_lookup_3): Likewise.
* tree-ssa-uninit.c (warn_uninitialized_vars): Likewise.
Index: gcc/inchash.h
===================================================================
--- gcc/inchash.h 2017-10-23 17:01:43.314993320 +0100
+++ gcc/inchash.h 2017-10-23 17:01:52.303181137 +0100
@@ -57,6 +57,14 @@ hashval_t iterative_hash_hashval_t (hash
val = iterative_hash_hashval_t (v, val);
}
+ /* Add polynomial value V, treating each element as an unsigned int. */
+ template<unsigned int N, typename T>
+ void add_poly_int (const poly_int_pod<N, T> &v)
+ {
+ for (unsigned int i = 0; i < N; ++i)
+ add_int (v.coeffs[i]);
+ }
+
/* Add HOST_WIDE_INT value V. */
void add_hwi (HOST_WIDE_INT v)
{
Index: gcc/tree-ssa-alias.h
===================================================================
--- gcc/tree-ssa-alias.h 2017-10-23 16:52:20.058356365 +0100
+++ gcc/tree-ssa-alias.h 2017-10-23 17:01:52.304179714 +0100
@@ -80,11 +80,11 @@ struct ao_ref
the following fields are not yet computed. */
tree base;
/* The offset relative to the base. */
- HOST_WIDE_INT offset;
+ poly_int64 offset;
/* The size of the access. */
- HOST_WIDE_INT size;
+ poly_int64 size;
/* The maximum possible extent of the access or -1 if unconstrained. */
- HOST_WIDE_INT max_size;
+ poly_int64 max_size;
/* The alias set of the access or -1 if not yet computed. */
alias_set_type ref_alias_set;
@@ -94,8 +94,18 @@ struct ao_ref
/* Whether the memory is considered a volatile access. */
bool volatile_p;
+
+ bool max_size_known_p () const;
};
+/* Return true if the maximum size is known, rather than the special -1
+ marker. */
+
+inline bool
+ao_ref::max_size_known_p () const
+{
+ return known_size_p (max_size);
+}
/* In tree-ssa-alias.c */
extern void ao_ref_init (ao_ref *, tree);
Index: gcc/tree-ssa-sccvn.h
===================================================================
--- gcc/tree-ssa-sccvn.h 2017-10-23 16:52:20.058356365 +0100
+++ gcc/tree-ssa-sccvn.h 2017-10-23 17:01:52.305178291 +0100
@@ -93,7 +93,7 @@ typedef struct vn_reference_op_struct
/* For storing TYPE_ALIGN for array ref element size computation. */
unsigned align : 6;
/* Constant offset this op adds or -1 if it is variable. */
- HOST_WIDE_INT off;
+ poly_int64_pod off;
tree type;
tree op0;
tree op1;
Index: gcc/tree-ssa-alias.c
===================================================================
--- gcc/tree-ssa-alias.c 2017-10-23 17:01:51.044974644 +0100
+++ gcc/tree-ssa-alias.c 2017-10-23 17:01:52.304179714 +0100
@@ -635,11 +635,15 @@ ao_ref_init (ao_ref *r, tree ref)
ao_ref_base (ao_ref *ref)
{
bool reverse;
+ HOST_WIDE_INT offset, size, max_size;
if (ref->base)
return ref->base;
- ref->base = get_ref_base_and_extent (ref->ref, &ref->offset, &ref->size,
- &ref->max_size, &reverse);
+ ref->base = get_ref_base_and_extent (ref->ref, &offset, &size,
+ &max_size, &reverse);
+ ref->offset = offset;
+ ref->size = size;
+ ref->max_size = max_size;
return ref->base;
}
@@ -679,7 +683,8 @@ ao_ref_alias_set (ao_ref *ref)
void
ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
{
- HOST_WIDE_INT t, size_hwi, extra_offset = 0;
+ HOST_WIDE_INT t;
+ poly_int64 size_hwi, extra_offset = 0;
ref->ref = NULL_TREE;
if (TREE_CODE (ptr) == SSA_NAME)
{
@@ -689,11 +694,10 @@ ao_ref_init_from_ptr_and_size (ao_ref *r
ptr = gimple_assign_rhs1 (stmt);
else if (is_gimple_assign (stmt)
&& gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
- && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST)
+ && ptrdiff_tree_p (gimple_assign_rhs2 (stmt), &extra_offset))
{
ptr = gimple_assign_rhs1 (stmt);
- extra_offset = BITS_PER_UNIT
- * int_cst_value (gimple_assign_rhs2 (stmt));
+ extra_offset *= BITS_PER_UNIT;
}
}
@@ -717,8 +721,8 @@ ao_ref_init_from_ptr_and_size (ao_ref *r
}
ref->offset += extra_offset;
if (size
- && tree_fits_shwi_p (size)
- && (size_hwi = tree_to_shwi (size)) <= HOST_WIDE_INT_MAX / BITS_PER_UNIT)
+ && poly_int_tree_p (size, &size_hwi)
+ && coeffs_in_range_p (size_hwi, 0, HOST_WIDE_INT_MAX / BITS_PER_UNIT))
ref->max_size = ref->size = size_hwi * BITS_PER_UNIT;
else
ref->max_size = ref->size = -1;
@@ -779,11 +783,11 @@ same_type_for_tbaa (tree type1, tree typ
aliasing_component_refs_p (tree ref1,
alias_set_type ref1_alias_set,
alias_set_type base1_alias_set,
- HOST_WIDE_INT offset1, HOST_WIDE_INT max_size1,
+ poly_int64 offset1, poly_int64 max_size1,
tree ref2,
alias_set_type ref2_alias_set,
alias_set_type base2_alias_set,
- HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2,
+ poly_int64 offset2, poly_int64 max_size2,
bool ref2_is_decl)
{
/* If one reference is a component references through pointers try to find a
@@ -825,7 +829,7 @@ aliasing_component_refs_p (tree ref1,
offset2 -= offadj;
get_ref_base_and_extent (base1, &offadj, &sztmp, &msztmp, &reverse);
offset1 -= offadj;
- return ranges_overlap_p (offset1, max_size1, offset2, max_size2);
+ return ranges_may_overlap_p (offset1, max_size1, offset2, max_size2);
}
/* If we didn't find a common base, try the other way around. */
refp = &ref1;
@@ -844,7 +848,7 @@ aliasing_component_refs_p (tree ref1,
offset1 -= offadj;
get_ref_base_and_extent (base2, &offadj, &sztmp, &msztmp, &reverse);
offset2 -= offadj;
- return ranges_overlap_p (offset1, max_size1, offset2, max_size2);
+ return ranges_may_overlap_p (offset1, max_size1, offset2, max_size2);
}
/* If we have two type access paths B1.path1 and B2.path2 they may
@@ -1090,9 +1094,9 @@ nonoverlapping_component_refs_p (const_t
static bool
decl_refs_may_alias_p (tree ref1, tree base1,
- HOST_WIDE_INT offset1, HOST_WIDE_INT max_size1,
+ poly_int64 offset1, poly_int64 max_size1,
tree ref2, tree base2,
- HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2)
+ poly_int64 offset2, poly_int64 max_size2)
{
gcc_checking_assert (DECL_P (base1) && DECL_P (base2));
@@ -1102,7 +1106,7 @@ decl_refs_may_alias_p (tree ref1, tree b
/* If both references are based on the same variable, they cannot alias if
the accesses do not overlap. */
- if (!ranges_overlap_p (offset1, max_size1, offset2, max_size2))
+ if (!ranges_may_overlap_p (offset1, max_size1, offset2, max_size2))
return false;
/* For components with variable position, the above test isn't sufficient,
@@ -1124,12 +1128,11 @@ decl_refs_may_alias_p (tree ref1, tree b
static bool
indirect_ref_may_alias_decl_p (tree ref1 ATTRIBUTE_UNUSED, tree base1,
- HOST_WIDE_INT offset1,
- HOST_WIDE_INT max_size1 ATTRIBUTE_UNUSED,
+ poly_int64 offset1, poly_int64 max_size1,
alias_set_type ref1_alias_set,
alias_set_type base1_alias_set,
tree ref2 ATTRIBUTE_UNUSED, tree base2,
- HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2,
+ poly_int64 offset2, poly_int64 max_size2,
alias_set_type ref2_alias_set,
alias_set_type base2_alias_set, bool tbaa_p)
{
@@ -1185,14 +1188,15 @@ indirect_ref_may_alias_decl_p (tree ref1
is bigger than the size of the decl we can't possibly access the
decl via that pointer. */
if (DECL_SIZE (base2) && COMPLETE_TYPE_P (TREE_TYPE (ptrtype1))
- && TREE_CODE (DECL_SIZE (base2)) == INTEGER_CST
- && TREE_CODE (TYPE_SIZE (TREE_TYPE (ptrtype1))) == INTEGER_CST
+ && poly_int_tree_p (DECL_SIZE (base2))
+ && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (ptrtype1)))
/* ??? This in turn may run afoul when a decl of type T which is
a member of union type U is accessed through a pointer to
type U and sizeof T is smaller than sizeof U. */
&& TREE_CODE (TREE_TYPE (ptrtype1)) != UNION_TYPE
&& TREE_CODE (TREE_TYPE (ptrtype1)) != QUAL_UNION_TYPE
- && tree_int_cst_lt (DECL_SIZE (base2), TYPE_SIZE (TREE_TYPE (ptrtype1))))
+ && must_lt (wi::to_poly_widest (DECL_SIZE (base2)),
+ wi::to_poly_widest (TYPE_SIZE (TREE_TYPE (ptrtype1)))))
return false;
if (!ref2)
@@ -1203,8 +1207,8 @@ indirect_ref_may_alias_decl_p (tree ref1
dbase2 = ref2;
while (handled_component_p (dbase2))
dbase2 = TREE_OPERAND (dbase2, 0);
- HOST_WIDE_INT doffset1 = offset1;
- offset_int doffset2 = offset2;
+ poly_int64 doffset1 = offset1;
+ poly_offset_int doffset2 = offset2;
if (TREE_CODE (dbase2) == MEM_REF
|| TREE_CODE (dbase2) == TARGET_MEM_REF)
doffset2 -= mem_ref_offset (dbase2) << LOG2_BITS_PER_UNIT;
@@ -1252,11 +1256,11 @@ indirect_ref_may_alias_decl_p (tree ref1
static bool
indirect_refs_may_alias_p (tree ref1 ATTRIBUTE_UNUSED, tree base1,
- HOST_WIDE_INT offset1, HOST_WIDE_INT max_size1,
+ poly_int64 offset1, poly_int64 max_size1,
alias_set_type ref1_alias_set,
alias_set_type base1_alias_set,
tree ref2 ATTRIBUTE_UNUSED, tree base2,
- HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2,
+ poly_int64 offset2, poly_int64 max_size2,
alias_set_type ref2_alias_set,
alias_set_type base2_alias_set, bool tbaa_p)
{
@@ -1330,7 +1334,7 @@ indirect_refs_may_alias_p (tree ref1 ATT
/* But avoid treating arrays as "objects", instead assume they
can overlap by an exact multiple of their element size. */
&& TREE_CODE (TREE_TYPE (ptrtype1)) != ARRAY_TYPE)
- return ranges_overlap_p (offset1, max_size1, offset2, max_size2);
+ return ranges_may_overlap_p (offset1, max_size1, offset2, max_size2);
/* Do type-based disambiguation. */
if (base1_alias_set != base2_alias_set
@@ -1365,8 +1369,8 @@ indirect_refs_may_alias_p (tree ref1 ATT
refs_may_alias_p_1 (ao_ref *ref1, ao_ref *ref2, bool tbaa_p)
{
tree base1, base2;
- HOST_WIDE_INT offset1 = 0, offset2 = 0;
- HOST_WIDE_INT max_size1 = -1, max_size2 = -1;
+ poly_int64 offset1 = 0, offset2 = 0;
+ poly_int64 max_size1 = -1, max_size2 = -1;
bool var1_p, var2_p, ind1_p, ind2_p;
gcc_checking_assert ((!ref1->ref
@@ -2444,14 +2448,17 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *
handling constant offset and size. */
/* For a must-alias check we need to be able to constrain
the access properly. */
- if (ref->max_size == -1)
+ if (!ref->max_size_known_p ())
return false;
- HOST_WIDE_INT size, offset, max_size, ref_offset = ref->offset;
+ HOST_WIDE_INT size, max_size, const_offset;
+ poly_int64 ref_offset = ref->offset;
bool reverse;
tree base
- = get_ref_base_and_extent (lhs, &offset, &size, &max_size, &reverse);
+ = get_ref_base_and_extent (lhs, &const_offset, &size, &max_size,
+ &reverse);
/* We can get MEM[symbol: sZ, index: D.8862_1] here,
so base == ref->base does not always hold. */
+ poly_int64 offset = const_offset;
if (base != ref->base)
{
/* Try using points-to info. */
@@ -2468,18 +2475,13 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *
if (!tree_int_cst_equal (TREE_OPERAND (base, 1),
TREE_OPERAND (ref->base, 1)))
{
- offset_int off1 = mem_ref_offset (base);
+ poly_offset_int off1 = mem_ref_offset (base);
off1 <<= LOG2_BITS_PER_UNIT;
off1 += offset;
- offset_int off2 = mem_ref_offset (ref->base);
+ poly_offset_int off2 = mem_ref_offset (ref->base);
off2 <<= LOG2_BITS_PER_UNIT;
off2 += ref_offset;
- if (wi::fits_shwi_p (off1) && wi::fits_shwi_p (off2))
- {
- offset = off1.to_shwi ();
- ref_offset = off2.to_shwi ();
- }
- else
+ if (!off1.to_shwi (&offset) || !off2.to_shwi (&ref_offset))
size = -1;
}
}
@@ -2488,12 +2490,9 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *
}
/* For a must-alias check we need to be able to constrain
the access properly. */
- if (size != -1 && size == max_size)
- {
- if (offset <= ref_offset
- && offset + size >= ref_offset + ref->max_size)
- return true;
- }
+ if (size == max_size
+ && known_subrange_p (ref_offset, ref->max_size, offset, size))
+ return true;
}
if (is_gimple_call (stmt))
@@ -2526,19 +2525,19 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *
{
/* For a must-alias check we need to be able to constrain
the access properly. */
- if (ref->max_size == -1)
+ if (!ref->max_size_known_p ())
return false;
tree dest = gimple_call_arg (stmt, 0);
tree len = gimple_call_arg (stmt, 2);
- if (!tree_fits_shwi_p (len))
+ if (!poly_int_tree_p (len))
return false;
tree rbase = ref->base;
- offset_int roffset = ref->offset;
+ poly_offset_int roffset = ref->offset;
ao_ref dref;
ao_ref_init_from_ptr_and_size (&dref, dest, len);
tree base = ao_ref_base (&dref);
- offset_int offset = dref.offset;
- if (!base || dref.size == -1)
+ poly_offset_int offset = dref.offset;
+ if (!base || !known_size_p (dref.size))
return false;
if (TREE_CODE (base) == MEM_REF)
{
@@ -2551,9 +2550,9 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *
rbase = TREE_OPERAND (rbase, 0);
}
if (base == rbase
- && offset <= roffset
- && (roffset + ref->max_size
- <= offset + (wi::to_offset (len) << LOG2_BITS_PER_UNIT)))
+ && known_subrange_p (roffset, ref->max_size, offset,
+ wi::to_poly_offset (len)
+ << LOG2_BITS_PER_UNIT))
return true;
break;
}
Index: gcc/alias.c
===================================================================
--- gcc/alias.c 2017-10-23 16:52:20.058356365 +0100
+++ gcc/alias.c 2017-10-23 17:01:52.303181137 +0100
@@ -331,9 +331,9 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
/* If MEM_OFFSET/MEM_SIZE get us outside of ref->offset/ref->max_size
drop ref->ref. */
if (MEM_OFFSET (mem) < 0
- || (ref->max_size != -1
- && ((MEM_OFFSET (mem) + MEM_SIZE (mem)) * BITS_PER_UNIT
- > ref->max_size)))
+ || (ref->max_size_known_p ()
+ && may_gt ((MEM_OFFSET (mem) + MEM_SIZE (mem)) * BITS_PER_UNIT,
+ ref->max_size)))
ref->ref = NULL_TREE;
/* Refine size and offset we got from analyzing MEM_EXPR by using
@@ -344,19 +344,18 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
/* The MEM may extend into adjacent fields, so adjust max_size if
necessary. */
- if (ref->max_size != -1
- && ref->size > ref->max_size)
- ref->max_size = ref->size;
+ if (ref->max_size_known_p ())
+ ref->max_size = upper_bound (ref->max_size, ref->size);
- /* If MEM_OFFSET and MEM_SIZE get us outside of the base object of
+ /* If MEM_OFFSET and MEM_SIZE might get us outside of the base object of
the MEM_EXPR punt. This happens for STRICT_ALIGNMENT targets a lot. */
if (MEM_EXPR (mem) != get_spill_slot_decl (false)
- && (ref->offset < 0
+ && (may_lt (ref->offset, 0)
|| (DECL_P (ref->base)
&& (DECL_SIZE (ref->base) == NULL_TREE
- || TREE_CODE (DECL_SIZE (ref->base)) != INTEGER_CST
- || wi::ltu_p (wi::to_offset (DECL_SIZE (ref->base)),
- ref->offset + ref->size)))))
+ || !poly_int_tree_p (DECL_SIZE (ref->base))
+ || may_lt (wi::to_poly_offset (DECL_SIZE (ref->base)),
+ ref->offset + ref->size)))))
return false;
return true;
Index: gcc/tree-ssa-dce.c
===================================================================
--- gcc/tree-ssa-dce.c 2017-10-23 16:52:20.058356365 +0100
+++ gcc/tree-ssa-dce.c 2017-10-23 17:01:52.304179714 +0100
@@ -488,13 +488,9 @@ mark_aliased_reaching_defs_necessary_1 (
{
/* For a must-alias check we need to be able to constrain
the accesses properly. */
- if (size != -1 && size == max_size
- && ref->max_size != -1)
- {
- if (offset <= ref->offset
- && offset + size >= ref->offset + ref->max_size)
- return true;
- }
+ if (size == max_size
+ && known_subrange_p (ref->offset, ref->max_size, offset, size))
+ return true;
/* Or they need to be exactly the same. */
else if (ref->ref
/* Make sure there is no induction variable involved
Index: gcc/tree-ssa-dse.c
===================================================================
--- gcc/tree-ssa-dse.c 2017-10-23 16:52:20.058356365 +0100
+++ gcc/tree-ssa-dse.c 2017-10-23 17:01:52.304179714 +0100
@@ -128,13 +128,12 @@ initialize_ao_ref_for_dse (gimple *stmt,
valid_ao_ref_for_dse (ao_ref *ref)
{
return (ao_ref_base (ref)
- && ref->max_size != -1
- && ref->size != 0
- && ref->max_size == ref->size
- && ref->offset >= 0
- && (ref->offset % BITS_PER_UNIT) == 0
- && (ref->size % BITS_PER_UNIT) == 0
- && (ref->size != -1));
+ && known_size_p (ref->max_size)
+ && maybe_nonzero (ref->size)
+ && must_eq (ref->max_size, ref->size)
+ && must_ge (ref->offset, 0)
+ && multiple_p (ref->offset, BITS_PER_UNIT)
+ && multiple_p (ref->size, BITS_PER_UNIT));
}
/* Try to normalize COPY (an ao_ref) relative to REF. Essentially when we are
@@ -144,25 +143,31 @@ valid_ao_ref_for_dse (ao_ref *ref)
static bool
normalize_ref (ao_ref *copy, ao_ref *ref)
{
+ if (!ordered_p (copy->offset, ref->offset))
+ return false;
+
/* If COPY starts before REF, then reset the beginning of
COPY to match REF and decrease the size of COPY by the
number of bytes removed from COPY. */
- if (copy->offset < ref->offset)
+ if (may_lt (copy->offset, ref->offset))
{
- HOST_WIDE_INT diff = ref->offset - copy->offset;
- if (copy->size <= diff)
+ poly_int64 diff = ref->offset - copy->offset;
+ if (may_le (copy->size, diff))
return false;
copy->size -= diff;
copy->offset = ref->offset;
}
- HOST_WIDE_INT diff = copy->offset - ref->offset;
- if (ref->size <= diff)
+ poly_int64 diff = copy->offset - ref->offset;
+ if (may_le (ref->size, diff))
return false;
/* If COPY extends beyond REF, chop off its size appropriately. */
- HOST_WIDE_INT limit = ref->size - diff;
- if (copy->size > limit)
+ poly_int64 limit = ref->size - diff;
+ if (!ordered_p (limit, copy->size))
+ return false;
+
+ if (may_gt (copy->size, limit))
copy->size = limit;
return true;
}
@@ -183,15 +188,15 @@ clear_bytes_written_by (sbitmap live_byt
/* Verify we have the same base memory address, the write
has a known size and overlaps with REF. */
+ HOST_WIDE_INT start, size;
if (valid_ao_ref_for_dse (&write)
&& operand_equal_p (write.base, ref->base, OEP_ADDRESS_OF)
- && write.size == write.max_size
- && normalize_ref (&write, ref))
- {
- HOST_WIDE_INT start = write.offset - ref->offset;
- bitmap_clear_range (live_bytes, start / BITS_PER_UNIT,
- write.size / BITS_PER_UNIT);
- }
+ && must_eq (write.size, write.max_size)
+ && normalize_ref (&write, ref)
+ && (write.offset - ref->offset).is_constant (&start)
+ && write.size.is_constant (&size))
+ bitmap_clear_range (live_bytes, start / BITS_PER_UNIT,
+ size / BITS_PER_UNIT);
}
/* REF is a memory write. Extract relevant information from it and
@@ -201,12 +206,14 @@ clear_bytes_written_by (sbitmap live_byt
static bool
setup_live_bytes_from_ref (ao_ref *ref, sbitmap live_bytes)
{
+ HOST_WIDE_INT const_size;
if (valid_ao_ref_for_dse (ref)
- && (ref->size / BITS_PER_UNIT
+ && ref->size.is_constant (&const_size)
+ && (const_size / BITS_PER_UNIT
<= PARAM_VALUE (PARAM_DSE_MAX_OBJECT_SIZE)))
{
bitmap_clear (live_bytes);
- bitmap_set_range (live_bytes, 0, ref->size / BITS_PER_UNIT);
+ bitmap_set_range (live_bytes, 0, const_size / BITS_PER_UNIT);
return true;
}
return false;
@@ -231,9 +238,15 @@ compute_trims (ao_ref *ref, sbitmap live
the REF to compute the trims. */
/* Now identify how much, if any of the tail we can chop off. */
- int last_orig = (ref->size / BITS_PER_UNIT) - 1;
- int last_live = bitmap_last_set_bit (live);
- *trim_tail = (last_orig - last_live) & ~0x1;
+ HOST_WIDE_INT const_size;
+ if (ref->size.is_constant (&const_size))
+ {
+ int last_orig = (const_size / BITS_PER_UNIT) - 1;
+ int last_live = bitmap_last_set_bit (live);
+ *trim_tail = (last_orig - last_live) & ~0x1;
+ }
+ else
+ *trim_tail = 0;
/* Identify how much, if any of the head we can chop off. */
int first_orig = 0;
@@ -267,7 +280,7 @@ maybe_trim_complex_store (ao_ref *ref, s
least half the size of the object to ensure we're trimming
the entire real or imaginary half. By writing things this
way we avoid more O(n) bitmap operations. */
- if (trim_tail * 2 >= ref->size / BITS_PER_UNIT)
+ if (must_ge (trim_tail * 2 * BITS_PER_UNIT, ref->size))
{
/* TREE_REALPART is live */
tree x = TREE_REALPART (gimple_assign_rhs1 (stmt));
@@ -276,7 +289,7 @@ maybe_trim_complex_store (ao_ref *ref, s
gimple_assign_set_lhs (stmt, y);
gimple_assign_set_rhs1 (stmt, x);
}
- else if (trim_head * 2 >= ref->size / BITS_PER_UNIT)
+ else if (must_ge (trim_head * 2 * BITS_PER_UNIT, ref->size))
{
/* TREE_IMAGPART is live */
tree x = TREE_IMAGPART (gimple_assign_rhs1 (stmt));
@@ -326,7 +339,8 @@ maybe_trim_constructor_store (ao_ref *re
return;
/* The number of bytes for the new constructor. */
- int count = (ref->size / BITS_PER_UNIT) - head_trim - tail_trim;
+ poly_int64 ref_bytes = exact_div (ref->size, BITS_PER_UNIT);
+ poly_int64 count = ref_bytes - head_trim - tail_trim;
/* And the new type for the CONSTRUCTOR. Essentially it's just
a char array large enough to cover the non-trimmed parts of
@@ -483,15 +497,15 @@ live_bytes_read (ao_ref use_ref, ao_ref
{
/* We have already verified that USE_REF and REF hit the same object.
Now verify that there's actually an overlap between USE_REF and REF. */
- if (normalize_ref (&use_ref, ref))
+ HOST_WIDE_INT start, size;
+ if (normalize_ref (&use_ref, ref)
+ && (use_ref.offset - ref->offset).is_constant (&start)
+ && use_ref.size.is_constant (&size))
{
- HOST_WIDE_INT start = use_ref.offset - ref->offset;
- HOST_WIDE_INT size = use_ref.size;
-
/* If USE_REF covers all of REF, then it will hit one or more
live bytes. This avoids useless iteration over the bitmap
below. */
- if (start == 0 && size == ref->size)
+ if (start == 0 && must_eq (size, ref->size))
return true;
/* Now check if any of the remaining bits in use_ref are set in LIVE. */
@@ -592,8 +606,8 @@ dse_classify_store (ao_ref *ref, gimple
ao_ref use_ref;
ao_ref_init (&use_ref, gimple_assign_rhs1 (use_stmt));
if (valid_ao_ref_for_dse (&use_ref)
- && use_ref.base == ref->base
- && use_ref.size == use_ref.max_size
+ && must_eq (use_ref.base, ref->base)
+ && must_eq (use_ref.size, use_ref.max_size)
&& !live_bytes_read (use_ref, ref, live_bytes))
{
/* If this statement has a VDEF, then it is the
Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c 2017-10-23 16:52:20.058356365 +0100
+++ gcc/tree-ssa-sccvn.c 2017-10-23 17:01:52.305178291 +0100
@@ -547,7 +547,7 @@ vn_reference_compute_hash (const vn_refe
hashval_t result;
int i;
vn_reference_op_t vro;
- HOST_WIDE_INT off = -1;
+ poly_int64 off = -1;
bool deref = false;
FOR_EACH_VEC_ELT (vr1->operands, i, vro)
@@ -556,17 +556,17 @@ vn_reference_compute_hash (const vn_refe
deref = true;
else if (vro->opcode != ADDR_EXPR)
deref = false;
- if (vro->off != -1)
+ if (may_ne (vro->off, -1))
{
- if (off == -1)
+ if (must_eq (off, -1))
off = 0;
off += vro->off;
}
else
{
- if (off != -1
- && off != 0)
- hstate.add_int (off);
+ if (may_ne (off, -1)
+ && may_ne (off, 0))
+ hstate.add_poly_int (off);
off = -1;
if (deref
&& vro->opcode == ADDR_EXPR)
@@ -632,7 +632,7 @@ vn_reference_eq (const_vn_reference_t co
j = 0;
do
{
- HOST_WIDE_INT off1 = 0, off2 = 0;
+ poly_int64 off1 = 0, off2 = 0;
vn_reference_op_t vro1, vro2;
vn_reference_op_s tem1, tem2;
bool deref1 = false, deref2 = false;
@@ -643,7 +643,7 @@ vn_reference_eq (const_vn_reference_t co
/* Do not look through a storage order barrier. */
else if (vro1->opcode == VIEW_CONVERT_EXPR && vro1->reverse)
return false;
- if (vro1->off == -1)
+ if (must_eq (vro1->off, -1))
break;
off1 += vro1->off;
}
@@ -654,11 +654,11 @@ vn_reference_eq (const_vn_reference_t co
/* Do not look through a storage order barrier. */
else if (vro2->opcode == VIEW_CONVERT_EXPR && vro2->reverse)
return false;
- if (vro2->off == -1)
+ if (must_eq (vro2->off, -1))
break;
off2 += vro2->off;
}
- if (off1 != off2)
+ if (may_ne (off1, off2))
return false;
if (deref1 && vro1->opcode == ADDR_EXPR)
{
@@ -784,24 +784,23 @@ copy_reference_ops_from_ref (tree ref, v
{
tree this_offset = component_ref_field_offset (ref);
if (this_offset
- && TREE_CODE (this_offset) == INTEGER_CST)
+ && poly_int_tree_p (this_offset))
{
tree bit_offset = DECL_FIELD_BIT_OFFSET (TREE_OPERAND (ref, 1));
if (TREE_INT_CST_LOW (bit_offset) % BITS_PER_UNIT == 0)
{
- offset_int off
- = (wi::to_offset (this_offset)
+ poly_offset_int off
+ = (wi::to_poly_offset (this_offset)
+ (wi::to_offset (bit_offset) >> LOG2_BITS_PER_UNIT));
- if (wi::fits_shwi_p (off)
- /* Probibit value-numbering zero offset components
- of addresses the same before the pass folding
- __builtin_object_size had a chance to run
- (checking cfun->after_inlining does the
- trick here). */
- && (TREE_CODE (orig) != ADDR_EXPR
- || off != 0
- || cfun->after_inlining))
- temp.off = off.to_shwi ();
+ /* Probibit value-numbering zero offset components
+ of addresses the same before the pass folding
+ __builtin_object_size had a chance to run
+ (checking cfun->after_inlining does the
+ trick here). */
+ if (TREE_CODE (orig) != ADDR_EXPR
+ || maybe_nonzero (off)
+ || cfun->after_inlining)
+ off.to_shwi (&temp.off);
}
}
}
@@ -820,16 +819,15 @@ copy_reference_ops_from_ref (tree ref, v
if (! temp.op2)
temp.op2 = size_binop (EXACT_DIV_EXPR, TYPE_SIZE_UNIT (eltype),
size_int (TYPE_ALIGN_UNIT (eltype)));
- if (TREE_CODE (temp.op0) == INTEGER_CST
- && TREE_CODE (temp.op1) == INTEGER_CST
+ if (poly_int_tree_p (temp.op0)
+ && poly_int_tree_p (temp.op1)
&& TREE_CODE (temp.op2) == INTEGER_CST)
{
- offset_int off = ((wi::to_offset (temp.op0)
- - wi::to_offset (temp.op1))
- * wi::to_offset (temp.op2)
- * vn_ref_op_align_unit (&temp));
- if (wi::fits_shwi_p (off))
- temp.off = off.to_shwi();
+ poly_offset_int off = ((wi::to_poly_offset (temp.op0)
+ - wi::to_poly_offset (temp.op1))
+ * wi::to_offset (temp.op2)
+ * vn_ref_op_align_unit (&temp));
+ off.to_shwi (&temp.off);
}
}
break;
@@ -918,9 +916,9 @@ ao_ref_init_from_vn_reference (ao_ref *r
unsigned i;
tree base = NULL_TREE;
tree *op0_p = &base;
- offset_int offset = 0;
- offset_int max_size;
- offset_int size = -1;
+ poly_offset_int offset = 0;
+ poly_offset_int max_size;
+ poly_offset_int size = -1;
tree size_tree = NULL_TREE;
alias_set_type base_alias_set = -1;
@@ -936,11 +934,11 @@ ao_ref_init_from_vn_reference (ao_ref *r
if (mode == BLKmode)
size_tree = TYPE_SIZE (type);
else
- size = int (GET_MODE_BITSIZE (mode));
+ size = GET_MODE_BITSIZE (mode);
}
if (size_tree != NULL_TREE
- && TREE_CODE (size_tree) == INTEGER_CST)
- size = wi::to_offset (size_tree);
+ && poly_int_tree_p (size_tree))
+ size = wi::to_poly_offset (size_tree);
/* Initially, maxsize is the same as the accessed element size.
In the following it will only grow (or become -1). */
@@ -963,7 +961,7 @@ ao_ref_init_from_vn_reference (ao_ref *r
{
vn_reference_op_t pop = &ops[i-1];
base = TREE_OPERAND (op->op0, 0);
- if (pop->off == -1)
+ if (must_eq (pop->off, -1))
{
max_size = -1;
offset = 0;
@@ -1008,12 +1006,12 @@ ao_ref_init_from_vn_reference (ao_ref *r
parts manually. */
tree this_offset = DECL_FIELD_OFFSET (field);
- if (op->op1 || TREE_CODE (this_offset) != INTEGER_CST)
+ if (op->op1 || !poly_int_tree_p (this_offset))
max_size = -1;
else
{
- offset_int woffset = (wi::to_offset (this_offset)
- << LOG2_BITS_PER_UNIT);
+ poly_offset_int woffset = (wi::to_poly_offset (this_offset)
+ << LOG2_BITS_PER_UNIT);
woffset += wi::to_offset (DECL_FIELD_BIT_OFFSET (field));
offset += woffset;
}
@@ -1023,14 +1021,15 @@ ao_ref_init_from_vn_reference (ao_ref *r
case ARRAY_RANGE_REF:
case ARRAY_REF:
/* We recorded the lower bound and the element size. */
- if (TREE_CODE (op->op0) != INTEGER_CST
- || TREE_CODE (op->op1) != INTEGER_CST
+ if (!poly_int_tree_p (op->op0)
+ || !poly_int_tree_p (op->op1)
|| TREE_CODE (op->op2) != INTEGER_CST)
max_size = -1;
else
{
- offset_int woffset
- = wi::sext (wi::to_offset (op->op0) - wi::to_offset (op->op1),
+ poly_offset_int woffset
+ = wi::sext (wi::to_poly_offset (op->op0)
+ - wi::to_poly_offset (op->op1),
TYPE_PRECISION (TREE_TYPE (op->op0)));
woffset *= wi::to_offset (op->op2) * vn_ref_op_align_unit (op);
woffset <<= LOG2_BITS_PER_UNIT;
@@ -1077,7 +1076,7 @@ ao_ref_init_from_vn_reference (ao_ref *r
/* We discount volatiles from value-numbering elsewhere. */
ref->volatile_p = false;
- if (!wi::fits_shwi_p (size) || wi::neg_p (size))
+ if (!size.to_shwi (&ref->size) || may_lt (ref->size, 0))
{
ref->offset = 0;
ref->size = -1;
@@ -1085,21 +1084,15 @@ ao_ref_init_from_vn_reference (ao_ref *r
return true;
}
- ref->size = size.to_shwi ();
-
- if (!wi::fits_shwi_p (offset))
+ if (!offset.to_shwi (&ref->offset))
{
ref->offset = 0;
ref->max_size = -1;
return true;
}
- ref->offset = offset.to_shwi ();
-
- if (!wi::fits_shwi_p (max_size) || wi::neg_p (max_size))
+ if (!max_size.to_shwi (&ref->max_size) || may_lt (ref->max_size, 0))
ref->max_size = -1;
- else
- ref->max_size = max_size.to_shwi ();
return true;
}
@@ -1344,7 +1337,7 @@ fully_constant_vn_reference_p (vn_refere
&& (!INTEGRAL_TYPE_P (ref->type)
|| TYPE_PRECISION (ref->type) % BITS_PER_UNIT == 0))
{
- HOST_WIDE_INT off = 0;
+ poly_int64 off = 0;
HOST_WIDE_INT size;
if (INTEGRAL_TYPE_P (ref->type))
size = TYPE_PRECISION (ref->type);
@@ -1362,7 +1355,7 @@ fully_constant_vn_reference_p (vn_refere
++i;
break;
}
- if (operands[i].off == -1)
+ if (must_eq (operands[i].off, -1))
return NULL_TREE;
off += operands[i].off;
if (operands[i].opcode == MEM_REF)
@@ -1388,6 +1381,7 @@ fully_constant_vn_reference_p (vn_refere
return build_zero_cst (ref->type);
else if (ctor != error_mark_node)
{
+ HOST_WIDE_INT const_off;
if (decl)
{
tree res = fold_ctor_reference (ref->type, ctor,
@@ -1400,10 +1394,10 @@ fully_constant_vn_reference_p (vn_refere
return res;
}
}
- else
+ else if (off.is_constant (&const_off))
{
unsigned char buf[MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT];
- int len = native_encode_expr (ctor, buf, size, off);
+ int len = native_encode_expr (ctor, buf, size, const_off);
if (len > 0)
return native_interpret_expr (ref->type, buf, len);
}
@@ -1495,17 +1489,16 @@ valueize_refs_1 (vec<vn_reference_op_s>
/* If it transforms a non-constant ARRAY_REF into a constant
one, adjust the constant offset. */
else if (vro->opcode == ARRAY_REF
- && vro->off == -1
- && TREE_CODE (vro->op0) == INTEGER_CST
- && TREE_CODE (vro->op1) == INTEGER_CST
+ && must_eq (vro->off, -1)
+ && poly_int_tree_p (vro->op0)
+ && poly_int_tree_p (vro->op1)
&& TREE_CODE (vro->op2) == INTEGER_CST)
{
- offset_int off = ((wi::to_offset (vro->op0)
- - wi::to_offset (vro->op1))
- * wi::to_offset (vro->op2)
- * vn_ref_op_align_unit (vro));
- if (wi::fits_shwi_p (off))
- vro->off = off.to_shwi ();
+ poly_offset_int off = ((wi::to_poly_offset (vro->op0)
+ - wi::to_poly_offset (vro->op1))
+ * wi::to_offset (vro->op2)
+ * vn_ref_op_align_unit (vro));
+ off.to_shwi (&vro->off);
}
}
@@ -1821,10 +1814,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree
vn_reference_t vr = (vn_reference_t)vr_;
gimple *def_stmt = SSA_NAME_DEF_STMT (vuse);
tree base = ao_ref_base (ref);
- HOST_WIDE_INT offset, maxsize;
+ HOST_WIDE_INT offseti, maxsizei;
static vec<vn_reference_op_s> lhs_ops;
ao_ref lhs_ref;
bool lhs_ref_ok = false;
+ poly_int64 copy_size;
/* If the reference is based on a parameter that was determined as
pointing to readonly memory it doesn't change. */
@@ -1903,14 +1897,14 @@ vn_reference_lookup_3 (ao_ref *ref, tree
if (*disambiguate_only)
return (void *)-1;
- offset = ref->offset;
- maxsize = ref->max_size;
-
/* If we cannot constrain the size of the reference we cannot
test if anything kills it. */
- if (maxsize == -1)
+ if (!ref->max_size_known_p ())
return (void *)-1;
+ poly_int64 offset = ref->offset;
+ poly_int64 maxsize = ref->max_size;
+
/* We can't deduce anything useful from clobbers. */
if (gimple_clobber_p (def_stmt))
return (void *)-1;
@@ -1921,7 +1915,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree
if (is_gimple_reg_type (vr->type)
&& gimple_call_builtin_p (def_stmt, BUILT_IN_MEMSET)
&& integer_zerop (gimple_call_arg (def_stmt, 1))
- && tree_fits_uhwi_p (gimple_call_arg (def_stmt, 2))
+ && poly_int_tree_p (gimple_call_arg (def_stmt, 2))
&& TREE_CODE (gimple_call_arg (def_stmt, 0)) == ADDR_EXPR)
{
tree ref2 = TREE_OPERAND (gimple_call_arg (def_stmt, 0), 0);
@@ -1930,13 +1924,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree
bool reverse;
base2 = get_ref_base_and_extent (ref2, &offset2, &size2, &maxsize2,
&reverse);
- size2 = tree_to_uhwi (gimple_call_arg (def_stmt, 2)) * 8;
- if ((unsigned HOST_WIDE_INT)size2 / 8
- == tree_to_uhwi (gimple_call_arg (def_stmt, 2))
- && maxsize2 != -1
+ tree len = gimple_call_arg (def_stmt, 2);
+ if (known_size_p (maxsize2)
&& operand_equal_p (base, base2, 0)
- && offset2 <= offset
- && offset2 + size2 >= offset + maxsize)
+ && known_subrange_p (offset, maxsize, offset2,
+ wi::to_poly_offset (len) << LOG2_BITS_PER_UNIT))
{
tree val = build_zero_cst (vr->type);
return vn_reference_lookup_or_insert_for_pieces
@@ -1955,10 +1947,9 @@ vn_reference_lookup_3 (ao_ref *ref, tree
bool reverse;
base2 = get_ref_base_and_extent (gimple_assign_lhs (def_stmt),
&offset2, &size2, &maxsize2, &reverse);
- if (maxsize2 != -1
+ if (known_size_p (maxsize2)
&& operand_equal_p (base, base2, 0)
- && offset2 <= offset
- && offset2 + size2 >= offset + maxsize)
+ && known_subrange_p (offset, maxsize, offset2, size2))
{
tree val = build_zero_cst (vr->type);
return vn_reference_lookup_or_insert_for_pieces
@@ -1968,13 +1959,17 @@ vn_reference_lookup_3 (ao_ref *ref, tree
/* 3) Assignment from a constant. We can use folds native encode/interpret
routines to extract the assigned bits. */
- else if (ref->size == maxsize
+ else if (must_eq (ref->size, maxsize)
&& is_gimple_reg_type (vr->type)
&& !contains_storage_order_barrier_p (vr->operands)
&& gimple_assign_single_p (def_stmt)
&& CHAR_BIT == 8 && BITS_PER_UNIT == 8
- && maxsize % BITS_PER_UNIT == 0
- && offset % BITS_PER_UNIT == 0
+ /* native_encode and native_decode operate on arrays of bytes
+ and so fundamentally need a compile-time size and offset. */
+ && maxsize.is_constant (&maxsizei)
+ && maxsizei % BITS_PER_UNIT == 0
+ && offset.is_constant (&offseti)
+ && offseti % BITS_PER_UNIT == 0
&& (is_gimple_min_invariant (gimple_assign_rhs1 (def_stmt))
|| (TREE_CODE (gimple_assign_rhs1 (def_stmt)) == SSA_NAME
&& is_gimple_min_invariant (SSA_VAL (gimple_assign_rhs1 (def_stmt))))))
@@ -1990,8 +1985,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree
&& size2 % BITS_PER_UNIT == 0
&& offset2 % BITS_PER_UNIT == 0
&& operand_equal_p (base, base2, 0)
- && offset2 <= offset
- && offset2 + size2 >= offset + maxsize)
+ && known_subrange_p (offseti, maxsizei, offset2, size2))
{
/* We support up to 512-bit values (for V8DFmode). */
unsigned char buffer[64];
@@ -2008,14 +2002,14 @@ vn_reference_lookup_3 (ao_ref *ref, tree
/* Make sure to interpret in a type that has a range
covering the whole access size. */
if (INTEGRAL_TYPE_P (vr->type)
- && ref->size != TYPE_PRECISION (vr->type))
- type = build_nonstandard_integer_type (ref->size,
+ && maxsizei != TYPE_PRECISION (vr->type))
+ type = build_nonstandard_integer_type (maxsizei,
TYPE_UNSIGNED (type));
tree val = native_interpret_expr (type,
buffer
- + ((offset - offset2)
+ + ((offseti - offset2)
/ BITS_PER_UNIT),
- ref->size / BITS_PER_UNIT);
+ maxsizei / BITS_PER_UNIT);
/* If we chop off bits because the types precision doesn't
match the memory access size this is ok when optimizing
reads but not when called from the DSE code during
@@ -2038,7 +2032,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree
/* 4) Assignment from an SSA name which definition we may be able
to access pieces from. */
- else if (ref->size == maxsize
+ else if (must_eq (ref->size, maxsize)
&& is_gimple_reg_type (vr->type)
&& !contains_storage_order_barrier_p (vr->operands)
&& gimple_assign_single_p (def_stmt)
@@ -2054,15 +2048,14 @@ vn_reference_lookup_3 (ao_ref *ref, tree
&& maxsize2 != -1
&& maxsize2 == size2
&& operand_equal_p (base, base2, 0)
- && offset2 <= offset
- && offset2 + size2 >= offset + maxsize
+ && known_subrange_p (offset, maxsize, offset2, size2)
/* ??? We can't handle bitfield precision extracts without
either using an alternate type for the BIT_FIELD_REF and
then doing a conversion or possibly adjusting the offset
according to endianness. */
&& (! INTEGRAL_TYPE_P (vr->type)
- || ref->size == TYPE_PRECISION (vr->type))
- && ref->size % BITS_PER_UNIT == 0)
+ || must_eq (ref->size, TYPE_PRECISION (vr->type)))
+ && multiple_p (ref->size, BITS_PER_UNIT))
{
code_helper rcode = BIT_FIELD_REF;
tree ops[3];
@@ -2090,7 +2083,6 @@ vn_reference_lookup_3 (ao_ref *ref, tree
|| handled_component_p (gimple_assign_rhs1 (def_stmt))))
{
tree base2;
- HOST_WIDE_INT maxsize2;
int i, j, k;
auto_vec<vn_reference_op_s> rhs;
vn_reference_op_t vro;
@@ -2101,8 +2093,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree
/* See if the assignment kills REF. */
base2 = ao_ref_base (&lhs_ref);
- maxsize2 = lhs_ref.max_size;
- if (maxsize2 == -1
+ if (!lhs_ref.max_size_known_p ()
|| (base != base2
&& (TREE_CODE (base) != MEM_REF
|| TREE_CODE (base2) != MEM_REF
@@ -2129,15 +2120,15 @@ vn_reference_lookup_3 (ao_ref *ref, tree
may fail when comparing types for compatibility. But we really
don't care here - further lookups with the rewritten operands
will simply fail if we messed up types too badly. */
- HOST_WIDE_INT extra_off = 0;
+ poly_int64 extra_off = 0;
if (j == 0 && i >= 0
&& lhs_ops[0].opcode == MEM_REF
- && lhs_ops[0].off != -1)
+ && may_ne (lhs_ops[0].off, -1))
{
- if (lhs_ops[0].off == vr->operands[i].off)
+ if (must_eq (lhs_ops[0].off, vr->operands[i].off))
i--, j--;
else if (vr->operands[i].opcode == MEM_REF
- && vr->operands[i].off != -1)
+ && may_ne (vr->operands[i].off, -1))
{
extra_off = vr->operands[i].off - lhs_ops[0].off;
i--, j--;
@@ -2163,11 +2154,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree
copy_reference_ops_from_ref (gimple_assign_rhs1 (def_stmt), &rhs);
/* Apply an extra offset to the inner MEM_REF of the RHS. */
- if (extra_off != 0)
+ if (maybe_nonzero (extra_off))
{
if (rhs.length () < 2
|| rhs[0].opcode != MEM_REF
- || rhs[0].off == -1)
+ || must_eq (rhs[0].off, -1))
return (void *)-1;
rhs[0].off += extra_off;
rhs[0].op0 = int_const_binop (PLUS_EXPR, rhs[0].op0,
@@ -2198,7 +2189,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree
if (!ao_ref_init_from_vn_reference (&r, vr->set, vr->type, vr->operands))
return (void *)-1;
/* This can happen with bitfields. */
- if (ref->size != r.size)
+ if (may_ne (ref->size, r.size))
return (void *)-1;
*ref = r;
@@ -2221,20 +2212,20 @@ vn_reference_lookup_3 (ao_ref *ref, tree
|| TREE_CODE (gimple_call_arg (def_stmt, 0)) == SSA_NAME)
&& (TREE_CODE (gimple_call_arg (def_stmt, 1)) == ADDR_EXPR
|| TREE_CODE (gimple_call_arg (def_stmt, 1)) == SSA_NAME)
- && tree_fits_uhwi_p (gimple_call_arg (def_stmt, 2)))
+ && poly_int_tree_p (gimple_call_arg (def_stmt, 2), ©_size))
{
tree lhs, rhs;
ao_ref r;
- HOST_WIDE_INT rhs_offset, copy_size, lhs_offset;
+ poly_int64 rhs_offset, lhs_offset;
vn_reference_op_s op;
- HOST_WIDE_INT at;
+ poly_uint64 mem_offset;
+ poly_int64 at, byte_maxsize;
/* Only handle non-variable, addressable refs. */
- if (ref->size != maxsize
- || offset % BITS_PER_UNIT != 0
- || ref->size % BITS_PER_UNIT != 0)
+ if (may_ne (ref->size, maxsize)
+ || !multiple_p (offset, BITS_PER_UNIT, &at)
+ || !multiple_p (maxsize, BITS_PER_UNIT, &byte_maxsize))
return (void *)-1;
- at = offset / BITS_PER_UNIT;
/* Extract a pointer base and an offset for the destination. */
lhs = gimple_call_arg (def_stmt, 0);
@@ -2252,17 +2243,19 @@ vn_reference_lookup_3 (ao_ref *ref, tree
}
if (TREE_CODE (lhs) == ADDR_EXPR)
{
+ HOST_WIDE_INT tmp_lhs_offset;
tree tem = get_addr_base_and_unit_offset (TREE_OPERAND (lhs, 0),
- &lhs_offset);
+ &tmp_lhs_offset);
+ lhs_offset = tmp_lhs_offset;
if (!tem)
return (void *)-1;
if (TREE_CODE (tem) == MEM_REF
- && tree_fits_uhwi_p (TREE_OPERAND (tem, 1)))
+ && poly_int_tree_p (TREE_OPERAND (tem, 1), &mem_offset))
{
lhs = TREE_OPERAND (tem, 0);
if (TREE_CODE (lhs) == SSA_NAME)
lhs = SSA_VAL (lhs);
- lhs_offset += tree_to_uhwi (TREE_OPERAND (tem, 1));
+ lhs_offset += mem_offset;
}
else if (DECL_P (tem))
lhs = build_fold_addr_expr (tem);
@@ -2280,15 +2273,17 @@ vn_reference_lookup_3 (ao_ref *ref, tree
rhs = SSA_VAL (rhs);
if (TREE_CODE (rhs) == ADDR_EXPR)
{
+ HOST_WIDE_INT tmp_rhs_offset;
tree tem = get_addr_base_and_unit_offset (TREE_OPERAND (rhs, 0),
- &rhs_offset);
+ &tmp_rhs_offset);
+ rhs_offset = tmp_rhs_offset;
if (!tem)
return (void *)-1;
if (TREE_CODE (tem) == MEM_REF
- && tree_fits_uhwi_p (TREE_OPERAND (tem, 1)))
+ && poly_int_tree_p (TREE_OPERAND (tem, 1), &mem_offset))
{
rhs = TREE_OPERAND (tem, 0);
- rhs_offset += tree_to_uhwi (TREE_OPERAND (tem, 1));
+ rhs_offset += mem_offset;
}
else if (DECL_P (tem))
rhs = build_fold_addr_expr (tem);
@@ -2299,15 +2294,13 @@ vn_reference_lookup_3 (ao_ref *ref, tree
&& TREE_CODE (rhs) != ADDR_EXPR)
return (void *)-1;
- copy_size = tree_to_uhwi (gimple_call_arg (def_stmt, 2));
-
/* The bases of the destination and the references have to agree. */
if (TREE_CODE (base) == MEM_REF)
{
if (TREE_OPERAND (base, 0) != lhs
- || !tree_fits_uhwi_p (TREE_OPERAND (base, 1)))
+ || !poly_int_tree_p (TREE_OPERAND (base, 1), &mem_offset))
return (void *) -1;
- at += tree_to_uhwi (TREE_OPERAND (base, 1));
+ at += mem_offset;
}
else if (!DECL_P (base)
|| TREE_CODE (lhs) != ADDR_EXPR
@@ -2316,12 +2309,10 @@ vn_reference_lookup_3 (ao_ref *ref, tree
/* If the access is completely outside of the memcpy destination
area there is no aliasing. */
- if (lhs_offset >= at + maxsize / BITS_PER_UNIT
- || lhs_offset + copy_size <= at)
+ if (!ranges_may_overlap_p (lhs_offset, copy_size, at, byte_maxsize))
return NULL;
/* And the access has to be contained within the memcpy destination. */
- if (lhs_offset > at
- || lhs_offset + copy_size < at + maxsize / BITS_PER_UNIT)
+ if (!known_subrange_p (at, byte_maxsize, lhs_offset, copy_size))
return (void *)-1;
/* Make room for 2 operands in the new reference. */
@@ -2359,7 +2350,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree
if (!ao_ref_init_from_vn_reference (&r, vr->set, vr->type, vr->operands))
return (void *)-1;
/* This can happen with bitfields. */
- if (ref->size != r.size)
+ if (may_ne (ref->size, r.size))
return (void *)-1;
*ref = r;
Index: gcc/tree-ssa-uninit.c
===================================================================
--- gcc/tree-ssa-uninit.c 2017-10-23 16:52:20.058356365 +0100
+++ gcc/tree-ssa-uninit.c 2017-10-23 17:01:52.305178291 +0100
@@ -294,15 +294,15 @@ warn_uninitialized_vars (bool warn_possi
/* Do not warn if the access is fully outside of the
variable. */
+ poly_int64 decl_size;
if (DECL_P (base)
- && ref.size != -1
- && ref.max_size == ref.size
- && (ref.offset + ref.size <= 0
- || (ref.offset >= 0
+ && known_size_p (ref.size)
+ && must_eq (ref.max_size, ref.size)
+ && (must_le (ref.offset + ref.size, 0)
+ || (must_ge (ref.offset, 0)
&& DECL_SIZE (base)
- && TREE_CODE (DECL_SIZE (base)) == INTEGER_CST
- && compare_tree_int (DECL_SIZE (base),
- ref.offset) <= 0)))
+ && poly_int_tree_p (DECL_SIZE (base), &decl_size)
+ && must_le (decl_size, ref.offset))))
continue;
/* Do not warn if the access is then used for a BIT_INSERT_EXPR. */
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [015/nnn] poly_int: ao_ref and vn_reference_op_t
2017-10-23 17:06 ` [015/nnn] poly_int: ao_ref and vn_reference_op_t Richard Sandiford
@ 2017-11-18 4:25 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-18 4:25 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:06 AM, Richard Sandiford wrote:
> This patch changes the offset, size and max_size fields
> of ao_ref from HOST_WIDE_INT to poly_int64 and propagates
> the change through the code that references it. This includes
> changing the off field of vn_reference_op_struct in the same way.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * inchash.h (inchash::hash::add_poly_int): New function.
> * tree-ssa-alias.h (ao_ref::offset, ao_ref::size, ao_ref::max_size):
> Use poly_int64 rather than HOST_WIDE_INT.
> (ao_ref::max_size_known_p): New function.
> * tree-ssa-sccvn.h (vn_reference_op_struct::off): Use poly_int64_pod
> rather than HOST_WIDE_INT.
> * tree-ssa-alias.c (ao_ref_base): Apply get_ref_base_and_extent
> to temporaries until its interface is adjusted to match.
> (ao_ref_init_from_ptr_and_size): Handle polynomial offsets and sizes.
> (aliasing_component_refs_p, decl_refs_may_alias_p)
> (indirect_ref_may_alias_decl_p, indirect_refs_may_alias_p): Take
> the offsets and max_sizes as poly_int64s instead of HOST_WIDE_INTs.
> (refs_may_alias_p_1, stmt_kills_ref_p): Adjust for changes to
> ao_ref fields.
> * alias.c (ao_ref_from_mem): Likewise.
> * tree-ssa-dce.c (mark_aliased_reaching_defs_necessary_1): Likewise.
> * tree-ssa-dse.c (valid_ao_ref_for_dse, normalize_ref)
> (clear_bytes_written_by, setup_live_bytes_from_ref, compute_trims)
> (maybe_trim_complex_store, maybe_trim_constructor_store)
> (live_bytes_read, dse_classify_store): Likewise.
> * tree-ssa-sccvn.c (vn_reference_compute_hash, vn_reference_eq):
> (copy_reference_ops_from_ref, ao_ref_init_from_vn_reference)
> (fully_constant_vn_reference_p, valueize_refs_1): Likewise.
> (vn_reference_lookup_3): Likewise.
> * tree-ssa-uninit.c (warn_uninitialized_vars): Likewise.
It looks like this patch contains more changes from ranges_overlap_p to
ranges_may_overlap_p. They aren't noted in the ChangeLog.
As I look at these patches my worry is we're probably going to need some
guidance in our documentation for when to use the poly interfaces.
Certainly the type system helps here, so someone changing existing code
will likely get errors at compile time if they goof. But in larger
chunks of new code I won't be surprised if problems creep in until folks
adjust existing habits.
As is often the case, there's a certain amount of trust here that you
evaluated the may/must stuff correctly. It was fairly easy for me to
look at the tree-ssa-dse.c changes and see the intent as I'm reasonably
familiar with that code. tree-ssa-sccvn (as an example) is much
harder for me to evaluate.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [014/nnn] poly_int: indirect_refs_may_alias_p
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (13 preceding siblings ...)
2017-10-23 17:06 ` [015/nnn] poly_int: ao_ref and vn_reference_op_t Richard Sandiford
@ 2017-10-23 17:06 ` Richard Sandiford
2017-11-17 18:11 ` Jeff Law
2017-10-23 17:07 ` [017/nnn] poly_int: rtx_addr_can_trap_p_1 Richard Sandiford
` (92 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:06 UTC (permalink / raw)
To: gcc-patches
This patch makes indirect_refs_may_alias_p use ranges_may_overlap_p
rather than ranges_overlap_p. Unlike the former, the latter can handle
negative offsets, so the fix for PR44852 should no longer be necessary.
It can also handle offset_int, so avoids unchecked truncations to
HOST_WIDE_INT.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-ssa-alias.c (indirect_ref_may_alias_decl_p)
(indirect_refs_may_alias_p): Use ranges_may_overlap_p
instead of ranges_overlap_p.
Index: gcc/tree-ssa-alias.c
===================================================================
--- gcc/tree-ssa-alias.c 2017-10-23 17:01:49.579064221 +0100
+++ gcc/tree-ssa-alias.c 2017-10-23 17:01:51.044974644 +0100
@@ -1135,23 +1135,13 @@ indirect_ref_may_alias_decl_p (tree ref1
{
tree ptr1;
tree ptrtype1, dbase2;
- HOST_WIDE_INT offset1p = offset1, offset2p = offset2;
- HOST_WIDE_INT doffset1, doffset2;
gcc_checking_assert ((TREE_CODE (base1) == MEM_REF
|| TREE_CODE (base1) == TARGET_MEM_REF)
&& DECL_P (base2));
ptr1 = TREE_OPERAND (base1, 0);
-
- /* The offset embedded in MEM_REFs can be negative. Bias them
- so that the resulting offset adjustment is positive. */
- offset_int moff = mem_ref_offset (base1);
- moff <<= LOG2_BITS_PER_UNIT;
- if (wi::neg_p (moff))
- offset2p += (-moff).to_short_addr ();
- else
- offset1p += moff.to_short_addr ();
+ offset_int moff = mem_ref_offset (base1) << LOG2_BITS_PER_UNIT;
/* If only one reference is based on a variable, they cannot alias if
the pointer access is beyond the extent of the variable access.
@@ -1160,7 +1150,7 @@ indirect_ref_may_alias_decl_p (tree ref1
??? IVOPTs creates bases that do not honor this restriction,
so do not apply this optimization for TARGET_MEM_REFs. */
if (TREE_CODE (base1) != TARGET_MEM_REF
- && !ranges_overlap_p (MAX (0, offset1p), -1, offset2p, max_size2))
+ && !ranges_may_overlap_p (offset1 + moff, -1, offset2, max_size2))
return false;
/* They also cannot alias if the pointer may not point to the decl. */
if (!ptr_deref_may_alias_decl_p (ptr1, base2))
@@ -1213,18 +1203,11 @@ indirect_ref_may_alias_decl_p (tree ref1
dbase2 = ref2;
while (handled_component_p (dbase2))
dbase2 = TREE_OPERAND (dbase2, 0);
- doffset1 = offset1;
- doffset2 = offset2;
+ HOST_WIDE_INT doffset1 = offset1;
+ offset_int doffset2 = offset2;
if (TREE_CODE (dbase2) == MEM_REF
|| TREE_CODE (dbase2) == TARGET_MEM_REF)
- {
- offset_int moff = mem_ref_offset (dbase2);
- moff <<= LOG2_BITS_PER_UNIT;
- if (wi::neg_p (moff))
- doffset1 -= (-moff).to_short_addr ();
- else
- doffset2 -= moff.to_short_addr ();
- }
+ doffset2 -= mem_ref_offset (dbase2) << LOG2_BITS_PER_UNIT;
/* If either reference is view-converted, give up now. */
if (same_type_for_tbaa (TREE_TYPE (base1), TREE_TYPE (ptrtype1)) != 1
@@ -1241,7 +1224,7 @@ indirect_ref_may_alias_decl_p (tree ref1
if ((TREE_CODE (base1) != TARGET_MEM_REF
|| (!TMR_INDEX (base1) && !TMR_INDEX2 (base1)))
&& same_type_for_tbaa (TREE_TYPE (base1), TREE_TYPE (dbase2)) == 1)
- return ranges_overlap_p (doffset1, max_size1, doffset2, max_size2);
+ return ranges_may_overlap_p (doffset1, max_size1, doffset2, max_size2);
if (ref1 && ref2
&& nonoverlapping_component_refs_p (ref1, ref2))
@@ -1313,22 +1296,10 @@ indirect_refs_may_alias_p (tree ref1 ATT
&& operand_equal_p (TMR_INDEX2 (base1),
TMR_INDEX2 (base2), 0))))))
{
- offset_int moff;
- /* The offset embedded in MEM_REFs can be negative. Bias them
- so that the resulting offset adjustment is positive. */
- moff = mem_ref_offset (base1);
- moff <<= LOG2_BITS_PER_UNIT;
- if (wi::neg_p (moff))
- offset2 += (-moff).to_short_addr ();
- else
- offset1 += moff.to_shwi ();
- moff = mem_ref_offset (base2);
- moff <<= LOG2_BITS_PER_UNIT;
- if (wi::neg_p (moff))
- offset1 += (-moff).to_short_addr ();
- else
- offset2 += moff.to_short_addr ();
- return ranges_overlap_p (offset1, max_size1, offset2, max_size2);
+ offset_int moff1 = mem_ref_offset (base1) << LOG2_BITS_PER_UNIT;
+ offset_int moff2 = mem_ref_offset (base2) << LOG2_BITS_PER_UNIT;
+ return ranges_may_overlap_p (offset1 + moff1, max_size1,
+ offset2 + moff2, max_size2);
}
if (!ptr_derefs_may_alias_p (ptr1, ptr2))
return false;
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [014/nnn] poly_int: indirect_refs_may_alias_p
2017-10-23 17:06 ` [014/nnn] poly_int: indirect_refs_may_alias_p Richard Sandiford
@ 2017-11-17 18:11 ` Jeff Law
2017-11-20 13:31 ` Richard Sandiford
0 siblings, 1 reply; 302+ messages in thread
From: Jeff Law @ 2017-11-17 18:11 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:05 AM, Richard Sandiford wrote:
> This patch makes indirect_refs_may_alias_p use ranges_may_overlap_p
> rather than ranges_overlap_p. Unlike the former, the latter can handle
> negative offsets, so the fix for PR44852 should no longer be necessary.
> It can also handle offset_int, so avoids unchecked truncations to
> HOST_WIDE_INT.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * tree-ssa-alias.c (indirect_ref_may_alias_decl_p)
> (indirect_refs_may_alias_p): Use ranges_may_overlap_p
> instead of ranges_overlap_p.
OK.
Note that this highlighted a nit in patch 001 -- namely that there's new
function templates that aren't mentioned in the ChangeLog.
Jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [014/nnn] poly_int: indirect_refs_may_alias_p
2017-11-17 18:11 ` Jeff Law
@ 2017-11-20 13:31 ` Richard Sandiford
2017-11-21 0:49 ` Jeff Law
0 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-11-20 13:31 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches
Jeff Law <law@redhat.com> writes:
> On 10/23/2017 11:05 AM, Richard Sandiford wrote:
>> This patch makes indirect_refs_may_alias_p use ranges_may_overlap_p
>> rather than ranges_overlap_p. Unlike the former, the latter can handle
>> negative offsets, so the fix for PR44852 should no longer be necessary.
>> It can also handle offset_int, so avoids unchecked truncations to
>> HOST_WIDE_INT.
>>
>>
>> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
>> Alan Hayward <alan.hayward@arm.com>
>> David Sherwood <david.sherwood@arm.com>
>>
>> gcc/
>> * tree-ssa-alias.c (indirect_ref_may_alias_decl_p)
>> (indirect_refs_may_alias_p): Use ranges_may_overlap_p
>> instead of ranges_overlap_p.
> OK.
>
> Note that this highlighted a nit in patch 001 -- namely that there's new
> function templates that aren't mentioned in the ChangeLog.
Do you mean ranges_may_overlap_p? I can add that and the other new
poly-int.h functions to the changelog if you think it's useful,
but I thought for new files it was more usual just to do:
* foo.h: New file.
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [014/nnn] poly_int: indirect_refs_may_alias_p
2017-11-20 13:31 ` Richard Sandiford
@ 2017-11-21 0:49 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-21 0:49 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 11/20/2017 06:00 AM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 10/23/2017 11:05 AM, Richard Sandiford wrote:
>>> This patch makes indirect_refs_may_alias_p use ranges_may_overlap_p
>>> rather than ranges_overlap_p. Unlike the former, the latter can handle
>>> negative offsets, so the fix for PR44852 should no longer be necessary.
>>> It can also handle offset_int, so avoids unchecked truncations to
>>> HOST_WIDE_INT.
>>>
>>>
>>> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
>>> Alan Hayward <alan.hayward@arm.com>
>>> David Sherwood <david.sherwood@arm.com>
>>>
>>> gcc/
>>> * tree-ssa-alias.c (indirect_ref_may_alias_decl_p)
>>> (indirect_refs_may_alias_p): Use ranges_may_overlap_p
>>> instead of ranges_overlap_p.
>> OK.
>>
>> Note that this highlighted a nit in patch 001 -- namely that there's new
>> function templates that aren't mentioned in the ChangeLog.
>
> Do you mean ranges_may_overlap_p? I can add that and the other new
> poly-int.h functions to the changelog if you think it's useful,
> but I thought for new files it was more usual just to do:
>
> * foo.h: New file.
That's fine. I was just having trouble finding it when I wanted to look
at it. My mailer won't unwrap a compressed patch :-)
Jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [017/nnn] poly_int: rtx_addr_can_trap_p_1
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (14 preceding siblings ...)
2017-10-23 17:06 ` [014/nnn] poly_int: indirect_refs_may_alias_p Richard Sandiford
@ 2017-10-23 17:07 ` Richard Sandiford
2017-11-18 4:46 ` Jeff Law
2017-10-23 17:07 ` [016/nnn] poly_int: dse.c Richard Sandiford
` (91 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:07 UTC (permalink / raw)
To: gcc-patches
This patch changes the offset and size arguments of
rtx_addr_can_trap_p_1 from HOST_WIDE_INT to poly_int64. It also
uses a size of -1 rather than 0 to represent an unknown size and
BLKmode rather than VOIDmode to represent an unknown mode.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* rtlanal.c (rtx_addr_can_trap_p_1): Take the offset and size
as poly_int64s rather than HOST_WIDE_INTs. Use a size of -1
rather than 0 to represent an unknown size. Assert that the size
is known when the mode isn't BLKmode.
(may_trap_p_1): Use -1 for unknown sizes.
(rtx_addr_can_trap_p): Likewise. Pass BLKmode rather than VOIDmode.
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c 2017-10-23 17:00:54.444001238 +0100
+++ gcc/rtlanal.c 2017-10-23 17:01:55.453690255 +0100
@@ -457,16 +457,17 @@ get_initial_register_offset (int from, i
references on strict alignment machines. */
static int
-rtx_addr_can_trap_p_1 (const_rtx x, HOST_WIDE_INT offset, HOST_WIDE_INT size,
+rtx_addr_can_trap_p_1 (const_rtx x, poly_int64 offset, poly_int64 size,
machine_mode mode, bool unaligned_mems)
{
enum rtx_code code = GET_CODE (x);
+ gcc_checking_assert (mode == BLKmode || known_size_p (size));
/* The offset must be a multiple of the mode size if we are considering
unaligned memory references on strict alignment machines. */
- if (STRICT_ALIGNMENT && unaligned_mems && GET_MODE_SIZE (mode) != 0)
+ if (STRICT_ALIGNMENT && unaligned_mems && mode != BLKmode)
{
- HOST_WIDE_INT actual_offset = offset;
+ poly_int64 actual_offset = offset;
#ifdef SPARC_STACK_BOUNDARY_HACK
/* ??? The SPARC port may claim a STACK_BOUNDARY higher than
@@ -477,7 +478,7 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST
actual_offset -= STACK_POINTER_OFFSET;
#endif
- if (actual_offset % GET_MODE_SIZE (mode) != 0)
+ if (!multiple_p (actual_offset, GET_MODE_SIZE (mode)))
return 1;
}
@@ -489,14 +490,14 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST
if (!CONSTANT_POOL_ADDRESS_P (x) && !SYMBOL_REF_FUNCTION_P (x))
{
tree decl;
- HOST_WIDE_INT decl_size;
+ poly_int64 decl_size;
- if (offset < 0)
+ if (may_lt (offset, 0))
+ return 1;
+ if (known_zero (offset))
+ return 0;
+ if (!known_size_p (size))
return 1;
- if (size == 0)
- size = GET_MODE_SIZE (mode);
- if (size == 0)
- return offset != 0;
/* If the size of the access or of the symbol is unknown,
assume the worst. */
@@ -507,9 +508,10 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST
if (!decl)
decl_size = -1;
else if (DECL_P (decl) && DECL_SIZE_UNIT (decl))
- decl_size = (tree_fits_shwi_p (DECL_SIZE_UNIT (decl))
- ? tree_to_shwi (DECL_SIZE_UNIT (decl))
- : -1);
+ {
+ if (!poly_int_tree_p (DECL_SIZE_UNIT (decl), &decl_size))
+ decl_size = -1;
+ }
else if (TREE_CODE (decl) == STRING_CST)
decl_size = TREE_STRING_LENGTH (decl);
else if (TYPE_SIZE_UNIT (TREE_TYPE (decl)))
@@ -517,7 +519,7 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST
else
decl_size = -1;
- return (decl_size <= 0 ? offset != 0 : offset + size > decl_size);
+ return !known_subrange_p (offset, size, 0, decl_size);
}
return 0;
@@ -534,17 +536,14 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST
|| (x == arg_pointer_rtx && fixed_regs[ARG_POINTER_REGNUM]))
{
#ifdef RED_ZONE_SIZE
- HOST_WIDE_INT red_zone_size = RED_ZONE_SIZE;
+ poly_int64 red_zone_size = RED_ZONE_SIZE;
#else
- HOST_WIDE_INT red_zone_size = 0;
+ poly_int64 red_zone_size = 0;
#endif
- HOST_WIDE_INT stack_boundary = PREFERRED_STACK_BOUNDARY
- / BITS_PER_UNIT;
- HOST_WIDE_INT low_bound, high_bound;
-
- if (size == 0)
- size = GET_MODE_SIZE (mode);
- if (size == 0)
+ poly_int64 stack_boundary = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
+ poly_int64 low_bound, high_bound;
+
+ if (!known_size_p (size))
return 1;
if (x == frame_pointer_rtx)
@@ -562,10 +561,10 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST
}
else if (x == hard_frame_pointer_rtx)
{
- HOST_WIDE_INT sp_offset
+ poly_int64 sp_offset
= get_initial_register_offset (STACK_POINTER_REGNUM,
HARD_FRAME_POINTER_REGNUM);
- HOST_WIDE_INT ap_offset
+ poly_int64 ap_offset
= get_initial_register_offset (ARG_POINTER_REGNUM,
HARD_FRAME_POINTER_REGNUM);
@@ -589,7 +588,7 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST
}
else if (x == stack_pointer_rtx)
{
- HOST_WIDE_INT ap_offset
+ poly_int64 ap_offset
= get_initial_register_offset (ARG_POINTER_REGNUM,
STACK_POINTER_REGNUM);
@@ -629,7 +628,8 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST
#endif
}
- if (offset >= low_bound && offset <= high_bound - size)
+ if (must_ge (offset, low_bound)
+ && must_le (offset, high_bound - size))
return 0;
return 1;
}
@@ -649,7 +649,7 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST
if (XEXP (x, 0) == pic_offset_table_rtx
&& GET_CODE (XEXP (x, 1)) == CONST
&& GET_CODE (XEXP (XEXP (x, 1), 0)) == UNSPEC
- && offset == 0)
+ && known_zero (offset))
return 0;
/* - or it is an address that can't trap plus a constant integer. */
@@ -686,7 +686,7 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST
int
rtx_addr_can_trap_p (const_rtx x)
{
- return rtx_addr_can_trap_p_1 (x, 0, 0, VOIDmode, false);
+ return rtx_addr_can_trap_p_1 (x, 0, -1, BLKmode, false);
}
/* Return true if X contains a MEM subrtx. */
@@ -2796,7 +2796,7 @@ may_trap_p_1 (const_rtx x, unsigned flag
code_changed
|| !MEM_NOTRAP_P (x))
{
- HOST_WIDE_INT size = MEM_SIZE_KNOWN_P (x) ? MEM_SIZE (x) : 0;
+ HOST_WIDE_INT size = MEM_SIZE_KNOWN_P (x) ? MEM_SIZE (x) : -1;
return rtx_addr_can_trap_p_1 (XEXP (x, 0), 0, size,
GET_MODE (x), code_changed);
}
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [017/nnn] poly_int: rtx_addr_can_trap_p_1
2017-10-23 17:07 ` [017/nnn] poly_int: rtx_addr_can_trap_p_1 Richard Sandiford
@ 2017-11-18 4:46 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-18 4:46 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:07 AM, Richard Sandiford wrote:
> This patch changes the offset and size arguments of
> rtx_addr_can_trap_p_1 from HOST_WIDE_INT to poly_int64. It also
> uses a size of -1 rather than 0 to represent an unknown size and
> BLKmode rather than VOIDmode to represent an unknown mode.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * rtlanal.c (rtx_addr_can_trap_p_1): Take the offset and size
> as poly_int64s rather than HOST_WIDE_INTs. Use a size of -1
> rather than 0 to represent an unknown size. Assert that the size
> is known when the mode isn't BLKmode.
> (may_trap_p_1): Use -1 for unknown sizes.
> (rtx_addr_can_trap_p): Likewise. Pass BLKmode rather than VOIDmode.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [016/nnn] poly_int: dse.c
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (15 preceding siblings ...)
2017-10-23 17:07 ` [017/nnn] poly_int: rtx_addr_can_trap_p_1 Richard Sandiford
@ 2017-10-23 17:07 ` Richard Sandiford
2017-11-18 4:30 ` Jeff Law
2017-10-23 17:08 ` [020/nnn] poly_int: store_bit_field bitrange Richard Sandiford
` (90 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:07 UTC (permalink / raw)
To: gcc-patches
This patch makes RTL DSE use poly_int for offsets and sizes.
The local phase can optimise them normally but the global phase
treats them as wild accesses.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* dse.c (store_info): Change offset and width from HOST_WIDE_INT
to poly_int64. Update commentary for positions_needed.large.
(read_info_type): Change offset and width from HOST_WIDE_INT
to poly_int64.
(set_usage_bits): Likewise.
(canon_address): Return the offset as a poly_int64 rather than
a HOST_WIDE_INT. Use strip_offset_and_add.
(set_all_positions_unneeded, any_positions_needed_p): Use
positions_needed.large to track stores with non-constant widths.
(all_positions_needed_p): Likewise. Take the offset and width
as poly_int64s rather than ints. Assert that rhs is nonnull.
(record_store): Cope with non-constant offsets and widths.
Nullify the rhs of an earlier store if we can't tell which bytes
of it are needed.
(find_shift_sequence): Take the access_size and shift as poly_int64s
rather than ints.
(get_stored_val): Take the read_offset and read_width as poly_int64s
rather than HOST_WIDE_INTs.
(check_mem_read_rtx, scan_stores, scan_reads, dse_step5): Handle
non-constant offsets and widths.
Index: gcc/dse.c
===================================================================
--- gcc/dse.c 2017-10-23 16:52:20.003305798 +0100
+++ gcc/dse.c 2017-10-23 17:01:54.249406896 +0100
@@ -244,11 +244,11 @@ struct store_info
rtx mem_addr;
/* The offset of the first byte associated with the operation. */
- HOST_WIDE_INT offset;
+ poly_int64 offset;
/* The number of bytes covered by the operation. This is always exact
and known (rather than -1). */
- HOST_WIDE_INT width;
+ poly_int64 width;
union
{
@@ -259,12 +259,19 @@ struct store_info
struct
{
- /* A bitmap with one bit per byte. Cleared bit means the position
- is needed. Used if IS_LARGE is false. */
+ /* A bitmap with one bit per byte, or null if the number of
+ bytes isn't known at compile time. A cleared bit means
+ the position is needed. Used if IS_LARGE is true. */
bitmap bmap;
- /* Number of set bits (i.e. unneeded bytes) in BITMAP. If it is
- equal to WIDTH, the whole store is unused. */
+ /* When BITMAP is nonnull, this counts the number of set bits
+ (i.e. unneeded bytes) in the bitmap. If it is equal to
+ WIDTH, the whole store is unused.
+
+ When BITMAP is null:
+ - the store is definitely not needed when COUNT == 1
+ - all the store is needed when COUNT == 0 and RHS is nonnull
+ - otherwise we don't know which parts of the store are needed. */
int count;
} large;
} positions_needed;
@@ -308,10 +315,10 @@ struct read_info_type
int group_id;
/* The offset of the first byte associated with the operation. */
- HOST_WIDE_INT offset;
+ poly_int64 offset;
/* The number of bytes covered by the operation, or -1 if not known. */
- HOST_WIDE_INT width;
+ poly_int64 width;
/* The mem being read. */
rtx mem;
@@ -940,13 +947,18 @@ can_escape (tree expr)
OFFSET and WIDTH. */
static void
-set_usage_bits (group_info *group, HOST_WIDE_INT offset, HOST_WIDE_INT width,
+set_usage_bits (group_info *group, poly_int64 offset, poly_int64 width,
tree expr)
{
- HOST_WIDE_INT i;
+ /* Non-constant offsets and widths act as global kills, so there's no point
+ trying to use them to derive global DSE candidates. */
+ HOST_WIDE_INT i, const_offset, const_width;
bool expr_escapes = can_escape (expr);
- if (offset > -MAX_OFFSET && offset + width < MAX_OFFSET)
- for (i=offset; i<offset+width; i++)
+ if (offset.is_constant (&const_offset)
+ && width.is_constant (&const_width)
+ && const_offset > -MAX_OFFSET
+ && const_offset + const_width < MAX_OFFSET)
+ for (i = const_offset; i < const_offset + const_width; ++i)
{
bitmap store1;
bitmap store2;
@@ -1080,7 +1092,7 @@ const_or_frame_p (rtx x)
static bool
canon_address (rtx mem,
int *group_id,
- HOST_WIDE_INT *offset,
+ poly_int64 *offset,
cselib_val **base)
{
machine_mode address_mode = get_address_mode (mem);
@@ -1147,12 +1159,7 @@ canon_address (rtx mem,
if (GET_CODE (address) == CONST)
address = XEXP (address, 0);
- if (GET_CODE (address) == PLUS
- && CONST_INT_P (XEXP (address, 1)))
- {
- *offset = INTVAL (XEXP (address, 1));
- address = XEXP (address, 0);
- }
+ address = strip_offset_and_add (address, offset);
if (ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (mem))
&& const_or_frame_p (address))
@@ -1160,8 +1167,11 @@ canon_address (rtx mem,
group_info *group = get_group_info (address);
if (dump_file && (dump_flags & TDF_DETAILS))
- fprintf (dump_file, " gid=%d offset=%d \n",
- group->id, (int)*offset);
+ {
+ fprintf (dump_file, " gid=%d offset=", group->id);
+ print_dec (*offset, dump_file);
+ fprintf (dump_file, "\n");
+ }
*base = NULL;
*group_id = group->id;
return true;
@@ -1178,8 +1188,12 @@ canon_address (rtx mem,
return false;
}
if (dump_file && (dump_flags & TDF_DETAILS))
- fprintf (dump_file, " varying cselib base=%u:%u offset = %d\n",
- (*base)->uid, (*base)->hash, (int)*offset);
+ {
+ fprintf (dump_file, " varying cselib base=%u:%u offset = ",
+ (*base)->uid, (*base)->hash);
+ print_dec (*offset, dump_file);
+ fprintf (dump_file, "\n");
+ }
return true;
}
@@ -1228,9 +1242,17 @@ set_all_positions_unneeded (store_info *
{
if (__builtin_expect (s_info->is_large, false))
{
- bitmap_set_range (s_info->positions_needed.large.bmap,
- 0, s_info->width);
- s_info->positions_needed.large.count = s_info->width;
+ HOST_WIDE_INT width;
+ if (s_info->width.is_constant (&width))
+ {
+ bitmap_set_range (s_info->positions_needed.large.bmap, 0, width);
+ s_info->positions_needed.large.count = width;
+ }
+ else
+ {
+ gcc_checking_assert (!s_info->positions_needed.large.bmap);
+ s_info->positions_needed.large.count = 1;
+ }
}
else
s_info->positions_needed.small_bitmask = HOST_WIDE_INT_0U;
@@ -1242,35 +1264,64 @@ set_all_positions_unneeded (store_info *
any_positions_needed_p (store_info *s_info)
{
if (__builtin_expect (s_info->is_large, false))
- return s_info->positions_needed.large.count < s_info->width;
+ {
+ HOST_WIDE_INT width;
+ if (s_info->width.is_constant (&width))
+ {
+ gcc_checking_assert (s_info->positions_needed.large.bmap);
+ return s_info->positions_needed.large.count < width;
+ }
+ else
+ {
+ gcc_checking_assert (!s_info->positions_needed.large.bmap);
+ return s_info->positions_needed.large.count == 0;
+ }
+ }
else
return (s_info->positions_needed.small_bitmask != HOST_WIDE_INT_0U);
}
/* Return TRUE if all bytes START through START+WIDTH-1 from S_INFO
- store are needed. */
+ store are known to be needed. */
static inline bool
-all_positions_needed_p (store_info *s_info, int start, int width)
+all_positions_needed_p (store_info *s_info, poly_int64 start,
+ poly_int64 width)
{
+ gcc_assert (s_info->rhs);
+ if (!s_info->width.is_constant ())
+ {
+ gcc_assert (s_info->is_large
+ && !s_info->positions_needed.large.bmap);
+ return s_info->positions_needed.large.count == 0;
+ }
+
+ /* Otherwise, if START and WIDTH are non-constant, we're asking about
+ a non-constant region of a constant-sized store. We can't say for
+ sure that all positions are needed. */
+ HOST_WIDE_INT const_start, const_width;
+ if (!start.is_constant (&const_start)
+ || !width.is_constant (&const_width))
+ return false;
+
if (__builtin_expect (s_info->is_large, false))
{
- int end = start + width;
- while (start < end)
- if (bitmap_bit_p (s_info->positions_needed.large.bmap, start++))
+ for (HOST_WIDE_INT i = const_start; i < const_start + const_width; ++i)
+ if (bitmap_bit_p (s_info->positions_needed.large.bmap, i))
return false;
return true;
}
else
{
- unsigned HOST_WIDE_INT mask = lowpart_bitmask (width) << start;
+ unsigned HOST_WIDE_INT mask
+ = lowpart_bitmask (const_width) << const_start;
return (s_info->positions_needed.small_bitmask & mask) == mask;
}
}
-static rtx get_stored_val (store_info *, machine_mode, HOST_WIDE_INT,
- HOST_WIDE_INT, basic_block, bool);
+static rtx get_stored_val (store_info *, machine_mode, poly_int64,
+ poly_int64, basic_block, bool);
/* BODY is an instruction pattern that belongs to INSN. Return 1 if
@@ -1281,8 +1332,8 @@ static rtx get_stored_val (store_info *,
record_store (rtx body, bb_info_t bb_info)
{
rtx mem, rhs, const_rhs, mem_addr;
- HOST_WIDE_INT offset = 0;
- HOST_WIDE_INT width = 0;
+ poly_int64 offset = 0;
+ poly_int64 width = 0;
insn_info_t insn_info = bb_info->last_insn;
store_info *store_info = NULL;
int group_id;
@@ -1437,7 +1488,7 @@ record_store (rtx body, bb_info_t bb_inf
group_info *group = rtx_group_vec[group_id];
mem_addr = group->canon_base_addr;
}
- if (offset)
+ if (maybe_nonzero (offset))
mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);
while (ptr)
@@ -1497,18 +1548,27 @@ record_store (rtx body, bb_info_t bb_inf
}
}
+ HOST_WIDE_INT begin_unneeded, const_s_width, const_width;
if (known_subrange_p (s_info->offset, s_info->width, offset, width))
/* The new store touches every byte that S_INFO does. */
set_all_positions_unneeded (s_info);
- else
+ else if ((offset - s_info->offset).is_constant (&begin_unneeded)
+ && s_info->width.is_constant (&const_s_width)
+ && width.is_constant (&const_width))
{
- HOST_WIDE_INT begin_unneeded = offset - s_info->offset;
- HOST_WIDE_INT end_unneeded = begin_unneeded + width;
+ HOST_WIDE_INT end_unneeded = begin_unneeded + const_width;
begin_unneeded = MAX (begin_unneeded, 0);
- end_unneeded = MIN (end_unneeded, s_info->width);
+ end_unneeded = MIN (end_unneeded, const_s_width);
for (i = begin_unneeded; i < end_unneeded; ++i)
set_position_unneeded (s_info, i);
}
+ else
+ {
+ /* We don't know which parts of S_INFO are needed and
+ which aren't, so invalidate the RHS. */
+ s_info->rhs = NULL;
+ s_info->const_rhs = NULL;
+ }
}
else if (s_info->rhs)
/* Need to see if it is possible for this store to overwrite
@@ -1554,7 +1614,14 @@ record_store (rtx body, bb_info_t bb_inf
store_info->mem = mem;
store_info->mem_addr = mem_addr;
store_info->cse_base = base;
- if (width > HOST_BITS_PER_WIDE_INT)
+ HOST_WIDE_INT const_width;
+ if (!width.is_constant (&const_width))
+ {
+ store_info->is_large = true;
+ store_info->positions_needed.large.count = 0;
+ store_info->positions_needed.large.bmap = NULL;
+ }
+ else if (const_width > HOST_BITS_PER_WIDE_INT)
{
store_info->is_large = true;
store_info->positions_needed.large.count = 0;
@@ -1563,7 +1630,8 @@ record_store (rtx body, bb_info_t bb_inf
else
{
store_info->is_large = false;
- store_info->positions_needed.small_bitmask = lowpart_bitmask (width);
+ store_info->positions_needed.small_bitmask
+ = lowpart_bitmask (const_width);
}
store_info->group_id = group_id;
store_info->offset = offset;
@@ -1598,10 +1666,10 @@ dump_insn_info (const char * start, insn
shift. */
static rtx
-find_shift_sequence (int access_size,
+find_shift_sequence (poly_int64 access_size,
store_info *store_info,
machine_mode read_mode,
- int shift, bool speed, bool require_cst)
+ poly_int64 shift, bool speed, bool require_cst)
{
machine_mode store_mode = GET_MODE (store_info->mem);
scalar_int_mode new_mode;
@@ -1737,11 +1805,11 @@ look_for_hardregs (rtx x, const_rtx pat
static rtx
get_stored_val (store_info *store_info, machine_mode read_mode,
- HOST_WIDE_INT read_offset, HOST_WIDE_INT read_width,
+ poly_int64 read_offset, poly_int64 read_width,
basic_block bb, bool require_cst)
{
machine_mode store_mode = GET_MODE (store_info->mem);
- HOST_WIDE_INT gap;
+ poly_int64 gap;
rtx read_reg;
/* To get here the read is within the boundaries of the write so
@@ -1755,10 +1823,10 @@ get_stored_val (store_info *store_info,
else
gap = read_offset - store_info->offset;
- if (gap != 0)
+ if (maybe_nonzero (gap))
{
- HOST_WIDE_INT shift = gap * BITS_PER_UNIT;
- HOST_WIDE_INT access_size = GET_MODE_SIZE (read_mode) + gap;
+ poly_int64 shift = gap * BITS_PER_UNIT;
+ poly_int64 access_size = GET_MODE_SIZE (read_mode) + gap;
read_reg = find_shift_sequence (access_size, store_info, read_mode,
shift, optimize_bb_for_speed_p (bb),
require_cst);
@@ -1977,8 +2045,8 @@ check_mem_read_rtx (rtx *loc, bb_info_t
{
rtx mem = *loc, mem_addr;
insn_info_t insn_info;
- HOST_WIDE_INT offset = 0;
- HOST_WIDE_INT width = 0;
+ poly_int64 offset = 0;
+ poly_int64 width = 0;
cselib_val *base = NULL;
int group_id;
read_info_t read_info;
@@ -2027,7 +2095,7 @@ check_mem_read_rtx (rtx *loc, bb_info_t
group_info *group = rtx_group_vec[group_id];
mem_addr = group->canon_base_addr;
}
- if (offset)
+ if (maybe_nonzero (offset))
mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);
if (group_id >= 0)
@@ -2039,7 +2107,7 @@ check_mem_read_rtx (rtx *loc, bb_info_t
if (dump_file && (dump_flags & TDF_DETAILS))
{
- if (width == -1)
+ if (!known_size_p (width))
fprintf (dump_file, " processing const load gid=%d[BLK]\n",
group_id);
else
@@ -2073,7 +2141,7 @@ check_mem_read_rtx (rtx *loc, bb_info_t
{
/* This is a block mode load. We may get lucky and
canon_true_dependence may save the day. */
- if (width == -1)
+ if (!known_size_p (width))
remove
= canon_true_dependence (store_info->mem,
GET_MODE (store_info->mem),
@@ -2803,13 +2871,17 @@ scan_stores (store_info *store_info, bit
{
while (store_info)
{
- HOST_WIDE_INT i;
+ HOST_WIDE_INT i, offset, width;
group_info *group_info
= rtx_group_vec[store_info->group_id];
- if (group_info->process_globally)
+ /* We can (conservatively) ignore stores whose bounds aren't known;
+ they simply don't generate new global dse opportunities. */
+ if (group_info->process_globally
+ && store_info->offset.is_constant (&offset)
+ && store_info->width.is_constant (&width))
{
- HOST_WIDE_INT end = store_info->offset + store_info->width;
- for (i = store_info->offset; i < end; i++)
+ HOST_WIDE_INT end = offset + width;
+ for (i = offset; i < end; i++)
{
int index = get_bitmap_index (group_info, i);
if (index != 0)
@@ -2869,7 +2941,12 @@ scan_reads (insn_info_t insn_info, bitma
{
if (i == read_info->group_id)
{
- if (!known_size_p (read_info->width))
+ HOST_WIDE_INT offset, width;
+ /* Reads with non-constant size kill all DSE opportunities
+ in the group. */
+ if (!read_info->offset.is_constant (&offset)
+ || !read_info->width.is_constant (&width)
+ || !known_size_p (width))
{
/* Handle block mode reads. */
if (kill)
@@ -2881,8 +2958,8 @@ scan_reads (insn_info_t insn_info, bitma
/* The groups are the same, just process the
offsets. */
HOST_WIDE_INT j;
- HOST_WIDE_INT end = read_info->offset + read_info->width;
- for (j = read_info->offset; j < end; j++)
+ HOST_WIDE_INT end = offset + width;
+ for (j = offset; j < end; j++)
{
int index = get_bitmap_index (group, j);
if (index != 0)
@@ -3298,22 +3375,30 @@ dse_step5 (void)
while (!store_info->is_set)
store_info = store_info->next;
- HOST_WIDE_INT i;
+ HOST_WIDE_INT i, offset, width;
group_info *group_info = rtx_group_vec[store_info->group_id];
- HOST_WIDE_INT end = store_info->offset + store_info->width;
- for (i = store_info->offset; i < end; i++)
+ if (!store_info->offset.is_constant (&offset)
+ || !store_info->width.is_constant (&width))
+ deleted = false;
+ else
{
- int index = get_bitmap_index (group_info, i);
-
- if (dump_file && (dump_flags & TDF_DETAILS))
- fprintf (dump_file, "i = %d, index = %d\n", (int)i, index);
- if (index == 0 || !bitmap_bit_p (v, index))
+ HOST_WIDE_INT end = offset + width;
+ for (i = offset; i < end; i++)
{
+ int index = get_bitmap_index (group_info, i);
+
if (dump_file && (dump_flags & TDF_DETAILS))
- fprintf (dump_file, "failing at i = %d\n", (int)i);
- deleted = false;
- break;
+ fprintf (dump_file, "i = %d, index = %d\n",
+ (int) i, index);
+ if (index == 0 || !bitmap_bit_p (v, index))
+ {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, "failing at i = %d\n",
+ (int) i);
+ deleted = false;
+ break;
+ }
}
}
if (deleted)
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [016/nnn] poly_int: dse.c
2017-10-23 17:07 ` [016/nnn] poly_int: dse.c Richard Sandiford
@ 2017-11-18 4:30 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-18 4:30 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:06 AM, Richard Sandiford wrote:
> This patch makes RTL DSE use poly_int for offsets and sizes.
> The local phase can optimise them normally but the global phase
> treats them as wild accesses.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * dse.c (store_info): Change offset and width from HOST_WIDE_INT
> to poly_int64. Update commentary for positions_needed.large.
> (read_info_type): Change offset and width from HOST_WIDE_INT
> to poly_int64.
> (set_usage_bits): Likewise.
> (canon_address): Return the offset as a poly_int64 rather than
> a HOST_WIDE_INT. Use strip_offset_and_add.
> (set_all_positions_unneeded, any_positions_needed_p): Use
> positions_needed.large to track stores with non-constant widths.
> (all_positions_needed_p): Likewise. Take the offset and width
> as poly_int64s rather than ints. Assert that rhs is nonnull.
> (record_store): Cope with non-constant offsets and widths.
> Nullify the rhs of an earlier store if we can't tell which bytes
> of it are needed.
> (find_shift_sequence): Take the access_size and shift as poly_int64s
> rather than ints.
> (get_stored_val): Take the read_offset and read_width as poly_int64s
> rather than HOST_WIDE_INTs.
> (check_mem_read_rtx, scan_stores, scan_reads, dse_step5): Handle
> non-constant offsets and widths.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [020/nnn] poly_int: store_bit_field bitrange
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (16 preceding siblings ...)
2017-10-23 17:07 ` [016/nnn] poly_int: dse.c Richard Sandiford
@ 2017-10-23 17:08 ` Richard Sandiford
2017-12-05 23:43 ` Jeff Law
2017-10-23 17:08 ` [018/nnn] poly_int: MEM_OFFSET and MEM_SIZE Richard Sandiford
` (89 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:08 UTC (permalink / raw)
To: gcc-patches
This patch changes the bitnum and bitsize arguments to
store_bit_field from unsigned HOST_WIDE_INTs to poly_uint64s.
The later part of store_bit_field_1 still needs to operate
on constant bit positions and sizes, so the patch splits
it out into a subfunction (store_integral_bit_field).
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* expmed.h (store_bit_field): Take bitsize and bitnum as
poly_uint64s rather than unsigned HOST_WIDE_INTs.
* expmed.c (simple_mem_bitfield_p): Likewise. Add a parameter
that returns the byte size.
(store_bit_field_1): Take bitsize and bitnum as
poly_uint64s rather than unsigned HOST_WIDE_INTs. Update call
to simple_mem_bitfield_p. Split the part that can only handle
constant bitsize and bitnum out into...
(store_integral_bit_field): ...this new function.
(store_bit_field): Take bitsize and bitnum as poly_uint64s rather
than unsigned HOST_WIDE_INTs.
(extract_bit_field_1): Update call to simple_mem_bitfield_p.
Index: gcc/expmed.h
===================================================================
--- gcc/expmed.h 2017-10-23 17:00:54.441003964 +0100
+++ gcc/expmed.h 2017-10-23 17:02:01.542011677 +0100
@@ -718,8 +718,7 @@ extern rtx expand_divmod (int, enum tree
rtx, int);
#endif
-extern void store_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+extern void store_bit_field (rtx, poly_uint64, poly_uint64,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
machine_mode, rtx, bool);
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c 2017-10-23 17:00:57.771973825 +0100
+++ gcc/expmed.c 2017-10-23 17:02:01.542011677 +0100
@@ -46,6 +46,12 @@ struct target_expmed default_target_expm
struct target_expmed *this_target_expmed = &default_target_expmed;
#endif
+static bool store_integral_bit_field (rtx, opt_scalar_int_mode,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ machine_mode, rtx, bool, bool);
static void store_fixed_bit_field (rtx, opt_scalar_int_mode,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
@@ -562,17 +568,18 @@ strict_volatile_bitfield_p (rtx op0, uns
}
/* Return true if OP is a memory and if a bitfield of size BITSIZE at
- bit number BITNUM can be treated as a simple value of mode MODE. */
+ bit number BITNUM can be treated as a simple value of mode MODE.
+ Store the byte offset in *BYTENUM if so. */
static bool
-simple_mem_bitfield_p (rtx op0, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, machine_mode mode)
+simple_mem_bitfield_p (rtx op0, poly_uint64 bitsize, poly_uint64 bitnum,
+ machine_mode mode, poly_uint64 *bytenum)
{
return (MEM_P (op0)
- && bitnum % BITS_PER_UNIT == 0
- && bitsize == GET_MODE_BITSIZE (mode)
+ && multiple_p (bitnum, BITS_PER_UNIT, bytenum)
+ && must_eq (bitsize, GET_MODE_BITSIZE (mode))
&& (!targetm.slow_unaligned_access (mode, MEM_ALIGN (op0))
- || (bitnum % GET_MODE_ALIGNMENT (mode) == 0
+ || (multiple_p (bitnum, GET_MODE_ALIGNMENT (mode))
&& MEM_ALIGN (op0) >= GET_MODE_ALIGNMENT (mode))));
}
\f
@@ -717,15 +724,13 @@ store_bit_field_using_insv (const extrac
return false instead. */
static bool
-store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum,
+store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
unsigned HOST_WIDE_INT bitregion_start,
unsigned HOST_WIDE_INT bitregion_end,
machine_mode fieldmode,
rtx value, bool reverse, bool fallback_p)
{
rtx op0 = str_rtx;
- rtx orig_value;
while (GET_CODE (op0) == SUBREG)
{
@@ -736,23 +741,23 @@ store_bit_field_1 (rtx str_rtx, unsigned
/* No action is needed if the target is a register and if the field
lies completely outside that register. This can occur if the source
code contains an out-of-bounds access to a small array. */
- if (REG_P (op0) && bitnum >= GET_MODE_BITSIZE (GET_MODE (op0)))
+ if (REG_P (op0) && must_ge (bitnum, GET_MODE_BITSIZE (GET_MODE (op0))))
return true;
/* Use vec_set patterns for inserting parts of vectors whenever
available. */
machine_mode outermode = GET_MODE (op0);
scalar_mode innermode = GET_MODE_INNER (outermode);
+ poly_uint64 pos;
if (VECTOR_MODE_P (outermode)
&& !MEM_P (op0)
&& optab_handler (vec_set_optab, outermode) != CODE_FOR_nothing
&& fieldmode == innermode
- && bitsize == GET_MODE_BITSIZE (innermode)
- && !(bitnum % GET_MODE_BITSIZE (innermode)))
+ && must_eq (bitsize, GET_MODE_BITSIZE (innermode))
+ && multiple_p (bitnum, GET_MODE_BITSIZE (innermode), &pos))
{
struct expand_operand ops[3];
enum insn_code icode = optab_handler (vec_set_optab, outermode);
- int pos = bitnum / GET_MODE_BITSIZE (innermode);
create_fixed_operand (&ops[0], op0);
create_input_operand (&ops[1], value, innermode);
@@ -764,16 +769,16 @@ store_bit_field_1 (rtx str_rtx, unsigned
/* If the target is a register, overwriting the entire object, or storing
a full-word or multi-word field can be done with just a SUBREG. */
if (!MEM_P (op0)
- && bitsize == GET_MODE_BITSIZE (fieldmode)
- && ((bitsize == GET_MODE_BITSIZE (GET_MODE (op0)) && bitnum == 0)
- || (bitsize % BITS_PER_WORD == 0 && bitnum % BITS_PER_WORD == 0)))
+ && must_eq (bitsize, GET_MODE_BITSIZE (fieldmode)))
{
/* Use the subreg machinery either to narrow OP0 to the required
words or to cope with mode punning between equal-sized modes.
In the latter case, use subreg on the rhs side, not lhs. */
rtx sub;
-
- if (bitsize == GET_MODE_BITSIZE (GET_MODE (op0)))
+ HOST_WIDE_INT regnum;
+ HOST_WIDE_INT regsize = REGMODE_NATURAL_SIZE (GET_MODE (op0));
+ if (known_zero (bitnum)
+ && must_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (op0))))
{
sub = simplify_gen_subreg (GET_MODE (op0), value, fieldmode, 0);
if (sub)
@@ -784,10 +789,11 @@ store_bit_field_1 (rtx str_rtx, unsigned
return true;
}
}
- else
+ else if (constant_multiple_p (bitnum, regsize * BITS_PER_UNIT, ®num)
+ && multiple_p (bitsize, regsize * BITS_PER_UNIT))
{
sub = simplify_gen_subreg (fieldmode, op0, GET_MODE (op0),
- bitnum / BITS_PER_UNIT);
+ regnum * regsize);
if (sub)
{
if (reverse)
@@ -801,15 +807,23 @@ store_bit_field_1 (rtx str_rtx, unsigned
/* If the target is memory, storing any naturally aligned field can be
done with a simple store. For targets that support fast unaligned
memory, any naturally sized, unit aligned field can be done directly. */
- if (simple_mem_bitfield_p (op0, bitsize, bitnum, fieldmode))
+ poly_uint64 bytenum;
+ if (simple_mem_bitfield_p (op0, bitsize, bitnum, fieldmode, &bytenum))
{
- op0 = adjust_bitfield_address (op0, fieldmode, bitnum / BITS_PER_UNIT);
+ op0 = adjust_bitfield_address (op0, fieldmode, bytenum);
if (reverse)
value = flip_storage_order (fieldmode, value);
emit_move_insn (op0, value);
return true;
}
+ /* It's possible we'll need to handle other cases here for
+ polynomial bitnum and bitsize. */
+
+ /* From here on we need to be looking at a fixed-size insertion. */
+ unsigned HOST_WIDE_INT ibitsize = bitsize.to_constant ();
+ unsigned HOST_WIDE_INT ibitnum = bitnum.to_constant ();
+
/* Make sure we are playing with integral modes. Pun with subregs
if we aren't. This must come after the entire register case above,
since that case is valid for any mode. The following cases are only
@@ -825,12 +839,31 @@ store_bit_field_1 (rtx str_rtx, unsigned
op0 = gen_lowpart (op0_mode.require (), op0);
}
+ return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
+ bitregion_start, bitregion_end,
+ fieldmode, value, reverse, fallback_p);
+}
+
+/* Subroutine of store_bit_field_1, with the same arguments, except
+ that BITSIZE and BITNUM are constant. Handle cases specific to
+ integral modes. If OP0_MODE is defined, it is the mode of OP0,
+ otherwise OP0 is a BLKmode MEM. */
+
+static bool
+store_integral_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
+ unsigned HOST_WIDE_INT bitsize,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ machine_mode fieldmode,
+ rtx value, bool reverse, bool fallback_p)
+{
/* Storing an lsb-aligned field in a register
can be done with a movstrict instruction. */
if (!MEM_P (op0)
&& !reverse
- && lowpart_bit_field_p (bitnum, bitsize, GET_MODE (op0))
+ && lowpart_bit_field_p (bitnum, bitsize, op0_mode.require ())
&& bitsize == GET_MODE_BITSIZE (fieldmode)
&& optab_handler (movstrict_optab, fieldmode) != CODE_FOR_nothing)
{
@@ -882,10 +915,13 @@ store_bit_field_1 (rtx str_rtx, unsigned
subwords to extract. Note that fieldmode will often (always?) be
VOIDmode, because that is what store_field uses to indicate that this
is a bit field, but passing VOIDmode to operand_subword_force
- is not allowed. */
- fieldmode = GET_MODE (value);
- if (fieldmode == VOIDmode)
- fieldmode = smallest_int_mode_for_size (nwords * BITS_PER_WORD);
+ is not allowed.
+
+ The mode must be fixed-size, since insertions into variable-sized
+ objects are meant to be handled before calling this function. */
+ fixed_size_mode value_mode = as_a <fixed_size_mode> (GET_MODE (value));
+ if (value_mode == VOIDmode)
+ value_mode = smallest_int_mode_for_size (nwords * BITS_PER_WORD);
last = get_last_insn ();
for (i = 0; i < nwords; i++)
@@ -893,7 +929,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
/* If I is 0, use the low-order word in both field and target;
if I is 1, use the next to lowest word; and so on. */
unsigned int wordnum = (backwards
- ? GET_MODE_SIZE (fieldmode) / UNITS_PER_WORD
+ ? GET_MODE_SIZE (value_mode) / UNITS_PER_WORD
- i - 1
: i);
unsigned int bit_offset = (backwards ^ reverse
@@ -901,7 +937,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
* BITS_PER_WORD,
0)
: (int) i * BITS_PER_WORD);
- rtx value_word = operand_subword_force (value, wordnum, fieldmode);
+ rtx value_word = operand_subword_force (value, wordnum, value_mode);
unsigned HOST_WIDE_INT new_bitsize =
MIN (BITS_PER_WORD, bitsize - i * BITS_PER_WORD);
@@ -935,7 +971,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
integer of the corresponding size. This can occur on a machine
with 64 bit registers that uses SFmode for float. It can also
occur for unaligned float or complex fields. */
- orig_value = value;
+ rtx orig_value = value;
scalar_int_mode value_mode;
if (GET_MODE (value) == VOIDmode)
/* By this point we've dealt with values that are bigger than a word,
@@ -1043,41 +1079,43 @@ store_bit_field_1 (rtx str_rtx, unsigned
If REVERSE is true, the store is to be done in reverse order. */
void
-store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum,
+store_bit_field (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
unsigned HOST_WIDE_INT bitregion_start,
unsigned HOST_WIDE_INT bitregion_end,
machine_mode fieldmode,
rtx value, bool reverse)
{
/* Handle -fstrict-volatile-bitfields in the cases where it applies. */
+ unsigned HOST_WIDE_INT ibitsize = 0, ibitnum = 0;
scalar_int_mode int_mode;
- if (is_a <scalar_int_mode> (fieldmode, &int_mode)
- && strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, int_mode,
+ if (bitsize.is_constant (&ibitsize)
+ && bitnum.is_constant (&ibitnum)
+ && is_a <scalar_int_mode> (fieldmode, &int_mode)
+ && strict_volatile_bitfield_p (str_rtx, ibitsize, ibitnum, int_mode,
bitregion_start, bitregion_end))
{
/* Storing of a full word can be done with a simple store.
We know here that the field can be accessed with one single
instruction. For targets that support unaligned memory,
an unaligned access may be necessary. */
- if (bitsize == GET_MODE_BITSIZE (int_mode))
+ if (ibitsize == GET_MODE_BITSIZE (int_mode))
{
str_rtx = adjust_bitfield_address (str_rtx, int_mode,
- bitnum / BITS_PER_UNIT);
+ ibitnum / BITS_PER_UNIT);
if (reverse)
value = flip_storage_order (int_mode, value);
- gcc_assert (bitnum % BITS_PER_UNIT == 0);
+ gcc_assert (ibitnum % BITS_PER_UNIT == 0);
emit_move_insn (str_rtx, value);
}
else
{
rtx temp;
- str_rtx = narrow_bit_field_mem (str_rtx, int_mode, bitsize, bitnum,
- &bitnum);
- gcc_assert (bitnum + bitsize <= GET_MODE_BITSIZE (int_mode));
+ str_rtx = narrow_bit_field_mem (str_rtx, int_mode, ibitsize,
+ ibitnum, &ibitnum);
+ gcc_assert (ibitnum + ibitsize <= GET_MODE_BITSIZE (int_mode));
temp = copy_to_reg (str_rtx);
- if (!store_bit_field_1 (temp, bitsize, bitnum, 0, 0,
+ if (!store_bit_field_1 (temp, ibitsize, ibitnum, 0, 0,
int_mode, value, reverse, true))
gcc_unreachable ();
@@ -1094,19 +1132,21 @@ store_bit_field (rtx str_rtx, unsigned H
{
scalar_int_mode best_mode;
machine_mode addr_mode = VOIDmode;
- HOST_WIDE_INT offset, size;
+ HOST_WIDE_INT offset;
gcc_assert ((bitregion_start % BITS_PER_UNIT) == 0);
offset = bitregion_start / BITS_PER_UNIT;
bitnum -= bitregion_start;
- size = (bitnum + bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT;
+ poly_int64 size = bits_to_bytes_round_up (bitnum + bitsize);
bitregion_end -= bitregion_start;
bitregion_start = 0;
- if (get_best_mode (bitsize, bitnum,
- bitregion_start, bitregion_end,
- MEM_ALIGN (str_rtx), INT_MAX,
- MEM_VOLATILE_P (str_rtx), &best_mode))
+ if (bitsize.is_constant (&ibitsize)
+ && bitnum.is_constant (&ibitnum)
+ && get_best_mode (ibitsize, ibitnum,
+ bitregion_start, bitregion_end,
+ MEM_ALIGN (str_rtx), INT_MAX,
+ MEM_VOLATILE_P (str_rtx), &best_mode))
addr_mode = best_mode;
str_rtx = adjust_bitfield_address_size (str_rtx, addr_mode,
offset, size);
@@ -1738,9 +1778,10 @@ extract_bit_field_1 (rtx str_rtx, unsign
/* Extraction of a full MODE1 value can be done with a load as long as
the field is on a byte boundary and is sufficiently aligned. */
- if (simple_mem_bitfield_p (op0, bitsize, bitnum, mode1))
+ poly_uint64 bytenum;
+ if (simple_mem_bitfield_p (op0, bitsize, bitnum, mode1, &bytenum))
{
- op0 = adjust_bitfield_address (op0, mode1, bitnum / BITS_PER_UNIT);
+ op0 = adjust_bitfield_address (op0, mode1, bytenum);
if (reverse)
op0 = flip_storage_order (mode1, op0);
return convert_extracted_bit_field (op0, mode, tmode, unsignedp);
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [020/nnn] poly_int: store_bit_field bitrange
2017-10-23 17:08 ` [020/nnn] poly_int: store_bit_field bitrange Richard Sandiford
@ 2017-12-05 23:43 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-05 23:43 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:08 AM, Richard Sandiford wrote:
> This patch changes the bitnum and bitsize arguments to
> store_bit_field from unsigned HOST_WIDE_INTs to poly_uint64s.
> The later part of store_bit_field_1 still needs to operate
> on constant bit positions and sizes, so the patch splits
> it out into a subfunction (store_integral_bit_field).
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * expmed.h (store_bit_field): Take bitsize and bitnum as
> poly_uint64s rather than unsigned HOST_WIDE_INTs.
> * expmed.c (simple_mem_bitfield_p): Likewise. Add a parameter
> that returns the byte size.
> (store_bit_field_1): Take bitsize and bitnum as
> poly_uint64s rather than unsigned HOST_WIDE_INTs. Update call
> to simple_mem_bitfield_p. Split the part that can only handle
> constant bitsize and bitnum out into...
> (store_integral_bit_field): ...this new function.
> (store_bit_field): Take bitsize and bitnum as poly_uint64s rather
> than unsigned HOST_WIDE_INTs.
> (extract_bit_field_1): Update call to simple_mem_bitfield_p.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [018/nnn] poly_int: MEM_OFFSET and MEM_SIZE
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (17 preceding siblings ...)
2017-10-23 17:08 ` [020/nnn] poly_int: store_bit_field bitrange Richard Sandiford
@ 2017-10-23 17:08 ` Richard Sandiford
2017-12-06 18:27 ` Jeff Law
2017-10-23 17:08 ` [019/nnn] poly_int: lra frame offsets Richard Sandiford
` (88 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:08 UTC (permalink / raw)
To: gcc-patches
This patch changes the MEM_OFFSET and MEM_SIZE memory attributes
from HOST_WIDE_INT to poly_int64. Most of it is mechanical,
but there is one nonbovious change in widen_memory_access.
Previously the main while loop broke with:
/* Similarly for the decl. */
else if (DECL_P (attrs.expr)
&& DECL_SIZE_UNIT (attrs.expr)
&& TREE_CODE (DECL_SIZE_UNIT (attrs.expr)) == INTEGER_CST
&& compare_tree_int (DECL_SIZE_UNIT (attrs.expr), size) >= 0
&& (! attrs.offset_known_p || attrs.offset >= 0))
break;
but it seemed wrong to optimistically assume the best case
when the offset isn't known (and thus might be negative).
As it happens, the "! attrs.offset_known_p" condition was
always false, because we'd already nullified attrs.expr in
that case:
/* If we don't know what offset we were at within the expression, then
we can't know if we've overstepped the bounds. */
if (! attrs.offset_known_p)
attrs.expr = NULL_TREE;
The patch therefore drops "! attrs.offset_known_p ||" when
converting the offset check to the may/must interface.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* rtl.h (mem_attrs): Add a default constructor. Change size and
offset from HOST_WIDE_INT to poly_int64.
* emit-rtl.h (set_mem_offset, set_mem_size, adjust_address_1)
(adjust_automodify_address_1, set_mem_attributes_minus_bitpos)
(widen_memory_access): Take the sizes and offsets as poly_int64s
rather than HOST_WIDE_INTs.
* alias.c (ao_ref_from_mem): Handle the new form of MEM_OFFSET.
(offset_overlap_p): Take poly_int64s rather than HOST_WIDE_INTs
and ints.
(adjust_offset_for_component_ref): Change the offset from a
HOST_WIDE_INT to a poly_int64.
(nonoverlapping_memrefs_p): Track polynomial offsets and sizes.
* cfgcleanup.c (merge_memattrs): Update after mem_attrs changes.
* dce.c (find_call_stack_args): Likewise.
* dse.c (record_store): Likewise.
* dwarf2out.c (tls_mem_loc_descriptor, dw_sra_loc_expr): Likewise.
* print-rtl.c (rtx_writer::print_rtx): Likewise.
* read-rtl-function.c (test_loading_mem): Likewise.
* rtlanal.c (may_trap_p_1): Likewise.
* simplify-rtx.c (delegitimize_mem_from_attrs): Likewise.
* var-tracking.c (int_mem_offset, track_expr_p): Likewise.
* emit-rtl.c (mem_attrs_eq_p, get_mem_align_offset): Likewise.
(mem_attrs::mem_attrs): New function.
(set_mem_attributes_minus_bitpos): Change bitpos from a
HOST_WIDE_INT to poly_int64.
(set_mem_alias_set, set_mem_addr_space, set_mem_align, set_mem_expr)
(clear_mem_offset, clear_mem_size, change_address)
(get_spill_slot_decl, set_mem_attrs_for_spill): Directly
initialize mem_attrs.
(set_mem_offset, set_mem_size, adjust_address_1)
(adjust_automodify_address_1, offset_address, widen_memory_access):
Likewise. Take poly_int64s rather than HOST_WIDE_INT.
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h 2017-10-23 17:01:43.314993320 +0100
+++ gcc/rtl.h 2017-10-23 17:01:56.777802803 +0100
@@ -147,6 +147,8 @@ struct addr_diff_vec_flags
they cannot be modified in place. */
struct GTY(()) mem_attrs
{
+ mem_attrs ();
+
/* The expression that the MEM accesses, or null if not known.
This expression might be larger than the memory reference itself.
(In other words, the MEM might access only part of the object.) */
@@ -154,11 +156,11 @@ struct GTY(()) mem_attrs
/* The offset of the memory reference from the start of EXPR.
Only valid if OFFSET_KNOWN_P. */
- HOST_WIDE_INT offset;
+ poly_int64 offset;
/* The size of the memory reference in bytes. Only valid if
SIZE_KNOWN_P. */
- HOST_WIDE_INT size;
+ poly_int64 size;
/* The alias set of the memory reference. */
alias_set_type alias;
Index: gcc/emit-rtl.h
===================================================================
--- gcc/emit-rtl.h 2017-10-23 17:00:54.440004873 +0100
+++ gcc/emit-rtl.h 2017-10-23 17:01:56.777802803 +0100
@@ -333,13 +333,13 @@ extern void set_mem_addr_space (rtx, add
extern void set_mem_expr (rtx, tree);
/* Set the offset for MEM to OFFSET. */
-extern void set_mem_offset (rtx, HOST_WIDE_INT);
+extern void set_mem_offset (rtx, poly_int64);
/* Clear the offset recorded for MEM. */
extern void clear_mem_offset (rtx);
/* Set the size for MEM to SIZE. */
-extern void set_mem_size (rtx, HOST_WIDE_INT);
+extern void set_mem_size (rtx, poly_int64);
/* Clear the size recorded for MEM. */
extern void clear_mem_size (rtx);
@@ -488,10 +488,10 @@ #define adjust_automodify_address(MEMREF
#define adjust_automodify_address_nv(MEMREF, MODE, ADDR, OFFSET) \
adjust_automodify_address_1 (MEMREF, MODE, ADDR, OFFSET, 0)
-extern rtx adjust_address_1 (rtx, machine_mode, HOST_WIDE_INT, int, int,
- int, HOST_WIDE_INT);
+extern rtx adjust_address_1 (rtx, machine_mode, poly_int64, int, int,
+ int, poly_int64);
extern rtx adjust_automodify_address_1 (rtx, machine_mode, rtx,
- HOST_WIDE_INT, int);
+ poly_int64, int);
/* Return a memory reference like MEMREF, but whose address is changed by
adding OFFSET, an RTX, to it. POW2 is the highest power of two factor
@@ -506,7 +506,7 @@ extern void set_mem_attributes (rtx, tre
/* Similar, except that BITPOS has not yet been applied to REF, so if
we alter MEM_OFFSET according to T then we should subtract BITPOS
expecting that it'll be added back in later. */
-extern void set_mem_attributes_minus_bitpos (rtx, tree, int, HOST_WIDE_INT);
+extern void set_mem_attributes_minus_bitpos (rtx, tree, int, poly_int64);
/* Return OFFSET if XEXP (MEM, 0) - OFFSET is known to be ALIGN
bits aligned for 0 <= OFFSET < ALIGN / BITS_PER_UNIT, or
@@ -515,7 +515,7 @@ extern int get_mem_align_offset (rtx, un
/* Return a memory reference like MEMREF, but with its mode widened to
MODE and adjusted by OFFSET. */
-extern rtx widen_memory_access (rtx, machine_mode, HOST_WIDE_INT);
+extern rtx widen_memory_access (rtx, machine_mode, poly_int64);
extern void maybe_set_max_label_num (rtx_code_label *x);
Index: gcc/alias.c
===================================================================
--- gcc/alias.c 2017-10-23 17:01:52.303181137 +0100
+++ gcc/alias.c 2017-10-23 17:01:56.772809920 +0100
@@ -330,7 +330,7 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
/* If MEM_OFFSET/MEM_SIZE get us outside of ref->offset/ref->max_size
drop ref->ref. */
- if (MEM_OFFSET (mem) < 0
+ if (may_lt (MEM_OFFSET (mem), 0)
|| (ref->max_size_known_p ()
&& may_gt ((MEM_OFFSET (mem) + MEM_SIZE (mem)) * BITS_PER_UNIT,
ref->max_size)))
@@ -2329,12 +2329,15 @@ addr_side_effect_eval (rtx addr, int siz
absolute value of the sizes as the actual sizes. */
static inline bool
-offset_overlap_p (HOST_WIDE_INT c, int xsize, int ysize)
+offset_overlap_p (poly_int64 c, poly_int64 xsize, poly_int64 ysize)
{
- return (xsize == 0 || ysize == 0
- || (c >= 0
- ? (abs (xsize) > c)
- : (abs (ysize) > -c)));
+ if (known_zero (xsize) || known_zero (ysize))
+ return true;
+
+ if (may_ge (c, 0))
+ return may_gt (may_lt (xsize, 0) ? -xsize : xsize, c);
+ else
+ return may_gt (may_lt (ysize, 0) ? -ysize : ysize, -c);
}
/* Return one if X and Y (memory addresses) reference the
@@ -2665,7 +2668,7 @@ decl_for_component_ref (tree x)
static void
adjust_offset_for_component_ref (tree x, bool *known_p,
- HOST_WIDE_INT *offset)
+ poly_int64 *offset)
{
if (!*known_p)
return;
@@ -2706,8 +2709,8 @@ nonoverlapping_memrefs_p (const_rtx x, c
rtx rtlx, rtly;
rtx basex, basey;
bool moffsetx_known_p, moffsety_known_p;
- HOST_WIDE_INT moffsetx = 0, moffsety = 0;
- HOST_WIDE_INT offsetx = 0, offsety = 0, sizex, sizey;
+ poly_int64 moffsetx = 0, moffsety = 0;
+ poly_int64 offsetx = 0, offsety = 0, sizex, sizey;
/* Unless both have exprs, we can't tell anything. */
if (exprx == 0 || expry == 0)
@@ -2809,12 +2812,10 @@ nonoverlapping_memrefs_p (const_rtx x, c
we can avoid overlap is if we can deduce that they are nonoverlapping
pieces of that decl, which is very rare. */
basex = MEM_P (rtlx) ? XEXP (rtlx, 0) : rtlx;
- if (GET_CODE (basex) == PLUS && CONST_INT_P (XEXP (basex, 1)))
- offsetx = INTVAL (XEXP (basex, 1)), basex = XEXP (basex, 0);
+ basex = strip_offset_and_add (basex, &offsetx);
basey = MEM_P (rtly) ? XEXP (rtly, 0) : rtly;
- if (GET_CODE (basey) == PLUS && CONST_INT_P (XEXP (basey, 1)))
- offsety = INTVAL (XEXP (basey, 1)), basey = XEXP (basey, 0);
+ basey = strip_offset_and_add (basey, &offsety);
/* If the bases are different, we know they do not overlap if both
are constants or if one is a constant and the other a pointer into the
@@ -2835,10 +2836,10 @@ nonoverlapping_memrefs_p (const_rtx x, c
declarations are necessarily different
(i.e. compare_base_decls (exprx, expry) == -1) */
- sizex = (!MEM_P (rtlx) ? (int) GET_MODE_SIZE (GET_MODE (rtlx))
+ sizex = (!MEM_P (rtlx) ? poly_int64 (GET_MODE_SIZE (GET_MODE (rtlx)))
: MEM_SIZE_KNOWN_P (rtlx) ? MEM_SIZE (rtlx)
: -1);
- sizey = (!MEM_P (rtly) ? (int) GET_MODE_SIZE (GET_MODE (rtly))
+ sizey = (!MEM_P (rtly) ? poly_int64 (GET_MODE_SIZE (GET_MODE (rtly)))
: MEM_SIZE_KNOWN_P (rtly) ? MEM_SIZE (rtly)
: -1);
@@ -2857,16 +2858,7 @@ nonoverlapping_memrefs_p (const_rtx x, c
if (MEM_SIZE_KNOWN_P (y) && moffsety_known_p)
sizey = MEM_SIZE (y);
- /* Put the values of the memref with the lower offset in X's values. */
- if (offsetx > offsety)
- {
- std::swap (offsetx, offsety);
- std::swap (sizex, sizey);
- }
-
- /* If we don't know the size of the lower-offset value, we can't tell
- if they conflict. Otherwise, we do the test. */
- return sizex >= 0 && offsety >= offsetx + sizex;
+ return !ranges_may_overlap_p (offsetx, sizex, offsety, sizey);
}
/* Helper for true_dependence and canon_true_dependence.
Index: gcc/cfgcleanup.c
===================================================================
--- gcc/cfgcleanup.c 2017-10-23 16:52:19.902212938 +0100
+++ gcc/cfgcleanup.c 2017-10-23 17:01:56.772809920 +0100
@@ -873,8 +873,6 @@ merge_memattrs (rtx x, rtx y)
MEM_ATTRS (x) = 0;
else
{
- HOST_WIDE_INT mem_size;
-
if (MEM_ALIAS_SET (x) != MEM_ALIAS_SET (y))
{
set_mem_alias_set (x, 0);
@@ -890,20 +888,23 @@ merge_memattrs (rtx x, rtx y)
}
else if (MEM_OFFSET_KNOWN_P (x) != MEM_OFFSET_KNOWN_P (y)
|| (MEM_OFFSET_KNOWN_P (x)
- && MEM_OFFSET (x) != MEM_OFFSET (y)))
+ && may_ne (MEM_OFFSET (x), MEM_OFFSET (y))))
{
clear_mem_offset (x);
clear_mem_offset (y);
}
- if (MEM_SIZE_KNOWN_P (x) && MEM_SIZE_KNOWN_P (y))
- {
- mem_size = MAX (MEM_SIZE (x), MEM_SIZE (y));
- set_mem_size (x, mem_size);
- set_mem_size (y, mem_size);
- }
+ if (!MEM_SIZE_KNOWN_P (x))
+ clear_mem_size (y);
+ else if (!MEM_SIZE_KNOWN_P (y))
+ clear_mem_size (x);
+ else if (must_le (MEM_SIZE (x), MEM_SIZE (y)))
+ set_mem_size (x, MEM_SIZE (y));
+ else if (must_le (MEM_SIZE (y), MEM_SIZE (x)))
+ set_mem_size (y, MEM_SIZE (x));
else
{
+ /* The sizes aren't ordered, so we can't merge them. */
clear_mem_size (x);
clear_mem_size (y);
}
Index: gcc/dce.c
===================================================================
--- gcc/dce.c 2017-10-23 16:52:19.902212938 +0100
+++ gcc/dce.c 2017-10-23 17:01:56.772809920 +0100
@@ -293,9 +293,8 @@ find_call_stack_args (rtx_call_insn *cal
{
rtx mem = XEXP (XEXP (p, 0), 0), addr;
HOST_WIDE_INT off = 0, size;
- if (!MEM_SIZE_KNOWN_P (mem))
+ if (!MEM_SIZE_KNOWN_P (mem) || !MEM_SIZE (mem).is_constant (&size))
return false;
- size = MEM_SIZE (mem);
addr = XEXP (mem, 0);
if (GET_CODE (addr) == PLUS
&& REG_P (XEXP (addr, 0))
@@ -360,7 +359,9 @@ find_call_stack_args (rtx_call_insn *cal
&& MEM_P (XEXP (XEXP (p, 0), 0)))
{
rtx mem = XEXP (XEXP (p, 0), 0), addr;
- HOST_WIDE_INT off = 0, byte;
+ HOST_WIDE_INT off = 0, byte, size;
+ /* Checked in the previous iteration. */
+ size = MEM_SIZE (mem).to_constant ();
addr = XEXP (mem, 0);
if (GET_CODE (addr) == PLUS
&& REG_P (XEXP (addr, 0))
@@ -386,7 +387,7 @@ find_call_stack_args (rtx_call_insn *cal
set = single_set (DF_REF_INSN (defs->ref));
off += INTVAL (XEXP (SET_SRC (set), 1));
}
- for (byte = off; byte < off + MEM_SIZE (mem); byte++)
+ for (byte = off; byte < off + size; byte++)
{
if (!bitmap_set_bit (sp_bytes, byte - min_sp_off))
gcc_unreachable ();
@@ -469,8 +470,10 @@ find_call_stack_args (rtx_call_insn *cal
break;
}
+ HOST_WIDE_INT size;
if (!MEM_SIZE_KNOWN_P (mem)
- || !check_argument_store (MEM_SIZE (mem), off, min_sp_off,
+ || !MEM_SIZE (mem).is_constant (&size)
+ || !check_argument_store (size, off, min_sp_off,
max_sp_off, sp_bytes))
break;
Index: gcc/dse.c
===================================================================
--- gcc/dse.c 2017-10-23 17:01:54.249406896 +0100
+++ gcc/dse.c 2017-10-23 17:01:56.773808497 +0100
@@ -1365,6 +1365,7 @@ record_store (rtx body, bb_info_t bb_inf
/* At this point we know mem is a mem. */
if (GET_MODE (mem) == BLKmode)
{
+ HOST_WIDE_INT const_size;
if (GET_CODE (XEXP (mem, 0)) == SCRATCH)
{
if (dump_file && (dump_flags & TDF_DETAILS))
@@ -1376,8 +1377,11 @@ record_store (rtx body, bb_info_t bb_inf
/* Handle (set (mem:BLK (addr) [... S36 ...]) (const_int 0))
as memset (addr, 0, 36); */
else if (!MEM_SIZE_KNOWN_P (mem)
- || MEM_SIZE (mem) <= 0
- || MEM_SIZE (mem) > MAX_OFFSET
+ || may_le (MEM_SIZE (mem), 0)
+ /* This is a limit on the bitmap size, which is only relevant
+ for constant-sized MEMs. */
+ || (MEM_SIZE (mem).is_constant (&const_size)
+ && const_size > MAX_OFFSET)
|| GET_CODE (body) != SET
|| !CONST_INT_P (SET_SRC (body)))
{
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c 2017-10-23 17:01:45.056510879 +0100
+++ gcc/dwarf2out.c 2017-10-23 17:01:56.775805650 +0100
@@ -13754,7 +13754,7 @@ tls_mem_loc_descriptor (rtx mem)
if (loc_result == NULL)
return NULL;
- if (MEM_OFFSET (mem))
+ if (maybe_nonzero (MEM_OFFSET (mem)))
loc_descr_plus_const (&loc_result, MEM_OFFSET (mem));
return loc_result;
@@ -16320,8 +16320,10 @@ dw_sra_loc_expr (tree decl, rtx loc)
adjustment. */
if (MEM_P (varloc))
{
- unsigned HOST_WIDE_INT memsize
- = MEM_SIZE (varloc) * BITS_PER_UNIT;
+ unsigned HOST_WIDE_INT memsize;
+ if (!poly_uint64 (MEM_SIZE (varloc)).is_constant (&memsize))
+ goto discard_descr;
+ memsize *= BITS_PER_UNIT;
if (memsize != bitsize)
{
if (BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN
Index: gcc/print-rtl.c
===================================================================
--- gcc/print-rtl.c 2017-10-23 17:01:43.314993320 +0100
+++ gcc/print-rtl.c 2017-10-23 17:01:56.777802803 +0100
@@ -884,10 +884,16 @@ rtx_writer::print_rtx (const_rtx in_rtx)
fputc (' ', m_outfile);
if (MEM_OFFSET_KNOWN_P (in_rtx))
- fprintf (m_outfile, "+" HOST_WIDE_INT_PRINT_DEC, MEM_OFFSET (in_rtx));
+ {
+ fprintf (m_outfile, "+");
+ print_poly_int (m_outfile, MEM_OFFSET (in_rtx));
+ }
if (MEM_SIZE_KNOWN_P (in_rtx))
- fprintf (m_outfile, " S" HOST_WIDE_INT_PRINT_DEC, MEM_SIZE (in_rtx));
+ {
+ fprintf (m_outfile, " S");
+ print_poly_int (m_outfile, MEM_SIZE (in_rtx));
+ }
if (MEM_ALIGN (in_rtx) != 1)
fprintf (m_outfile, " A%u", MEM_ALIGN (in_rtx));
Index: gcc/read-rtl-function.c
===================================================================
--- gcc/read-rtl-function.c 2017-10-23 16:52:19.902212938 +0100
+++ gcc/read-rtl-function.c 2017-10-23 17:01:56.777802803 +0100
@@ -2143,9 +2143,9 @@ test_loading_mem ()
ASSERT_EQ (42, MEM_ALIAS_SET (mem1));
/* "+17". */
ASSERT_TRUE (MEM_OFFSET_KNOWN_P (mem1));
- ASSERT_EQ (17, MEM_OFFSET (mem1));
+ ASSERT_MUST_EQ (17, MEM_OFFSET (mem1));
/* "S8". */
- ASSERT_EQ (8, MEM_SIZE (mem1));
+ ASSERT_MUST_EQ (8, MEM_SIZE (mem1));
/* "A128. */
ASSERT_EQ (128, MEM_ALIGN (mem1));
/* "AS5. */
@@ -2159,9 +2159,9 @@ test_loading_mem ()
ASSERT_EQ (43, MEM_ALIAS_SET (mem2));
/* "+18". */
ASSERT_TRUE (MEM_OFFSET_KNOWN_P (mem2));
- ASSERT_EQ (18, MEM_OFFSET (mem2));
+ ASSERT_MUST_EQ (18, MEM_OFFSET (mem2));
/* "S9". */
- ASSERT_EQ (9, MEM_SIZE (mem2));
+ ASSERT_MUST_EQ (9, MEM_SIZE (mem2));
/* "AS6. */
ASSERT_EQ (6, MEM_ADDR_SPACE (mem2));
}
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c 2017-10-23 17:01:55.453690255 +0100
+++ gcc/rtlanal.c 2017-10-23 17:01:56.778801380 +0100
@@ -2796,7 +2796,7 @@ may_trap_p_1 (const_rtx x, unsigned flag
code_changed
|| !MEM_NOTRAP_P (x))
{
- HOST_WIDE_INT size = MEM_SIZE_KNOWN_P (x) ? MEM_SIZE (x) : -1;
+ poly_int64 size = MEM_SIZE_KNOWN_P (x) ? MEM_SIZE (x) : -1;
return rtx_addr_can_trap_p_1 (XEXP (x, 0), 0, size,
GET_MODE (x), code_changed);
}
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c 2017-10-23 17:00:54.445000329 +0100
+++ gcc/simplify-rtx.c 2017-10-23 17:01:56.778801380 +0100
@@ -289,7 +289,7 @@ delegitimize_mem_from_attrs (rtx x)
{
tree decl = MEM_EXPR (x);
machine_mode mode = GET_MODE (x);
- HOST_WIDE_INT offset = 0;
+ poly_int64 offset = 0;
switch (TREE_CODE (decl))
{
@@ -346,6 +346,7 @@ delegitimize_mem_from_attrs (rtx x)
if (MEM_P (newx))
{
rtx n = XEXP (newx, 0), o = XEXP (x, 0);
+ poly_int64 n_offset, o_offset;
/* Avoid creating a new MEM needlessly if we already had
the same address. We do if there's no OFFSET and the
@@ -353,21 +354,14 @@ delegitimize_mem_from_attrs (rtx x)
form (plus NEWX OFFSET), or the NEWX is of the form
(plus Y (const_int Z)) and X is that with the offset
added: (plus Y (const_int Z+OFFSET)). */
- if (!((offset == 0
- || (GET_CODE (o) == PLUS
- && GET_CODE (XEXP (o, 1)) == CONST_INT
- && (offset == INTVAL (XEXP (o, 1))
- || (GET_CODE (n) == PLUS
- && GET_CODE (XEXP (n, 1)) == CONST_INT
- && (INTVAL (XEXP (n, 1)) + offset
- == INTVAL (XEXP (o, 1)))
- && (n = XEXP (n, 0))))
- && (o = XEXP (o, 0))))
+ n = strip_offset (n, &n_offset);
+ o = strip_offset (o, &o_offset);
+ if (!(must_eq (o_offset, n_offset + offset)
&& rtx_equal_p (o, n)))
x = adjust_address_nv (newx, mode, offset);
}
else if (GET_MODE (x) == GET_MODE (newx)
- && offset == 0)
+ && known_zero (offset))
x = newx;
}
}
Index: gcc/var-tracking.c
===================================================================
--- gcc/var-tracking.c 2017-10-23 17:01:43.315991896 +0100
+++ gcc/var-tracking.c 2017-10-23 17:01:56.779799956 +0100
@@ -395,8 +395,9 @@ #define VTI(BB) ((variable_tracking_info
static inline HOST_WIDE_INT
int_mem_offset (const_rtx mem)
{
- if (MEM_OFFSET_KNOWN_P (mem))
- return MEM_OFFSET (mem);
+ HOST_WIDE_INT offset;
+ if (MEM_OFFSET_KNOWN_P (mem) && MEM_OFFSET (mem).is_constant (&offset))
+ return offset;
return 0;
}
@@ -5256,7 +5257,7 @@ track_expr_p (tree expr, bool need_rtl)
&& !tracked_record_parameter_p (realdecl))
return 0;
if (MEM_SIZE_KNOWN_P (decl_rtl)
- && MEM_SIZE (decl_rtl) > MAX_VAR_PARTS)
+ && may_gt (MEM_SIZE (decl_rtl), MAX_VAR_PARTS))
return 0;
}
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c 2017-10-23 17:01:43.313994743 +0100
+++ gcc/emit-rtl.c 2017-10-23 17:01:56.776804226 +0100
@@ -386,9 +386,9 @@ mem_attrs_eq_p (const struct mem_attrs *
return false;
return (p->alias == q->alias
&& p->offset_known_p == q->offset_known_p
- && (!p->offset_known_p || p->offset == q->offset)
+ && (!p->offset_known_p || must_eq (p->offset, q->offset))
&& p->size_known_p == q->size_known_p
- && (!p->size_known_p || p->size == q->size)
+ && (!p->size_known_p || must_eq (p->size, q->size))
&& p->align == q->align
&& p->addrspace == q->addrspace
&& (p->expr == q->expr
@@ -1789,6 +1789,17 @@ operand_subword_force (rtx op, unsigned
return result;
}
\f
+mem_attrs::mem_attrs ()
+ : expr (NULL_TREE),
+ offset (0),
+ size (0),
+ alias (0),
+ align (0),
+ addrspace (ADDR_SPACE_GENERIC),
+ offset_known_p (false),
+ size_known_p (false)
+{}
+
/* Returns 1 if both MEM_EXPR can be considered equal
and 0 otherwise. */
@@ -1815,7 +1826,7 @@ mem_expr_equal_p (const_tree expr1, cons
get_mem_align_offset (rtx mem, unsigned int align)
{
tree expr;
- unsigned HOST_WIDE_INT offset;
+ poly_uint64 offset;
/* This function can't use
if (!MEM_EXPR (mem) || !MEM_OFFSET_KNOWN_P (mem)
@@ -1857,12 +1868,13 @@ get_mem_align_offset (rtx mem, unsigned
tree byte_offset = component_ref_field_offset (expr);
tree bit_offset = DECL_FIELD_BIT_OFFSET (field);
+ poly_uint64 suboffset;
if (!byte_offset
- || !tree_fits_uhwi_p (byte_offset)
+ || !poly_int_tree_p (byte_offset, &suboffset)
|| !tree_fits_uhwi_p (bit_offset))
return -1;
- offset += tree_to_uhwi (byte_offset);
+ offset += suboffset;
offset += tree_to_uhwi (bit_offset) / BITS_PER_UNIT;
if (inner == NULL_TREE)
@@ -1886,7 +1898,10 @@ get_mem_align_offset (rtx mem, unsigned
else
return -1;
- return offset & ((align / BITS_PER_UNIT) - 1);
+ HOST_WIDE_INT misalign;
+ if (!known_misalignment (offset, align / BITS_PER_UNIT, &misalign))
+ return -1;
+ return misalign;
}
/* Given REF (a MEM) and T, either the type of X or the expression
@@ -1896,9 +1911,9 @@ get_mem_align_offset (rtx mem, unsigned
void
set_mem_attributes_minus_bitpos (rtx ref, tree t, int objectp,
- HOST_WIDE_INT bitpos)
+ poly_int64 bitpos)
{
- HOST_WIDE_INT apply_bitpos = 0;
+ poly_int64 apply_bitpos = 0;
tree type;
struct mem_attrs attrs, *defattrs, *refattrs;
addr_space_t as;
@@ -1919,8 +1934,6 @@ set_mem_attributes_minus_bitpos (rtx ref
set_mem_attributes. */
gcc_assert (!DECL_P (t) || ref != DECL_RTL_IF_SET (t));
- memset (&attrs, 0, sizeof (attrs));
-
/* Get the alias set from the expression or type (perhaps using a
front-end routine) and use it. */
attrs.alias = get_alias_set (t);
@@ -2090,10 +2103,9 @@ set_mem_attributes_minus_bitpos (rtx ref
{
attrs.expr = t2;
attrs.offset_known_p = false;
- if (tree_fits_uhwi_p (off_tree))
+ if (poly_int_tree_p (off_tree, &attrs.offset))
{
attrs.offset_known_p = true;
- attrs.offset = tree_to_uhwi (off_tree);
apply_bitpos = bitpos;
}
}
@@ -2114,27 +2126,29 @@ set_mem_attributes_minus_bitpos (rtx ref
unsigned int obj_align;
unsigned HOST_WIDE_INT obj_bitpos;
get_object_alignment_1 (t, &obj_align, &obj_bitpos);
- obj_bitpos = (obj_bitpos - bitpos) & (obj_align - 1);
- if (obj_bitpos != 0)
- obj_align = least_bit_hwi (obj_bitpos);
+ unsigned int diff_align = known_alignment (obj_bitpos - bitpos);
+ if (diff_align != 0)
+ obj_align = MIN (obj_align, diff_align);
attrs.align = MAX (attrs.align, obj_align);
}
- if (tree_fits_uhwi_p (new_size))
+ poly_uint64 const_size;
+ if (poly_int_tree_p (new_size, &const_size))
{
attrs.size_known_p = true;
- attrs.size = tree_to_uhwi (new_size);
+ attrs.size = const_size;
}
/* If we modified OFFSET based on T, then subtract the outstanding
bit position offset. Similarly, increase the size of the accessed
object to contain the negative offset. */
- if (apply_bitpos)
+ if (maybe_nonzero (apply_bitpos))
{
gcc_assert (attrs.offset_known_p);
- attrs.offset -= apply_bitpos / BITS_PER_UNIT;
+ poly_int64 bytepos = bits_to_bytes_round_down (apply_bitpos);
+ attrs.offset -= bytepos;
if (attrs.size_known_p)
- attrs.size += apply_bitpos / BITS_PER_UNIT;
+ attrs.size += bytepos;
}
/* Now set the attributes we computed above. */
@@ -2153,11 +2167,9 @@ set_mem_attributes (rtx ref, tree t, int
void
set_mem_alias_set (rtx mem, alias_set_type set)
{
- struct mem_attrs attrs;
-
/* If the new and old alias sets don't conflict, something is wrong. */
gcc_checking_assert (alias_sets_conflict_p (set, MEM_ALIAS_SET (mem)));
- attrs = *get_mem_attrs (mem);
+ mem_attrs attrs (*get_mem_attrs (mem));
attrs.alias = set;
set_mem_attrs (mem, &attrs);
}
@@ -2167,9 +2179,7 @@ set_mem_alias_set (rtx mem, alias_set_ty
void
set_mem_addr_space (rtx mem, addr_space_t addrspace)
{
- struct mem_attrs attrs;
-
- attrs = *get_mem_attrs (mem);
+ mem_attrs attrs (*get_mem_attrs (mem));
attrs.addrspace = addrspace;
set_mem_attrs (mem, &attrs);
}
@@ -2179,9 +2189,7 @@ set_mem_addr_space (rtx mem, addr_space_
void
set_mem_align (rtx mem, unsigned int align)
{
- struct mem_attrs attrs;
-
- attrs = *get_mem_attrs (mem);
+ mem_attrs attrs (*get_mem_attrs (mem));
attrs.align = align;
set_mem_attrs (mem, &attrs);
}
@@ -2191,9 +2199,7 @@ set_mem_align (rtx mem, unsigned int ali
void
set_mem_expr (rtx mem, tree expr)
{
- struct mem_attrs attrs;
-
- attrs = *get_mem_attrs (mem);
+ mem_attrs attrs (*get_mem_attrs (mem));
attrs.expr = expr;
set_mem_attrs (mem, &attrs);
}
@@ -2201,11 +2207,9 @@ set_mem_expr (rtx mem, tree expr)
/* Set the offset of MEM to OFFSET. */
void
-set_mem_offset (rtx mem, HOST_WIDE_INT offset)
+set_mem_offset (rtx mem, poly_int64 offset)
{
- struct mem_attrs attrs;
-
- attrs = *get_mem_attrs (mem);
+ mem_attrs attrs (*get_mem_attrs (mem));
attrs.offset_known_p = true;
attrs.offset = offset;
set_mem_attrs (mem, &attrs);
@@ -2216,9 +2220,7 @@ set_mem_offset (rtx mem, HOST_WIDE_INT o
void
clear_mem_offset (rtx mem)
{
- struct mem_attrs attrs;
-
- attrs = *get_mem_attrs (mem);
+ mem_attrs attrs (*get_mem_attrs (mem));
attrs.offset_known_p = false;
set_mem_attrs (mem, &attrs);
}
@@ -2226,11 +2228,9 @@ clear_mem_offset (rtx mem)
/* Set the size of MEM to SIZE. */
void
-set_mem_size (rtx mem, HOST_WIDE_INT size)
+set_mem_size (rtx mem, poly_int64 size)
{
- struct mem_attrs attrs;
-
- attrs = *get_mem_attrs (mem);
+ mem_attrs attrs (*get_mem_attrs (mem));
attrs.size_known_p = true;
attrs.size = size;
set_mem_attrs (mem, &attrs);
@@ -2241,9 +2241,7 @@ set_mem_size (rtx mem, HOST_WIDE_INT siz
void
clear_mem_size (rtx mem)
{
- struct mem_attrs attrs;
-
- attrs = *get_mem_attrs (mem);
+ mem_attrs attrs (*get_mem_attrs (mem));
attrs.size_known_p = false;
set_mem_attrs (mem, &attrs);
}
@@ -2306,9 +2304,9 @@ change_address (rtx memref, machine_mode
{
rtx new_rtx = change_address_1 (memref, mode, addr, 1, false);
machine_mode mmode = GET_MODE (new_rtx);
- struct mem_attrs attrs, *defattrs;
+ struct mem_attrs *defattrs;
- attrs = *get_mem_attrs (memref);
+ mem_attrs attrs (*get_mem_attrs (memref));
defattrs = mode_mem_attrs[(int) mmode];
attrs.expr = NULL_TREE;
attrs.offset_known_p = false;
@@ -2343,15 +2341,14 @@ change_address (rtx memref, machine_mode
has no inherent size. */
rtx
-adjust_address_1 (rtx memref, machine_mode mode, HOST_WIDE_INT offset,
+adjust_address_1 (rtx memref, machine_mode mode, poly_int64 offset,
int validate, int adjust_address, int adjust_object,
- HOST_WIDE_INT size)
+ poly_int64 size)
{
rtx addr = XEXP (memref, 0);
rtx new_rtx;
scalar_int_mode address_mode;
- int pbits;
- struct mem_attrs attrs = *get_mem_attrs (memref), *defattrs;
+ struct mem_attrs attrs (*get_mem_attrs (memref)), *defattrs;
unsigned HOST_WIDE_INT max_align;
#ifdef POINTERS_EXTEND_UNSIGNED
scalar_int_mode pointer_mode
@@ -2368,8 +2365,10 @@ adjust_address_1 (rtx memref, machine_mo
size = defattrs->size;
/* If there are no changes, just return the original memory reference. */
- if (mode == GET_MODE (memref) && !offset
- && (size == 0 || (attrs.size_known_p && attrs.size == size))
+ if (mode == GET_MODE (memref)
+ && known_zero (offset)
+ && (known_zero (size)
+ || (attrs.size_known_p && must_eq (attrs.size, size)))
&& (!validate || memory_address_addr_space_p (mode, addr,
attrs.addrspace)))
return memref;
@@ -2382,22 +2381,17 @@ adjust_address_1 (rtx memref, machine_mo
/* Convert a possibly large offset to a signed value within the
range of the target address space. */
address_mode = get_address_mode (memref);
- pbits = GET_MODE_BITSIZE (address_mode);
- if (HOST_BITS_PER_WIDE_INT > pbits)
- {
- int shift = HOST_BITS_PER_WIDE_INT - pbits;
- offset = (((HOST_WIDE_INT) ((unsigned HOST_WIDE_INT) offset << shift))
- >> shift);
- }
+ offset = trunc_int_for_mode (offset, address_mode);
if (adjust_address)
{
/* If MEMREF is a LO_SUM and the offset is within the alignment of the
object, we can merge it into the LO_SUM. */
- if (GET_MODE (memref) != BLKmode && GET_CODE (addr) == LO_SUM
- && offset >= 0
- && (unsigned HOST_WIDE_INT) offset
- < GET_MODE_ALIGNMENT (GET_MODE (memref)) / BITS_PER_UNIT)
+ if (GET_MODE (memref) != BLKmode
+ && GET_CODE (addr) == LO_SUM
+ && known_in_range_p (offset,
+ 0, (GET_MODE_ALIGNMENT (GET_MODE (memref))
+ / BITS_PER_UNIT)))
addr = gen_rtx_LO_SUM (address_mode, XEXP (addr, 0),
plus_constant (address_mode,
XEXP (addr, 1), offset));
@@ -2408,7 +2402,7 @@ adjust_address_1 (rtx memref, machine_mo
else if (POINTERS_EXTEND_UNSIGNED > 0
&& GET_CODE (addr) == ZERO_EXTEND
&& GET_MODE (XEXP (addr, 0)) == pointer_mode
- && trunc_int_for_mode (offset, pointer_mode) == offset)
+ && must_eq (trunc_int_for_mode (offset, pointer_mode), offset))
addr = gen_rtx_ZERO_EXTEND (address_mode,
plus_constant (pointer_mode,
XEXP (addr, 0), offset));
@@ -2421,7 +2415,7 @@ adjust_address_1 (rtx memref, machine_mo
/* If the address is a REG, change_address_1 rightfully returns memref,
but this would destroy memref's MEM_ATTRS. */
- if (new_rtx == memref && offset != 0)
+ if (new_rtx == memref && maybe_nonzero (offset))
new_rtx = copy_rtx (new_rtx);
/* Conservatively drop the object if we don't know where we start from. */
@@ -2438,7 +2432,7 @@ adjust_address_1 (rtx memref, machine_mo
attrs.offset += offset;
/* Drop the object if the new left end is not within its bounds. */
- if (adjust_object && attrs.offset < 0)
+ if (adjust_object && may_lt (attrs.offset, 0))
{
attrs.expr = NULL_TREE;
attrs.alias = 0;
@@ -2448,16 +2442,16 @@ adjust_address_1 (rtx memref, machine_mo
/* Compute the new alignment by taking the MIN of the alignment and the
lowest-order set bit in OFFSET, but don't change the alignment if OFFSET
if zero. */
- if (offset != 0)
+ if (maybe_nonzero (offset))
{
- max_align = least_bit_hwi (offset) * BITS_PER_UNIT;
+ max_align = known_alignment (offset) * BITS_PER_UNIT;
attrs.align = MIN (attrs.align, max_align);
}
- if (size)
+ if (maybe_nonzero (size))
{
/* Drop the object if the new right end is not within its bounds. */
- if (adjust_object && (offset + size) > attrs.size)
+ if (adjust_object && may_gt (offset + size, attrs.size))
{
attrs.expr = NULL_TREE;
attrs.alias = 0;
@@ -2485,7 +2479,7 @@ adjust_address_1 (rtx memref, machine_mo
rtx
adjust_automodify_address_1 (rtx memref, machine_mode mode, rtx addr,
- HOST_WIDE_INT offset, int validate)
+ poly_int64 offset, int validate)
{
memref = change_address_1 (memref, VOIDmode, addr, validate, false);
return adjust_address_1 (memref, mode, offset, validate, 0, 0, 0);
@@ -2500,9 +2494,9 @@ offset_address (rtx memref, rtx offset,
{
rtx new_rtx, addr = XEXP (memref, 0);
machine_mode address_mode;
- struct mem_attrs attrs, *defattrs;
+ struct mem_attrs *defattrs;
- attrs = *get_mem_attrs (memref);
+ mem_attrs attrs (*get_mem_attrs (memref));
address_mode = get_address_mode (memref);
new_rtx = simplify_gen_binary (PLUS, address_mode, addr, offset);
@@ -2570,17 +2564,16 @@ replace_equiv_address_nv (rtx memref, rt
operations plus masking logic. */
rtx
-widen_memory_access (rtx memref, machine_mode mode, HOST_WIDE_INT offset)
+widen_memory_access (rtx memref, machine_mode mode, poly_int64 offset)
{
rtx new_rtx = adjust_address_1 (memref, mode, offset, 1, 1, 0, 0);
- struct mem_attrs attrs;
unsigned int size = GET_MODE_SIZE (mode);
/* If there are no changes, just return the original memory reference. */
if (new_rtx == memref)
return new_rtx;
- attrs = *get_mem_attrs (new_rtx);
+ mem_attrs attrs (*get_mem_attrs (new_rtx));
/* If we don't know what offset we were at within the expression, then
we can't know if we've overstepped the bounds. */
@@ -2602,28 +2595,30 @@ widen_memory_access (rtx memref, machine
/* Is the field at least as large as the access? If so, ok,
otherwise strip back to the containing structure. */
- if (TREE_CODE (DECL_SIZE_UNIT (field)) == INTEGER_CST
- && compare_tree_int (DECL_SIZE_UNIT (field), size) >= 0
- && attrs.offset >= 0)
+ if (poly_int_tree_p (DECL_SIZE_UNIT (field))
+ && must_ge (wi::to_poly_offset (DECL_SIZE_UNIT (field)), size)
+ && must_ge (attrs.offset, 0))
break;
- if (! tree_fits_uhwi_p (offset))
+ poly_uint64 suboffset;
+ if (!poly_int_tree_p (offset, &suboffset))
{
attrs.expr = NULL_TREE;
break;
}
attrs.expr = TREE_OPERAND (attrs.expr, 0);
- attrs.offset += tree_to_uhwi (offset);
+ attrs.offset += suboffset;
attrs.offset += (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field))
/ BITS_PER_UNIT);
}
/* Similarly for the decl. */
else if (DECL_P (attrs.expr)
&& DECL_SIZE_UNIT (attrs.expr)
- && TREE_CODE (DECL_SIZE_UNIT (attrs.expr)) == INTEGER_CST
- && compare_tree_int (DECL_SIZE_UNIT (attrs.expr), size) >= 0
- && (! attrs.offset_known_p || attrs.offset >= 0))
+ && poly_int_tree_p (DECL_SIZE_UNIT (attrs.expr))
+ && must_ge (wi::to_poly_offset (DECL_SIZE_UNIT (attrs.expr)),
+ size)
+ && must_ge (attrs.offset, 0))
break;
else
{
@@ -2654,7 +2649,6 @@ get_spill_slot_decl (bool force_build_p)
{
tree d = spill_slot_decl;
rtx rd;
- struct mem_attrs attrs;
if (d || !force_build_p)
return d;
@@ -2668,7 +2662,7 @@ get_spill_slot_decl (bool force_build_p)
rd = gen_rtx_MEM (BLKmode, frame_pointer_rtx);
MEM_NOTRAP_P (rd) = 1;
- attrs = *mode_mem_attrs[(int) BLKmode];
+ mem_attrs attrs (*mode_mem_attrs[(int) BLKmode]);
attrs.alias = new_alias_set ();
attrs.expr = d;
set_mem_attrs (rd, &attrs);
@@ -2686,10 +2680,9 @@ get_spill_slot_decl (bool force_build_p)
void
set_mem_attrs_for_spill (rtx mem)
{
- struct mem_attrs attrs;
rtx addr;
- attrs = *get_mem_attrs (mem);
+ mem_attrs attrs (*get_mem_attrs (mem));
attrs.expr = get_spill_slot_decl (true);
attrs.alias = MEM_ALIAS_SET (DECL_RTL (attrs.expr));
attrs.addrspace = ADDR_SPACE_GENERIC;
@@ -2699,10 +2692,7 @@ set_mem_attrs_for_spill (rtx mem)
with perhaps the plus missing for offset = 0. */
addr = XEXP (mem, 0);
attrs.offset_known_p = true;
- attrs.offset = 0;
- if (GET_CODE (addr) == PLUS
- && CONST_INT_P (XEXP (addr, 1)))
- attrs.offset = INTVAL (XEXP (addr, 1));
+ strip_offset (addr, &attrs.offset);
set_mem_attrs (mem, &attrs);
MEM_NOTRAP_P (mem) = 1;
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [018/nnn] poly_int: MEM_OFFSET and MEM_SIZE
2017-10-23 17:08 ` [018/nnn] poly_int: MEM_OFFSET and MEM_SIZE Richard Sandiford
@ 2017-12-06 18:27 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-06 18:27 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:07 AM, Richard Sandiford wrote:
> This patch changes the MEM_OFFSET and MEM_SIZE memory attributes
> from HOST_WIDE_INT to poly_int64. Most of it is mechanical,
> but there is one nonbovious change in widen_memory_access.
> Previously the main while loop broke with:
>
> /* Similarly for the decl. */
> else if (DECL_P (attrs.expr)
> && DECL_SIZE_UNIT (attrs.expr)
> && TREE_CODE (DECL_SIZE_UNIT (attrs.expr)) == INTEGER_CST
> && compare_tree_int (DECL_SIZE_UNIT (attrs.expr), size) >= 0
> && (! attrs.offset_known_p || attrs.offset >= 0))
> break;
>
> but it seemed wrong to optimistically assume the best case
> when the offset isn't known (and thus might be negative).
> As it happens, the "! attrs.offset_known_p" condition was
> always false, because we'd already nullified attrs.expr in
> that case:
>
> /* If we don't know what offset we were at within the expression, then
> we can't know if we've overstepped the bounds. */
> if (! attrs.offset_known_p)
> attrs.expr = NULL_TREE;
>
> The patch therefore drops "! attrs.offset_known_p ||" when
> converting the offset check to the may/must interface.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * rtl.h (mem_attrs): Add a default constructor. Change size and
> offset from HOST_WIDE_INT to poly_int64.
> * emit-rtl.h (set_mem_offset, set_mem_size, adjust_address_1)
> (adjust_automodify_address_1, set_mem_attributes_minus_bitpos)
> (widen_memory_access): Take the sizes and offsets as poly_int64s
> rather than HOST_WIDE_INTs.
> * alias.c (ao_ref_from_mem): Handle the new form of MEM_OFFSET.
> (offset_overlap_p): Take poly_int64s rather than HOST_WIDE_INTs
> and ints.
> (adjust_offset_for_component_ref): Change the offset from a
> HOST_WIDE_INT to a poly_int64.
> (nonoverlapping_memrefs_p): Track polynomial offsets and sizes.
> * cfgcleanup.c (merge_memattrs): Update after mem_attrs changes.
> * dce.c (find_call_stack_args): Likewise.
> * dse.c (record_store): Likewise.
> * dwarf2out.c (tls_mem_loc_descriptor, dw_sra_loc_expr): Likewise.
> * print-rtl.c (rtx_writer::print_rtx): Likewise.
> * read-rtl-function.c (test_loading_mem): Likewise.
> * rtlanal.c (may_trap_p_1): Likewise.
> * simplify-rtx.c (delegitimize_mem_from_attrs): Likewise.
> * var-tracking.c (int_mem_offset, track_expr_p): Likewise.
> * emit-rtl.c (mem_attrs_eq_p, get_mem_align_offset): Likewise.
> (mem_attrs::mem_attrs): New function.
> (set_mem_attributes_minus_bitpos): Change bitpos from a
> HOST_WIDE_INT to poly_int64.
> (set_mem_alias_set, set_mem_addr_space, set_mem_align, set_mem_expr)
> (clear_mem_offset, clear_mem_size, change_address)
> (get_spill_slot_decl, set_mem_attrs_for_spill): Directly
> initialize mem_attrs.
> (set_mem_offset, set_mem_size, adjust_address_1)
> (adjust_automodify_address_1, offset_address, widen_memory_access):
> Likewise. Take poly_int64s rather than HOST_WIDE_INT.
>
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [019/nnn] poly_int: lra frame offsets
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (18 preceding siblings ...)
2017-10-23 17:08 ` [018/nnn] poly_int: MEM_OFFSET and MEM_SIZE Richard Sandiford
@ 2017-10-23 17:08 ` Richard Sandiford
2017-12-06 0:16 ` Jeff Law
2017-10-23 17:09 ` [022/nnn] poly_int: C++ bitfield regions Richard Sandiford
` (87 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:08 UTC (permalink / raw)
To: gcc-patches
This patch makes LRA use poly_int64s rather than HOST_WIDE_INTs
to store a frame offset (including in things like eliminations).
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* lra-int.h (lra_reg): Change offset from int to poly_int64.
(lra_insn_recog_data): Change sp_offset from HOST_WIDE_INT
to poly_int64.
(lra_eliminate_regs_1, eliminate_regs_in_insn): Change
update_sp_offset from a HOST_WIDE_INT to a poly_int64.
(lra_update_reg_val_offset, lra_reg_val_equal_p): Take the
offset as a poly_int64 rather than an int.
* lra-assigns.c (find_hard_regno_for_1): Handle poly_int64 offsets.
(setup_live_pseudos_and_spill_after_risky_transforms): Likewise.
* lra-constraints.c (equiv_address_substitution): Track offsets
as poly_int64s.
(emit_inc): Check poly_int_rtx_p instead of CONST_INT_P.
(curr_insn_transform): Handle the new form of sp_offset.
* lra-eliminations.c (lra_elim_table): Change previous_offset
and offset from HOST_WIDE_INT to poly_int64.
(print_elim_table, update_reg_eliminate): Update accordingly.
(self_elim_offsets): Change from HOST_WIDE_INT to poly_int64_pod.
(get_elimination): Update accordingly.
(form_sum): Check poly_int_rtx_p instead of CONST_INT_P.
(lra_eliminate_regs_1, eliminate_regs_in_insn): Change
update_sp_offset from a HOST_WIDE_INT to a poly_int64. Handle
poly_int64 offsets generally.
(curr_sp_change): Change from HOST_WIDE_INT to poly_int64.
(mark_not_eliminable, init_elimination): Update accordingly.
(remove_reg_equal_offset_note): Return a bool and pass the new
offset back by pointer as a poly_int64.
* lra-remat.c (change_sp_offset): Take sp_offset as a poly_int64
rather than a HOST_WIDE_INT.
(do_remat): Track offsets poly_int64s.
* lra.c (lra_update_insn_recog_data, setup_sp_offset): Likewise.
Index: gcc/lra-int.h
===================================================================
--- gcc/lra-int.h 2017-10-23 16:52:19.836152258 +0100
+++ gcc/lra-int.h 2017-10-23 17:01:59.910337542 +0100
@@ -106,7 +106,7 @@ struct lra_reg
they do not conflict. */
int val;
/* Offset from relative eliminate register to pesudo reg. */
- int offset;
+ poly_int64 offset;
/* These members are set up in lra-lives.c and updated in
lra-coalesce.c. */
/* The biggest size mode in which each pseudo reg is referred in
@@ -213,7 +213,7 @@ struct lra_insn_recog_data
insn. */
int used_insn_alternative;
/* SP offset before the insn relative to one at the func start. */
- HOST_WIDE_INT sp_offset;
+ poly_int64 sp_offset;
/* The insn itself. */
rtx_insn *insn;
/* Common data for insns with the same ICODE. Asm insns (their
@@ -406,8 +406,8 @@ extern bool lra_remat (void);
extern void lra_debug_elim_table (void);
extern int lra_get_elimination_hard_regno (int);
extern rtx lra_eliminate_regs_1 (rtx_insn *, rtx, machine_mode,
- bool, bool, HOST_WIDE_INT, bool);
-extern void eliminate_regs_in_insn (rtx_insn *insn, bool, bool, HOST_WIDE_INT);
+ bool, bool, poly_int64, bool);
+extern void eliminate_regs_in_insn (rtx_insn *insn, bool, bool, poly_int64);
extern void lra_eliminate (bool, bool);
extern void lra_eliminate_reg_if_possible (rtx *);
@@ -493,7 +493,7 @@ lra_get_insn_recog_data (rtx_insn *insn)
/* Update offset from pseudos with VAL by INCR. */
static inline void
-lra_update_reg_val_offset (int val, int incr)
+lra_update_reg_val_offset (int val, poly_int64 incr)
{
int i;
@@ -506,10 +506,10 @@ lra_update_reg_val_offset (int val, int
/* Return true if register content is equal to VAL with OFFSET. */
static inline bool
-lra_reg_val_equal_p (int regno, int val, int offset)
+lra_reg_val_equal_p (int regno, int val, poly_int64 offset)
{
if (lra_reg_info[regno].val == val
- && lra_reg_info[regno].offset == offset)
+ && must_eq (lra_reg_info[regno].offset, offset))
return true;
return false;
Index: gcc/lra-assigns.c
===================================================================
--- gcc/lra-assigns.c 2017-10-23 16:52:19.836152258 +0100
+++ gcc/lra-assigns.c 2017-10-23 17:01:59.909338965 +0100
@@ -485,7 +485,8 @@ find_hard_regno_for_1 (int regno, int *c
int hr, conflict_hr, nregs;
machine_mode biggest_mode;
unsigned int k, conflict_regno;
- int offset, val, biggest_nregs, nregs_diff;
+ poly_int64 offset;
+ int val, biggest_nregs, nregs_diff;
enum reg_class rclass;
bitmap_iterator bi;
bool *rclass_intersect_p;
@@ -1147,7 +1148,8 @@ setup_live_pseudos_and_spill_after_risky
{
int p, i, j, n, regno, hard_regno;
unsigned int k, conflict_regno;
- int val, offset;
+ poly_int64 offset;
+ int val;
HARD_REG_SET conflict_set;
machine_mode mode;
lra_live_range_t r;
Index: gcc/lra-constraints.c
===================================================================
--- gcc/lra-constraints.c 2017-10-23 16:52:19.836152258 +0100
+++ gcc/lra-constraints.c 2017-10-23 17:01:59.910337542 +0100
@@ -3084,7 +3084,8 @@ can_add_disp_p (struct address_info *ad)
equiv_address_substitution (struct address_info *ad)
{
rtx base_reg, new_base_reg, index_reg, new_index_reg, *base_term, *index_term;
- HOST_WIDE_INT disp, scale;
+ poly_int64 disp;
+ HOST_WIDE_INT scale;
bool change_p;
base_term = strip_subreg (ad->base_term);
@@ -3115,6 +3116,7 @@ equiv_address_substitution (struct addre
}
if (base_reg != new_base_reg)
{
+ poly_int64 offset;
if (REG_P (new_base_reg))
{
*base_term = new_base_reg;
@@ -3122,10 +3124,10 @@ equiv_address_substitution (struct addre
}
else if (GET_CODE (new_base_reg) == PLUS
&& REG_P (XEXP (new_base_reg, 0))
- && CONST_INT_P (XEXP (new_base_reg, 1))
+ && poly_int_rtx_p (XEXP (new_base_reg, 1), &offset)
&& can_add_disp_p (ad))
{
- disp += INTVAL (XEXP (new_base_reg, 1));
+ disp += offset;
*base_term = XEXP (new_base_reg, 0);
change_p = true;
}
@@ -3134,6 +3136,7 @@ equiv_address_substitution (struct addre
}
if (index_reg != new_index_reg)
{
+ poly_int64 offset;
if (REG_P (new_index_reg))
{
*index_term = new_index_reg;
@@ -3141,16 +3144,16 @@ equiv_address_substitution (struct addre
}
else if (GET_CODE (new_index_reg) == PLUS
&& REG_P (XEXP (new_index_reg, 0))
- && CONST_INT_P (XEXP (new_index_reg, 1))
+ && poly_int_rtx_p (XEXP (new_index_reg, 1), &offset)
&& can_add_disp_p (ad)
&& (scale = get_index_scale (ad)))
{
- disp += INTVAL (XEXP (new_index_reg, 1)) * scale;
+ disp += offset * scale;
*index_term = XEXP (new_index_reg, 0);
change_p = true;
}
}
- if (disp != 0)
+ if (maybe_nonzero (disp))
{
if (ad->disp != NULL)
*ad->disp = plus_constant (GET_MODE (*ad->inner), *ad->disp, disp);
@@ -3629,9 +3632,10 @@ emit_inc (enum reg_class new_rclass, rtx
register. */
if (plus_p)
{
- if (CONST_INT_P (inc))
+ poly_int64 offset;
+ if (poly_int_rtx_p (inc, &offset))
emit_insn (gen_add2_insn (result,
- gen_int_mode (-INTVAL (inc),
+ gen_int_mode (-offset,
GET_MODE (result))));
else
emit_insn (gen_sub2_insn (result, inc));
@@ -3999,10 +4003,13 @@ curr_insn_transform (bool check_only_p)
if (INSN_CODE (curr_insn) >= 0
&& (p = get_insn_name (INSN_CODE (curr_insn))) != NULL)
fprintf (lra_dump_file, " {%s}", p);
- if (curr_id->sp_offset != 0)
- fprintf (lra_dump_file, " (sp_off=%" HOST_WIDE_INT_PRINT "d)",
- curr_id->sp_offset);
- fprintf (lra_dump_file, "\n");
+ if (maybe_nonzero (curr_id->sp_offset))
+ {
+ fprintf (lra_dump_file, " (sp_off=");
+ print_dec (curr_id->sp_offset, lra_dump_file);
+ fprintf (lra_dump_file, ")");
+ }
+ fprintf (lra_dump_file, "\n");
}
/* Right now, for any pair of operands I and J that are required to
Index: gcc/lra-eliminations.c
===================================================================
--- gcc/lra-eliminations.c 2017-10-23 16:52:19.836152258 +0100
+++ gcc/lra-eliminations.c 2017-10-23 17:01:59.910337542 +0100
@@ -79,9 +79,9 @@ struct lra_elim_table
int to;
/* Difference between values of the two hard registers above on
previous iteration. */
- HOST_WIDE_INT previous_offset;
+ poly_int64 previous_offset;
/* Difference between the values on the current iteration. */
- HOST_WIDE_INT offset;
+ poly_int64 offset;
/* Nonzero if this elimination can be done. */
bool can_eliminate;
/* CAN_ELIMINATE since the last check. */
@@ -120,10 +120,14 @@ print_elim_table (FILE *f)
struct lra_elim_table *ep;
for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++)
- fprintf (f, "%s eliminate %d to %d (offset=" HOST_WIDE_INT_PRINT_DEC
- ", prev_offset=" HOST_WIDE_INT_PRINT_DEC ")\n",
- ep->can_eliminate ? "Can" : "Can't",
- ep->from, ep->to, ep->offset, ep->previous_offset);
+ {
+ fprintf (f, "%s eliminate %d to %d (offset=",
+ ep->can_eliminate ? "Can" : "Can't", ep->from, ep->to);
+ print_dec (ep->offset, f);
+ fprintf (f, ", prev_offset=");
+ print_dec (ep->previous_offset, f);
+ fprintf (f, ")\n");
+ }
}
/* Print info about elimination table to stderr. */
@@ -161,7 +165,7 @@ setup_can_eliminate (struct lra_elim_tab
/* Offsets should be used to restore original offsets for eliminable
hard register which just became not eliminable. Zero,
otherwise. */
-static HOST_WIDE_INT self_elim_offsets[FIRST_PSEUDO_REGISTER];
+static poly_int64_pod self_elim_offsets[FIRST_PSEUDO_REGISTER];
/* Map: hard regno -> RTL presentation. RTL presentations of all
potentially eliminable hard registers are stored in the map. */
@@ -193,6 +197,7 @@ setup_elimination_map (void)
form_sum (rtx x, rtx y)
{
machine_mode mode = GET_MODE (x);
+ poly_int64 offset;
if (mode == VOIDmode)
mode = GET_MODE (y);
@@ -200,10 +205,10 @@ form_sum (rtx x, rtx y)
if (mode == VOIDmode)
mode = Pmode;
- if (CONST_INT_P (x))
- return plus_constant (mode, y, INTVAL (x));
- else if (CONST_INT_P (y))
- return plus_constant (mode, x, INTVAL (y));
+ if (poly_int_rtx_p (x, &offset))
+ return plus_constant (mode, y, offset);
+ else if (poly_int_rtx_p (y, &offset))
+ return plus_constant (mode, x, offset);
else if (CONSTANT_P (x))
std::swap (x, y);
@@ -252,14 +257,14 @@ get_elimination (rtx reg)
{
int hard_regno;
struct lra_elim_table *ep;
- HOST_WIDE_INT offset;
lra_assert (REG_P (reg));
if ((hard_regno = REGNO (reg)) < 0 || hard_regno >= FIRST_PSEUDO_REGISTER)
return NULL;
if ((ep = elimination_map[hard_regno]) != NULL)
return ep->from_rtx != reg ? NULL : ep;
- if ((offset = self_elim_offsets[hard_regno]) == 0)
+ poly_int64 offset = self_elim_offsets[hard_regno];
+ if (known_zero (offset))
return NULL;
/* This is an iteration to restore offsets just after HARD_REGNO
stopped to be eliminable. */
@@ -325,7 +330,7 @@ move_plus_up (rtx x)
rtx
lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
bool subst_p, bool update_p,
- HOST_WIDE_INT update_sp_offset, bool full_p)
+ poly_int64 update_sp_offset, bool full_p)
{
enum rtx_code code = GET_CODE (x);
struct lra_elim_table *ep;
@@ -335,7 +340,8 @@ lra_eliminate_regs_1 (rtx_insn *insn, rt
int copied = 0;
lra_assert (!update_p || !full_p);
- lra_assert (update_sp_offset == 0 || (!subst_p && update_p && !full_p));
+ lra_assert (known_zero (update_sp_offset)
+ || (!subst_p && update_p && !full_p));
if (! current_function_decl)
return x;
@@ -360,7 +366,7 @@ lra_eliminate_regs_1 (rtx_insn *insn, rt
{
rtx to = subst_p ? ep->to_rtx : ep->from_rtx;
- if (update_sp_offset != 0)
+ if (maybe_nonzero (update_sp_offset))
{
if (ep->to_rtx == stack_pointer_rtx)
return plus_constant (Pmode, to, update_sp_offset);
@@ -387,20 +393,21 @@ lra_eliminate_regs_1 (rtx_insn *insn, rt
{
if ((ep = get_elimination (XEXP (x, 0))) != NULL)
{
- HOST_WIDE_INT offset;
+ poly_int64 offset, curr_offset;
rtx to = subst_p ? ep->to_rtx : ep->from_rtx;
if (! update_p && ! full_p)
return gen_rtx_PLUS (Pmode, to, XEXP (x, 1));
- if (update_sp_offset != 0)
+ if (maybe_nonzero (update_sp_offset))
offset = ep->to_rtx == stack_pointer_rtx ? update_sp_offset : 0;
else
offset = (update_p
? ep->offset - ep->previous_offset : ep->offset);
if (full_p && insn != NULL_RTX && ep->to_rtx == stack_pointer_rtx)
offset -= lra_get_insn_recog_data (insn)->sp_offset;
- if (CONST_INT_P (XEXP (x, 1)) && INTVAL (XEXP (x, 1)) == -offset)
+ if (poly_int_rtx_p (XEXP (x, 1), &curr_offset)
+ && must_eq (curr_offset, -offset))
return to;
else
return gen_rtx_PLUS (Pmode, to,
@@ -449,7 +456,7 @@ lra_eliminate_regs_1 (rtx_insn *insn, rt
{
rtx to = subst_p ? ep->to_rtx : ep->from_rtx;
- if (update_sp_offset != 0)
+ if (maybe_nonzero (update_sp_offset))
{
if (ep->to_rtx == stack_pointer_rtx)
return plus_constant (Pmode,
@@ -464,7 +471,7 @@ lra_eliminate_regs_1 (rtx_insn *insn, rt
* INTVAL (XEXP (x, 1)));
else if (full_p)
{
- HOST_WIDE_INT offset = ep->offset;
+ poly_int64 offset = ep->offset;
if (insn != NULL_RTX && ep->to_rtx == stack_pointer_rtx)
offset -= lra_get_insn_recog_data (insn)->sp_offset;
@@ -711,7 +718,7 @@ lra_eliminate_regs (rtx x, machine_mode
/* Stack pointer offset before the current insn relative to one at the
func start. RTL insns can change SP explicitly. We keep the
changes from one insn to another through this variable. */
-static HOST_WIDE_INT curr_sp_change;
+static poly_int64 curr_sp_change;
/* Scan rtx X for references to elimination source or target registers
in contexts that would prevent the elimination from happening.
@@ -725,6 +732,7 @@ mark_not_eliminable (rtx x, machine_mode
struct lra_elim_table *ep;
int i, j;
const char *fmt;
+ poly_int64 offset = 0;
switch (code)
{
@@ -738,7 +746,7 @@ mark_not_eliminable (rtx x, machine_mode
&& ((code != PRE_MODIFY && code != POST_MODIFY)
|| (GET_CODE (XEXP (x, 1)) == PLUS
&& XEXP (x, 0) == XEXP (XEXP (x, 1), 0)
- && CONST_INT_P (XEXP (XEXP (x, 1), 1)))))
+ && poly_int_rtx_p (XEXP (XEXP (x, 1), 1), &offset))))
{
int size = GET_MODE_SIZE (mem_mode);
@@ -752,7 +760,7 @@ mark_not_eliminable (rtx x, machine_mode
else if (code == PRE_INC || code == POST_INC)
curr_sp_change += size;
else if (code == PRE_MODIFY || code == POST_MODIFY)
- curr_sp_change += INTVAL (XEXP (XEXP (x, 1), 1));
+ curr_sp_change += offset;
}
else if (REG_P (XEXP (x, 0))
&& REGNO (XEXP (x, 0)) >= FIRST_PSEUDO_REGISTER)
@@ -802,9 +810,9 @@ mark_not_eliminable (rtx x, machine_mode
if (SET_DEST (x) == stack_pointer_rtx
&& GET_CODE (SET_SRC (x)) == PLUS
&& XEXP (SET_SRC (x), 0) == SET_DEST (x)
- && CONST_INT_P (XEXP (SET_SRC (x), 1)))
+ && poly_int_rtx_p (XEXP (SET_SRC (x), 1), &offset))
{
- curr_sp_change += INTVAL (XEXP (SET_SRC (x), 1));
+ curr_sp_change += offset;
return;
}
if (! REG_P (SET_DEST (x))
@@ -859,11 +867,11 @@ mark_not_eliminable (rtx x, machine_mode
#ifdef HARD_FRAME_POINTER_REGNUM
-/* Find offset equivalence note for reg WHAT in INSN and return the
- found elmination offset. If the note is not found, return NULL.
- Remove the found note. */
-static rtx
-remove_reg_equal_offset_note (rtx_insn *insn, rtx what)
+/* Search INSN's reg notes to see whether the destination is equal to
+ WHAT + C for some constant C. Return true if so, storing C in
+ *OFFSET_OUT and removing the reg note. */
+static bool
+remove_reg_equal_offset_note (rtx_insn *insn, rtx what, poly_int64 *offset_out)
{
rtx link, *link_loc;
@@ -873,12 +881,12 @@ remove_reg_equal_offset_note (rtx_insn *
if (REG_NOTE_KIND (link) == REG_EQUAL
&& GET_CODE (XEXP (link, 0)) == PLUS
&& XEXP (XEXP (link, 0), 0) == what
- && CONST_INT_P (XEXP (XEXP (link, 0), 1)))
+ && poly_int_rtx_p (XEXP (XEXP (link, 0), 1), offset_out))
{
*link_loc = XEXP (link, 1);
- return XEXP (XEXP (link, 0), 1);
+ return true;
}
- return NULL_RTX;
+ return false;
}
#endif
@@ -899,7 +907,7 @@ remove_reg_equal_offset_note (rtx_insn *
void
eliminate_regs_in_insn (rtx_insn *insn, bool replace_p, bool first_p,
- HOST_WIDE_INT update_sp_offset)
+ poly_int64 update_sp_offset)
{
int icode = recog_memoized (insn);
rtx old_set = single_set (insn);
@@ -940,28 +948,21 @@ eliminate_regs_in_insn (rtx_insn *insn,
nonlocal goto. */
{
rtx src = SET_SRC (old_set);
- rtx off = remove_reg_equal_offset_note (insn, ep->to_rtx);
-
+ poly_int64 offset = 0;
+
/* We should never process such insn with non-zero
UPDATE_SP_OFFSET. */
- lra_assert (update_sp_offset == 0);
+ lra_assert (known_zero (update_sp_offset));
- if (off != NULL_RTX
- || src == ep->to_rtx
- || (GET_CODE (src) == PLUS
- && XEXP (src, 0) == ep->to_rtx
- && CONST_INT_P (XEXP (src, 1))))
+ if (remove_reg_equal_offset_note (insn, ep->to_rtx, &offset)
+ || strip_offset (src, &offset) == ep->to_rtx)
{
- HOST_WIDE_INT offset;
-
if (replace_p)
{
SET_DEST (old_set) = ep->to_rtx;
lra_update_insn_recog_data (insn);
return;
}
- offset = (off != NULL_RTX ? INTVAL (off)
- : src == ep->to_rtx ? 0 : INTVAL (XEXP (src, 1)));
offset -= (ep->offset - ep->previous_offset);
src = plus_constant (Pmode, ep->to_rtx, offset);
@@ -997,13 +998,13 @@ eliminate_regs_in_insn (rtx_insn *insn,
currently support: a single set with the source or a REG_EQUAL
note being a PLUS of an eliminable register and a constant. */
plus_src = plus_cst_src = 0;
+ poly_int64 offset = 0;
if (old_set && REG_P (SET_DEST (old_set)))
{
if (GET_CODE (SET_SRC (old_set)) == PLUS)
plus_src = SET_SRC (old_set);
/* First see if the source is of the form (plus (...) CST). */
- if (plus_src
- && CONST_INT_P (XEXP (plus_src, 1)))
+ if (plus_src && poly_int_rtx_p (XEXP (plus_src, 1), &offset))
plus_cst_src = plus_src;
/* Check that the first operand of the PLUS is a hard reg or
the lowpart subreg of one. */
@@ -1021,7 +1022,6 @@ eliminate_regs_in_insn (rtx_insn *insn,
if (plus_cst_src)
{
rtx reg = XEXP (plus_cst_src, 0);
- HOST_WIDE_INT offset = INTVAL (XEXP (plus_cst_src, 1));
if (GET_CODE (reg) == SUBREG)
reg = SUBREG_REG (reg);
@@ -1032,7 +1032,7 @@ eliminate_regs_in_insn (rtx_insn *insn,
if (! replace_p)
{
- if (update_sp_offset == 0)
+ if (known_zero (update_sp_offset))
offset += (ep->offset - ep->previous_offset);
if (ep->to_rtx == stack_pointer_rtx)
{
@@ -1051,7 +1051,7 @@ eliminate_regs_in_insn (rtx_insn *insn,
the cost of the insn by replacing a simple REG with (plus
(reg sp) CST). So try only when we already had a PLUS
before. */
- if (offset == 0 || plus_src)
+ if (known_zero (offset) || plus_src)
{
rtx new_src = plus_constant (GET_MODE (to_rtx), to_rtx, offset);
@@ -1239,7 +1239,7 @@ update_reg_eliminate (bitmap insns_with_
if (lra_dump_file != NULL)
fprintf (lra_dump_file, " Using elimination %d to %d now\n",
ep1->from, ep1->to);
- lra_assert (ep1->previous_offset == 0);
+ lra_assert (known_zero (ep1->previous_offset));
ep1->previous_offset = ep->offset;
}
else
@@ -1251,7 +1251,7 @@ update_reg_eliminate (bitmap insns_with_
fprintf (lra_dump_file, " %d is not eliminable at all\n",
ep->from);
self_elim_offsets[ep->from] = -ep->offset;
- if (ep->offset != 0)
+ if (maybe_nonzero (ep->offset))
bitmap_ior_into (insns_with_changed_offsets,
&lra_reg_info[ep->from].insn_bitmap);
}
@@ -1271,7 +1271,7 @@ update_reg_eliminate (bitmap insns_with_
the usage for pseudos. */
if (ep->from != ep->to)
SET_HARD_REG_BIT (temp_hard_reg_set, ep->to);
- if (ep->previous_offset != ep->offset)
+ if (may_ne (ep->previous_offset, ep->offset))
{
bitmap_ior_into (insns_with_changed_offsets,
&lra_reg_info[ep->from].insn_bitmap);
@@ -1357,13 +1357,13 @@ init_elimination (void)
if (NONDEBUG_INSN_P (insn))
{
mark_not_eliminable (PATTERN (insn), VOIDmode);
- if (curr_sp_change != 0
+ if (maybe_nonzero (curr_sp_change)
&& find_reg_note (insn, REG_LABEL_OPERAND, NULL_RTX))
stop_to_sp_elimination_p = true;
}
}
if (! frame_pointer_needed
- && (curr_sp_change != 0 || stop_to_sp_elimination_p)
+ && (maybe_nonzero (curr_sp_change) || stop_to_sp_elimination_p)
&& bb->succs && bb->succs->length () != 0)
for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++)
if (ep->to == STACK_POINTER_REGNUM)
Index: gcc/lra-remat.c
===================================================================
--- gcc/lra-remat.c 2017-10-23 16:52:19.836152258 +0100
+++ gcc/lra-remat.c 2017-10-23 17:01:59.910337542 +0100
@@ -994,7 +994,7 @@ calculate_global_remat_bb_data (void)
/* Setup sp offset attribute to SP_OFFSET for all INSNS. */
static void
-change_sp_offset (rtx_insn *insns, HOST_WIDE_INT sp_offset)
+change_sp_offset (rtx_insn *insns, poly_int64 sp_offset)
{
for (rtx_insn *insn = insns; insn != NULL; insn = NEXT_INSN (insn))
eliminate_regs_in_insn (insn, false, false, sp_offset);
@@ -1118,7 +1118,7 @@ do_remat (void)
int i, hard_regno, nregs;
int dst_hard_regno, dst_nregs;
rtx_insn *remat_insn = NULL;
- HOST_WIDE_INT cand_sp_offset = 0;
+ poly_int64 cand_sp_offset = 0;
if (cand != NULL)
{
lra_insn_recog_data_t cand_id
@@ -1241,8 +1241,8 @@ do_remat (void)
if (remat_insn != NULL)
{
- HOST_WIDE_INT sp_offset_change = cand_sp_offset - id->sp_offset;
- if (sp_offset_change != 0)
+ poly_int64 sp_offset_change = cand_sp_offset - id->sp_offset;
+ if (maybe_nonzero (sp_offset_change))
change_sp_offset (remat_insn, sp_offset_change);
update_scratch_ops (remat_insn);
lra_process_new_insns (insn, remat_insn, NULL,
Index: gcc/lra.c
===================================================================
--- gcc/lra.c 2017-10-23 16:52:19.836152258 +0100
+++ gcc/lra.c 2017-10-23 17:01:59.911336118 +0100
@@ -1163,7 +1163,7 @@ lra_update_insn_recog_data (rtx_insn *in
int n;
unsigned int uid = INSN_UID (insn);
struct lra_static_insn_data *insn_static_data;
- HOST_WIDE_INT sp_offset = 0;
+ poly_int64 sp_offset = 0;
check_and_expand_insn_recog_data (uid);
if ((data = lra_insn_recog_data[uid]) != NULL
@@ -1805,8 +1805,8 @@ push_insns (rtx_insn *from, rtx_insn *to
setup_sp_offset (rtx_insn *from, rtx_insn *last)
{
rtx_insn *before = next_nonnote_insn_bb (last);
- HOST_WIDE_INT offset = (before == NULL_RTX || ! INSN_P (before)
- ? 0 : lra_get_insn_recog_data (before)->sp_offset);
+ poly_int64 offset = (before == NULL_RTX || ! INSN_P (before)
+ ? 0 : lra_get_insn_recog_data (before)->sp_offset);
for (rtx_insn *insn = from; insn != NEXT_INSN (last); insn = NEXT_INSN (insn))
lra_get_insn_recog_data (insn)->sp_offset = offset;
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [019/nnn] poly_int: lra frame offsets
2017-10-23 17:08 ` [019/nnn] poly_int: lra frame offsets Richard Sandiford
@ 2017-12-06 0:16 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-06 0:16 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:07 AM, Richard Sandiford wrote:
> This patch makes LRA use poly_int64s rather than HOST_WIDE_INTs
> to store a frame offset (including in things like eliminations).
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * lra-int.h (lra_reg): Change offset from int to poly_int64.
> (lra_insn_recog_data): Change sp_offset from HOST_WIDE_INT
> to poly_int64.
> (lra_eliminate_regs_1, eliminate_regs_in_insn): Change
> update_sp_offset from a HOST_WIDE_INT to a poly_int64.
> (lra_update_reg_val_offset, lra_reg_val_equal_p): Take the
> offset as a poly_int64 rather than an int.
> * lra-assigns.c (find_hard_regno_for_1): Handle poly_int64 offsets.
> (setup_live_pseudos_and_spill_after_risky_transforms): Likewise.
> * lra-constraints.c (equiv_address_substitution): Track offsets
> as poly_int64s.
> (emit_inc): Check poly_int_rtx_p instead of CONST_INT_P.
> (curr_insn_transform): Handle the new form of sp_offset.
> * lra-eliminations.c (lra_elim_table): Change previous_offset
> and offset from HOST_WIDE_INT to poly_int64.
> (print_elim_table, update_reg_eliminate): Update accordingly.
> (self_elim_offsets): Change from HOST_WIDE_INT to poly_int64_pod.
> (get_elimination): Update accordingly.
> (form_sum): Check poly_int_rtx_p instead of CONST_INT_P.
> (lra_eliminate_regs_1, eliminate_regs_in_insn): Change
> update_sp_offset from a HOST_WIDE_INT to a poly_int64. Handle
> poly_int64 offsets generally.
> (curr_sp_change): Change from HOST_WIDE_INT to poly_int64.
> (mark_not_eliminable, init_elimination): Update accordingly.
> (remove_reg_equal_offset_note): Return a bool and pass the new
> offset back by pointer as a poly_int64.
> * lra-remat.c (change_sp_offset): Take sp_offset as a poly_int64
> rather than a HOST_WIDE_INT.
> (do_remat): Track offsets poly_int64s.
> * lra.c (lra_update_insn_recog_data, setup_sp_offset): Likewise.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [022/nnn] poly_int: C++ bitfield regions
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (19 preceding siblings ...)
2017-10-23 17:08 ` [019/nnn] poly_int: lra frame offsets Richard Sandiford
@ 2017-10-23 17:09 ` Richard Sandiford
2017-12-05 23:39 ` Jeff Law
2017-10-23 17:09 ` [021/nnn] poly_int: extract_bit_field bitrange Richard Sandiford
` (86 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:09 UTC (permalink / raw)
To: gcc-patches
This patch changes C++ bitregion_start/end values from constants to
poly_ints. Although it's unlikely that the size needs to be polynomial
in practice, the offset could be with future language extensions.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* expmed.h (store_bit_field): Change bitregion_start and
bitregion_end from unsigned HOST_WIDE_INT to poly_uint64.
* expmed.c (adjust_bit_field_mem_for_reg, strict_volatile_bitfield_p)
(store_bit_field_1, store_integral_bit_field, store_bit_field)
(store_fixed_bit_field, store_split_bit_field): Likewise.
* expr.c (store_constructor_field, store_field): Likewise.
(optimize_bitfield_assignment_op): Likewise. Make the same change
to bitsize and bitpos.
* machmode.h (bit_field_mode_iterator): Change m_bitregion_start
and m_bitregion_end from HOST_WIDE_INT to poly_int64. Make the
same change in the constructor arguments.
(get_best_mode): Change bitregion_start and bitregion_end from
unsigned HOST_WIDE_INT to poly_uint64.
* stor-layout.c (bit_field_mode_iterator::bit_field_mode_iterator):
Change bitregion_start and bitregion_end from HOST_WIDE_INT to
poly_int64.
(bit_field_mode_iterator::next_mode): Update for new types
of m_bitregion_start and m_bitregion_end.
(get_best_mode): Change bitregion_start and bitregion_end from
unsigned HOST_WIDE_INT to poly_uint64.
Index: gcc/expmed.h
===================================================================
--- gcc/expmed.h 2017-10-23 17:11:50.109574423 +0100
+++ gcc/expmed.h 2017-10-23 17:11:54.533863145 +0100
@@ -719,8 +719,7 @@ extern rtx expand_divmod (int, enum tree
#endif
extern void store_bit_field (rtx, poly_uint64, poly_uint64,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ poly_uint64, poly_uint64,
machine_mode, rtx, bool);
extern rtx extract_bit_field (rtx, poly_uint64, poly_uint64, int, rtx,
machine_mode, machine_mode, bool, rtx *);
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c 2017-10-23 17:11:50.109574423 +0100
+++ gcc/expmed.c 2017-10-23 17:11:54.533863145 +0100
@@ -49,14 +49,12 @@ struct target_expmed *this_target_expmed
static bool store_integral_bit_field (rtx, opt_scalar_int_mode,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ poly_uint64, poly_uint64,
machine_mode, rtx, bool, bool);
static void store_fixed_bit_field (rtx, opt_scalar_int_mode,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ poly_uint64, poly_uint64,
rtx, scalar_int_mode, bool);
static void store_fixed_bit_field_1 (rtx, scalar_int_mode,
unsigned HOST_WIDE_INT,
@@ -65,8 +63,7 @@ static void store_fixed_bit_field_1 (rtx
static void store_split_bit_field (rtx, opt_scalar_int_mode,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ poly_uint64, poly_uint64,
rtx, scalar_int_mode, bool);
static rtx extract_integral_bit_field (rtx, opt_scalar_int_mode,
unsigned HOST_WIDE_INT,
@@ -471,8 +468,8 @@ narrow_bit_field_mem (rtx mem, opt_scala
adjust_bit_field_mem_for_reg (enum extraction_pattern pattern,
rtx op0, HOST_WIDE_INT bitsize,
HOST_WIDE_INT bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ poly_uint64 bitregion_start,
+ poly_uint64 bitregion_end,
machine_mode fieldmode,
unsigned HOST_WIDE_INT *new_bitnum)
{
@@ -536,8 +533,8 @@ lowpart_bit_field_p (poly_uint64 bitnum,
strict_volatile_bitfield_p (rtx op0, unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitnum,
scalar_int_mode fieldmode,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end)
+ poly_uint64 bitregion_start,
+ poly_uint64 bitregion_end)
{
unsigned HOST_WIDE_INT modesize = GET_MODE_BITSIZE (fieldmode);
@@ -564,9 +561,10 @@ strict_volatile_bitfield_p (rtx op0, uns
return false;
/* Check for cases where the C++ memory model applies. */
- if (bitregion_end != 0
- && (bitnum - bitnum % modesize < bitregion_start
- || bitnum - bitnum % modesize + modesize - 1 > bitregion_end))
+ if (maybe_nonzero (bitregion_end)
+ && (may_lt (bitnum - bitnum % modesize, bitregion_start)
+ || may_gt (bitnum - bitnum % modesize + modesize - 1,
+ bitregion_end)))
return false;
return true;
@@ -730,8 +728,7 @@ store_bit_field_using_insv (const extrac
static bool
store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ poly_uint64 bitregion_start, poly_uint64 bitregion_end,
machine_mode fieldmode,
rtx value, bool reverse, bool fallback_p)
{
@@ -858,8 +855,8 @@ store_bit_field_1 (rtx str_rtx, poly_uin
store_integral_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ poly_uint64 bitregion_start,
+ poly_uint64 bitregion_end,
machine_mode fieldmode,
rtx value, bool reverse, bool fallback_p)
{
@@ -1085,8 +1082,7 @@ store_integral_bit_field (rtx op0, opt_s
void
store_bit_field (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ poly_uint64 bitregion_start, poly_uint64 bitregion_end,
machine_mode fieldmode,
rtx value, bool reverse)
{
@@ -1133,15 +1129,12 @@ store_bit_field (rtx str_rtx, poly_uint6
/* Under the C++0x memory model, we must not touch bits outside the
bit region. Adjust the address to start at the beginning of the
bit region. */
- if (MEM_P (str_rtx) && bitregion_start > 0)
+ if (MEM_P (str_rtx) && maybe_nonzero (bitregion_start))
{
scalar_int_mode best_mode;
machine_mode addr_mode = VOIDmode;
- HOST_WIDE_INT offset;
-
- gcc_assert ((bitregion_start % BITS_PER_UNIT) == 0);
- offset = bitregion_start / BITS_PER_UNIT;
+ poly_uint64 offset = exact_div (bitregion_start, BITS_PER_UNIT);
bitnum -= bitregion_start;
poly_int64 size = bits_to_bytes_round_up (bitnum + bitsize);
bitregion_end -= bitregion_start;
@@ -1174,8 +1167,7 @@ store_bit_field (rtx str_rtx, poly_uint6
store_fixed_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ poly_uint64 bitregion_start, poly_uint64 bitregion_end,
rtx value, scalar_int_mode value_mode, bool reverse)
{
/* There is a case not handled here:
@@ -1330,8 +1322,7 @@ store_fixed_bit_field_1 (rtx op0, scalar
store_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ poly_uint64 bitregion_start, poly_uint64 bitregion_end,
rtx value, scalar_int_mode value_mode, bool reverse)
{
unsigned int unit, total_bits, bitsdone = 0;
@@ -1379,9 +1370,9 @@ store_split_bit_field (rtx op0, opt_scal
UNIT close to the end of the region as needed. If op0 is a REG
or SUBREG of REG, don't do this, as there can't be data races
on a register and we can expand shorter code in some cases. */
- if (bitregion_end
+ if (maybe_nonzero (bitregion_end)
&& unit > BITS_PER_UNIT
- && bitpos + bitsdone - thispos + unit > bitregion_end + 1
+ && may_gt (bitpos + bitsdone - thispos + unit, bitregion_end + 1)
&& !REG_P (op0)
&& (GET_CODE (op0) != SUBREG || !REG_P (SUBREG_REG (op0))))
{
Index: gcc/expr.c
===================================================================
--- gcc/expr.c 2017-10-23 17:11:43.725043907 +0100
+++ gcc/expr.c 2017-10-23 17:11:54.535862371 +0100
@@ -79,13 +79,9 @@ static void emit_block_move_via_loop (rt
static void clear_by_pieces (rtx, unsigned HOST_WIDE_INT, unsigned int);
static rtx_insn *compress_float_constant (rtx, rtx);
static rtx get_subtarget (rtx);
-static void store_constructor_field (rtx, unsigned HOST_WIDE_INT,
- HOST_WIDE_INT, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, machine_mode,
- tree, int, alias_set_type, bool);
static void store_constructor (tree, rtx, int, HOST_WIDE_INT, bool);
static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+ poly_uint64, poly_uint64,
machine_mode, tree, alias_set_type, bool, bool);
static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree);
@@ -4611,10 +4607,10 @@ get_subtarget (rtx x)
and there's nothing else to do. */
static bool
-optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+optimize_bitfield_assignment_op (poly_uint64 pbitsize,
+ poly_uint64 pbitpos,
+ poly_uint64 pbitregion_start,
+ poly_uint64 pbitregion_end,
machine_mode mode1, rtx str_rtx,
tree to, tree src, bool reverse)
{
@@ -4626,7 +4622,12 @@ optimize_bitfield_assignment_op (unsigne
gimple *srcstmt;
enum tree_code code;
+ unsigned HOST_WIDE_INT bitsize, bitpos, bitregion_start, bitregion_end;
if (mode1 != VOIDmode
+ || !pbitsize.is_constant (&bitsize)
+ || !pbitpos.is_constant (&bitpos)
+ || !pbitregion_start.is_constant (&bitregion_start)
+ || !pbitregion_end.is_constant (&bitregion_end)
|| bitsize >= BITS_PER_WORD
|| str_bitsize > BITS_PER_WORD
|| TREE_SIDE_EFFECTS (to)
@@ -6082,8 +6083,8 @@ all_zeros_p (const_tree exp)
static void
store_constructor_field (rtx target, unsigned HOST_WIDE_INT bitsize,
HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ poly_uint64 bitregion_start,
+ poly_uint64 bitregion_end,
machine_mode mode,
tree exp, int cleared,
alias_set_type alias_set, bool reverse)
@@ -6762,8 +6763,7 @@ store_constructor (tree exp, rtx target,
static rtx
store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ poly_uint64 bitregion_start, poly_uint64 bitregion_end,
machine_mode mode, tree exp,
alias_set_type alias_set, bool nontemporal, bool reverse)
{
Index: gcc/machmode.h
===================================================================
--- gcc/machmode.h 2017-10-23 17:11:43.725043907 +0100
+++ gcc/machmode.h 2017-10-23 17:11:54.535862371 +0100
@@ -760,7 +760,7 @@ mode_for_int_vector (machine_mode mode)
{
public:
bit_field_mode_iterator (HOST_WIDE_INT, HOST_WIDE_INT,
- HOST_WIDE_INT, HOST_WIDE_INT,
+ poly_int64, poly_int64,
unsigned int, bool);
bool next_mode (scalar_int_mode *);
bool prefer_smaller_modes ();
@@ -771,8 +771,8 @@ mode_for_int_vector (machine_mode mode)
for invalid input such as gcc.dg/pr48335-8.c. */
HOST_WIDE_INT m_bitsize;
HOST_WIDE_INT m_bitpos;
- HOST_WIDE_INT m_bitregion_start;
- HOST_WIDE_INT m_bitregion_end;
+ poly_int64 m_bitregion_start;
+ poly_int64 m_bitregion_end;
unsigned int m_align;
bool m_volatilep;
int m_count;
@@ -780,8 +780,7 @@ mode_for_int_vector (machine_mode mode)
/* Find the best mode to use to access a bit field. */
-extern bool get_best_mode (int, int, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, unsigned int,
+extern bool get_best_mode (int, int, poly_uint64, poly_uint64, unsigned int,
unsigned HOST_WIDE_INT, bool, scalar_int_mode *);
/* Determine alignment, 1<=result<=BIGGEST_ALIGNMENT. */
Index: gcc/stor-layout.c
===================================================================
--- gcc/stor-layout.c 2017-10-23 17:11:43.725043907 +0100
+++ gcc/stor-layout.c 2017-10-23 17:11:54.535862371 +0100
@@ -2747,15 +2747,15 @@ fixup_unsigned_type (tree type)
bit_field_mode_iterator
::bit_field_mode_iterator (HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
- HOST_WIDE_INT bitregion_start,
- HOST_WIDE_INT bitregion_end,
+ poly_int64 bitregion_start,
+ poly_int64 bitregion_end,
unsigned int align, bool volatilep)
: m_mode (NARROWEST_INT_MODE), m_bitsize (bitsize),
m_bitpos (bitpos), m_bitregion_start (bitregion_start),
m_bitregion_end (bitregion_end), m_align (align),
m_volatilep (volatilep), m_count (0)
{
- if (!m_bitregion_end)
+ if (known_zero (m_bitregion_end))
{
/* We can assume that any aligned chunk of ALIGN bits that overlaps
the bitfield is mapped and won't trap, provided that ALIGN isn't
@@ -2765,8 +2765,8 @@ fixup_unsigned_type (tree type)
= MIN (align, MAX (BIGGEST_ALIGNMENT, BITS_PER_WORD));
if (bitsize <= 0)
bitsize = 1;
- m_bitregion_end = bitpos + bitsize + units - 1;
- m_bitregion_end -= m_bitregion_end % units + 1;
+ HOST_WIDE_INT end = bitpos + bitsize + units - 1;
+ m_bitregion_end = end - end % units - 1;
}
}
@@ -2803,10 +2803,11 @@ bit_field_mode_iterator::next_mode (scal
/* Stop if the mode goes outside the bitregion. */
HOST_WIDE_INT start = m_bitpos - substart;
- if (m_bitregion_start && start < m_bitregion_start)
+ if (maybe_nonzero (m_bitregion_start)
+ && may_lt (start, m_bitregion_start))
break;
HOST_WIDE_INT end = start + unit;
- if (end > m_bitregion_end + 1)
+ if (may_gt (end, m_bitregion_end + 1))
break;
/* Stop if the mode requires too much alignment. */
@@ -2862,8 +2863,7 @@ bit_field_mode_iterator::prefer_smaller_
bool
get_best_mode (int bitsize, int bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ poly_uint64 bitregion_start, poly_uint64 bitregion_end,
unsigned int align,
unsigned HOST_WIDE_INT largest_mode_bitsize, bool volatilep,
scalar_int_mode *best_mode)
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [022/nnn] poly_int: C++ bitfield regions
2017-10-23 17:09 ` [022/nnn] poly_int: C++ bitfield regions Richard Sandiford
@ 2017-12-05 23:39 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-05 23:39 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:08 AM, Richard Sandiford wrote:
> This patch changes C++ bitregion_start/end values from constants to
> poly_ints. Although it's unlikely that the size needs to be polynomial
> in practice, the offset could be with future language extensions.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * expmed.h (store_bit_field): Change bitregion_start and
> bitregion_end from unsigned HOST_WIDE_INT to poly_uint64.
> * expmed.c (adjust_bit_field_mem_for_reg, strict_volatile_bitfield_p)
> (store_bit_field_1, store_integral_bit_field, store_bit_field)
> (store_fixed_bit_field, store_split_bit_field): Likewise.
> * expr.c (store_constructor_field, store_field): Likewise.
> (optimize_bitfield_assignment_op): Likewise. Make the same change
> to bitsize and bitpos.
> * machmode.h (bit_field_mode_iterator): Change m_bitregion_start
> and m_bitregion_end from HOST_WIDE_INT to poly_int64. Make the
> same change in the constructor arguments.
> (get_best_mode): Change bitregion_start and bitregion_end from
> unsigned HOST_WIDE_INT to poly_uint64.
> * stor-layout.c (bit_field_mode_iterator::bit_field_mode_iterator):
> Change bitregion_start and bitregion_end from HOST_WIDE_INT to
> poly_int64.
> (bit_field_mode_iterator::next_mode): Update for new types
> of m_bitregion_start and m_bitregion_end.
> (get_best_mode): Change bitregion_start and bitregion_end from
> unsigned HOST_WIDE_INT to poly_uint64.
>
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [021/nnn] poly_int: extract_bit_field bitrange
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (20 preceding siblings ...)
2017-10-23 17:09 ` [022/nnn] poly_int: C++ bitfield regions Richard Sandiford
@ 2017-10-23 17:09 ` Richard Sandiford
2017-12-05 23:46 ` Jeff Law
2017-10-23 17:09 ` [023/nnn] poly_int: store_field & co Richard Sandiford
` (85 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:09 UTC (permalink / raw)
To: gcc-patches
Similar to the previous store_bit_field patch, but for extractions
rather than insertions. The patch splits out the extraction-as-subreg
handling into a new function (extract_bit_field_as_subreg), both for
ease of writing and because a later patch will add another caller.
The simplify_gen_subreg overload is temporary; it goes away
in a later patch.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* rtl.h (simplify_gen_subreg): Add a temporary overload that
accepts poly_uint64 offsets.
* expmed.h (extract_bit_field): Take bitsize and bitnum as
poly_uint64s rather than unsigned HOST_WIDE_INTs.
* expmed.c (lowpart_bit_field_p): Likewise.
(extract_bit_field_as_subreg): New function, split out from...
(extract_bit_field_1): ...here. Take bitsize and bitnum as
poly_uint64s rather than unsigned HOST_WIDE_INTs. For vector
extractions, check that BITSIZE matches the size of the extracted
value and that BITNUM is an exact multiple of that size.
If all else fails, try forcing the value into memory if
BITNUM is variable, and adjusting the address so that the
offset is constant. Split the part that can only handle constant
bitsize and bitnum out into...
(extract_integral_bit_field): ...this new function.
(extract_bit_field): Take bitsize and bitnum as poly_uint64s
rather than unsigned HOST_WIDE_INTs.
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h 2017-10-23 17:11:43.774024962 +0100
+++ gcc/rtl.h 2017-10-23 17:11:50.109574423 +0100
@@ -3267,6 +3267,12 @@ extern rtx simplify_subreg (machine_mode
unsigned int);
extern rtx simplify_gen_subreg (machine_mode, rtx, machine_mode,
unsigned int);
+inline rtx
+simplify_gen_subreg (machine_mode omode, rtx x, machine_mode imode,
+ poly_uint64 offset)
+{
+ return simplify_gen_subreg (omode, x, imode, offset.to_constant ());
+}
extern rtx lowpart_subreg (machine_mode, rtx, machine_mode);
extern rtx simplify_replace_fn_rtx (rtx, const_rtx,
rtx (*fn) (rtx, const_rtx, void *), void *);
Index: gcc/expmed.h
===================================================================
--- gcc/expmed.h 2017-10-23 17:11:43.774024962 +0100
+++ gcc/expmed.h 2017-10-23 17:11:50.109574423 +0100
@@ -722,8 +722,7 @@ extern void store_bit_field (rtx, poly_u
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
machine_mode, rtx, bool);
-extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, int, rtx,
+extern rtx extract_bit_field (rtx, poly_uint64, poly_uint64, int, rtx,
machine_mode, machine_mode, bool, rtx *);
extern rtx extract_low_bits (machine_mode, machine_mode, rtx);
extern rtx expand_mult (machine_mode, rtx, rtx, rtx, int);
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c 2017-10-23 17:11:43.774024962 +0100
+++ gcc/expmed.c 2017-10-23 17:11:50.109574423 +0100
@@ -68,6 +68,10 @@ static void store_split_bit_field (rtx,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
rtx, scalar_int_mode, bool);
+static rtx extract_integral_bit_field (rtx, opt_scalar_int_mode,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT, int, rtx,
+ machine_mode, machine_mode, bool, bool);
static rtx extract_fixed_bit_field (machine_mode, rtx, opt_scalar_int_mode,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, rtx, int, bool);
@@ -509,17 +513,17 @@ adjust_bit_field_mem_for_reg (enum extra
offset is then BITNUM / BITS_PER_UNIT. */
static bool
-lowpart_bit_field_p (unsigned HOST_WIDE_INT bitnum,
- unsigned HOST_WIDE_INT bitsize,
+lowpart_bit_field_p (poly_uint64 bitnum, poly_uint64 bitsize,
machine_mode struct_mode)
{
- unsigned HOST_WIDE_INT regsize = REGMODE_NATURAL_SIZE (struct_mode);
+ poly_uint64 regsize = REGMODE_NATURAL_SIZE (struct_mode);
if (BYTES_BIG_ENDIAN)
- return (bitnum % BITS_PER_UNIT == 0
- && (bitnum + bitsize == GET_MODE_BITSIZE (struct_mode)
- || (bitnum + bitsize) % (regsize * BITS_PER_UNIT) == 0));
+ return (multiple_p (bitnum, BITS_PER_UNIT)
+ && (must_eq (bitnum + bitsize, GET_MODE_BITSIZE (struct_mode))
+ || multiple_p (bitnum + bitsize,
+ regsize * BITS_PER_UNIT)));
else
- return bitnum % (regsize * BITS_PER_UNIT) == 0;
+ return multiple_p (bitnum, regsize * BITS_PER_UNIT);
}
/* Return true if -fstrict-volatile-bitfields applies to an access of OP0
@@ -1574,16 +1578,33 @@ extract_bit_field_using_extv (const extr
return NULL_RTX;
}
+/* See whether it would be valid to extract the part of OP0 described
+ by BITNUM and BITSIZE into a value of mode MODE using a subreg
+ operation. Return the subreg if so, otherwise return null. */
+
+static rtx
+extract_bit_field_as_subreg (machine_mode mode, rtx op0,
+ poly_uint64 bitsize, poly_uint64 bitnum)
+{
+ poly_uint64 bytenum;
+ if (multiple_p (bitnum, BITS_PER_UNIT, &bytenum)
+ && must_eq (bitsize, GET_MODE_BITSIZE (mode))
+ && lowpart_bit_field_p (bitnum, bitsize, GET_MODE (op0))
+ && TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op0)))
+ return simplify_gen_subreg (mode, op0, GET_MODE (op0), bytenum);
+ return NULL_RTX;
+}
+
/* A subroutine of extract_bit_field, with the same arguments.
If FALLBACK_P is true, fall back to extract_fixed_bit_field
if we can find no other means of implementing the operation.
if FALLBACK_P is false, return NULL instead. */
static rtx
-extract_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, int unsignedp, rtx target,
- machine_mode mode, machine_mode tmode,
- bool reverse, bool fallback_p, rtx *alt_rtl)
+extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
+ int unsignedp, rtx target, machine_mode mode,
+ machine_mode tmode, bool reverse, bool fallback_p,
+ rtx *alt_rtl)
{
rtx op0 = str_rtx;
machine_mode mode1;
@@ -1600,13 +1621,13 @@ extract_bit_field_1 (rtx str_rtx, unsign
/* If we have an out-of-bounds access to a register, just return an
uninitialized register of the required mode. This can occur if the
source code contains an out-of-bounds access to a small array. */
- if (REG_P (op0) && bitnum >= GET_MODE_BITSIZE (GET_MODE (op0)))
+ if (REG_P (op0) && must_ge (bitnum, GET_MODE_BITSIZE (GET_MODE (op0))))
return gen_reg_rtx (tmode);
if (REG_P (op0)
&& mode == GET_MODE (op0)
- && bitnum == 0
- && bitsize == GET_MODE_BITSIZE (GET_MODE (op0)))
+ && known_zero (bitnum)
+ && must_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (op0))))
{
if (reverse)
op0 = flip_storage_order (mode, op0);
@@ -1618,6 +1639,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if (VECTOR_MODE_P (GET_MODE (op0))
&& !MEM_P (op0)
&& VECTOR_MODE_P (tmode)
+ && must_eq (bitsize, GET_MODE_SIZE (tmode))
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (tmode))
{
machine_mode new_mode = GET_MODE (op0);
@@ -1633,18 +1655,17 @@ extract_bit_field_1 (rtx str_rtx, unsign
|| !targetm.vector_mode_supported_p (new_mode))
new_mode = VOIDmode;
}
+ poly_uint64 pos;
if (new_mode != VOIDmode
&& (convert_optab_handler (vec_extract_optab, new_mode, tmode)
!= CODE_FOR_nothing)
- && ((bitnum + bitsize - 1) / GET_MODE_BITSIZE (tmode)
- == bitnum / GET_MODE_BITSIZE (tmode)))
+ && multiple_p (bitnum, GET_MODE_BITSIZE (tmode), &pos))
{
struct expand_operand ops[3];
machine_mode outermode = new_mode;
machine_mode innermode = tmode;
enum insn_code icode
= convert_optab_handler (vec_extract_optab, outermode, innermode);
- unsigned HOST_WIDE_INT pos = bitnum / GET_MODE_BITSIZE (innermode);
if (new_mode != GET_MODE (op0))
op0 = gen_lowpart (new_mode, op0);
@@ -1697,17 +1718,17 @@ extract_bit_field_1 (rtx str_rtx, unsign
available. */
machine_mode outermode = GET_MODE (op0);
scalar_mode innermode = GET_MODE_INNER (outermode);
+ poly_uint64 pos;
if (VECTOR_MODE_P (outermode)
&& !MEM_P (op0)
&& (convert_optab_handler (vec_extract_optab, outermode, innermode)
!= CODE_FOR_nothing)
- && ((bitnum + bitsize - 1) / GET_MODE_BITSIZE (innermode)
- == bitnum / GET_MODE_BITSIZE (innermode)))
+ && must_eq (bitsize, GET_MODE_BITSIZE (innermode))
+ && multiple_p (bitnum, GET_MODE_BITSIZE (innermode), &pos))
{
struct expand_operand ops[3];
enum insn_code icode
= convert_optab_handler (vec_extract_optab, outermode, innermode);
- unsigned HOST_WIDE_INT pos = bitnum / GET_MODE_BITSIZE (innermode);
create_output_operand (&ops[0], target, innermode);
ops[0].target = 1;
@@ -1765,14 +1786,9 @@ extract_bit_field_1 (rtx str_rtx, unsign
/* Extraction of a full MODE1 value can be done with a subreg as long
as the least significant bit of the value is the least significant
bit of either OP0 or a word of OP0. */
- if (!MEM_P (op0)
- && !reverse
- && lowpart_bit_field_p (bitnum, bitsize, op0_mode.require ())
- && bitsize == GET_MODE_BITSIZE (mode1)
- && TRULY_NOOP_TRUNCATION_MODES_P (mode1, op0_mode.require ()))
+ if (!MEM_P (op0) && !reverse)
{
- rtx sub = simplify_gen_subreg (mode1, op0, op0_mode.require (),
- bitnum / BITS_PER_UNIT);
+ rtx sub = extract_bit_field_as_subreg (mode1, op0, bitsize, bitnum);
if (sub)
return convert_extracted_bit_field (sub, mode, tmode, unsignedp);
}
@@ -1788,6 +1804,39 @@ extract_bit_field_1 (rtx str_rtx, unsign
return convert_extracted_bit_field (op0, mode, tmode, unsignedp);
}
+ /* If we have a memory source and a non-constant bit offset, restrict
+ the memory to the referenced bytes. This is a worst-case fallback
+ but is useful for things like vector booleans. */
+ if (MEM_P (op0) && !bitnum.is_constant ())
+ {
+ bytenum = bits_to_bytes_round_down (bitnum);
+ bitnum = num_trailing_bits (bitnum);
+ poly_uint64 bytesize = bits_to_bytes_round_up (bitnum + bitsize);
+ op0 = adjust_bitfield_address_size (op0, BLKmode, bytenum, bytesize);
+ op0_mode = opt_scalar_int_mode ();
+ }
+
+ /* It's possible we'll need to handle other cases here for
+ polynomial bitnum and bitsize. */
+
+ /* From here on we need to be looking at a fixed-size insertion. */
+ return extract_integral_bit_field (op0, op0_mode, bitsize.to_constant (),
+ bitnum.to_constant (), unsignedp,
+ target, mode, tmode, reverse, fallback_p);
+}
+
+/* Subroutine of extract_bit_field_1, with the same arguments, except
+ that BITSIZE and BITNUM are constant. Handle cases specific to
+ integral modes. If OP0_MODE is defined, it is the mode of OP0,
+ otherwise OP0 is a BLKmode MEM. */
+
+static rtx
+extract_integral_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
+ unsigned HOST_WIDE_INT bitsize,
+ unsigned HOST_WIDE_INT bitnum, int unsignedp,
+ rtx target, machine_mode mode, machine_mode tmode,
+ bool reverse, bool fallback_p)
+{
/* Handle fields bigger than a word. */
if (bitsize > BITS_PER_WORD)
@@ -1807,12 +1856,16 @@ extract_bit_field_1 (rtx str_rtx, unsign
/* In case we're about to clobber a base register or something
(see gcc.c-torture/execute/20040625-1.c). */
- if (reg_mentioned_p (target, str_rtx))
+ if (reg_mentioned_p (target, op0))
target = gen_reg_rtx (mode);
/* Indicate for flow that the entire target reg is being set. */
emit_clobber (target);
+ /* The mode must be fixed-size, since extract_bit_field_1 handles
+ extractions from variable-sized objects before calling this
+ function. */
+ unsigned int target_size = GET_MODE_SIZE (GET_MODE (target));
last = get_last_insn ();
for (i = 0; i < nwords; i++)
{
@@ -1820,9 +1873,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if I is 1, use the next to lowest word; and so on. */
/* Word number in TARGET to use. */
unsigned int wordnum
- = (backwards
- ? GET_MODE_SIZE (GET_MODE (target)) / UNITS_PER_WORD - i - 1
- : i);
+ = (backwards ? target_size / UNITS_PER_WORD - i - 1 : i);
/* Offset from start of field in OP0. */
unsigned int bit_offset = (backwards ^ reverse
? MAX ((int) bitsize - ((int) i + 1)
@@ -1851,11 +1902,11 @@ extract_bit_field_1 (rtx str_rtx, unsign
{
/* Unless we've filled TARGET, the upper regs in a multi-reg value
need to be zero'd out. */
- if (GET_MODE_SIZE (GET_MODE (target)) > nwords * UNITS_PER_WORD)
+ if (target_size > nwords * UNITS_PER_WORD)
{
unsigned int i, total_words;
- total_words = GET_MODE_SIZE (GET_MODE (target)) / UNITS_PER_WORD;
+ total_words = target_size / UNITS_PER_WORD;
for (i = nwords; i < total_words; i++)
emit_move_insn
(operand_subword (target,
@@ -1993,10 +2044,9 @@ extract_bit_field_1 (rtx str_rtx, unsign
if they are equally easy. */
rtx
-extract_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, int unsignedp, rtx target,
- machine_mode mode, machine_mode tmode, bool reverse,
- rtx *alt_rtl)
+extract_bit_field (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
+ int unsignedp, rtx target, machine_mode mode,
+ machine_mode tmode, bool reverse, rtx *alt_rtl)
{
machine_mode mode1;
@@ -2008,28 +2058,34 @@ extract_bit_field (rtx str_rtx, unsigned
else
mode1 = tmode;
+ unsigned HOST_WIDE_INT ibitsize, ibitnum;
scalar_int_mode int_mode;
- if (is_a <scalar_int_mode> (mode1, &int_mode)
- && strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, int_mode, 0, 0))
+ if (bitsize.is_constant (&ibitsize)
+ && bitnum.is_constant (&ibitnum)
+ && is_a <scalar_int_mode> (mode1, &int_mode)
+ && strict_volatile_bitfield_p (str_rtx, ibitsize, ibitnum,
+ int_mode, 0, 0))
{
/* Extraction of a full INT_MODE value can be done with a simple load.
We know here that the field can be accessed with one single
instruction. For targets that support unaligned memory,
an unaligned access may be necessary. */
- if (bitsize == GET_MODE_BITSIZE (int_mode))
+ if (ibitsize == GET_MODE_BITSIZE (int_mode))
{
rtx result = adjust_bitfield_address (str_rtx, int_mode,
- bitnum / BITS_PER_UNIT);
+ ibitnum / BITS_PER_UNIT);
if (reverse)
result = flip_storage_order (int_mode, result);
- gcc_assert (bitnum % BITS_PER_UNIT == 0);
+ gcc_assert (ibitnum % BITS_PER_UNIT == 0);
return convert_extracted_bit_field (result, mode, tmode, unsignedp);
}
- str_rtx = narrow_bit_field_mem (str_rtx, int_mode, bitsize, bitnum,
- &bitnum);
- gcc_assert (bitnum + bitsize <= GET_MODE_BITSIZE (int_mode));
+ str_rtx = narrow_bit_field_mem (str_rtx, int_mode, ibitsize, ibitnum,
+ &ibitnum);
+ gcc_assert (ibitnum + ibitsize <= GET_MODE_BITSIZE (int_mode));
str_rtx = copy_to_reg (str_rtx);
+ return extract_bit_field_1 (str_rtx, ibitsize, ibitnum, unsignedp,
+ target, mode, tmode, reverse, true, alt_rtl);
}
return extract_bit_field_1 (str_rtx, bitsize, bitnum, unsignedp,
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [021/nnn] poly_int: extract_bit_field bitrange
2017-10-23 17:09 ` [021/nnn] poly_int: extract_bit_field bitrange Richard Sandiford
@ 2017-12-05 23:46 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-05 23:46 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:08 AM, Richard Sandiford wrote:
> Similar to the previous store_bit_field patch, but for extractions
> rather than insertions. The patch splits out the extraction-as-subreg
> handling into a new function (extract_bit_field_as_subreg), both for
> ease of writing and because a later patch will add another caller.
>
> The simplify_gen_subreg overload is temporary; it goes away
> in a later patch.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * rtl.h (simplify_gen_subreg): Add a temporary overload that
> accepts poly_uint64 offsets.
> * expmed.h (extract_bit_field): Take bitsize and bitnum as
> poly_uint64s rather than unsigned HOST_WIDE_INTs.
> * expmed.c (lowpart_bit_field_p): Likewise.
> (extract_bit_field_as_subreg): New function, split out from...
> (extract_bit_field_1): ...here. Take bitsize and bitnum as
> poly_uint64s rather than unsigned HOST_WIDE_INTs. For vector
> extractions, check that BITSIZE matches the size of the extracted
> value and that BITNUM is an exact multiple of that size.
> If all else fails, try forcing the value into memory if
> BITNUM is variable, and adjusting the address so that the
> offset is constant. Split the part that can only handle constant
> bitsize and bitnum out into...
> (extract_integral_bit_field): ...this new function.
> (extract_bit_field): Take bitsize and bitnum as poly_uint64s
> rather than unsigned HOST_WIDE_INTs.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [023/nnn] poly_int: store_field & co
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (21 preceding siblings ...)
2017-10-23 17:09 ` [021/nnn] poly_int: extract_bit_field bitrange Richard Sandiford
@ 2017-10-23 17:09 ` Richard Sandiford
2017-12-05 23:49 ` Jeff Law
2017-10-23 17:10 ` [024/nnn] poly_int: ira subreg liveness tracking Richard Sandiford
` (84 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:09 UTC (permalink / raw)
To: gcc-patches
This patch makes store_field and related routines use poly_ints
for bit positions and sizes. It keeps the existing choices
between signed and unsigned types (there are a mixture of both).
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* expr.c (store_constructor_field): Change bitsize from a
unsigned HOST_WIDE_INT to a poly_uint64 and bitpos from a
HOST_WIDE_INT to a poly_int64.
(store_constructor): Change size from a HOST_WIDE_INT to
a poly_int64.
(store_field): Likewise bitsize and bitpos.
Index: gcc/expr.c
===================================================================
--- gcc/expr.c 2017-10-23 17:11:54.535862371 +0100
+++ gcc/expr.c 2017-10-23 17:11:55.989300194 +0100
@@ -79,9 +79,8 @@ static void emit_block_move_via_loop (rt
static void clear_by_pieces (rtx, unsigned HOST_WIDE_INT, unsigned int);
static rtx_insn *compress_float_constant (rtx, rtx);
static rtx get_subtarget (rtx);
-static void store_constructor (tree, rtx, int, HOST_WIDE_INT, bool);
-static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
- poly_uint64, poly_uint64,
+static void store_constructor (tree, rtx, int, poly_int64, bool);
+static rtx store_field (rtx, poly_int64, poly_int64, poly_uint64, poly_uint64,
machine_mode, tree, alias_set_type, bool, bool);
static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree);
@@ -6081,31 +6080,34 @@ all_zeros_p (const_tree exp)
clear a substructure if the outer structure has already been cleared. */
static void
-store_constructor_field (rtx target, unsigned HOST_WIDE_INT bitsize,
- HOST_WIDE_INT bitpos,
+store_constructor_field (rtx target, poly_uint64 bitsize, poly_int64 bitpos,
poly_uint64 bitregion_start,
poly_uint64 bitregion_end,
machine_mode mode,
tree exp, int cleared,
alias_set_type alias_set, bool reverse)
{
+ poly_int64 bytepos;
+ poly_uint64 bytesize;
if (TREE_CODE (exp) == CONSTRUCTOR
/* We can only call store_constructor recursively if the size and
bit position are on a byte boundary. */
- && bitpos % BITS_PER_UNIT == 0
- && (bitsize > 0 && bitsize % BITS_PER_UNIT == 0)
+ && multiple_p (bitpos, BITS_PER_UNIT, &bytepos)
+ && maybe_nonzero (bitsize)
+ && multiple_p (bitsize, BITS_PER_UNIT, &bytesize)
/* If we have a nonzero bitpos for a register target, then we just
let store_field do the bitfield handling. This is unlikely to
generate unnecessary clear instructions anyways. */
- && (bitpos == 0 || MEM_P (target)))
+ && (known_zero (bitpos) || MEM_P (target)))
{
if (MEM_P (target))
- target
- = adjust_address (target,
- GET_MODE (target) == BLKmode
- || 0 != (bitpos
- % GET_MODE_ALIGNMENT (GET_MODE (target)))
- ? BLKmode : VOIDmode, bitpos / BITS_PER_UNIT);
+ {
+ machine_mode target_mode = GET_MODE (target);
+ if (target_mode != BLKmode
+ && !multiple_p (bitpos, GET_MODE_ALIGNMENT (target_mode)))
+ target_mode = BLKmode;
+ target = adjust_address (target, target_mode, bytepos);
+ }
/* Update the alias set, if required. */
@@ -6116,8 +6118,7 @@ store_constructor_field (rtx target, uns
set_mem_alias_set (target, alias_set);
}
- store_constructor (exp, target, cleared, bitsize / BITS_PER_UNIT,
- reverse);
+ store_constructor (exp, target, cleared, bytesize, reverse);
}
else
store_field (target, bitsize, bitpos, bitregion_start, bitregion_end, mode,
@@ -6151,12 +6152,12 @@ fields_length (const_tree type)
If REVERSE is true, the store is to be done in reverse order. */
static void
-store_constructor (tree exp, rtx target, int cleared, HOST_WIDE_INT size,
+store_constructor (tree exp, rtx target, int cleared, poly_int64 size,
bool reverse)
{
tree type = TREE_TYPE (exp);
HOST_WIDE_INT exp_size = int_size_in_bytes (type);
- HOST_WIDE_INT bitregion_end = size > 0 ? size * BITS_PER_UNIT - 1 : 0;
+ poly_int64 bitregion_end = must_gt (size, 0) ? size * BITS_PER_UNIT - 1 : 0;
switch (TREE_CODE (type))
{
@@ -6171,7 +6172,7 @@ store_constructor (tree exp, rtx target,
reverse = TYPE_REVERSE_STORAGE_ORDER (type);
/* If size is zero or the target is already cleared, do nothing. */
- if (size == 0 || cleared)
+ if (known_zero (size) || cleared)
cleared = 1;
/* We either clear the aggregate or indicate the value is dead. */
else if ((TREE_CODE (type) == UNION_TYPE
@@ -6200,14 +6201,14 @@ store_constructor (tree exp, rtx target,
the whole structure first. Don't do this if TARGET is a
register whose mode size isn't equal to SIZE since
clear_storage can't handle this case. */
- else if (size > 0
+ else if (known_size_p (size)
&& (((int) CONSTRUCTOR_NELTS (exp) != fields_length (type))
|| mostly_zeros_p (exp))
&& (!REG_P (target)
- || ((HOST_WIDE_INT) GET_MODE_SIZE (GET_MODE (target))
- == size)))
+ || must_eq (GET_MODE_SIZE (GET_MODE (target)), size)))
{
- clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);
+ clear_storage (target, gen_int_mode (size, Pmode),
+ BLOCK_OP_NORMAL);
cleared = 1;
}
@@ -6388,12 +6389,13 @@ store_constructor (tree exp, rtx target,
need_to_clear = 1;
}
- if (need_to_clear && size > 0)
+ if (need_to_clear && may_gt (size, 0))
{
if (REG_P (target))
- emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
+ emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
else
- clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);
+ clear_storage (target, gen_int_mode (size, Pmode),
+ BLOCK_OP_NORMAL);
cleared = 1;
}
@@ -6407,7 +6409,7 @@ store_constructor (tree exp, rtx target,
FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (exp), i, index, value)
{
machine_mode mode;
- HOST_WIDE_INT bitsize;
+ poly_int64 bitsize;
HOST_WIDE_INT bitpos;
rtx xtarget = target;
@@ -6500,7 +6502,8 @@ store_constructor (tree exp, rtx target,
xtarget = adjust_address (xtarget, mode, 0);
if (TREE_CODE (value) == CONSTRUCTOR)
store_constructor (value, xtarget, cleared,
- bitsize / BITS_PER_UNIT, reverse);
+ exact_div (bitsize, BITS_PER_UNIT),
+ reverse);
else
store_expr (value, xtarget, 0, false, reverse);
@@ -6669,12 +6672,13 @@ store_constructor (tree exp, rtx target,
need_to_clear = (count < n_elts || 4 * zero_count >= 3 * count);
}
- if (need_to_clear && size > 0 && !vector)
+ if (need_to_clear && may_gt (size, 0) && !vector)
{
if (REG_P (target))
emit_move_insn (target, CONST0_RTX (mode));
else
- clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);
+ clear_storage (target, gen_int_mode (size, Pmode),
+ BLOCK_OP_NORMAL);
cleared = 1;
}
@@ -6762,7 +6766,7 @@ store_constructor (tree exp, rtx target,
If REVERSE is true, the store is to be done in reverse order. */
static rtx
-store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
+store_field (rtx target, poly_int64 bitsize, poly_int64 bitpos,
poly_uint64 bitregion_start, poly_uint64 bitregion_end,
machine_mode mode, tree exp,
alias_set_type alias_set, bool nontemporal, bool reverse)
@@ -6773,7 +6777,7 @@ store_field (rtx target, HOST_WIDE_INT b
/* If we have nothing to store, do nothing unless the expression has
side-effects. Don't do that for zero sized addressable lhs of
calls. */
- if (bitsize == 0
+ if (known_zero (bitsize)
&& (!TREE_ADDRESSABLE (TREE_TYPE (exp))
|| TREE_CODE (exp) != CALL_EXPR))
return expand_expr (exp, const0_rtx, VOIDmode, EXPAND_NORMAL);
@@ -6782,7 +6786,7 @@ store_field (rtx target, HOST_WIDE_INT b
{
/* We're storing into a struct containing a single __complex. */
- gcc_assert (!bitpos);
+ gcc_assert (known_zero (bitpos));
return store_expr (exp, target, 0, nontemporal, reverse);
}
@@ -6790,6 +6794,7 @@ store_field (rtx target, HOST_WIDE_INT b
is a bit field, we cannot use addressing to access it.
Use bit-field techniques or SUBREG to store in it. */
+ poly_int64 decl_bitsize;
if (mode == VOIDmode
|| (mode != BLKmode && ! direct_store[(int) mode]
&& GET_MODE_CLASS (mode) != MODE_COMPLEX_INT
@@ -6800,21 +6805,22 @@ store_field (rtx target, HOST_WIDE_INT b
store it as a bit field. */
|| (mode != BLKmode
&& ((((MEM_ALIGN (target) < GET_MODE_ALIGNMENT (mode))
- || bitpos % GET_MODE_ALIGNMENT (mode))
+ || !multiple_p (bitpos, GET_MODE_ALIGNMENT (mode)))
&& targetm.slow_unaligned_access (mode, MEM_ALIGN (target)))
- || (bitpos % BITS_PER_UNIT != 0)))
- || (bitsize >= 0 && mode != BLKmode
- && GET_MODE_BITSIZE (mode) > bitsize)
+ || !multiple_p (bitpos, BITS_PER_UNIT)))
+ || (known_size_p (bitsize)
+ && mode != BLKmode
+ && may_gt (GET_MODE_BITSIZE (mode), bitsize))
/* If the RHS and field are a constant size and the size of the
RHS isn't the same size as the bitfield, we must use bitfield
operations. */
- || (bitsize >= 0
- && TREE_CODE (TYPE_SIZE (TREE_TYPE (exp))) == INTEGER_CST
- && compare_tree_int (TYPE_SIZE (TREE_TYPE (exp)), bitsize) != 0
+ || (known_size_p (bitsize)
+ && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (exp)))
+ && may_ne (wi::to_poly_offset (TYPE_SIZE (TREE_TYPE (exp))), bitsize)
/* Except for initialization of full bytes from a CONSTRUCTOR, which
we will handle specially below. */
&& !(TREE_CODE (exp) == CONSTRUCTOR
- && bitsize % BITS_PER_UNIT == 0)
+ && multiple_p (bitsize, BITS_PER_UNIT))
/* And except for bitwise copying of TREE_ADDRESSABLE types,
where the FIELD_DECL has the right bitsize, but TREE_TYPE (exp)
includes some extra padding. store_expr / expand_expr will in
@@ -6825,14 +6831,14 @@ store_field (rtx target, HOST_WIDE_INT b
get_base_address needs to live in memory. */
&& (!TREE_ADDRESSABLE (TREE_TYPE (exp))
|| TREE_CODE (exp) != COMPONENT_REF
- || TREE_CODE (DECL_SIZE (TREE_OPERAND (exp, 1))) != INTEGER_CST
- || (bitsize % BITS_PER_UNIT != 0)
- || (bitpos % BITS_PER_UNIT != 0)
- || (compare_tree_int (DECL_SIZE (TREE_OPERAND (exp, 1)), bitsize)
- != 0)))
+ || !multiple_p (bitsize, BITS_PER_UNIT)
+ || !multiple_p (bitpos, BITS_PER_UNIT)
+ || !poly_int_tree_p (DECL_SIZE (TREE_OPERAND (exp, 1)),
+ &decl_bitsize)
+ || may_ne (decl_bitsize, bitsize)))
/* If we are expanding a MEM_REF of a non-BLKmode non-addressable
decl we must use bitfield operations. */
- || (bitsize >= 0
+ || (known_size_p (bitsize)
&& TREE_CODE (exp) == MEM_REF
&& TREE_CODE (TREE_OPERAND (exp, 0)) == ADDR_EXPR
&& DECL_P (TREE_OPERAND (TREE_OPERAND (exp, 0), 0))
@@ -6853,17 +6859,23 @@ store_field (rtx target, HOST_WIDE_INT b
tree type = TREE_TYPE (exp);
if (INTEGRAL_TYPE_P (type)
&& TYPE_PRECISION (type) < GET_MODE_BITSIZE (TYPE_MODE (type))
- && bitsize == TYPE_PRECISION (type))
+ && must_eq (bitsize, TYPE_PRECISION (type)))
{
tree op = gimple_assign_rhs1 (nop_def);
type = TREE_TYPE (op);
- if (INTEGRAL_TYPE_P (type) && TYPE_PRECISION (type) >= bitsize)
+ if (INTEGRAL_TYPE_P (type)
+ && must_ge (TYPE_PRECISION (type), bitsize))
exp = op;
}
}
temp = expand_normal (exp);
+ /* We don't support variable-sized BLKmode bitfields, since our
+ handling of BLKmode is bound up with the ability to break
+ things into words. */
+ gcc_assert (mode != BLKmode || bitsize.is_constant ());
+
/* Handle calls that return values in multiple non-contiguous locations.
The Irix 6 ABI has examples of this. */
if (GET_CODE (temp) == PARALLEL)
@@ -6904,9 +6916,11 @@ store_field (rtx target, HOST_WIDE_INT b
if (reverse)
temp = flip_storage_order (temp_mode, temp);
- if (bitsize < size
+ gcc_checking_assert (must_le (bitsize, size));
+ if (may_lt (bitsize, size)
&& reverse ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN
- && !(mode == BLKmode && bitsize > BITS_PER_WORD))
+ /* Use of to_constant for BLKmode was checked above. */
+ && !(mode == BLKmode && bitsize.to_constant () > BITS_PER_WORD))
temp = expand_shift (RSHIFT_EXPR, temp_mode, temp,
size - bitsize, NULL_RTX, 1);
}
@@ -6923,16 +6937,16 @@ store_field (rtx target, HOST_WIDE_INT b
&& (GET_MODE (target) == BLKmode
|| (MEM_P (target)
&& GET_MODE_CLASS (GET_MODE (target)) == MODE_INT
- && (bitpos % BITS_PER_UNIT) == 0
- && (bitsize % BITS_PER_UNIT) == 0)))
+ && multiple_p (bitpos, BITS_PER_UNIT)
+ && multiple_p (bitsize, BITS_PER_UNIT))))
{
- gcc_assert (MEM_P (target) && MEM_P (temp)
- && (bitpos % BITS_PER_UNIT) == 0);
+ gcc_assert (MEM_P (target) && MEM_P (temp));
+ poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);
+ poly_int64 bytesize = bits_to_bytes_round_up (bitsize);
- target = adjust_address (target, VOIDmode, bitpos / BITS_PER_UNIT);
+ target = adjust_address (target, VOIDmode, bytepos);
emit_block_move (target, temp,
- GEN_INT ((bitsize + BITS_PER_UNIT - 1)
- / BITS_PER_UNIT),
+ gen_int_mode (bytesize, Pmode),
BLOCK_OP_NORMAL);
return const0_rtx;
@@ -6940,7 +6954,7 @@ store_field (rtx target, HOST_WIDE_INT b
/* If the mode of TEMP is still BLKmode and BITSIZE not larger than the
word size, we need to load the value (see again store_bit_field). */
- if (GET_MODE (temp) == BLKmode && bitsize <= BITS_PER_WORD)
+ if (GET_MODE (temp) == BLKmode && must_le (bitsize, BITS_PER_WORD))
{
scalar_int_mode temp_mode = smallest_int_mode_for_size (bitsize);
temp = extract_bit_field (temp, bitsize, 0, 1, NULL_RTX, temp_mode,
@@ -6957,7 +6971,8 @@ store_field (rtx target, HOST_WIDE_INT b
else
{
/* Now build a reference to just the desired component. */
- rtx to_rtx = adjust_address (target, mode, bitpos / BITS_PER_UNIT);
+ rtx to_rtx = adjust_address (target, mode,
+ exact_div (bitpos, BITS_PER_UNIT));
if (to_rtx == target)
to_rtx = copy_rtx (to_rtx);
@@ -6967,10 +6982,10 @@ store_field (rtx target, HOST_WIDE_INT b
/* Above we avoided using bitfield operations for storing a CONSTRUCTOR
into a target smaller than its type; handle that case now. */
- if (TREE_CODE (exp) == CONSTRUCTOR && bitsize >= 0)
+ if (TREE_CODE (exp) == CONSTRUCTOR && known_size_p (bitsize))
{
- gcc_assert (bitsize % BITS_PER_UNIT == 0);
- store_constructor (exp, to_rtx, 0, bitsize / BITS_PER_UNIT, reverse);
+ poly_int64 bytesize = exact_div (bitsize, BITS_PER_UNIT);
+ store_constructor (exp, to_rtx, 0, bytesize, reverse);
return to_rtx;
}
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [023/nnn] poly_int: store_field & co
2017-10-23 17:09 ` [023/nnn] poly_int: store_field & co Richard Sandiford
@ 2017-12-05 23:49 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-05 23:49 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:09 AM, Richard Sandiford wrote:
> This patch makes store_field and related routines use poly_ints
> for bit positions and sizes. It keeps the existing choices
> between signed and unsigned types (there are a mixture of both).
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * expr.c (store_constructor_field): Change bitsize from a
> unsigned HOST_WIDE_INT to a poly_uint64 and bitpos from a
> HOST_WIDE_INT to a poly_int64.
> (store_constructor): Change size from a HOST_WIDE_INT to
> a poly_int64.
> (store_field): Likewise bitsize and bitpos.
OK
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [024/nnn] poly_int: ira subreg liveness tracking
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (22 preceding siblings ...)
2017-10-23 17:09 ` [023/nnn] poly_int: store_field & co Richard Sandiford
@ 2017-10-23 17:10 ` Richard Sandiford
2017-11-28 21:10 ` Jeff Law
2017-10-23 17:10 ` [025/nnn] poly_int: SUBREG_BYTE Richard Sandiford
` (83 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:10 UTC (permalink / raw)
To: gcc-patches
Normmaly the IRA-reload interface tries to track the liveness of
individual bytes of an allocno if the allocno is sometimes written
to as a SUBREG. This isn't possible for variable-sized allocnos,
but it doesn't matter because targets with variable-sized registers
should use LRA instead.
This patch adds a get_subreg_tracking_sizes function for deciding
whether it is possible to model a partial read or write. Later
patches make it return false if anything is variable.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* ira.c (get_subreg_tracking_sizes): New function.
(init_live_subregs): Take an integer size rather than a register.
(build_insn_chain): Use get_subreg_tracking_sizes. Update calls
to init_live_subregs.
Index: gcc/ira.c
===================================================================
--- gcc/ira.c 2017-10-23 17:11:43.647074065 +0100
+++ gcc/ira.c 2017-10-23 17:11:59.074107016 +0100
@@ -4040,16 +4040,27 @@ pseudo_for_reload_consideration_p (int r
return (reg_renumber[regno] >= 0 || ira_conflicts_p);
}
-/* Init LIVE_SUBREGS[ALLOCNUM] and LIVE_SUBREGS_USED[ALLOCNUM] using
- REG to the number of nregs, and INIT_VALUE to get the
- initialization. ALLOCNUM need not be the regno of REG. */
+/* Return true if we can track the individual bytes of subreg X.
+ When returning true, set *OUTER_SIZE to the number of bytes in
+ X itself, *INNER_SIZE to the number of bytes in the inner register
+ and *START to the offset of the first byte. */
+static bool
+get_subreg_tracking_sizes (rtx x, HOST_WIDE_INT *outer_size,
+ HOST_WIDE_INT *inner_size, HOST_WIDE_INT *start)
+{
+ rtx reg = regno_reg_rtx[REGNO (SUBREG_REG (x))];
+ *outer_size = GET_MODE_SIZE (GET_MODE (x));
+ *inner_size = GET_MODE_SIZE (GET_MODE (reg));
+ *start = SUBREG_BYTE (x);
+ return true;
+}
+
+/* Init LIVE_SUBREGS[ALLOCNUM] and LIVE_SUBREGS_USED[ALLOCNUM] for
+ a register with SIZE bytes, making the register live if INIT_VALUE. */
static void
init_live_subregs (bool init_value, sbitmap *live_subregs,
- bitmap live_subregs_used, int allocnum, rtx reg)
+ bitmap live_subregs_used, int allocnum, int size)
{
- unsigned int regno = REGNO (SUBREG_REG (reg));
- int size = GET_MODE_SIZE (GET_MODE (regno_reg_rtx[regno]));
-
gcc_assert (size > 0);
/* Been there, done that. */
@@ -4158,19 +4169,26 @@ build_insn_chain (void)
&& (!DF_REF_FLAGS_IS_SET (def, DF_REF_CONDITIONAL)))
{
rtx reg = DF_REF_REG (def);
+ HOST_WIDE_INT outer_size, inner_size, start;
- /* We can model subregs, but not if they are
- wrapped in ZERO_EXTRACTS. */
+ /* We can usually track the liveness of individual
+ bytes within a subreg. The only exceptions are
+ subregs wrapped in ZERO_EXTRACTs and subregs whose
+ size is not known; in those cases we need to be
+ conservative and treat the definition as a partial
+ definition of the full register rather than a full
+ definition of a specific part of the register. */
if (GET_CODE (reg) == SUBREG
- && !DF_REF_FLAGS_IS_SET (def, DF_REF_ZERO_EXTRACT))
+ && !DF_REF_FLAGS_IS_SET (def, DF_REF_ZERO_EXTRACT)
+ && get_subreg_tracking_sizes (reg, &outer_size,
+ &inner_size, &start))
{
- unsigned int start = SUBREG_BYTE (reg);
- unsigned int last = start
- + GET_MODE_SIZE (GET_MODE (reg));
+ HOST_WIDE_INT last = start + outer_size;
init_live_subregs
(bitmap_bit_p (live_relevant_regs, regno),
- live_subregs, live_subregs_used, regno, reg);
+ live_subregs, live_subregs_used, regno,
+ inner_size);
if (!DF_REF_FLAGS_IS_SET
(def, DF_REF_STRICT_LOW_PART))
@@ -4255,18 +4273,20 @@ build_insn_chain (void)
if (regno < FIRST_PSEUDO_REGISTER
|| pseudo_for_reload_consideration_p (regno))
{
+ HOST_WIDE_INT outer_size, inner_size, start;
if (GET_CODE (reg) == SUBREG
&& !DF_REF_FLAGS_IS_SET (use,
DF_REF_SIGN_EXTRACT
- | DF_REF_ZERO_EXTRACT))
+ | DF_REF_ZERO_EXTRACT)
+ && get_subreg_tracking_sizes (reg, &outer_size,
+ &inner_size, &start))
{
- unsigned int start = SUBREG_BYTE (reg);
- unsigned int last = start
- + GET_MODE_SIZE (GET_MODE (reg));
+ HOST_WIDE_INT last = start + outer_size;
init_live_subregs
(bitmap_bit_p (live_relevant_regs, regno),
- live_subregs, live_subregs_used, regno, reg);
+ live_subregs, live_subregs_used, regno,
+ inner_size);
/* Ignore the paradoxical bits. */
if (last > SBITMAP_SIZE (live_subregs[regno]))
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [024/nnn] poly_int: ira subreg liveness tracking
2017-10-23 17:10 ` [024/nnn] poly_int: ira subreg liveness tracking Richard Sandiford
@ 2017-11-28 21:10 ` Jeff Law
2017-12-05 21:54 ` Richard Sandiford
0 siblings, 1 reply; 302+ messages in thread
From: Jeff Law @ 2017-11-28 21:10 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:09 AM, Richard Sandiford wrote:
> Normmaly the IRA-reload interface tries to track the liveness of
> individual bytes of an allocno if the allocno is sometimes written
> to as a SUBREG. This isn't possible for variable-sized allocnos,
> but it doesn't matter because targets with variable-sized registers
> should use LRA instead.
>
> This patch adds a get_subreg_tracking_sizes function for deciding
> whether it is possible to model a partial read or write. Later
> patches make it return false if anything is variable.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * ira.c (get_subreg_tracking_sizes): New function.
> (init_live_subregs): Take an integer size rather than a register.
> (build_insn_chain): Use get_subreg_tracking_sizes. Update calls
> to init_live_subregs.
OK.
Note this is starting to get close to the discussion around CLOBBER_HIGH
vs using a self set with a low subreg that we're having with Alan on
another thread in that liveness tracking of subregs of SVE regs could
potentially use some improvements.
When I quickly looked at the subreg handling in the df infrstructure my
first thought was that it might need some updating for SVE. I can't
immediately call bits for poly_int/SVE in the patches to-date. Have you
dug in there at all for the poly_int/SVE work?
Jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [024/nnn] poly_int: ira subreg liveness tracking
2017-11-28 21:10 ` Jeff Law
@ 2017-12-05 21:54 ` Richard Sandiford
0 siblings, 0 replies; 302+ messages in thread
From: Richard Sandiford @ 2017-12-05 21:54 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches
Jeff Law <law@redhat.com> writes:
> On 10/23/2017 11:09 AM, Richard Sandiford wrote:
>> Normmaly the IRA-reload interface tries to track the liveness of
>> individual bytes of an allocno if the allocno is sometimes written
>> to as a SUBREG. This isn't possible for variable-sized allocnos,
>> but it doesn't matter because targets with variable-sized registers
>> should use LRA instead.
>>
>> This patch adds a get_subreg_tracking_sizes function for deciding
>> whether it is possible to model a partial read or write. Later
>> patches make it return false if anything is variable.
>>
>>
>> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
>> Alan Hayward <alan.hayward@arm.com>
>> David Sherwood <david.sherwood@arm.com>
>>
>> gcc/
>> * ira.c (get_subreg_tracking_sizes): New function.
>> (init_live_subregs): Take an integer size rather than a register.
>> (build_insn_chain): Use get_subreg_tracking_sizes. Update calls
>> to init_live_subregs.
> OK.
>
> Note this is starting to get close to the discussion around CLOBBER_HIGH
> vs using a self set with a low subreg that we're having with Alan on
> another thread in that liveness tracking of subregs of SVE regs could
> potentially use some improvements.
>
> When I quickly looked at the subreg handling in the df infrstructure my
> first thought was that it might need some updating for SVE. I can't
> immediately call bits for poly_int/SVE in the patches to-date. Have you
> dug in there at all for the poly_int/SVE work?
Yeah, although the subreg tracking in this patch is specific to reload,
I thought we had something similar for LRA. I couldn't find anything
though, and the static type checking of poly_ints would have forced
the issue.
There is the DF_WORD_LR code, which tracks the liveness of words in a
double-word pseudo. We didn't extend that to variable-length registers
for two reasons: (1) if we did need it, we'd want it for pseudos
that map to 3 or 4 registers, not just 2, so that LD[234] and ST[234]
are handled consistently; and (2) it's only used for DCE at the moment,
and it's rare for LD[234]/ST[234]s to be dead code.
Thanks,
Richard
^ permalink raw reply [flat|nested] 302+ messages in thread
* [025/nnn] poly_int: SUBREG_BYTE
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (23 preceding siblings ...)
2017-10-23 17:10 ` [024/nnn] poly_int: ira subreg liveness tracking Richard Sandiford
@ 2017-10-23 17:10 ` Richard Sandiford
2017-12-06 18:50 ` Jeff Law
2017-10-23 17:11 ` [027/nnn] poly_int: DWARF CFA offsets Richard Sandiford
` (82 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:10 UTC (permalink / raw)
To: gcc-patches
This patch changes SUBREG_BYTE from an int to a poly_int.
Since valid SUBREG_BYTEs must be contained within the mode of the
SUBREG_REG, the required range is the same as for GET_MODE_SIZE,
i.e. unsigned short. The patch therefore uses poly_uint16(_pod)
for the SUBREG_BYTE.
Using poly_uint16_pod rtx fields requires a new field code ('p').
Since there are no other uses of 'p' besides SUBREG_BYTE, the patch
doesn't add an XPOLY or whatever; all uses should go via SUBREG_BYTE
instead.
The patch doesn't bother implementing 'p' support for legacy
define_peepholes, since none of the remaining ones have subregs
in their patterns.
As it happened, the rtl documentation used SUBREG as an example of a
code with mixed field types, accessed via XEXP (x, 0) and XINT (x, 1).
Since there's no direct replacement for XINT, and since people should
never use it even if there were, the patch changes the example to use
INT_LIST instead.
The patch also changes subreg-related helper functions so that they too
take and return polynomial offsets. This makes the patch quite big, but
it's mostly mechanical. The patch generally sticks to existing choices
wrt signedness.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/rtl.texi: Update documentation of SUBREG_BYTE. Document the
'p' format code. Use INT_LIST rather than SUBREG as the example of
a code with an XINT and an XEXP. Remove the implication that
accessing an rtx field using XINT is expected to work.
* rtl.def (SUBREG): Change format from "ei" to "ep".
* rtl.h (rtunion::rt_subreg): New field.
(XCSUBREG): New macro.
(SUBREG_BYTE): Use it.
(subreg_shape): Change offset from an unsigned int to a poly_uint16.
Update constructor accordingly.
(subreg_shape::operator ==): Update accordingly.
(subreg_shape::unique_id): Return an unsigned HOST_WIDE_INT rather
than an unsigned int.
(subreg_lsb, subreg_lowpart_offset, subreg_highpart_offset): Return
a poly_uint64 rather than an unsigned int.
(subreg_lsb_1): Likewise. Take the offset as a poly_uint64 rather
than an unsigned int.
(subreg_size_offset_from_lsb, subreg_size_lowpart_offset)
(subreg_size_highpart_offset): Return a poly_uint64 rather than
an unsigned int. Take the sizes as poly_uint64s.
(subreg_offset_from_lsb): Return a poly_uint64 rather than
an unsigned int. Take the shift as a poly_uint64 rather than
an unsigned int.
(subreg_regno_offset, subreg_offset_representable_p): Take the offset
as a poly_uint64 rather than an unsigned int.
(simplify_subreg_regno): Likewise.
(byte_lowpart_offset): Return the memory offset as a poly_int64
rather than an int.
(subreg_memory_offset): Likewise. Take the subreg offset as a
poly_uint64 rather than an unsigned int.
(simplify_subreg, simplify_gen_subreg, subreg_get_info)
(gen_rtx_SUBREG, validate_subreg): Take the subreg offset as a
poly_uint64 rather than an unsigned int.
* rtl.c (rtx_format): Describe 'p' in comment.
(copy_rtx, rtx_equal_p_cb, rtx_equal_p): Handle 'p'.
* emit-rtl.c (validate_subreg, gen_rtx_SUBREG): Take the subreg
offset as a poly_uint64 rather than an unsigned int.
(byte_lowpart_offset): Return the memory offset as a poly_int64
rather than an int.
(subreg_memory_offset): Likewise. Take the subreg offset as a
poly_uint64 rather than an unsigned int.
(subreg_size_lowpart_offset, subreg_size_highpart_offset): Take the
mode sizes as poly_uint64s rather than unsigned ints. Return a
poly_uint64 rather than an unsigned int.
(subreg_lowpart_p): Treat subreg offsets as poly_ints.
(copy_insn_1): Handle 'p'.
* rtlanal.c (set_noop_p): Treat subregs offsets as poly_uint64s.
(subreg_lsb_1): Take the subreg offset as a poly_uint64 rather than
an unsigned int. Return the shift in the same way.
(subreg_lsb): Return the shift as a poly_uint64 rather than an
unsigned int.
(subreg_size_offset_from_lsb): Take the sizes and shift as
poly_uint64s rather than unsigned ints. Return the offset as
a poly_uint64.
(subreg_get_info, subreg_regno_offset, subreg_offset_representable_p)
(simplify_subreg_regno): Take the offset as a poly_uint64 rather than
an unsigned int.
* rtlhash.c (add_rtx): Handle 'p'.
* genemit.c (gen_exp): Likewise.
* gengenrtl.c (type_from_format, gendef): Likewise.
* gensupport.c (subst_pattern_match, get_alternatives_number)
(collect_insn_data, alter_predicate_for_insn, alter_constraints)
(subst_dup): Likewise.
* gengtype.c (adjust_field_rtx_def): Likewise.
* genrecog.c (find_operand, find_matching_operand, validate_pattern)
(match_pattern_2): Likewise.
(rtx_test::SUBREG_FIELD): New rtx_test::kind_enum.
(rtx_test::subreg_field): New function.
(operator ==, safe_to_hoist_p, transition_parameter_type)
(print_nonbool_test, print_test): Handle SUBREG_FIELD.
* genattrtab.c (attr_rtx_1): Say that 'p' is deliberately not handled.
* genpeep.c (match_rtx): Likewise.
* print-rtl.c (print_poly_int): Include if GENERATOR_FILE too.
(rtx_writer::print_rtx_operand): Handle 'p'.
(print_value): Handle SUBREG.
* read-rtl.c (apply_int_iterator): Likewise.
(rtx_reader::read_rtx_operand): Handle 'p'.
* alias.c (rtx_equal_for_memref_p): Likewise.
* cselib.c (rtx_equal_for_cselib_1, cselib_hash_rtx): Likewise.
* caller-save.c (replace_reg_with_saved_mem): Treat subreg offsets
as poly_ints.
* calls.c (expand_call): Likewise.
* combine.c (combine_simplify_rtx, expand_field_assignment): Likewise.
(make_extraction, gen_lowpart_for_combine): Likewise.
* loop-invariant.c (hash_invariant_expr_1, invariant_expr_equal_p):
Likewise.
* cse.c (remove_invalid_subreg_refs): Take the offset as a poly_uint64
rather than an unsigned int. Treat subreg offsets as poly_ints.
(exp_equiv_p): Handle 'p'.
(hash_rtx_cb): Likewise. Treat subreg offsets as poly_ints.
(equiv_constant, cse_insn): Treat subreg offsets as poly_ints.
* dse.c (find_shift_sequence): Likewise.
* dwarf2out.c (rtl_for_decl_location): Likewise.
* expmed.c (extract_low_bits): Likewise.
* expr.c (emit_group_store, undefined_operand_subword_p): Likewise.
(expand_expr_real_2): Likewise.
* final.c (alter_subreg): Likewise.
(leaf_renumber_regs_insn): Handle 'p'.
* function.c (assign_parm_find_stack_rtl, assign_parm_setup_stack):
Treat subreg offsets as poly_ints.
* fwprop.c (forward_propagate_and_simplify): Likewise.
* ifcvt.c (noce_emit_move_insn, noce_emit_cmove): Likewise.
* ira.c (get_subreg_tracking_sizes): Likewise.
* ira-conflicts.c (go_through_subreg): Likewise.
* ira-lives.c (process_single_reg_class_operands): Likewise.
* jump.c (rtx_renumbered_equal_p): Likewise. Handle 'p'.
* lower-subreg.c (simplify_subreg_concatn): Take the subreg offset
as a poly_uint64 rather than an unsigned int.
(simplify_gen_subreg_concatn, resolve_simple_move): Treat
subreg offsets as poly_ints.
* lra-constraints.c (operands_match_p): Handle 'p'.
(match_reload, curr_insn_transform): Treat subreg offsets as poly_ints.
* lra-spills.c (assign_mem_slot): Likewise.
* postreload.c (move2add_valid_value_p): Likewise.
* recog.c (general_operand, indirect_operand): Likewise.
* regcprop.c (copy_value, maybe_mode_change): Likewise.
(copyprop_hardreg_forward_1): Likewise.
* reginfo.c (simplifiable_subregs_hasher::hash, simplifiable_subregs)
(record_subregs_of_mode): Likewise.
* rtlhooks.c (gen_lowpart_general, gen_lowpart_if_possible): Likewise.
* reload.c (operands_match_p): Handle 'p'.
(find_reloads_subreg_address): Treat subreg offsets as poly_ints.
* reload1.c (alter_reg, choose_reload_regs): Likewise.
(compute_reload_subreg_offset): Likewise, and return an poly_int64.
* simplify-rtx.c (simplify_truncation, simplify_binary_operation_1):
(test_vector_ops_duplicate): Treat subreg offsets as poly_ints.
(simplify_const_poly_int_tests<N>::run): Likewise.
(simplify_subreg, simplify_gen_subreg): Take the subreg offset as
a poly_uint64 rather than an unsigned int.
* valtrack.c (debug_lowpart_subreg): Likewise.
* var-tracking.c (var_lowpart): Likewise.
(loc_cmp): Handle 'p'.
Index: gcc/doc/rtl.texi
===================================================================
--- gcc/doc/rtl.texi 2017-10-23 17:16:35.057923923 +0100
+++ gcc/doc/rtl.texi 2017-10-23 17:16:50.360529627 +0100
@@ -109,10 +109,10 @@ and what kinds of objects they are. In
by looking at an operand what kind of object it is. Instead, you must know
from its context---from the expression code of the containing expression.
For example, in an expression of code @code{subreg}, the first operand is
-to be regarded as an expression and the second operand as an integer. In
-an expression of code @code{plus}, there are two operands, both of which
-are to be regarded as expressions. In a @code{symbol_ref} expression,
-there is one operand, which is to be regarded as a string.
+to be regarded as an expression and the second operand as a polynomial
+integer. In an expression of code @code{plus}, there are two operands,
+both of which are to be regarded as expressions. In a @code{symbol_ref}
+expression, there is one operand, which is to be regarded as a string.
Expressions are written as parentheses containing the name of the
expression type, its flags and machine mode if any, and then the operands
@@ -209,7 +209,7 @@ chain, such as @code{NOTE}, @code{BARRIE
For each expression code, @file{rtl.def} specifies the number of
contained objects and their kinds using a sequence of characters
called the @dfn{format} of the expression code. For example,
-the format of @code{subreg} is @samp{ei}.
+the format of @code{subreg} is @samp{ep}.
@cindex RTL format characters
These are the most commonly used format characters:
@@ -258,6 +258,9 @@ An omitted vector is effectively the sam
@item B
@samp{B} indicates a pointer to basic block structure.
+@item p
+A polynomial integer. At present this is used only for @code{SUBREG_BYTE}.
+
@item 0
@samp{0} means a slot whose contents do not fit any normal category.
@samp{0} slots are not printed at all in dumps, and are often used in
@@ -340,16 +343,13 @@ stored in the operand. You would do thi
the containing expression. That is also how you would know how many
operands there are.
-For example, if @var{x} is a @code{subreg} expression, you know that it has
-two operands which can be correctly accessed as @code{XEXP (@var{x}, 0)}
-and @code{XINT (@var{x}, 1)}. If you did @code{XINT (@var{x}, 0)}, you
-would get the address of the expression operand but cast as an integer;
-that might occasionally be useful, but it would be cleaner to write
-@code{(int) XEXP (@var{x}, 0)}. @code{XEXP (@var{x}, 1)} would also
-compile without error, and would return the second, integer operand cast as
-an expression pointer, which would probably result in a crash when
-accessed. Nothing stops you from writing @code{XEXP (@var{x}, 28)} either,
-but this will access memory past the end of the expression with
+For example, if @var{x} is an @code{int_list} expression, you know that it has
+two operands which can be correctly accessed as @code{XINT (@var{x}, 0)}
+and @code{XEXP (@var{x}, 1)}. Incorrect accesses like
+@code{XEXP (@var{x}, 0)} and @code{XINT (@var{x}, 1)} would compile,
+but would trigger an internal compiler error when rtl checking is enabled.
+Nothing stops you from writing @code{XEXP (@var{x}, 28)} either, but
+this will access memory past the end of the expression with
unpredictable results.
Access to operands which are vectors is more complicated. You can use the
@@ -2007,6 +2007,13 @@ on a @code{BYTES_BIG_ENDIAN}, @samp{UNIT
on a little-endian, @samp{UNITS_PER_WORD == 4} target. Both
@code{subreg}s access the lower two bytes of register @var{x}.
+Note that the byte offset is a polynomial integer; it may not be a
+compile-time constant on targets with variable-sized modes. However,
+the restrictions above mean that there are only a certain set of
+acceptable offsets for a given combination of @var{m1} and @var{m2}.
+The compiler can always tell which blocks a valid subreg occupies, and
+whether the subreg is a lowpart of a block.
+
@end table
A @code{MODE_PARTIAL_INT} mode behaves as if it were as wide as the
Index: gcc/rtl.def
===================================================================
--- gcc/rtl.def 2017-10-23 17:16:35.057923923 +0100
+++ gcc/rtl.def 2017-10-23 17:16:50.374527737 +0100
@@ -394,7 +394,7 @@ DEF_RTL_EXPR(SCRATCH, "scratch", "", RTX
/* A reference to a part of another value. The first operand is the
complete value and the second is the byte offset of the selected part. */
-DEF_RTL_EXPR(SUBREG, "subreg", "ei", RTX_EXTRA)
+DEF_RTL_EXPR(SUBREG, "subreg", "ep", RTX_EXTRA)
/* This one-argument rtx is used for move instructions
that are guaranteed to alter only the low part of a destination.
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h 2017-10-23 17:16:35.057923923 +0100
+++ gcc/rtl.h 2017-10-23 17:16:50.374527737 +0100
@@ -198,6 +198,7 @@ struct GTY((for_user)) reg_attrs {
{
int rt_int;
unsigned int rt_uint;
+ poly_uint16_pod rt_subreg;
const char *rt_str;
rtx rt_rtx;
rtvec rt_rtvec;
@@ -1330,6 +1331,7 @@ #define X0ANY(RTX, N) RTL_CHECK1 (RTX
#define XCINT(RTX, N, C) (RTL_CHECKC1 (RTX, N, C).rt_int)
#define XCUINT(RTX, N, C) (RTL_CHECKC1 (RTX, N, C).rt_uint)
+#define XCSUBREG(RTX, N, C) (RTL_CHECKC1 (RTX, N, C).rt_subreg)
#define XCSTR(RTX, N, C) (RTL_CHECKC1 (RTX, N, C).rt_str)
#define XCEXP(RTX, N, C) (RTL_CHECKC1 (RTX, N, C).rt_rtx)
#define XCVEC(RTX, N, C) (RTL_CHECKC1 (RTX, N, C).rt_rtvec)
@@ -1920,7 +1922,7 @@ #define CONST_VECTOR_NUNITS(RTX) XCVECLE
SUBREG_BYTE extracts the byte-number. */
#define SUBREG_REG(RTX) XCEXP (RTX, 0, SUBREG)
-#define SUBREG_BYTE(RTX) XCUINT (RTX, 1, SUBREG)
+#define SUBREG_BYTE(RTX) XCSUBREG (RTX, 1, SUBREG)
/* in rtlanal.c */
/* Return the right cost to give to an operation
@@ -1993,19 +1995,19 @@ costs_add_n_insns (struct full_rtx_costs
offset == the SUBREG_BYTE
outer_mode == the mode of the SUBREG itself. */
struct subreg_shape {
- subreg_shape (machine_mode, unsigned int, machine_mode);
+ subreg_shape (machine_mode, poly_uint16, machine_mode);
bool operator == (const subreg_shape &) const;
bool operator != (const subreg_shape &) const;
- unsigned int unique_id () const;
+ unsigned HOST_WIDE_INT unique_id () const;
machine_mode inner_mode;
- unsigned int offset;
+ poly_uint16 offset;
machine_mode outer_mode;
};
inline
subreg_shape::subreg_shape (machine_mode inner_mode_in,
- unsigned int offset_in,
+ poly_uint16 offset_in,
machine_mode outer_mode_in)
: inner_mode (inner_mode_in), offset (offset_in), outer_mode (outer_mode_in)
{}
@@ -2014,7 +2016,7 @@ subreg_shape::subreg_shape (machine_mode
subreg_shape::operator == (const subreg_shape &other) const
{
return (inner_mode == other.inner_mode
- && offset == other.offset
+ && must_eq (offset, other.offset)
&& outer_mode == other.outer_mode);
}
@@ -2029,11 +2031,16 @@ subreg_shape::operator != (const subreg_
current mode is anywhere near being 65536 bytes in size, so the
id comfortably fits in an int. */
-inline unsigned int
+inline unsigned HOST_WIDE_INT
subreg_shape::unique_id () const
{
- STATIC_ASSERT (MAX_MACHINE_MODE <= 256);
- return (int) inner_mode + ((int) outer_mode << 8) + (offset << 16);
+ { STATIC_ASSERT (MAX_MACHINE_MODE <= 256); }
+ { STATIC_ASSERT (NUM_POLY_INT_COEFFS <= 3); }
+ { STATIC_ASSERT (sizeof (offset.coeffs[0]) <= 2); }
+ int res = (int) inner_mode + ((int) outer_mode << 8);
+ for (int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
+ res += (HOST_WIDE_INT) offset.coeffs[i] << ((1 + i) * 16);
+ return res;
}
/* Return the shape of a SUBREG rtx. */
@@ -2287,11 +2294,10 @@ extern int rtx_cost (rtx, machine_mode,
extern int address_cost (rtx, machine_mode, addr_space_t, bool);
extern void get_full_rtx_cost (rtx, machine_mode, enum rtx_code, int,
struct full_rtx_costs *);
-extern unsigned int subreg_lsb (const_rtx);
-extern unsigned int subreg_lsb_1 (machine_mode, machine_mode,
- unsigned int);
-extern unsigned int subreg_size_offset_from_lsb (unsigned int, unsigned int,
- unsigned int);
+extern poly_uint64 subreg_lsb (const_rtx);
+extern poly_uint64 subreg_lsb_1 (machine_mode, machine_mode, poly_uint64);
+extern poly_uint64 subreg_size_offset_from_lsb (poly_uint64, poly_uint64,
+ poly_uint64);
extern bool read_modify_subreg_p (const_rtx);
/* Return the subreg byte offset for a subreg whose outer mode is
@@ -2300,22 +2306,22 @@ extern bool read_modify_subreg_p (const_
the inner value. This is the inverse of subreg_lsb_1 (which converts
byte offsets to bit shifts). */
-inline unsigned int
+inline poly_uint64
subreg_offset_from_lsb (machine_mode outer_mode,
machine_mode inner_mode,
- unsigned int lsb_shift)
+ poly_uint64 lsb_shift)
{
return subreg_size_offset_from_lsb (GET_MODE_SIZE (outer_mode),
GET_MODE_SIZE (inner_mode), lsb_shift);
}
-extern unsigned int subreg_regno_offset (unsigned int, machine_mode,
- unsigned int, machine_mode);
+extern unsigned int subreg_regno_offset (unsigned int, machine_mode,
+ poly_uint64, machine_mode);
extern bool subreg_offset_representable_p (unsigned int, machine_mode,
- unsigned int, machine_mode);
+ poly_uint64, machine_mode);
extern unsigned int subreg_regno (const_rtx);
extern int simplify_subreg_regno (unsigned int, machine_mode,
- unsigned int, machine_mode);
+ poly_uint64, machine_mode);
extern unsigned int subreg_nregs (const_rtx);
extern unsigned int subreg_nregs_with_regno (unsigned int, const_rtx);
extern unsigned HOST_WIDE_INT nonzero_bits (const_rtx, machine_mode);
@@ -3016,7 +3022,7 @@ extern rtx operand_subword (rtx, unsigne
/* In emit-rtl.c */
extern rtx operand_subword_force (rtx, unsigned int, machine_mode);
extern int subreg_lowpart_p (const_rtx);
-extern unsigned int subreg_size_lowpart_offset (unsigned int, unsigned int);
+extern poly_uint64 subreg_size_lowpart_offset (poly_uint64, poly_uint64);
/* Return true if a subreg of mode OUTERMODE would only access part of
an inner register with mode INNERMODE. The other bits of the inner
@@ -3063,7 +3069,7 @@ paradoxical_subreg_p (const_rtx x)
/* Return the SUBREG_BYTE for an OUTERMODE lowpart of an INNERMODE value. */
-inline unsigned int
+inline poly_uint64
subreg_lowpart_offset (machine_mode outermode, machine_mode innermode)
{
return subreg_size_lowpart_offset (GET_MODE_SIZE (outermode),
@@ -3098,20 +3104,21 @@ wider_subreg_mode (const_rtx x)
return wider_subreg_mode (GET_MODE (x), GET_MODE (SUBREG_REG (x)));
}
-extern unsigned int subreg_size_highpart_offset (unsigned int, unsigned int);
+extern poly_uint64 subreg_size_highpart_offset (poly_uint64, poly_uint64);
/* Return the SUBREG_BYTE for an OUTERMODE highpart of an INNERMODE value. */
-inline unsigned int
+inline poly_uint64
subreg_highpart_offset (machine_mode outermode, machine_mode innermode)
{
return subreg_size_highpart_offset (GET_MODE_SIZE (outermode),
GET_MODE_SIZE (innermode));
}
-extern int byte_lowpart_offset (machine_mode, machine_mode);
-extern int subreg_memory_offset (machine_mode, machine_mode, unsigned int);
-extern int subreg_memory_offset (const_rtx);
+extern poly_int64 byte_lowpart_offset (machine_mode, machine_mode);
+extern poly_int64 subreg_memory_offset (machine_mode, machine_mode,
+ poly_uint64);
+extern poly_int64 subreg_memory_offset (const_rtx);
extern rtx make_safe_from (rtx, rtx);
extern rtx convert_memory_address_addr_space_1 (scalar_int_mode, rtx,
addr_space_t, bool, bool);
@@ -3263,16 +3270,8 @@ extern rtx simplify_gen_ternary (enum rt
machine_mode, rtx, rtx, rtx);
extern rtx simplify_gen_relational (enum rtx_code, machine_mode,
machine_mode, rtx, rtx);
-extern rtx simplify_subreg (machine_mode, rtx, machine_mode,
- unsigned int);
-extern rtx simplify_gen_subreg (machine_mode, rtx, machine_mode,
- unsigned int);
-inline rtx
-simplify_gen_subreg (machine_mode omode, rtx x, machine_mode imode,
- poly_uint64 offset)
-{
- return simplify_gen_subreg (omode, x, imode, offset.to_constant ());
-}
+extern rtx simplify_subreg (machine_mode, rtx, machine_mode, poly_uint64);
+extern rtx simplify_gen_subreg (machine_mode, rtx, machine_mode, poly_uint64);
extern rtx lowpart_subreg (machine_mode, rtx, machine_mode);
extern rtx simplify_replace_fn_rtx (rtx, const_rtx,
rtx (*fn) (rtx, const_rtx, void *), void *);
@@ -3458,7 +3457,7 @@ struct subreg_info
};
extern void subreg_get_info (unsigned int, machine_mode,
- unsigned int, machine_mode,
+ poly_uint64, machine_mode,
struct subreg_info *);
/* lists.c */
@@ -3697,7 +3696,7 @@ extern rtx gen_rtx_CONST_VECTOR (machine
extern void set_mode_and_regno (rtx, machine_mode, unsigned int);
extern rtx gen_raw_REG (machine_mode, unsigned int);
extern rtx gen_rtx_REG (machine_mode, unsigned int);
-extern rtx gen_rtx_SUBREG (machine_mode, rtx, int);
+extern rtx gen_rtx_SUBREG (machine_mode, rtx, poly_uint64);
extern rtx gen_rtx_MEM (machine_mode, rtx);
extern rtx gen_rtx_VAR_LOCATION (machine_mode, tree, rtx,
enum var_init_status);
@@ -3914,7 +3913,7 @@ extern rtx gen_const_mem (machine_mode,
extern rtx gen_frame_mem (machine_mode, rtx);
extern rtx gen_tmp_stack_mem (machine_mode, rtx);
extern bool validate_subreg (machine_mode, machine_mode,
- const_rtx, unsigned int);
+ const_rtx, poly_uint64);
/* In combine.c */
extern unsigned int extended_count (const_rtx, machine_mode, int);
Index: gcc/rtl.c
===================================================================
--- gcc/rtl.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/rtl.c 2017-10-23 17:16:50.374527737 +0100
@@ -89,7 +89,8 @@ const char * const rtx_format[NUM_RTX_CO
"b" is a pointer to a bitmap header.
"B" is a basic block pointer.
"t" is a tree pointer.
- "r" a register. */
+ "r" a register.
+ "p" is a poly_uint16 offset. */
#define DEF_RTL_EXPR(ENUM, NAME, FORMAT, CLASS) FORMAT ,
#include "rtl.def" /* rtl expressions are defined here */
@@ -349,6 +350,7 @@ copy_rtx (rtx orig)
case 't':
case 'w':
case 'i':
+ case 'p':
case 's':
case 'S':
case 'T':
@@ -503,6 +505,11 @@ rtx_equal_p_cb (const_rtx x, const_rtx y
}
break;
+ case 'p':
+ if (may_ne (SUBREG_BYTE (x), SUBREG_BYTE (y)))
+ return 0;
+ break;
+
case 'V':
case 'E':
/* Two vectors must have the same length. */
@@ -640,6 +647,11 @@ rtx_equal_p (const_rtx x, const_rtx y)
}
break;
+ case 'p':
+ if (may_ne (SUBREG_BYTE (x), SUBREG_BYTE (y)))
+ return 0;
+ break;
+
case 'V':
case 'E':
/* Two vectors must have the same length. */
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/emit-rtl.c 2017-10-23 17:16:50.363529222 +0100
@@ -922,17 +922,17 @@ gen_tmp_stack_mem (machine_mode mode, rt
bool
validate_subreg (machine_mode omode, machine_mode imode,
- const_rtx reg, unsigned int offset)
+ const_rtx reg, poly_uint64 offset)
{
unsigned int isize = GET_MODE_SIZE (imode);
unsigned int osize = GET_MODE_SIZE (omode);
/* All subregs must be aligned. */
- if (offset % osize != 0)
+ if (!multiple_p (offset, osize))
return false;
/* The subreg offset cannot be outside the inner object. */
- if (offset >= isize)
+ if (may_ge (offset, isize))
return false;
unsigned int regsize = REGMODE_NATURAL_SIZE (imode);
@@ -977,7 +977,7 @@ validate_subreg (machine_mode omode, mac
/* Paradoxical subregs must have offset zero. */
if (osize > isize)
- return offset == 0;
+ return known_zero (offset);
/* This is a normal subreg. Verify that the offset is representable. */
@@ -1009,18 +1009,20 @@ validate_subreg (machine_mode omode, mac
if (osize < regsize
&& ! (lra_in_progress && (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))))
{
- unsigned int block_size = MIN (isize, regsize);
- unsigned int offset_within_block = offset % block_size;
- if (BYTES_BIG_ENDIAN
- ? offset_within_block != block_size - osize
- : offset_within_block != 0)
+ poly_uint64 block_size = MIN (isize, regsize);
+ unsigned int start_reg;
+ poly_uint64 offset_within_reg;
+ if (!can_div_trunc_p (offset, block_size, &start_reg, &offset_within_reg)
+ || (BYTES_BIG_ENDIAN
+ ? may_ne (offset_within_reg, block_size - osize)
+ : maybe_nonzero (offset_within_reg)))
return false;
}
return true;
}
rtx
-gen_rtx_SUBREG (machine_mode mode, rtx reg, int offset)
+gen_rtx_SUBREG (machine_mode mode, rtx reg, poly_uint64 offset)
{
gcc_assert (validate_subreg (mode, GET_MODE (reg), reg, offset));
return gen_rtx_raw_SUBREG (mode, reg, offset);
@@ -1121,7 +1123,7 @@ gen_rtvec_v (int n, rtx_insn **argp)
paradoxical lowpart, in which case the offset will be negative
on big-endian targets. */
-int
+poly_int64
byte_lowpart_offset (machine_mode outer_mode,
machine_mode inner_mode)
{
@@ -1135,13 +1137,13 @@ byte_lowpart_offset (machine_mode outer_
from address X. For paradoxical big-endian subregs this is a
negative value, otherwise it's the same as OFFSET. */
-int
+poly_int64
subreg_memory_offset (machine_mode outer_mode, machine_mode inner_mode,
- unsigned int offset)
+ poly_uint64 offset)
{
if (paradoxical_subreg_p (outer_mode, inner_mode))
{
- gcc_assert (offset == 0);
+ gcc_assert (known_zero (offset));
return -subreg_lowpart_offset (inner_mode, outer_mode);
}
return offset;
@@ -1151,7 +1153,7 @@ subreg_memory_offset (machine_mode outer
if SUBREG_REG (X) were stored in memory. The only significant thing
about the current SUBREG_REG is its mode. */
-int
+poly_int64
subreg_memory_offset (const_rtx x)
{
return subreg_memory_offset (GET_MODE (x), GET_MODE (SUBREG_REG (x)),
@@ -1657,10 +1659,11 @@ gen_highpart_mode (machine_mode outermod
/* Return the SUBREG_BYTE for a lowpart subreg whose outer mode has
OUTER_BYTES bytes and whose inner mode has INNER_BYTES bytes. */
-unsigned int
-subreg_size_lowpart_offset (unsigned int outer_bytes, unsigned int inner_bytes)
+poly_uint64
+subreg_size_lowpart_offset (poly_uint64 outer_bytes, poly_uint64 inner_bytes)
{
- if (outer_bytes > inner_bytes)
+ gcc_checking_assert (ordered_p (outer_bytes, inner_bytes));
+ if (may_gt (outer_bytes, inner_bytes))
/* Paradoxical subregs always have a SUBREG_BYTE of 0. */
return 0;
@@ -1675,11 +1678,10 @@ subreg_size_lowpart_offset (unsigned int
/* Return the SUBREG_BYTE for a highpart subreg whose outer mode has
OUTER_BYTES bytes and whose inner mode has INNER_BYTES bytes. */
-unsigned int
-subreg_size_highpart_offset (unsigned int outer_bytes,
- unsigned int inner_bytes)
+poly_uint64
+subreg_size_highpart_offset (poly_uint64 outer_bytes, poly_uint64 inner_bytes)
{
- gcc_assert (inner_bytes >= outer_bytes);
+ gcc_assert (must_ge (inner_bytes, outer_bytes));
if (BYTES_BIG_ENDIAN && WORDS_BIG_ENDIAN)
return 0;
@@ -1703,8 +1705,9 @@ subreg_lowpart_p (const_rtx x)
else if (GET_MODE (SUBREG_REG (x)) == VOIDmode)
return 0;
- return (subreg_lowpart_offset (GET_MODE (x), GET_MODE (SUBREG_REG (x)))
- == SUBREG_BYTE (x));
+ return must_eq (subreg_lowpart_offset (GET_MODE (x),
+ GET_MODE (SUBREG_REG (x))),
+ SUBREG_BYTE (x));
}
\f
/* Return subword OFFSET of operand OP.
@@ -5755,6 +5758,7 @@ copy_insn_1 (rtx orig)
case 't':
case 'w':
case 'i':
+ case 'p':
case 's':
case 'S':
case 'u':
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/rtlanal.c 2017-10-23 17:16:50.375527601 +0100
@@ -1586,7 +1586,7 @@ set_noop_p (const_rtx set)
if (GET_CODE (src) == SUBREG && GET_CODE (dst) == SUBREG)
{
- if (SUBREG_BYTE (src) != SUBREG_BYTE (dst))
+ if (may_ne (SUBREG_BYTE (src), SUBREG_BYTE (dst)))
return 0;
src = SUBREG_REG (src);
dst = SUBREG_REG (dst);
@@ -3557,48 +3557,50 @@ loc_mentioned_in_p (rtx *loc, const_rtx
and SUBREG_BYTE, return the bit offset where the subreg begins
(counting from the least significant bit of the operand). */
-unsigned int
+poly_uint64
subreg_lsb_1 (machine_mode outer_mode,
machine_mode inner_mode,
- unsigned int subreg_byte)
+ poly_uint64 subreg_byte)
{
- unsigned int bitpos;
- unsigned int byte;
- unsigned int word;
+ poly_uint64 subreg_end, trailing_bytes, byte_pos;
/* A paradoxical subreg begins at bit position 0. */
if (paradoxical_subreg_p (outer_mode, inner_mode))
return 0;
- if (WORDS_BIG_ENDIAN != BYTES_BIG_ENDIAN)
- /* If the subreg crosses a word boundary ensure that
- it also begins and ends on a word boundary. */
- gcc_assert (!((subreg_byte % UNITS_PER_WORD
- + GET_MODE_SIZE (outer_mode)) > UNITS_PER_WORD
- && (subreg_byte % UNITS_PER_WORD
- || GET_MODE_SIZE (outer_mode) % UNITS_PER_WORD)));
-
- if (WORDS_BIG_ENDIAN)
- word = (GET_MODE_SIZE (inner_mode)
- - (subreg_byte + GET_MODE_SIZE (outer_mode))) / UNITS_PER_WORD;
- else
- word = subreg_byte / UNITS_PER_WORD;
- bitpos = word * BITS_PER_WORD;
-
- if (BYTES_BIG_ENDIAN)
- byte = (GET_MODE_SIZE (inner_mode)
- - (subreg_byte + GET_MODE_SIZE (outer_mode))) % UNITS_PER_WORD;
+ subreg_end = subreg_byte + GET_MODE_SIZE (outer_mode);
+ trailing_bytes = GET_MODE_SIZE (inner_mode) - subreg_end;
+ if (WORDS_BIG_ENDIAN && BYTES_BIG_ENDIAN)
+ byte_pos = trailing_bytes;
+ else if (!WORDS_BIG_ENDIAN && !BYTES_BIG_ENDIAN)
+ byte_pos = subreg_byte;
else
- byte = subreg_byte % UNITS_PER_WORD;
- bitpos += byte * BITS_PER_UNIT;
+ {
+ /* When bytes and words have oppposite endianness, we must be able
+ to split offsets into words and bytes at compile time. */
+ poly_uint64 leading_word_part
+ = force_align_down (subreg_byte, UNITS_PER_WORD);
+ poly_uint64 trailing_word_part
+ = force_align_down (trailing_bytes, UNITS_PER_WORD);
+ /* If the subreg crosses a word boundary ensure that
+ it also begins and ends on a word boundary. */
+ gcc_assert (must_le (subreg_end - leading_word_part,
+ (unsigned int) UNITS_PER_WORD)
+ || (must_eq (leading_word_part, subreg_byte)
+ && must_eq (trailing_word_part, trailing_bytes)));
+ if (WORDS_BIG_ENDIAN)
+ byte_pos = trailing_word_part + (subreg_byte - leading_word_part);
+ else
+ byte_pos = leading_word_part + (trailing_bytes - trailing_word_part);
+ }
- return bitpos;
+ return byte_pos * BITS_PER_UNIT;
}
/* Given a subreg X, return the bit offset where the subreg begins
(counting from the least significant bit of the reg). */
-unsigned int
+poly_uint64
subreg_lsb (const_rtx x)
{
return subreg_lsb_1 (GET_MODE (x), GET_MODE (SUBREG_REG (x)),
@@ -3611,29 +3613,32 @@ subreg_lsb (const_rtx x)
lsb of the inner value. This is the inverse of the calculation
performed by subreg_lsb_1 (which converts byte offsets to bit shifts). */
-unsigned int
-subreg_size_offset_from_lsb (unsigned int outer_bytes,
- unsigned int inner_bytes,
- unsigned int lsb_shift)
+poly_uint64
+subreg_size_offset_from_lsb (poly_uint64 outer_bytes, poly_uint64 inner_bytes,
+ poly_uint64 lsb_shift)
{
/* A paradoxical subreg begins at bit position 0. */
- if (outer_bytes > inner_bytes)
+ gcc_checking_assert (ordered_p (outer_bytes, inner_bytes));
+ if (may_gt (outer_bytes, inner_bytes))
{
- gcc_checking_assert (lsb_shift == 0);
+ gcc_checking_assert (known_zero (lsb_shift));
return 0;
}
- gcc_assert (lsb_shift % BITS_PER_UNIT == 0);
- unsigned int lower_bytes = lsb_shift / BITS_PER_UNIT;
- unsigned int upper_bytes = inner_bytes - (lower_bytes + outer_bytes);
+ poly_uint64 lower_bytes = exact_div (lsb_shift, BITS_PER_UNIT);
+ poly_uint64 upper_bytes = inner_bytes - (lower_bytes + outer_bytes);
if (WORDS_BIG_ENDIAN && BYTES_BIG_ENDIAN)
return upper_bytes;
else if (!WORDS_BIG_ENDIAN && !BYTES_BIG_ENDIAN)
return lower_bytes;
else
{
- unsigned int lower_word_part = lower_bytes & -UNITS_PER_WORD;
- unsigned int upper_word_part = upper_bytes & -UNITS_PER_WORD;
+ /* When bytes and words have oppposite endianness, we must be able
+ to split offsets into words and bytes at compile time. */
+ poly_uint64 lower_word_part = force_align_down (lower_bytes,
+ UNITS_PER_WORD);
+ poly_uint64 upper_word_part = force_align_down (upper_bytes,
+ UNITS_PER_WORD);
if (WORDS_BIG_ENDIAN)
return upper_word_part + (lower_bytes - lower_word_part);
else
@@ -3662,7 +3667,7 @@ subreg_size_offset_from_lsb (unsigned in
void
subreg_get_info (unsigned int xregno, machine_mode xmode,
- unsigned int offset, machine_mode ymode,
+ poly_uint64 offset, machine_mode ymode,
struct subreg_info *info)
{
unsigned int nregs_xmode, nregs_ymode;
@@ -3679,6 +3684,9 @@ subreg_get_info (unsigned int xregno, ma
at least one register. */
if (HARD_REGNO_NREGS_HAS_PADDING (xregno, xmode))
{
+ /* As a consequence, we must be dealing with a constant number of
+ scalars, and thus a constant offset. */
+ HOST_WIDE_INT coffset = offset.to_constant ();
nregs_xmode = HARD_REGNO_NREGS_WITH_PADDING (xregno, xmode);
unsigned int nunits = GET_MODE_NUNITS (xmode);
scalar_mode xmode_unit = GET_MODE_INNER (xmode);
@@ -3697,9 +3705,9 @@ subreg_get_info (unsigned int xregno, ma
3 for each part, but in memory it's two 128-bit parts.
Padding is assumed to be at the end (not necessarily the 'high part')
of each unit. */
- if ((offset / GET_MODE_SIZE (xmode_unit) + 1 < nunits)
- && (offset / GET_MODE_SIZE (xmode_unit)
- != ((offset + ysize - 1) / GET_MODE_SIZE (xmode_unit))))
+ if ((coffset / GET_MODE_SIZE (xmode_unit) + 1 < nunits)
+ && (coffset / GET_MODE_SIZE (xmode_unit)
+ != ((coffset + ysize - 1) / GET_MODE_SIZE (xmode_unit))))
{
info->representable_p = false;
rknown = true;
@@ -3711,7 +3719,7 @@ subreg_get_info (unsigned int xregno, ma
nregs_ymode = hard_regno_nregs (xregno, ymode);
/* Paradoxical subregs are otherwise valid. */
- if (!rknown && offset == 0 && ysize > xsize)
+ if (!rknown && known_zero (offset) && ysize > xsize)
{
info->representable_p = true;
/* If this is a big endian paradoxical subreg, which uses more
@@ -3746,16 +3754,22 @@ subreg_get_info (unsigned int xregno, ma
{
info->representable_p = false;
info->nregs = CEIL (ysize, regsize_xmode);
- info->offset = offset / regsize_xmode;
+ if (!can_div_trunc_p (offset, regsize_xmode, &info->offset))
+ /* Checked by validate_subreg. We must know at compile time
+ which inner registers are being accessed. */
+ gcc_unreachable ();
return;
}
/* It's not valid to extract a subreg of mode YMODE at OFFSET that
would go outside of XMODE. */
- if (!rknown && ysize + offset > xsize)
+ if (!rknown && may_gt (ysize + offset, xsize))
{
info->representable_p = false;
info->nregs = nregs_ymode;
- info->offset = offset / regsize_xmode;
+ if (!can_div_trunc_p (offset, regsize_xmode, &info->offset))
+ /* Checked by validate_subreg. We must know at compile time
+ which inner registers are being accessed. */
+ gcc_unreachable ();
return;
}
/* Quick exit for the simple and common case of extracting whole
@@ -3763,26 +3777,27 @@ subreg_get_info (unsigned int xregno, ma
/* ??? It would be better to integrate this into the code below,
if we can generalize the concept enough and figure out how
odd-sized modes can coexist with the other weird cases we support. */
+ HOST_WIDE_INT count;
if (!rknown
&& WORDS_BIG_ENDIAN == REG_WORDS_BIG_ENDIAN
&& regsize_xmode == regsize_ymode
- && (offset % regsize_ymode) == 0)
+ && constant_multiple_p (offset, regsize_ymode, &count))
{
info->representable_p = true;
info->nregs = nregs_ymode;
- info->offset = offset / regsize_ymode;
+ info->offset = count;
gcc_assert (info->offset + info->nregs <= (int) nregs_xmode);
return;
}
}
/* Lowpart subregs are otherwise valid. */
- if (!rknown && offset == subreg_lowpart_offset (ymode, xmode))
+ if (!rknown && must_eq (offset, subreg_lowpart_offset (ymode, xmode)))
{
info->representable_p = true;
rknown = true;
- if (offset == 0 || nregs_xmode == nregs_ymode)
+ if (known_zero (offset) || nregs_xmode == nregs_ymode)
{
info->offset = 0;
info->nregs = nregs_ymode;
@@ -3803,19 +3818,24 @@ subreg_get_info (unsigned int xregno, ma
These conditions may be relaxed but subreg_regno_offset would
need to be redesigned. */
gcc_assert ((xsize % num_blocks) == 0);
- unsigned int bytes_per_block = xsize / num_blocks;
+ poly_uint64 bytes_per_block = xsize / num_blocks;
/* Get the number of the first block that contains the subreg and the byte
offset of the subreg from the start of that block. */
- unsigned int block_number = offset / bytes_per_block;
- unsigned int subblock_offset = offset % bytes_per_block;
+ unsigned int block_number;
+ poly_uint64 subblock_offset;
+ if (!can_div_trunc_p (offset, bytes_per_block, &block_number,
+ &subblock_offset))
+ /* Checked by validate_subreg. We must know at compile time which
+ inner registers are being accessed. */
+ gcc_unreachable ();
if (!rknown)
{
/* Only the lowpart of each block is representable. */
info->representable_p
- = (subblock_offset
- == subreg_size_lowpart_offset (ysize, bytes_per_block));
+ = must_eq (subblock_offset,
+ subreg_size_lowpart_offset (ysize, bytes_per_block));
rknown = true;
}
@@ -3842,7 +3862,7 @@ subreg_get_info (unsigned int xregno, ma
RETURN - The regno offset which would be used. */
unsigned int
subreg_regno_offset (unsigned int xregno, machine_mode xmode,
- unsigned int offset, machine_mode ymode)
+ poly_uint64 offset, machine_mode ymode)
{
struct subreg_info info;
subreg_get_info (xregno, xmode, offset, ymode, &info);
@@ -3858,7 +3878,7 @@ subreg_regno_offset (unsigned int xregno
RETURN - Whether the offset is representable. */
bool
subreg_offset_representable_p (unsigned int xregno, machine_mode xmode,
- unsigned int offset, machine_mode ymode)
+ poly_uint64 offset, machine_mode ymode)
{
struct subreg_info info;
subreg_get_info (xregno, xmode, offset, ymode, &info);
@@ -3875,7 +3895,7 @@ subreg_offset_representable_p (unsigned
int
simplify_subreg_regno (unsigned int xregno, machine_mode xmode,
- unsigned int offset, machine_mode ymode)
+ poly_uint64 offset, machine_mode ymode)
{
struct subreg_info info;
unsigned int yregno;
Index: gcc/rtlhash.c
===================================================================
--- gcc/rtlhash.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/rtlhash.c 2017-10-23 17:16:50.375527601 +0100
@@ -87,6 +87,9 @@ add_rtx (const_rtx x, hash &hstate)
case 'i':
hstate.add_int (XINT (x, i));
break;
+ case 'p':
+ hstate.add_poly_int (SUBREG_BYTE (x));
+ break;
case 'V':
case 'E':
j = XVECLEN (x, i);
Index: gcc/genemit.c
===================================================================
--- gcc/genemit.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/genemit.c 2017-10-23 17:16:50.366528817 +0100
@@ -235,6 +235,12 @@ gen_exp (rtx x, enum rtx_code subroutine
printf ("%u", REGNO (x));
break;
+ case 'p':
+ /* We don't have a way of parsing polynomial offsets yet,
+ and hopefully never will. */
+ printf ("%d", SUBREG_BYTE (x).to_constant ());
+ break;
+
case 's':
printf ("\"%s\"", XSTR (x, i));
break;
Index: gcc/gengenrtl.c
===================================================================
--- gcc/gengenrtl.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/gengenrtl.c 2017-10-23 17:16:50.366528817 +0100
@@ -54,6 +54,9 @@ type_from_format (int c)
case 'w':
return "HOST_WIDE_INT ";
+ case 'p':
+ return "poly_uint16 ";
+
case 's':
return "const char *";
@@ -257,10 +260,12 @@ gendef (const char *format)
puts (" PUT_MODE_RAW (rt, mode);");
for (p = format, i = j = 0; *p ; ++p, ++i)
- if (*p != '0')
- printf (" %s (rt, %d) = arg%d;\n", accessor_from_format (*p), i, j++);
- else
+ if (*p == '0')
printf (" X0EXP (rt, %d) = NULL_RTX;\n", i);
+ else if (*p == 'p')
+ printf (" SUBREG_BYTE (rt) = arg%d;\n", j++);
+ else
+ printf (" %s (rt, %d) = arg%d;\n", accessor_from_format (*p), i, j++);
puts ("\n return rt;\n}\n");
printf ("#define gen_rtx_fmt_%s(c, m", format);
Index: gcc/gensupport.c
===================================================================
--- gcc/gensupport.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/gensupport.c 2017-10-23 17:16:50.368528547 +0100
@@ -883,7 +883,7 @@ subst_pattern_match (rtx x, rtx pt, file
switch (fmt[i])
{
- case 'i': case 'r': case 'w': case 's':
+ case 'r': case 'p': case 'i': case 'w': case 's':
continue;
case 'e': case 'u':
@@ -1047,7 +1047,8 @@ get_alternatives_number (rtx pattern, in
return 0;
break;
- case 'i': case 'r': case 'w': case '0': case 's': case 'S': case 'T':
+ case 'r': case 'p': case 'i': case 'w':
+ case '0': case 's': case 'S': case 'T':
break;
default:
@@ -1106,7 +1107,8 @@ collect_insn_data (rtx pattern, int *pal
collect_insn_data (XVECEXP (pattern, i, j), palt, pmax);
break;
- case 'i': case 'r': case 'w': case '0': case 's': case 'S': case 'T':
+ case 'r': case 'p': case 'i': case 'w':
+ case '0': case 's': case 'S': case 'T':
break;
default:
@@ -1190,7 +1192,7 @@ alter_predicate_for_insn (rtx pattern, i
}
break;
- case 'i': case 'r': case 'w': case '0': case 's':
+ case 'r': case 'p': case 'i': case 'w': case '0': case 's':
break;
default:
@@ -1248,7 +1250,7 @@ alter_constraints (rtx pattern, int n_du
}
break;
- case 'i': case 'r': case 'w': case '0': case 's':
+ case 'r': case 'p': case 'i': case 'w': case '0': case 's':
break;
default:
@@ -2164,7 +2166,8 @@ subst_dup (rtx pattern, int n_alt, int n
n_alt, n_subst_alt);
break;
- case 'i': case 'r': case 'w': case '0': case 's': case 'S': case 'T':
+ case 'r': case 'p': case 'i': case 'w':
+ case '0': case 's': case 'S': case 'T':
break;
default:
Index: gcc/gengtype.c
===================================================================
--- gcc/gengtype.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/gengtype.c 2017-10-23 17:16:50.367528682 +0100
@@ -1241,6 +1241,11 @@ adjust_field_rtx_def (type_p t, options_
subname = "rt_int";
break;
+ case 'p':
+ t = scalar_tp;
+ subname = "rt_subreg";
+ break;
+
case '0':
if (i == MEM && aindex == 1)
t = mem_attrs_tp, subname = "rt_mem";
Index: gcc/genrecog.c
===================================================================
--- gcc/genrecog.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/genrecog.c 2017-10-23 17:16:50.367528682 +0100
@@ -388,7 +388,7 @@ find_operand (rtx pattern, int n, rtx st
return r;
break;
- case 'i': case 'r': case 'w': case '0': case 's':
+ case 'r': case 'p': case 'i': case 'w': case '0': case 's':
break;
default:
@@ -439,7 +439,7 @@ find_matching_operand (rtx pattern, int
return r;
break;
- case 'i': case 'r': case 'w': case '0': case 's':
+ case 'r': case 'p': case 'i': case 'w': case '0': case 's':
break;
default:
@@ -797,7 +797,7 @@ validate_pattern (rtx pattern, md_rtx_in
validate_pattern (XVECEXP (pattern, i, j), info, NULL_RTX, 0);
break;
- case 'i': case 'r': case 'w': case '0': case 's':
+ case 'r': case 'p': case 'i': case 'w': case '0': case 's':
break;
default:
@@ -1119,6 +1119,9 @@ struct rtx_test
/* Check REGNO (X) == LABEL. */
REGNO_FIELD,
+ /* Check must_eq (SUBREG_BYTE (X), LABEL). */
+ SUBREG_FIELD,
+
/* Check XINT (X, u.opno) == LABEL. */
INT_FIELD,
@@ -1199,6 +1202,7 @@ struct rtx_test
static rtx_test code (position *);
static rtx_test mode (position *);
static rtx_test regno_field (position *);
+ static rtx_test subreg_field (position *);
static rtx_test int_field (position *, int);
static rtx_test wide_int_field (position *, int);
static rtx_test veclen (position *);
@@ -1244,6 +1248,13 @@ rtx_test::regno_field (position *pos)
}
rtx_test
+rtx_test::subreg_field (position *pos)
+{
+ rtx_test res (pos, rtx_test::SUBREG_FIELD);
+ return res;
+}
+
+rtx_test
rtx_test::int_field (position *pos, int opno)
{
rtx_test res (pos, rtx_test::INT_FIELD);
@@ -1364,6 +1375,7 @@ operator == (const rtx_test &a, const rt
case rtx_test::CODE:
case rtx_test::MODE:
case rtx_test::REGNO_FIELD:
+ case rtx_test::SUBREG_FIELD:
case rtx_test::VECLEN:
case rtx_test::HAVE_NUM_CLOBBERS:
return true;
@@ -1821,6 +1833,7 @@ safe_to_hoist_p (decision *d, const rtx_
gcc_unreachable ();
case rtx_test::REGNO_FIELD:
+ case rtx_test::SUBREG_FIELD:
case rtx_test::INT_FIELD:
case rtx_test::WIDE_INT_FIELD:
case rtx_test::VECLEN:
@@ -2028,6 +2041,7 @@ transition_parameter_type (rtx_test::kin
return parameter::MODE;
case rtx_test::REGNO_FIELD:
+ case rtx_test::SUBREG_FIELD:
return parameter::UINT;
case rtx_test::INT_FIELD:
@@ -4039,6 +4053,14 @@ match_pattern_2 (state *s, md_rtx_info *
XWINT (pattern, 0), false);
break;
+ case 'p':
+ /* We don't have a way of parsing polynomial offsets yet,
+ and hopefully never will. */
+ s = add_decision (s, rtx_test::subreg_field (pos),
+ SUBREG_BYTE (pattern).to_constant (),
+ false);
+ break;
+
case '0':
break;
@@ -4571,6 +4593,12 @@ print_nonbool_test (output_state *os, co
printf (")");
break;
+ case rtx_test::SUBREG_FIELD:
+ printf ("SUBREG_BYTE (");
+ print_test_rtx (os, test);
+ printf (")");
+ break;
+
case rtx_test::WIDE_INT_FIELD:
printf ("XWINT (");
print_test_rtx (os, test);
@@ -4653,6 +4681,14 @@ print_test (output_state *os, const rtx_
print_label_value (test, is_param, value);
break;
+ case rtx_test::SUBREG_FIELD:
+ printf ("%s (", invert_p ? "may_ne" : "must_eq");
+ print_nonbool_test (os, test);
+ printf (", ");
+ print_label_value (test, is_param, value);
+ printf (")");
+ break;
+
case rtx_test::SAVED_CONST_INT:
gcc_assert (!is_param && value == 1);
print_test_rtx (os, test);
Index: gcc/genattrtab.c
===================================================================
--- gcc/genattrtab.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/genattrtab.c 2017-10-23 17:16:50.366528817 +0100
@@ -563,6 +563,7 @@ attr_rtx_1 (enum rtx_code code, va_list
break;
default:
+ /* Don't need to handle 'p' for attributes. */
gcc_unreachable ();
}
}
Index: gcc/genpeep.c
===================================================================
--- gcc/genpeep.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/genpeep.c 2017-10-23 17:16:50.367528682 +0100
@@ -306,6 +306,9 @@ match_rtx (rtx x, struct link *path, int
printf (" if (strcmp (XSTR (x, %d), \"%s\")) goto L%d;\n",
i, XSTR (x, i), fail_label);
}
+ else if (fmt[i] == 'p')
+ /* Not going to support subregs for legacy define_peeholes. */
+ gcc_unreachable ();
}
}
Index: gcc/print-rtl.c
===================================================================
--- gcc/print-rtl.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/print-rtl.c 2017-10-23 17:16:50.371528142 +0100
@@ -178,6 +178,7 @@ print_mem_expr (FILE *outfile, const_tre
fputc (' ', outfile);
print_generic_expr (outfile, CONST_CAST_TREE (expr), dump_flags);
}
+#endif
/* Print X to FILE. */
@@ -195,7 +196,6 @@ print_poly_int (FILE *file, poly_int64 x
fprintf (file, "]");
}
}
-#endif
/* Subroutine of print_rtx_operand for handling code '0'.
0 indicates a field for internal use that should not be printed.
@@ -628,6 +628,11 @@ rtx_writer::print_rtx_operand (const_rtx
print_rtx_operand_code_i (in_rtx, idx);
break;
+ case 'p':
+ fprintf (m_outfile, " ");
+ print_poly_int (m_outfile, SUBREG_BYTE (in_rtx));
+ break;
+
case 'r':
print_rtx_operand_code_r (in_rtx);
break;
@@ -1661,7 +1666,8 @@ print_value (pretty_printer *pp, const_r
break;
case SUBREG:
print_value (pp, SUBREG_REG (x), verbose);
- pp_printf (pp, "#%d", SUBREG_BYTE (x));
+ pp_printf (pp, "#");
+ pp_wide_integer (pp, SUBREG_BYTE (x));
break;
case SCRATCH:
case CC0:
Index: gcc/read-rtl.c
===================================================================
--- gcc/read-rtl.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/read-rtl.c 2017-10-23 17:16:50.371528142 +0100
@@ -222,7 +222,10 @@ find_int (const char *name)
static void
apply_int_iterator (rtx x, unsigned int index, int value)
{
- XINT (x, index) = value;
+ if (GET_CODE (x) == SUBREG)
+ SUBREG_BYTE (x) = value;
+ else
+ XINT (x, index) = value;
}
#ifdef GENERATOR_FILE
@@ -1608,6 +1611,7 @@ rtx_reader::read_rtx_operand (rtx return
case 'i':
case 'n':
+ case 'p':
/* Can be an iterator or an integer constant. */
read_name (&name);
record_potential_iterator_use (&ints, return_rtx, idx, name.string);
Index: gcc/alias.c
===================================================================
--- gcc/alias.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/alias.c 2017-10-23 17:16:50.356530167 +0100
@@ -1833,6 +1833,11 @@ rtx_equal_for_memref_p (const_rtx x, con
return 0;
break;
+ case 'p':
+ if (may_ne (SUBREG_BYTE (x), SUBREG_BYTE (y)))
+ return 0;
+ break;
+
case 'E':
/* Two vectors must have the same length. */
if (XVECLEN (x, i) != XVECLEN (y, i))
Index: gcc/cselib.c
===================================================================
--- gcc/cselib.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/cselib.c 2017-10-23 17:16:50.359529762 +0100
@@ -987,6 +987,11 @@ rtx_equal_for_cselib_1 (rtx x, rtx y, ma
return 0;
break;
+ case 'p':
+ if (may_ne (SUBREG_BYTE (x), SUBREG_BYTE (y)))
+ return 0;
+ break;
+
case 'V':
case 'E':
/* Two vectors must have the same length. */
@@ -1278,6 +1283,10 @@ cselib_hash_rtx (rtx x, int create, mach
hash += XINT (x, i);
break;
+ case 'p':
+ hash += constant_lower_bound (SUBREG_BYTE (x));
+ break;
+
case '0':
case 't':
/* unused */
Index: gcc/caller-save.c
===================================================================
--- gcc/caller-save.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/caller-save.c 2017-10-23 17:16:50.356530167 +0100
@@ -1129,7 +1129,7 @@ replace_reg_with_saved_mem (rtx *loc,
{
/* This is gen_lowpart_if_possible(), but without validating
the newly-formed address. */
- HOST_WIDE_INT offset = byte_lowpart_offset (mode, GET_MODE (mem));
+ poly_int64 offset = byte_lowpart_offset (mode, GET_MODE (mem));
mem = adjust_address_nv (mem, mode, offset);
}
}
Index: gcc/calls.c
===================================================================
--- gcc/calls.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/calls.c 2017-10-23 17:16:50.357530032 +0100
@@ -4126,8 +4126,8 @@ expand_call (tree exp, rtx target, int i
funtype, 1);
gcc_assert (GET_MODE (target) == pmode);
- unsigned int offset = subreg_lowpart_offset (TYPE_MODE (type),
- GET_MODE (target));
+ poly_uint64 offset = subreg_lowpart_offset (TYPE_MODE (type),
+ GET_MODE (target));
target = gen_rtx_SUBREG (TYPE_MODE (type), target, offset);
SUBREG_PROMOTED_VAR_P (target) = 1;
SUBREG_PROMOTED_SET (target, unsignedp);
Index: gcc/combine.c
===================================================================
--- gcc/combine.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/combine.c 2017-10-23 17:16:50.358529897 +0100
@@ -5826,7 +5826,7 @@ combine_simplify_rtx (rtx x, machine_mod
/* See if this can be moved to simplify_subreg. */
if (CONSTANT_P (SUBREG_REG (x))
- && subreg_lowpart_offset (mode, op0_mode) == SUBREG_BYTE (x)
+ && must_eq (subreg_lowpart_offset (mode, op0_mode), SUBREG_BYTE (x))
/* Don't call gen_lowpart if the inner mode
is VOIDmode and we cannot simplify it, as SUBREG without
inner mode is invalid. */
@@ -5850,8 +5850,8 @@ combine_simplify_rtx (rtx x, machine_mod
&& is_a <scalar_int_mode> (op0_mode, &int_op0_mode)
&& (GET_MODE_PRECISION (int_mode)
< GET_MODE_PRECISION (int_op0_mode))
- && (subreg_lowpart_offset (int_mode, int_op0_mode)
- == SUBREG_BYTE (x))
+ && must_eq (subreg_lowpart_offset (int_mode, int_op0_mode),
+ SUBREG_BYTE (x))
&& HWI_COMPUTABLE_MODE_P (int_op0_mode)
&& (nonzero_bits (SUBREG_REG (x), int_op0_mode)
& GET_MODE_MASK (int_mode)) == 0)
@@ -7320,7 +7320,8 @@ expand_field_assignment (const_rtx x)
{
inner = SUBREG_REG (XEXP (SET_DEST (x), 0));
len = GET_MODE_PRECISION (GET_MODE (XEXP (SET_DEST (x), 0)));
- pos = GEN_INT (subreg_lsb (XEXP (SET_DEST (x), 0)));
+ pos = gen_int_mode (subreg_lsb (XEXP (SET_DEST (x), 0)),
+ MAX_MODE_INT);
}
else if (GET_CODE (SET_DEST (x)) == ZERO_EXTRACT
&& CONST_INT_P (XEXP (SET_DEST (x), 1)))
@@ -7569,7 +7570,7 @@ make_extraction (machine_mode mode, rtx
return a new hard register. */
if (pos || in_dest)
{
- unsigned int offset
+ poly_uint64 offset
= subreg_offset_from_lsb (tmode, inner_mode, pos);
/* Avoid creating invalid subregs, for example when
@@ -11626,7 +11627,7 @@ gen_lowpart_for_combine (machine_mode om
if (paradoxical_subreg_p (omode, imode))
return gen_rtx_SUBREG (omode, x, 0);
- HOST_WIDE_INT offset = byte_lowpart_offset (omode, imode);
+ poly_int64 offset = byte_lowpart_offset (omode, imode);
return adjust_address_nv (x, omode, offset);
}
Index: gcc/loop-invariant.c
===================================================================
--- gcc/loop-invariant.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/loop-invariant.c 2017-10-23 17:16:50.370528277 +0100
@@ -335,6 +335,8 @@ hash_invariant_expr_1 (rtx_insn *insn, r
}
else if (fmt[i] == 'i' || fmt[i] == 'n')
val ^= XINT (x, i);
+ else if (fmt[i] == 'p')
+ val ^= constant_lower_bound (SUBREG_BYTE (x));
}
return val;
@@ -420,6 +422,11 @@ invariant_expr_equal_p (rtx_insn *insn1,
if (XINT (e1, i) != XINT (e2, i))
return false;
}
+ else if (fmt[i] == 'p')
+ {
+ if (may_ne (SUBREG_BYTE (e1), SUBREG_BYTE (e2)))
+ return false;
+ }
/* Unhandled type of subexpression, we fail conservatively. */
else
return false;
Index: gcc/cse.c
===================================================================
--- gcc/cse.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/cse.c 2017-10-23 17:16:50.359529762 +0100
@@ -561,7 +561,7 @@ static struct table_elt *insert (rtx, st
static void merge_equiv_classes (struct table_elt *, struct table_elt *);
static void invalidate (rtx, machine_mode);
static void remove_invalid_refs (unsigned int);
-static void remove_invalid_subreg_refs (unsigned int, unsigned int,
+static void remove_invalid_subreg_refs (unsigned int, poly_uint64,
machine_mode);
static void rehash_using_reg (rtx);
static void invalidate_memory (void);
@@ -1994,12 +1994,11 @@ remove_invalid_refs (unsigned int regno)
/* Likewise for a subreg with subreg_reg REGNO, subreg_byte OFFSET,
and mode MODE. */
static void
-remove_invalid_subreg_refs (unsigned int regno, unsigned int offset,
+remove_invalid_subreg_refs (unsigned int regno, poly_uint64 offset,
machine_mode mode)
{
unsigned int i;
struct table_elt *p, *next;
- unsigned int end = offset + (GET_MODE_SIZE (mode) - 1);
for (i = 0; i < HASH_SIZE; i++)
for (p = table[i]; p; p = next)
@@ -2011,9 +2010,9 @@ remove_invalid_subreg_refs (unsigned int
&& (GET_CODE (exp) != SUBREG
|| !REG_P (SUBREG_REG (exp))
|| REGNO (SUBREG_REG (exp)) != regno
- || (((SUBREG_BYTE (exp)
- + (GET_MODE_SIZE (GET_MODE (exp)) - 1)) >= offset)
- && SUBREG_BYTE (exp) <= end))
+ || ranges_may_overlap_p (SUBREG_BYTE (exp),
+ GET_MODE_SIZE (GET_MODE (exp)),
+ offset, GET_MODE_SIZE (mode)))
&& refers_to_regno_p (regno, p->exp))
remove_from_table (p, i);
}
@@ -2307,7 +2306,8 @@ hash_rtx_cb (const_rtx x, machine_mode m
{
hash += (((unsigned int) SUBREG << 7)
+ REGNO (SUBREG_REG (x))
- + (SUBREG_BYTE (x) / UNITS_PER_WORD));
+ + (constant_lower_bound (SUBREG_BYTE (x))
+ / UNITS_PER_WORD));
return hash;
}
break;
@@ -2526,6 +2526,10 @@ hash_rtx_cb (const_rtx x, machine_mode m
hash += (unsigned int) XINT (x, i);
break;
+ case 'p':
+ hash += constant_lower_bound (SUBREG_BYTE (x));
+ break;
+
case '0': case 't':
/* Unused. */
break;
@@ -2776,6 +2780,11 @@ exp_equiv_p (const_rtx x, const_rtx y, i
return 0;
break;
+ case 'p':
+ if (may_ne (SUBREG_BYTE (x), SUBREG_BYTE (y)))
+ return 0;
+ break;
+
case '0':
case 't':
break;
@@ -3801,8 +3810,9 @@ equiv_constant (rtx x)
if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (word_mode)
&& GET_MODE_SIZE (word_mode) < GET_MODE_SIZE (imode))
{
- int byte = SUBREG_BYTE (x) - subreg_lowpart_offset (mode, word_mode);
- if (byte >= 0 && (byte % UNITS_PER_WORD) == 0)
+ poly_int64 byte = (SUBREG_BYTE (x)
+ - subreg_lowpart_offset (mode, word_mode));
+ if (must_ge (byte, 0) && multiple_p (byte, UNITS_PER_WORD))
{
rtx y = gen_rtx_SUBREG (word_mode, SUBREG_REG (x), byte);
new_rtx = lookup_as_function (y, CONST_INT);
@@ -6002,7 +6012,7 @@ cse_insn (rtx_insn *insn)
new_src = elt->exp;
else
{
- unsigned int byte
+ poly_uint64 byte
= subreg_lowpart_offset (new_mode, GET_MODE (dest));
new_src = simplify_gen_subreg (new_mode, elt->exp,
GET_MODE (dest), byte);
Index: gcc/dse.c
===================================================================
--- gcc/dse.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/dse.c 2017-10-23 17:16:50.360529627 +0100
@@ -1703,7 +1703,7 @@ find_shift_sequence (poly_int64 access_s
e.g. at -Os, even when no actual shift will be needed. */
if (store_info->const_rhs)
{
- unsigned int byte = subreg_lowpart_offset (new_mode, store_mode);
+ poly_uint64 byte = subreg_lowpart_offset (new_mode, store_mode);
rtx ret = simplify_subreg (new_mode, store_info->const_rhs,
store_mode, byte);
if (ret && CONSTANT_P (ret))
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/dwarf2out.c 2017-10-23 17:16:50.362529357 +0100
@@ -19152,8 +19152,8 @@ rtl_for_decl_location (tree decl)
&& GET_MODE (rtl) != TYPE_MODE (TREE_TYPE (decl)))
{
machine_mode addr_mode = get_address_mode (rtl);
- HOST_WIDE_INT offset = byte_lowpart_offset (TYPE_MODE (TREE_TYPE (decl)),
- GET_MODE (rtl));
+ poly_int64 offset = byte_lowpart_offset (TYPE_MODE (TREE_TYPE (decl)),
+ GET_MODE (rtl));
/* If a variable is declared "register" yet is smaller than
a register, then if we store the variable to memory, it
@@ -19161,7 +19161,7 @@ rtl_for_decl_location (tree decl)
fact we are not. We need to adjust the offset of the
storage location to reflect the actual value's bytes,
else gdb will not be able to display it. */
- if (offset != 0)
+ if (maybe_nonzero (offset))
rtl = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (decl)),
plus_constant (addr_mode, XEXP (rtl, 0), offset));
}
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/expmed.c 2017-10-23 17:16:50.363529222 +0100
@@ -2344,7 +2344,7 @@ extract_low_bits (machine_mode mode, mac
/* simplify_gen_subreg can't be used here, as if simplify_subreg
fails, it will happily create (subreg (symbol_ref)) or similar
invalid SUBREGs. */
- unsigned int byte = subreg_lowpart_offset (mode, src_mode);
+ poly_uint64 byte = subreg_lowpart_offset (mode, src_mode);
rtx ret = simplify_subreg (mode, src, src_mode, byte);
if (ret)
return ret;
Index: gcc/expr.c
===================================================================
--- gcc/expr.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/expr.c 2017-10-23 17:16:50.364529087 +0100
@@ -2446,7 +2446,7 @@ emit_group_store (rtx orig_dst, rtx src,
{
machine_mode outer = GET_MODE (dst);
machine_mode inner;
- HOST_WIDE_INT bytepos;
+ poly_int64 bytepos;
bool done = false;
rtx temp;
@@ -2461,7 +2461,7 @@ emit_group_store (rtx orig_dst, rtx src,
{
inner = GET_MODE (tmps[start]);
bytepos = subreg_lowpart_offset (inner, outer);
- if (INTVAL (XEXP (XVECEXP (src, 0, start), 1)) == bytepos)
+ if (must_eq (INTVAL (XEXP (XVECEXP (src, 0, start), 1)), bytepos))
{
temp = simplify_gen_subreg (outer, tmps[start],
inner, 0);
@@ -2480,7 +2480,8 @@ emit_group_store (rtx orig_dst, rtx src,
{
inner = GET_MODE (tmps[finish - 1]);
bytepos = subreg_lowpart_offset (inner, outer);
- if (INTVAL (XEXP (XVECEXP (src, 0, finish - 1), 1)) == bytepos)
+ if (must_eq (INTVAL (XEXP (XVECEXP (src, 0, finish - 1), 1)),
+ bytepos))
{
temp = simplify_gen_subreg (outer, tmps[finish - 1],
inner, 0);
@@ -3543,9 +3544,9 @@ undefined_operand_subword_p (const_rtx o
if (GET_CODE (op) != SUBREG)
return false;
machine_mode innermostmode = GET_MODE (SUBREG_REG (op));
- HOST_WIDE_INT offset = i * UNITS_PER_WORD + subreg_memory_offset (op);
- return (offset >= GET_MODE_SIZE (innermostmode)
- || offset <= -UNITS_PER_WORD);
+ poly_int64 offset = i * UNITS_PER_WORD + subreg_memory_offset (op);
+ return (must_ge (offset, GET_MODE_SIZE (innermostmode))
+ || must_le (offset, -UNITS_PER_WORD));
}
/* A subroutine of emit_move_insn_1. Generate a move from Y into X.
@@ -9229,8 +9230,8 @@ #define REDUCE_BIT_FIELD(expr) (reduce_b
>= GET_MODE_BITSIZE (word_mode)))
{
rtx_insn *seq, *seq_old;
- unsigned int high_off = subreg_highpart_offset (word_mode,
- int_mode);
+ poly_uint64 high_off = subreg_highpart_offset (word_mode,
+ int_mode);
bool extend_unsigned
= TYPE_UNSIGNED (TREE_TYPE (gimple_assign_rhs1 (def)));
rtx low = lowpart_subreg (word_mode, op0, int_mode);
Index: gcc/final.c
===================================================================
--- gcc/final.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/final.c 2017-10-23 17:16:50.365528952 +0100
@@ -3194,7 +3194,7 @@ alter_subreg (rtx *xp, bool final_p)
We are required to. */
if (MEM_P (y))
{
- int offset = SUBREG_BYTE (x);
+ poly_int64 offset = SUBREG_BYTE (x);
/* For paradoxical subregs on big-endian machines, SUBREG_BYTE
contains 0 instead of the proper offset. See simplify_subreg. */
@@ -3217,7 +3217,7 @@ alter_subreg (rtx *xp, bool final_p)
{
/* Simplify_subreg can't handle some REG cases, but we have to. */
unsigned int regno;
- HOST_WIDE_INT offset;
+ poly_int64 offset;
regno = subreg_regno (x);
if (subreg_lowpart_p (x))
@@ -4460,6 +4460,7 @@ leaf_renumber_regs_insn (rtx in_rtx)
case '0':
case 'i':
case 'w':
+ case 'p':
case 'n':
case 'u':
break;
Index: gcc/function.c
===================================================================
--- gcc/function.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/function.c 2017-10-23 17:16:50.365528952 +0100
@@ -2698,9 +2698,9 @@ assign_parm_find_stack_rtl (tree parm, s
set_mem_size (stack_parm, GET_MODE_SIZE (data->promoted_mode));
if (MEM_EXPR (stack_parm) && MEM_OFFSET_KNOWN_P (stack_parm))
{
- int offset = subreg_lowpart_offset (DECL_MODE (parm),
- data->promoted_mode);
- if (offset)
+ poly_int64 offset = subreg_lowpart_offset (DECL_MODE (parm),
+ data->promoted_mode);
+ if (maybe_nonzero (offset))
set_mem_offset (stack_parm, MEM_OFFSET (stack_parm) - offset);
}
}
@@ -3424,12 +3424,13 @@ assign_parm_setup_stack (struct assign_p
if (data->stack_parm)
{
- int offset = subreg_lowpart_offset (data->nominal_mode,
- GET_MODE (data->stack_parm));
+ poly_int64 offset
+ = subreg_lowpart_offset (data->nominal_mode,
+ GET_MODE (data->stack_parm));
/* ??? This may need a big-endian conversion on sparc64. */
data->stack_parm
= adjust_address (data->stack_parm, data->nominal_mode, 0);
- if (offset && MEM_OFFSET_KNOWN_P (data->stack_parm))
+ if (maybe_nonzero (offset) && MEM_OFFSET_KNOWN_P (data->stack_parm))
set_mem_offset (data->stack_parm,
MEM_OFFSET (data->stack_parm) + offset);
}
Index: gcc/fwprop.c
===================================================================
--- gcc/fwprop.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/fwprop.c 2017-10-23 17:16:50.366528817 +0100
@@ -1263,7 +1263,7 @@ forward_propagate_and_simplify (df_ref u
reg = DF_REF_REG (use);
if (GET_CODE (reg) == SUBREG && GET_CODE (SET_DEST (def_set)) == SUBREG)
{
- if (SUBREG_BYTE (SET_DEST (def_set)) != SUBREG_BYTE (reg))
+ if (may_ne (SUBREG_BYTE (SET_DEST (def_set)), SUBREG_BYTE (reg)))
return false;
}
/* Check if the def had a subreg, but the use has the whole reg. */
Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/ifcvt.c 2017-10-23 17:16:50.368528547 +0100
@@ -894,7 +894,7 @@ noce_emit_move_insn (rtx x, rtx y)
{
machine_mode outmode;
rtx outer, inner;
- int bitpos;
+ poly_int64 bitpos;
if (GET_CODE (x) != STRICT_LOW_PART)
{
@@ -1724,12 +1724,12 @@ noce_emit_cmove (struct noce_if_info *if
{
rtx reg_vtrue = SUBREG_REG (vtrue);
rtx reg_vfalse = SUBREG_REG (vfalse);
- unsigned int byte_vtrue = SUBREG_BYTE (vtrue);
- unsigned int byte_vfalse = SUBREG_BYTE (vfalse);
+ poly_uint64 byte_vtrue = SUBREG_BYTE (vtrue);
+ poly_uint64 byte_vfalse = SUBREG_BYTE (vfalse);
rtx promoted_target;
if (GET_MODE (reg_vtrue) != GET_MODE (reg_vfalse)
- || byte_vtrue != byte_vfalse
+ || may_ne (byte_vtrue, byte_vfalse)
|| (SUBREG_PROMOTED_VAR_P (vtrue)
!= SUBREG_PROMOTED_VAR_P (vfalse))
|| (SUBREG_PROMOTED_GET (vtrue)
Index: gcc/ira.c
===================================================================
--- gcc/ira.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/ira.c 2017-10-23 17:16:50.369528412 +0100
@@ -4051,8 +4051,7 @@ get_subreg_tracking_sizes (rtx x, HOST_W
rtx reg = regno_reg_rtx[REGNO (SUBREG_REG (x))];
*outer_size = GET_MODE_SIZE (GET_MODE (x));
*inner_size = GET_MODE_SIZE (GET_MODE (reg));
- *start = SUBREG_BYTE (x);
- return true;
+ return SUBREG_BYTE (x).is_constant (start);
}
/* Init LIVE_SUBREGS[ALLOCNUM] and LIVE_SUBREGS_USED[ALLOCNUM] for
Index: gcc/ira-conflicts.c
===================================================================
--- gcc/ira-conflicts.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/ira-conflicts.c 2017-10-23 17:16:50.368528547 +0100
@@ -226,8 +226,11 @@ go_through_subreg (rtx x, int *offset)
if (REGNO (reg) < FIRST_PSEUDO_REGISTER)
*offset = subreg_regno_offset (REGNO (reg), GET_MODE (reg),
SUBREG_BYTE (x), GET_MODE (x));
- else
- *offset = (SUBREG_BYTE (x) / REGMODE_NATURAL_SIZE (GET_MODE (x)));
+ else if (!can_div_trunc_p (SUBREG_BYTE (x),
+ REGMODE_NATURAL_SIZE (GET_MODE (x)), offset))
+ /* Checked by validate_subreg. We must know at compile time which
+ inner hard registers are being accessed. */
+ gcc_unreachable ();
return reg;
}
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/ira-lives.c 2017-10-23 17:16:50.369528412 +0100
@@ -919,7 +919,7 @@ process_single_reg_class_operands (bool
(subreg:YMODE (reg:XMODE XREGNO) OFFSET). */
machine_mode ymode, xmode;
int xregno, yregno;
- HOST_WIDE_INT offset;
+ poly_int64 offset;
xmode = recog_data.operand_mode[i];
xregno = ira_class_singleton[cl][xmode];
Index: gcc/jump.c
===================================================================
--- gcc/jump.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/jump.c 2017-10-23 17:16:50.369528412 +0100
@@ -1724,7 +1724,7 @@ rtx_renumbered_equal_p (const_rtx x, con
&& REG_P (SUBREG_REG (y)))))
{
int reg_x = -1, reg_y = -1;
- int byte_x = 0, byte_y = 0;
+ poly_int64 byte_x = 0, byte_y = 0;
struct subreg_info info;
if (GET_MODE (x) != GET_MODE (y))
@@ -1781,7 +1781,7 @@ rtx_renumbered_equal_p (const_rtx x, con
reg_y = reg_renumber[reg_y];
}
- return reg_x >= 0 && reg_x == reg_y && byte_x == byte_y;
+ return reg_x >= 0 && reg_x == reg_y && must_eq (byte_x, byte_y);
}
/* Now we have disposed of all the cases
@@ -1873,6 +1873,11 @@ rtx_renumbered_equal_p (const_rtx x, con
}
break;
+ case 'p':
+ if (may_ne (SUBREG_BYTE (x), SUBREG_BYTE (y)))
+ return 0;
+ break;
+
case 't':
if (XTREE (x, i) != XTREE (y, i))
return 0;
Index: gcc/lower-subreg.c
===================================================================
--- gcc/lower-subreg.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/lower-subreg.c 2017-10-23 17:16:50.370528277 +0100
@@ -609,19 +609,21 @@ decompose_register (unsigned int regno)
/* Get a SUBREG of a CONCATN. */
static rtx
-simplify_subreg_concatn (machine_mode outermode, rtx op,
- unsigned int byte)
+simplify_subreg_concatn (machine_mode outermode, rtx op, poly_uint64 orig_byte)
{
unsigned int outer_size, outer_words, inner_size, inner_words;
machine_mode innermode, partmode;
rtx part;
unsigned int final_offset;
+ unsigned int byte;
innermode = GET_MODE (op);
if (!interesting_mode_p (outermode, &outer_size, &outer_words)
|| !interesting_mode_p (innermode, &inner_size, &inner_words))
gcc_unreachable ();
+ /* Must be constant if interesting_mode_p passes. */
+ byte = orig_byte.to_constant ();
gcc_assert (GET_CODE (op) == CONCATN);
gcc_assert (byte % outer_size == 0);
@@ -667,7 +669,7 @@ simplify_gen_subreg_concatn (machine_mod
if ((GET_MODE_SIZE (GET_MODE (op))
== GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))))
- && SUBREG_BYTE (op) == 0)
+ && known_zero (SUBREG_BYTE (op)))
return simplify_gen_subreg_concatn (outermode, SUBREG_REG (op),
GET_MODE (SUBREG_REG (op)), byte);
@@ -866,7 +868,7 @@ resolve_simple_move (rtx set, rtx_insn *
if (GET_CODE (src) == SUBREG
&& resolve_reg_p (SUBREG_REG (src))
- && (SUBREG_BYTE (src) != 0
+ && (maybe_nonzero (SUBREG_BYTE (src))
|| (GET_MODE_SIZE (orig_mode)
!= GET_MODE_SIZE (GET_MODE (SUBREG_REG (src))))))
{
@@ -881,7 +883,7 @@ resolve_simple_move (rtx set, rtx_insn *
if (GET_CODE (dest) == SUBREG
&& resolve_reg_p (SUBREG_REG (dest))
- && (SUBREG_BYTE (dest) != 0
+ && (maybe_nonzero (SUBREG_BYTE (dest))
|| (GET_MODE_SIZE (orig_mode)
!= GET_MODE_SIZE (GET_MODE (SUBREG_REG (dest))))))
{
Index: gcc/lra-constraints.c
===================================================================
--- gcc/lra-constraints.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/lra-constraints.c 2017-10-23 17:16:50.370528277 +0100
@@ -786,6 +786,11 @@ operands_match_p (rtx x, rtx y, int y_ha
return false;
break;
+ case 'p':
+ if (may_ne (SUBREG_BYTE (x), SUBREG_BYTE (y)))
+ return false;
+ break;
+
case 'e':
val = operands_match_p (XEXP (x, i), XEXP (y, i), -1);
if (val == 0)
@@ -974,7 +979,7 @@ match_reload (signed char out, signed ch
if (REG_P (subreg_reg)
&& (int) REGNO (subreg_reg) < lra_new_regno_start
&& GET_MODE (subreg_reg) == outmode
- && SUBREG_BYTE (in_rtx) == SUBREG_BYTE (new_in_reg)
+ && must_eq (SUBREG_BYTE (in_rtx), SUBREG_BYTE (new_in_reg))
&& find_regno_note (curr_insn, REG_DEAD, REGNO (subreg_reg))
&& (! early_clobber_p
|| check_conflict_input_operands (REGNO (subreg_reg),
@@ -4204,7 +4209,7 @@ curr_insn_transform (bool check_only_p)
{
machine_mode mode;
rtx reg, *loc;
- int hard_regno, byte;
+ int hard_regno;
enum op_type type = curr_static_id->operand[i].type;
loc = curr_id->operand_loc[i];
@@ -4212,7 +4217,7 @@ curr_insn_transform (bool check_only_p)
if (GET_CODE (*loc) == SUBREG)
{
reg = SUBREG_REG (*loc);
- byte = SUBREG_BYTE (*loc);
+ poly_int64 byte = SUBREG_BYTE (*loc);
if (REG_P (reg)
/* Strict_low_part requires reload the register not
the sub-register. */
Index: gcc/lra-spills.c
===================================================================
--- gcc/lra-spills.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/lra-spills.c 2017-10-23 17:16:50.371528142 +0100
@@ -136,7 +136,7 @@ assign_mem_slot (int i)
machine_mode wider_mode
= wider_subreg_mode (mode, lra_reg_info[i].biggest_mode);
HOST_WIDE_INT total_size = GET_MODE_SIZE (wider_mode);
- HOST_WIDE_INT adjust = 0;
+ poly_int64 adjust = 0;
lra_assert (regno_reg_rtx[i] != NULL_RTX && REG_P (regno_reg_rtx[i])
&& lra_reg_info[i].nrefs != 0 && reg_renumber[i] < 0);
Index: gcc/postreload.c
===================================================================
--- gcc/postreload.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/postreload.c 2017-10-23 17:16:50.371528142 +0100
@@ -1704,9 +1704,9 @@ move2add_valid_value_p (int regno, scala
mode after truncation only if (REG:mode regno) is the lowpart of
(REG:reg_mode[regno] regno). Now, for big endian, the starting
regno of the lowpart might be different. */
- int s_off = subreg_lowpart_offset (mode, old_mode);
+ poly_int64 s_off = subreg_lowpart_offset (mode, old_mode);
s_off = subreg_regno_offset (regno, old_mode, s_off, mode);
- if (s_off != 0)
+ if (maybe_nonzero (s_off))
/* We could in principle adjust regno, check reg_mode[regno] to be
BLKmode, and return s_off to the caller (vs. -1 for failure),
but we currently have no callers that could make use of this
Index: gcc/recog.c
===================================================================
--- gcc/recog.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/recog.c 2017-10-23 17:16:50.372528007 +0100
@@ -1006,7 +1006,8 @@ general_operand (rtx op, machine_mode mo
might be called from cleanup_subreg_operands.
??? This is a kludge. */
- if (!reload_completed && SUBREG_BYTE (op) != 0
+ if (!reload_completed
+ && maybe_nonzero (SUBREG_BYTE (op))
&& MEM_P (sub))
return 0;
@@ -1368,9 +1369,6 @@ indirect_operand (rtx op, machine_mode m
if (! reload_completed
&& GET_CODE (op) == SUBREG && MEM_P (SUBREG_REG (op)))
{
- int offset = SUBREG_BYTE (op);
- rtx inner = SUBREG_REG (op);
-
if (mode != VOIDmode && GET_MODE (op) != mode)
return 0;
@@ -1378,12 +1376,10 @@ indirect_operand (rtx op, machine_mode m
address is if OFFSET is zero and the address already is an operand
or if the address is (plus Y (const_int -OFFSET)) and Y is an
operand. */
-
- return ((offset == 0 && general_operand (XEXP (inner, 0), Pmode))
- || (GET_CODE (XEXP (inner, 0)) == PLUS
- && CONST_INT_P (XEXP (XEXP (inner, 0), 1))
- && INTVAL (XEXP (XEXP (inner, 0), 1)) == -offset
- && general_operand (XEXP (XEXP (inner, 0), 0), Pmode)));
+ poly_int64 offset;
+ rtx addr = strip_offset (XEXP (SUBREG_REG (op), 0), &offset);
+ return (known_zero (offset + SUBREG_BYTE (op))
+ && general_operand (addr, Pmode));
}
return (MEM_P (op)
Index: gcc/regcprop.c
===================================================================
--- gcc/regcprop.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/regcprop.c 2017-10-23 17:16:50.372528007 +0100
@@ -345,7 +345,8 @@ copy_value (rtx dest, rtx src, struct va
We can't properly represent the latter case in our tables, so don't
record anything then. */
else if (sn < hard_regno_nregs (sr, vd->e[sr].mode)
- && subreg_lowpart_offset (GET_MODE (dest), vd->e[sr].mode) != 0)
+ && maybe_nonzero (subreg_lowpart_offset (GET_MODE (dest),
+ vd->e[sr].mode)))
return;
/* If SRC had been assigned a mode narrower than the copy, we can't
@@ -407,7 +408,7 @@ maybe_mode_change (machine_mode orig_mod
int use_nregs = hard_regno_nregs (copy_regno, new_mode);
int copy_offset
= GET_MODE_SIZE (copy_mode) / copy_nregs * (copy_nregs - use_nregs);
- unsigned int offset
+ poly_uint64 offset
= subreg_size_lowpart_offset (GET_MODE_SIZE (new_mode) + copy_offset,
GET_MODE_SIZE (orig_mode));
regno += subreg_regno_offset (regno, orig_mode, offset, new_mode);
@@ -866,7 +867,8 @@ copyprop_hardreg_forward_1 (basic_block
/* And likewise, if we are narrowing on big endian the transformation
is also invalid. */
if (REG_NREGS (src) < hard_regno_nregs (regno, vd->e[regno].mode)
- && subreg_lowpart_offset (mode, vd->e[regno].mode) != 0)
+ && maybe_nonzero (subreg_lowpart_offset (mode,
+ vd->e[regno].mode)))
goto no_move_special_case;
}
Index: gcc/reginfo.c
===================================================================
--- gcc/reginfo.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/reginfo.c 2017-10-23 17:16:50.372528007 +0100
@@ -1206,7 +1206,9 @@ reg_classes_intersect_p (reg_class_t c1,
inline hashval_t
simplifiable_subregs_hasher::hash (const simplifiable_subreg *value)
{
- return value->shape.unique_id ();
+ inchash::hash h;
+ h.add_hwi (value->shape.unique_id ());
+ return h.end ();
}
inline bool
@@ -1231,9 +1233,11 @@ simplifiable_subregs (const subreg_shape
if (!this_target_hard_regs->x_simplifiable_subregs)
this_target_hard_regs->x_simplifiable_subregs
= new hash_table <simplifiable_subregs_hasher> (30);
+ inchash::hash h;
+ h.add_hwi (shape.unique_id ());
simplifiable_subreg **slot
= (this_target_hard_regs->x_simplifiable_subregs
- ->find_slot_with_hash (&shape, shape.unique_id (), INSERT));
+ ->find_slot_with_hash (&shape, h.end (), INSERT));
if (!*slot)
{
@@ -1294,7 +1298,7 @@ record_subregs_of_mode (rtx subreg, bool
unsigned int size = MAX (REGMODE_NATURAL_SIZE (shape.inner_mode),
GET_MODE_SIZE (shape.outer_mode));
gcc_checking_assert (size < GET_MODE_SIZE (shape.inner_mode));
- if (shape.offset >= size)
+ if (must_ge (shape.offset, size))
shape.offset -= size;
else
shape.offset += size;
Index: gcc/rtlhooks.c
===================================================================
--- gcc/rtlhooks.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/rtlhooks.c 2017-10-23 17:16:50.375527601 +0100
@@ -70,7 +70,7 @@ gen_lowpart_general (machine_mode mode,
&& !reload_completed)
return gen_lowpart_general (mode, force_reg (xmode, x));
- HOST_WIDE_INT offset = byte_lowpart_offset (mode, GET_MODE (x));
+ poly_int64 offset = byte_lowpart_offset (mode, GET_MODE (x));
return adjust_address (x, mode, offset);
}
}
@@ -115,7 +115,7 @@ gen_lowpart_if_possible (machine_mode mo
else if (MEM_P (x))
{
/* This is the only other case we handle. */
- HOST_WIDE_INT offset = byte_lowpart_offset (mode, GET_MODE (x));
+ poly_int64 offset = byte_lowpart_offset (mode, GET_MODE (x));
rtx new_rtx = adjust_address_nv (x, mode, offset);
if (! memory_address_addr_space_p (mode, XEXP (new_rtx, 0),
MEM_ADDR_SPACE (x)))
Index: gcc/reload.c
===================================================================
--- gcc/reload.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/reload.c 2017-10-23 17:16:50.373527872 +0100
@@ -2307,6 +2307,11 @@ operands_match_p (rtx x, rtx y)
return 0;
break;
+ case 'p':
+ if (may_ne (SUBREG_BYTE (x), SUBREG_BYTE (y)))
+ return 0;
+ break;
+
case 'e':
val = operands_match_p (XEXP (x, i), XEXP (y, i));
if (val == 0)
@@ -6095,7 +6100,7 @@ find_reloads_subreg_address (rtx x, int
int regno = REGNO (SUBREG_REG (x));
int reloaded = 0;
rtx tem, orig;
- int offset;
+ poly_int64 offset;
gcc_assert (reg_equiv_memory_loc (regno) != 0);
@@ -6142,7 +6147,7 @@ find_reloads_subreg_address (rtx x, int
XEXP (tem, 0), &XEXP (tem, 0),
opnum, type, ind_levels, insn);
/* ??? Do we need to handle nonzero offsets somehow? */
- if (!offset && !rtx_equal_p (tem, orig))
+ if (known_zero (offset) && !rtx_equal_p (tem, orig))
push_reg_equiv_alt_mem (regno, tem);
/* For some processors an address may be valid in the original mode but
Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/reload1.c 2017-10-23 17:16:50.373527872 +0100
@@ -2145,7 +2145,7 @@ alter_reg (int i, int from_reg, bool don
machine_mode wider_mode = wider_subreg_mode (mode, reg_max_ref_mode[i]);
unsigned int total_size = GET_MODE_SIZE (wider_mode);
unsigned int min_align = GET_MODE_BITSIZE (reg_max_ref_mode[i]);
- int adjust = 0;
+ poly_int64 adjust = 0;
something_was_spilled = true;
@@ -2185,7 +2185,7 @@ alter_reg (int i, int from_reg, bool don
if (BYTES_BIG_ENDIAN)
{
adjust = inherent_size - total_size;
- if (adjust)
+ if (maybe_nonzero (adjust))
{
unsigned int total_bits = total_size * BITS_PER_UNIT;
machine_mode mem_mode
@@ -2237,7 +2237,7 @@ alter_reg (int i, int from_reg, bool don
if (BYTES_BIG_ENDIAN)
{
adjust = GET_MODE_SIZE (mode) - total_size;
- if (adjust)
+ if (maybe_nonzero (adjust))
{
unsigned int total_bits = total_size * BITS_PER_UNIT;
machine_mode mem_mode
@@ -6347,12 +6347,12 @@ replaced_subreg (rtx x)
SUBREG is non-NULL if the pseudo is a subreg whose reg is a pseudo,
otherwise it is NULL. */
-static int
+static poly_int64
compute_reload_subreg_offset (machine_mode outermode,
rtx subreg,
machine_mode innermode)
{
- int outer_offset;
+ poly_int64 outer_offset;
machine_mode middlemode;
if (!subreg)
@@ -6506,7 +6506,7 @@ choose_reload_regs (struct insn_chain *c
if (inheritance)
{
- int byte = 0;
+ poly_int64 byte = 0;
int regno = -1;
machine_mode mode = VOIDmode;
rtx subreg = NULL_RTX;
@@ -6556,8 +6556,9 @@ choose_reload_regs (struct insn_chain *c
if (regno >= 0
&& reg_last_reload_reg[regno] != 0
- && (GET_MODE_SIZE (GET_MODE (reg_last_reload_reg[regno]))
- >= GET_MODE_SIZE (mode) + byte)
+ && (must_ge
+ (GET_MODE_SIZE (GET_MODE (reg_last_reload_reg[regno])),
+ GET_MODE_SIZE (mode) + byte))
/* Verify that the register it's in can be used in
mode MODE. */
&& (REG_CAN_CHANGE_MODE_P
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/simplify-rtx.c 2017-10-23 17:16:50.376527466 +0100
@@ -789,7 +789,7 @@ simplify_truncation (machine_mode mode,
&& (INTVAL (XEXP (op, 1)) & (precision - 1)) == 0
&& UINTVAL (XEXP (op, 1)) < op_precision)
{
- int byte = subreg_lowpart_offset (mode, op_mode);
+ poly_int64 byte = subreg_lowpart_offset (mode, op_mode);
int shifted_bytes = INTVAL (XEXP (op, 1)) / BITS_PER_UNIT;
return simplify_gen_subreg (mode, XEXP (op, 0), op_mode,
(WORDS_BIG_ENDIAN
@@ -815,7 +815,7 @@ simplify_truncation (machine_mode mode,
&& (GET_MODE_SIZE (int_mode) >= UNITS_PER_WORD
|| WORDS_BIG_ENDIAN == BYTES_BIG_ENDIAN))
{
- int byte = subreg_lowpart_offset (int_mode, int_op_mode);
+ poly_int64 byte = subreg_lowpart_offset (int_mode, int_op_mode);
int shifted_bytes = INTVAL (XEXP (op, 1)) / BITS_PER_UNIT;
return adjust_address_nv (XEXP (op, 0), int_mode,
(WORDS_BIG_ENDIAN
@@ -2826,7 +2826,7 @@ simplify_binary_operation_1 (enum rtx_co
&& GET_CODE (SUBREG_REG (opleft)) == ASHIFT
&& GET_CODE (opright) == LSHIFTRT
&& GET_CODE (XEXP (opright, 0)) == SUBREG
- && SUBREG_BYTE (opleft) == SUBREG_BYTE (XEXP (opright, 0))
+ && must_eq (SUBREG_BYTE (opleft), SUBREG_BYTE (XEXP (opright, 0)))
&& GET_MODE_SIZE (int_mode) < GET_MODE_SIZE (inner_mode)
&& rtx_equal_p (XEXP (SUBREG_REG (opleft), 0),
SUBREG_REG (XEXP (opright, 0)))
@@ -6183,7 +6183,7 @@ simplify_immed_subreg (fixed_size_mode o
Return 0 if no simplifications are possible. */
rtx
simplify_subreg (machine_mode outermode, rtx op,
- machine_mode innermode, unsigned int byte)
+ machine_mode innermode, poly_uint64 byte)
{
/* Little bit of sanity checking. */
gcc_assert (innermode != VOIDmode);
@@ -6194,16 +6194,16 @@ simplify_subreg (machine_mode outermode,
gcc_assert (GET_MODE (op) == innermode
|| GET_MODE (op) == VOIDmode);
- if ((byte % GET_MODE_SIZE (outermode)) != 0)
+ if (!multiple_p (byte, GET_MODE_SIZE (outermode)))
return NULL_RTX;
- if (byte >= GET_MODE_SIZE (innermode))
+ if (may_ge (byte, GET_MODE_SIZE (innermode)))
return NULL_RTX;
- if (outermode == innermode && !byte)
+ if (outermode == innermode && known_zero (byte))
return op;
- if (byte % GET_MODE_UNIT_SIZE (innermode) == 0)
+ if (multiple_p (byte, GET_MODE_UNIT_SIZE (innermode)))
{
rtx elt;
@@ -6224,12 +6224,15 @@ simplify_subreg (machine_mode outermode,
{
/* simplify_immed_subreg deconstructs OP into bytes and constructs
the result from bytes, so it only works if the sizes of the modes
- are known at compile time. Cases that apply to general modes
- should be handled here before calling simplify_immed_subreg. */
+ and the value of the offset are known at compile time. Cases that
+ that apply to general modes and offsets should be handled here
+ before calling simplify_immed_subreg. */
fixed_size_mode fs_outermode, fs_innermode;
+ unsigned HOST_WIDE_INT cbyte;
if (is_a <fixed_size_mode> (outermode, &fs_outermode)
- && is_a <fixed_size_mode> (innermode, &fs_innermode))
- return simplify_immed_subreg (fs_outermode, op, fs_innermode, byte);
+ && is_a <fixed_size_mode> (innermode, &fs_innermode)
+ && byte.is_constant (&cbyte))
+ return simplify_immed_subreg (fs_outermode, op, fs_innermode, cbyte);
return NULL_RTX;
}
@@ -6242,32 +6245,33 @@ simplify_subreg (machine_mode outermode,
rtx newx;
if (outermode == innermostmode
- && byte == 0 && SUBREG_BYTE (op) == 0)
+ && known_zero (byte)
+ && known_zero (SUBREG_BYTE (op)))
return SUBREG_REG (op);
/* Work out the memory offset of the final OUTERMODE value relative
to the inner value of OP. */
- HOST_WIDE_INT mem_offset = subreg_memory_offset (outermode,
- innermode, byte);
- HOST_WIDE_INT op_mem_offset = subreg_memory_offset (op);
- HOST_WIDE_INT final_offset = mem_offset + op_mem_offset;
+ poly_int64 mem_offset = subreg_memory_offset (outermode,
+ innermode, byte);
+ poly_int64 op_mem_offset = subreg_memory_offset (op);
+ poly_int64 final_offset = mem_offset + op_mem_offset;
/* See whether resulting subreg will be paradoxical. */
if (!paradoxical_subreg_p (outermode, innermostmode))
{
/* In nonparadoxical subregs we can't handle negative offsets. */
- if (final_offset < 0)
+ if (may_lt (final_offset, 0))
return NULL_RTX;
/* Bail out in case resulting subreg would be incorrect. */
- if (final_offset % GET_MODE_SIZE (outermode)
- || (unsigned) final_offset >= GET_MODE_SIZE (innermostmode))
+ if (!multiple_p (final_offset, GET_MODE_SIZE (outermode))
+ || may_ge (final_offset, GET_MODE_SIZE (innermostmode)))
return NULL_RTX;
}
else
{
- HOST_WIDE_INT required_offset
- = subreg_memory_offset (outermode, innermostmode, 0);
- if (final_offset != required_offset)
+ poly_int64 required_offset = subreg_memory_offset (outermode,
+ innermostmode, 0);
+ if (may_ne (final_offset, required_offset))
return NULL_RTX;
/* Paradoxical subregs always have byte offset 0. */
final_offset = 0;
@@ -6320,7 +6324,7 @@ simplify_subreg (machine_mode outermode,
The information is used only by alias analysis that can not
grog partial register anyway. */
- if (subreg_lowpart_offset (outermode, innermode) == byte)
+ if (must_eq (subreg_lowpart_offset (outermode, innermode), byte))
ORIGINAL_REGNO (x) = ORIGINAL_REGNO (op);
return x;
}
@@ -6345,25 +6349,28 @@ simplify_subreg (machine_mode outermode,
if (GET_CODE (op) == CONCAT
|| GET_CODE (op) == VEC_CONCAT)
{
- unsigned int part_size, final_offset;
+ unsigned int part_size;
+ poly_uint64 final_offset;
rtx part, res;
machine_mode part_mode = GET_MODE (XEXP (op, 0));
if (part_mode == VOIDmode)
part_mode = GET_MODE_INNER (GET_MODE (op));
part_size = GET_MODE_SIZE (part_mode);
- if (byte < part_size)
+ if (must_lt (byte, part_size))
{
part = XEXP (op, 0);
final_offset = byte;
}
- else
+ else if (must_ge (byte, part_size))
{
part = XEXP (op, 1);
final_offset = byte - part_size;
}
+ else
+ return NULL_RTX;
- if (final_offset + GET_MODE_SIZE (outermode) > part_size)
+ if (may_gt (final_offset + GET_MODE_SIZE (outermode), part_size))
return NULL_RTX;
part_mode = GET_MODE (part);
@@ -6381,15 +6388,15 @@ simplify_subreg (machine_mode outermode,
it extracts higher bits that the ZERO_EXTEND's source bits. */
if (GET_CODE (op) == ZERO_EXTEND && SCALAR_INT_MODE_P (innermode))
{
- unsigned int bitpos = subreg_lsb_1 (outermode, innermode, byte);
- if (bitpos >= GET_MODE_PRECISION (GET_MODE (XEXP (op, 0))))
+ poly_uint64 bitpos = subreg_lsb_1 (outermode, innermode, byte);
+ if (must_ge (bitpos, GET_MODE_PRECISION (GET_MODE (XEXP (op, 0)))))
return CONST0_RTX (outermode);
}
scalar_int_mode int_outermode, int_innermode;
if (is_a <scalar_int_mode> (outermode, &int_outermode)
&& is_a <scalar_int_mode> (innermode, &int_innermode)
- && byte == subreg_lowpart_offset (int_outermode, int_innermode))
+ && must_eq (byte, subreg_lowpart_offset (int_outermode, int_innermode)))
{
/* Handle polynomial integers. The upper bits of a paradoxical
subreg are undefined, so this is safe regardless of whether
@@ -6419,7 +6426,7 @@ simplify_subreg (machine_mode outermode,
rtx
simplify_gen_subreg (machine_mode outermode, rtx op,
- machine_mode innermode, unsigned int byte)
+ machine_mode innermode, poly_uint64 byte)
{
rtx newx;
@@ -6615,7 +6622,7 @@ test_vector_ops_duplicate (machine_mode
duplicate, last_par));
/* Test a scalar subreg of a VEC_DUPLICATE. */
- unsigned int offset = subreg_lowpart_offset (inner_mode, mode);
+ poly_uint64 offset = subreg_lowpart_offset (inner_mode, mode);
ASSERT_RTX_EQ (scalar_reg,
simplify_gen_subreg (inner_mode, duplicate,
mode, offset));
@@ -6635,7 +6642,7 @@ test_vector_ops_duplicate (machine_mode
duplicate, vec_par));
/* Test a vector subreg of a VEC_DUPLICATE. */
- unsigned int offset = subreg_lowpart_offset (narrower_mode, mode);
+ poly_uint64 offset = subreg_lowpart_offset (narrower_mode, mode);
ASSERT_RTX_EQ (narrower_duplicate,
simplify_gen_subreg (narrower_mode, duplicate,
mode, offset));
@@ -6745,7 +6752,7 @@ simplify_const_poly_int_tests<N>::run ()
rtx x10 = gen_int_mode (poly_int64 (-31, -24), HImode);
rtx two = GEN_INT (2);
rtx six = GEN_INT (6);
- HOST_WIDE_INT offset = subreg_lowpart_offset (QImode, HImode);
+ poly_uint64 offset = subreg_lowpart_offset (QImode, HImode);
/* These tests only try limited operation combinations. Fuller arithmetic
testing is done directly on poly_ints. */
Index: gcc/valtrack.c
===================================================================
--- gcc/valtrack.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/valtrack.c 2017-10-23 17:16:50.376527466 +0100
@@ -550,7 +550,7 @@ debug_lowpart_subreg (machine_mode outer
{
if (inner_mode == VOIDmode)
inner_mode = GET_MODE (expr);
- int offset = subreg_lowpart_offset (outer_mode, inner_mode);
+ poly_int64 offset = subreg_lowpart_offset (outer_mode, inner_mode);
rtx ret = simplify_gen_subreg (outer_mode, expr, inner_mode, offset);
if (ret)
return ret;
Index: gcc/var-tracking.c
===================================================================
--- gcc/var-tracking.c 2017-10-23 17:16:35.057923923 +0100
+++ gcc/var-tracking.c 2017-10-23 17:16:50.377527331 +0100
@@ -3522,6 +3522,12 @@ loc_cmp (rtx x, rtx y)
else
return 1;
+ case 'p':
+ r = compare_sizes_for_sort (SUBREG_BYTE (x), SUBREG_BYTE (y));
+ if (r != 0)
+ return r;
+ break;
+
case 'V':
case 'E':
/* Compare the vector length first. */
@@ -5369,7 +5375,7 @@ track_loc_p (rtx loc, tree expr, poly_in
static rtx
var_lowpart (machine_mode mode, rtx loc)
{
- unsigned int offset, reg_offset, regno;
+ unsigned int regno;
if (GET_MODE (loc) == mode)
return loc;
@@ -5377,12 +5383,12 @@ var_lowpart (machine_mode mode, rtx loc)
if (!REG_P (loc) && !MEM_P (loc))
return NULL;
- offset = byte_lowpart_offset (mode, GET_MODE (loc));
+ poly_uint64 offset = byte_lowpart_offset (mode, GET_MODE (loc));
if (MEM_P (loc))
return adjust_address_nv (loc, mode, offset);
- reg_offset = subreg_lowpart_offset (mode, GET_MODE (loc));
+ poly_uint64 reg_offset = subreg_lowpart_offset (mode, GET_MODE (loc));
regno = REGNO (loc) + subreg_regno_offset (REGNO (loc), GET_MODE (loc),
reg_offset, mode);
return gen_rtx_REG_offset (loc, mode, regno, offset);
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [025/nnn] poly_int: SUBREG_BYTE
2017-10-23 17:10 ` [025/nnn] poly_int: SUBREG_BYTE Richard Sandiford
@ 2017-12-06 18:50 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-06 18:50 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:10 AM, Richard Sandiford wrote:
> This patch changes SUBREG_BYTE from an int to a poly_int.
> Since valid SUBREG_BYTEs must be contained within the mode of the
> SUBREG_REG, the required range is the same as for GET_MODE_SIZE,
> i.e. unsigned short. The patch therefore uses poly_uint16(_pod)
> for the SUBREG_BYTE.
>
> Using poly_uint16_pod rtx fields requires a new field code ('p').
> Since there are no other uses of 'p' besides SUBREG_BYTE, the patch
> doesn't add an XPOLY or whatever; all uses should go via SUBREG_BYTE
> instead.
>
> The patch doesn't bother implementing 'p' support for legacy
> define_peepholes, since none of the remaining ones have subregs
> in their patterns.
>
> As it happened, the rtl documentation used SUBREG as an example of a
> code with mixed field types, accessed via XEXP (x, 0) and XINT (x, 1).
> Since there's no direct replacement for XINT, and since people should
> never use it even if there were, the patch changes the example to use
> INT_LIST instead.
>
> The patch also changes subreg-related helper functions so that they too
> take and return polynomial offsets. This makes the patch quite big, but
> it's mostly mechanical. The patch generally sticks to existing choices
> wrt signedness.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * doc/rtl.texi: Update documentation of SUBREG_BYTE. Document the
> 'p' format code. Use INT_LIST rather than SUBREG as the example of
> a code with an XINT and an XEXP. Remove the implication that
> accessing an rtx field using XINT is expected to work.
> * rtl.def (SUBREG): Change format from "ei" to "ep".
> * rtl.h (rtunion::rt_subreg): New field.
> (XCSUBREG): New macro.
> (SUBREG_BYTE): Use it.
> (subreg_shape): Change offset from an unsigned int to a poly_uint16.
> Update constructor accordingly.
> (subreg_shape::operator ==): Update accordingly.
> (subreg_shape::unique_id): Return an unsigned HOST_WIDE_INT rather
> than an unsigned int.
> (subreg_lsb, subreg_lowpart_offset, subreg_highpart_offset): Return
> a poly_uint64 rather than an unsigned int.
> (subreg_lsb_1): Likewise. Take the offset as a poly_uint64 rather
> than an unsigned int.
> (subreg_size_offset_from_lsb, subreg_size_lowpart_offset)
> (subreg_size_highpart_offset): Return a poly_uint64 rather than
> an unsigned int. Take the sizes as poly_uint64s.
> (subreg_offset_from_lsb): Return a poly_uint64 rather than
> an unsigned int. Take the shift as a poly_uint64 rather than
> an unsigned int.
> (subreg_regno_offset, subreg_offset_representable_p): Take the offset
> as a poly_uint64 rather than an unsigned int.
> (simplify_subreg_regno): Likewise.
> (byte_lowpart_offset): Return the memory offset as a poly_int64
> rather than an int.
> (subreg_memory_offset): Likewise. Take the subreg offset as a
> poly_uint64 rather than an unsigned int.
> (simplify_subreg, simplify_gen_subreg, subreg_get_info)
> (gen_rtx_SUBREG, validate_subreg): Take the subreg offset as a
> poly_uint64 rather than an unsigned int.
> * rtl.c (rtx_format): Describe 'p' in comment.
> (copy_rtx, rtx_equal_p_cb, rtx_equal_p): Handle 'p'.
> * emit-rtl.c (validate_subreg, gen_rtx_SUBREG): Take the subreg
> offset as a poly_uint64 rather than an unsigned int.
> (byte_lowpart_offset): Return the memory offset as a poly_int64
> rather than an int.
> (subreg_memory_offset): Likewise. Take the subreg offset as a
> poly_uint64 rather than an unsigned int.
> (subreg_size_lowpart_offset, subreg_size_highpart_offset): Take the
> mode sizes as poly_uint64s rather than unsigned ints. Return a
> poly_uint64 rather than an unsigned int.
> (subreg_lowpart_p): Treat subreg offsets as poly_ints.
> (copy_insn_1): Handle 'p'.
> * rtlanal.c (set_noop_p): Treat subregs offsets as poly_uint64s.
> (subreg_lsb_1): Take the subreg offset as a poly_uint64 rather than
> an unsigned int. Return the shift in the same way.
> (subreg_lsb): Return the shift as a poly_uint64 rather than an
> unsigned int.
> (subreg_size_offset_from_lsb): Take the sizes and shift as
> poly_uint64s rather than unsigned ints. Return the offset as
> a poly_uint64.
> (subreg_get_info, subreg_regno_offset, subreg_offset_representable_p)
> (simplify_subreg_regno): Take the offset as a poly_uint64 rather than
> an unsigned int.
> * rtlhash.c (add_rtx): Handle 'p'.
> * genemit.c (gen_exp): Likewise.
> * gengenrtl.c (type_from_format, gendef): Likewise.
> * gensupport.c (subst_pattern_match, get_alternatives_number)
> (collect_insn_data, alter_predicate_for_insn, alter_constraints)
> (subst_dup): Likewise.
> * gengtype.c (adjust_field_rtx_def): Likewise.
> * genrecog.c (find_operand, find_matching_operand, validate_pattern)
> (match_pattern_2): Likewise.
> (rtx_test::SUBREG_FIELD): New rtx_test::kind_enum.
> (rtx_test::subreg_field): New function.
> (operator ==, safe_to_hoist_p, transition_parameter_type)
> (print_nonbool_test, print_test): Handle SUBREG_FIELD.
> * genattrtab.c (attr_rtx_1): Say that 'p' is deliberately not handled.
> * genpeep.c (match_rtx): Likewise.
> * print-rtl.c (print_poly_int): Include if GENERATOR_FILE too.
> (rtx_writer::print_rtx_operand): Handle 'p'.
> (print_value): Handle SUBREG.
> * read-rtl.c (apply_int_iterator): Likewise.
> (rtx_reader::read_rtx_operand): Handle 'p'.
> * alias.c (rtx_equal_for_memref_p): Likewise.
> * cselib.c (rtx_equal_for_cselib_1, cselib_hash_rtx): Likewise.
> * caller-save.c (replace_reg_with_saved_mem): Treat subreg offsets
> as poly_ints.
> * calls.c (expand_call): Likewise.
> * combine.c (combine_simplify_rtx, expand_field_assignment): Likewise.
> (make_extraction, gen_lowpart_for_combine): Likewise.
> * loop-invariant.c (hash_invariant_expr_1, invariant_expr_equal_p):
> Likewise.
> * cse.c (remove_invalid_subreg_refs): Take the offset as a poly_uint64
> rather than an unsigned int. Treat subreg offsets as poly_ints.
> (exp_equiv_p): Handle 'p'.
> (hash_rtx_cb): Likewise. Treat subreg offsets as poly_ints.
> (equiv_constant, cse_insn): Treat subreg offsets as poly_ints.
> * dse.c (find_shift_sequence): Likewise.
> * dwarf2out.c (rtl_for_decl_location): Likewise.
> * expmed.c (extract_low_bits): Likewise.
> * expr.c (emit_group_store, undefined_operand_subword_p): Likewise.
> (expand_expr_real_2): Likewise.
> * final.c (alter_subreg): Likewise.
> (leaf_renumber_regs_insn): Handle 'p'.
> * function.c (assign_parm_find_stack_rtl, assign_parm_setup_stack):
> Treat subreg offsets as poly_ints.
> * fwprop.c (forward_propagate_and_simplify): Likewise.
> * ifcvt.c (noce_emit_move_insn, noce_emit_cmove): Likewise.
> * ira.c (get_subreg_tracking_sizes): Likewise.
> * ira-conflicts.c (go_through_subreg): Likewise.
> * ira-lives.c (process_single_reg_class_operands): Likewise.
> * jump.c (rtx_renumbered_equal_p): Likewise. Handle 'p'.
> * lower-subreg.c (simplify_subreg_concatn): Take the subreg offset
> as a poly_uint64 rather than an unsigned int.
> (simplify_gen_subreg_concatn, resolve_simple_move): Treat
> subreg offsets as poly_ints.
> * lra-constraints.c (operands_match_p): Handle 'p'.
> (match_reload, curr_insn_transform): Treat subreg offsets as poly_ints.
> * lra-spills.c (assign_mem_slot): Likewise.
> * postreload.c (move2add_valid_value_p): Likewise.
> * recog.c (general_operand, indirect_operand): Likewise.
> * regcprop.c (copy_value, maybe_mode_change): Likewise.
> (copyprop_hardreg_forward_1): Likewise.
> * reginfo.c (simplifiable_subregs_hasher::hash, simplifiable_subregs)
> (record_subregs_of_mode): Likewise.
> * rtlhooks.c (gen_lowpart_general, gen_lowpart_if_possible): Likewise.
> * reload.c (operands_match_p): Handle 'p'.
> (find_reloads_subreg_address): Treat subreg offsets as poly_ints.
> * reload1.c (alter_reg, choose_reload_regs): Likewise.
> (compute_reload_subreg_offset): Likewise, and return an poly_int64.
> * simplify-rtx.c (simplify_truncation, simplify_binary_operation_1):
> (test_vector_ops_duplicate): Treat subreg offsets as poly_ints.
> (simplify_const_poly_int_tests<N>::run): Likewise.
> (simplify_subreg, simplify_gen_subreg): Take the subreg offset as
> a poly_uint64 rather than an unsigned int.
> * valtrack.c (debug_lowpart_subreg): Likewise.
> * var-tracking.c (var_lowpart): Likewise.
> (loc_cmp): Handle 'p'.
>
> Index: gcc/rtlanal.c
[ ... Going to assume these bits are right WRT the endianness bits ...]
> - unsigned int lower_word_part = lower_bytes & -UNITS_PER_WORD;
> - unsigned int upper_word_part = upper_bytes & -UNITS_PER_WORD;
> + /* When bytes and words have oppposite endianness, we must be able
Nit. Oppposite
OK with nit fixed.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [027/nnn] poly_int: DWARF CFA offsets
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (24 preceding siblings ...)
2017-10-23 17:10 ` [025/nnn] poly_int: SUBREG_BYTE Richard Sandiford
@ 2017-10-23 17:11 ` Richard Sandiford
2017-12-06 0:40 ` Jeff Law
2017-10-23 17:11 ` [026/nnn] poly_int: operand_subword Richard Sandiford
` (81 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:11 UTC (permalink / raw)
To: gcc-patches
This patch makes the DWARF code use poly_int64 rather than
HOST_WIDE_INT for CFA offsets. The main changes are:
- to make reg_save use a DW_CFA_expression representation when
the offset isn't constant and
- to record the CFA information alongside a def_cfa_expression
if either offset is polynomial, since it's quite difficult
to reconstruct the CFA information otherwise.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* gengtype.c (main): Handle poly_int64_pod.
* dwarf2out.h (dw_cfi_oprnd_cfa_loc): New dw_cfi_oprnd_type.
(dw_cfi_oprnd::dw_cfi_cfa_loc): New field.
(dw_cfa_location::offset, dw_cfa_location::base_offset): Change
from HOST_WIDE_INT to poly_int64_pod.
* dwarf2cfi.c (queued_reg_save::cfa_offset): Likewise.
(copy_cfa): New function.
(lookup_cfa_1): Use the cached dw_cfi_cfa_loc, if it exists.
(cfi_oprnd_equal_p): Handle dw_cfi_oprnd_cfa_loc.
(cfa_equal_p, dwarf2out_frame_debug_adjust_cfa)
(dwarf2out_frame_debug_cfa_offset, dwarf2out_frame_debug_expr)
(initial_return_save): Treat offsets as poly_ints.
(def_cfa_0): Likewise. Cache the CFA in dw_cfi_cfa_loc if either
offset is nonconstant.
(reg_save): Take the offset as a poly_int64. Fall back to
DW_CFA_expression for nonconstant offsets.
(queue_reg_save): Take the offset as a poly_int64.
* dwarf2out.c (dw_cfi_oprnd2_desc): Handle DW_CFA_def_cfa_expression.
Index: gcc/gengtype.c
===================================================================
--- gcc/gengtype.c 2017-10-23 17:16:50.367528682 +0100
+++ gcc/gengtype.c 2017-10-23 17:16:57.211604434 +0100
@@ -5192,6 +5192,7 @@ #define POS_HERE(Call) do { pos.file = t
POS_HERE (do_scalar_typedef ("REAL_VALUE_TYPE", &pos));
POS_HERE (do_scalar_typedef ("FIXED_VALUE_TYPE", &pos));
POS_HERE (do_scalar_typedef ("double_int", &pos));
+ POS_HERE (do_scalar_typedef ("poly_int64_pod", &pos));
POS_HERE (do_scalar_typedef ("offset_int", &pos));
POS_HERE (do_scalar_typedef ("widest_int", &pos));
POS_HERE (do_scalar_typedef ("int64_t", &pos));
Index: gcc/dwarf2out.h
===================================================================
--- gcc/dwarf2out.h 2017-10-23 17:11:40.311071579 +0100
+++ gcc/dwarf2out.h 2017-10-23 17:16:57.210604569 +0100
@@ -43,7 +43,8 @@ enum dw_cfi_oprnd_type {
dw_cfi_oprnd_reg_num,
dw_cfi_oprnd_offset,
dw_cfi_oprnd_addr,
- dw_cfi_oprnd_loc
+ dw_cfi_oprnd_loc,
+ dw_cfi_oprnd_cfa_loc
};
typedef union GTY(()) {
@@ -51,6 +52,8 @@ typedef union GTY(()) {
HOST_WIDE_INT GTY ((tag ("dw_cfi_oprnd_offset"))) dw_cfi_offset;
const char * GTY ((tag ("dw_cfi_oprnd_addr"))) dw_cfi_addr;
struct dw_loc_descr_node * GTY ((tag ("dw_cfi_oprnd_loc"))) dw_cfi_loc;
+ struct dw_cfa_location * GTY ((tag ("dw_cfi_oprnd_cfa_loc")))
+ dw_cfi_cfa_loc;
} dw_cfi_oprnd;
struct GTY(()) dw_cfi_node {
@@ -114,8 +117,8 @@ struct GTY(()) dw_fde_node {
Instead of passing around REG and OFFSET, we pass a copy
of this structure. */
struct GTY(()) dw_cfa_location {
- HOST_WIDE_INT offset;
- HOST_WIDE_INT base_offset;
+ poly_int64_pod offset;
+ poly_int64_pod base_offset;
/* REG is in DWARF_FRAME_REGNUM space, *not* normal REGNO space. */
unsigned int reg;
BOOL_BITFIELD indirect : 1; /* 1 if CFA is accessed via a dereference. */
Index: gcc/dwarf2cfi.c
===================================================================
--- gcc/dwarf2cfi.c 2017-10-23 17:07:41.013611927 +0100
+++ gcc/dwarf2cfi.c 2017-10-23 17:16:57.208604839 +0100
@@ -206,7 +206,7 @@ static GTY(()) unsigned long dwarf2out_c
struct queued_reg_save {
rtx reg;
rtx saved_reg;
- HOST_WIDE_INT cfa_offset;
+ poly_int64_pod cfa_offset;
};
@@ -434,6 +434,16 @@ copy_cfi_row (dw_cfi_row *src)
return dst;
}
+/* Return a copy of an existing CFA location. */
+
+static dw_cfa_location *
+copy_cfa (dw_cfa_location *src)
+{
+ dw_cfa_location *dst = ggc_alloc<dw_cfa_location> ();
+ *dst = *src;
+ return dst;
+}
+
/* Generate a new label for the CFI info to refer to. */
static char *
@@ -629,7 +639,10 @@ lookup_cfa_1 (dw_cfi_ref cfi, dw_cfa_loc
loc->offset = cfi->dw_cfi_oprnd2.dw_cfi_offset;
break;
case DW_CFA_def_cfa_expression:
- get_cfa_from_loc_descr (loc, cfi->dw_cfi_oprnd1.dw_cfi_loc);
+ if (cfi->dw_cfi_oprnd2.dw_cfi_cfa_loc)
+ *loc = *cfi->dw_cfi_oprnd2.dw_cfi_cfa_loc;
+ else
+ get_cfa_from_loc_descr (loc, cfi->dw_cfi_oprnd1.dw_cfi_loc);
break;
case DW_CFA_remember_state:
@@ -654,10 +667,10 @@ lookup_cfa_1 (dw_cfi_ref cfi, dw_cfa_loc
cfa_equal_p (const dw_cfa_location *loc1, const dw_cfa_location *loc2)
{
return (loc1->reg == loc2->reg
- && loc1->offset == loc2->offset
+ && must_eq (loc1->offset, loc2->offset)
&& loc1->indirect == loc2->indirect
&& (loc1->indirect == 0
- || loc1->base_offset == loc2->base_offset));
+ || must_eq (loc1->base_offset, loc2->base_offset)));
}
/* Determine if two CFI operands are identical. */
@@ -678,6 +691,8 @@ cfi_oprnd_equal_p (enum dw_cfi_oprnd_typ
|| strcmp (a->dw_cfi_addr, b->dw_cfi_addr) == 0);
case dw_cfi_oprnd_loc:
return loc_descr_equal_p (a->dw_cfi_loc, b->dw_cfi_loc);
+ case dw_cfi_oprnd_cfa_loc:
+ return cfa_equal_p (a->dw_cfi_cfa_loc, b->dw_cfi_cfa_loc);
}
gcc_unreachable ();
}
@@ -758,19 +773,23 @@ def_cfa_0 (dw_cfa_location *old_cfa, dw_
cfi = new_cfi ();
- if (new_cfa->reg == old_cfa->reg && !new_cfa->indirect && !old_cfa->indirect)
+ HOST_WIDE_INT const_offset;
+ if (new_cfa->reg == old_cfa->reg
+ && !new_cfa->indirect
+ && !old_cfa->indirect
+ && new_cfa->offset.is_constant (&const_offset))
{
/* Construct a "DW_CFA_def_cfa_offset <offset>" instruction, indicating
the CFA register did not change but the offset did. The data
factoring for DW_CFA_def_cfa_offset_sf happens in output_cfi, or
in the assembler via the .cfi_def_cfa_offset directive. */
- if (new_cfa->offset < 0)
+ if (const_offset < 0)
cfi->dw_cfi_opc = DW_CFA_def_cfa_offset_sf;
else
cfi->dw_cfi_opc = DW_CFA_def_cfa_offset;
- cfi->dw_cfi_oprnd1.dw_cfi_offset = new_cfa->offset;
+ cfi->dw_cfi_oprnd1.dw_cfi_offset = const_offset;
}
- else if (new_cfa->offset == old_cfa->offset
+ else if (must_eq (new_cfa->offset, old_cfa->offset)
&& old_cfa->reg != INVALID_REGNUM
&& !new_cfa->indirect
&& !old_cfa->indirect)
@@ -781,19 +800,20 @@ def_cfa_0 (dw_cfa_location *old_cfa, dw_
cfi->dw_cfi_opc = DW_CFA_def_cfa_register;
cfi->dw_cfi_oprnd1.dw_cfi_reg_num = new_cfa->reg;
}
- else if (new_cfa->indirect == 0)
+ else if (new_cfa->indirect == 0
+ && new_cfa->offset.is_constant (&const_offset))
{
/* Construct a "DW_CFA_def_cfa <register> <offset>" instruction,
indicating the CFA register has changed to <register> with
the specified offset. The data factoring for DW_CFA_def_cfa_sf
happens in output_cfi, or in the assembler via the .cfi_def_cfa
directive. */
- if (new_cfa->offset < 0)
+ if (const_offset < 0)
cfi->dw_cfi_opc = DW_CFA_def_cfa_sf;
else
cfi->dw_cfi_opc = DW_CFA_def_cfa;
cfi->dw_cfi_oprnd1.dw_cfi_reg_num = new_cfa->reg;
- cfi->dw_cfi_oprnd2.dw_cfi_offset = new_cfa->offset;
+ cfi->dw_cfi_oprnd2.dw_cfi_offset = const_offset;
}
else
{
@@ -805,6 +825,13 @@ def_cfa_0 (dw_cfa_location *old_cfa, dw_
cfi->dw_cfi_opc = DW_CFA_def_cfa_expression;
loc_list = build_cfa_loc (new_cfa, 0);
cfi->dw_cfi_oprnd1.dw_cfi_loc = loc_list;
+ if (!new_cfa->offset.is_constant ()
+ || !new_cfa->base_offset.is_constant ())
+ /* It's hard to reconstruct the CFA location for a polynomial
+ expression, so just cache it instead. */
+ cfi->dw_cfi_oprnd2.dw_cfi_cfa_loc = copy_cfa (new_cfa);
+ else
+ cfi->dw_cfi_oprnd2.dw_cfi_cfa_loc = NULL;
}
return cfi;
@@ -836,33 +863,42 @@ def_cfa_1 (dw_cfa_location *new_cfa)
otherwise it is saved in SREG. */
static void
-reg_save (unsigned int reg, unsigned int sreg, HOST_WIDE_INT offset)
+reg_save (unsigned int reg, unsigned int sreg, poly_int64 offset)
{
dw_fde_ref fde = cfun ? cfun->fde : NULL;
dw_cfi_ref cfi = new_cfi ();
cfi->dw_cfi_oprnd1.dw_cfi_reg_num = reg;
- /* When stack is aligned, store REG using DW_CFA_expression with FP. */
- if (fde
- && fde->stack_realign
- && sreg == INVALID_REGNUM)
- {
- cfi->dw_cfi_opc = DW_CFA_expression;
- cfi->dw_cfi_oprnd1.dw_cfi_reg_num = reg;
- cfi->dw_cfi_oprnd2.dw_cfi_loc
- = build_cfa_aligned_loc (&cur_row->cfa, offset,
- fde->stack_realignment);
- }
- else if (sreg == INVALID_REGNUM)
- {
- if (need_data_align_sf_opcode (offset))
- cfi->dw_cfi_opc = DW_CFA_offset_extended_sf;
- else if (reg & ~0x3f)
- cfi->dw_cfi_opc = DW_CFA_offset_extended;
+ if (sreg == INVALID_REGNUM)
+ {
+ HOST_WIDE_INT const_offset;
+ /* When stack is aligned, store REG using DW_CFA_expression with FP. */
+ if (fde && fde->stack_realign)
+ {
+ cfi->dw_cfi_opc = DW_CFA_expression;
+ cfi->dw_cfi_oprnd1.dw_cfi_reg_num = reg;
+ cfi->dw_cfi_oprnd2.dw_cfi_loc
+ = build_cfa_aligned_loc (&cur_row->cfa, offset,
+ fde->stack_realignment);
+ }
+ else if (offset.is_constant (&const_offset))
+ {
+ if (need_data_align_sf_opcode (const_offset))
+ cfi->dw_cfi_opc = DW_CFA_offset_extended_sf;
+ else if (reg & ~0x3f)
+ cfi->dw_cfi_opc = DW_CFA_offset_extended;
+ else
+ cfi->dw_cfi_opc = DW_CFA_offset;
+ cfi->dw_cfi_oprnd2.dw_cfi_offset = const_offset;
+ }
else
- cfi->dw_cfi_opc = DW_CFA_offset;
- cfi->dw_cfi_oprnd2.dw_cfi_offset = offset;
+ {
+ cfi->dw_cfi_opc = DW_CFA_expression;
+ cfi->dw_cfi_oprnd1.dw_cfi_reg_num = reg;
+ cfi->dw_cfi_oprnd2.dw_cfi_loc
+ = build_cfa_loc (&cur_row->cfa, offset);
+ }
}
else if (sreg == reg)
{
@@ -995,7 +1031,7 @@ record_reg_saved_in_reg (rtx dest, rtx s
SREG, or if SREG is NULL then it is saved at OFFSET to the CFA. */
static void
-queue_reg_save (rtx reg, rtx sreg, HOST_WIDE_INT offset)
+queue_reg_save (rtx reg, rtx sreg, poly_int64 offset)
{
queued_reg_save *q;
queued_reg_save e = {reg, sreg, offset};
@@ -1097,20 +1133,11 @@ dwarf2out_frame_debug_def_cfa (rtx pat)
{
memset (cur_cfa, 0, sizeof (*cur_cfa));
- if (GET_CODE (pat) == PLUS)
- {
- cur_cfa->offset = INTVAL (XEXP (pat, 1));
- pat = XEXP (pat, 0);
- }
+ pat = strip_offset (pat, &cur_cfa->offset);
if (MEM_P (pat))
{
cur_cfa->indirect = 1;
- pat = XEXP (pat, 0);
- if (GET_CODE (pat) == PLUS)
- {
- cur_cfa->base_offset = INTVAL (XEXP (pat, 1));
- pat = XEXP (pat, 0);
- }
+ pat = strip_offset (XEXP (pat, 0), &cur_cfa->base_offset);
}
/* ??? If this fails, we could be calling into the _loc functions to
define a full expression. So far no port does that. */
@@ -1133,7 +1160,7 @@ dwarf2out_frame_debug_adjust_cfa (rtx pa
{
case PLUS:
gcc_assert (dwf_regno (XEXP (src, 0)) == cur_cfa->reg);
- cur_cfa->offset -= INTVAL (XEXP (src, 1));
+ cur_cfa->offset -= rtx_to_poly_int64 (XEXP (src, 1));
break;
case REG:
@@ -1152,7 +1179,7 @@ dwarf2out_frame_debug_adjust_cfa (rtx pa
static void
dwarf2out_frame_debug_cfa_offset (rtx set)
{
- HOST_WIDE_INT offset;
+ poly_int64 offset;
rtx src, addr, span;
unsigned int sregno;
@@ -1170,7 +1197,7 @@ dwarf2out_frame_debug_cfa_offset (rtx se
break;
case PLUS:
gcc_assert (dwf_regno (XEXP (addr, 0)) == cur_cfa->reg);
- offset = INTVAL (XEXP (addr, 1)) - cur_cfa->offset;
+ offset = rtx_to_poly_int64 (XEXP (addr, 1)) - cur_cfa->offset;
break;
default:
gcc_unreachable ();
@@ -1195,7 +1222,7 @@ dwarf2out_frame_debug_cfa_offset (rtx se
{
/* We have a PARALLEL describing where the contents of SRC live.
Adjust the offset for each piece of the PARALLEL. */
- HOST_WIDE_INT span_offset = offset;
+ poly_int64 span_offset = offset;
gcc_assert (GET_CODE (span) == PARALLEL);
@@ -1535,7 +1562,7 @@ dwarf2out_frame_debug_cfa_window_save (v
dwarf2out_frame_debug_expr (rtx expr)
{
rtx src, dest, span;
- HOST_WIDE_INT offset;
+ poly_int64 offset;
dw_fde_ref fde;
/* If RTX_FRAME_RELATED_P is set on a PARALLEL, process each member of
@@ -1639,19 +1666,14 @@ dwarf2out_frame_debug_expr (rtx expr)
{
/* Rule 2 */
/* Adjusting SP. */
- switch (GET_CODE (XEXP (src, 1)))
+ if (REG_P (XEXP (src, 1)))
{
- case CONST_INT:
- offset = INTVAL (XEXP (src, 1));
- break;
- case REG:
gcc_assert (dwf_regno (XEXP (src, 1))
== cur_trace->cfa_temp.reg);
offset = cur_trace->cfa_temp.offset;
- break;
- default:
- gcc_unreachable ();
}
+ else if (!poly_int_rtx_p (XEXP (src, 1), &offset))
+ gcc_unreachable ();
if (XEXP (src, 0) == hard_frame_pointer_rtx)
{
@@ -1680,9 +1702,8 @@ dwarf2out_frame_debug_expr (rtx expr)
gcc_assert (frame_pointer_needed);
gcc_assert (REG_P (XEXP (src, 0))
- && dwf_regno (XEXP (src, 0)) == cur_cfa->reg
- && CONST_INT_P (XEXP (src, 1)));
- offset = INTVAL (XEXP (src, 1));
+ && dwf_regno (XEXP (src, 0)) == cur_cfa->reg);
+ offset = rtx_to_poly_int64 (XEXP (src, 1));
if (GET_CODE (src) != MINUS)
offset = -offset;
cur_cfa->offset += offset;
@@ -1695,11 +1716,11 @@ dwarf2out_frame_debug_expr (rtx expr)
/* Rule 4 */
if (REG_P (XEXP (src, 0))
&& dwf_regno (XEXP (src, 0)) == cur_cfa->reg
- && CONST_INT_P (XEXP (src, 1)))
+ && poly_int_rtx_p (XEXP (src, 1), &offset))
{
/* Setting a temporary CFA register that will be copied
into the FP later on. */
- offset = - INTVAL (XEXP (src, 1));
+ offset = -offset;
cur_cfa->offset += offset;
cur_cfa->reg = dwf_regno (dest);
/* Or used to save regs to the stack. */
@@ -1722,11 +1743,9 @@ dwarf2out_frame_debug_expr (rtx expr)
/* Rule 9 */
else if (GET_CODE (src) == LO_SUM
- && CONST_INT_P (XEXP (src, 1)))
- {
- cur_trace->cfa_temp.reg = dwf_regno (dest);
- cur_trace->cfa_temp.offset = INTVAL (XEXP (src, 1));
- }
+ && poly_int_rtx_p (XEXP (src, 1),
+ &cur_trace->cfa_temp.offset))
+ cur_trace->cfa_temp.reg = dwf_regno (dest);
else
gcc_unreachable ();
}
@@ -1734,8 +1753,9 @@ dwarf2out_frame_debug_expr (rtx expr)
/* Rule 6 */
case CONST_INT:
+ case POLY_INT_CST:
cur_trace->cfa_temp.reg = dwf_regno (dest);
- cur_trace->cfa_temp.offset = INTVAL (src);
+ cur_trace->cfa_temp.offset = rtx_to_poly_int64 (src);
break;
/* Rule 7 */
@@ -1745,7 +1765,11 @@ dwarf2out_frame_debug_expr (rtx expr)
&& CONST_INT_P (XEXP (src, 1)));
cur_trace->cfa_temp.reg = dwf_regno (dest);
- cur_trace->cfa_temp.offset |= INTVAL (XEXP (src, 1));
+ if (!can_ior_p (cur_trace->cfa_temp.offset, INTVAL (XEXP (src, 1)),
+ &cur_trace->cfa_temp.offset))
+ /* The target shouldn't generate this kind of CFI note if we
+ can't represent it. */
+ gcc_unreachable ();
break;
/* Skip over HIGH, assuming it will be followed by a LO_SUM,
@@ -1800,9 +1824,7 @@ dwarf2out_frame_debug_expr (rtx expr)
case PRE_MODIFY:
case POST_MODIFY:
/* We can't handle variable size modifications. */
- gcc_assert (GET_CODE (XEXP (XEXP (XEXP (dest, 0), 1), 1))
- == CONST_INT);
- offset = -INTVAL (XEXP (XEXP (XEXP (dest, 0), 1), 1));
+ offset = -rtx_to_poly_int64 (XEXP (XEXP (XEXP (dest, 0), 1), 1));
gcc_assert (REGNO (XEXP (XEXP (dest, 0), 0)) == STACK_POINTER_REGNUM
&& cur_trace->cfa_store.reg == dw_stack_pointer_regnum);
@@ -1860,9 +1882,8 @@ dwarf2out_frame_debug_expr (rtx expr)
{
unsigned int regno;
- gcc_assert (CONST_INT_P (XEXP (XEXP (dest, 0), 1))
- && REG_P (XEXP (XEXP (dest, 0), 0)));
- offset = INTVAL (XEXP (XEXP (dest, 0), 1));
+ gcc_assert (REG_P (XEXP (XEXP (dest, 0), 0)));
+ offset = rtx_to_poly_int64 (XEXP (XEXP (dest, 0), 1));
if (GET_CODE (XEXP (dest, 0)) == MINUS)
offset = -offset;
@@ -1923,7 +1944,7 @@ dwarf2out_frame_debug_expr (rtx expr)
{
/* We're storing the current CFA reg into the stack. */
- if (cur_cfa->offset == 0)
+ if (known_zero (cur_cfa->offset))
{
/* Rule 19 */
/* If stack is aligned, putting CFA reg into stack means
@@ -1981,7 +2002,7 @@ dwarf2out_frame_debug_expr (rtx expr)
{
/* We have a PARALLEL describing where the contents of SRC live.
Queue register saves for each piece of the PARALLEL. */
- HOST_WIDE_INT span_offset = offset;
+ poly_int64 span_offset = offset;
gcc_assert (GET_CODE (span) == PARALLEL);
@@ -2884,7 +2905,7 @@ create_pseudo_cfg (void)
initial_return_save (rtx rtl)
{
unsigned int reg = INVALID_REGNUM;
- HOST_WIDE_INT offset = 0;
+ poly_int64 offset = 0;
switch (GET_CODE (rtl))
{
@@ -2905,12 +2926,12 @@ initial_return_save (rtx rtl)
case PLUS:
gcc_assert (REGNO (XEXP (rtl, 0)) == STACK_POINTER_REGNUM);
- offset = INTVAL (XEXP (rtl, 1));
+ offset = rtx_to_poly_int64 (XEXP (rtl, 1));
break;
case MINUS:
gcc_assert (REGNO (XEXP (rtl, 0)) == STACK_POINTER_REGNUM);
- offset = -INTVAL (XEXP (rtl, 1));
+ offset = -rtx_to_poly_int64 (XEXP (rtl, 1));
break;
default:
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c 2017-10-23 17:16:50.362529357 +0100
+++ gcc/dwarf2out.c 2017-10-23 17:16:57.210604569 +0100
@@ -570,6 +570,9 @@ dw_cfi_oprnd2_desc (enum dwarf_call_fram
case DW_CFA_val_expression:
return dw_cfi_oprnd_loc;
+ case DW_CFA_def_cfa_expression:
+ return dw_cfi_oprnd_cfa_loc;
+
default:
return dw_cfi_oprnd_unused;
}
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [027/nnn] poly_int: DWARF CFA offsets
2017-10-23 17:11 ` [027/nnn] poly_int: DWARF CFA offsets Richard Sandiford
@ 2017-12-06 0:40 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-12-06 0:40 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:10 AM, Richard Sandiford wrote:
> This patch makes the DWARF code use poly_int64 rather than
> HOST_WIDE_INT for CFA offsets. The main changes are:
>
> - to make reg_save use a DW_CFA_expression representation when
> the offset isn't constant and
>
> - to record the CFA information alongside a def_cfa_expression
> if either offset is polynomial, since it's quite difficult
> to reconstruct the CFA information otherwise.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * gengtype.c (main): Handle poly_int64_pod.
> * dwarf2out.h (dw_cfi_oprnd_cfa_loc): New dw_cfi_oprnd_type.
> (dw_cfi_oprnd::dw_cfi_cfa_loc): New field.
> (dw_cfa_location::offset, dw_cfa_location::base_offset): Change
> from HOST_WIDE_INT to poly_int64_pod.
> * dwarf2cfi.c (queued_reg_save::cfa_offset): Likewise.
> (copy_cfa): New function.
> (lookup_cfa_1): Use the cached dw_cfi_cfa_loc, if it exists.
> (cfi_oprnd_equal_p): Handle dw_cfi_oprnd_cfa_loc.
> (cfa_equal_p, dwarf2out_frame_debug_adjust_cfa)
> (dwarf2out_frame_debug_cfa_offset, dwarf2out_frame_debug_expr)
> (initial_return_save): Treat offsets as poly_ints.
> (def_cfa_0): Likewise. Cache the CFA in dw_cfi_cfa_loc if either
> offset is nonconstant.
> (reg_save): Take the offset as a poly_int64. Fall back to
> DW_CFA_expression for nonconstant offsets.
> (queue_reg_save): Take the offset as a poly_int64.
> * dwarf2out.c (dw_cfi_oprnd2_desc): Handle DW_CFA_def_cfa_expression.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [026/nnn] poly_int: operand_subword
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (25 preceding siblings ...)
2017-10-23 17:11 ` [027/nnn] poly_int: DWARF CFA offsets Richard Sandiford
@ 2017-10-23 17:11 ` Richard Sandiford
2017-11-28 17:51 ` Jeff Law
2017-10-23 17:12 ` [028/nnn] poly_int: ipa_parm_adjustment Richard Sandiford
` (80 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:11 UTC (permalink / raw)
To: gcc-patches
This patch makes operand_subword and operand_subword_force take
polynomial offsets. This is a fairly old-school interface and
these days should only be used when splitting multiword operations
into word operations. It still doesn't hurt to support polynomial
offsets and it helps make callers easier to write.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* rtl.h (operand_subword, operand_subword_force): Take the offset
as a poly_uint64 an unsigned int.
* emit-rtl.c (operand_subword, operand_subword_force): Likewise.
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h 2017-10-23 17:16:50.374527737 +0100
+++ gcc/rtl.h 2017-10-23 17:16:55.754801166 +0100
@@ -3017,10 +3017,10 @@ extern rtx gen_lowpart_if_possible (mach
/* In emit-rtl.c */
extern rtx gen_highpart (machine_mode, rtx);
extern rtx gen_highpart_mode (machine_mode, machine_mode, rtx);
-extern rtx operand_subword (rtx, unsigned int, int, machine_mode);
+extern rtx operand_subword (rtx, poly_uint64, int, machine_mode);
/* In emit-rtl.c */
-extern rtx operand_subword_force (rtx, unsigned int, machine_mode);
+extern rtx operand_subword_force (rtx, poly_uint64, machine_mode);
extern int subreg_lowpart_p (const_rtx);
extern poly_uint64 subreg_size_lowpart_offset (poly_uint64, poly_uint64);
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c 2017-10-23 17:16:50.363529222 +0100
+++ gcc/emit-rtl.c 2017-10-23 17:16:55.754801166 +0100
@@ -1736,7 +1736,8 @@ subreg_lowpart_p (const_rtx x)
*/
rtx
-operand_subword (rtx op, unsigned int offset, int validate_address, machine_mode mode)
+operand_subword (rtx op, poly_uint64 offset, int validate_address,
+ machine_mode mode)
{
if (mode == VOIDmode)
mode = GET_MODE (op);
@@ -1745,12 +1746,12 @@ operand_subword (rtx op, unsigned int of
/* If OP is narrower than a word, fail. */
if (mode != BLKmode
- && (GET_MODE_SIZE (mode) < UNITS_PER_WORD))
+ && may_lt (GET_MODE_SIZE (mode), UNITS_PER_WORD))
return 0;
/* If we want a word outside OP, return zero. */
if (mode != BLKmode
- && (offset + 1) * UNITS_PER_WORD > GET_MODE_SIZE (mode))
+ && may_gt ((offset + 1) * UNITS_PER_WORD, GET_MODE_SIZE (mode)))
return const0_rtx;
/* Form a new MEM at the requested address. */
@@ -1784,7 +1785,7 @@ operand_subword (rtx op, unsigned int of
MODE is the mode of OP, in case it is CONST_INT. */
rtx
-operand_subword_force (rtx op, unsigned int offset, machine_mode mode)
+operand_subword_force (rtx op, poly_uint64 offset, machine_mode mode)
{
rtx result = operand_subword (op, offset, 1, mode);
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [026/nnn] poly_int: operand_subword
2017-10-23 17:11 ` [026/nnn] poly_int: operand_subword Richard Sandiford
@ 2017-11-28 17:51 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-28 17:51 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:10 AM, Richard Sandiford wrote:
> This patch makes operand_subword and operand_subword_force take
> polynomial offsets. This is a fairly old-school interface and
> these days should only be used when splitting multiword operations
> into word operations. It still doesn't hurt to support polynomial
> offsets and it helps make callers easier to write.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * rtl.h (operand_subword, operand_subword_force): Take the offset
> as a poly_uint64 an unsigned int.
> * emit-rtl.c (operand_subword, operand_subword_force): Likewise.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [028/nnn] poly_int: ipa_parm_adjustment
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (26 preceding siblings ...)
2017-10-23 17:11 ` [026/nnn] poly_int: operand_subword Richard Sandiford
@ 2017-10-23 17:12 ` Richard Sandiford
2017-11-28 17:47 ` Jeff Law
2017-10-23 17:12 ` [030/nnn] poly_int: get_addr_unit_base_and_extent Richard Sandiford
` (79 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:12 UTC (permalink / raw)
To: gcc-patches
This patch changes the type of ipa_parm_adjustment::offset from
HOST_WIDE_INT to poly_int64 and updates uses accordingly.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* ipa-prop.h (ipa_parm_adjustment::offset): Change from
HOST_WIDE_INT to poly_int64_pod.
* ipa-prop.c (ipa_modify_call_arguments): Track polynomail
parameter offsets.
Index: gcc/ipa-prop.h
===================================================================
--- gcc/ipa-prop.h 2017-10-23 17:07:40.959671257 +0100
+++ gcc/ipa-prop.h 2017-10-23 17:16:58.508429306 +0100
@@ -828,7 +828,7 @@ struct ipa_parm_adjustment
/* Offset into the original parameter (for the cases when the new parameter
is a component of an original one). */
- HOST_WIDE_INT offset;
+ poly_int64_pod offset;
/* Zero based index of the original parameter this one is based on. */
int base_index;
Index: gcc/ipa-prop.c
===================================================================
--- gcc/ipa-prop.c 2017-10-23 17:07:40.959671257 +0100
+++ gcc/ipa-prop.c 2017-10-23 17:16:58.507429441 +0100
@@ -4302,15 +4302,14 @@ ipa_modify_call_arguments (struct cgraph
simply taking the address of a reference inside the original
aggregate. */
- gcc_checking_assert (adj->offset % BITS_PER_UNIT == 0);
+ poly_int64 byte_offset = exact_div (adj->offset, BITS_PER_UNIT);
base = gimple_call_arg (stmt, adj->base_index);
loc = DECL_P (base) ? DECL_SOURCE_LOCATION (base)
: EXPR_LOCATION (base);
if (TREE_CODE (base) != ADDR_EXPR
&& POINTER_TYPE_P (TREE_TYPE (base)))
- off = build_int_cst (adj->alias_ptr_type,
- adj->offset / BITS_PER_UNIT);
+ off = build_int_cst (adj->alias_ptr_type, byte_offset);
else
{
HOST_WIDE_INT base_offset;
@@ -4330,8 +4329,7 @@ ipa_modify_call_arguments (struct cgraph
if (!base)
{
base = build_fold_addr_expr (prev_base);
- off = build_int_cst (adj->alias_ptr_type,
- adj->offset / BITS_PER_UNIT);
+ off = build_int_cst (adj->alias_ptr_type, byte_offset);
}
else if (TREE_CODE (base) == MEM_REF)
{
@@ -4341,8 +4339,7 @@ ipa_modify_call_arguments (struct cgraph
deref_align = TYPE_ALIGN (TREE_TYPE (base));
}
off = build_int_cst (adj->alias_ptr_type,
- base_offset
- + adj->offset / BITS_PER_UNIT);
+ base_offset + byte_offset);
off = int_const_binop (PLUS_EXPR, TREE_OPERAND (base, 1),
off);
base = TREE_OPERAND (base, 0);
@@ -4350,8 +4347,7 @@ ipa_modify_call_arguments (struct cgraph
else
{
off = build_int_cst (adj->alias_ptr_type,
- base_offset
- + adj->offset / BITS_PER_UNIT);
+ base_offset + byte_offset);
base = build_fold_addr_expr (base);
}
}
@@ -4602,7 +4598,7 @@ ipa_get_adjustment_candidate (tree **exp
struct ipa_parm_adjustment *adj = &adjustments[i];
if (adj->base == base
- && (adj->offset == offset || adj->op == IPA_PARM_OP_REMOVE))
+ && (must_eq (adj->offset, offset) || adj->op == IPA_PARM_OP_REMOVE))
{
cand = adj;
break;
@@ -4766,7 +4762,10 @@ ipa_dump_param_adjustments (FILE *file,
else if (adj->op == IPA_PARM_OP_REMOVE)
fprintf (file, ", remove_param");
else
- fprintf (file, ", offset %li", (long) adj->offset);
+ {
+ fprintf (file, ", offset ");
+ print_dec (adj->offset, file);
+ }
if (adj->by_ref)
fprintf (file, ", by_ref");
print_node_brief (file, ", type: ", adj->type, 0);
^ permalink raw reply [flat|nested] 302+ messages in thread
* Re: [028/nnn] poly_int: ipa_parm_adjustment
2017-10-23 17:12 ` [028/nnn] poly_int: ipa_parm_adjustment Richard Sandiford
@ 2017-11-28 17:47 ` Jeff Law
0 siblings, 0 replies; 302+ messages in thread
From: Jeff Law @ 2017-11-28 17:47 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 10/23/2017 11:11 AM, Richard Sandiford wrote:
> This patch changes the type of ipa_parm_adjustment::offset from
> HOST_WIDE_INT to poly_int64 and updates uses accordingly.
>
>
> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * ipa-prop.h (ipa_parm_adjustment::offset): Change from
> HOST_WIDE_INT to poly_int64_pod.
> * ipa-prop.c (ipa_modify_call_arguments): Track polynomail
> parameter offsets.
OK.
jeff
^ permalink raw reply [flat|nested] 302+ messages in thread
* [030/nnn] poly_int: get_addr_unit_base_and_extent
2017-10-23 16:57 [000/nnn] poly_int: representation of runtime offsets and sizes Richard Sandiford
` (27 preceding siblings ...)
2017-10-23 17:12 ` [028/nnn] poly_int: ipa_parm_adjustment Richard Sandiford
@ 2017-10-23 17:12 ` Richard Sandiford
2017-12-06 0:26 ` Jeff Law
2017-10-23 17:12 ` [029/nnn] poly_int: get_ref_base_and_extent Richard Sandiford
` (78 subsequent siblings)
107 siblings, 1 reply; 302+ messages in thread
From: Richard Sandiford @ 2017-10-23 17:12 UTC (permalink / raw)
To: gcc-patches
This patch changes the values returned by
get_addr_unit_base_and_extent from HOST_WIDE_INT to poly_int64.
maxsize in gimple_fold_builtin_memory_op goes from HOST_WIDE_INT
to poly_uint64 (rather than poly_int) to match the previous use
of tree_fits_uhwi_p.
2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-dfa.h (get_addr_base_and_unit_offset_1): Return the offset
as a poly_int64_pod rather than a HOST_WIDE_INT.
(get_addr_base_and_unit_offset): Likewise.
* tree-dfa.c (get_addr_base_and_unit_offset_1): Likewise.
(get_addr_base_and_unit_offset): Likewise.
* doc/match-and-simplify.texi: Change off from HOST_WIDE_INT
to poly_int64 in example.
* fold-const.c (fold_binary_loc): Update call to
get_addr_base_and_unit_offset.
* gimple-fold.c (gimple_fold_builtin_memory_op): Likewise.
(maybe_canonicalize_mem_ref_addr): Likewise.
(gimple_fold_stmt_to_constant_1): Likewise.
* ipa-prop.c (ipa_modify_call_arguments): Likewise.
* match.pd: Likewise.
* omp-low.c (lower_omp_target): Likewise.
* tree-sra.c (build_ref_for_offset): Likewise.
(build_debug_ref_for_model): Likewise.
* tree-ssa-address.c (maybe_fold_tmr): Likewise.
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Likewise.
* tree-ssa-ccp.c (optimize_memcpy): Likewise.
* tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Likewise.
(constant_pointer_difference): Likewise.
* tree-ssa-loop-niter.c (expand_simple_operations): Likewise.
* tree-ssa-phiopt.c (jump_function_from_stmt): Likewise.
* tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise.
* tree-ssa-sccvn.c (vn_reference_fold_indirect): Likewise.
(vn_reference_maybe_forwprop_address, vn_reference_lookup_3): Likewise.
(set_ssa_val_to): Likewise.
* tree-ssa-strlen.c (get_addr_stridx, addr_stridxptr): Likewise.
* tree.c (build_simple_mem_ref_loc): Likewise.
Index: gcc/tree-dfa.h
===================================================================
--- gcc/tree-dfa.h 2017-10-23 17:16:59.705267681 +0100
+++ gcc/tree-dfa.h 2017-10-23 17:17:01.432034493 +0100
@@ -33,9 +33,9 @@ extern tree get_ref_base_and_extent (tre
poly_int64_pod *, bool *);
extern tree get_ref_base_and_extent_hwi (tree, HOST_WIDE_INT *,
HOST_WIDE_INT *, bool *);
-extern tree get_addr_base_and_unit_offset_1 (tree, HOST_WIDE_INT *,
+extern tree get_addr_base_and_unit_offset_1 (tree, poly_int64_pod *,
tree (*) (tree));
-extern tree get_addr_base_and_unit_offset (tree, HOST_WIDE_INT *);
+extern tree get_addr_base_and_unit_offset (tree, poly_int64_pod *);
extern bool stmt_references_abnormal_ssa_name (gimple *);
extern void replace_abnormal_ssa_names (gimple *);
extern void dump_enumerated_decls (FILE *, dump_flags_t);
Index: gcc/tree-dfa.c
===================================================================
--- gcc/tree-dfa.c 2017-10-23 17:16:59.705267681 +0100
+++ gcc/tree-dfa.c 2017-10-23 17:17:01.432034493 +0100
@@ -705,10 +705,10 @@ get_ref_base_and_extent_hwi (tree exp, H
its argument or a constant if the argument is known to be constant. */
tree
-get_addr_base_and_unit_offset_1 (tree exp, HOST_WIDE_INT *poffset,
+get_addr_base_and_unit_offset_1 (tree exp, poly_int64_pod *poffset,
tree (*valueize) (tree))
{
- HOST_WIDE_INT byte_offset = 0;
+ poly_int64 byte_offset = 0;
/* Compute cumulative byte-offset for nested component-refs and array-refs,
and find the ultimate containing object. */
@@ -718,10 +718,13 @@ get_addr_base_and_unit_offset_1 (tree ex
{
case BIT_FIELD_REF:
{
- HOST_WIDE_INT this_off = TREE_INT_CST_LOW (TREE_OPERAND (exp, 2));
- if (this_off % BITS_PER_UNIT)
+ poly_int64 this_byte_offset;
+ poly_uint64 this_bit_offset;
+ if (!poly_int_tree_p (TREE_OPERAND (exp, 2), &this_bit_offset)
+ || !multiple_p (this_bit_offset, BITS_PER_UNIT,
+ &this_byte_offset))
return NULL_TREE;
- byte_offset += this_off / BITS_PER_UNIT;
+ byte_offset += this_byte_offset;
}
break;
@@ -729,15 +732,14 @@ get_addr_base_and_unit_offset_1 (tree ex
{
tree field = TREE_OPERAND (exp, 1);
tree this_offset = component_ref_field_offset (exp);
- HOST_WIDE_INT hthis_offset;
+ poly_int64 hthis_offset;
if (!this_offset
- || TREE_CODE (this_offset) != INTEGER_CST
+ || !poly_int_tree_p (this_offset, &hthis_offset)
|| (TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field))
% BITS_PER_UNIT))
return NULL_TREE;
- hthis_offset = TREE_INT_CST_LOW (this_offset);
hthis_offset += (TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field))
/ BITS_PER_UNIT);
byte_offset += hthis_offset;
@@ -755,17 +757,18 @@ get_addr_base_and_unit_offset_1 (tree ex
index = (*valueize) (index);
/* If the resulting bit-offset is constant, track it. */
- if (TREE_CODE (index) == INTEGER_CST
+ if (poly_int_tree_p (index)
&& (low_bound = array_ref_low_bound (exp),
- TREE_CODE (low_bound) == INTEGER_CST)
+ poly_int_tree_p (low_bound))
&& (unit_size = array_ref_element_size (exp),
TREE_CODE (unit_size) == INTEGER_CST))
{
- offset_int woffset
- = wi::sext (wi::to_offset (index) - wi::to_offset (low_bound),
+ poly_offset_int woffset
+ = wi::sext (wi::to_poly_offset (index)
+ - wi::to_poly_offset (low_bound),
TYPE_PRECISION (TREE_TYPE (index)));
woffset *= wi::to_offset (unit_size);
- byte_offset += woffset.to_shwi ();
+ byte_offset += woffset.force_shwi ();
}
else
return NULL_TREE;
@@ -842,7 +845,7 @@ get_addr_base_and_unit_offset_1 (tree ex
is not BITS_PER_UNIT-aligned. */
tree
-get_addr_base_and_unit_offset (tree exp, HOST_WIDE_INT *poffset)
+get_addr_base_and_unit_offset (tree exp, poly_int64_pod *poffset)
{
return get_addr_base_and_unit_offset_1 (exp, poffset, NULL);
}
Index: gcc/doc/match-and-simplify.texi
===================================================================
--- gcc/doc/match-and-simplify.texi 2017-10-23 17:07:40.843798706 +0100
+++ gcc/doc/match-and-simplify.texi 2017-10-23 17:17:01.428035033 +0100
@@ -205,7 +205,7 @@ Captures can also be used for capturing
(pointer_plus (addr@@2 @@0) INTEGER_CST_P@@1)
(if (is_gimple_min_invariant (@@2)))
@{
- HOST_WIDE_INT off;
+ poly_int64 off;
tree base = get_addr_base_and_unit_offset (@@0, &off);
off += tree_to_uhwi (@@1);
/* Now with that we should be able to simply write
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c 2017-10-23 17:11:40.244945208 +0100
+++ gcc/fold-const.c 2017-10-23 17:17:01.429034898 +0100
@@ -9455,7 +9455,7 @@ fold_binary_loc (location_t loc,
&& handled_component_p (TREE_OPERAND (arg0, 0)))
{
tree base;
- HOST_WIDE_INT coffset;
+ poly_int64 coffset;
base = get_addr_base_and_unit_offset (TREE_OPERAND (arg0, 0),
&coffset);
if (!base)
Index: gcc/gimple-fold.c
===================================================================
--- gcc/gimple-fold.c 2017-10-23 17:16:59.703267951 +0100
+++ gcc/gimple-fold.c 2017-10-23 17:17:01.430034763 +0100
@@ -838,8 +838,8 @@ gimple_fold_builtin_memory_op (gimple_st
&& TREE_CODE (dest) == ADDR_EXPR)
{
tree src_base, dest_base, fn;
- HOST_WIDE_INT src_offset = 0, dest_offset = 0;
- HOST_WIDE_INT maxsize;
+ poly_int64 src_offset = 0, dest_offset = 0;
+ poly_uint64 maxsize;
srcvar = TREE_OPERAND (src, 0);
src_base = get_addr_base_and_unit_offset (srcvar, &src_offset);
@@ -850,16 +850,14 @@ gimple_fold_builtin_memory_op (gimple_st
&dest_offset);
if (dest_base == NULL)
dest_base = destvar;
- if (tree_fits_uhwi_p (len))
- maxsize = tree_to_uhwi (len);
- else
+ if (!poly_int_tree_p (len, &maxsize))
maxsize = -1;
if (SSA_VAR_P (src_base)
&& SSA_VAR_P (dest_base))
{
if (operand_equal_p (src_base, dest_base, 0)
- && ranges_overlap_p (src_offset, maxsize,
- dest_offset, maxsize))
+ && ranges_may_overlap_p (src_offset, maxsize,
+ dest_offset, maxsize))
return false;
}
else if (TREE_CODE (src_base) == MEM_REF
@@ -868,17 +866,12 @@ gimple_fold_builtin_memory_op (gimple_st
if (! operand_equal_p (TREE_OPERAND (src_base, 0),
TREE_OPERAND (dest_base, 0), 0))
return false;
- off