Re: paradoxical subreg problem

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: paradoxical subreg problem
@ 2002-01-28 13:25 Richard Kenner
  2002-01-28 13:46 ` David Edelsohn
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Kenner @ 2002-01-28 13:25 UTC (permalink / raw)
  To: law; +Cc: gcc

Let me rephrase: why are we genering this if we don't know what those
bits are?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 13:25 paradoxical subreg problem Richard Kenner
@ 2002-01-28 13:46 ` David Edelsohn
  2002-01-28 15:20   ` law
  0 siblings, 1 reply; 27+ messages in thread
From: David Edelsohn @ 2002-01-28 13:46 UTC (permalink / raw)
  To: Richard Kenner; +Cc: law, gcc

>>>>> Richard Kenner writes:

Richard> Let me rephrase: why are we genering this if we don't know what those
Richard> bits are?

	I agree with Kenner.  There are two things we can discuss: should
this ever be generated and should this be generated in this particular
instance causing problems.  Regardless whether this compare result can be
determined at compile-time or run-time, if the bits are undefined, why is
GCC generating an expression that relies on undefined bits?

David

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 13:46 ` David Edelsohn
@ 2002-01-28 15:20   ` law
  2002-01-28 15:50     ` Geoff Keating
  0 siblings, 1 reply; 27+ messages in thread
From: law @ 2002-01-28 15:20 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Kenner, gcc

In message <200201282037.PAA24126@makai.watson.ibm.com>, David Edelsohn writes:
 > >>>>> Richard Kenner writes:
 > 
 > Richard> Let me rephrase: why are we genering this if we don't know what tho
 > se
 > Richard> bits are?
 > 
 > 	I agree with Kenner.  There are two things we can discuss: should
 > this ever be generated and should this be generated in this particular
 > instance causing problems.  Regardless whether this compare result can be
 > determined at compile-time or run-time, if the bits are undefined, why is
 > GCC generating an expression that relies on undefined bits?
Kenner's initial claim was not that they were undefined, but that they were
bits we could pretend had any value that was interesting to us (don't care).
If you read the section on paradoxical subregs, this is what it implies.

I think we really need to clarify if the bits are "don't care" or "undefined";
once that's settled it shouldn't be terribly difficult to deal with combine
to make it follow those semantics.

I think the semantics should be "undefined".

jeff



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 15:20   ` law
@ 2002-01-28 15:50     ` Geoff Keating
  0 siblings, 0 replies; 27+ messages in thread
From: Geoff Keating @ 2002-01-28 15:50 UTC (permalink / raw)
  To: law; +Cc: gcc

law@redhat.com writes:

> In message <200201282037.PAA24126@makai.watson.ibm.com>, David Edelsohn writes:
>  > >>>>> Richard Kenner writes:
>  > 
>  > Richard> Let me rephrase: why are we genering this if we don't know what tho
>  > se
>  > Richard> bits are?
>  > 
>  > 	I agree with Kenner.  There are two things we can discuss: should
>  > this ever be generated and should this be generated in this particular
>  > instance causing problems.  Regardless whether this compare result can be
>  > determined at compile-time or run-time, if the bits are undefined, why is
>  > GCC generating an expression that relies on undefined bits?
> Kenner's initial claim was not that they were undefined, but that they were
> bits we could pretend had any value that was interesting to us (don't care).
> If you read the section on paradoxical subregs, this is what it implies.
> 
> I think we really need to clarify if the bits are "don't care" or "undefined";
> once that's settled it shouldn't be terribly difficult to deal with combine
> to make it follow those semantics.
> 
> I think the semantics should be "undefined".

I think you're using the word "undefined" in a confusing way.  You
probably mean "opaque".  If the expression (subreg:SI (mem:QI ...))
had no defined value in the high bits, then the generator of the
expression should know this, and so presumably the value of the bits
doesn't affect the computation and the rest of the compiler can pick
any convenient value.

In this particular case, though, IMO either way the bug is in the
generator of the RTL.  We know that the bits will actually be zero,
and so either we rely on this and a ZERO_EXTEND should be used, or we
don't care what the bits will be, in which case the SUBREG is valid no
matter which definition of it we choose.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 15:41 ` Jim Wilson
  2002-01-29  1:57   ` Jim Wilson
@ 2002-01-29 10:55   ` Richard Earnshaw
  1 sibling, 0 replies; 27+ messages in thread
From: Richard Earnshaw @ 2002-01-29 10:55 UTC (permalink / raw)
  To: Jim Wilson; +Cc: law, gcc, Richard.Earnshaw

> In article <16350.1012242676@porcupine.cygnus.com> you write:
> >Think very very carefully about the semantics of a paradoxical subreg.
> 
> A paradoxical subreg can have one of two different meanings, depending on
> context.
> 
> 1) The extra bits are don't care bits.  This usage can occur anywhere, and
>    is primarily used when manipulating values larger than the word size.
>    This perhaps requires that the operand be a register, but I'm not sure.
>    For instance, on a 32-bit target, given (subreg:DI (reg:SI 100)), the
>    extra 32-bits are don't care bits.
> 2) The extra bits are known to be zero or one, depending on whether loads
>    zero or sign extend by default.  This is only used for values smaller than
>    or equal to the the word size, and I think this also requires that the
>    operand be a MEM, but I'm not positive about the last condition.  For
>    instance, on a 32-bit target with zero-extending loads, given
>    (subreg:SI (mem:QI ...)), the extra 24-bits are known to be zero.  This
>    usage occurs only between combine and reload.
> 
> It has been this way for over a decade I'd say.
> 
> The first meaning came first, and has obvious uses.  I believe the second
> meaning arose because of our primarily CISC oriented view at the time.
> Since predicates know about subregs already, using subregs for zero/sign
> extended loads means that they would be accepted automatically.  If we
> added the explicit zero/sign extension operators, then a lot of predicates
> and patterns would have to be modified to allow zero/sign extension operators.
> Nowadays, the prevailing wisdom is that you should never create an insn
> that will require a reload, and thus it may be that the second type of
> paradoxical subreg is no longer useful.  We already have the problem that
> combine.c and recog.c have special code checking INSN_SCHEDULING to get rid
> of the second class of paradoxical subregs.  This code is unclean though,
> as we never should have created these paradoxical subregs on a load/store
> (RISC) machine to begin with.  If the patterns in the md file don't allow
> MEM, then recog should not be accepting (subreg (mem)) as a register operand.
> 
> There might be a second justification that this feature allows for better
> optimization, by exposing zero/sign extensions that would otherwise be
> hidden.  However, I don't think this is convincing.  Combine has code to
> keep track of sign-bit copies now, and we can do this just as well by using
> explicit zero_extend/sign_extend operators instead of using subregs.

Yep, that summarizes my understanding as well.

Most of this rubbish seems to come about because register_operand accepts 
subreg(mem) as a valid register.  This has always seemed daft to me, and 
indeed the arm back-end hardly ever uses register_operand because of this 
(it seems to generate worse code by forcing spills during reload that 
would otherwise have been caught during general register allocation).

R.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 14:53 ` law
@ 2002-01-29 10:31   ` Richard Earnshaw
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Earnshaw @ 2002-01-29 10:31 UTC (permalink / raw)
  To: law; +Cc: Richard Kenner, gcc, Richard.Earnshaw

> In message <10201281935.AA25716@vlsi1.ultra.nyu.edu>, Richard Kenner writes:
>  >     Don't assume you can break it into two expressions.  Consider the
>  >     expression as it stands (and as combine creates it).
>  > 
>  > Sure, but I'm trying to define what it means by comparison with
>  > two expressions.
>  > 
>  >     So with your assertions in mind are these two expresions equivalent?
>  > 
>  >     (and:SI (subreg:SI (mem:QI) 0) (const_int 255))
>  > 
>  >     (subreg:SI (mem:QI X) 0)
> 
> Are you sure?
> 
> Are the bits outside the mode of SUBREG_REG undefined or "don't care"?  That
> is the crux of the issue.
> 
> If those bits are undefined, then those expressions are not equivalent as
> the result of the first expression has 24 zeros in its high bits whereas
> the second expression has 24 undefined bits.  If those bits are "don't care"
> then the compiler can treat those expressions as equivalent.
> 

I was under the impression that we would only create paradoxical subregs 
of a mem if LOAD_EXTEND_OP returns either ZERO_EXTEND or SIGN_EXTEND for 
the mode of the mem (ie, we won't create such paradoxical subregs if 
LOAD_EXTEND_OP returns NIL).  That is, the operation is an alternate way 
of expressing the extend operation, because that is what will happen in 
practice.

Given the above, then on such machines

	(subreg:SI (mem:QI X) 0)

is defined for all 32 bits.

R.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 15:41 ` Jim Wilson
@ 2002-01-29  1:57   ` Jim Wilson
  2002-01-29 10:55   ` Richard Earnshaw
  1 sibling, 0 replies; 27+ messages in thread
From: Jim Wilson @ 2002-01-29  1:57 UTC (permalink / raw)
  To: gcc; +Cc: law

In article <200201282147.NAA19688@rtl.cygnus.com> you write:
>In article <16350.1012242676@porcupine.cygnus.com> you write:
>2) The extra bits are known to be zero or one, depending on whether loads
>   zero or sign extend by default.  This is only used for values smaller than
>   or equal to the the word size, and I think this also requires that the
>   operand be a MEM, but I'm not positive about the last condition.  For
>   instance, on a 32-bit target with zero-extending loads, given
>   (subreg:SI (mem:QI ...)), the extra 24-bits are known to be zero.  This
>   usage occurs only between combine and reload.

I left out an important piece.  This is only true if WORD_REGISTER_OPERATIONS
is defined.  If WORD_REGISTER_OPERATIONS is not defined, then the upper
24 bits of (subreg:SI (mem:QI ...)) are undefined.  A QImode load on a CISC
tends to modify the low 8-bits of the register, and leave the upper 24-bit
unchanged, which means they have no useful value.  If WORD_REGISTER_OPERATIONS
is defined, then the upper 24 bits of (subreg:SI (mem:QI ...)) are defined,
and are known to be zero or one depending on the value of LOAD_EXTEND_OP.
A QImode load on a RISC tends to set the entire 32-bits of a register, so we
want to use that knowledge to optimize away unnecessary zero/sign extend
instructions.  Except that these paradoxical subregs prevent us from
properly scheduling memory operations, so we try to avoid recognizing them when
INSN_SCHEDULING, which partially defeats the purpose of creating them when
WORD_REGISTER_OPERATIONS is true, since most any target that has one of
these defined has the other one defined too.  They still are created and
used inside combine though, and help with some simplications inside combine,
but I think we would get the same effect by using explicit zero/sign extend
operators.

Searching combine.c for WORD_REGISTER_OPERATIONS shows that it does some
interesting things, some of them in conjunction with LOAD_EXTEND_OP.

Curiously, I just noticed that the INSN_SCHEDULING code in combine.c looks
wrong.  It has:
          /* If *SPLIT is a paradoxical SUBREG, when we split it, it should
             be written as a ZERO_EXTEND.  */
          if (split_code == SUBREG && GET_CODE (SUBREG_REG (*split)) == MEM)
            SUBST (*split, gen_rtx_ZERO_EXTEND  (split_mode,
                                                 SUBREG_REG (*split)));
I think it should be using LOAD_EXTEND_OP instead of assuming a ZERO_EXTEND,
but that code has been there for about 10 years, so if it was wrong, we
should have noticed long ago.

>(eq (subreg:SI (mem/s:QI (plus:SI (reg:SI 3 %r3)
>                (const_int 15 [0xf])) 1) 0)
>    (mem/s:SI (plus:SI (reg:SI 3 %r3)
>            (const_int 12 [0xc])) 1))

If the word size is 32-bits or more, and WORD_REGISTER_OPERATIONS is defined,
and LOAD_EXTEND_OP is defined, then this is a 32-bit compare.  It can not
be simplified at compile time unless we know the values in the memory locations
at offsets 12, 13, and 14, and know that these values match the value of
the byte at offset 15 extended as per LOAD_EXTEND_OP.

>Can someone explain to me clearly what the semantics of a paradoxical
>subreg really are?

No, because the use of subregs has gotten too complicated over time.
They are used for more than one purpose, and have more than one meaning,
depending on the context, and depending on the value of some target macros.

Jim

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 11:36 law
  2002-01-28 11:50 ` Richard Henderson
  2002-01-28 12:15 ` Michael Matz
@ 2002-01-28 15:41 ` Jim Wilson
  2002-01-29  1:57   ` Jim Wilson
  2002-01-29 10:55   ` Richard Earnshaw
  2 siblings, 2 replies; 27+ messages in thread
From: Jim Wilson @ 2002-01-28 15:41 UTC (permalink / raw)
  To: law; +Cc: gcc

In article <16350.1012242676@porcupine.cygnus.com> you write:
>Think very very carefully about the semantics of a paradoxical subreg.

A paradoxical subreg can have one of two different meanings, depending on
context.

1) The extra bits are don't care bits.  This usage can occur anywhere, and
   is primarily used when manipulating values larger than the word size.
   This perhaps requires that the operand be a register, but I'm not sure.
   For instance, on a 32-bit target, given (subreg:DI (reg:SI 100)), the
   extra 32-bits are don't care bits.
2) The extra bits are known to be zero or one, depending on whether loads
   zero or sign extend by default.  This is only used for values smaller than
   or equal to the the word size, and I think this also requires that the
   operand be a MEM, but I'm not positive about the last condition.  For
   instance, on a 32-bit target with zero-extending loads, given
   (subreg:SI (mem:QI ...)), the extra 24-bits are known to be zero.  This
   usage occurs only between combine and reload.

It has been this way for over a decade I'd say.

The first meaning came first, and has obvious uses.  I believe the second
meaning arose because of our primarily CISC oriented view at the time.
Since predicates know about subregs already, using subregs for zero/sign
extended loads means that they would be accepted automatically.  If we
added the explicit zero/sign extension operators, then a lot of predicates
and patterns would have to be modified to allow zero/sign extension operators.
Nowadays, the prevailing wisdom is that you should never create an insn
that will require a reload, and thus it may be that the second type of
paradoxical subreg is no longer useful.  We already have the problem that
combine.c and recog.c have special code checking INSN_SCHEDULING to get rid
of the second class of paradoxical subregs.  This code is unclean though,
as we never should have created these paradoxical subregs on a load/store
(RISC) machine to begin with.  If the patterns in the md file don't allow
MEM, then recog should not be accepting (subreg (mem)) as a register operand.

There might be a second justification that this feature allows for better
optimization, by exposing zero/sign extensions that would otherwise be
hidden.  However, I don't think this is convincing.  Combine has code to
keep track of sign-bit copies now, and we can do this just as well by using
explicit zero_extend/sign_extend operators instead of using subregs.

Jim

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 15:21 Richard Kenner
@ 2002-01-28 15:41 ` law
  0 siblings, 0 replies; 27+ messages in thread
From: law @ 2002-01-28 15:41 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

In message <10201282114.AA26279@vlsi1.ultra.nyu.edu>, Richard Kenner writes:
 >     Kenner's initial claim was not that they were undefined, but that they
 >     were bits we could pretend had any value that was interesting to us
 >     (don't care).  If you read the section on paradoxical subregs, this is
 >     what it implies.
 > 
 > I must say I don't understand the distinction between those two cases.
In the don't care case, you can assume they have any value you want and
make optimizations based on that.  Going back to the original comparison:

(eq (subreg:SI (mem/s:QI (plus:SI (reg:SI 3 %r3)
                (const_int 15 [0xf])) 1) 0)
    (mem/s:SI (plus:SI (reg:SI 3 %r3)
            (const_int 12 [0xc])) 1))

The ability to assume that the upper 24 bits in the first operand have any
convenient value (don't care) allows you do equate both arms, creating
an expression that optimizes into a compile-time constant.  This is what
combine does right now.  [ ie, you are allowed to assume that the upper
bits in the first operand match the upper bits in the second operand].

If the upper 24 bits in the first operand are undefined, then we can make
*no* assumptions about what value they might have.  Thus we can't optimize
away the comparison.

Or to look at the second example.  Given these two expressions:

(and:SI (subreg:SI (mem:QI) 0) (const_int 255))

(subreg:SI (mem:QI X) 0)

If the semantics are "don't care", then we can assume the upper bits of
the second expression are all zero and the two expressions are equivalent
(ie, they can be used interchangably).

However, if the semantics are "undefined", then the second expression's
upper 24bits are undefined whereas the first expressions upper bits are known
to have the value 0.  Which means the expressions can not be used 
interchangably.

jeff

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
@ 2002-01-28 15:21 Richard Kenner
  2002-01-28 15:41 ` law
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Kenner @ 2002-01-28 15:21 UTC (permalink / raw)
  To: law; +Cc: gcc

    Kenner's initial claim was not that they were undefined, but that they
    were bits we could pretend had any value that was interesting to us
    (don't care).  If you read the section on paradoxical subregs, this is
    what it implies.

I must say I don't understand the distinction between those two cases.

But, as I said, I think the real issue here is why this is being emitted when
we don't know for sure what those bits are.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 12:18 Richard Kenner
  2002-01-28 12:47 ` law
@ 2002-01-28 14:53 ` law
  2002-01-29 10:31   ` Richard Earnshaw
  1 sibling, 1 reply; 27+ messages in thread
From: law @ 2002-01-28 14:53 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

In message <10201281935.AA25716@vlsi1.ultra.nyu.edu>, Richard Kenner writes:
 >     Don't assume you can break it into two expressions.  Consider the
 >     expression as it stands (and as combine creates it).
 > 
 > Sure, but I'm trying to define what it means by comparison with
 > two expressions.
 > 
 >     So with your assertions in mind are these two expresions equivalent?
 > 
 >     (and:SI (subreg:SI (mem:QI) 0) (const_int 255))
 > 
 >     (subreg:SI (mem:QI X) 0)

Are you sure?

Are the bits outside the mode of SUBREG_REG undefined or "don't care"?  That
is the crux of the issue.

If those bits are undefined, then those expressions are not equivalent as
the result of the first expression has 24 zeros in its high bits whereas
the second expression has 24 undefined bits.  If those bits are "don't care"
then the compiler can treat those expressions as equivalent.

jeff

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 13:03           ` law
  2002-01-28 13:18             ` Richard Henderson
@ 2002-01-28 14:25             ` Michael Matz
  1 sibling, 0 replies; 27+ messages in thread
From: Michael Matz @ 2002-01-28 14:25 UTC (permalink / raw)
  To: law; +Cc: Richard Henderson, gcc

Ho,

On Mon, 28 Jan 2002 law@redhat.com wrote:

> However, even if we do that I think we need to clarify precisely what
> the semantics of a paradoxical subreg really are.    Clearly there is
> some confusion.

Well, para. subregs introduce undefined bits.  From that everything
follows.  They can only be introduced if the code doesn't care about that
undefinedness (IMHO that is what the docu talks about).
If they are introduced there exists no knowledge about the undefined bits.
Under this rule alone para. subregs of mem's are not disallowed.  In fact
all are handled the same.  (mem subregs might be (and are) disallowed for
other reasons)

Btw. (and:SI (subreg:SI (reg:QI X) 0) 255) and (reg:QI X) are in that
definition equal.  The subreg first introduces undefinedness just to throw
it away again with the 'and'.  Problems only arise if that subreg is
copied to a (SImode) pseudo.  That pseudo still carries that partial
undefinedness property, which is what is problematic about them, because
we have no real way to note that.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 13:03           ` law
@ 2002-01-28 13:18             ` Richard Henderson
  2002-01-28 14:25             ` Michael Matz
  1 sibling, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2002-01-28 13:18 UTC (permalink / raw)
  To: law; +Cc: gcc

On Mon, Jan 28, 2002 at 01:03:20PM -0700, law@redhat.com wrote:
> However, even if we do that I think we need to clarify precisely what
> the semantics of a paradoxical subreg really are.    Clearly there is
> some confusion.

With plain registers, I don't think there is any confusion -- the
upper bits are simply undefined.  With memories, there is confusion.
But I think we should "clarify" by disallowing them.

> [ Actually my preference would be to zap paradoxical subregs, but I
>   don't think we're in a position to do that right now. ]

Nope.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 12:47 Richard Kenner
@ 2002-01-28 13:10 ` law
  0 siblings, 0 replies; 27+ messages in thread
From: law @ 2002-01-28 13:10 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

In message <10201281959.AA25855@vlsi1.ultra.nyu.edu>, Richard Kenner writes:
 > I think the code issue here is that if the bits are undefined, why is
 > this being generated?
Err, if the bits were undefined, then we can't make any assumptions about
their value.  Contrast that to how the compiler treats those bits -- it
treats them as "don't care, assume any useful value".

If the bits are undefined in the traditional sense, then this expression is
not a compile-time constant:

(eq (subreg:SI (mem/s:QI (plus:SI (reg:SI 3 %r3)
                (const_int 15 [0xf])) 1) 0)
    (mem/s:SI (plus:SI (reg:SI 3 %r3)
            (const_int 12 [0xc])) 1))


If the bits are undefined in the traditional sense, then these two expressions
are _NOT_ equivalent:

(and:SI (subreg:SI (mem:QI) 0) (const_int 255))

(subreg:SI (mem:QI X) 0)



jeff
jeff










^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 12:39         ` Richard Henderson
@ 2002-01-28 13:03           ` law
  2002-01-28 13:18             ` Richard Henderson
  2002-01-28 14:25             ` Michael Matz
  0 siblings, 2 replies; 27+ messages in thread
From: law @ 2002-01-28 13:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc

In message <20020128115045.A26034@redhat.com>, Richard Henderson writes:
 > On Mon, Jan 28, 2002 at 12:26:13PM -0700, law@redhat.com wrote:
 > >  > That would be a bug.  A failure to call simplify_subreg or
 > >  > simplify_gen_subreg.
 > > I disagree given the above docs.
 > 
 > Given the problem you're seeing, wouldn't it be better to 
 > disallow it anyway and fix the docs?  I can't imagine what
 > good you could do by allowing such a thing.
Possibly -- I don't even known why it's allowed in this case.  There
may be a good reason for it, then again, there may not.  My gut
instinct is to disallow it, given the semantic issues that arise.

However, even if we do that I think we need to clarify precisely what
the semantics of a paradoxical subreg really are.    Clearly there is
some confusion.

[ Actually my preference would be to zap paradoxical subregs, but I
  don't think we're in a position to do that right now. ]

jeff

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 12:18 Richard Kenner
@ 2002-01-28 12:47 ` law
  2002-01-28 14:53 ` law
  1 sibling, 0 replies; 27+ messages in thread
From: law @ 2002-01-28 12:47 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

In message <10201281935.AA25716@vlsi1.ultra.nyu.edu>, Richard Kenner writes:
 >     Don't assume you can break it into two expressions.  Consider the
 >     expression as it stands (and as combine creates it).
 > 
 > Sure, but I'm trying to define what it means by comparison with
 > two expressions.
 > 
 >     So with your assertions in mind are these two expresions equivalent?
 > 
 >     (and:SI (subreg:SI (mem:QI) 0) (const_int 255))
 > 
 >     (subreg:SI (mem:QI X) 0)
OK.  So let's go back to this expression:

(eq (subreg:SI (mem/s:QI (plus:SI (reg:SI 3 %r3)
                (const_int 15 [0xf])) 1) 0)
    (mem/s:SI (plus:SI (reg:SI 3 %r3)
            (const_int 12 [0xc])) 1))

Your claim is that the paradoxical subreg allows us to pretend the bits
are anything that is convenient for the optimizer.  I claim we can't make
that assumption.  Let's pretend the memory in question has the value
0x12345678.  Plug it in (remembering byte loads zero extend, big endian)
and the hardware actually performs the following comparison:

(eq (0x00000078) (0x12345678)

Which is false.

However, combine treats this as

(eq (0x12345678) (0x12345678)

Which is true and combine optimizes away the comparison.

So, something is inconsistent.

jeff

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
@ 2002-01-28 12:47 Richard Kenner
  2002-01-28 13:10 ` law
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Kenner @ 2002-01-28 12:47 UTC (permalink / raw)
  To: law; +Cc: gcc

I think the code issue here is that if the bits are undefined, why is
this being generated?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 12:09       ` law
@ 2002-01-28 12:39         ` Richard Henderson
  2002-01-28 13:03           ` law
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2002-01-28 12:39 UTC (permalink / raw)
  To: law; +Cc: gcc

On Mon, Jan 28, 2002 at 12:26:13PM -0700, law@redhat.com wrote:
>  > That would be a bug.  A failure to call simplify_subreg or
>  > simplify_gen_subreg.
> I disagree given the above docs.

Given the problem you're seeing, wouldn't it be better to 
disallow it anyway and fix the docs?  I can't imagine what
good you could do by allowing such a thing.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
@ 2002-01-28 12:18 Richard Kenner
  2002-01-28 12:47 ` law
  2002-01-28 14:53 ` law
  0 siblings, 2 replies; 27+ messages in thread
From: Richard Kenner @ 2002-01-28 12:18 UTC (permalink / raw)
  To: law; +Cc: gcc

    Don't assume you can break it into two expressions.  Consider the
    expression as it stands (and as combine creates it).

Sure, but I'm trying to define what it means by comparison with
two expressions.

    So with your assertions in mind are these two expresions equivalent?

    (and:SI (subreg:SI (mem:QI) 0) (const_int 255))

    (subreg:SI (mem:QI X) 0)

I'd be inclined to say "yes".

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 11:36 law
  2002-01-28 11:50 ` Richard Henderson
@ 2002-01-28 12:15 ` Michael Matz
  2002-01-28 15:41 ` Jim Wilson
  2 siblings, 0 replies; 27+ messages in thread
From: Michael Matz @ 2002-01-28 12:15 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: gcc

Hi,

On Mon, 28 Jan 2002 law@redhat.com wrote:

> I'll start simple.  On a big endian machine, can this expression be optimized
> into true/false at compile time, or must it be run-time computed?
>
> (eq (subreg:SI (mem/s:QI (plus:SI (reg:SI 3 %r3)
>                 (const_int 15 [0xf])) 1) 0)
>     (mem/s:SI (plus:SI (reg:SI 3 %r3)
>             (const_int 12 [0xc])) 1))
>
> Think very very carefully about the semantics of a paradoxical subreg.
>
> According to my reading, the compiler is allowed to optimize the expression
> into (true) because the bits outside of QImode on the subreg are "don't care
> bits" -- meaning they can have any value that is convenient to us.
>
> Agree/Disagree?

Disagree.  Richard H. pointed out that paradoxical subreg of mem are
normally not allowed, so this shouldn't happen, and Richard K. later
transformed it into what reload would do to this, and concludes that this
optimization is valid.  I disagree to the latter for deeper reasons:

The non-touched bit's of paradoxical subregs (be they from mem or from
pseudos) are _not_ "don't care".  They are "we don't know", i.e.
undefined.  Any optimizations making use of knowledge of the content of
those bits must be invalid.  In some very constrained circumstances we
might relax that (e.g. when we know that byte-loading will zero-extend).
In those cases we of course shouldn't simply rely on
  (use (subreg:SI (reg:QI x) 0) ...) doing the right thing, but instead
make that knowledge explicit, e.g. by
  (use (zero_extend:SI (reg:QI x))

I.e. replace paradoxical subregs by known operations, and if we can't, due
to lack of knowledge or other reasons, inhibit any optimizations referring
to the undefined parts.  The documentation seems to be misleading: it
talks about forming subregs, when the code doesn't care about the other
bits.  If that is correct combine shouldn't have produced that subreg of
mem, because clearly we _do_ care about those other bits in the compare.
OTOH once such a subreg is produced we can't deduce anything anymore.

So this is twofold: may be it was too difficult to prove that we don't
care about those bits (cf. combine), so such subregs are produced in the
hope to get better code later.  Then we should deactivate optimizations
making use of knowledge about undefined bits (i.e. when we later see, that
we do care nontheless).

Or we could get rid of paradoxical subregs ;-)  They are so paradoxical.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 12:03 Richard Kenner
@ 2002-01-28 12:15 ` law
  0 siblings, 0 replies; 27+ messages in thread
From: law @ 2002-01-28 12:15 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

In message <10201281902.AA25419@vlsi1.ultra.nyu.edu>, Richard Kenner writes:
 >     I'll start simple.  On a big endian machine, can this expression be
 >     optimized into true/false at compile time, or must it be run-time
 >     computed?
 > 
 >     (eq (subreg:SI (mem/s:QI (plus:SI (reg:SI 3 %r3)
 >                     (const_int 15 [0xf])) 1) 0)
 >         (mem/s:SI (plus:SI (reg:SI 3 %r3)
 >               (const_int 12 [0xc])) 1))
 > 
 > This is equivalent to:
 > 
 >     (set (reg:QI xx) (mem/s:QI (plus:SI (reg:SI 3 %r3)
 > 				        (const_int 15 [0xf])) 0))
 >     (eq (subreg:SI (reg:QI xx) 0)
 >         (mem/s:SI (plus:SI (reg:SI 3 %r3)
 > 	                   (const_int 12 [0xc])) 1))
 > 
 > 
 > on all machines.
Don't assume you can break it into two expressions.  Consider the
expression as it stands (and as combine creates it).

 >     According to my reading, the compiler is allowed to optimize the
 >     expression into (true) because the bits outside of QImode on the
 >     subreg are "don't care bits" -- meaning they can have any value that
 >     is convenient to us.
 > 
 >     Agree/Disagree?
 > 
 > I agree.
OK.

 > 
 >     Now consider if byte loads zero extend.  Does your answer change?  In
 >     the subreg arm, those "don't care" bits, have a well defined meaning --
 >     ie, we can't pretend they have whatever value is convenient for us.
 >     So, unless we have some more specific knowledge about the other arm,
 >     then this expression must be evaluated at runtime.
 > 
 > I disagree.  We "know" what they will be, but the undefined semantics still
 > holds.  So this can also be true.
OK.

So with your assertions in mind are these two expresions equivalent?

(and:SI (subreg:SI (mem:QI) 0) (const_int 255))

(subreg:SI (mem:QI X) 0)


jeff


















^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 12:05     ` Richard Henderson
@ 2002-01-28 12:09       ` law
  2002-01-28 12:39         ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: law @ 2002-01-28 12:09 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc

In message <20020128112330.A25997@redhat.com>, Richard Henderson writes:
 > On Mon, Jan 28, 2002 at 11:59:37AM -0700, law@redhat.com wrote:
 > > More correctly, they're allowed between combine and reload inclusively.
 > 
 > I don't think so.
The documentation has allowed this for years.  From rtl.texi:

@cindex combiner pass
@cindex reload pass
@cindex @code{subreg}, special reload handling
Between the combiner pass and the reload pass, it is possible to have a
paradoxical @code{subreg} which contains a @code{mem} instead of a
@code{reg} as its first operand.  After the reload pass, it is also
possible to have a non-paradoxical @code{subreg} which contains a
@code{mem}; this usually occurs when the @code{mem} is a stack slot
which replaced a pseudo register.

 > > And it's the combiner that is creating the paradoxical,
 > 
 > That would be a bug.  A failure to call simplify_subreg or
 > simplify_gen_subreg.
I disagree given the above docs.

jeff
 > 
 > 
 > r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 11:58   ` law
@ 2002-01-28 12:05     ` Richard Henderson
  2002-01-28 12:09       ` law
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2002-01-28 12:05 UTC (permalink / raw)
  To: law; +Cc: gcc

On Mon, Jan 28, 2002 at 11:59:37AM -0700, law@redhat.com wrote:
> More correctly, they're allowed between combine and reload inclusively.

I don't think so.

> And it's the combiner that is creating the paradoxical,

That would be a bug.  A failure to call simplify_subreg or
simplify_gen_subreg.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re:  paradoxical subreg problem
@ 2002-01-28 12:03 Richard Kenner
  2002-01-28 12:15 ` law
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Kenner @ 2002-01-28 12:03 UTC (permalink / raw)
  To: law; +Cc: gcc

    I'll start simple.  On a big endian machine, can this expression be
    optimized into true/false at compile time, or must it be run-time
    computed?

    (eq (subreg:SI (mem/s:QI (plus:SI (reg:SI 3 %r3)
                    (const_int 15 [0xf])) 1) 0)
        (mem/s:SI (plus:SI (reg:SI 3 %r3)
              (const_int 12 [0xc])) 1))

This is equivalent to:

    (set (reg:QI xx) (mem/s:QI (plus:SI (reg:SI 3 %r3)
				        (const_int 15 [0xf])) 0))
    (eq (subreg:SI (reg:QI xx) 0)
        (mem/s:SI (plus:SI (reg:SI 3 %r3)
	                   (const_int 12 [0xc])) 1))


on all machines.

    According to my reading, the compiler is allowed to optimize the
    expression into (true) because the bits outside of QImode on the
    subreg are "don't care bits" -- meaning they can have any value that
    is convenient to us.

    Agree/Disagree?

I agree.

    Now consider if byte loads zero extend.  Does your answer change?  In the
    subreg arm, those "don't care" bits, have a well defined meaning -- ie, we
    can't pretend they have whatever value is convenient for us.  So, unless
    we have some more specific knowledge about the other arm, then this
    expression must be evaluated at runtime.

I disagree.  We "know" what they will be, but the undefined semantics still
holds.  So this can also be true.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 11:50 ` Richard Henderson
@ 2002-01-28 11:58   ` law
  2002-01-28 12:05     ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: law @ 2002-01-28 11:58 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc

In message <20020128105031.D25844@redhat.com>, Richard Henderson writes:
 > On Mon, Jan 28, 2002 at 11:31:16AM -0700, law@redhat.com wrote:
 > > I'll start simple.  On a big endian machine, can this expression be optimi
 > zed
 > > into true/false at compile time, or must it be run-time computed?
 > > 
 > > (eq (subreg:SI (mem/s:QI (plus:SI (reg:SI 3 %r3)
 > >                 (const_int 15 [0xf])) 1) 0)
 > >     (mem/s:SI (plus:SI (reg:SI 3 %r3)
 > >             (const_int 12 [0xc])) 1))
 > > 
 > > 
 > > Think very very carefully about the semantics of a paradoxical subreg.
 > 
 > I'll remind you that we should only have subregs of memory during
 > reload (while pseudos are being replaced), and even then we don't
 > create paradoxical subregs of memory.
 > 
 > So this is a "shouldn't happen"
More correctly, they're allowed between combine and reload inclusively.  And
it's the combiner that is creating the paradoxical, then incorrectly
optimizing away a necessary comparison.

jeff

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: paradoxical subreg problem
  2002-01-28 11:36 law
@ 2002-01-28 11:50 ` Richard Henderson
  2002-01-28 11:58   ` law
  2002-01-28 12:15 ` Michael Matz
  2002-01-28 15:41 ` Jim Wilson
  2 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2002-01-28 11:50 UTC (permalink / raw)
  To: law; +Cc: gcc

On Mon, Jan 28, 2002 at 11:31:16AM -0700, law@redhat.com wrote:
> I'll start simple.  On a big endian machine, can this expression be optimized
> into true/false at compile time, or must it be run-time computed?
> 
> (eq (subreg:SI (mem/s:QI (plus:SI (reg:SI 3 %r3)
>                 (const_int 15 [0xf])) 1) 0)
>     (mem/s:SI (plus:SI (reg:SI 3 %r3)
>             (const_int 12 [0xc])) 1))
> 
> 
> Think very very carefully about the semantics of a paradoxical subreg.

I'll remind you that we should only have subregs of memory during
reload (while pseudos are being replaced), and even then we don't
create paradoxical subregs of memory.

So this is a "shouldn't happen".


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* paradoxical subreg problem
@ 2002-01-28 11:36 law
  2002-01-28 11:50 ` Richard Henderson
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: law @ 2002-01-28 11:36 UTC (permalink / raw)
  To: gcc

What a rats nest.  This relates of PRs 5169, 5185 and 5264.

I'll start simple.  On a big endian machine, can this expression be optimized
into true/false at compile time, or must it be run-time computed?

(eq (subreg:SI (mem/s:QI (plus:SI (reg:SI 3 %r3)
                (const_int 15 [0xf])) 1) 0)
    (mem/s:SI (plus:SI (reg:SI 3 %r3)
            (const_int 12 [0xc])) 1))

Think very very carefully about the semantics of a paradoxical subreg.

According to my reading, the compiler is allowed to optimize the expression
into (true) because the bits outside of QImode on the subreg are "don't care
bits" -- meaning they can have any value that is convenient to us.

Agree/Disagree?

Now consider if byte loads zero extend.  Does your answer change?  In the
subreg arm, those "don't care" bits, have a well defined meaning -- ie, we
can't pretend they have whatever value is convenient for us.  So, unless
we have some more specific knowledge about the other arm, then this
expression must be evaluated at runtime.

As you can probably guess, the compiler evaluates the expression at compile
time, which is wrong.

Can someone explain to me clearly what the semantics of a paradoxical
subreg really are?

jeff

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2002-01-29 13:13 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-01-28 13:25 paradoxical subreg problem Richard Kenner
2002-01-28 13:46 ` David Edelsohn
2002-01-28 15:20   ` law
2002-01-28 15:50     ` Geoff Keating
  -- strict thread matches above, loose matches on Subject: below --
2002-01-28 15:21 Richard Kenner
2002-01-28 15:41 ` law
2002-01-28 12:47 Richard Kenner
2002-01-28 13:10 ` law
2002-01-28 12:18 Richard Kenner
2002-01-28 12:47 ` law
2002-01-28 14:53 ` law
2002-01-29 10:31   ` Richard Earnshaw
2002-01-28 12:03 Richard Kenner
2002-01-28 12:15 ` law
2002-01-28 11:36 law
2002-01-28 11:50 ` Richard Henderson
2002-01-28 11:58   ` law
2002-01-28 12:05     ` Richard Henderson
2002-01-28 12:09       ` law
2002-01-28 12:39         ` Richard Henderson
2002-01-28 13:03           ` law
2002-01-28 13:18             ` Richard Henderson
2002-01-28 14:25             ` Michael Matz
2002-01-28 12:15 ` Michael Matz
2002-01-28 15:41 ` Jim Wilson
2002-01-29  1:57   ` Jim Wilson
2002-01-29 10:55   ` Richard Earnshaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).