public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix PR rtl-optimization/65067
@ 2015-03-05  7:10 Bernd Edlinger
  2015-03-05  8:53 ` Richard Biener
  0 siblings, 1 reply; 8+ messages in thread
From: Bernd Edlinger @ 2015-03-05  7:10 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 806 bytes --]

Hi,

on ARM we have a code quality regression, because of the strict volatile
bitfields handing.  The reason is that the current implementation directly
jumps to store_fixed_bit_field_1 which emits a sequence of and/or/shift
expressions.  This turns out to be too complex for combine to figure out
the possibility to use a "bfi" instruction.

But if -fno-strict-volatile-bitfields is used store_bit_field  can use the
EP_insv code pattern, which results in "bfi" instructions.
The only problem is that that store_bit_field is free to use _any_ possible
access mode.  But if we load the value first in a register, we can safely
use store_bit_field on the register and move the result back.


Boot-Strapped and regression-tested on Cortex-M3.

OK for trunk?


Thanks
Bernd.
 		 	   		  

[-- Attachment #2: changelog-volatile-bitfields-2.txt --]
[-- Type: text/plain, Size: 296 bytes --]

gcc:
2015-03-05  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	PR rtl-optimization/65067
	* expmed.c (store_bit_field, extract_bit_field): Reworked the
	strict volatile bitfield handling.

testsuite:
2015-03-05  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* gcc.target/arm/pr65067.c: New test.

[-- Attachment #3: patch-volatile-bitfields-2.diff --]
[-- Type: application/octet-stream, Size: 3196 bytes --]

Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c	(revision 221161)
+++ gcc/expmed.c	(working copy)
@@ -976,7 +976,7 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I
       /* Storing any naturally aligned field can be done with a simple
 	 store.  For targets that support fast unaligned memory, any
 	 naturally sized, unit aligned field can be done directly.  */
-      if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, fieldmode))
+      if (bitsize == GET_MODE_BITSIZE (fieldmode))
 	{
 	  str_rtx = adjust_bitfield_address (str_rtx, fieldmode,
 					     bitnum / BITS_PER_UNIT);
@@ -984,12 +984,16 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I
 	}
       else
 	{
+	  rtx temp;
+
 	  str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum,
 					  &bitnum);
-	  /* Explicitly override the C/C++ memory model; ignore the
-	     bit range so that we can do the access in the mode mandated
-	     by -fstrict-volatile-bitfields instead.  */
-	  store_fixed_bit_field_1 (str_rtx, bitsize, bitnum, value);
+	  temp = copy_to_reg (str_rtx);
+	  if (!store_bit_field_1 (temp, bitsize, bitnum, 0, 0,
+				  fieldmode, value, true))
+	    gcc_unreachable ();
+
+	  emit_move_insn (str_rtx, temp);
 	}
 
       return;
@@ -1786,24 +1790,20 @@ extract_bit_field (rtx str_rtx, unsigned HOST_WIDE
 
   if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, mode1, 0, 0))
     {
-      rtx result;
-
       /* Extraction of a full MODE1 value can be done with a load as long as
 	 the field is on a byte boundary and is sufficiently aligned.  */
-      if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, mode1))
-	result = adjust_bitfield_address (str_rtx, mode1,
-					  bitnum / BITS_PER_UNIT);
-      else
+      if (bitsize == GET_MODE_BITSIZE(mode1))
 	{
-	  str_rtx = narrow_bit_field_mem (str_rtx, mode1, bitsize, bitnum,
-					  &bitnum);
-	  result = extract_fixed_bit_field_1 (mode, str_rtx, bitsize, bitnum,
-					      target, unsignedp);
+	  rtx result = adjust_bitfield_address (str_rtx, mode1,
+						bitnum / BITS_PER_UNIT);
+	  return convert_extracted_bit_field (result, mode, tmode, unsignedp);
 	}
 
-      return convert_extracted_bit_field (result, mode, tmode, unsignedp);
+      str_rtx = narrow_bit_field_mem (str_rtx, mode1, bitsize, bitnum,
+				      &bitnum);
+      str_rtx = copy_to_reg (str_rtx);
     }
-  
+
   return extract_bit_field_1 (str_rtx, bitsize, bitnum, unsignedp,
 			      target, mode, tmode, true);
 }
Index: gcc/testsuite/gcc.target/arm/pr65067.c
===================================================================
--- gcc/testsuite/gcc.target/arm/pr65067.c	(revision 0)
+++ gcc/testsuite/gcc.target/arm/pr65067.c	(working copy)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-mthumb -mcpu=cortex-m3 -O2" } */
+
+struct tmp {
+ unsigned int dummy;
+ union {
+  struct {
+   unsigned int xyz : 1;
+   unsigned int mode: 3;
+   unsigned int res : 28;
+  } bf;
+  unsigned int wordval;
+ } reg;
+};
+
+void set_mode(int mode)
+{
+ volatile struct tmp *t = (struct tmp *) 0x1000;
+ t->reg.bf.mode = mode;
+}
+
+/* { dg-final { scan-assembler "bfi" } } */

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix PR rtl-optimization/65067
  2015-03-05  7:10 [PATCH] Fix PR rtl-optimization/65067 Bernd Edlinger
@ 2015-03-05  8:53 ` Richard Biener
  2015-03-05  9:03   ` Bernd Edlinger
  2015-03-06  9:12   ` Eric Botcazou
  0 siblings, 2 replies; 8+ messages in thread
From: Richard Biener @ 2015-03-05  8:53 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: gcc-patches, Eric Botcazou

On Thu, Mar 5, 2015 at 8:10 AM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> Hi,
>
> on ARM we have a code quality regression, because of the strict volatile
> bitfields handing.  The reason is that the current implementation directly
> jumps to store_fixed_bit_field_1 which emits a sequence of and/or/shift
> expressions.  This turns out to be too complex for combine to figure out
> the possibility to use a "bfi" instruction.
>
> But if -fno-strict-volatile-bitfields is used store_bit_field  can use the
> EP_insv code pattern, which results in "bfi" instructions.
> The only problem is that that store_bit_field is free to use _any_ possible
> access mode.  But if we load the value first in a register, we can safely
> use store_bit_field on the register and move the result back.
>
>
> Boot-Strapped and regression-tested on Cortex-M3.
>
> OK for trunk?

Hmm.  As you also modify the no-strict-volatile-bitfield path I'm not sure
you don't regress the case where EP_insv can work on memory.  I agree
that simplifying the strict-volatile-bitfield path to extract the memory
within strict-volatile-bitfield constraints to a reg and then using the regular
path is a good thing.

Eric?

Thanks,
Richard.

>
> Thanks
> Bernd.
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] Fix PR rtl-optimization/65067
  2015-03-05  8:53 ` Richard Biener
@ 2015-03-05  9:03   ` Bernd Edlinger
  2015-03-05  9:51     ` Richard Biener
  2015-03-06  9:12   ` Eric Botcazou
  1 sibling, 1 reply; 8+ messages in thread
From: Bernd Edlinger @ 2015-03-05  9:03 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Eric Botcazou

Hi,

On Thu, 5 Mar 2015 09:52:54, Richard Biener wrote:
>
> On Thu, Mar 5, 2015 at 8:10 AM, Bernd Edlinger
> <bernd.edlinger@hotmail.de> wrote:
>> Hi,
>>
>> on ARM we have a code quality regression, because of the strict volatile
>> bitfields handing. The reason is that the current implementation directly
>> jumps to store_fixed_bit_field_1 which emits a sequence of and/or/shift
>> expressions. This turns out to be too complex for combine to figure out
>> the possibility to use a "bfi" instruction.
>>
>> But if -fno-strict-volatile-bitfields is used store_bit_field can use the
>> EP_insv code pattern, which results in "bfi" instructions.
>> The only problem is that that store_bit_field is free to use _any_ possible
>> access mode. But if we load the value first in a register, we can safely
>> use store_bit_field on the register and move the result back.
>>
>>
>> Boot-Strapped and regression-tested on Cortex-M3.
>>
>> OK for trunk?
>
> Hmm. As you also modify the no-strict-volatile-bitfield path I'm not sure
> you don't regress the case where EP_insv can work on memory. I agree
> that simplifying the strict-volatile-bitfield path to extract the memory
> within strict-volatile-bitfield constraints to a reg and then using the regular
> path is a good thing.
>

I tried _not_ to touch the no-strict-volatile-bitfield path.
Where did you see that?

Thanks
Bernd.

> Eric?
>
> Thanks,
> Richard.
>
>>
>> Thanks
>> Bernd.
>>
 		 	   		  

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix PR rtl-optimization/65067
  2015-03-05  9:03   ` Bernd Edlinger
@ 2015-03-05  9:51     ` Richard Biener
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Biener @ 2015-03-05  9:51 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: gcc-patches, Eric Botcazou

On Thu, Mar 5, 2015 at 10:03 AM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> Hi,
>
> On Thu, 5 Mar 2015 09:52:54, Richard Biener wrote:
>>
>> On Thu, Mar 5, 2015 at 8:10 AM, Bernd Edlinger
>> <bernd.edlinger@hotmail.de> wrote:
>>> Hi,
>>>
>>> on ARM we have a code quality regression, because of the strict volatile
>>> bitfields handing. The reason is that the current implementation directly
>>> jumps to store_fixed_bit_field_1 which emits a sequence of and/or/shift
>>> expressions. This turns out to be too complex for combine to figure out
>>> the possibility to use a "bfi" instruction.
>>>
>>> But if -fno-strict-volatile-bitfields is used store_bit_field can use the
>>> EP_insv code pattern, which results in "bfi" instructions.
>>> The only problem is that that store_bit_field is free to use _any_ possible
>>> access mode. But if we load the value first in a register, we can safely
>>> use store_bit_field on the register and move the result back.
>>>
>>>
>>> Boot-Strapped and regression-tested on Cortex-M3.
>>>
>>> OK for trunk?
>>
>> Hmm. As you also modify the no-strict-volatile-bitfield path I'm not sure
>> you don't regress the case where EP_insv can work on memory. I agree
>> that simplifying the strict-volatile-bitfield path to extract the memory
>> within strict-volatile-bitfield constraints to a reg and then using the regular
>> path is a good thing.
>>
>
> I tried _not_ to touch the no-strict-volatile-bitfield path.
> Where did you see that?

You changed store_bit_field - ah, sorry, context missing in the patch.  So yes,
the patch is ok.

Thanks,
Richard.

> Thanks
> Bernd.
>
>> Eric?
>>
>> Thanks,
>> Richard.
>>
>>>
>>> Thanks
>>> Bernd.
>>>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix PR rtl-optimization/65067
  2015-03-05  8:53 ` Richard Biener
  2015-03-05  9:03   ` Bernd Edlinger
@ 2015-03-06  9:12   ` Eric Botcazou
  2015-03-06  9:42     ` Bernd Edlinger
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2015-03-06  9:12 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Bernd Edlinger

> Hmm.  As you also modify the no-strict-volatile-bitfield path I'm not sure
> you don't regress the case where EP_insv can work on memory.  I agree
> that simplifying the strict-volatile-bitfield path to extract the memory
> within strict-volatile-bitfield constraints to a reg and then using the
> regular path is a good thing.
> 
> Eric?

Even if the no-strict-volatile-bitfield path isn't touched, I don't understand

@@ -976,7 +976,7 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I
       /* Storing any naturally aligned field can be done with a simple
 	 store.  For targets that support fast unaligned memory, any
 	 naturally sized, unit aligned field can be done directly.  */
-      if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, fieldmode))
+      if (bitsize == GET_MODE_BITSIZE (fieldmode))
 	{
 	  str_rtx = adjust_bitfield_address (str_rtx, fieldmode,
 					     bitnum / BITS_PER_UNIT);


     {
-      rtx result;
-
       /* Extraction of a full MODE1 value can be done with a load as long as
 	 the field is on a byte boundary and is sufficiently aligned.  */
-      if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, mode1))
-	result = adjust_bitfield_address (str_rtx, mode1,
-					  bitnum / BITS_PER_UNIT);
-      else
+      if (bitsize == GET_MODE_BITSIZE(mode1))
 	{
-	  str_rtx = narrow_bit_field_mem (str_rtx, mode1, bitsize, bitnum,
-					  &bitnum);
-	  result = extract_fixed_bit_field_1 (mode, str_rtx, bitsize, bitnum,
-					      target, unsignedp);
+	  rtx result = adjust_bitfield_address (str_rtx, mode1,
+						bitnum / BITS_PER_UNIT);


adjust_bitfield_address takes (bitnum / BITS_PER_UNIT) so don't you need to 
make sure that bitnum % BITS_PER_UNIT == 0?

In any case, the comments are now totally out of sync.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] Fix PR rtl-optimization/65067
  2015-03-06  9:12   ` Eric Botcazou
@ 2015-03-06  9:42     ` Bernd Edlinger
  2015-03-06  9:54       ` Eric Botcazou
  0 siblings, 1 reply; 8+ messages in thread
From: Bernd Edlinger @ 2015-03-06  9:42 UTC (permalink / raw)
  To: Eric Botcazou, Richard Biener; +Cc: gcc-patches

Hi,

On Fri, 6 Mar 2015 10:09:30, Eric Botcazou wrote:
>
>> Hmm. As you also modify the no-strict-volatile-bitfield path I'm not sure
>> you don't regress the case where EP_insv can work on memory. I agree
>> that simplifying the strict-volatile-bitfield path to extract the memory
>> within strict-volatile-bitfield constraints to a reg and then using the
>> regular path is a good thing.
>>
>> Eric?
>
> Even if the no-strict-volatile-bitfield path isn't touched, I don't understand
>
> @@ -976,7 +976,7 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I
> /* Storing any naturally aligned field can be done with a simple
> store. For targets that support fast unaligned memory, any
> naturally sized, unit aligned field can be done directly. */
> - if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, fieldmode))
> + if (bitsize == GET_MODE_BITSIZE (fieldmode))
> {
> str_rtx = adjust_bitfield_address (str_rtx, fieldmode,
> bitnum / BITS_PER_UNIT);
>
>
> {
> - rtx result;
> -
> /* Extraction of a full MODE1 value can be done with a load as long as
> the field is on a byte boundary and is sufficiently aligned. */
> - if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, mode1))
> - result = adjust_bitfield_address (str_rtx, mode1,
> - bitnum / BITS_PER_UNIT);
> - else
> + if (bitsize == GET_MODE_BITSIZE(mode1))
> {
> - str_rtx = narrow_bit_field_mem (str_rtx, mode1, bitsize, bitnum,
> - &bitnum);
> - result = extract_fixed_bit_field_1 (mode, str_rtx, bitsize, bitnum,
> - target, unsignedp);
> + rtx result = adjust_bitfield_address (str_rtx, mode1,
> + bitnum / BITS_PER_UNIT);
>
>
> adjust_bitfield_address takes (bitnum / BITS_PER_UNIT) so don't you need to
> make sure that bitnum % BITS_PER_UNIT == 0?
>

I know it because strict_volatile_bitfield_p checks this:
   /* Check for cases of unaligned fields that must be split.  */
   if (bitnum % BITS_PER_UNIT + bitsize> modesize

Therefore, if bitsize == modesize, we know that bitnum % BITS_PER_UNIT
must be zero.


> In any case, the comments are now totally out of sync.
>
> --
> Eric Botcazou
 		 	   		  

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix PR rtl-optimization/65067
  2015-03-06  9:42     ` Bernd Edlinger
@ 2015-03-06  9:54       ` Eric Botcazou
  2015-03-06 10:04         ` Bernd Edlinger
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2015-03-06  9:54 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: gcc-patches, Richard Biener

> I know it because strict_volatile_bitfield_p checks this:
>    /* Check for cases of unaligned fields that must be split.  */
>    if (bitnum % BITS_PER_UNIT + bitsize> modesize
> 
> Therefore, if bitsize == modesize, we know that bitnum % BITS_PER_UNIT
> must be zero.

Isn't that precisely the condition you're trying to relax in the other change?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] Fix PR rtl-optimization/65067
  2015-03-06  9:54       ` Eric Botcazou
@ 2015-03-06 10:04         ` Bernd Edlinger
  0 siblings, 0 replies; 8+ messages in thread
From: Bernd Edlinger @ 2015-03-06 10:04 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Richard Biener

On Fri, 6 Mar 2015 10:52:53, Eric Botcazou wrote:
>
>> I know it because strict_volatile_bitfield_p checks this:
>> /* Check for cases of unaligned fields that must be split. */
>> if (bitnum % BITS_PER_UNIT + bitsize> modesize
>>
>> Therefore, if bitsize == modesize, we know that bitnum % BITS_PER_UNIT
>> must be zero.
>
> Isn't that precisely the condition you're trying to relax in the other change?
>


Yes, but modesize is a mutliple of BITS_PER_UNIT,
therefore:

bitnum % modesize + bitsize> modesize

implies

bitnum % BITS_PER_UNIT + bitsize> modesize


I agree that the comments are still out of sync.


Thanks
Bernd.
 		 	   		  

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-06 10:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-05  7:10 [PATCH] Fix PR rtl-optimization/65067 Bernd Edlinger
2015-03-05  8:53 ` Richard Biener
2015-03-05  9:03   ` Bernd Edlinger
2015-03-05  9:51     ` Richard Biener
2015-03-06  9:12   ` Eric Botcazou
2015-03-06  9:42     ` Bernd Edlinger
2015-03-06  9:54       ` Eric Botcazou
2015-03-06 10:04         ` Bernd Edlinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).