public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Optimization of bit field assignemnts
@ 2024-02-12 16:47 Hugh Gleaves
  2024-02-12 18:33 ` David Brown
  2024-02-13 11:48 ` Richard Biener
  0 siblings, 2 replies; 3+ messages in thread
From: Hugh Gleaves @ 2024-02-12 16:47 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 1617 bytes --]

I’m interested in whether it would be feasible to add an optimization that compacted assignments to multiple bit fields.

Today, if I have a 32 bit long struct composed of say, four 8 bit fields and assign constants to them like this:

                ahb1_ptr->RCC.CFGR.MCO1_PRE = 7;
                ahb1_ptr->RCC.CFGR.I2SSC = 0;
                ahb1_ptr->RCC.CFGR.MCO1 = 3;

This generates code (on Arm) like this:

                ahb1_ptr->RCC.CFGR.MCO1_PRE = 7;
0x08000230  ldr.w r1, [r3, #2056]              @ 0x808
0x08000234  orr.w r1, r1, #117440512     @ 0x7000000
0x08000238  str.w r1, [r3, #2056]              @ 0x808
                ahb1_ptr->RCC.CFGR.I2SSC = 0;
0x0800023c  ldr.w r1, [r3, #2056]              @ 0x808
0x08000240  bfc r1, #23, #1
0x08000244  str.w r1, [r3, #2056]              @ 0x808
                ahb1_ptr->RCC.CFGR.MCO1 = 3;
0x08000248  ldr.w r1, [r3, #2056]              @ 0x808
0x0800024c  orr.w r1, r1, #6291456          @ 0x600000
0x08000250  str.w r1, [r3, #2056]              @ 0x808

It would be an improvement, if the compiler analyzed these assignments and realized they are all modifications to the same 32 bit datum, generate an appropriate OR and AND bitmask and then apply those to the register and do just a single store at the end.

In other words, infer the equivalent of this:

                RCC->CFGR &= ~0x07E00000;
                RCC->CFGR |=    0x07600000;

This strikes me as very feasible, the compiler knows the offset and bit length of the sub fields so all of the information needed seems to be present.

Thoughts…




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Optimization of bit field assignemnts
  2024-02-12 16:47 Optimization of bit field assignemnts Hugh Gleaves
@ 2024-02-12 18:33 ` David Brown
  2024-02-13 11:48 ` Richard Biener
  1 sibling, 0 replies; 3+ messages in thread
From: David Brown @ 2024-02-12 18:33 UTC (permalink / raw)
  To: Hugh Gleaves, gcc

On 12/02/2024 17:47, Hugh Gleaves via Gcc wrote:
> I’m interested in whether it would be feasible to add an optimization that compacted assignments to multiple bit fields.
> 
> Today, if I have a 32 bit long struct composed of say, four 8 bit fields and assign constants to them like this:
> 
>                  ahb1_ptr->RCC.CFGR.MCO1_PRE = 7;
>                  ahb1_ptr->RCC.CFGR.I2SSC = 0;
>                  ahb1_ptr->RCC.CFGR.MCO1 = 3;
> 
> This generates code (on Arm) like this:
> 
>                  ahb1_ptr->RCC.CFGR.MCO1_PRE = 7;
> 0x08000230  ldr.w r1, [r3, #2056]              @ 0x808
> 0x08000234  orr.w r1, r1, #117440512     @ 0x7000000
> 0x08000238  str.w r1, [r3, #2056]              @ 0x808
>                  ahb1_ptr->RCC.CFGR.I2SSC = 0;
> 0x0800023c  ldr.w r1, [r3, #2056]              @ 0x808
> 0x08000240  bfc r1, #23, #1
> 0x08000244  str.w r1, [r3, #2056]              @ 0x808
>                  ahb1_ptr->RCC.CFGR.MCO1 = 3;
> 0x08000248  ldr.w r1, [r3, #2056]              @ 0x808
> 0x0800024c  orr.w r1, r1, #6291456          @ 0x600000
> 0x08000250  str.w r1, [r3, #2056]              @ 0x808
> 
> It would be an improvement, if the compiler analyzed these assignments and realized they are all modifications to the same 32 bit datum, generate an appropriate OR and AND bitmask and then apply those to the register and do just a single store at the end.
> 
> In other words, infer the equivalent of this:
> 
>                  RCC->CFGR &= ~0x07E00000;
>                  RCC->CFGR |=    0x07600000;
> 
> This strikes me as very feasible, the compiler knows the offset and bit length of the sub fields so all of the information needed seems to be present.
> 
> Thoughts…
> 

In most such cases, the underlying definition of the structure (or the 
pointer to the structure) is volatile, because it is a hardware 
register.  The compiler cannot combine the register field settings, 
because volatile accesses must not be combined - precisely so that 
programmers can reliably control hardware.  It is normal to want to be 
sure that a particular bitfield is changed, and only after that will the 
next bitfield be changed, and so on.  Sometimes that means the result is 
slower than it would have to be - but this is much better than giving 
wrong results when the programmer needs the changes to be handled 
separately.

It is not uncommon for the bytes underlying a hardware register bitfield 
struct to be available directly as well, letting you do the bit 
manipulation in a local copy which you then write out in a single operation.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Optimization of bit field assignemnts
  2024-02-12 16:47 Optimization of bit field assignemnts Hugh Gleaves
  2024-02-12 18:33 ` David Brown
@ 2024-02-13 11:48 ` Richard Biener
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Biener @ 2024-02-13 11:48 UTC (permalink / raw)
  To: Hugh Gleaves; +Cc: gcc

On Mon, Feb 12, 2024 at 5:49 PM Hugh Gleaves via Gcc <gcc@gcc.gnu.org> wrote:
>
> I’m interested in whether it would be feasible to add an optimization that compacted assignments to multiple bit fields.
>
> Today, if I have a 32 bit long struct composed of say, four 8 bit fields and assign constants to them like this:
>
>                 ahb1_ptr->RCC.CFGR.MCO1_PRE = 7;
>                 ahb1_ptr->RCC.CFGR.I2SSC = 0;
>                 ahb1_ptr->RCC.CFGR.MCO1 = 3;
>
> This generates code (on Arm) like this:
>
>                 ahb1_ptr->RCC.CFGR.MCO1_PRE = 7;
> 0x08000230  ldr.w r1, [r3, #2056]              @ 0x808
> 0x08000234  orr.w r1, r1, #117440512     @ 0x7000000
> 0x08000238  str.w r1, [r3, #2056]              @ 0x808
>                 ahb1_ptr->RCC.CFGR.I2SSC = 0;
> 0x0800023c  ldr.w r1, [r3, #2056]              @ 0x808
> 0x08000240  bfc r1, #23, #1
> 0x08000244  str.w r1, [r3, #2056]              @ 0x808
>                 ahb1_ptr->RCC.CFGR.MCO1 = 3;
> 0x08000248  ldr.w r1, [r3, #2056]              @ 0x808
> 0x0800024c  orr.w r1, r1, #6291456          @ 0x600000
> 0x08000250  str.w r1, [r3, #2056]              @ 0x808
>
> It would be an improvement, if the compiler analyzed these assignments and realized they are all modifications to the same 32 bit datum, generate an appropriate OR and AND bitmask and then apply those to the register and do just a single store at the end.
>
> In other words, infer the equivalent of this:
>
>                 RCC->CFGR &= ~0x07E00000;
>                 RCC->CFGR |=    0x07600000;
>
> This strikes me as very feasible, the compiler knows the offset and bit length of the sub fields so all of the information needed seems to be present.

There is the store-merging pass which should already do this when
constraints allow.

Richard.

> Thoughts…
>
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-02-13 11:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-12 16:47 Optimization of bit field assignemnts Hugh Gleaves
2024-02-12 18:33 ` David Brown
2024-02-13 11:48 ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).