public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Fw: Possible missed optimization opportunity with const?
       [not found] <848378447.3130483.1472699334626.ref@mail.yahoo.com>
@ 2016-09-01  3:10 ` Toshi Morita
  2016-09-01  6:14   ` Kalle Olavi Niemitalo
                     ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Toshi Morita @ 2016-09-01  3:10 UTC (permalink / raw)
  To: fweimer; +Cc: gcc-help



Florian Weimer wrote:

>> On 08/31/2016 10:57 AM, Toshi Morita wrote: 
>> 
>> However, if the definition of pfoo is changed to: const int * const pfoo = (const int * const 0x1234);
>> the optimization seems to fail:
> 
> The optimization is not valid in this case because the compiler cannot know that the object was declared const.
> It could well be mutable.

Sorry, that should be:

const int * const pfoo = (const int * const)0x1234;

So assuming this is still wrong, what is the correct way to define a pointer to a hardware register at 0x1234 which contains immutable data? I'm missing something here.

Toshi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-09-01  3:10 ` Fw: Possible missed optimization opportunity with const? Toshi Morita
@ 2016-09-01  6:14   ` Kalle Olavi Niemitalo
  2016-09-01  7:19     ` Toshi Morita
  2016-09-01  9:10   ` Florian Weimer
  2016-09-01  9:38   ` David Brown
  2 siblings, 1 reply; 13+ messages in thread
From: Kalle Olavi Niemitalo @ 2016-09-01  6:14 UTC (permalink / raw)
  To: Toshi Morita; +Cc: gcc-help

Toshi Morita <tm314159@yahoo.com> writes:

> what is the correct way to define a pointer to a hardware
> register at 0x1234 which contains immutable data?

It seems "extern const int foo;" with "-Wl,--defsym,foo=0x1234"
almost works, except GCC then uses a RIP-relative reference on
amd64 and that can cause a relocation overflow.

I also tried wrapping the read in an __attribute__((const))
function, which means the return value does not depend on
global memory:

int __attribute__((const)) getfoo() 
{
  return *(const int *) 0x1234;
}

but GCC emitted two reads anyhow.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-09-01  6:14   ` Kalle Olavi Niemitalo
@ 2016-09-01  7:19     ` Toshi Morita
  0 siblings, 0 replies; 13+ messages in thread
From: Toshi Morita @ 2016-09-01  7:19 UTC (permalink / raw)
  To: Kalle Olavi Niemitalo; +Cc: gcc-help

Kalle wrote:

> It seems "extern const int foo;" with "-Wl --defsym,foo=0x1234" almost works, except GCC then uses a 
> RIP-relative reference on amd64 and that can cause a relocation overflow.
> I also tried wrapping the read in an __attribute__((const))
> function, which means the return value does not depend on
> global memory:
> 
> int __attribute__((const)) getfoo() 
> {
>   return *(const int *) 0x1234;
> }
> 

> but GCC emitted two reads anyhow.

Yep. It should not be this hard to do this.

Toshi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-09-01  3:10 ` Fw: Possible missed optimization opportunity with const? Toshi Morita
  2016-09-01  6:14   ` Kalle Olavi Niemitalo
@ 2016-09-01  9:10   ` Florian Weimer
  2016-09-01  9:38   ` David Brown
  2 siblings, 0 replies; 13+ messages in thread
From: Florian Weimer @ 2016-09-01  9:10 UTC (permalink / raw)
  To: Toshi Morita; +Cc: gcc-help

On 09/01/2016 05:08 AM, Toshi Morita wrote:
>
>
> Florian Weimer wrote:
>
>>> On 08/31/2016 10:57 AM, Toshi Morita wrote:
>>>
>>> However, if the definition of pfoo is changed to: const int * const pfoo = (const int * const 0x1234);
>>> the optimization seems to fail:
>>
>> The optimization is not valid in this case because the compiler cannot know that the object was declared const.
>> It could well be mutable.
>
> Sorry, that should be:
>
> const int * const pfoo = (const int * const)0x1234;

Yes, I assumed so, it does not make a difference.

> So assuming this is still wrong, what is the correct way to define a pointer to a hardware register at 0x1234 which contains immutable data? I'm missing something here.

You need to use the original code, with the declaration of foo, and tell 
the assembler or linker to place foo at an absolute address.  This is 
rather target-dependent.

Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-09-01  3:10 ` Fw: Possible missed optimization opportunity with const? Toshi Morita
  2016-09-01  6:14   ` Kalle Olavi Niemitalo
  2016-09-01  9:10   ` Florian Weimer
@ 2016-09-01  9:38   ` David Brown
  2016-09-01 18:15     ` Toshi Morita
  2 siblings, 1 reply; 13+ messages in thread
From: David Brown @ 2016-09-01  9:38 UTC (permalink / raw)
  To: Toshi Morita, fweimer; +Cc: gcc-help

On 01/09/16 05:08, Toshi Morita wrote:
> 
> 
> Florian Weimer wrote:
> 
>>> On 08/31/2016 10:57 AM, Toshi Morita wrote:
>>> 
>>> However, if the definition of pfoo is changed to: const int *
>>> const pfoo = (const int * const 0x1234); the optimization seems
>>> to fail:
>> 
>> The optimization is not valid in this case because the compiler
>> cannot know that the object was declared const. It could well be
>> mutable.
> 
> Sorry, that should be:
> 
> const int * const pfoo = (const int * const)0x1234;
> 
> So assuming this is still wrong, what is the correct way to define a
> pointer to a hardware register at 0x1234 which contains immutable
> data? I'm missing something here.
> 
> Toshi
> 

If it is a hardware register, you normally want to put a "volatile"
there so that you get exactly the number of reads you want.  And if you
only want to read it once, then read it only once.  Sometimes it is
easier to write things in clear and simple code, rather than hoping the
compiler will optimise the unnecessary source code.

It would be nice if there was a good way to tell the compiler that a
particular object exists at a particular address.  Expressions like the
one you gave, or placement new, give the compiler a reference or pointer
to such an object, but you can't define characteristics of the object
(such as telling the compiler that the object itself is const).

The AVR port of gcc lets you write:

const int foo __attribute__((address(0x1234)));

Maybe it would be possible to make this a common __attribute__ in gcc?

(As an embedded programmer who works with gcc on a number of different
targets, it is always annoying to see a useful attribute, builtin, or
extension on one port but not have it available on the other ports!)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-09-01  9:38   ` David Brown
@ 2016-09-01 18:15     ` Toshi Morita
  2016-09-01 19:05       ` David Brown
  0 siblings, 1 reply; 13+ messages in thread
From: Toshi Morita @ 2016-09-01 18:15 UTC (permalink / raw)
  To: David Brown, fweimer; +Cc: gcc-help

David Brown wrote:

> If it is a hardware register, you normally want to put a "volatile"

> there so that you get exactly the number of reads you want. And if you
> only want to read it once, then read it only once.=C2=A0 Sometimes it is
> easier to write things in clear and simple code, rather than hoping the> compiler will optimise the unnecessary source code.

There are many cases where there are values at hardware addresses which are const.

One example is a hardware configuration register. Many modern microcontrollers
are fabricated with more interfaces than can be physically bonded out in a particular
packaging. For example,a microcontroller die may have a 4 USB interfaces, 4 SPI,
and 4 I2C ports. However, this die may be placed in a packaging where there is only
space for some of the signals to be exposed. So a particular chip might have
2 USB interfaces, 2 SPI interfaces, and 2 I2C ports. Another chip in the same family
might have 1 USB interface, 3 SPI interfaces, and 2 I2C ports. So there may be a static
hardware configuration register which describes the number of interfaces of each type
for that particular chip.


Another example is mask-programmed ROM or EPROM. These devices are usually fairly slow
with access times ranging from 125 ns to 450 ns. Since these devicesare so slow, it is preferable to avoid unnecessary reads to the device. The values
in a mask-programmed ROM cannot be changed, and the values in an EPROM can only
be changed when the device is powered off and exposed to UV light for several minutes,
so the values are effectively const and do not change at runtime.

Toshi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-09-01 18:15     ` Toshi Morita
@ 2016-09-01 19:05       ` David Brown
  2016-09-01 19:14         ` Toshi Morita
  0 siblings, 1 reply; 13+ messages in thread
From: David Brown @ 2016-09-01 19:05 UTC (permalink / raw)
  To: Toshi Morita, fweimer; +Cc: gcc-help

On 01/09/16 20:11, Toshi Morita wrote:
> David Brown wrote:
>
>> If it is a hardware register, you normally want to put a
>> "volatile"
>
>> there so that you get exactly the number of reads you want. And if
>> you only want to read it once, then read it only once.=C2=A0
>> Sometimes it is easier to write things in clear and simple code,
>> rather than hoping the> compiler will optimise the unnecessary
>> source code.
>
> There are many cases where there are values at hardware addresses
> which are const.
>
> One example is a hardware configuration register. Many modern
> microcontrollers are fabricated with more interfaces than can be
> physically bonded out in a particular packaging. For example,a
> microcontroller die may have a 4 USB interfaces, 4 SPI, and 4 I2C
> ports. However, this die may be placed in a packaging where there is
> only space for some of the signals to be exposed. So a particular
> chip might have 2 USB interfaces, 2 SPI interfaces, and 2 I2C ports.
> Another chip in the same family might have 1 USB interface, 3 SPI
> interfaces, and 2 I2C ports. So there may be a static hardware
> configuration register which describes the number of interfaces of
> each type for that particular chip.

Yes, I know - I work with such devices myself.  But I'm having trouble 
imagining why you would be likely to read those hardware registers more 
than once (typically during system setup), and also why it would be a 
problem if you did read them twice.  Registers like this are usually no 
more than a couple of clock cycles to access - similar to reading 
on-chip SRAM.  At worst, your system startup might be 50 nanoseconds 
longer than necessary.

It would be a different matter if this were in a tight critical loop, or 
if reading the register had some special effect (such as is sometimes 
the case with hardware registers).

>
>
> Another example is mask-programmed ROM or EPROM. These devices are
> usually fairly slow with access times ranging from 125 ns to 450 ns.
> Since these devicesare so slow, it is preferable to avoid unnecessary
> reads to the device. The values in a mask-programmed ROM cannot be
> changed, and the values in an EPROM can only be changed when the
> device is powered off and exposed to UV light for several minutes, so
> the values are effectively const and do not change at runtime.
>

Usually you can treat such data as const, that's true.  But again, I am 
having a hard time understanding why it might matter if you read the 
data twice (it's not /that/ slow - especially in comparison to calling a 
function in between reads), and why you can't simply store the first 
read in a local variable in cases where it /does/ matter.  And in 
microcontrollers that are fast enough for this to matter, you've 
probably got a cache that can be used.

I'm all in favour of better optimisations, but this looks like a lot of 
effort (for either the gcc developers, or for the programmer) for a 
negligible gain.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-09-01 19:05       ` David Brown
@ 2016-09-01 19:14         ` Toshi Morita
  2016-09-01 19:19           ` Florian Weimer
  2016-09-01 19:33           ` David Brown
  0 siblings, 2 replies; 13+ messages in thread
From: Toshi Morita @ 2016-09-01 19:14 UTC (permalink / raw)
  To: David Brown, fweimer; +Cc: gcc-help



David Brown wrote:
>> Another example is mask-programmed ROM or EPROM. These devices are
>> usually fairly slow with access times ranging from 125 ns to 450 ns.
>> Since these devicesare so slow, it is preferable to avoid unnecessary
>> reads to the device. The values in a mask-programmed ROM cannot be
>> changed, and the values in an EPROM can only be changed when the
>> device is powered off and exposed to UV light for several minutes, so
>> the values are effectively const and do not change at runtime.

> Usually you can treat such data as const, that's true.  But again, I am 
> having a hard time understanding why it might matter if you read the 
> data twice (it's not /that/ slow - especially in comparison to calling a 
> function in between reads), and why you can't simply store the first 
> read in a local variable in cases where it /does/ matter.  And in 
> microcontrollers that are fast enough for this to matter, you've 

> probably got a cache that can be used.

I've worked on one platform with a dual ARM7 running at 100 Mhz and no cache.
ROMs had a cycle time of 450 ns, and the processor has a cycle time of 10 ns.
So it's 45 clock cycles to access ROM.
> I'm all in favour of better optimisations, but this looks like a lot of 

> effort (for either the gcc developers, or for the programmer) for a
> negligible gain.

If you look at my original message, you will see that this optimization is
already implemented and works for the case where pfoo = &foo.

It fails when foo is replaced with a (const int * const) address.

I suspect when the C front-end parses (const int * const)0x1234, it fails
to apply the const_tree attribute to the subtree, and this is why the
optimization is failing.

Toshi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-09-01 19:14         ` Toshi Morita
@ 2016-09-01 19:19           ` Florian Weimer
  2016-09-01 19:33           ` David Brown
  1 sibling, 0 replies; 13+ messages in thread
From: Florian Weimer @ 2016-09-01 19:19 UTC (permalink / raw)
  To: Toshi Morita, David Brown; +Cc: gcc-help

On 09/01/2016 09:11 PM, Toshi Morita wrote:

> I suspect when the C front-end parses (const int * const)0x1234, it fails
> to apply the const_tree attribute to the subtree, and this is why the
> optimization is failing.

The optimization is not failing—it results in *invalid code* if the 
compiler cannot observe that the pointer points to a const object.  For 
pointers, const does not mean immutable.

Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-09-01 19:14         ` Toshi Morita
  2016-09-01 19:19           ` Florian Weimer
@ 2016-09-01 19:33           ` David Brown
  2016-09-01 20:50             ` Toshi Morita
  1 sibling, 1 reply; 13+ messages in thread
From: David Brown @ 2016-09-01 19:33 UTC (permalink / raw)
  To: Toshi Morita, fweimer; +Cc: gcc-help

On 01/09/16 21:11, Toshi Morita wrote:
>
>
> David Brown wrote:
>>> Another example is mask-programmed ROM or EPROM. These devices are
>>> usually fairly slow with access times ranging from 125 ns to 450 ns.
>>> Since these devicesare so slow, it is preferable to avoid unnecessary
>>> reads to the device. The values in a mask-programmed ROM cannot be
>>> changed, and the values in an EPROM can only be changed when the
>>> device is powered off and exposed to UV light for several minutes, so
>>> the values are effectively const and do not change at runtime.
>
>> Usually you can treat such data as const, that's true.  But again, I am
>> having a hard time understanding why it might matter if you read the
>> data twice (it's not /that/ slow - especially in comparison to calling a
>> function in between reads), and why you can't simply store the first
>> read in a local variable in cases where it /does/ matter.  And in
>> microcontrollers that are fast enough for this to matter, you've
>
>> probably got a cache that can be used.
>
> I've worked on one platform with a dual ARM7 running at 100 Mhz and no cache.
> ROMs had a cycle time of 450 ns, and the processor has a cycle time of 10 ns.
> So it's 45 clock cycles to access ROM.

In cases like that, you can do your caching manually.  Just read the rom 
value once, and keep the value in a local variable - then it will stay 
in a local register.  If you have a lot to read and re-use, copy a block 
to SRAM.  This is something that the compiler could never do on its own 
- you have to write it in the code.

>> I'm all in favour of better optimisations, but this looks like a lot of
>
>> effort (for either the gcc developers, or for the programmer) for a
>> negligible gain.
>
> If you look at my original message, you will see that this optimization is
> already implemented and works for the case where pfoo = &foo.
>
> It fails when foo is replaced with a (const int * const) address.
>
> I suspect when the C front-end parses (const int * const)0x1234, it fails
> to apply the const_tree attribute to the subtree, and this is why the
> optimization is failing.
>

This was already explained to you by Florian - the compiler cannot make 
the optimisation you want, because it does not know that the object at 
address 0x1234 is truly constant.  The "const" in the pointer here can 
only tell the compiler that /you/ will not try to change its value. 
This is different from the case where you define pfoo from the address 
of foo, because you have defined foo to be constant - now the compiler 
knows it cannot change, and can optimise accordingly.

To be able to get the kind of optimisation you are asking for, you would 
need to be able to tell the compiler that the thing pfoo points to is 
really constant - that would need a new __attribute__ or similar 
extension.  Alternatively, you would need a new __attribute__ to let you 
declare a const foo at a specific address (this might be easier, since 
the relevant __attribute__ already exists for avr-gcc).


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-09-01 19:33           ` David Brown
@ 2016-09-01 20:50             ` Toshi Morita
  0 siblings, 0 replies; 13+ messages in thread
From: Toshi Morita @ 2016-09-01 20:50 UTC (permalink / raw)
  To: David Brown, fweimer; +Cc: gcc-help


> In cases like that, you can do your caching manually.  Just read the rom

> value once, and keep the value in a local variable - then it will stay
> in a local register.  If you have a lot to read and re-use, copy a block> to SRAM.  This is something that the compiler could never do on its own 
>  you have to write it in the code.

Yes, this is obvious, but in some cases this is not feasible, such as porting

a large codebase to a new platform with time constraints and/or porting
code where the license prohibits changing the code.

> This was already explained to you by Florian - the compiler cannot make
> the optimisation you want, because it does not know that the object at > address 0x1234 is truly constant.  The "const" in the pointer here can 
> only tell the compiler that /you/ will not try to change its value. 
> This is different from the case where you define pfoo from the address
> of foo, because you have defined foo to be constant - now the compiler
> knows it cannot change, and can optimise accordingly.
>  be able to get the kind of optimisation you are asking for, you would 
> need to be able to tell the compiler that the thing pfoo points to is 
> really constant - that would need a new __attribute__ or similar

> extension.  Alternatively, you would need a new __attribute__ to let you
> declare a const foo at a specific address (this might be easier, since > the relevant __attribute__ already exists for avr-gcc).

The --defsyms approach seems easier.

Toshi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fw: Possible missed optimization opportunity with const?
  2016-08-31  9:01         ` Toshi Morita
@ 2016-08-31 11:45           ` Florian Weimer
  0 siblings, 0 replies; 13+ messages in thread
From: Florian Weimer @ 2016-08-31 11:45 UTC (permalink / raw)
  To: gcc-help

On 08/31/2016 10:57 AM, Toshi Morita wrote:
> However, if the definition of pfoo is changed to:
>
> const int * const pfoo = (const int * const 0x1234);
>
>
> the optimization seems to fail:

The optimization is not valid in this case because the compiler cannot 
know that the object was declared const.  It could well be mutable.

Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Fw: Possible missed optimization opportunity with const?
       [not found]       ` <839193122.1815820.1472544155134@mail.yahoo.com>
@ 2016-08-31  9:01         ` Toshi Morita
  2016-08-31 11:45           ` Florian Weimer
  0 siblings, 1 reply; 13+ messages in thread
From: Toshi Morita @ 2016-08-31  9:01 UTC (permalink / raw)
  To: gcc-help

Resending to the list, because I didn't see this show up in the archives.

IWhen this code is compiled with gcc-4.8.2 using -O2:

#include <stdio.h> 
extern const int foo; 
const int * const pfoo = &foo; 
extern void bar(void); 
int main(void) 
{ 
int a, b; 
a = *pfoo; 
bar(); 
b = *pfoo; 
printf("a: %d, b: %d\n", a, b); 

}


The two reads of foo are optimized down to one read, as expected:

test.c.025t.esra:

...
a_2 = foo;
bar ();
b_4 = foo;
...

test.c.026t.fre1

...
a_2 = foo;
bar ();
b_4 = a_2;
...
However, if the definition of pfoo is changed to:

const int * const pfoo = (const int * const 0x1234);


the optimization seems to fail:

test.c.025t.esra:

a_3 = MEM[(const int *)4660B];
bar ();
b_6 = MEM[(const int *)4660B];
...

test.c.026t.fre1

(no change)

The output code seems to have two reads of 0x1234 as well:

test:
    pushq   %rbx
    movl    4660, %ebx
    call    bar
    movl    %ebx, %edx
    movl    4660, %ecx
    movl    $.LC0, %esi
    popq    %rbx
    movl    $1, %edi
    xorl    %eax, %eax
    jmp     __print_chk

I'm assuming "movl 4660, %ebx" is an indirect reference in GNU syntax and not an immediate reference.

This seems odd. Is this correct?

Toshi

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-09-01 20:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <848378447.3130483.1472699334626.ref@mail.yahoo.com>
2016-09-01  3:10 ` Fw: Possible missed optimization opportunity with const? Toshi Morita
2016-09-01  6:14   ` Kalle Olavi Niemitalo
2016-09-01  7:19     ` Toshi Morita
2016-09-01  9:10   ` Florian Weimer
2016-09-01  9:38   ` David Brown
2016-09-01 18:15     ` Toshi Morita
2016-09-01 19:05       ` David Brown
2016-09-01 19:14         ` Toshi Morita
2016-09-01 19:19           ` Florian Weimer
2016-09-01 19:33           ` David Brown
2016-09-01 20:50             ` Toshi Morita
     [not found] <1637972460.15325965.1471393300211.JavaMail.yahoo.ref@mail.yahoo.com>
     [not found] ` <1637972460.15325965.1471393300211.JavaMail.yahoo@mail.yahoo.com>
     [not found]   ` <2c496f2e.1b46.15698e63833.Coremail.lh_mouse@126.com>
     [not found]     ` <CAEwic4aRutFFCeb-gdKYJPiUnwtzXHx4+T4CrydSQkJ1Zra-dA@mail.gmail.com>
     [not found]       ` <839193122.1815820.1472544155134@mail.yahoo.com>
2016-08-31  9:01         ` Toshi Morita
2016-08-31 11:45           ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).