public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Uecker <muecker@gwdg.de>
To: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>,
	"Rafał Pietrak" <embedded@ztk-rp.eu>,
	"gcc@gcc.gnu.org" <gcc@gcc.gnu.org>
Subject: Re: wishlist: support for shorter pointers
Date: Wed, 28 Jun 2023 19:00:30 +0200	[thread overview]
Message-ID: <27899ec298c36e70211df528612a4ef1e4911717.camel@gwdg.de> (raw)
In-Reply-To: <7d5c5921-634c-7fdd-b219-e8034a5a0a9c@arm.com>

Am Mittwoch, dem 28.06.2023 um 17:49 +0100 schrieb Richard Earnshaw (lists):
> On 28/06/2023 17:07, Martin Uecker wrote:
> > Am Mittwoch, dem 28.06.2023 um 16:44 +0100 schrieb Richard Earnshaw (lists):
> > > On 28/06/2023 15:51, Rafał Pietrak via Gcc wrote:
> > > > Hi Martin,
> > > > 
> > > > W dniu 28.06.2023 o 15:00, Martin Uecker pisze:
> > > > > 
> > > > > Sounds like named address spaces to me:
> > > > > https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html
> > > > 
> > > > Only to same extend, and only in x86 case.
> > > > 
> > > > The goal of the wish-item I've describe is to shorten pointers. I may be
> > > > wrong and have misread the specs, but the "address spaces"
> > > > implementation you've pointed out don't look like doing that. In
> > > > particular the AVR variant applies to devices that have a "native int"
> > > > of 16-bits, and those devices (most of them) have address space no
> > > > larger. So there is no gain. Their pointers cover all their address
> > > > space and if one wanted to have shorter pointers ... like 12-bits -
> > > > those wouldn't "nicely fit into register", or 8-bits - those would
> > > > reduce the "addressable" space to 256 bytes, which is VERY tight for any
> > > > practical application.
> > > > 
> > > > Additionally, the AVR case is explained as "only for rodata" - this
> > > > completely dismisses it from my use.
> > > > 
> > > > To explain a little more: the functionality I'm looking for is something
> > > > like x86 implementation of that "address spaces". The key functionality
> > > > here is the additional register like fs/gs (an address offset register).
> > > > IMHO the feature/implementation in question would HAVE TO use additional
> > > > register instead of letting linker adjust them at link time, because
> > > > those "short" pointers would need to be load-and-stored dynamically and
> > > > changed dynamically at runtime. That's why I've put an example of ARM
> > > > instruction that does this. Again IMHO the only "syntactic" feature,that
> > > > is required for a compiler to do "the right thing" is to make compiler
> > > > consider segment (segment name, ordinary linker segment name) where a
> > > > particular pointer target resides. Then if that segment where data (of
> > > > that pointer) reside is declared "short pointers", then compiler loads
> > > > and uses additional register pointing to the base of that segment. Quite
> > > > like intel segments work in hardware.
> > > > 
> > > > Naturally, although I have hints on such mechanism behavior, I have no
> > > > skills to even imagine where to tweak the sources to achieve that.
> > > 
> > > 
> > > I think I understand what you're asking for but:
> > > 1) You'd need a new ABI specification to handle this, probably involving
> > > register assignments (for the 'segment' addresses), the initialization
> > > of those at startup, assembler and linker extensions to allow for
> > > relocations describing the symbols, etc.
> > > 2) Implementations for all of the above (it would be a lot of work -
> > > weeks to months, not days).  Little existing code, including most of the
> > > hand-written assembly routines is likely to be compatible with the
> > > register conventions you'd need to define, so all that code would need
> > > auditing and alternatives developed.
> > > 3) I doubt it would be an overall win in the end.
> > > 
> > > I base the last assertion on the fact that you'd now have three values
> > > in many addresses, the base (segment), the pointer and then a final
> > > offset.  This means quite a bit more code being generated, so you trade
> > > smaller pointers in your data section for more code in your code
> > > section.  For example,
> > > 
> > > struct f
> > > {
> > >     int a;
> > >     int b;
> > > };
> > > 
> > > int func (struct f *p)
> > > {
> > >     return p->b;
> > > }
> > > 
> > > would currently compile to something like
> > > 
> > > 	ldr r0, [r0, #4]
> > > 	bx lr
> > > 
> > > but with the new, shorter, pointer you'd end up with
> > > 
> > > 	add r0, r_seg, r0
> > > 	ldr r0, [r0, #4]
> > > 	bx lr
> > > 
> > > In some cases it might be even worse as you'd end up with
> > > zero-extensions of the pointer values as well.
> > > 
> > 
> > I do not quite understand why this wouldn't work with
> > named address spaces?
> > 
> > __near struct {
> >    int x;
> >    int y;
> > };
> > 
> > int func (__near struct f *p)
> > {
> >     return p->b;
> > }
> > 
> > could produce exactly such code?   If you need multiple
> > such segments one could have __near0, ..., __near9.
> > 
> > Such a pointer could also be converted to a regular
> > pointer, which could reduce code overhead.
> > 
> > Martin
> 
> Named address spaces, as they exist today, don't really do anything (at 
> least, in the Arm port).  A pointer is still 32-bits in size, so they 
> become just syntactic sugar.
> 
Sorry, I didn't mean to imply that this works today. But it
seems one could use this mechanism to implement this feature.

> If you're going to use them as 'bases', then you still have to define 
> how the base address is accessed - it doesn't just happen by magic.

The address space could correspond to a specific linker section.

Martin


> R.
> 
> > 
> > 
> > 
> > > R.
> > > 
> > > > -R
> > > > 
> > > > > 
> > > > > Best,
> > > > > Martin
> > > > > 
> > > > > Am Dienstag, dem 27.06.2023 um 14:26 +0200 schrieb Rafał Pietrak via Gcc:
> > > > > > Hello everybody,
> > > > > > 
> > > > > > I'm not quite sure if this is correct mailbox for this suggestion (may
> > > > > > be "embedded" would be better), but let me present it first (and while
> > > > > > the examples is from ARM stm32 environment, the issue would equally
> > > > > > apply to i386 or even amd64). So:
> > > > > > 
> > > > > > 1. Small MPU (like stm32f103) would normally have small amount of RAM,
> > > > > > and even somewhat larger variant do have its memory "partitioned/
> > > > > > dedicated" to various subsystems (like CloseCoupledMemory, Ethernet
> > > > > > buffers, USB buffs, etc).
> > > > > > 
> > > > > > 2. to address any location within those sections of that memory (or
> > > > > > their entire RAM) it would suffice to use 16-bit pointers.
> > > > > > 
> > > > > > 3. still, declaring a pointer in GCC always allocate "natural" size of a
> > > > > > pointer in given architecture. In case of ARM stm32 it would be 32-bits.
> > > > > > 
> > > > > > 4. programs using pointers do keep them around in structures. So
> > > > > > programs with heavy use of pointers have those structures like 2 times
> > > > > > larger then necessary .... if only pointers were 16-bit. And memory in
> > > > > > those devices is scarce.
> > > > > > 
> > > > > > 5. the same thing applies to 64-bit world. Programs that don't require
> > > > > > huge memories but do use pointers excessively, MUST take up 64-bit for a
> > > > > > pointer no matter what.
> > > > > > 
> > > > > > So I was wondering if it would be feasible for GCC to allow SEGMENT to
> > > > > > be declared as "small" (like 16-bit addressable in 32-bit CPU, or 32-bit
> > > > > > addressable in 64-bit CPU), and ANY pointer declared to reference
> > > > > > location within them would then be appropriately reduced.
> > > > > > 
> > > > > > In ARM world, the use of such pointers would require the use of an
> > > > > > additional register (functionally being a "segment base address") to
> > > > > > allow for data access using instructions like: "LD Rx, [Ry, Rz]" -
> > > > > > meaning register index reference. Here Ry is the base of the SEGMENT in
> > > > > > question. Or if (like inside a loop) the structure "pointed to" by Rz
> > > > > > must be often used, just one operation "ADD Rz, Ry" will prep Rz for
> > > > > > subsequent "ordinary" offset operations like: "LD Ra, [Rz, #member]" ...
> > > > > > and reentering the loop by "LDH Rz, [Rz, #next]" does what's required by
> > > > > > "x = x->next".
> > > > > > 
> > > > > > Not having any experience in compiler implementations I have no idea if
> > > > > > this is a big or a small change to compiler design.
> > > > > > 
> > > > > > -R
> > > > > 
> > > > > 
> > > 
> > 
> > 
> 



  reply	other threads:[~2023-06-28 17:00 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-27 12:26 Rafał Pietrak
2023-06-28  1:54 ` waffl3x
2023-06-28  7:13   ` Rafał Pietrak
2023-06-28  7:31     ` Jonathan Wakely
2023-06-28  8:35       ` Rafał Pietrak
2023-06-28  9:56         ` waffl3x
2023-06-28 10:43           ` Rafał Pietrak
2023-06-28 12:12             ` waffl3x
2023-06-28 12:23               ` Rafał Pietrak
2023-07-03 14:52         ` David Brown
2023-07-03 16:29           ` Rafał Pietrak
2023-07-04 14:20             ` Rafał Pietrak
2023-07-04 15:13               ` David Brown
2023-07-04 16:15                 ` Rafał Pietrak
2023-06-28  7:34     ` waffl3x
2023-06-28  8:41       ` Rafał Pietrak
2023-06-28 13:00 ` Martin Uecker
2023-06-28 14:51   ` Rafał Pietrak
2023-06-28 15:44     ` Richard Earnshaw (lists)
2023-06-28 16:07       ` Martin Uecker
2023-06-28 16:49         ` Richard Earnshaw (lists)
2023-06-28 17:00           ` Martin Uecker [this message]
2023-06-28 16:48       ` Rafał Pietrak
2023-06-29  6:19       ` Rafał Pietrak
2023-07-03 15:07         ` Ian Lance Taylor
2023-07-03 16:42           ` Rafał Pietrak
2023-07-03 16:57             ` Richard Earnshaw (lists)
2023-07-03 17:34               ` Rafał Pietrak
2023-07-04 12:38             ` David Brown
2023-07-04 12:57               ` Oleg Endo
2023-07-04 14:46               ` Rafał Pietrak
2023-07-04 15:55                 ` David Brown
2023-07-04 16:20                   ` Rafał Pietrak
2023-07-04 22:57                 ` Martin Uecker
2023-07-05  5:26                   ` Rafał Pietrak
2023-07-05  7:29                     ` Martin Uecker
2023-07-05  8:05                       ` Rafał Pietrak
2023-07-05  9:11                         ` David Brown
2023-07-05  9:25                           ` Martin Uecker
2023-07-05 11:34                             ` David Brown
2023-07-05 12:01                               ` Martin Uecker
2023-07-05  9:42                           ` Rafał Pietrak
2023-07-05 11:55                             ` David Brown
2023-07-05 12:25                               ` Rafał Pietrak
2023-07-05 12:57                                 ` David Brown
2023-07-05 13:29                                   ` Rafał Pietrak
2023-07-05 14:45                                     ` David Brown
2023-07-05 16:13                                       ` Rafał Pietrak
2023-07-05 17:39                                         ` David Brown
2023-07-06  7:00                                           ` Rafał Pietrak
2023-07-06 12:53                                             ` David Brown
2023-07-05  9:29                         ` Martin Uecker
2023-07-05 10:17                           ` Rafał Pietrak
2023-07-05 10:48                             ` Martin Uecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27899ec298c36e70211df528612a4ef1e4911717.camel@gwdg.de \
    --to=muecker@gwdg.de \
    --cc=Richard.Earnshaw@arm.com \
    --cc=embedded@ztk-rp.eu \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).