* wishlist: support for shorter pointers @ 2023-06-27 12:26 Rafał Pietrak 2023-06-28 1:54 ` waffl3x 2023-06-28 13:00 ` Martin Uecker 0 siblings, 2 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-06-27 12:26 UTC (permalink / raw) To: gcc Hello everybody, I'm not quite sure if this is correct mailbox for this suggestion (may be "embedded" would be better), but let me present it first (and while the examples is from ARM stm32 environment, the issue would equally apply to i386 or even amd64). So: 1. Small MPU (like stm32f103) would normally have small amount of RAM, and even somewhat larger variant do have its memory "partitioned/ dedicated" to various subsystems (like CloseCoupledMemory, Ethernet buffers, USB buffs, etc). 2. to address any location within those sections of that memory (or their entire RAM) it would suffice to use 16-bit pointers. 3. still, declaring a pointer in GCC always allocate "natural" size of a pointer in given architecture. In case of ARM stm32 it would be 32-bits. 4. programs using pointers do keep them around in structures. So programs with heavy use of pointers have those structures like 2 times larger then necessary .... if only pointers were 16-bit. And memory in those devices is scarce. 5. the same thing applies to 64-bit world. Programs that don't require huge memories but do use pointers excessively, MUST take up 64-bit for a pointer no matter what. So I was wondering if it would be feasible for GCC to allow SEGMENT to be declared as "small" (like 16-bit addressable in 32-bit CPU, or 32-bit addressable in 64-bit CPU), and ANY pointer declared to reference location within them would then be appropriately reduced. In ARM world, the use of such pointers would require the use of an additional register (functionally being a "segment base address") to allow for data access using instructions like: "LD Rx, [Ry, Rz]" - meaning register index reference. Here Ry is the base of the SEGMENT in question. Or if (like inside a loop) the structure "pointed to" by Rz must be often used, just one operation "ADD Rz, Ry" will prep Rz for subsequent "ordinary" offset operations like: "LD Ra, [Rz, #member]" ... and reentering the loop by "LDH Rz, [Rz, #next]" does what's required by "x = x->next". Not having any experience in compiler implementations I have no idea if this is a big or a small change to compiler design. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-27 12:26 wishlist: support for shorter pointers Rafał Pietrak @ 2023-06-28 1:54 ` waffl3x 2023-06-28 7:13 ` Rafał Pietrak 2023-06-28 13:00 ` Martin Uecker 1 sibling, 1 reply; 54+ messages in thread From: waffl3x @ 2023-06-28 1:54 UTC (permalink / raw) To: Rafał Pietrak; +Cc: gcc I want to preface this stating that I have little to no experience in compiler development, I am only merely just getting into it. With that said, I have messed around with library design a fair amount, and this seems like something that could be implemented in a library. It might be slightly comfier implemented on the compiler side, but I question how generally it could be implemented. >In ARM world, the use of such pointers would require the use of an >additional register (functionally being a "segment base address") to >allow for data access using instructions like: "LD Rx, [Ry, Rz]" - >meaning register index reference. What you say here makes me feel like you should just be implementing this in library. With how you're describing it, it seems like the compiler would have no idea what the "segment base address" would actually be without additional annotation. Since you would need that annotation anyway, it seems best implemented in library. I think what you want to do (for 16 bit pointers) is have a struct that internally is a fixed width 16 bit uint, and have an operator* that sets up the registers for that particular segment. It would be a bit of an implementation task since you have to do some inline ASM, but thats just the reality of implementing low level libraries. Like I said before, and unless I'm mistaken, since the segments would need annotations anyway, it probably makes the most sense to implement this in library as I'm describing. I believe these types would be referred to as "fancy pointers." Hopefully I'm not too mistaken as I don't have any experience with this field. In general though, I believe that if something can be implemented in a reasonable way in library, then it belongs in library, not in language or other extensions. -Alex ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 1:54 ` waffl3x @ 2023-06-28 7:13 ` Rafał Pietrak 2023-06-28 7:31 ` Jonathan Wakely 2023-06-28 7:34 ` waffl3x 0 siblings, 2 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-06-28 7:13 UTC (permalink / raw) To: waffl3x; +Cc: gcc Hi Alex! W dniu 28.06.2023 o 03:54, waffl3x pisze: > I want to preface this stating that I have little to no experience in compiler > development, I am only merely just getting into it. With that said, I have messed around > with library design a fair amount, and this seems like something that could be > implemented in a library. It might be slightly comfier implemented on the compiler side, > but I question how generally it could be implemented. I thought of it a lot, and library implementation is something I'd rather avoid. Before I elaborate, let me put some excerpts of the code in question. GREP-ing the sources for "->next" (list processing) this is how it looks like. I have a lot of code like this scattered around: ------------------- y->next = NULL; if (our) { out->next = a; for (y = t->HD; y && y->next; y = y->next) if (y) y->next = a; fit->HD = a->next; fit->win = a->next; b = a->next; -------------------- This is from just one source file, which otherwise is "plain C". If I was to put it into a library that use "asm tweaked fancy pointers", a portable fragment of code becomes "target dedicated" - this is undesired. To elaborate: even in (or may be "particularly" in) embedded world, one prefers to have as portable code (to other targets) as possible. And I must say, that such code fragments as quoted are scattered around my sources practically everywhere. If I "convert" them to a library, the entire project becomes "target locked", and in consequences "unmaintainable". Such move practically kills it. Now, let me put the case into numbers: if I use stm32f030 instead of stm32f103, I may end up with just 4kB of RAM. The struct I heavily use here is 32bytes long, but with 16-bit "pointers" it collapses to 16bytes. This structure pops up in many places and within a running system it may reach 100 instances. It's a LOT of space to fight for. -R PS: pls CC the response to my address (as you have done) - I'm not sure if I get mails from the list. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 7:13 ` Rafał Pietrak @ 2023-06-28 7:31 ` Jonathan Wakely 2023-06-28 8:35 ` Rafał Pietrak 2023-06-28 7:34 ` waffl3x 1 sibling, 1 reply; 54+ messages in thread From: Jonathan Wakely @ 2023-06-28 7:31 UTC (permalink / raw) To: Rafał Pietrak; +Cc: waffl3x, gcc [-- Attachment #1: Type: text/plain, Size: 2622 bytes --] On Wed, 28 Jun 2023, 08:14 Rafał Pietrak via Gcc, <gcc@gcc.gnu.org> wrote: > Hi Alex! > > W dniu 28.06.2023 o 03:54, waffl3x pisze: > > I want to preface this stating that I have little to no experience in > compiler > > development, I am only merely just getting into it. With that said, I > have messed around > > with library design a fair amount, and this seems like something that > could be > > implemented in a library. It might be slightly comfier implemented on > the compiler side, > > but I question how generally it could be implemented. > > I thought of it a lot, and library implementation is something I'd > rather avoid. Before I elaborate, let me put some excerpts of the code > in question. GREP-ing the sources for "->next" (list processing) this is > how it looks like. I have a lot of code like this scattered around: > ------------------- > y->next = NULL; > if (our) { out->next = a; > for (y = t->HD; y && y->next; y = y->next) > if (y) y->next = a; > fit->HD = a->next; > fit->win = a->next; > b = a->next; > -------------------- > This is from just one source file, which otherwise is "plain C". If I > was to put it into a library that use "asm tweaked fancy pointers", a > portable fragment of code becomes "target dedicated" - this is undesired. > If you use a C++ library type for your pointers the syntax above doesn't need to change, and the fancy pointer type can be implemented portable, with customisation for targets where you could use 16 bits for the pointers. > To elaborate: even in (or may be "particularly" in) embedded world, one > prefers to have as portable code (to other targets) as possible. And I > must say, that such code fragments as quoted are scattered around my > sources practically everywhere. If I "convert" them to a library, the > entire project becomes "target locked", and in consequences > "unmaintainable". Not if you use a C++ class type. Such move practically kills it. > > Now, let me put the case into numbers: if I use stm32f030 instead of > stm32f103, I may end up with just 4kB of RAM. The struct I heavily use > here is 32bytes long, but with 16-bit "pointers" it collapses to > 16bytes. This structure pops up in many places and within a running > system it may reach 100 instances. It's a LOT of space to fight for. > > -R > PS: pls CC the response to my address (as you have done) - I'm not sure > if I get mails from the list. > ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 7:31 ` Jonathan Wakely @ 2023-06-28 8:35 ` Rafał Pietrak 2023-06-28 9:56 ` waffl3x 2023-07-03 14:52 ` David Brown 0 siblings, 2 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-06-28 8:35 UTC (permalink / raw) To: Jonathan Wakely; +Cc: waffl3x, gcc Hi Jonathan, W dniu 28.06.2023 o 09:31, Jonathan Wakely pisze: > > > On Wed, 28 Jun 2023, 08:14 Rafał Pietrak via Gcc, <gcc@gcc.gnu.org [---------] > how it looks like. I have a lot of code like this scattered around: > ------------------- > y->next = NULL; > if (our) { out->next = a; > for (y = t->HD; y && y->next; y = y->next) > if (y) y->next = a; > fit->HD = a->next; > fit->win = a->next; > b = a->next; > -------------------- > This is from just one source file, which otherwise is "plain C". If I > was to put it into a library that use "asm tweaked fancy pointers", a > portable fragment of code becomes "target dedicated" - this is > undesired. > > > If you use a C++ library type for your pointers the syntax above doesn't > need to change, and the fancy pointer type can be implemented portable, > with customisation for targets where you could use 16 bits for the pointers. As you can expect from the problem I've stated - I don't know C++, so I'll need some more advice there. But, before I dive into learning C++ (forgive the naive question).... isn't it so, that C++ comes with a heavy runtime? One that will bloat my tiny project? Or the bloat comes only when one uses particular elaborated class/inheritance scenarios, and this particular case ( for (...; ...; x = x->next) {} ) will not draw any of that into this project? Not knowing C++ and wanting to check your suggestion (before I start putting time into learning it), can you pls provide me a sample of code, that would replace the following: ---------------- struct test_s { struct test_s *next; char buff[1]; }; int test_funct(struct test_s *head, char *opt) { struct test_s *x = head; for (; x; x = x->next) { if (!*x->buff) { *x->buff = *opt; break; } } return x; } ----------------- .... and help me compile it into a variant with "normal/natural to architecture" pointers, and a variant with "fancy 16-bit" pointers? Thenx in advance, -R PS: as before - I don't get mails from the list, pls CC responses to me. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 8:35 ` Rafał Pietrak @ 2023-06-28 9:56 ` waffl3x 2023-06-28 10:43 ` Rafał Pietrak 2023-07-03 14:52 ` David Brown 1 sibling, 1 reply; 54+ messages in thread From: waffl3x @ 2023-06-28 9:56 UTC (permalink / raw) To: Rafał Pietrak; +Cc: Jonathan Wakely, gcc Here's a quick and dirty example of how this function could be rewritten with modern C++. I omitted some necessary details, particularly the implementation of the linked list iterator. I also wrote it out quickly so I can't be certain it's 100% correct, but it should give you an idea of whats possible. // I assume you meant to return a pointer template<typename Iter> auto test_funct(Iter iter, Iter end, char opt) { for (; iter != end; ++iter) { // dereferencing iter would get buff if (!*iter) { *iter = opt; break; } } return iter; } I also made an example using the C++ algorithms library. template<typename Iter> auto test_funct(Iter begin, Iter end, char opt) { auto iter = std::find_if(begin, end, [](auto buff){return !buff;}); if (iter) { *iter = opt; } return iter; } As I said, there's quite a bit omitted here, to be blunt, implementing both the fancy pointers (especially when I don't know anything about the hardware) and the iterators required would be more of a task than I am willing to do. I'm happy to help but I don't think I should be doing unpaid labor :). These examples would work with anything implementing the C++ iterator interface, as long as you conform to that interface on both sides, most code will be reusable where it is possible. Regarding the C++ runtime, I can't speak authoritatively, but I believe that the C++ runtime is fairly hefty yes. Luckily, plenty of the standard library does not require it. I believe you'll want to look into GCC's freestanding support to get a full picture of what is and is not available. Again, I can't speak authoritatively on the matter, but I think you would be correct to avoid the C++ runtime. There are other pitfalls to beware of, one of my concerns is that templates can cause some degree of bloat to executable size, I imagine one can get around it if they try hard enough though. The most real bottleneck you'll encounter in very large projects is compile time, but that all depends on what you're using, and how much you're using it, and even then there are mitigations for that. I'm happy to answer more questions and help, however I'm concerned this is getting fairly unrelated to GCC. -Alex ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 9:56 ` waffl3x @ 2023-06-28 10:43 ` Rafał Pietrak 2023-06-28 12:12 ` waffl3x 0 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-06-28 10:43 UTC (permalink / raw) To: waffl3x; +Cc: Jonathan Wakely, gcc Hi Alex, W dniu 28.06.2023 o 11:56, waffl3x pisze: > Here's a quick and dirty example of how this function could be rewritten with > modern C++. I omitted some necessary details, particularly the implementation of the > linked list iterator. I also wrote it out quickly so I can't be certain it's 100% > correct, but it should give you an idea of whats possible. trying.... > > // I assume you meant to return a pointer > template<typename Iter> > auto test_funct(Iter iter, Iter end, char opt) { > for (; iter != end; ++iter) { > // dereferencing iter would get buff > if (!*iter) { *iter = opt; break; } > } > return iter; > } -------------------------- TEST.CPP is the above code $ g++ -fpermissive -c test.cpp >>no error, GOOD :) $ g++ -fpermissive -S test.cpp $ cat test.s .file "test.cpp" .text .ident "GCC: (Debian 12.2.0-14) 12.2.0" .section .note.GNU-stack,"",@progbits ---------------end-of-file---------- Hmm... that's disappointing :( nothing was generated. then again. I've noticed that you've changed pointers to indices. I've pondered that for my implementation too but discarded the idea for it will require adjustments by struct-size (array element size) on every access.... Or may be C++ does a different thing with [object++], then what plain-c does with [variable++]? I's hard to analyze code without basic knowledge of the language :( > > I also made an example using the C++ algorithms library. > > template<typename Iter> > auto test_funct(Iter begin, Iter end, char opt) { > auto iter = std::find_if(begin, end, [](auto buff){return !buff;}); > if (iter) { > *iter = opt; > } > return iter; > } here I got: test2.cpp:3:22: error: ‘find_if’ is not a member of ‘std’ so, it's a nogo for me either. > As I said, there's quite a bit omitted here, to be blunt, implementing both > the fancy pointers (especially when I don't know anything about the hardware) and > the iterators required would be more of a task than I am willing to do. I'm happy > to help but I don't think I should be doing unpaid labor :). Fair enough. [---------] > > I'm happy to answer more questions and help, however I'm concerned this is > getting fairly unrelated to GCC. From my perspective it is related to GCC (well... ok, to CC in general - it "smells" like an extention to "C-standard" providing additional "funny" semantics to CC. But GCC is a "front-runner" for CC evolution, right? :). Then again. I'm not into drawing anybody into unfruitful and pointless support (for my little project). I only hoped that the problem could be recognized and may be would inspire some developers out there (as it would be silly for me, if I thought its implementation into GCC could happen before my small project ends, right?). Anyway, thanx for the hints and suggestions. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 10:43 ` Rafał Pietrak @ 2023-06-28 12:12 ` waffl3x 2023-06-28 12:23 ` Rafał Pietrak 0 siblings, 1 reply; 54+ messages in thread From: waffl3x @ 2023-06-28 12:12 UTC (permalink / raw) To: Rafał Pietrak; +Cc: Jonathan Wakely, gcc > Hmm... that's disappointing :( nothing was generated. Function templates are not functions, they are templates of functions, they will not generate any code unless they are instantiated. > then again. I've noticed that you've changed pointers to indices. No, I changed pointers to a template type parameter named Iter. Which is meant to correspond to the C++ iterator interface. Pointers satisfy all of iterators requirements, and classes that satisfy those requirements (by implementing similar semantics to pointers) are also iterators. > Or may be C++ does a different thing with [object++], then > what plain-c does with [variable++]? That's correct, C++ has operator overloading, which allows you to define member functions for classes that are called when the corresponding operator is used. In this case, operator++ (in the imaginary implementation) is overloaded to go to the next element of the linked list. The iterator interface requires operator++ to be overloaded, and should implement similar semantics to using operator++ on a pointer. > I's hard to analyze code without basic knowledge of the language :( Yes, I personally recommend learncpp as a resource for learning C++, that would aid you greatly. C++ is a large language, you would need to invest some time into it to become proficient, in my opinion that investment is hugely worth it though. > I only hoped that the problem could be > recognized and may be would inspire some developers out there Unfortunately, I strongly agree with JWakely that what you requested belongs in library rather than in language additions. If implementing it is too much of a burden (which is understandable since you have no prior experience with C++) then I would suggest checking out Boost to see if they have what you need. I seem to recall them having some sort of fancy pointers in there somewhere. Realistically though, it will take some time to get used to all the C++isms before you would be able to be proficient with anything Boost would provide. I don't mean to be discouraging, I just want to keep your expectations realistic, the learning curve for C++ can be rather high, especially when you're used to C. Good luck! -Alex Sent with Proton Mail secure email. ------- Original Message ------- On Wednesday, June 28th, 2023 at 6:43 AM, Rafał Pietrak <embedded@ztk-rp.eu> wrote: > Hi Alex, > > W dniu 28.06.2023 o 11:56, waffl3x pisze: > > > Here's a quick and dirty example of how this function could be rewritten with > > modern C++. I omitted some necessary details, particularly the implementation of the > > linked list iterator. I also wrote it out quickly so I can't be certain it's 100% > > correct, but it should give you an idea of whats possible. > > > trying.... > > > // I assume you meant to return a pointer > > template<typename Iter> > > auto test_funct(Iter iter, Iter end, char opt) { > > for (; iter != end; ++iter) { > > // dereferencing iter would get buff > > if (!*iter) { *iter = opt; break; } > > } > > return iter; > > } > > -------------------------- TEST.CPP is the above code > $ g++ -fpermissive -c test.cpp > > > > no error, GOOD :) > > $ g++ -fpermissive -S test.cpp > $ cat test.s > .file "test.cpp" > .text > .ident "GCC: (Debian 12.2.0-14) 12.2.0" > .section .note.GNU-stack,"",@progbits > ---------------end-of-file---------- > > Hmm... that's disappointing :( nothing was generated. > > then again. I've noticed that you've changed pointers to indices. I've > pondered that for my implementation too but discarded the idea for it > will require adjustments by struct-size (array element size) on every > access.... Or may be C++ does a different thing with [object++], then > what plain-c does with [variable++]? > > I's hard to analyze code without basic knowledge of the language :( > > > I also made an example using the C++ algorithms library. > > > > template<typename Iter> > > auto test_funct(Iter begin, Iter end, char opt) { > > auto iter = std::find_if(begin, end, [](auto buff){return !buff;}); > > if (iter) { > > *iter = opt; > > } > > return iter; > > } > > > here I got: > test2.cpp:3:22: error: ‘find_if’ is not a member of ‘std’ > so, it's a nogo for me either. > > > As I said, there's quite a bit omitted here, to be blunt, implementing both > > the fancy pointers (especially when I don't know anything about the hardware) and > > the iterators required would be more of a task than I am willing to do. I'm happy > > to help but I don't think I should be doing unpaid labor :). > > > Fair enough. > > [---------] > > > I'm happy to answer more questions and help, however I'm concerned this is > > getting fairly unrelated to GCC. > > > From my perspective it is related to GCC (well... ok, to CC in general > - it "smells" like an extention to "C-standard" providing additional > "funny" semantics to CC. But GCC is a "front-runner" for CC evolution, > right? :). > > Then again. I'm not into drawing anybody into unfruitful and pointless > support (for my little project). I only hoped that the problem could be > recognized and may be would inspire some developers out there (as it > would be silly for me, if I thought its implementation into GCC could > happen before my small project ends, right?). > > Anyway, thanx for the hints and suggestions. > > -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 12:12 ` waffl3x @ 2023-06-28 12:23 ` Rafał Pietrak 0 siblings, 0 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-06-28 12:23 UTC (permalink / raw) To: waffl3x; +Cc: Jonathan Wakely, gcc Hi Alex, W dniu 28.06.2023 o 14:12, waffl3x pisze: [----------] > them having some sort of fancy pointers in there somewhere. Realistically though, > it will take some time to get used to all the C++isms before you would be able to > be proficient with anything Boost would provide. I don't mean to be discouraging, > I just want to keep your expectations realistic, the learning curve for C++ can > be rather high, especially when you're used to C. > > Good luck! > -Alex OK, thenx again. Bye. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 8:35 ` Rafał Pietrak 2023-06-28 9:56 ` waffl3x @ 2023-07-03 14:52 ` David Brown 2023-07-03 16:29 ` Rafał Pietrak 1 sibling, 1 reply; 54+ messages in thread From: David Brown @ 2023-07-03 14:52 UTC (permalink / raw) To: Rafał Pietrak, Jonathan Wakely; +Cc: waffl3x, gcc On 28/06/2023 10:35, Rafał Pietrak via Gcc wrote: > Hi Jonathan, > > W dniu 28.06.2023 o 09:31, Jonathan Wakely pisze: >> >> >> >> If you use a C++ library type for your pointers the syntax above >> doesn't need to change, and the fancy pointer type can be implemented >> portable, with customisation for targets where you could use 16 bits >> for the pointers. > > As you can expect from the problem I've stated - I don't know C++, so > I'll need some more advice there. > > But, before I dive into learning C++ (forgive the naive question).... > isn't it so, that C++ comes with a heavy runtime? One that will bloat my > tiny project? Or the bloat comes only when one uses particular > elaborated class/inheritance scenarios, and this particular case ( for > (...; ...; x = x->next) {} ) will not draw any of that into this project? > Let me make a few points (in no particular order) : 1. For some RISC targets, such as PowerPC, it is common to have a section of memory called the "small data section". One of the registers is dedicated as an anchor to this section, and data within it is addressed as Rx + 16-bit offset. But this is primarily for data at fixed (statically allocated) addresses, since reads and writes using this address mode are smaller and faster than full 32-bit addresses. Normal pointers are still 32-bit. It also requires a dedicated register - not a big cost when you have 31 GPRs, but much more costly when you have only 13. 2. C++ is only costly if you use costly features. On small embedded systems, you want "-fno-exceptions -fno-rtti", and you will get as good (or bad!) results for C++ as for C. Many standard library features will, however, result in a great deal of code - it is usually fairly obvious which classes and functions are appropriate. 3. In C, you could make a type such as : typedef struct { uint16_t p; } small_pointer_t; and conversion functions : static const uintptr_t ram_base = 0x20000000; static inline void * sp_to_voidp(small_pointer_t sp) { return (void *)(ram_base + sp); } static inline small_pointer_t voidp_to_sp(void * p) { small_pointer_t sp; sp.p = (uintptr_t) p - ram_base; return sp; } Then you would use these access functions to turn your "small pointers" into normal pointers. The source code would become significantly harder to read and write, and less type-safe, but could be quite efficient. In C++, you'd use the same kinds of functions. But they would now be methods in a class template, and tied to overloaded operators and/or conversion functions. The result would be type-safe and let you continue to use a normal pointer-like syntax, and with equally efficient generated code. You could also equally conveniently have small pointers to ram and to peripheral groups. This mailing list is not really the place to work through an implementation of such class templates - but it certainly could be done. 4. It is worth taking a step back, and thinking about how you would like to use these pointers. It is likely that you would be better thinking in terms of an array, rather than pointers - after all, you don't want to be using dynamically allocated memory here if you can avoid it, and certainly not generic malloc(). If you can use an array, then your index type can be as small as you like - maybe uint8_t is enough. David ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-03 14:52 ` David Brown @ 2023-07-03 16:29 ` Rafał Pietrak 2023-07-04 14:20 ` Rafał Pietrak 0 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-07-03 16:29 UTC (permalink / raw) To: David Brown, Jonathan Wakely; +Cc: waffl3x, gcc Hi David, W dniu 3.07.2023 o 16:52, David Brown pisze: [------------] >> >> But, before I dive into learning C++ (forgive the naive question).... >> isn't it so, that C++ comes with a heavy runtime? One that will bloat >> my tiny project? Or the bloat comes only when one uses particular >> elaborated class/inheritance scenarios, and this particular case ( for >> (...; ...; x = x->next) {} ) will not draw any of that into this project? >> > > > Let me make a few points (in no particular order) : > > 1. For some RISC targets, such as PowerPC, it is common to have a > section of memory called the "small data section". One of the registers > is dedicated as an anchor to this section, and data within it is > addressed as Rx + 16-bit offset. But this is primarily for data at > fixed (statically allocated) addresses, since reads and writes using > this address mode are smaller and faster than full 32-bit addresses. > Normal pointers are still 32-bit. It also requires a dedicated register > - not a big cost when you have 31 GPRs, but much more costly when you > have only 13. I don't have any experience with PowerPC, all you say here is new to me. And PPC architecture today is "kind of exotic", but I appreciate the info and I may look it up for insight how "short pointers" influence performance. Thenx. > 2. C++ is only costly if you use costly features. On small embedded > systems, you want "-fno-exceptions -fno-rtti", and you will get as good > (or bad!) results for C++ as for C. Many standard library features > will, however, result in a great deal of code - it is usually fairly > obvious which classes and functions are appropriate. OK. I become aware, that I will no longer be able to turn a blind eye on C++. :( > > 3. In C, you could make a type such as : > > typedef struct { > uint16_t p; > } small_pointer_t; > > and conversion functions : > > static const uintptr_t ram_base = 0x20000000; > > static inline void * sp_to_voidp(small_pointer_t sp) { > return (void *)(ram_base + sp); > } > > static inline small_pointer_t voidp_to_sp(void * p) { > small_pointer_t sp; > sp.p = (uintptr_t) p - ram_base; > return sp; > } > > Then you would use these access functions to turn your "small pointers" > into normal pointers. The source code would become significantly harder > to read and write, and less type-safe, but could be quite efficient. That actually is a problem. I really can make a lot of the code in question into an assembler, and have it behave precisely as I desire, but that'll make the project not portable - that's why I though of casting the use case onto this list here. This way (I hoped) it may inspire "the world" and have it supported at compiler level some time in the future. Should it not be the case, I'd rather stay with "plain C" and keep the code portable and readable (rather then obfuscate it ... even by merely too "talkative sources"). [--------] > to ram and to peripheral groups. This mailing list is not really the > place to work through an implementation of such class templates - but it > certainly could be done. OK. I fully agree. FYI: it was never my intention to inquire for advice of how to cook such "short/funny" pointers by special constructs / technic in c-programming. Actually I was a little set back reading such advice as first responses to my email. It was nice, but surprising. I hoped to get a discussion more towards "how to let compiler know", that a particular segment/section of a program-data will be emitted into an executable in a "constraint output section", so that compiler could "automagicly" know, that using "short" pointers for that data would suffice, and in consequence would generate such instructions.... without any change to the source code. It's sort of obvious, that this would also require support from libc (like a specific "malloc()" and friends), but application sources could stay untouched, and that's IMHO key point here. > 4. It is worth taking a step back, and thinking about how you would like > to use these pointers. It is likely that you would be better thinking > in terms of an array, rather than pointers - after all, you don't want > to be using dynamically allocated memory here if you can avoid it, and > certainly not generic malloc(). If you can use an array, then your > index type can be as small as you like - maybe uint8_t is enough. I did that trip ... some time ago. May be I discarded the idea prematurely, but I dropped it because I was afraid of cost of multiplication (index calculation) in micros. That my "assumption" may actually not be true, since today even the mini-minis often have integer multiplication units, so my reasoning became false. But. Even if I turn pointers into indices for tiny micros ... that'd make the code not portable. I'm not to eager to do that. Still, thank you very much for sharing those concepts. With best regards, -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-03 16:29 ` Rafał Pietrak @ 2023-07-04 14:20 ` Rafał Pietrak 2023-07-04 15:13 ` David Brown 0 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-07-04 14:20 UTC (permalink / raw) To: David Brown, Jonathan Wakely; +Cc: waffl3x, gcc W dniu 3.07.2023 o 18:29, Rafał Pietrak pisze: > Hi David, > [--------------] >> 4. It is worth taking a step back, and thinking about how you would >> like to use these pointers. It is likely that you would be better >> thinking in terms of an array, rather than pointers - after all, you >> don't want to be using dynamically allocated memory here if you can >> avoid it, and certainly not generic malloc(). If you can use an >> array, then your index type can be as small as you like - maybe >> uint8_t is enough. > > I did that trip ... some time ago. May be I discarded the idea > prematurely, but I dropped it because I was afraid of cost of I remember now what was my main problem with indexes implementation: inability to express/write chain "references" with them. Table/index semantic of: t[a][b][c][d]. is a "multidimentional table" which is completely different from "pointer semantic" of: *t->a->b->c->d It is quite legit to do a full circle around a circular list this way, while table semantics doesn't allow that. Indexes are off the table. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-04 14:20 ` Rafał Pietrak @ 2023-07-04 15:13 ` David Brown 2023-07-04 16:15 ` Rafał Pietrak 0 siblings, 1 reply; 54+ messages in thread From: David Brown @ 2023-07-04 15:13 UTC (permalink / raw) To: Rafał Pietrak, Jonathan Wakely; +Cc: waffl3x, gcc On 04/07/2023 16:20, Rafał Pietrak wrote: > > > W dniu 3.07.2023 o 18:29, Rafał Pietrak pisze: >> Hi David, >> > [--------------] >>> 4. It is worth taking a step back, and thinking about how you would >>> like to use these pointers. It is likely that you would be better >>> thinking in terms of an array, rather than pointers - after all, you >>> don't want to be using dynamically allocated memory here if you can >>> avoid it, and certainly not generic malloc(). If you can use an >>> array, then your index type can be as small as you like - maybe >>> uint8_t is enough. >> >> I did that trip ... some time ago. May be I discarded the idea >> prematurely, but I dropped it because I was afraid of cost of > > I remember now what was my main problem with indexes implementation: > inability to express/write chain "references" with them. Table/index > semantic of: > t[a][b][c][d]. > is a "multidimentional table" which is completely different from > "pointer semantic" of: > *t->a->b->c->d > > It is quite legit to do a full circle around a circular list this way, > while table semantics doesn't allow that. > > Indexes are off the table. > > -R If you have a circular buffer, it is vastly more efficient to have an array with no pointers or indices, and use head and tail indices to track the current position. But I'm not sure if that is what you are looking for. And you can use indices in fields for chaining, but the syntax will be different. (For some microcontrollers, the multiplications involved in array index calculations can be an issue, but not for ARM devices.) ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-04 15:13 ` David Brown @ 2023-07-04 16:15 ` Rafał Pietrak 0 siblings, 0 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-07-04 16:15 UTC (permalink / raw) To: David Brown, Jonathan Wakely; +Cc: waffl3x, gcc W dniu 4.07.2023 o 17:13, David Brown pisze: [------------] > > If you have a circular buffer, it is vastly more efficient to have an > array with no pointers or indices, and use head and tail indices to > track the current position. But I'm not sure if that is what you are > looking for. And you can use indices in fields for chaining, but the > syntax will be different. (For some microcontrollers, the > multiplications involved in array index calculations can be an issue, > but not for ARM devices.) Ring Buffers, yest and no. Thy have their uses, but at this particular case (my current project) using them is pointless. A little explanation: at this point I have an "object" (a structure, or rather a union of structures) with 6 pointers and some additional data. Those 6 pointers are entangled in something that look like "neural network" (although it's NOT one). This structure is sort of a demo, a template. It's expected to grow somewhat for the real thing ... like 3-5 times current structure. This translates to 100-150 bytes each (from current 32bytes) with "big several" expected as total size of the system. And my target is 2K-RAM/4kRAM devices. I don't imagine turning this web into any amount of RB. In my capacity, that'd make it unmanageable. But this is just me. I though, people doing embedded out there face similar problems, and a nice compiler "pragma" into direction of named spaces/segments could really help here. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 7:13 ` Rafał Pietrak 2023-06-28 7:31 ` Jonathan Wakely @ 2023-06-28 7:34 ` waffl3x 2023-06-28 8:41 ` Rafał Pietrak 1 sibling, 1 reply; 54+ messages in thread From: waffl3x @ 2023-06-28 7:34 UTC (permalink / raw) To: Rafał Pietrak; +Cc: gcc >This is from just one source file, which otherwise is "plain C". If I >was to put it into a library that use "asm tweaked fancy pointers", a >portable fragment of code becomes "target dedicated" - this is undesired. I sympathize with your desire to not lock your codebase to a particular target, I agree, it's important to keep it generic. I would definitely design the library to allow customization of the utilities for a given target. I imagine this gets a little difficult if you're setting up registers a certain way, but wrapping some ASM in a function object, and then forcing the call to that object to inline should do the trick there. From there, any code that you want to remain portable would have to take the pointer type by template parameter. Unfortunately, I can imagine the secondary part of this creating problems in an embedded project if you had to instantiate too many different functions from the templates. >------------------- > y->next = NULL; > if (our) { out->next = a; > for (y = t->HD; y && y->next; y = y->next) > if (y) y->next = a; > fit->HD = a->next; > fit->win = a->next; > b = a->next; >-------------------- I suspect that this snippet that you shared might not be quite as portable as you think. It looks to me like it relies on type punning. Type punning can indeed be implemented in a well defined manner, in my experience though it rarely is. With that said, strict aliasing is very difficult to understand so I would not be surprised if I was mistaken here, especially since there's not enough code in the snippet to be certain. -Alex ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 7:34 ` waffl3x @ 2023-06-28 8:41 ` Rafał Pietrak 0 siblings, 0 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-06-28 8:41 UTC (permalink / raw) To: waffl3x; +Cc: gcc Hi Alex, W dniu 28.06.2023 o 09:34, waffl3x pisze: [------] > >> ------------------- >> y->next = NULL; >> if (our) { out->next = a; >> for (y = t->HD; y && y->next; y = y->next) >> if (y) y->next = a; >> fit->HD = a->next; >> fit->win = a->next; >> b = a->next; >> -------------------- [-----------] > With that said, strict aliasing is very difficult to understand so I would not be > surprised if I was mistaken here, especially since there's not enough code in the > snippet to be certain. Shur thing. The snippet is a GREP of the sources - there is no valid continuity between those lines :) I just wanted to point out the amount of constructs scattered around, and thus the necessity to replace the ENTIRE source by a non-portable variant if I was to put in into a C-lib. But may be Johnathan suggestion would work - I'll check it if I get some help with that. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-27 12:26 wishlist: support for shorter pointers Rafał Pietrak 2023-06-28 1:54 ` waffl3x @ 2023-06-28 13:00 ` Martin Uecker 2023-06-28 14:51 ` Rafał Pietrak 1 sibling, 1 reply; 54+ messages in thread From: Martin Uecker @ 2023-06-28 13:00 UTC (permalink / raw) To: Rafał Pietrak, gcc Sounds like named address spaces to me: https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html Best, Martin Am Dienstag, dem 27.06.2023 um 14:26 +0200 schrieb Rafał Pietrak via Gcc: > Hello everybody, > > I'm not quite sure if this is correct mailbox for this suggestion (may > be "embedded" would be better), but let me present it first (and while > the examples is from ARM stm32 environment, the issue would equally > apply to i386 or even amd64). So: > > 1. Small MPU (like stm32f103) would normally have small amount of RAM, > and even somewhat larger variant do have its memory "partitioned/ > dedicated" to various subsystems (like CloseCoupledMemory, Ethernet > buffers, USB buffs, etc). > > 2. to address any location within those sections of that memory (or > their entire RAM) it would suffice to use 16-bit pointers. > > 3. still, declaring a pointer in GCC always allocate "natural" size of a > pointer in given architecture. In case of ARM stm32 it would be 32-bits. > > 4. programs using pointers do keep them around in structures. So > programs with heavy use of pointers have those structures like 2 times > larger then necessary .... if only pointers were 16-bit. And memory in > those devices is scarce. > > 5. the same thing applies to 64-bit world. Programs that don't require > huge memories but do use pointers excessively, MUST take up 64-bit for a > pointer no matter what. > > So I was wondering if it would be feasible for GCC to allow SEGMENT to > be declared as "small" (like 16-bit addressable in 32-bit CPU, or 32-bit > addressable in 64-bit CPU), and ANY pointer declared to reference > location within them would then be appropriately reduced. > > In ARM world, the use of such pointers would require the use of an > additional register (functionally being a "segment base address") to > allow for data access using instructions like: "LD Rx, [Ry, Rz]" - > meaning register index reference. Here Ry is the base of the SEGMENT in > question. Or if (like inside a loop) the structure "pointed to" by Rz > must be often used, just one operation "ADD Rz, Ry" will prep Rz for > subsequent "ordinary" offset operations like: "LD Ra, [Rz, #member]" ... > and reentering the loop by "LDH Rz, [Rz, #next]" does what's required by > "x = x->next". > > Not having any experience in compiler implementations I have no idea if > this is a big or a small change to compiler design. > > -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 13:00 ` Martin Uecker @ 2023-06-28 14:51 ` Rafał Pietrak 2023-06-28 15:44 ` Richard Earnshaw (lists) 0 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-06-28 14:51 UTC (permalink / raw) To: Martin Uecker, gcc Hi Martin, W dniu 28.06.2023 o 15:00, Martin Uecker pisze: > > Sounds like named address spaces to me: > https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html Only to same extend, and only in x86 case. The goal of the wish-item I've describe is to shorten pointers. I may be wrong and have misread the specs, but the "address spaces" implementation you've pointed out don't look like doing that. In particular the AVR variant applies to devices that have a "native int" of 16-bits, and those devices (most of them) have address space no larger. So there is no gain. Their pointers cover all their address space and if one wanted to have shorter pointers ... like 12-bits - those wouldn't "nicely fit into register", or 8-bits - those would reduce the "addressable" space to 256 bytes, which is VERY tight for any practical application. Additionally, the AVR case is explained as "only for rodata" - this completely dismisses it from my use. To explain a little more: the functionality I'm looking for is something like x86 implementation of that "address spaces". The key functionality here is the additional register like fs/gs (an address offset register). IMHO the feature/implementation in question would HAVE TO use additional register instead of letting linker adjust them at link time, because those "short" pointers would need to be load-and-stored dynamically and changed dynamically at runtime. That's why I've put an example of ARM instruction that does this. Again IMHO the only "syntactic" feature,that is required for a compiler to do "the right thing" is to make compiler consider segment (segment name, ordinary linker segment name) where a particular pointer target resides. Then if that segment where data (of that pointer) reside is declared "short pointers", then compiler loads and uses additional register pointing to the base of that segment. Quite like intel segments work in hardware. Naturally, although I have hints on such mechanism behavior, I have no skills to even imagine where to tweak the sources to achieve that. -R > > Best, > Martin > > Am Dienstag, dem 27.06.2023 um 14:26 +0200 schrieb Rafał Pietrak via Gcc: >> Hello everybody, >> >> I'm not quite sure if this is correct mailbox for this suggestion (may >> be "embedded" would be better), but let me present it first (and while >> the examples is from ARM stm32 environment, the issue would equally >> apply to i386 or even amd64). So: >> >> 1. Small MPU (like stm32f103) would normally have small amount of RAM, >> and even somewhat larger variant do have its memory "partitioned/ >> dedicated" to various subsystems (like CloseCoupledMemory, Ethernet >> buffers, USB buffs, etc). >> >> 2. to address any location within those sections of that memory (or >> their entire RAM) it would suffice to use 16-bit pointers. >> >> 3. still, declaring a pointer in GCC always allocate "natural" size of a >> pointer in given architecture. In case of ARM stm32 it would be 32-bits. >> >> 4. programs using pointers do keep them around in structures. So >> programs with heavy use of pointers have those structures like 2 times >> larger then necessary .... if only pointers were 16-bit. And memory in >> those devices is scarce. >> >> 5. the same thing applies to 64-bit world. Programs that don't require >> huge memories but do use pointers excessively, MUST take up 64-bit for a >> pointer no matter what. >> >> So I was wondering if it would be feasible for GCC to allow SEGMENT to >> be declared as "small" (like 16-bit addressable in 32-bit CPU, or 32-bit >> addressable in 64-bit CPU), and ANY pointer declared to reference >> location within them would then be appropriately reduced. >> >> In ARM world, the use of such pointers would require the use of an >> additional register (functionally being a "segment base address") to >> allow for data access using instructions like: "LD Rx, [Ry, Rz]" - >> meaning register index reference. Here Ry is the base of the SEGMENT in >> question. Or if (like inside a loop) the structure "pointed to" by Rz >> must be often used, just one operation "ADD Rz, Ry" will prep Rz for >> subsequent "ordinary" offset operations like: "LD Ra, [Rz, #member]" ... >> and reentering the loop by "LDH Rz, [Rz, #next]" does what's required by >> "x = x->next". >> >> Not having any experience in compiler implementations I have no idea if >> this is a big or a small change to compiler design. >> >> -R > > ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 14:51 ` Rafał Pietrak @ 2023-06-28 15:44 ` Richard Earnshaw (lists) 2023-06-28 16:07 ` Martin Uecker ` (2 more replies) 0 siblings, 3 replies; 54+ messages in thread From: Richard Earnshaw (lists) @ 2023-06-28 15:44 UTC (permalink / raw) To: Rafał Pietrak, Martin Uecker, gcc On 28/06/2023 15:51, Rafał Pietrak via Gcc wrote: > Hi Martin, > > W dniu 28.06.2023 o 15:00, Martin Uecker pisze: >> >> Sounds like named address spaces to me: >> https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html > > Only to same extend, and only in x86 case. > > The goal of the wish-item I've describe is to shorten pointers. I may be > wrong and have misread the specs, but the "address spaces" > implementation you've pointed out don't look like doing that. In > particular the AVR variant applies to devices that have a "native int" > of 16-bits, and those devices (most of them) have address space no > larger. So there is no gain. Their pointers cover all their address > space and if one wanted to have shorter pointers ... like 12-bits - > those wouldn't "nicely fit into register", or 8-bits - those would > reduce the "addressable" space to 256 bytes, which is VERY tight for any > practical application. > > Additionally, the AVR case is explained as "only for rodata" - this > completely dismisses it from my use. > > To explain a little more: the functionality I'm looking for is something > like x86 implementation of that "address spaces". The key functionality > here is the additional register like fs/gs (an address offset register). > IMHO the feature/implementation in question would HAVE TO use additional > register instead of letting linker adjust them at link time, because > those "short" pointers would need to be load-and-stored dynamically and > changed dynamically at runtime. That's why I've put an example of ARM > instruction that does this. Again IMHO the only "syntactic" feature,that > is required for a compiler to do "the right thing" is to make compiler > consider segment (segment name, ordinary linker segment name) where a > particular pointer target resides. Then if that segment where data (of > that pointer) reside is declared "short pointers", then compiler loads > and uses additional register pointing to the base of that segment. Quite > like intel segments work in hardware. > > Naturally, although I have hints on such mechanism behavior, I have no > skills to even imagine where to tweak the sources to achieve that. I think I understand what you're asking for but: 1) You'd need a new ABI specification to handle this, probably involving register assignments (for the 'segment' addresses), the initialization of those at startup, assembler and linker extensions to allow for relocations describing the symbols, etc. 2) Implementations for all of the above (it would be a lot of work - weeks to months, not days). Little existing code, including most of the hand-written assembly routines is likely to be compatible with the register conventions you'd need to define, so all that code would need auditing and alternatives developed. 3) I doubt it would be an overall win in the end. I base the last assertion on the fact that you'd now have three values in many addresses, the base (segment), the pointer and then a final offset. This means quite a bit more code being generated, so you trade smaller pointers in your data section for more code in your code section. For example, struct f { int a; int b; }; int func (struct f *p) { return p->b; } would currently compile to something like ldr r0, [r0, #4] bx lr but with the new, shorter, pointer you'd end up with add r0, r_seg, r0 ldr r0, [r0, #4] bx lr In some cases it might be even worse as you'd end up with zero-extensions of the pointer values as well. R. > -R > >> >> Best, >> Martin >> >> Am Dienstag, dem 27.06.2023 um 14:26 +0200 schrieb Rafał Pietrak via Gcc: >>> Hello everybody, >>> >>> I'm not quite sure if this is correct mailbox for this suggestion (may >>> be "embedded" would be better), but let me present it first (and while >>> the examples is from ARM stm32 environment, the issue would equally >>> apply to i386 or even amd64). So: >>> >>> 1. Small MPU (like stm32f103) would normally have small amount of RAM, >>> and even somewhat larger variant do have its memory "partitioned/ >>> dedicated" to various subsystems (like CloseCoupledMemory, Ethernet >>> buffers, USB buffs, etc). >>> >>> 2. to address any location within those sections of that memory (or >>> their entire RAM) it would suffice to use 16-bit pointers. >>> >>> 3. still, declaring a pointer in GCC always allocate "natural" size of a >>> pointer in given architecture. In case of ARM stm32 it would be 32-bits. >>> >>> 4. programs using pointers do keep them around in structures. So >>> programs with heavy use of pointers have those structures like 2 times >>> larger then necessary .... if only pointers were 16-bit. And memory in >>> those devices is scarce. >>> >>> 5. the same thing applies to 64-bit world. Programs that don't require >>> huge memories but do use pointers excessively, MUST take up 64-bit for a >>> pointer no matter what. >>> >>> So I was wondering if it would be feasible for GCC to allow SEGMENT to >>> be declared as "small" (like 16-bit addressable in 32-bit CPU, or 32-bit >>> addressable in 64-bit CPU), and ANY pointer declared to reference >>> location within them would then be appropriately reduced. >>> >>> In ARM world, the use of such pointers would require the use of an >>> additional register (functionally being a "segment base address") to >>> allow for data access using instructions like: "LD Rx, [Ry, Rz]" - >>> meaning register index reference. Here Ry is the base of the SEGMENT in >>> question. Or if (like inside a loop) the structure "pointed to" by Rz >>> must be often used, just one operation "ADD Rz, Ry" will prep Rz for >>> subsequent "ordinary" offset operations like: "LD Ra, [Rz, #member]" ... >>> and reentering the loop by "LDH Rz, [Rz, #next]" does what's required by >>> "x = x->next". >>> >>> Not having any experience in compiler implementations I have no idea if >>> this is a big or a small change to compiler design. >>> >>> -R >> >> ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 15:44 ` Richard Earnshaw (lists) @ 2023-06-28 16:07 ` Martin Uecker 2023-06-28 16:49 ` Richard Earnshaw (lists) 2023-06-28 16:48 ` Rafał Pietrak 2023-06-29 6:19 ` Rafał Pietrak 2 siblings, 1 reply; 54+ messages in thread From: Martin Uecker @ 2023-06-28 16:07 UTC (permalink / raw) To: Richard Earnshaw (lists), Rafał Pietrak, gcc Am Mittwoch, dem 28.06.2023 um 16:44 +0100 schrieb Richard Earnshaw (lists): > On 28/06/2023 15:51, Rafał Pietrak via Gcc wrote: > > Hi Martin, > > > > W dniu 28.06.2023 o 15:00, Martin Uecker pisze: > > > > > > Sounds like named address spaces to me: > > > https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html > > > > Only to same extend, and only in x86 case. > > > > The goal of the wish-item I've describe is to shorten pointers. I may be > > wrong and have misread the specs, but the "address spaces" > > implementation you've pointed out don't look like doing that. In > > particular the AVR variant applies to devices that have a "native int" > > of 16-bits, and those devices (most of them) have address space no > > larger. So there is no gain. Their pointers cover all their address > > space and if one wanted to have shorter pointers ... like 12-bits - > > those wouldn't "nicely fit into register", or 8-bits - those would > > reduce the "addressable" space to 256 bytes, which is VERY tight for any > > practical application. > > > > Additionally, the AVR case is explained as "only for rodata" - this > > completely dismisses it from my use. > > > > To explain a little more: the functionality I'm looking for is something > > like x86 implementation of that "address spaces". The key functionality > > here is the additional register like fs/gs (an address offset register). > > IMHO the feature/implementation in question would HAVE TO use additional > > register instead of letting linker adjust them at link time, because > > those "short" pointers would need to be load-and-stored dynamically and > > changed dynamically at runtime. That's why I've put an example of ARM > > instruction that does this. Again IMHO the only "syntactic" feature,that > > is required for a compiler to do "the right thing" is to make compiler > > consider segment (segment name, ordinary linker segment name) where a > > particular pointer target resides. Then if that segment where data (of > > that pointer) reside is declared "short pointers", then compiler loads > > and uses additional register pointing to the base of that segment. Quite > > like intel segments work in hardware. > > > > Naturally, although I have hints on such mechanism behavior, I have no > > skills to even imagine where to tweak the sources to achieve that. > > > I think I understand what you're asking for but: > 1) You'd need a new ABI specification to handle this, probably involving > register assignments (for the 'segment' addresses), the initialization > of those at startup, assembler and linker extensions to allow for > relocations describing the symbols, etc. > 2) Implementations for all of the above (it would be a lot of work - > weeks to months, not days). Little existing code, including most of the > hand-written assembly routines is likely to be compatible with the > register conventions you'd need to define, so all that code would need > auditing and alternatives developed. > 3) I doubt it would be an overall win in the end. > > I base the last assertion on the fact that you'd now have three values > in many addresses, the base (segment), the pointer and then a final > offset. This means quite a bit more code being generated, so you trade > smaller pointers in your data section for more code in your code > section. For example, > > struct f > { > int a; > int b; > }; > > int func (struct f *p) > { > return p->b; > } > > would currently compile to something like > > ldr r0, [r0, #4] > bx lr > > but with the new, shorter, pointer you'd end up with > > add r0, r_seg, r0 > ldr r0, [r0, #4] > bx lr > > In some cases it might be even worse as you'd end up with > zero-extensions of the pointer values as well. > I do not quite understand why this wouldn't work with named address spaces? __near struct { int x; int y; }; int func (__near struct f *p) { return p->b; } could produce exactly such code? If you need multiple such segments one could have __near0, ..., __near9. Such a pointer could also be converted to a regular pointer, which could reduce code overhead. Martin > R. > > > -R > > > > > > > > Best, > > > Martin > > > > > > Am Dienstag, dem 27.06.2023 um 14:26 +0200 schrieb Rafał Pietrak via Gcc: > > > > Hello everybody, > > > > > > > > I'm not quite sure if this is correct mailbox for this suggestion (may > > > > be "embedded" would be better), but let me present it first (and while > > > > the examples is from ARM stm32 environment, the issue would equally > > > > apply to i386 or even amd64). So: > > > > > > > > 1. Small MPU (like stm32f103) would normally have small amount of RAM, > > > > and even somewhat larger variant do have its memory "partitioned/ > > > > dedicated" to various subsystems (like CloseCoupledMemory, Ethernet > > > > buffers, USB buffs, etc). > > > > > > > > 2. to address any location within those sections of that memory (or > > > > their entire RAM) it would suffice to use 16-bit pointers. > > > > > > > > 3. still, declaring a pointer in GCC always allocate "natural" size of a > > > > pointer in given architecture. In case of ARM stm32 it would be 32-bits. > > > > > > > > 4. programs using pointers do keep them around in structures. So > > > > programs with heavy use of pointers have those structures like 2 times > > > > larger then necessary .... if only pointers were 16-bit. And memory in > > > > those devices is scarce. > > > > > > > > 5. the same thing applies to 64-bit world. Programs that don't require > > > > huge memories but do use pointers excessively, MUST take up 64-bit for a > > > > pointer no matter what. > > > > > > > > So I was wondering if it would be feasible for GCC to allow SEGMENT to > > > > be declared as "small" (like 16-bit addressable in 32-bit CPU, or 32-bit > > > > addressable in 64-bit CPU), and ANY pointer declared to reference > > > > location within them would then be appropriately reduced. > > > > > > > > In ARM world, the use of such pointers would require the use of an > > > > additional register (functionally being a "segment base address") to > > > > allow for data access using instructions like: "LD Rx, [Ry, Rz]" - > > > > meaning register index reference. Here Ry is the base of the SEGMENT in > > > > question. Or if (like inside a loop) the structure "pointed to" by Rz > > > > must be often used, just one operation "ADD Rz, Ry" will prep Rz for > > > > subsequent "ordinary" offset operations like: "LD Ra, [Rz, #member]" ... > > > > and reentering the loop by "LDH Rz, [Rz, #next]" does what's required by > > > > "x = x->next". > > > > > > > > Not having any experience in compiler implementations I have no idea if > > > > this is a big or a small change to compiler design. > > > > > > > > -R > > > > > > > ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 16:07 ` Martin Uecker @ 2023-06-28 16:49 ` Richard Earnshaw (lists) 2023-06-28 17:00 ` Martin Uecker 0 siblings, 1 reply; 54+ messages in thread From: Richard Earnshaw (lists) @ 2023-06-28 16:49 UTC (permalink / raw) To: Martin Uecker, Rafał Pietrak, gcc On 28/06/2023 17:07, Martin Uecker wrote: > Am Mittwoch, dem 28.06.2023 um 16:44 +0100 schrieb Richard Earnshaw (lists): >> On 28/06/2023 15:51, Rafał Pietrak via Gcc wrote: >>> Hi Martin, >>> >>> W dniu 28.06.2023 o 15:00, Martin Uecker pisze: >>>> >>>> Sounds like named address spaces to me: >>>> https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html >>> >>> Only to same extend, and only in x86 case. >>> >>> The goal of the wish-item I've describe is to shorten pointers. I may be >>> wrong and have misread the specs, but the "address spaces" >>> implementation you've pointed out don't look like doing that. In >>> particular the AVR variant applies to devices that have a "native int" >>> of 16-bits, and those devices (most of them) have address space no >>> larger. So there is no gain. Their pointers cover all their address >>> space and if one wanted to have shorter pointers ... like 12-bits - >>> those wouldn't "nicely fit into register", or 8-bits - those would >>> reduce the "addressable" space to 256 bytes, which is VERY tight for any >>> practical application. >>> >>> Additionally, the AVR case is explained as "only for rodata" - this >>> completely dismisses it from my use. >>> >>> To explain a little more: the functionality I'm looking for is something >>> like x86 implementation of that "address spaces". The key functionality >>> here is the additional register like fs/gs (an address offset register). >>> IMHO the feature/implementation in question would HAVE TO use additional >>> register instead of letting linker adjust them at link time, because >>> those "short" pointers would need to be load-and-stored dynamically and >>> changed dynamically at runtime. That's why I've put an example of ARM >>> instruction that does this. Again IMHO the only "syntactic" feature,that >>> is required for a compiler to do "the right thing" is to make compiler >>> consider segment (segment name, ordinary linker segment name) where a >>> particular pointer target resides. Then if that segment where data (of >>> that pointer) reside is declared "short pointers", then compiler loads >>> and uses additional register pointing to the base of that segment. Quite >>> like intel segments work in hardware. >>> >>> Naturally, although I have hints on such mechanism behavior, I have no >>> skills to even imagine where to tweak the sources to achieve that. >> >> >> I think I understand what you're asking for but: >> 1) You'd need a new ABI specification to handle this, probably involving >> register assignments (for the 'segment' addresses), the initialization >> of those at startup, assembler and linker extensions to allow for >> relocations describing the symbols, etc. >> 2) Implementations for all of the above (it would be a lot of work - >> weeks to months, not days). Little existing code, including most of the >> hand-written assembly routines is likely to be compatible with the >> register conventions you'd need to define, so all that code would need >> auditing and alternatives developed. >> 3) I doubt it would be an overall win in the end. >> >> I base the last assertion on the fact that you'd now have three values >> in many addresses, the base (segment), the pointer and then a final >> offset. This means quite a bit more code being generated, so you trade >> smaller pointers in your data section for more code in your code >> section. For example, >> >> struct f >> { >> int a; >> int b; >> }; >> >> int func (struct f *p) >> { >> return p->b; >> } >> >> would currently compile to something like >> >> ldr r0, [r0, #4] >> bx lr >> >> but with the new, shorter, pointer you'd end up with >> >> add r0, r_seg, r0 >> ldr r0, [r0, #4] >> bx lr >> >> In some cases it might be even worse as you'd end up with >> zero-extensions of the pointer values as well. >> > > I do not quite understand why this wouldn't work with > named address spaces? > > __near struct { > int x; > int y; > }; > > int func (__near struct f *p) > { > return p->b; > } > > could produce exactly such code? If you need multiple > such segments one could have __near0, ..., __near9. > > Such a pointer could also be converted to a regular > pointer, which could reduce code overhead. > > Martin Named address spaces, as they exist today, don't really do anything (at least, in the Arm port). A pointer is still 32-bits in size, so they become just syntactic sugar. If you're going to use them as 'bases', then you still have to define how the base address is accessed - it doesn't just happen by magic. R. > > > >> R. >> >>> -R >>> >>>> >>>> Best, >>>> Martin >>>> >>>> Am Dienstag, dem 27.06.2023 um 14:26 +0200 schrieb Rafał Pietrak via Gcc: >>>>> Hello everybody, >>>>> >>>>> I'm not quite sure if this is correct mailbox for this suggestion (may >>>>> be "embedded" would be better), but let me present it first (and while >>>>> the examples is from ARM stm32 environment, the issue would equally >>>>> apply to i386 or even amd64). So: >>>>> >>>>> 1. Small MPU (like stm32f103) would normally have small amount of RAM, >>>>> and even somewhat larger variant do have its memory "partitioned/ >>>>> dedicated" to various subsystems (like CloseCoupledMemory, Ethernet >>>>> buffers, USB buffs, etc). >>>>> >>>>> 2. to address any location within those sections of that memory (or >>>>> their entire RAM) it would suffice to use 16-bit pointers. >>>>> >>>>> 3. still, declaring a pointer in GCC always allocate "natural" size of a >>>>> pointer in given architecture. In case of ARM stm32 it would be 32-bits. >>>>> >>>>> 4. programs using pointers do keep them around in structures. So >>>>> programs with heavy use of pointers have those structures like 2 times >>>>> larger then necessary .... if only pointers were 16-bit. And memory in >>>>> those devices is scarce. >>>>> >>>>> 5. the same thing applies to 64-bit world. Programs that don't require >>>>> huge memories but do use pointers excessively, MUST take up 64-bit for a >>>>> pointer no matter what. >>>>> >>>>> So I was wondering if it would be feasible for GCC to allow SEGMENT to >>>>> be declared as "small" (like 16-bit addressable in 32-bit CPU, or 32-bit >>>>> addressable in 64-bit CPU), and ANY pointer declared to reference >>>>> location within them would then be appropriately reduced. >>>>> >>>>> In ARM world, the use of such pointers would require the use of an >>>>> additional register (functionally being a "segment base address") to >>>>> allow for data access using instructions like: "LD Rx, [Ry, Rz]" - >>>>> meaning register index reference. Here Ry is the base of the SEGMENT in >>>>> question. Or if (like inside a loop) the structure "pointed to" by Rz >>>>> must be often used, just one operation "ADD Rz, Ry" will prep Rz for >>>>> subsequent "ordinary" offset operations like: "LD Ra, [Rz, #member]" ... >>>>> and reentering the loop by "LDH Rz, [Rz, #next]" does what's required by >>>>> "x = x->next". >>>>> >>>>> Not having any experience in compiler implementations I have no idea if >>>>> this is a big or a small change to compiler design. >>>>> >>>>> -R >>>> >>>> >> > > ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 16:49 ` Richard Earnshaw (lists) @ 2023-06-28 17:00 ` Martin Uecker 0 siblings, 0 replies; 54+ messages in thread From: Martin Uecker @ 2023-06-28 17:00 UTC (permalink / raw) To: Richard Earnshaw (lists), Rafał Pietrak, gcc Am Mittwoch, dem 28.06.2023 um 17:49 +0100 schrieb Richard Earnshaw (lists): > On 28/06/2023 17:07, Martin Uecker wrote: > > Am Mittwoch, dem 28.06.2023 um 16:44 +0100 schrieb Richard Earnshaw (lists): > > > On 28/06/2023 15:51, Rafał Pietrak via Gcc wrote: > > > > Hi Martin, > > > > > > > > W dniu 28.06.2023 o 15:00, Martin Uecker pisze: > > > > > > > > > > Sounds like named address spaces to me: > > > > > https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html > > > > > > > > Only to same extend, and only in x86 case. > > > > > > > > The goal of the wish-item I've describe is to shorten pointers. I may be > > > > wrong and have misread the specs, but the "address spaces" > > > > implementation you've pointed out don't look like doing that. In > > > > particular the AVR variant applies to devices that have a "native int" > > > > of 16-bits, and those devices (most of them) have address space no > > > > larger. So there is no gain. Their pointers cover all their address > > > > space and if one wanted to have shorter pointers ... like 12-bits - > > > > those wouldn't "nicely fit into register", or 8-bits - those would > > > > reduce the "addressable" space to 256 bytes, which is VERY tight for any > > > > practical application. > > > > > > > > Additionally, the AVR case is explained as "only for rodata" - this > > > > completely dismisses it from my use. > > > > > > > > To explain a little more: the functionality I'm looking for is something > > > > like x86 implementation of that "address spaces". The key functionality > > > > here is the additional register like fs/gs (an address offset register). > > > > IMHO the feature/implementation in question would HAVE TO use additional > > > > register instead of letting linker adjust them at link time, because > > > > those "short" pointers would need to be load-and-stored dynamically and > > > > changed dynamically at runtime. That's why I've put an example of ARM > > > > instruction that does this. Again IMHO the only "syntactic" feature,that > > > > is required for a compiler to do "the right thing" is to make compiler > > > > consider segment (segment name, ordinary linker segment name) where a > > > > particular pointer target resides. Then if that segment where data (of > > > > that pointer) reside is declared "short pointers", then compiler loads > > > > and uses additional register pointing to the base of that segment. Quite > > > > like intel segments work in hardware. > > > > > > > > Naturally, although I have hints on such mechanism behavior, I have no > > > > skills to even imagine where to tweak the sources to achieve that. > > > > > > > > > I think I understand what you're asking for but: > > > 1) You'd need a new ABI specification to handle this, probably involving > > > register assignments (for the 'segment' addresses), the initialization > > > of those at startup, assembler and linker extensions to allow for > > > relocations describing the symbols, etc. > > > 2) Implementations for all of the above (it would be a lot of work - > > > weeks to months, not days). Little existing code, including most of the > > > hand-written assembly routines is likely to be compatible with the > > > register conventions you'd need to define, so all that code would need > > > auditing and alternatives developed. > > > 3) I doubt it would be an overall win in the end. > > > > > > I base the last assertion on the fact that you'd now have three values > > > in many addresses, the base (segment), the pointer and then a final > > > offset. This means quite a bit more code being generated, so you trade > > > smaller pointers in your data section for more code in your code > > > section. For example, > > > > > > struct f > > > { > > > int a; > > > int b; > > > }; > > > > > > int func (struct f *p) > > > { > > > return p->b; > > > } > > > > > > would currently compile to something like > > > > > > ldr r0, [r0, #4] > > > bx lr > > > > > > but with the new, shorter, pointer you'd end up with > > > > > > add r0, r_seg, r0 > > > ldr r0, [r0, #4] > > > bx lr > > > > > > In some cases it might be even worse as you'd end up with > > > zero-extensions of the pointer values as well. > > > > > > > I do not quite understand why this wouldn't work with > > named address spaces? > > > > __near struct { > > int x; > > int y; > > }; > > > > int func (__near struct f *p) > > { > > return p->b; > > } > > > > could produce exactly such code? If you need multiple > > such segments one could have __near0, ..., __near9. > > > > Such a pointer could also be converted to a regular > > pointer, which could reduce code overhead. > > > > Martin > > Named address spaces, as they exist today, don't really do anything (at > least, in the Arm port). A pointer is still 32-bits in size, so they > become just syntactic sugar. > Sorry, I didn't mean to imply that this works today. But it seems one could use this mechanism to implement this feature. > If you're going to use them as 'bases', then you still have to define > how the base address is accessed - it doesn't just happen by magic. The address space could correspond to a specific linker section. Martin > R. > > > > > > > > > > R. > > > > > > > -R > > > > > > > > > > > > > > Best, > > > > > Martin > > > > > > > > > > Am Dienstag, dem 27.06.2023 um 14:26 +0200 schrieb Rafał Pietrak via Gcc: > > > > > > Hello everybody, > > > > > > > > > > > > I'm not quite sure if this is correct mailbox for this suggestion (may > > > > > > be "embedded" would be better), but let me present it first (and while > > > > > > the examples is from ARM stm32 environment, the issue would equally > > > > > > apply to i386 or even amd64). So: > > > > > > > > > > > > 1. Small MPU (like stm32f103) would normally have small amount of RAM, > > > > > > and even somewhat larger variant do have its memory "partitioned/ > > > > > > dedicated" to various subsystems (like CloseCoupledMemory, Ethernet > > > > > > buffers, USB buffs, etc). > > > > > > > > > > > > 2. to address any location within those sections of that memory (or > > > > > > their entire RAM) it would suffice to use 16-bit pointers. > > > > > > > > > > > > 3. still, declaring a pointer in GCC always allocate "natural" size of a > > > > > > pointer in given architecture. In case of ARM stm32 it would be 32-bits. > > > > > > > > > > > > 4. programs using pointers do keep them around in structures. So > > > > > > programs with heavy use of pointers have those structures like 2 times > > > > > > larger then necessary .... if only pointers were 16-bit. And memory in > > > > > > those devices is scarce. > > > > > > > > > > > > 5. the same thing applies to 64-bit world. Programs that don't require > > > > > > huge memories but do use pointers excessively, MUST take up 64-bit for a > > > > > > pointer no matter what. > > > > > > > > > > > > So I was wondering if it would be feasible for GCC to allow SEGMENT to > > > > > > be declared as "small" (like 16-bit addressable in 32-bit CPU, or 32-bit > > > > > > addressable in 64-bit CPU), and ANY pointer declared to reference > > > > > > location within them would then be appropriately reduced. > > > > > > > > > > > > In ARM world, the use of such pointers would require the use of an > > > > > > additional register (functionally being a "segment base address") to > > > > > > allow for data access using instructions like: "LD Rx, [Ry, Rz]" - > > > > > > meaning register index reference. Here Ry is the base of the SEGMENT in > > > > > > question. Or if (like inside a loop) the structure "pointed to" by Rz > > > > > > must be often used, just one operation "ADD Rz, Ry" will prep Rz for > > > > > > subsequent "ordinary" offset operations like: "LD Ra, [Rz, #member]" ... > > > > > > and reentering the loop by "LDH Rz, [Rz, #next]" does what's required by > > > > > > "x = x->next". > > > > > > > > > > > > Not having any experience in compiler implementations I have no idea if > > > > > > this is a big or a small change to compiler design. > > > > > > > > > > > > -R > > > > > > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 15:44 ` Richard Earnshaw (lists) 2023-06-28 16:07 ` Martin Uecker @ 2023-06-28 16:48 ` Rafał Pietrak 2023-06-29 6:19 ` Rafał Pietrak 2 siblings, 0 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-06-28 16:48 UTC (permalink / raw) To: Richard Earnshaw (lists), Martin Uecker, gcc Hi Richard, W dniu 28.06.2023 o 17:44, Richard Earnshaw (lists) pisze: [--------------] > > > I think I understand what you're asking for but: From what I can see below. You do. The case is exactly this. > 1) You'd need a new ABI specification to handle this, probably involving > register assignments (for the 'segment' addresses), the initialization > of those at startup, assembler and linker extensions to allow for > relocations describing the symbols, etc. > 2) Implementations for all of the above (it would be a lot of work - > weeks to months, not days). Little existing code, including most of the > hand-written assembly routines is likely to be compatible with the > register conventions you'd need to define, so all that code would need > auditing and alternatives developed. I was afraid of that ... admittedly, not to the point you explain here. > 3) I doubt it would be an overall win in the end. IMHO achieving the goal is worthwhile, although my estimates on the effort is surely not sufficient. I only hope, that it may spark some twist in doing the programming, thusly have a significant influence on future of programming as such. But, may be not. > > I base the last assertion on the fact that you'd now have three values > in many addresses, the base (segment), the pointer and then a final > offset. This means quite a bit more code being generated, so you trade > smaller pointers in your data section for more code in your code Yes. And this is the actual reality with embedded systems. Atmega can have 128k of Flash, and only 4k of RAM, stm32f070 has 32k of flash and 4k of RAM. Having 1M of flash with just 128k of RAM is common in the "bigger" devices. Most of the time, when I have to choose, I'm better off moving things info flash instead of keeping them in RAM. > section. For example, > > struct f > { > int a; > int b; > }; > > int func (struct f *p) > { > return p->b; > } > > would currently compile to something like > > ldr r0, [r0, #4] > bx lr > > but with the new, shorter, pointer you'd end up with > > add r0, r_seg, r0 > ldr r0, [r0, #4] > bx lr > > In some cases it might be even worse as you'd end up with > zero-extensions of the pointer values as well. Yes, it would be like this. Only I don't think, that this case would be a real problem, since CPU-s usually have those zero-extended register loads already within their instruction sets. So in reality the cost will be a single ADD instruction before dereference, because the segment register would be initialized outside the loops. -R > > R. > >> -R >> >>> >>> Best, >>> Martin >>> >>> Am Dienstag, dem 27.06.2023 um 14:26 +0200 schrieb Rafał Pietrak via >>> Gcc: >>>> Hello everybody, >>>> >>>> I'm not quite sure if this is correct mailbox for this suggestion (may >>>> be "embedded" would be better), but let me present it first (and while >>>> the examples is from ARM stm32 environment, the issue would equally >>>> apply to i386 or even amd64). So: >>>> >>>> 1. Small MPU (like stm32f103) would normally have small amount of RAM, >>>> and even somewhat larger variant do have its memory "partitioned/ >>>> dedicated" to various subsystems (like CloseCoupledMemory, Ethernet >>>> buffers, USB buffs, etc). >>>> >>>> 2. to address any location within those sections of that memory (or >>>> their entire RAM) it would suffice to use 16-bit pointers. >>>> >>>> 3. still, declaring a pointer in GCC always allocate "natural" size >>>> of a >>>> pointer in given architecture. In case of ARM stm32 it would be >>>> 32-bits. >>>> >>>> 4. programs using pointers do keep them around in structures. So >>>> programs with heavy use of pointers have those structures like 2 times >>>> larger then necessary .... if only pointers were 16-bit. And memory in >>>> those devices is scarce. >>>> >>>> 5. the same thing applies to 64-bit world. Programs that don't require >>>> huge memories but do use pointers excessively, MUST take up 64-bit >>>> for a >>>> pointer no matter what. >>>> >>>> So I was wondering if it would be feasible for GCC to allow SEGMENT to >>>> be declared as "small" (like 16-bit addressable in 32-bit CPU, or >>>> 32-bit >>>> addressable in 64-bit CPU), and ANY pointer declared to reference >>>> location within them would then be appropriately reduced. >>>> >>>> In ARM world, the use of such pointers would require the use of an >>>> additional register (functionally being a "segment base address") to >>>> allow for data access using instructions like: "LD Rx, [Ry, Rz]" - >>>> meaning register index reference. Here Ry is the base of the SEGMENT in >>>> question. Or if (like inside a loop) the structure "pointed to" by Rz >>>> must be often used, just one operation "ADD Rz, Ry" will prep Rz for >>>> subsequent "ordinary" offset operations like: "LD Ra, [Rz, #member]" >>>> ... >>>> and reentering the loop by "LDH Rz, [Rz, #next]" does what's >>>> required by >>>> "x = x->next". >>>> >>>> Not having any experience in compiler implementations I have no idea if >>>> this is a big or a small change to compiler design. >>>> >>>> -R >>> >>> > ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-28 15:44 ` Richard Earnshaw (lists) 2023-06-28 16:07 ` Martin Uecker 2023-06-28 16:48 ` Rafał Pietrak @ 2023-06-29 6:19 ` Rafał Pietrak 2023-07-03 15:07 ` Ian Lance Taylor 2 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-06-29 6:19 UTC (permalink / raw) To: Richard Earnshaw (lists), Martin Uecker, gcc Hi Richard, W dniu 28.06.2023 o 17:44, Richard Earnshaw (lists) pisze: [-----------] > I think I understand what you're asking for but: > 1) You'd need a new ABI specification to handle this, probably involving > register assignments (for the 'segment' addresses), the initialization > of those at startup, assembler and linker extensions to allow for > relocations describing the symbols, etc. I was thinking about that, and it doesn't look as requiring that deep rewrites. ABI spec, that could accomodate the functionality could be as little as one additional attribute to linker segments. Pls consider: 1. having that additional attribute (say "funny-ptr") of a segment. 2. ... from linker one would require only: 2.a) raising an error (fail) if one object has same segment-name WITH that attribute, and another object has that segment WITHOUT one. 2.b) raise an error (fail) if the resulting output segment would be larger then "max" (normally, max=64kB). 3. assembler would only need to be able to declare a segment with the new attribute 4. almost all the implementation changes are within the CC. Those changes can be broken down into a couple of scenarios: 4.a) for the following explanation, instead of __attribute__(section()), I will use <FP> shortcut. 4.b) assignment of "normal" to "funny" (char* <FP> x; char* y; x = y); here compiler would have to substract segment base address before deposition value at "&x", but for subsequent use of "x", compiler does NOT need to do anything. 4.c) reverse assignment (y = x); here compiler does nothing special, just uses "current/adjusted" value of "x". 4.d) comparation (x == y); compiler does nothing special - at this point, some register will already have "current/adjusted" value of "x". 4.e) comparation to NULL (x; !x; x == NULL); those test have to be done on "unadjusted" value of "x", so if register containing "x" is already adjusted, the base address will have to get substracted from "x" before test. And it may be good to take special care on loops (like "for()"). In case the loop looking like this: "for(;x; x = x->next)" the test is to be done before adjusting the pointer "x" by segment base address for the next loop-cycle, so there is no penalty for that test. I hope I didn't omit any important cases. If so, it doesn't look huge. Now, I'm NOT trying to persuade anybody, that it's "simple" and thus should be worth doing. I'm just doing some "intellectual exercise" with analyzing the challenge. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-06-29 6:19 ` Rafał Pietrak @ 2023-07-03 15:07 ` Ian Lance Taylor 2023-07-03 16:42 ` Rafał Pietrak 0 siblings, 1 reply; 54+ messages in thread From: Ian Lance Taylor @ 2023-07-03 15:07 UTC (permalink / raw) To: Rafał Pietrak; +Cc: Richard Earnshaw (lists), Martin Uecker, gcc On Wed, Jun 28, 2023 at 11:21 PM Rafał Pietrak via Gcc <gcc@gcc.gnu.org> wrote: > > W dniu 28.06.2023 o 17:44, Richard Earnshaw (lists) pisze: > [-----------] > > I think I understand what you're asking for but: > > 1) You'd need a new ABI specification to handle this, probably involving > > register assignments (for the 'segment' addresses), the initialization > > of those at startup, assembler and linker extensions to allow for > > relocations describing the symbols, etc. > > I was thinking about that, and it doesn't look as requiring that deep > rewrites. ABI spec, that could accomodate the functionality could be as > little as one additional attribute to linker segments. If I understand correctly, you are looking for something like the x32 mode that was available for a while on x86_64 processors: https://en.wikipedia.org/wiki/X32_ABI . That was a substantial amount of work including changes to the compiler, assembler, linker, standard library, and kernel. And at least to me it's never seemed particularly popular. Ian ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-03 15:07 ` Ian Lance Taylor @ 2023-07-03 16:42 ` Rafał Pietrak 2023-07-03 16:57 ` Richard Earnshaw (lists) 2023-07-04 12:38 ` David Brown 0 siblings, 2 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-07-03 16:42 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Richard Earnshaw (lists), Martin Uecker, gcc Hi Ian, W dniu 3.07.2023 o 17:07, Ian Lance Taylor pisze: > On Wed, Jun 28, 2023 at 11:21 PM Rafał Pietrak via Gcc <gcc@gcc.gnu.org> wrote: [--------] >> I was thinking about that, and it doesn't look as requiring that deep >> rewrites. ABI spec, that could accomodate the functionality could be as >> little as one additional attribute to linker segments. > > If I understand correctly, you are looking for something like the x32 > mode that was available for a while on x86_64 processors: > https://en.wikipedia.org/wiki/X32_ABI . That was a substantial amount > of work including changes to the compiler, assembler, linker, standard > library, and kernel. And at least to me it's never seemed > particularly popular. Yes. And WiKi reporting up to 40% performance improvements in some corner cases is impressive and encouraging. I believe, that the reported average of 5-8% improvement would be significantly better within MCU tiny resources environment. In MCU world, such improvement could mean fit-nofit of a project into a particular device. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-03 16:42 ` Rafał Pietrak @ 2023-07-03 16:57 ` Richard Earnshaw (lists) 2023-07-03 17:34 ` Rafał Pietrak 2023-07-04 12:38 ` David Brown 1 sibling, 1 reply; 54+ messages in thread From: Richard Earnshaw (lists) @ 2023-07-03 16:57 UTC (permalink / raw) To: Rafał Pietrak, Ian Lance Taylor; +Cc: Martin Uecker, gcc On 03/07/2023 17:42, Rafał Pietrak via Gcc wrote: > Hi Ian, > > W dniu 3.07.2023 o 17:07, Ian Lance Taylor pisze: >> On Wed, Jun 28, 2023 at 11:21 PM Rafał Pietrak via Gcc >> <gcc@gcc.gnu.org> wrote: > [--------] >>> I was thinking about that, and it doesn't look as requiring that deep >>> rewrites. ABI spec, that could accomodate the functionality could be as >>> little as one additional attribute to linker segments. >> >> If I understand correctly, you are looking for something like the x32 >> mode that was available for a while on x86_64 processors: >> https://en.wikipedia.org/wiki/X32_ABI . That was a substantial amount >> of work including changes to the compiler, assembler, linker, standard >> library, and kernel. And at least to me it's never seemed >> particularly popular. > > Yes. > > And WiKi reporting up to 40% performance improvements in some corner > cases is impressive and encouraging. I believe, that the reported > average of 5-8% improvement would be significantly better within MCU > tiny resources environment. In MCU world, such improvement could mean > fit-nofit of a project into a particular device. > > -R I think you need to be very careful when reading benchmarketing (sic) numbers like this. Firstly, this is a 32-bit vs 64-bit measurement; secondly, the benchmark (spec 2000) is very old now and IIRC was not fully optimized for 64-bit processors (it predates the 64-bit version of the x86 instruction set); thirdly, there are benchmarks in SPEC which are very sensitive to cache size and the 32-bit ABI just happened to allow them to fit enough data in the caches to make the numbers leap. R. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-03 16:57 ` Richard Earnshaw (lists) @ 2023-07-03 17:34 ` Rafał Pietrak 0 siblings, 0 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-07-03 17:34 UTC (permalink / raw) To: Richard Earnshaw (lists), Ian Lance Taylor; +Cc: Martin Uecker, gcc W dniu 3.07.2023 o 18:57, Richard Earnshaw (lists) pisze: > On 03/07/2023 17:42, Rafał Pietrak via Gcc wrote: >> Hi Ian, [---------] >> And WiKi reporting up to 40% performance improvements in some corner >> cases is impressive and encouraging. I believe, that the reported >> average of 5-8% improvement would be significantly better within MCU >> tiny resources environment. In MCU world, such improvement could mean >> fit-nofit of a project into a particular device. >> >> -R > > I think you need to be very careful when reading benchmarketing (sic) > numbers like this. Firstly, this is a 32-bit vs 64-bit measurement; > secondly, the benchmark (spec 2000) is very old now and IIRC was not > fully optimized for 64-bit processors (it predates the 64-bit version of > the x86 instruction set); thirdly, there are benchmarks in SPEC which > are very sensitive to cache size and the 32-bit ABI just happened to > allow them to fit enough data in the caches to make the numbers leap. Yes. Sure. I am. I thought I've expressed it clearly, that the "fantastic 40%" I regard as just "corner case" - those don't usually reflect ordinary usage. I was only highlighting the fact, that mare 5-8% improvement can result on fit-nofit of a particular design into a particular device ... in consequence requiring to use 4k-RAM device instead of 2k-RAM one. Tiny improvements of performance of x64 workhorses can become relatively huge in micros like stm32. That's all. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-03 16:42 ` Rafał Pietrak 2023-07-03 16:57 ` Richard Earnshaw (lists) @ 2023-07-04 12:38 ` David Brown 2023-07-04 12:57 ` Oleg Endo 2023-07-04 14:46 ` Rafał Pietrak 1 sibling, 2 replies; 54+ messages in thread From: David Brown @ 2023-07-04 12:38 UTC (permalink / raw) To: Rafał Pietrak, Ian Lance Taylor Cc: Richard Earnshaw (lists), Martin Uecker, gcc On 03/07/2023 18:42, Rafał Pietrak via Gcc wrote: > Hi Ian, > > W dniu 3.07.2023 o 17:07, Ian Lance Taylor pisze: >> On Wed, Jun 28, 2023 at 11:21 PM Rafał Pietrak via Gcc >> <gcc@gcc.gnu.org> wrote: > [--------] >>> I was thinking about that, and it doesn't look as requiring that deep >>> rewrites. ABI spec, that could accomodate the functionality could be as >>> little as one additional attribute to linker segments. >> >> If I understand correctly, you are looking for something like the x32 >> mode that was available for a while on x86_64 processors: >> https://en.wikipedia.org/wiki/X32_ABI . That was a substantial amount >> of work including changes to the compiler, assembler, linker, standard >> library, and kernel. And at least to me it's never seemed >> particularly popular. > > Yes. > > And WiKi reporting up to 40% performance improvements in some corner > cases is impressive and encouraging. I believe, that the reported > average of 5-8% improvement would be significantly better within MCU > tiny resources environment. In MCU world, such improvement could mean > fit-nofit of a project into a particular device. > > -R > A key difference is that using 32-bit pointers on an x86 is enough address space for a large majority of use-cases, while even on the smallest small ARM microcontroller, 16-bit is not enough. (It's not even enough to access all memory on larger AVR microcontrollers - the only 8-bit device supported by mainline gcc.) So while 16 bits would cover the address space of the RAM on a small ARM microcontroller, it would not cover access to code/flash space (including read-only data), IO registers, or other areas of memory-mapped memory and peripherals. Generic low-level pointers really have to be able to access everything. So an equivalent of x32 mode would not work at all. Really, what you want is a 16-bit "small pointer" that is added to 0x20000000 (the base address for RAM in small ARM devices, in case anyone following this thread is unfamiliar with the details) to get a real data pointer. And you'd like these small pointers to have convenient syntax and efficient use. I think a C++ class (or rather, class template) with inline functions is the way to go here. gcc's optimiser will give good code, and the C++ class will let you get nice syntax to hide the messy details. There is no good way to do this in C. Named address spaces would be a possibility, but require quite a bit of effort and change to the compiler to implement, and they don't give you anything that you would not get from a C++ class. (That's not quite true - named address spaces can, I believe, also influence the section name used for allocation of data defined in these spaces, which cannot be done by a C++ class.) David ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-04 12:38 ` David Brown @ 2023-07-04 12:57 ` Oleg Endo 2023-07-04 14:46 ` Rafał Pietrak 1 sibling, 0 replies; 54+ messages in thread From: Oleg Endo @ 2023-07-04 12:57 UTC (permalink / raw) To: David Brown, Rafał Pietrak, Ian Lance Taylor Cc: Richard Earnshaw (lists), Martin Uecker, gcc > I think a C++ class (or rather, class template) with inline functions is > the way to go here. gcc's optimiser will give good code, and the C++ > class will let you get nice syntax to hide the messy details. > > There is no good way to do this in C. Named address spaces would be a > possibility, but require quite a bit of effort and change to the > compiler to implement, and they don't give you anything that you would > not get from a C++ class. > > (That's not quite true - named address spaces can, I believe, also > influence the section name used for allocation of data defined in these > spaces, which cannot be done by a C++ class.) > Does the C++ template class shebang work for storing "short code pointers" for things like compile-time/link-time generated function tables? Haven't tried it myself, but somehow I doubt it. Cheers, Oleg ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-04 12:38 ` David Brown 2023-07-04 12:57 ` Oleg Endo @ 2023-07-04 14:46 ` Rafał Pietrak 2023-07-04 15:55 ` David Brown 2023-07-04 22:57 ` Martin Uecker 1 sibling, 2 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-07-04 14:46 UTC (permalink / raw) To: David Brown, Ian Lance Taylor Cc: Richard Earnshaw (lists), Martin Uecker, gcc Hi, W dniu 4.07.2023 o 14:38, David Brown pisze: [---------] > A key difference is that using 32-bit pointers on an x86 is enough > address space for a large majority of use-cases, while even on the > smallest small ARM microcontroller, 16-bit is not enough. (It's not > even enough to access all memory on larger AVR microcontrollers - the > only 8-bit device supported by mainline gcc.) So while 16 bits would > cover the address space of the RAM on a small ARM microcontroller, it > would not cover access to code/flash space (including read-only data), > IO registers, or other areas of memory-mapped memory and peripherals. > Generic low-level pointers really have to be able to access everything. Naturaly 16-bit is "most of the time" not enough to cover the entire workspace on even the smallest MCU (AVR being the only close to an exception here), but in my little experience, that is not really necessary. Meaning "generic low-level pointers really have to...", I don't think so. I really don't. Programs often manipulate quite "localized" data, and compiler is capable enough to distinguish and keep separate pointers of different "domains". What makes it currently impossible is tools (semantic constructs like pragma or named sections) that would let it happen. > > So an equivalent of x32 mode would not work at all. Really, what you > want is a 16-bit "small pointer" that is added to 0x20000000 (the base > address for RAM in small ARM devices, in case anyone following this > thread is unfamiliar with the details) to get a real data pointer. And > you'd like these small pointers to have convenient syntax and efficient > use. more or less yes. But "with a twist". A "compiler construct" that would be (say) sufficient to get the RAM-savings/optimization I'm aiming at could be "reduced" to the ability to create "medium-size" array of "some objects" and have them reference each other all WITHIN that "array". That array was in my earlier emails referred to as segment or section. So whenever a programmer writes a construct like: struct test_s attribute((small-and-funny)) { struct test_s attribute((small-and-funny)) *next, *prev, *head; struct test_s attribute((small-and-funny)) *user, *group; } repository[1000]; struct test_s attribute((small-and-funny)) *master, *trash; compiler puts that data into that small array (dedicated section), so no "generic low-level pointers" referring that data would need to exist within the program. And if it happens, error is thrown (or autoconversion happen). > > I think a C++ class (or rather, class template) with inline functions is > the way to go here. gcc's optimiser will give good code, and the C++ > class will let you get nice syntax to hide the messy details. OK. Thenx for the advice, but going into c++ is a major thing for me and (at least for the time being) I'll stay with ordinary "big" pointers in plain C instead. > There is no good way to do this in C. Named address spaces would be a > possibility, but require quite a bit of effort and change to the > compiler to implement, and they don't give you anything that you would > not get from a C++ class. Yes. named address spaces would be great. And for code, too. > (That's not quite true - named address spaces can, I believe, also > influence the section name used for allocation of data defined in these > spaces, which cannot be done by a C++ class.) OK. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-04 14:46 ` Rafał Pietrak @ 2023-07-04 15:55 ` David Brown 2023-07-04 16:20 ` Rafał Pietrak 2023-07-04 22:57 ` Martin Uecker 1 sibling, 1 reply; 54+ messages in thread From: David Brown @ 2023-07-04 15:55 UTC (permalink / raw) To: Rafał Pietrak, Ian Lance Taylor Cc: Richard Earnshaw (lists), Martin Uecker, gcc On 04/07/2023 16:46, Rafał Pietrak wrote: > Hi, > > W dniu 4.07.2023 o 14:38, David Brown pisze: > [---------] >> A key difference is that using 32-bit pointers on an x86 is enough >> address space for a large majority of use-cases, while even on the >> smallest small ARM microcontroller, 16-bit is not enough. (It's not >> even enough to access all memory on larger AVR microcontrollers - the >> only 8-bit device supported by mainline gcc.) So while 16 bits would >> cover the address space of the RAM on a small ARM microcontroller, it >> would not cover access to code/flash space (including read-only data), >> IO registers, or other areas of memory-mapped memory and peripherals. >> Generic low-level pointers really have to be able to access everything. > > Naturaly 16-bit is "most of the time" not enough to cover the entire > workspace on even the smallest MCU (AVR being the only close to an > exception here), but in my little experience, that is not really > necessary. (Most MSP430 devices, also supported by GCC, are also covered by a 16-bit address space.) > Meaning "generic low-level pointers really have to...", I > don't think so. I really don't. Programs often manipulate quite > "localized" data, and compiler is capable enough to distinguish and keep > separate pointers of different "domains". What makes it currently > impossible is tools (semantic constructs like pragma or named sections) > that would let it happen. > No, generic low-level pointers /do/ have to work with all reasonable address spaces on the device. A generic pointer has to support pointing to modifiable ram, to constant data (flash on small microcontrollers), to IO registers, etc. If you want something that can access a specific, restricted area, then it is a specialised pointer - not a generic one. C has no support for making your own pointer types, but C++ does. >> >> So an equivalent of x32 mode would not work at all. Really, what you >> want is a 16-bit "small pointer" that is added to 0x20000000 (the base >> address for RAM in small ARM devices, in case anyone following this >> thread is unfamiliar with the details) to get a real data pointer. >> And you'd like these small pointers to have convenient syntax and >> efficient use. > > more or less yes. But "with a twist". A "compiler construct" that would > be (say) sufficient to get the RAM-savings/optimization I'm aiming at > could be "reduced" to the ability to create "medium-size" array of "some > objects" and have them reference each other all WITHIN that "array". > That array was in my earlier emails referred to as segment or section. > So whenever a programmer writes a construct like: > > struct test_s attribute((small-and-funny)) { > struct test_s attribute((small-and-funny)) *next, *prev, *head; > struct test_s attribute((small-and-funny)) *user, *group; > } repository[1000]; > struct test_s attribute((small-and-funny)) *master, *trash; > > compiler puts that data into that small array (dedicated section), so no > "generic low-level pointers" referring that data would need to exist > within the program. And if it happens, error is thrown (or > autoconversion happen). > GCC attributes for sections already exist. And again - indices will give you what you need here more efficiently than pointers. All of your pointers can be converted to "repository[i]" format. (And if your repository has no more than 256 entries, 8-bit indices will be sufficient.) It can be efficient to store pointers to the entries in local variables if you are using them a lot, though GCC will do a fair amount of that automatically. >> >> I think a C++ class (or rather, class template) with inline functions >> is the way to go here. gcc's optimiser will give good code, and the >> C++ class will let you get nice syntax to hide the messy details. > > OK. Thenx for the advice, but going into c++ is a major thing for me and > (at least for the time being) I'll stay with ordinary "big" pointers in > plain C instead. > >> There is no good way to do this in C. Named address spaces would be a >> possibility, but require quite a bit of effort and change to the >> compiler to implement, and they don't give you anything that you would >> not get from a C++ class. > > Yes. named address spaces would be great. And for code, too. > It is good to have a wishlist (and you can file a wishlist "bug" in the gcc bugzilla, so that it won't be forgotten). But it is also good to be realistic. Indices will give you what you need in terms of space efficiency, but will be messier in the syntax. A small pointer class will give you efficient code and neat syntax, but require C++. These two solutions will, however, work today. (And they are both target independent.) David >> (That's not quite true - named address spaces can, I believe, also >> influence the section name used for allocation of data defined in >> these spaces, which cannot be done by a C++ class.) > > OK. > > -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-04 15:55 ` David Brown @ 2023-07-04 16:20 ` Rafał Pietrak 0 siblings, 0 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-07-04 16:20 UTC (permalink / raw) To: David Brown, Ian Lance Taylor Cc: Richard Earnshaw (lists), Martin Uecker, gcc W dniu 4.07.2023 o 17:55, David Brown pisze: > On 04/07/2023 16:46, Rafał Pietrak wrote: [----------] >> >> Yes. named address spaces would be great. And for code, too. >> > > It is good to have a wishlist (and you can file a wishlist "bug" in the > gcc bugzilla, so that it won't be forgotten). But it is also good to be > realistic. Indices will give you what you need in terms of space > efficiency, but will be messier in the syntax. A small pointer class > will give you efficient code and neat syntax, but require C++. These > two solutions will, however, work today. (And they are both target > independent.) OK, Eventually I may invest into the ++. For now, thenx for the discussion and pointing me to the most promising directions. See U. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-04 14:46 ` Rafał Pietrak 2023-07-04 15:55 ` David Brown @ 2023-07-04 22:57 ` Martin Uecker 2023-07-05 5:26 ` Rafał Pietrak 1 sibling, 1 reply; 54+ messages in thread From: Martin Uecker @ 2023-07-04 22:57 UTC (permalink / raw) To: Rafał Pietrak, David Brown, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Am Dienstag, dem 04.07.2023 um 16:46 +0200 schrieb Rafał Pietrak:... > > > > I think a C++ class (or rather, class template) with inline functions is > > the way to go here. gcc's optimiser will give good code, and the C++ > > class will let you get nice syntax to hide the messy details. > > OK. Thenx for the advice, but going into c++ is a major thing for me and > (at least for the time being) I'll stay with ordinary "big" pointers in > plain C instead. Depending on what you are doing, "nice syntax" may not be worth dealing with C++ issues. But this depends a lot on circumstances. If the spaces saving are really valuable, I would personally just wrap accesses with a macro. > > There is no good way to do this in C. Named address spaces would be a > > possibility, but require quite a bit of effort and change to the > > compiler to implement, and they don't give you anything that you would > > not get from a C++ class. > > Yes. named address spaces would be great. And for code, too. > While certainly some work, implementation effort for new kinds of named address spaces does not seem to be terrible at first glance: https://gcc.gnu.org/onlinedocs/gccint/target-macros/adding-support-for-named-address-spaces.html > Martin ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-04 22:57 ` Martin Uecker @ 2023-07-05 5:26 ` Rafał Pietrak 2023-07-05 7:29 ` Martin Uecker 0 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-07-05 5:26 UTC (permalink / raw) To: Martin Uecker, David Brown, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Hi, W dniu 5.07.2023 o 00:57, Martin Uecker pisze: > Am Dienstag, dem 04.07.2023 um 16:46 +0200 schrieb Rafał Pietrak:... [--------] >> >> Yes. named address spaces would be great. And for code, too. >> > > While certainly some work, implementation effort for > new kinds of named address spaces does not seem to be > terrible at first glance: > > https://gcc.gnu.org/onlinedocs/gccint/target-macros/adding-support-for-named-address-spaces.html Oh! I see. this is good news. Although that internals documentation is complete black magic to me and I cannot tell heads from tails in it, the surrounding comments sound promising... like GCC-13 actually had the internal "machinery" supporting named address spaces and just cpu-platform specific code is missing (for all but "SPU port"). Is that right? And if it's so ... there is no mention of how does it show up for "simple user" of the GCC (instead of the use of that "machinery" by creators of particular GCC port). In other words: how the sources should look like for the compiler to do "the thing"? -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 5:26 ` Rafał Pietrak @ 2023-07-05 7:29 ` Martin Uecker 2023-07-05 8:05 ` Rafał Pietrak 0 siblings, 1 reply; 54+ messages in thread From: Martin Uecker @ 2023-07-05 7:29 UTC (permalink / raw) To: Rafał Pietrak, David Brown, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Am Mittwoch, dem 05.07.2023 um 07:26 +0200 schrieb Rafał Pietrak: > Hi, > > W dniu 5.07.2023 o 00:57, Martin Uecker pisze: > > Am Dienstag, dem 04.07.2023 um 16:46 +0200 schrieb Rafał Pietrak:... > [--------] > > > > > > Yes. named address spaces would be great. And for code, too. > > > > > > > While certainly some work, implementation effort for > > new kinds of named address spaces does not seem to be > > terrible at first glance: > > > > https://gcc.gnu.org/onlinedocs/gccint/target-macros/adding-support-for-named-address-spaces.html > > Oh! I see. this is good news. Although that internals documentation is > complete black magic to me and I cannot tell heads from tails in it, the > surrounding comments sound promising... like GCC-13 actually had the > internal "machinery" supporting named address spaces and just > cpu-platform specific code is missing (for all but "SPU port"). Is that > right? It seems like this. I would need to do more research. > > And if it's so ... there is no mention of how does it show up for > "simple user" of the GCC (instead of the use of that "machinery" by > creators of particular GCC port). In other words: how the sources should > look like for the compiler to do "the thing"? > Not sure I understand the question. You would add a name space to an object as a qualifier and then the object would be allocated in a special (small) region of memory. Pointers known to point into that special region of memory (which is encoded into the type) would then be smaller. At least, this is my understanding of how it could work. Martin ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 7:29 ` Martin Uecker @ 2023-07-05 8:05 ` Rafał Pietrak 2023-07-05 9:11 ` David Brown 2023-07-05 9:29 ` Martin Uecker 0 siblings, 2 replies; 54+ messages in thread From: Rafał Pietrak @ 2023-07-05 8:05 UTC (permalink / raw) To: Martin Uecker, David Brown, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Hi, W dniu 5.07.2023 o 09:29, Martin Uecker pisze: > Am Mittwoch, dem 05.07.2023 um 07:26 +0200 schrieb Rafał Pietrak: [-------] >> And if it's so ... there is no mention of how does it show up for >> "simple user" of the GCC (instead of the use of that "machinery" by >> creators of particular GCC port). In other words: how the sources should >> look like for the compiler to do "the thing"? >> > > Not sure I understand the question. You would add a name space > to an object as a qualifier and then the object would be allocated > in a special (small) region of memory. Pointers known to point > into that special region of memory (which is encoded into the > type) would then be smaller. At least, this is my understanding > of how it could work. Apparently you do understand my question. Then again ... apparently you are guessing the answer. Incidentally, that would be my guess, too. And while such "syntax" is not really desirable (since such attribution at every declaration of every "short pointer" variable would significantly obfuscate the sources and a thing like "#pragma" at the top of a file would do a better job), better something then nothing. Then again, should you happen to fall onto an actual documentation of syntax to use this feature with, I'd appreciate you sharing it :) -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 8:05 ` Rafał Pietrak @ 2023-07-05 9:11 ` David Brown 2023-07-05 9:25 ` Martin Uecker 2023-07-05 9:42 ` Rafał Pietrak 2023-07-05 9:29 ` Martin Uecker 1 sibling, 2 replies; 54+ messages in thread From: David Brown @ 2023-07-05 9:11 UTC (permalink / raw) To: Rafał Pietrak, Martin Uecker, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote: > Hi, > > W dniu 5.07.2023 o 09:29, Martin Uecker pisze: >> Am Mittwoch, dem 05.07.2023 um 07:26 +0200 schrieb Rafał Pietrak: > [-------] >>> And if it's so ... there is no mention of how does it show up for >>> "simple user" of the GCC (instead of the use of that "machinery" by >>> creators of particular GCC port). In other words: how the sources should >>> look like for the compiler to do "the thing"? >>> >> >> Not sure I understand the question. You would add a name space >> to an object as a qualifier and then the object would be allocated >> in a special (small) region of memory. Pointers known to point >> into that special region of memory (which is encoded into the >> type) would then be smaller. At least, this is my understanding >> of how it could work. Note that this only applies to pointers declared to be of the address space specific type. If you have "__smalldata int x;" using a hypothetical new address space, then "&x" is of type "__smalldata int *" and you need to specify the address space specific pointer type to get the size advantages. (Since the __smalldata address space is a subset of the generic space, conversions between pointer types are required to work correctly.) > > Apparently you do understand my question. > > Then again ... apparently you are guessing the answer. Incidentally, > that would be my guess, too. And while such "syntax" is not really > desirable (since such attribution at every declaration of every "short > pointer" variable would significantly obfuscate the sources and a thing > like "#pragma" at the top of a file would do a better job), better > something then nothing. Then again, should you happen to fall onto an > actual documentation of syntax to use this feature with, I'd appreciate > you sharing it :) > I am not sure if you are clear about this, but the address space definition macros here are for use in the source code for the compiler, not in user code. There is (AFAIK) no way for user code to create address spaces - you need to check out the source code for GCC, modify it to support your new address space, and build your own compiler. This is perfectly possible (it's all free and open source, after all), but it is not a minor undertaking - especially if you don't like C++ ! In my personal opinion (which you are all free to disregard), named address spaces were an interesting idea that failed. I was enthusiastic about a number of the extensions in TR 18307 "C Extensions to support embedded processors" when the paper was first published. As I learned more, however, I saw it was a dead-end. The features are too under-specified to be useful or portable, gave very little of use to embedded programmers, and fit badly with C. It was an attempt to standardise and generalise some of the mess of different extensions that proprietary toolchain developers had for a variety of 8-bit CISC microcontrollers that could not use standard C very effectively. But it was all too little, too late - and AFAIK none of these proprietary toolchains support it. GCC supports some of the features to some extent - a few named address spaces on a few devices, for "gnuc" only (not standard C, and not C++), and has some fixed point support for some targets (with inefficient generated code - it appears to be little more than an initial "proof of concept" implementation). I do not think named address spaces have a future - in GCC or anywhere else. The only real use of them at the moment is for the AVR for accessing data in flash, and even then it is of limited success since it does not work in C++. I realise that learning at least some C++ is a significant step beyond learning C - but /using/ C++ classes or templates is no harder than C coding. And it is far easier, faster and less disruptive to make a C++ header library implementing such features than adding new named address spaces into the compiler itself. The one key feature that is missing is that named address spaces can affect the allocation details of data, which cannot be done with C++ classes. You could make a "small_data" class template, but variables would still need to be marked __attribute__((section(".smalldata"))) when used. I think this could be handled very neatly with one single additional feature in GCC - allow arbitrary GCC variable attributes to be specified for types, which would then be applied to any variables declared for that type. David ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 9:11 ` David Brown @ 2023-07-05 9:25 ` Martin Uecker 2023-07-05 11:34 ` David Brown 2023-07-05 9:42 ` Rafał Pietrak 1 sibling, 1 reply; 54+ messages in thread From: Martin Uecker @ 2023-07-05 9:25 UTC (permalink / raw) To: David Brown, Rafał Pietrak, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Am Mittwoch, dem 05.07.2023 um 11:11 +0200 schrieb David Brown: > On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote: ... > > In my personal opinion (which you are all free to disregard), named > address spaces were an interesting idea that failed. I was > enthusiastic > about a number of the extensions in TR 18307 "C Extensions to support > embedded processors" when the paper was first published. As I > learned > more, however, I saw it was a dead-end. The features are too > under-specified to be useful or portable, gave very little of use to > embedded programmers, and fit badly with C. It was an attempt to > standardise and generalise some of the mess of different extensions > that > proprietary toolchain developers had for a variety of 8-bit CISC > microcontrollers that could not use standard C very effectively. But > it > was all too little, too late - and AFAIK none of these proprietary > toolchains support it. GCC supports some of the features to some > extent > - a few named address spaces on a few devices, for "gnuc" only (not > standard C, and not C++), and has some fixed point support for some > targets (with inefficient generated code - it appears to be little > more > than an initial "proof of concept" implementation). > > I do not think named address spaces have a future - in GCC or > anywhere > else. The only real use of them at the moment is for the AVR for > accessing data in flash, and even then it is of limited success since > it > does not work in C++. Can you explain a little bit why you think it is a dead-end? It seems an elegant solution to a range of problems to me. I have no idea how much the GCC features are actually used, but other compilers for embedded systems such as SDCC also support named address spaces. Martin ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 9:25 ` Martin Uecker @ 2023-07-05 11:34 ` David Brown 2023-07-05 12:01 ` Martin Uecker 0 siblings, 1 reply; 54+ messages in thread From: David Brown @ 2023-07-05 11:34 UTC (permalink / raw) To: Martin Uecker, Rafał Pietrak, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc On 05/07/2023 11:25, Martin Uecker wrote: > Am Mittwoch, dem 05.07.2023 um 11:11 +0200 schrieb David Brown: >> On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote: > > ... >> >> In my personal opinion (which you are all free to disregard), named >> address spaces were an interesting idea that failed. I was >> enthusiastic >> about a number of the extensions in TR 18307 "C Extensions to support >> embedded processors" when the paper was first published. As I >> learned >> more, however, I saw it was a dead-end. The features are too >> under-specified to be useful or portable, gave very little of use to >> embedded programmers, and fit badly with C. It was an attempt to >> standardise and generalise some of the mess of different extensions >> that >> proprietary toolchain developers had for a variety of 8-bit CISC >> microcontrollers that could not use standard C very effectively. But >> it >> was all too little, too late - and AFAIK none of these proprietary >> toolchains support it. GCC supports some of the features to some >> extent >> - a few named address spaces on a few devices, for "gnuc" only (not >> standard C, and not C++), and has some fixed point support for some >> targets (with inefficient generated code - it appears to be little >> more >> than an initial "proof of concept" implementation). >> >> I do not think named address spaces have a future - in GCC or >> anywhere >> else. The only real use of them at the moment is for the AVR for >> accessing data in flash, and even then it is of limited success since >> it >> does not work in C++. > > Can you explain a little bit why you think it is a dead-end? It > seems an elegant solution to a range of problems to me. Named address spaces are not standardised in C, and I do not expect they ever will be. The TR18307 document is not anywhere close to being of a quality that could be integrated with the C standards, even as optional features, and much of it makes no sense in practice (I have never heard of the IO stuff being implemented or used). The few compilers that implement any of it do so in different ways - the "__flash" address space in AVR GCC is slightly different from the same extension in IAR's AVR compiler. For existing compilers, there is a strong inconsistency as to whether such things are "named address spaces", "extension keywords", "type qualifiers", "attributes", or other terms, all with subtly (or not so subtly) different effects on how they are used, what restrictions exist, conversions between types, and how errors can be diagnosed. Sometimes these features are considered part of the data type, sometimes of pointer types, sometimes they are just about data placement. Since every compiler targeting these small awkward microcontrollers has a different idea of what something like "const __flash int x = 123;" means, and has been implementing their own ideas for a decade or two before TR18307 ever proposed "named address spaces", the TR hasn't a hope of being a real standard. Named address spaces are not implemented at all, anywhere (AFAIK), for C++. (Some embedded toolchains have limited support for C++ on such microcontrollers, but these are again not really named address spaces.) Since C++ usage is heavily increasing in the small embedded system world, this is important. (GCC has much of the honour for that - as ARM took a bigger share of the market and GCC for ARM improved, the toolchain market was no longer at the mercy of big commercial vendors who charged absurd amounts for their C++ toolchains.) A feature which is only for C, and not supported by C++, is almost guaranteed to be dead-end. And of course the type of processor for which named address spaces or other related extensions are essential, are a dying breed. The AVR is probably the only one with a significant future. Part of the appeal of ARM in the embedded world is it frees you from the pains of target-specific coding with some of your data in "near" memory, some in "extended" memory, some in "flash" address spaces or "IO" address spaces. It all works with standard C or C++. The same applies to challengers like RISC-V, MIPS, PPC, and any other core - you have a single flat address space for normal data. > > I have no idea how much the GCC features are actually used, > but other compilers for embedded systems such as SDCC also > support named address spaces. > And the targets supported by SDCC are also dead-end devices - there is not a single one of them that I would consider for a new project. These microcontrollers are now used almost exclusively for legacy projects - updates to existing hardware or software, and rely on compatibility with existing C extensions (whether they are called "named address spaces", "extension keywords", or anything else). Now, there are things that I would like to be able to write in my code that could appear to be candidates for some kind of named address space. For example, I might want data that is placed in an external eeprom - it could be nice to be able to define, declare, read and write it like normal data in the code. But key to this would be the ability to define the way this works in /user/ code - named address spaces require changing the toolchain, which is out of the question in almost all use-cases. And it would spoil one of C's key advantages over alternative languages - to a fair extent (though not as completely as many people believe), you can guess what is happening from the code. You assume that "x = eeprom_var;" is small and efficient, while "x = read_eeprom(eeprom_var)" might take significant time to execute. People don't like hidden things in their C code. It would be conceivable for GCC to add extensions that could be used to define your own named address spaces in user code. But doing so would require a lot of syntax and features that already exist in C++. I would rather see better ways of controlling placement of data added to the compiler. In another post, I suggested allowing variable attributes - such as "section" - to be attached to types. I'd also like to see a way to specify the section name by compile-time evaluated code, not just a string literal. I'd like to be able to give sub-sections, and section flags in a convenient way. And - perhaps most importantly - I'd like to be able to use #pragma's to give sections for code or data for a block of code and data at a time, rather than having to specify it individually on each function. (That could perhaps also be done by allowing section attributes on namespaces in C++.) David ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 11:34 ` David Brown @ 2023-07-05 12:01 ` Martin Uecker 0 siblings, 0 replies; 54+ messages in thread From: Martin Uecker @ 2023-07-05 12:01 UTC (permalink / raw) To: David Brown, Rafał Pietrak, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Thanks David! I do not think I agree with all of it (e.g. sdcc is actively developed with regular releases and supports tiny devices which are used for extreme low-power applications) and I do not personally think that only C++ counts nowadays, especially in the embedded world, but we do not need to discuss this now. I still very much appreciate your input! Note that I am involved with C standardization, but TR 18307 precedes this. Martin Am Mittwoch, dem 05.07.2023 um 13:34 +0200 schrieb David Brown: > > > On 05/07/2023 11:25, Martin Uecker wrote: > > Am Mittwoch, dem 05.07.2023 um 11:11 +0200 schrieb David Brown: > > > On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote: > > > > ... > > > > > > In my personal opinion (which you are all free to disregard), > > > named > > > address spaces were an interesting idea that failed. I was > > > enthusiastic > > > about a number of the extensions in TR 18307 "C Extensions to > > > support > > > embedded processors" when the paper was first published. As I > > > learned > > > more, however, I saw it was a dead-end. The features are too > > > under-specified to be useful or portable, gave very little of use > > > to > > > embedded programmers, and fit badly with C. It was an attempt to > > > standardise and generalise some of the mess of different > > > extensions > > > that > > > proprietary toolchain developers had for a variety of 8-bit CISC > > > microcontrollers that could not use standard C very effectively. > > > But > > > it > > > was all too little, too late - and AFAIK none of these > > > proprietary > > > toolchains support it. GCC supports some of the features to some > > > extent > > > - a few named address spaces on a few devices, for "gnuc" only > > > (not > > > standard C, and not C++), and has some fixed point support for > > > some > > > targets (with inefficient generated code - it appears to be > > > little > > > more > > > than an initial "proof of concept" implementation). > > > > > > I do not think named address spaces have a future - in GCC or > > > anywhere > > > else. The only real use of them at the moment is for the AVR for > > > accessing data in flash, and even then it is of limited success > > > since > > > it > > > does not work in C++. > > > > Can you explain a little bit why you think it is a dead-end? It > > seems an elegant solution to a range of problems to me. > > Named address spaces are not standardised in C, and I do not expect > they > ever will be. The TR18307 document is not anywhere close to being of > a > quality that could be integrated with the C standards, even as > optional > features, and much of it makes no sense in practice (I have never > heard > of the IO stuff being implemented or used). > > The few compilers that implement any of it do so in different ways - > the > "__flash" address space in AVR GCC is slightly different from the > same > extension in IAR's AVR compiler. For existing compilers, there is a > strong inconsistency as to whether such things are "named address > spaces", "extension keywords", "type qualifiers", "attributes", or > other > terms, all with subtly (or not so subtly) different effects on how > they > are used, what restrictions exist, conversions between types, and how > errors can be diagnosed. Sometimes these features are considered > part > of the data type, sometimes of pointer types, sometimes they are just > about data placement. > > Since every compiler targeting these small awkward microcontrollers > has > a different idea of what something like "const __flash int x = 123;" > means, and has been implementing their own ideas for a decade or two > before TR18307 ever proposed "named address spaces", the TR hasn't a > hope of being a real standard. > > Named address spaces are not implemented at all, anywhere (AFAIK), > for > C++. (Some embedded toolchains have limited support for C++ on such > microcontrollers, but these are again not really named address > spaces.) > Since C++ usage is heavily increasing in the small embedded system > world, this is important. (GCC has much of the honour for that - as > ARM > took a bigger share of the market and GCC for ARM improved, the > toolchain market was no longer at the mercy of big commercial vendors > who charged absurd amounts for their C++ toolchains.) A feature > which > is only for C, and not supported by C++, is almost guaranteed to be > dead-end. > > And of course the type of processor for which named address spaces or > other related extensions are essential, are a dying breed. The AVR > is > probably the only one with a significant future. Part of the appeal > of > ARM in the embedded world is it frees you from the pains of > target-specific coding with some of your data in "near" memory, some > in > "extended" memory, some in "flash" address spaces or "IO" address > spaces. It all works with standard C or C++. The same applies to > challengers like RISC-V, MIPS, PPC, and any other core - you have a > single flat address space for normal data. > > > > > I have no idea how much the GCC features are actually used, > > but other compilers for embedded systems such as SDCC also > > support named address spaces. > > > > And the targets supported by SDCC are also dead-end devices - there > is > not a single one of them that I would consider for a new project. > These > microcontrollers are now used almost exclusively for legacy projects > - > updates to existing hardware or software, and rely on compatibility > with > existing C extensions (whether they are called "named address > spaces", > "extension keywords", or anything else). > > > Now, there are things that I would like to be able to write in my > code > that could appear to be candidates for some kind of named address > space. > For example, I might want data that is placed in an external eeprom > - > it could be nice to be able to define, declare, read and write it > like > normal data in the code. But key to this would be the ability to > define > the way this works in /user/ code - named address spaces require > changing the toolchain, which is out of the question in almost all > use-cases. And it would spoil one of C's key advantages over > alternative languages - to a fair extent (though not as completely as > many people believe), you can guess what is happening from the code. > You assume that "x = eeprom_var;" is small and efficient, while "x = > read_eeprom(eeprom_var)" might take significant time to execute. > People > don't like hidden things in their C code. > > It would be conceivable for GCC to add extensions that could be used > to > define your own named address spaces in user code. But doing so > would > require a lot of syntax and features that already exist in C++. > > I would rather see better ways of controlling placement of data added > to > the compiler. In another post, I suggested allowing variable > attributes > - such as "section" - to be attached to types. I'd also like to see > a > way to specify the section name by compile-time evaluated code, not > just > a string literal. I'd like to be able to give sub-sections, and > section > flags in a convenient way. And - perhaps most importantly - I'd like > to > be able to use #pragma's to give sections for code or data for a > block > of code and data at a time, rather than having to specify it > individually on each function. (That could perhaps also be done by > allowing section attributes on namespaces in C++.) > > > David > ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 9:11 ` David Brown 2023-07-05 9:25 ` Martin Uecker @ 2023-07-05 9:42 ` Rafał Pietrak 2023-07-05 11:55 ` David Brown 1 sibling, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-07-05 9:42 UTC (permalink / raw) To: David Brown, Martin Uecker, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Hi, W dniu 5.07.2023 o 11:11, David Brown pisze: > On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote: [-----------] >>> type) would then be smaller. At least, this is my understanding >>> of how it could work. > > Note that this only applies to pointers declared to be of the address > space specific type. If you have "__smalldata int x;" using a > hypothetical new address space, then "&x" is of type "__smalldata int *" > and you need to specify the address space specific pointer type to get > the size advantages. (Since the __smalldata address space is a subset > of the generic space, conversions between pointer types are required to > work correctly.) I see. [--------] >> thing like "#pragma" at the top of a file would do a better job), >> better something then nothing. Then again, should you happen to fall >> onto an actual documentation of syntax to use this feature with, I'd >> appreciate you sharing it :) >> > > I am not sure if you are clear about this, but the address space > definition macros here are for use in the source code for the compiler, > not in user code. There is (AFAIK) no way for user code to create > address spaces - you need to check out the source code for GCC, modify > it to support your new address space, and build your own compiler. This > is perfectly possible (it's all free and open source, after all), but it > is not a minor undertaking - especially if you don't like C++ ! Hmmm. Wouldn't it be easier and more natural to make the "named spaces" a synonym to specific linker sections (like section names, or section name prefix when instead of ".data.array.*" one gets ".mynamespace.array.*")? [------] > I realise that learning at least some C++ is a significant step beyond > learning C - but /using/ C++ classes or templates is no harder than C > coding. And it is far easier, faster and less disruptive to make a C++ > header library implementing such features than adding new named address > spaces into the compiler itself. > > The one key feature that is missing is that named address spaces can > affect the allocation details of data, which cannot be done with C++ > classes. You could make a "small_data" class template, but variables > would still need to be marked __attribute__((section(".smalldata"))) > when used. I think this could be handled very neatly with one single > additional feature in GCC - allow arbitrary GCC variable attributes to > be specified for types, which would then be applied to any variables > declared for that type. OK. I see your point. But let's have look at it. You say, that "names spaces affect allocation details, which cannot be done with C++". Pls consider: 1. for small embedded devices C++ is not a particularly "seller". We even turn to assembler occasionally. 2. affecting allocation details is usually the hole point of engineering skills when dealing with small embedded devices - the hole point is to have tools to do that. So your current objections to named spaces ... are in fact in favor of them. Isn't it so? -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 9:42 ` Rafał Pietrak @ 2023-07-05 11:55 ` David Brown 2023-07-05 12:25 ` Rafał Pietrak 0 siblings, 1 reply; 54+ messages in thread From: David Brown @ 2023-07-05 11:55 UTC (permalink / raw) To: Rafał Pietrak, Martin Uecker, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc On 05/07/2023 11:42, Rafał Pietrak via Gcc wrote: > Hi, > > W dniu 5.07.2023 o 11:11, David Brown pisze: >> On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote: > [-----------] >> I am not sure if you are clear about this, but the address space >> definition macros here are for use in the source code for the >> compiler, not in user code. There is (AFAIK) no way for user code to >> create address spaces - you need to check out the source code for GCC, >> modify it to support your new address space, and build your own >> compiler. This is perfectly possible (it's all free and open source, >> after all), but it is not a minor undertaking - especially if you >> don't like C++ ! > > Hmmm. > > Wouldn't it be easier and more natural to make the "named spaces" a > synonym to specific linker sections (like section names, or section name > prefix when instead of ".data.array.*" one gets ".mynamespace.array.*")? You can, of course, write : #define __smalldata __attribute__((section(".smalldata))) I'd rather see the "section" attribute extended to allow it to specify a prefix or suffix (to make subsections) than more named address spaces. I'm a big fan of only putting things in the compiler if they have to be there - if a feature can be expressed in code (whether it be C, C++, or preprocessor macros), then I see that as the best choice. > > [------] >> I realise that learning at least some C++ is a significant step beyond >> learning C - but /using/ C++ classes or templates is no harder than C >> coding. And it is far easier, faster and less disruptive to make a >> C++ header library implementing such features than adding new named >> address spaces into the compiler itself. >> >> The one key feature that is missing is that named address spaces can >> affect the allocation details of data, which cannot be done with C++ >> classes. You could make a "small_data" class template, but variables >> would still need to be marked __attribute__((section(".smalldata"))) >> when used. I think this could be handled very neatly with one single >> additional feature in GCC - allow arbitrary GCC variable attributes to >> be specified for types, which would then be applied to any variables >> declared for that type. > > OK. I see your point. > > But let's have look at it. You say, that "names spaces affect allocation > details, which cannot be done with C++". Pls consider: > 1. for small embedded devices C++ is not a particularly "seller". We > even turn to assembler occasionally. I have been writing code for small embedded systems for about 30 years. I used to write a lot in assembly, but it is very rare now. Almost all of the assembly I write these days is inline assembly in gcc format - and a lot of that actually contains no assembly at all, but is for careful control of dependencies or code re-arrangements. The smallest device I have ever used was an AVR Tiny with no ram at all - just 2K flash, a 3-level return stack and its 32 8-bit registers. I programmed that in C (with gcc). C++ /is/ a big "seller" in this market. It is definitely growing, just as the market for commercial toolchains with non-portable extensions is dropping and 8-bit CISC devices are being replaced by Cortex-M0 cores. There is certainly plenty of C-only coding going on, but C++ is growing. > 2. affecting allocation details is usually the hole point of engineering > skills when dealing with small embedded devices - the hole point is to > have tools to do that. > When you are dealing with 8-bit CISC devices like the 8051 or the COP8, then allocation strategies are critical, and good tools are essential. But for current microcontrollers, they are not nearly as important because you have a single flat address space - pointers to read-only data in flash and pointers to data in ram are fully compatible. You do sometimes need to place particular bits of data in particular places, but that is usually for individual large data blocks such as putting certain buffers in non-cached memory, or a large array in external memory. Section attributes suffice for that. Allocation control is certainly important at times, but it's far from being as commonly needed as you suggest. (Dynamic allocation is a different matter, but I don't believe we are talking about that here.) > So your current objections to named spaces ... are in fact in favor of > them. Isn't it so? > Not really, no - I would rather see better ways to handle allocation and section control than more named address spaces. David ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 11:55 ` David Brown @ 2023-07-05 12:25 ` Rafał Pietrak 2023-07-05 12:57 ` David Brown 0 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-07-05 12:25 UTC (permalink / raw) To: David Brown, Martin Uecker, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Hi, W dniu 5.07.2023 o 13:55, David Brown pisze: > On 05/07/2023 11:42, Rafał Pietrak via Gcc wrote: [--------------] >> Wouldn't it be easier and more natural to make the "named spaces" a >> synonym to specific linker sections (like section names, or section >> name prefix when instead of ".data.array.*" one gets >> ".mynamespace.array.*")? > > You can, of course, write : > > #define __smalldata __attribute__((section(".smalldata))) > > I'd rather see the "section" attribute extended to allow it to specify a > prefix or suffix (to make subsections) than more named address spaces. me to. (pun not intended :) > I'm a big fan of only putting things in the compiler if they have to be > there - if a feature can be expressed in code (whether it be C, C++, or > preprocessor macros), then I see that as the best choice. Fully agree. ... almost fully. I'd rather say: "I'm a big fun of being able to tell the compiler what I mean, but leave functionality/algorithms in libs". And I think this has some resonance to our discussion. I think, that as of today, C-sources (though the compiler) don't have enough "vocabulary" to tell the linker "what we mean", and so the ability to "select flash vs RAM" or "one RAM vs another" is limited. The compiler/linker language is pretty much limited to resolved/unresolved names and their binding to linker segments.... like you've pointed above. [--------] >> But let's have look at it. You say, that "names spaces affect >> allocation details, which cannot be done with C++". Pls consider: >> 1. for small embedded devices C++ is not a particularly "seller". We >> even turn to assembler occasionally. > > I have been writing code for small embedded systems for about 30 years. > I used to write a lot in assembly, but it is very rare now. Almost all > of the assembly I write these days is inline assembly in gcc format - > and a lot of that actually contains no assembly at all, but is for > careful control of dependencies or code re-arrangements. The smallest > device I have ever used was an AVR Tiny with no ram at all - just 2K > flash, a 3-level return stack and its 32 8-bit registers. I programmed > that in C (with gcc). Well, similar here, although not that intensive (hobby, not profession). Still, I've never touched the Tiny, as I wasn't able to wrap my head around those limited resources for anything more then a blinker. [--------------] > Allocation control is certainly important at times, but it's far from > being as commonly needed as you suggest. > > (Dynamic allocation is a different matter, but I don't believe we are > talking about that here.) yes, not about that. > >> So your current objections to named spaces ... are in fact in favor of >> them. Isn't it so? >> > > Not really, no - I would rather see better ways to handle allocation and > section control than more named address spaces. Doesn't it call for "something" that a c-source (through the compiler) can express to the linker programmers' intention? -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 12:25 ` Rafał Pietrak @ 2023-07-05 12:57 ` David Brown 2023-07-05 13:29 ` Rafał Pietrak 0 siblings, 1 reply; 54+ messages in thread From: David Brown @ 2023-07-05 12:57 UTC (permalink / raw) To: Rafał Pietrak, Martin Uecker, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc On 05/07/2023 14:25, Rafał Pietrak wrote: > Hi, > > W dniu 5.07.2023 o 13:55, David Brown pisze: >> On 05/07/2023 11:42, Rafał Pietrak via Gcc wrote: [--------------] >>> So your current objections to named spaces ... are in fact in favor >>> of them. Isn't it so? >>> >> >> Not really, no - I would rather see better ways to handle allocation >> and section control than more named address spaces. > > Doesn't it call for "something" that a c-source (through the compiler) > can express to the linker programmers' intention? > Yes, I think that is fair to say. And that "something" should be more advanced and flexible than the limited "section" attribute we have today. But I don't think it should be "named address spaces". My objection to named address spaces stem from two points: 1. They are compiler implementations, not user code (or library code), which means development is inevitably much slower and less flexible. 2. They mix two concepts that are actually quite separate - how objects are allocated, and how they are accessed. Access to different types of object in different sorts of memory can be done today. In C, you can use inline functions or macros. For target-specific stuff you can use inline assembly, and GCC might have builtins for some target-specific features. In C++, you can also wrap things in classes if that makes more sense. Allocation is currently controlled by "section" attributes. This is where we I believe GCC could do better, and give the user more control. (It may be possible to develop a compiler-independent syntax here that could become part of future C and C++ standards, but I think it will unavoidably be heavily implementation dependent.) All we really need is a way to combine these with types to improve user convenience and reduce the risk of mistakes. And I believe that allowing allocation control attributes to be attached to types would give us that in GCC. Then it would all be user code - typedefs, macros, functions, classes, whatever suits. David ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 12:57 ` David Brown @ 2023-07-05 13:29 ` Rafał Pietrak 2023-07-05 14:45 ` David Brown 0 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-07-05 13:29 UTC (permalink / raw) To: David Brown, Martin Uecker, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Hi, W dniu 5.07.2023 o 14:57, David Brown pisze: [------------] > > My objection to named address spaces stem from two points: > > 1. They are compiler implementations, not user code (or library code), > which means development is inevitably much slower and less flexible. > > 2. They mix two concepts that are actually quite separate - how objects > are allocated, and how they are accessed. OK. I don't see a problem here, but I admit that mixing semantics often lead to problems. > Access to different types of object in different sorts of memory can be > done today. In C, you can use inline functions or macros. For > target-specific stuff you can use inline assembly, and GCC might have > builtins for some target-specific features. In C++, you can also wrap > things in classes if that makes more sense. Personally, I'd avoid inline assembly whenever possible. It does a very good job of obfuscating programmers' intentions. From my experience, I'd rather put the entire functions into assembler if compiler makes obstacles. But that's not an issue here. > Allocation is currently controlled by "section" attributes. This is > where we I believe GCC could do better, and give the user more control. > (It may be possible to develop a compiler-independent syntax here that > could become part of future C and C++ standards, but I think it will > unavoidably be heavily implementation dependent.) I agree. > > All we really need is a way to combine these with types to improve user > convenience and reduce the risk of mistakes. And I believe that > allowing allocation control attributes to be attached to types would > give us that in GCC. Then it would all be user code - typedefs, macros, > functions, classes, whatever suits. OK. Sounds good. Naturally I have my "wishlist": the "small pointers" segment/attribute :) But how (and to what extend) would you do that? I mean, the convenient syntax is desirable, but IMHO at this point there is also a question of semantics: what exactly compiler is supposed to tell linker? I think it would be good to list here the use scenarios that we now of. Scenarios that would benefit from compiler communicating to linker more then names@sections. (even if such list wouldn't evolve into any implementation effort at this point I think that would nicely conclude this thread.) -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 13:29 ` Rafał Pietrak @ 2023-07-05 14:45 ` David Brown 2023-07-05 16:13 ` Rafał Pietrak 0 siblings, 1 reply; 54+ messages in thread From: David Brown @ 2023-07-05 14:45 UTC (permalink / raw) To: Rafał Pietrak, Martin Uecker, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc On 05/07/2023 15:29, Rafał Pietrak wrote: > Hi, > > > W dniu 5.07.2023 o 14:57, David Brown pisze: > [------------] >> >> My objection to named address spaces stem from two points: >> >> 1. They are compiler implementations, not user code (or library code), >> which means development is inevitably much slower and less flexible. >> >> 2. They mix two concepts that are actually quite separate - how >> objects are allocated, and how they are accessed. > > OK. I don't see a problem here, but I admit that mixing semantics often > lead to problems. > I think it also allows better generalisation and flexibility if they are separate. You might want careful control over where something is allocated, but the access would be using normal instructions. Conversely, you might not be bothered about where the data is allocated, but want control of access (maybe you want interrupts disabled around accesses to make it atomic). >> Access to different types of object in different sorts of memory can >> be done today. In C, you can use inline functions or macros. For >> target-specific stuff you can use inline assembly, and GCC might have >> builtins for some target-specific features. In C++, you can also wrap >> things in classes if that makes more sense. > > Personally, I'd avoid inline assembly whenever possible. It does a very > good job of obfuscating programmers' intentions. From my experience, I'd > rather put the entire functions into assembler if compiler makes obstacles. > I'd rather keep the assembly to a minimum, and let the compiler do what it is good at - such as register allocation. That means extended syntax inline assembly (but typically wrapped inside a small inline function). > But that's not an issue here. Agreed. > >> Allocation is currently controlled by "section" attributes. This is >> where we I believe GCC could do better, and give the user more >> control. (It may be possible to develop a compiler-independent syntax >> here that could become part of future C and C++ standards, but I think >> it will unavoidably be heavily implementation dependent.) > > I agree. > >> >> All we really need is a way to combine these with types to improve >> user convenience and reduce the risk of mistakes. And I believe that >> allowing allocation control attributes to be attached to types would >> give us that in GCC. Then it would all be user code - typedefs, >> macros, functions, classes, whatever suits. > > OK. Sounds good. > > Naturally I have my "wishlist": the "small pointers" segment/attribute :) > > But how (and to what extend) would you do that? I mean, the convenient > syntax is desirable, but IMHO at this point there is also a question of > semantics: what exactly compiler is supposed to tell linker? I think it > would be good to list here the use scenarios that we now of. Scenarios > that would benefit from compiler communicating to linker more then > names@sections. (even if such list wouldn't evolve into any > implementation effort at this point I think that would nicely conclude > this thread.) > Let me try to list some things I think might be useful (there may be some overlap). I am not giving any particular order here. 1. Adding a prefix to section names rather than replacing them. 2. Adding a suffix to section names. 3. Constructing section names at compile time, rather that just using a string literal. (String literals can be constructed using the pre-processor, but that has its limitations.) 4. Pragmas to apply section names (or prefixes or suffixes) to a block of definitions, changing the defaults. 5. Control of section flags (such as read-only, executable, etc.). At the moment, flags are added automatically depending on what you put into the section (code, data, read-only data). So if you want to override these, such as to make a data section in ram that is executable (for your JIT compiler :-) ), you need something like : __attribute__((section("jit_buffer,\"ax\"\n@"))) to add the flags manually, then a newline, then a line comment character (@ for ARM, but this varies according to target.) 6. Convenient support for non-initialised non-zeroed data sections in a standardised way, without having to specify sections manually in the source and linker setup. 7. Convenient support for sections (or variables) placed at specific addresses, in a standardised way. 8. Convenient support for sections that are not allocated space by the linker in the target memory, but where the contents are still included in the elf file and map files, where they can be read by other tools. (This could be used for external analysis tools.) 9. Support for getting data from the linker to the code, such as section sizes and start addresses, without having to manually add the symbols to the linker file and declare extern symbols in the C or C++ code. 10. Support for structs (or C++ classes) where different parts of the struct are in different sections. This would mean the struct could only be statically allocated (no stack allocation - just global or static), and it would have no accessible address or size (you could have pointers to fields, but not to the struct objects themselves). This would let you tie together objects made of multiple parts such as constant data in flash and writeable data in ram. 11. Convenient support for building up tables where the contents are scattered across different source files, without having to manually edit the linker files. Much of this can be done today, but involves manual (and therefore error-prone) effort and inconvenience. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 14:45 ` David Brown @ 2023-07-05 16:13 ` Rafał Pietrak 2023-07-05 17:39 ` David Brown 0 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-07-05 16:13 UTC (permalink / raw) To: David Brown, Martin Uecker, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Hi, W dniu 5.07.2023 o 16:45, David Brown pisze: > On 05/07/2023 15:29, Rafał Pietrak wrote: [---------------] >> OK. I don't see a problem here, but I admit that mixing semantics >> often lead to problems. >> > > I think it also allows better generalisation and flexibility if they are > separate. You might want careful control over where something is > allocated, but the access would be using normal instructions. > Conversely, you might not be bothered about where the data is allocated, > but want control of access (maybe you want interrupts disabled around > accesses to make it atomic). that would require compiler to know the "semantics" of such section. I don't think you've listed it below, worth adding. If I understand you correctly, that means the code generated varies depending on target section selected. This is linker "talking" to compiler if I'm not mistaken. but OK. I see your point. [----------] > Let me try to list some things I think might be useful (there may be > some overlap). I am not giving any particular order here. > > 1. Adding a prefix to section names rather than replacing them. OK. +1 > 2. Adding a suffix to section names. +1 > 3. Constructing section names at compile time, rather that just using a > string literal. (String literals can be constructed using the > pre-processor, but that has its limitations.) I'm not sure what this means? At compile time, you only have literals, so what's missing? > 4. Pragmas to apply section names (or prefixes or suffixes) to a block > of definitions, changing the defaults. +1 > 5. Control of section flags (such as read-only, executable, etc.). At > the moment, flags are added automatically depending on what you put into > the section (code, data, read-only data). So if you want to override > these, such as to make a data section in ram that is executable (for > your JIT compiler :-) ), you need something like : > > __attribute__((section("jit_buffer,\"ax\"\n@"))) I assume, that adding an attribute should split a particular section into "an old one" and "the new one with new attribute", right? One would need to have linker logic (and linker script definitions) altered, to follow that (other features so far wouldn't require any changes to linkers, I think). > to add the flags manually, then a newline, then a line comment character > (@ for ARM, but this varies according to target.) > > 6. Convenient support for non-initialised non-zeroed data sections in a > standardised way, without having to specify sections manually in the > source and linker setup. What gain and under which circumstances you get with this? I mean, why enforce keeping uninitialized memory fragment, while that is just a one shot action at load time? > 7. Convenient support for sections (or variables) placed at specific > addresses, in a standardised way. Hmm... Frankly, I'm quite comfortable with current features of linker script, and I do it like this: SECTIONS { sfr_devices 0x40000000 (NOLOAD): { . = ALIGN(1K); PROVIDE(TIM2 = .); . = 0x00400; PROVIDE(TIM3 = .); . = 0x00800; PROVIDE(TIM4 = .); } } The only problem is that so far I'm not aware of command line options to "supplement" default linker script with such fragment. Option "-T" replaces it, which is a nuisance. > 8. Convenient support for sections that are not allocated space by the > linker in the target memory, but where the contents are still included > in the elf file and map files, where they can be read by other tools. > (This could be used for external analysis tools.) Isn't it so, that current debugger sections are just that? Extrapolating your words: Do you think of sections that you would have full control on it's content at compilation, and it isn't sufficient to do it like this: char private[] __attribute__((section("something"))) = { 0xFF, 0x01, 0x02, .... }; > 9. Support for getting data from the linker to the code, such as section > sizes and start addresses, without having to manually add the symbols to > the linker file and declare extern symbols in the C or C++ code. +1 > 10. Support for structs (or C++ classes) where different parts of the > struct are in different sections. This would mean the struct could only > be statically allocated (no stack allocation - just global or static), > and it would have no accessible address or size (you could have pointers > to fields, but not to the struct objects themselves). This would let > you tie together objects made of multiple parts such as constant data in > flash and writeable data in ram. +1 > 11. Convenient support for building up tables where the contents are > scattered across different source files, without having to manually edit > the linker files. do you have an example where that is useful? 12. I'd supplement the list with a better control on code section names towards something like code namespaces. 13. and data sections "growing downwords" like stack does. This is for more flexible planning of memory layouts of micros (limited RAM). One class of sections you put into RAM sequencialy bottom to top, the other category from top to bottom. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 16:13 ` Rafał Pietrak @ 2023-07-05 17:39 ` David Brown 2023-07-06 7:00 ` Rafał Pietrak 0 siblings, 1 reply; 54+ messages in thread From: David Brown @ 2023-07-05 17:39 UTC (permalink / raw) To: Rafał Pietrak, Martin Uecker, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc On 05/07/2023 18:13, Rafał Pietrak via Gcc wrote: > Hi, > > W dniu 5.07.2023 o 16:45, David Brown pisze: >> On 05/07/2023 15:29, Rafał Pietrak wrote: > [---------------] >>> OK. I don't see a problem here, but I admit that mixing semantics >>> often lead to problems. >>> >> >> I think it also allows better generalisation and flexibility if they >> are separate. You might want careful control over where something is >> allocated, but the access would be using normal instructions. >> Conversely, you might not be bothered about where the data is >> allocated, but want control of access (maybe you want interrupts >> disabled around accesses to make it atomic). > > that would require compiler to know the "semantics" of such section. I > don't think you've listed it below, worth adding. If I understand you > correctly, that means the code generated varies depending on target > section selected. This is linker "talking" to compiler if I'm not mistaken. > No, it's about the access - not the allocation (or section). Access boils down to a "read" function and a "write" function (or possibly several, optimised for different sizes - C11 _Generic can make this neater, though C++ handles it better). > > [----------] >> Let me try to list some things I think might be useful (there may be >> some overlap). I am not giving any particular order here. >> >> 1. Adding a prefix to section names rather than replacing them. > > OK. +1 > >> 2. Adding a suffix to section names. > > +1 > >> 3. Constructing section names at compile time, rather that just using >> a string literal. (String literals can be constructed using the >> pre-processor, but that has its limitations.) > > I'm not sure what this means? At compile time, you only have literals, > so what's missing? The compiler knows a lot more than just literal values at compile time - lots of things are "compile-time constants" without being literals that can be used in string literals. That includes the value of static "const" variables, and the results of calculations or "pure" function calls using compile-time constant data. You can do a great deal more of this in C++ than in C ("static const int N = 10; int arr[N];" is valid in C++, but not in C). Calculated section names might be useful for sections that later need to be sorted. To be fair, you can construct string literals by the preprocessor that would cover many cases. I can also add that generating linker symbols from compile-time constructed names could be useful, to use (abuse?) the linker to find issues across different source files. Imagine you have a microcontroller with multiple timers, and several sources that all need to use timers. A module that uses timer 1 could define a "using_timer_1" symbol for link time (but with no allocation to real memory). Another module might use timer 2 and define "using_timer_2". If a third module uses timer 1 again, then you'd get a link-time error with two conflicting definitions of "use_timer_1" and you'd know you have to change one of the modules. > >> 4. Pragmas to apply section names (or prefixes or suffixes) to a block >> of definitions, changing the defaults. > > +1 > >> 5. Control of section flags (such as read-only, executable, etc.). At >> the moment, flags are added automatically depending on what you put >> into the section (code, data, read-only data). So if you want to >> override these, such as to make a data section in ram that is >> executable (for your JIT compiler :-) ), you need something like : >> >> __attribute__((section("jit_buffer,\"ax\"\n@"))) > > I assume, that adding an attribute should split a particular section > into "an old one" and "the new one with new attribute", right? You can't have the same section name and multiple flags. But you sometimes want to have unusual flag combinations, such as executable ram sections for "run from ram" functions. > > One would need to have linker logic (and linker script definitions) > altered, to follow that (other features so far wouldn't require any > changes to linkers, I think). > >> to add the flags manually, then a newline, then a line comment >> character (@ for ARM, but this varies according to target.) >> >> 6. Convenient support for non-initialised non-zeroed data sections in >> a standardised way, without having to specify sections manually in the >> source and linker setup. > > What gain and under which circumstances you get with this? I mean, why > enforce keeping uninitialized memory fragment, while that is just a one > shot action at load time? > Very often you have buffers in your programs, which you want to have statically allocated in ram (so they have a fixed address, perhaps specially aligned, and so you have a full overview of your memory usage in your map files), but you don't care about the contents at startup. Clearing these to 0 is just a waste of processor time. >> 7. Convenient support for sections (or variables) placed at specific >> addresses, in a standardised way. > > Hmm... Frankly, I'm quite comfortable with current features of linker > script, and I do it like this: > SECTIONS > { > sfr_devices 0x40000000 (NOLOAD): { > . = ALIGN(1K); PROVIDE(TIM2 = .); > . = 0x00400; PROVIDE(TIM3 = .); > . = 0x00800; PROVIDE(TIM4 = .); > } > } > > The only problem is that so far I'm not aware of command line options to > "supplement" default linker script with such fragment. Option "-T" > replaces it, which is a nuisance. These are ugly and hard to maintain in practice - the most common way to give fixed addresses is to use macros that cast the fixed address to pointers to volatile objects and structs. But sometimes it is nice to have sections at specific addresses, and it would be a significant gain for most people if these could be defined entirely in C (or C++), without editing linker files. Many embedded toolchains support such features - "int reg @ 0x1234;", or similar syntax. gcc has an "address" attribute for the AVR, but not as a common attribute. (It is always annoying when one target has an attribute that would be useful on other ports, but only exists on the one target.) > >> 8. Convenient support for sections that are not allocated space by the >> linker in the target memory, but where the contents are still included >> in the elf file and map files, where they can be read by other tools. >> (This could be used for external analysis tools.) > > Isn't it so, that current debugger sections are just that? They are, yes. But it would be useful to have non-debug user sections that act in a similar manner - it would not confuse debuggers, and be easier for user-written parsers to pull out of the elf or map files. > > Extrapolating your words: Do you think of sections that you would have > full control on it's content at compilation, and it isn't sufficient to > do it like this: > char private[] __attribute__((section("something"))) = { > 0xFF, 0x01, 0x02, .... > }; > You also need control of the allocation (or lack thereof). This can be done using sections with flags and/or linker file setup, but again it would be good to have a standardised GCC extension for it. It is far easier for people to use a GCC attribute than to learn about the messy details of section flags and linker files. >> 9. Support for getting data from the linker to the code, such as >> section sizes and start addresses, without having to manually add the >> symbols to the linker file and declare extern symbols in the C or C++ >> code. > > +1 > >> 10. Support for structs (or C++ classes) where different parts of the >> struct are in different sections. This would mean the struct could >> only be statically allocated (no stack allocation - just global or >> static), and it would have no accessible address or size (you could >> have pointers to fields, but not to the struct objects themselves). >> This would let you tie together objects made of multiple parts such as >> constant data in flash and writeable data in ram. > > +1 > >> 11. Convenient support for building up tables where the contents are >> scattered across different source files, without having to manually >> edit the linker files. > > do you have an example where that is useful? You might like to have a code organisation where source files could define structures for, say, threads. Each of these would need an entry in a thread table holding priorities, run function pointer, etc. If this table were built up as a single section where each thread declaration contributed their part of it, then the global thread table would be built at link time rather than traditional run time setup. The advantages include a clear static measure of the number of the number of threads (see point 9), clear memory usage, and smaller initialisation code. (Obviously we are talking about statically defined threads here, not dynamically defined threads.) > > 12. I'd supplement the list with a better control on code section names > towards something like code namespaces. > > 13. and data sections "growing downwords" like stack does. This is for > more flexible planning of memory layouts of micros (limited RAM). One > class of sections you put into RAM sequencialy bottom to top, the other > category from top to bottom. > Fair enough. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 17:39 ` David Brown @ 2023-07-06 7:00 ` Rafał Pietrak 2023-07-06 12:53 ` David Brown 0 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-07-06 7:00 UTC (permalink / raw) To: David Brown, Martin Uecker, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Hi, W dniu 5.07.2023 o 19:39, David Brown pisze: [------------------] >> I'm not sure what this means? At compile time, you only have literals, >> so what's missing? > > The compiler knows a lot more than just literal values at compile time - > lots of things are "compile-time constants" without being literals that > can be used in string literals. That includes the value of static > "const" variables, and the results of calculations or "pure" function const --> created by a literal. > calls using compile-time constant data. You can do a great deal more of "compile time constant data" -> literal > this in C++ than in C ("static const int N = 10; int arr[N];" is valid > in C++, but not in C). Calculated section names might be useful for > sections that later need to be sorted. > > To be fair, you can construct string literals by the preprocessor that > would cover many cases. OK. We are talking of convenience syntax that allows for using any "name" in c-sources as "const-literal" if only its rooted in literals only. That's useful. +2. :) > > I can also add that generating linker symbols from compile-time > constructed names could be useful, to use (abuse?) the linker to find > issues across different source files. Imagine you have a +1 > microcontroller with multiple timers, and several sources that all need > to use timers. A module that uses timer 1 could define a [----------------------] >>> >>> __attribute__((section("jit_buffer,\"ax\"\n@"))) >> >> I assume, that adding an attribute should split a particular section >> into "an old one" and "the new one with new attribute", right? > > You can't have the same section name and multiple flags. But you > sometimes want to have unusual flag combinations, such as executable ram > sections for "run from ram" functions. section flags reflect "semantic" of the section (ro v.s. rw is different semantics at that level). So, how do you "merge" RAM (a section called ".data"), one with "!x" flag, and the other with "x" flag? conflicting flags of sections with the same name have to be taken into consideration. > >> >> One would need to have linker logic (and linker script definitions) >> altered, to follow that (other features so far wouldn't require any >> changes to linkers, I think). >> >>> to add the flags manually, then a newline, then a line comment >>> character (@ for ARM, but this varies according to target.) >>> >>> 6. Convenient support for non-initialised non-zeroed data sections in >>> a standardised way, without having to specify sections manually in >>> the source and linker setup. >> >> What gain and under which circumstances you get with this? I mean, why >> enforce keeping uninitialized memory fragment, while that is just a >> one shot action at load time? >> > > Very often you have buffers in your programs, which you want to have > statically allocated in ram (so they have a fixed address, perhaps > specially aligned, and so you have a full overview of your memory usage > in your map files), but you don't care about the contents at startup. > Clearing these to 0 is just a waste of processor time. At startup? Really? Personally I wouldn't care if I waste those cycles. And having that explicitly "vocalized" in sources, I think it'll just make them harder to read by a maintainer. Otherwise, from my personal experience, it may or may not be desirable. > > >>> 7. Convenient support for sections (or variables) placed at specific >>> addresses, in a standardised way. >> >> Hmm... Frankly, I'm quite comfortable with current features of linker >> script, and I do it like this: >> SECTIONS >> { >> sfr_devices 0x40000000 (NOLOAD): { >> . = ALIGN(1K); PROVIDE(TIM2 = .); >> . = 0x00400; PROVIDE(TIM3 = .); >> . = 0x00800; PROVIDE(TIM4 = .); >> } >> } >> >> The only problem is that so far I'm not aware of command line options >> to "supplement" default linker script with such fragment. Option "-T" >> replaces it, which is a nuisance. > > These are ugly and hard to maintain in practice - the most common way to > give fixed addresses is to use macros that cast the fixed address to > pointers to volatile objects and structs. Yes, I know that macros are traditionally used here, but personally I think using them is just hideous. I'm using the above section definitions for years and they keep my c-sources nice and clean. And (in particular with stm32) if I change the target device, I just change the linker script and don't usually have to change the sources. That's really nice. It's like efortless porting. Having said that. I'm opened to suggestion how to get this better - like having a compiler "talk to linker" about those locations. > > But sometimes it is nice to have sections at specific addresses, and it > would be a significant gain for most people if these could be defined > entirely in C (or C++), without editing linker files. Many embedded > toolchains support such features - "int reg @ 0x1234;", or similar > syntax. gcc has an "address" attribute for the AVR, but not as a common > attribute. (It is always annoying when one target has an attribute that > would be useful on other ports, but only exists on the one target.) Yes, I know that. Then again (personally) I do prefer to be able to tell the compiler "-mcpu=atmega128" ... and so have it select appropriate linker script, while NOT changing my sources, then do it the other way around. [----------------] >> >> Extrapolating your words: Do you think of sections that you would have >> full control on it's content at compilation, and it isn't sufficient >> to do it like this: >> char private[] __attribute__((section("something"))) = { >> 0xFF, 0x01, 0x02, .... >> }; >> > > You also need control of the allocation (or lack thereof). This can be > done using sections with flags and/or linker file setup, but again it > would be good to have a standardised GCC extension for it. It is far > easier for people to use a GCC attribute than to learn about the messy > details of section flags and linker files. OK. But IMHO, should you move the functionality from linker to GCC, then all the "mess" just get transferred upstairs. And to know the linker is a must if you do a bare-metal programming anyway. Still, standardization is good, good, good. But how to you standardize something "private" by definition? [------------] >>> 11. Convenient support for building up tables where the contents are >>> scattered across different source files, without having to manually >>> edit the linker files. >> >> do you have an example where that is useful? > > You might like to have a code organisation where source files could > define structures for, say, threads. Each of these would need an entry > in a thread table holding priorities, run function pointer, etc. If > this table were built up as a single section where each thread > declaration contributed their part of it, then the global thread table > would be built at link time rather than traditional run time setup. The > advantages include a clear static measure of the number of the number of > threads (see point 9), clear memory usage, and smaller initialisation > code. (Obviously we are talking about statically defined threads here, > not dynamically defined threads.) I still don' get it. (pt.9 - sizes/locations of sections available to compiler? relevant to this?) Then again. I wouldn't aspire to understand everything. If that's useful, let it be. But I'd object to call this constructs "a table". A programmer should have control of how compiler interprets his/her words. "table" has a very well defined semantics and to have it the way you propose ... it'd be better to have a different name/syntax for those other objects. -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-06 7:00 ` Rafał Pietrak @ 2023-07-06 12:53 ` David Brown 0 siblings, 0 replies; 54+ messages in thread From: David Brown @ 2023-07-06 12:53 UTC (permalink / raw) To: gcc On 06/07/2023 09:00, Rafał Pietrak via Gcc wrote: > Hi, > > W dniu 5.07.2023 o 19:39, David Brown pisze: > [------------------] >>> I'm not sure what this means? At compile time, you only have >>> literals, so what's missing? >> >> The compiler knows a lot more than just literal values at compile time >> - lots of things are "compile-time constants" without being literals >> that can be used in string literals. That includes the value of >> static "const" variables, and the results of calculations or "pure" >> function > > const --> created by a literal. Technically in C, the only "literals" are "string literals". Something like 1234 is an integer constant, not a literal. But I don't want to get too deep into such standardese - especially not for C++ ! Even in C, there are lots of things that are known at compile time without being literals (or explicit constants). In many situations you can use "constant expressions", which includes basic arithmetic on constants, enumeration constants, etc. The restrictions on what can be used in different circumstances is not always obvious (if you have "static const N = 10;", then "static const M = N + 1;" is valid but "int xs[N];" is not). C++ has a very much wider concept of constant expressions at compile time - many more ways to make constant expressions, and many more ways to use them. But even there, the compiler will know things at compile time that are not syntactically constant in the language. (If you have code in a function "if (x < 0) return; bool b = (x >= 0);" then the compiler can optimise in the knowledge that "b" is a compile-time constant of "true".) > >> calls using compile-time constant data. You can do a great deal more of > > "compile time constant data" -> literal > >> this in C++ than in C ("static const int N = 10; int arr[N];" is valid >> in C++, but not in C). Calculated section names might be useful for >> sections that later need to be sorted. >> >> To be fair, you can construct string literals by the preprocessor that >> would cover many cases. > > OK. We are talking of convenience syntax that allows for using any > "name" in c-sources as "const-literal" if only its rooted in literals > only. That's useful. > > +2. :) > >> >> I can also add that generating linker symbols from compile-time >> constructed names could be useful, to use (abuse?) the linker to find >> issues across different source files. Imagine you have a > > +1 > >> microcontroller with multiple timers, and several sources that all >> need to use timers. A module that uses timer 1 could define a > [----------------------] >>>> >>>> __attribute__((section("jit_buffer,\"ax\"\n@"))) >>> >>> I assume, that adding an attribute should split a particular section >>> into "an old one" and "the new one with new attribute", right? >> >> You can't have the same section name and multiple flags. But you >> sometimes want to have unusual flag combinations, such as executable >> ram sections for "run from ram" functions. > > section flags reflect "semantic" of the section (ro v.s. rw is different > semantics at that level). So, how do you "merge" RAM (a section called > ".data"), one with "!x" flag, and the other with "x" flag? > > conflicting flags of sections with the same name have to be taken into > consideration. > It doesn't make sense to merge linker input sections with conflicting flags - this is (and should be) an error at link time. So I am not asking for a way to make a piece of ".data" section with different flags from the standard ".data" section - I am asking about nicer ways to make different sections with different selections of flags. (Input sections with different flags can be merged into one output section, as the semantic information is lost there.) >> >>> >>> One would need to have linker logic (and linker script definitions) >>> altered, to follow that (other features so far wouldn't require any >>> changes to linkers, I think). >>> >>>> to add the flags manually, then a newline, then a line comment >>>> character (@ for ARM, but this varies according to target.) >>>> >>>> 6. Convenient support for non-initialised non-zeroed data sections >>>> in a standardised way, without having to specify sections manually >>>> in the source and linker setup. >>> >>> What gain and under which circumstances you get with this? I mean, >>> why enforce keeping uninitialized memory fragment, while that is just >>> a one shot action at load time? >>> >> >> Very often you have buffers in your programs, which you want to have >> statically allocated in ram (so they have a fixed address, perhaps >> specially aligned, and so you have a full overview of your memory >> usage in your map files), but you don't care about the contents at >> startup. Clearing these to 0 is just a waste of processor time. > > At startup? Really? Personally I wouldn't care if I waste those cycles. > Usually it is not an issue, but it can be for some systems. I've seen systems where a hardware watchdog has timed out while the startup code is clearing large buffers unnecessarily. There are also some low-power systems that are halted until some external event triggers their reset - you want to get to the code that checks the reset source (reset pin or power on) as fast as possible, and you want much of your data to remain preserved over soft resets. And maybe your buffers are allocated in external dynamic ram which is not accessible until you have configured the ram controller - and thereafter it is accessible as normal ram. For one project I have at the moment, the chip's on-chip ram blocks can be allocated individually to data tightly coupled memory, instruction tightly coupled memory, or general-purpose ram - all at different addresses in the memory map. You do not want anything cleared until the blocks have been re-mapped from their default settings to their final settings. > And having that explicitly "vocalized" in sources, I think it'll just > make them harder to read by a maintainer. > It is even harder to read if it is not explicit in the C sources, but only in the linker files! > Otherwise, from my personal experience, it may or may not be desirable. > >> >> >>>> 7. Convenient support for sections (or variables) placed at specific >>>> addresses, in a standardised way. >>> >>> Hmm... Frankly, I'm quite comfortable with current features of linker >>> script, and I do it like this: >>> SECTIONS >>> { >>> sfr_devices 0x40000000 (NOLOAD): { >>> . = ALIGN(1K); PROVIDE(TIM2 = .); >>> . = 0x00400; PROVIDE(TIM3 = .); >>> . = 0x00800; PROVIDE(TIM4 = .); >>> } >>> } >>> >>> The only problem is that so far I'm not aware of command line options >>> to "supplement" default linker script with such fragment. Option "-T" >>> replaces it, which is a nuisance. >> >> These are ugly and hard to maintain in practice - the most common way >> to give fixed addresses is to use macros that cast the fixed address >> to pointers to volatile objects and structs. > > Yes, I know that macros are traditionally used here, but personally I > think using them is just hideous. I'm using the above section > definitions for years and they keep my c-sources nice and clean. And (in > particular with stm32) if I change the target device, I just change the > linker script and don't usually have to change the sources. That's > really nice. It's like efortless porting. > > Having said that. I'm opened to suggestion how to get this better - like > having a compiler "talk to linker" about those locations. > There are always more than one way to do these things. But I believe most programmers prefer to stick to the C (and/or C++) source files, and avoid anything involving linker files or assembly files. We are looking for ideas that could suit a wide range of people, not just you or I personally :-) >> >> But sometimes it is nice to have sections at specific addresses, and >> it would be a significant gain for most people if these could be >> defined entirely in C (or C++), without editing linker files. Many >> embedded toolchains support such features - "int reg @ 0x1234;", or >> similar syntax. gcc has an "address" attribute for the AVR, but not >> as a common attribute. (It is always annoying when one target has an >> attribute that would be useful on other ports, but only exists on the >> one target.) > > Yes, I know that. Then again (personally) I do prefer to be able to tell > the compiler "-mcpu=atmega128" ... and so have it select appropriate > linker script, while NOT changing my sources, then do it the other way > around. > > [----------------] >>> >>> Extrapolating your words: Do you think of sections that you would >>> have full control on it's content at compilation, and it isn't >>> sufficient to do it like this: >>> char private[] __attribute__((section("something"))) = { >>> 0xFF, 0x01, 0x02, .... >>> }; >>> >> >> You also need control of the allocation (or lack thereof). This can >> be done using sections with flags and/or linker file setup, but again >> it would be good to have a standardised GCC extension for it. It is >> far easier for people to use a GCC attribute than to learn about the >> messy details of section flags and linker files. > > OK. But IMHO, should you move the functionality from linker to GCC, then > all the "mess" just get transferred upstairs. And to know the linker is > a must if you do a bare-metal programming anyway. > I like having my messes in one place, rather than scattered around :-) > Still, standardization is good, good, good. But how to you standardize > something "private" by definition? You have to pick the right level of standardisation. I don't believe any of this should be at the level of the C standards, for example. But I think it should be possible to get a generalisation within GCC, so that it is "standard" across all targets rather than having target-specific attributes or extensions like named address spaces. It's fine for GCC to say that this feature is only guaranteed to work for binutils gas and ld, or compatible assemblers and linkers, with elf outputs. That gives you a "standard" for most use-cases. > > [------------] >>>> 11. Convenient support for building up tables where the contents are >>>> scattered across different source files, without having to manually >>>> edit the linker files. >>> >>> do you have an example where that is useful? >> >> You might like to have a code organisation where source files could >> define structures for, say, threads. Each of these would need an >> entry in a thread table holding priorities, run function pointer, >> etc. If this table were built up as a single section where each >> thread declaration contributed their part of it, then the global >> thread table would be built at link time rather than traditional run >> time setup. The advantages include a clear static measure of the >> number of the number of threads (see point 9), clear memory usage, and >> smaller initialisation code. (Obviously we are talking about >> statically defined threads here, not dynamically defined threads.) > > I still don' get it. (pt.9 - sizes/locations of sections available to > compiler? relevant to this?) > > Then again. I wouldn't aspire to understand everything. If that's > useful, let it be. > > But I'd object to call this constructs "a table". A programmer should > have control of how compiler interprets his/her words. "table" has a > very well defined semantics and to have it the way you propose ... it'd > be better to have a different name/syntax for those other objects. > I don't think "table" /does/ have well defined semantics. But I do think this would be a table! When you use C++, you already get a table like this for global constructors and other initialisation code. Sometimes the initialisation for a variable - especially class objects where there is a non-trivial constructor - requires some code to be run. When compiling a C++ file, every time the compiler needs to run some initialisation code, it generates a little function, and then makes a ".ctors.xxx" section containing a pointer to that function. In the linker, there is a section like this: . = ALIGN(4); KEEP (*crtbegin.o(.ctors)) KEEP (*(EXCLUDE_FILE (*crtend.o) .ctors)) KEEP (*(SORT(.ctors.*))) KEEP (*crtend.o(.ctors)) The ".ctors" section in crtbegin.o defines a "start of constructors table" symbol, and the matching section in ctrend.o has the end symbol. Linking collects all these constructor pointers into a table, and the C++ start up code can run through the table calling all the functions in order. I want to be able to do something similar, with a convenient syntax, but with my own choice of tables and contents. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 8:05 ` Rafał Pietrak 2023-07-05 9:11 ` David Brown @ 2023-07-05 9:29 ` Martin Uecker 2023-07-05 10:17 ` Rafał Pietrak 1 sibling, 1 reply; 54+ messages in thread From: Martin Uecker @ 2023-07-05 9:29 UTC (permalink / raw) To: Rafał Pietrak, David Brown, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Am Mittwoch, dem 05.07.2023 um 10:05 +0200 schrieb Rafał Pietrak: > Hi, > > W dniu 5.07.2023 o 09:29, Martin Uecker pisze: > > Am Mittwoch, dem 05.07.2023 um 07:26 +0200 schrieb Rafał Pietrak: > [-------] > > > And if it's so ... there is no mention of how does it show up for > > > "simple user" of the GCC (instead of the use of that "machinery" > > > by > > > creators of particular GCC port). In other words: how the sources > > > should > > > look like for the compiler to do "the thing"? > > > > > > > Not sure I understand the question. You would add a name space > > to an object as a qualifier and then the object would be allocated > > in a special (small) region of memory. Pointers known to point > > into that special region of memory (which is encoded into the > > type) would then be smaller. At least, this is my understanding > > of how it could work. > > Apparently you do understand my question. > > Then again ... apparently you are guessing the answer. Incidentally, > that would be my guess, too. And while such "syntax" is not really > desirable (since such attribution at every declaration of every > "short > pointer" variable would significantly obfuscate the sources and a > thing > like "#pragma" at the top of a file would do a better job), better > something then nothing. If you want to mix pointers I think it would make the code clearer if the name space is explicit. But yes, you would need to add those annotations. But maybe one could also consider a pragma that sets a default name space mode for some region of code in the source. > Then again, should you happen to fall onto an > actual documentation of syntax to use this feature with, I'd > appreciate > you sharing it :) Sorry, I thought I shared this before: https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html The draft specification mentioned there can be found herE: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1275.pdf Martin ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 9:29 ` Martin Uecker @ 2023-07-05 10:17 ` Rafał Pietrak 2023-07-05 10:48 ` Martin Uecker 0 siblings, 1 reply; 54+ messages in thread From: Rafał Pietrak @ 2023-07-05 10:17 UTC (permalink / raw) To: Martin Uecker, David Brown, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Hi W dniu 5.07.2023 o 11:29, Martin Uecker pisze: > Am Mittwoch, dem 05.07.2023 um 10:05 +0200 schrieb Rafał Pietrak: [--------------] >> Then again ... apparently you are guessing the answer. Incidentally, >> that would be my guess, too. And while such "syntax" is not really >> desirable (since such attribution at every declaration of every >> "short >> pointer" variable would significantly obfuscate the sources and a >> thing >> like "#pragma" at the top of a file would do a better job), better >> something then nothing. > > If you want to mix pointers I think it would make the code clearer > if the name space is explicit. But yes, you would need to add > those annotations. > > But maybe one could also consider a pragma that sets a default > name space mode for some region of code in the source. Yes. When there is a pragma to set a default for section of sources, one has to have another pragma, to "restore" compiler default. > >> Then again, should you happen to fall onto an >> actual documentation of syntax to use this feature with, I'd >> appreciate >> you sharing it :) > > Sorry, I thought I shared this before: > > https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html Thenx ... I've only scanned it (so I may be wrong with the following), but the example for AVR target looks ... strange. First example reads: char my_read (const __flash char ** p) { /* p is a pointer to RAM that points to a pointer to flash. The first indirection of p reads that flash pointer from RAM and the second indirection reads a char from this flash address. */ return **p; } now, how come a programmer (or a compiler) can possibly know, that it's not the other way around, meaning: first flash, then RAM) ... I know, that this is probably pointless here, but if the "named spaces" are to be fully generic, then "flash" does not necessarily mean "read only". There may be other "types" of memory, like "Close Coupled Memory", or some embedded device dedicated buffers. So something like: const __flash struct test_s { const __flash struct test_s *m,*n; int a,b,c; }; struct test_s *Z; // struct is already know to be in __flash // no need to repeat that info. Still, the Z is naturally in default namespace - in RAM struct test_s * my_read (struct test_s ** p) { /* P being passed as argument in register is a pointer in RAM, that points to a structure in __flash ... because that's how struct test_s is originally declared */ return *p; } is more readable to me :) and should I need to port the code to other devices I just take out the __flash from one/single place in the sources. Easy and painless. then again. To understand what the code does, I really don't need the __flash notification every time the structures in question appear. > > The draft specification mentioned there can be found herE: > > https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1275.pdf OK. Thenx, -R ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: wishlist: support for shorter pointers 2023-07-05 10:17 ` Rafał Pietrak @ 2023-07-05 10:48 ` Martin Uecker 0 siblings, 0 replies; 54+ messages in thread From: Martin Uecker @ 2023-07-05 10:48 UTC (permalink / raw) To: Rafał Pietrak, David Brown, Ian Lance Taylor Cc: Richard Earnshaw (lists), gcc Am Mittwoch, dem 05.07.2023 um 12:17 +0200 schrieb Rafał Pietrak: > Hi > > W dniu 5.07.2023 o 11:29, Martin Uecker pisze: > > Am Mittwoch, dem 05.07.2023 um 10:05 +0200 schrieb Rafał Pietrak: > ... > > > > > Then again, should you happen to fall onto an > > > actual documentation of syntax to use this feature with, I'd > > > appreciate > > > you sharing it :) > > > > Sorry, I thought I shared this before: > > > > https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html > > Thenx ... I've only scanned it (so I may be wrong with the > following), > but the example for AVR target looks ... strange. First example > reads: > char my_read (const __flash char ** p) > { > /* p is a pointer to RAM that points to a pointer to flash. > The first indirection of p reads that flash pointer > from RAM and the second indirection reads a char from this > flash address. */ > > return **p; > } > > now, how come a programmer (or a compiler) can possibly know, that > it's not the other way around, meaning: first flash, then RAM) ... It should work exactly like qualifiers: const __flash char **p; // pointer to pointer to char in flash const char * __flash *p; // pointers to pointer in flash to pointer in char const char ** __flash p; // pointer in flash to pointer in RAM to pointer to char So the same rules as for 'const'. > I know, that this is probably pointless here, but if the "named > spaces" are to be fully generic, then "flash" does not necessarily > mean "read only". There may be other "types" of memory, like "Close > Coupled Memory", or some embedded device dedicated buffers. Yes. This would be device-specific. Although one could consider more generic user-defined name spaces as well. This was discussed before for security boundaries, e.g. __kernel and __user. > So something like: > const __flash struct test_s { > const __flash struct test_s *m,*n; > int a,b,c; > }; > struct test_s *Z; // struct is already know to be in __flash > // no need to repeat that info. Still, the Z is naturally in > default namespace - in RAM The name space would not automatically become part of the struct test_s type similar to how const would not become part of it. But one should be able to use a typedef: typedef const __flash struct test_s { } test_s; > struct test_s * my_read (struct test_s ** p) > { > /* P being passed as argument in register is a pointer in RAM, > that > points to a structure in __flash ... because that's how struct test_s > is > originally declared */ > > return *p; > } > is more readable to me :) and should I need to port the code to other > devices I just take out the __flash from one/single place in the > sources. Easy and painless. You could do this with a typedef as above or also with a macro: #ifdef .. #define my_namespace __flash ##else #define my_namespace #endif So I think portability is not a problem. > > then again. To understand what the code does, I really don't need the > __flash notification every time the structures in question appear. In general, I think ones does: The flash pointers can not store pointers to arbitrary objects in ram so one needs to keep them appart to avoid mistakes. If one has types which are only used for objects in flash, one can use a typedef and then one does not need the annotation every time. Martin ^ permalink raw reply [flat|nested] 54+ messages in thread
end of thread, other threads:[~2023-07-06 12:53 UTC | newest] Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-06-27 12:26 wishlist: support for shorter pointers Rafał Pietrak 2023-06-28 1:54 ` waffl3x 2023-06-28 7:13 ` Rafał Pietrak 2023-06-28 7:31 ` Jonathan Wakely 2023-06-28 8:35 ` Rafał Pietrak 2023-06-28 9:56 ` waffl3x 2023-06-28 10:43 ` Rafał Pietrak 2023-06-28 12:12 ` waffl3x 2023-06-28 12:23 ` Rafał Pietrak 2023-07-03 14:52 ` David Brown 2023-07-03 16:29 ` Rafał Pietrak 2023-07-04 14:20 ` Rafał Pietrak 2023-07-04 15:13 ` David Brown 2023-07-04 16:15 ` Rafał Pietrak 2023-06-28 7:34 ` waffl3x 2023-06-28 8:41 ` Rafał Pietrak 2023-06-28 13:00 ` Martin Uecker 2023-06-28 14:51 ` Rafał Pietrak 2023-06-28 15:44 ` Richard Earnshaw (lists) 2023-06-28 16:07 ` Martin Uecker 2023-06-28 16:49 ` Richard Earnshaw (lists) 2023-06-28 17:00 ` Martin Uecker 2023-06-28 16:48 ` Rafał Pietrak 2023-06-29 6:19 ` Rafał Pietrak 2023-07-03 15:07 ` Ian Lance Taylor 2023-07-03 16:42 ` Rafał Pietrak 2023-07-03 16:57 ` Richard Earnshaw (lists) 2023-07-03 17:34 ` Rafał Pietrak 2023-07-04 12:38 ` David Brown 2023-07-04 12:57 ` Oleg Endo 2023-07-04 14:46 ` Rafał Pietrak 2023-07-04 15:55 ` David Brown 2023-07-04 16:20 ` Rafał Pietrak 2023-07-04 22:57 ` Martin Uecker 2023-07-05 5:26 ` Rafał Pietrak 2023-07-05 7:29 ` Martin Uecker 2023-07-05 8:05 ` Rafał Pietrak 2023-07-05 9:11 ` David Brown 2023-07-05 9:25 ` Martin Uecker 2023-07-05 11:34 ` David Brown 2023-07-05 12:01 ` Martin Uecker 2023-07-05 9:42 ` Rafał Pietrak 2023-07-05 11:55 ` David Brown 2023-07-05 12:25 ` Rafał Pietrak 2023-07-05 12:57 ` David Brown 2023-07-05 13:29 ` Rafał Pietrak 2023-07-05 14:45 ` David Brown 2023-07-05 16:13 ` Rafał Pietrak 2023-07-05 17:39 ` David Brown 2023-07-06 7:00 ` Rafał Pietrak 2023-07-06 12:53 ` David Brown 2023-07-05 9:29 ` Martin Uecker 2023-07-05 10:17 ` Rafał Pietrak 2023-07-05 10:48 ` Martin Uecker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).