From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from spam02.hesby.net (spam01.hesby.net [81.29.32.152]) by sourceware.org (Postfix) with ESMTP id C4DDE3858C2C for ; Wed, 15 Sep 2021 07:20:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C4DDE3858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=hesbynett.no Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=hesbynett.no Received: from [192.168.0.77] (unknown [79.161.10.130]) by spam02.hesby.net (Halon) with ESMTPSA id 5d559e70-15f5-11ec-9b5d-506b8dfa0e58; Wed, 15 Sep 2021 09:20:14 +0200 (CEST) Subject: Re: a feature to the wishlist To: =?UTF-8?Q?Rafa=c5=82_Pietrak?= , Jonathan Wakely Cc: "gcc@gcc.gnu.org" References: <34090ee9-8554-9c35-ed2a-3ded1369a4dc@ztk-rp.eu> <531eff88-cbd7-c283-dfa6-f2633b750f02@hesbynett.no> From: David Brown Message-ID: <4f019918-3c9c-0722-091f-f346efe2b92d@hesbynett.no> Date: Wed, 15 Sep 2021 09:20:10 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3033.5 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Sep 2021 07:20:20 -0000 On 14/09/2021 20:48, Rafał Pietrak wrote: > > > W dniu 13.09.2021 o 13:41, David Brown pisze: >> On 13/09/2021 11:51, Rafał Pietrak via Gcc wrote: >>> Hi, >>> >>> Thenk you for very prompt reply. >> >> >> (I'm not sure how much this is a "gcc development list" matter, rather >> than perhaps a "gcc help list" or perhaps comp.arch.embedded Usenet >> group discussion, but I'll continue here until Jonathan or other gcc >> developers say stop.) > > Thank you. I appreciate it. > > And yet, I do think, that it's about the "core" gcc - it's about a > programmer is able to "talk to gcc" without much "missunderstanding". > > But I'm not going to push it much more. I accept, that "talking to gcc" > with d->cr1 &= ~ { ... } syntax is not what majority of programmers > would like to be able to do. The gcc developers are always interested in new ideas. But they are not keen on new syntaxes or extensions - there has to be a /very/ good reason for them. The days of gcc being a maverick that makes up its own improved C are long, long gone - they'd much rather work with the C and C++ committees towards improving the languages in general, instead of having more of their own non-standard additions. So an extension like this should ideally be a proposal for adding to C23, though gcc could always implement it earlier. And while I appreciate what you are trying to do here, it is simply not general enough or important enough to justify such changes. To get a new feature implemented, it has to do something you could not do before, or do it /far/ more simply, clearly, safely or efficiently. > >> >>> >>> W dniu 13.09.2021 o 10:44, Jonathan Wakely pisze: >>>> On Mon, 13 Sept 2021 at 07:55, Rafał Pietrak via Gcc wrote: >>> [-----------------] >>>>> #elif VARIANT_WORKING >>>>> struct cr_s a = (struct cr_s) { >>>>> .re = 1, >>>>> .te = 1, >>>>> .rxneie = 1, >>>>> .txeie = 1, >>>>> .ue = 1 }; >>>>> int *b = (int *) &a; >>>>> d->cr1 &= ~(*b); >>>> >>>> This is a strict aliasing violation. You should either use a union or >>>> memcpy to get the value as an int. >>> >>> Yes, I know. I know, this is a "trick" I use (I had to use to missleed >>> gcc). >>> >> >> Don't think of it as a "trick" - think of it as a mistake. A completely >> unnecessary mistake, that will likely give you unexpected results at times : >> >> union { >> struct cr_s s; >> uint32_t raw; >> } a = {(struct cr_s) { >> .re = 1, >> .te = 1, >> .rxneie = 1, >> .txeie = 1, >> .ue = 1 } >> }; >> d->cr1 &= ~(a.raw); > > Ha! This is very nice. > > But pls note, that if contrary to my VARIANT_WORKING this actually is > kosher (and not an error, like you've said about the my "WORKING"), and > it actually looks very similar to the VARIANT_THE_BEST ... may be there > is a way to implement VARIANT_THE_BEST as "syntactic trick" leading > compiler into this semantics you've outlined above? > The issue I noticed with your "WORKING" is the bit order - that is orthogonal to the code structure, and easy to fix by re-arranging the fields in the initial bit-field. It is independent from the structure of the code. > I'm raising this questions, since CR1 as int (or better as uint32_t) is > already declared in my code. Compiler shouldn't have too hard time > weeding out struct cr_s from union embedding it. The code to convert between a bit-field of size 32-bit and a uint32_t is nothing at all, and the compiler can handle that fine :-) But you have to write the source code in a way that the conversion is well defined behaviour. When you write something that is not defined behaviour, the compiler can generate code in ways you don't expect because technically you haven't told it what you actually want. > >> >> I hope you also realise that the "VARIANT_TRADITIONAL" and >> "VARIANT_WORKING" versions of your code do different things. The ARM >> Cortex-M devices (like the STM-32) are little-endian, and the bitfields >> are ordered from the LSB. So either your list of #define's is wrong, or >> your struct cr_s bitfiled is wrong. (I haven't used that particular >> device myself, so I don't know which is wrong.) > > I'm sory. this is my mistake. I've taken a shortcut and quickly written > an at hoc '#defines' for the email. In my code I have: > enum usart_cr1_e { > USART_SBK, USART_RWU, USART_RE, USART_TE, USART_IDLEIE, > USART_RXNEIE, USART_TCIE, USART_TXEIE, USART_PEIE, USART_PS, > USART_PCE, USART_WAKE, USART_M, USART_UE, > }; > > And gcc produces exactly the same code in both variants: > 8000c56: 68d3 ldr r3, [r2, #12] > 8000c58: f423 5302 bic.w r3, r3, #8320 ; 0x2080 > 8000c5c: f023 032c bic.w r3, r3, #44 ; 0x2c > 8000c60: 60d3 str r3, [r2, #12] > > Great. >> >> Also - perhaps beside the point, but good advice anyway - for this kind >> of work you should always use the fixed-size types such as >> "uint32_t", not home-made types like "uint". And you always want >> unsigned types when doing bit operations such as bitwise complement. > > Yes. That's true. No problem here. > >> >> Using a union of the bitfield struct with a "raw" combined field is a >> common idiom, and gives you exactly what you need here. It is >> guaranteed to work in C, unlike your code (which has undefined >> behaviour). If it is important to you, you should note that it is not >> defined behaviour in C++ (though maybe gcc guarantees it will work - the >> documentation of "-fstrict-aliasing" is not clear on the matter). > > Somehow I've missed it until now, always falling back to things like > (int *) or even (void *). That's a good lesson for me. > I always view any use of "void *" with suspicion - it is telling the compiler you don't know your real type. It has its uses, of course, but I prefer to be more explicit when I can. I would never use a cast to "int *" for accessing anything other than an int, and even with "uint32_t *" I'd normally have a "volatile" in there, and probably have an access function or wrapper too. (With "volatile", the compiler will follow your instructions strictly - it lets you do messy casts safely, but can lead to inefficiencies.) >> >> As Jonathan says, a small inline function or macro (using gcc's >> extension of declarations and statements inside an expression) can be >> used for a wrapper that will work simply and efficiently while being >> safe for both C and C++ : >> >> #define raw32(x) \ >> ({ uint32_t raw; memcpy(&raw, &x, sizeof(uint32_t)); raw;}) >> >> >> struct cr_s a = (struct cr_s) { >> .re = 1, >> .te = 1, >> .rxneie = 1, >> .txeie = 1, >> .ue = 1 } >> }; >> d->cr1 &= ~raw32(a); > > May be I'll test it in some future. For the time being I'm very > satisfied with the assignment to a temporary union variable. > > but having said that, I strongly believe (particularly if union variable > is kosher), that gcc recognising VARIANT_THE_BEST as a syntax shortcut > to such union temporary variable could significantly improve code > readibility. Type-punning via unions is guaranteed to work in C, but undefined behaviour in C++. (gcc may choose to define it, however.) For the safest and most portable way of accessing raw memory, memcpy() always works. gcc (and many other compilers) fully optimise memcpy() calls that have fixed sizes like this - the code generated here is as optimal as any other solution. Some compilers, however, will give you a library call which is obviously not efficient. gcc has another extension that can also be useful for this kind of thing: typedef uint32_t __attribute__((may_alias)) uint32_a; Now you can safely use a cast to "uint32_a *", rather than "uint32_t *", and access data - gcc knows it may alias the struct : struct cr_s a = .... d->cr &= ~ *((uint32_a*) &a); > > So thank you for extended explanation, and I'm currently staying off the > list hoping that my perspective will eventually catch some traction :) > This is still copied to the list. If you want to look a bit at the list, without following everything, I recommend using the gmane.org Usenet gateway that provides a lot of mailing lists in convenient Usenet format. There is plenty of interesting discussions on the gcc mailing lists. I personally haven't done any gcc development - I'd love to do so, but haven't had the time to dig into it - but I sometimes get to help others and save the real developers some time. mvh., David > > Thank you again, > > Rafał >