From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16621 invoked by alias); 9 Nov 2002 04:46:03 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 16607 invoked by uid 71); 9 Nov 2002 04:46:02 -0000 Date: Fri, 08 Nov 2002 20:46:00 -0000 Message-ID: <20021109044602.16606.qmail@sources.redhat.com> To: neil@gcc.gnu.org Cc: gcc-prs@gcc.gnu.org, From: Byron Stanoszek Subject: Re: c/5593: GCC miscompiles bitshifts on unsigned struct members when creating a 64-bit value Reply-To: Byron Stanoszek X-SW-Source: 2002-11/txt/msg00446.txt.bz2 List-Id: The following reply was made to PR c/5593; it has been noted by GNATS. From: Byron Stanoszek To: Christian Ehrhardt Cc: bangerth@dealii.org, , , , Subject: Re: c/5593: GCC miscompiles bitshifts on unsigned struct members when creating a 64-bit value Date: Fri, 8 Nov 2002 23:44:04 -0500 (EST) On Fri, 8 Nov 2002, Christian Ehrhardt wrote: > On Tue, Nov 05, 2002 at 04:15:03PM -0000, bangerth@dealii.org wrote: > > Synopsis: GCC miscompiles bitshifts on unsigned struct members when creating a 64-bit value > > > > I can confirm this. However, I'm not sure whether what you > > do is specified at all: it all boils down to this function: > > long long equation4(struct field *data) > > { > > return ((long long)data->num << 32)| > > (data->flags << 16)| > > (data->container << 8)| > > data->quantity; > > } > > and that data->flags is a 16-bit integer. I think, shifting > > it by 16 is invoking undefined behavior, and you should > > not be surprised. But then, I'm not a language lawyer and > > leave this to someone else. > > b) A type is promoted to int if the whole range can be represented in > an int no matter what the sign of the original type was (6.3.1.1[#2]) > This is the culprit! This does appear to be the culprit. Modifing the function so that we have 'unsigned short flags=0x8000' or 'unsigned char flags=0x80' has the same effect in 'equation 1' to promote the shift to a signed int. Both pieces of code function similarly in a 64-bit environment (e.g. Alpha) so I'm pretty much declaring this to be not a bug at all. Thanks for pointing out the C spec. -Byron > c) According to 6.5.7[#2] it is actually unspecified what happens > if an overflow occurs in a left shift of a signed integer. This is > the undefined behaviour invoked here but I think it is clear what > the _right_ behaviour is. > > This means (assuming a 16 Bit short and a 32 Bit int) the standard > says that the equation below always holds. Actually all the casts on > the rhs aren't necessary . > > unsigned short a; > a << 16 == (int)(((int)a)<<16); > > The actual value of the rhs is still unspecified according to the standard > if the value of a is greater than 0x7fff. But again I don't think it is > unspecified what gcc does in this case. > > Now looking at the bitwise or with a 64 Bit operand: > a) 6.5.11[#3] states that the usual arithmetic conversions are performed > on the operands of ``|'' before the operator is applied. > b) The usual arithmetic conversions defined in 6.3.1.8 state in [#1] > for this case: > Otherwise, if both operands have signed integer types or both > have unsigned integer types, the operand with the type of lesser > integer conversion rank is converted to the type of the operand > with greater rank [which is long long in the case of int and > long long]. > and converting a signed integer to a signed integer of another type > is defined in 6.3.1.3[#1]: > When a value with integer type is converted to another integer > type other than _Bool,if the value can be represented by the new > type, it is unchanged. > > De facto this means that the operand of the smaller type is sign extended. > > This means that we get these implicit casts assuming 64 Bit long long: > > long long ll; unsigned short a; > ll | (a << 16) == ll | (long long)(int)((int)a << (int) 16) > > Now using your values (ll = 0x123400000000, a = 0x8000) in this > expression yields: > ll | (long long)(int)((int)a << (int) 16) > == (long long)0x123400000000 | (long long)(int)((int)0x8000 << 16) (*) > == (long long)0x123400000000 | (long long)(int)0x80000000 (**) > == (long long)0x123400000000 | (long long)(-2147483648) > == (long long)0x123400000000 | (long long)0xffffffff80000000 > == (long long)0xffffffff80000000 > > The step from (*) to (**) is the only place where undefined behaviour > is invoked and I think we all agree that we'd consider it a bug if this > calculation did anything else than what I did above. > > regards Christian > > http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=5593 > > -- Byron Stanoszek Ph: (330) 644-3059 Systems Programmer Fax: (330) 644-8110 Commercial Timesharing Inc. Email: byron@comtime.com