From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id 03D043858D32 for ; Mon, 18 Sep 2023 15:22:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 03D043858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-9adcb89b48bso416977566b.2 for ; Mon, 18 Sep 2023 08:22:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695050550; x=1695655350; darn=gcc.gnu.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=PoU8JUuSAueLGCxOgQcWcQRM53DKunRGaYoiJ7iL1Ko=; b=WDdR+L0dGf6hWHHaV13243VsezijKWL+I2siOUUQ0HJS2S24XhvKdEbiTRu3MPw17G N7AZOU52jhE4IqpM4AuI9DCcdumAFaH4uGtyZCcF5f3TEQrMMY6DIHrXR9Ky/BusF6Vf znEi5DO92ctBtjjK9jZhGcPZSDTrDu/4C2Cum80K3kQlBivoy0kJtUZYw9IqO6mpE88H qizmiB2SOFAGTcCTZeYUG45k/ssphZgUYP+DcSvVsa2YS+jQi5hairV3b6BqazJj2oKy LxdHh5UyLX1vKLkcFA2MgqSpRZP0zB8d6BQA4laCD+Dn/Z7Ner30JmIUxV+6eC/wafCx mNoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695050550; x=1695655350; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=PoU8JUuSAueLGCxOgQcWcQRM53DKunRGaYoiJ7iL1Ko=; b=JSoC/2W8czxDJtO7sDrlYZC2hz0KNzI84qKIo6K/I8UNsokFICpntp1wEJ3GvqzTl2 1FPCLn6SXVzYnyuifTOZr4AWAZrV6Ofdkqp+ETec/e+A3j5AsIzlyxYvv+GjK9UBs3Z6 Qttw9CuFdZnul6TZUUKkPgoD71PMvKPTjQz63eMnx5fxvzI+VjMHLDCHWwz6P7cs16nW aL7v3WO+6KZUv0qs534LBEcHNIa8gJuwWePMAvs5xhP9GYOrFlcVCGSXAxyP+/6FQ+D0 ++3FYMsx95tTUU4JPkhhbDS3aKWMCmR+ngStZSAQIX9xqaF50ago/OjEIXcnKpp5Cf6o O77w== X-Gm-Message-State: AOJu0YwzzIHiqFJubjLzTDhVR564mv6+GCJnsjJ5kr4SF/hnZ6wcyVfT NpHzYQiyPq4+2PkVBhRuiDs= X-Google-Smtp-Source: AGHT+IELZYVb5R4uQNVQ2LVZPMbnLu0txNDlB5dEC/PDMphmuTkNlhSWnVv8Js3BWWZZ1S3JBRdV1Q== X-Received: by 2002:a17:907:270c:b0:9a2:15d3:2583 with SMTP id w12-20020a170907270c00b009a215d32583mr7300012ejk.2.1695050549868; Mon, 18 Sep 2023 08:22:29 -0700 (PDT) Received: from 2a02-8388-e203-9700-eddb-fb4f-5189-911d.cable.dynamic.v6.surfer.at (2a02-8388-e203-9700-eddb-fb4f-5189-911d.cable.dynamic.v6.surfer.at. [2a02:8388:e203:9700:eddb:fb4f:5189:911d]) by smtp.gmail.com with ESMTPSA id lw28-20020a170906bcdc00b00992f309cfe8sm6710341ejb.178.2023.09.18.08.22.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Sep 2023 08:22:29 -0700 (PDT) Message-ID: <0be15964fcf7963b3b1c57a49e45dca305012bf3.camel@gmail.com> Subject: Re: Question on -fwrapv and -fwrapv-pointer From: Martin Uecker To: Richard Biener Cc: keescook@chromium.org, gcc@gcc.gnu.org Date: Mon, 18 Sep 2023 17:22:27 +0200 In-Reply-To: References: <202309151107.C2F7D9A01@keescook> <228be10b14dabe4e8216f9f93da99f629328b3e0.camel@gmail.com> <9959df868a3aadee6034e564cf575083e76d5833.camel@gmail.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4-2 MIME-Version: 1.0 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Am Montag, dem 18.09.2023 um 10:47 +0200 schrieb Richard Biener via Gcc: > On Mon, Sep 18, 2023 at 10:17=E2=80=AFAM Martin Uecker wrote: > >=20 > > Am Montag, dem 18.09.2023 um 09:31 +0200 schrieb Richard Biener via Gcc= : > > > On Sat, Sep 16, 2023 at 10:38=E2=80=AFAM Martin Uecker via Gcc wrote: > > > >=20 > > > >=20 > > > >=20 > > > > (moved to gcc@) > > > >=20 > > > > > On Fri, Sep 15, 2023 at 08:18:28AM -0700, Andrew Pinski wrote: > > > > > > On Fri, Sep 15, 2023 at 8:12=E2=80=AFAM Qing Zhao wrote: > > > > > > >=20 > > > > > > >=20 > > > > > > >=20 > > > > > > > > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao = wrote: > > > > > > > >=20 > > > > > > > > On Thu, 2023-09-14 at 21:41 +0000, Qing Zhao wrote: > > > > > > > > > > > CLANG already provided -fsanitize=3Dunsigned-integer-= overflow. GCC > > > > > > > > > > > might need to do the same. > > > > > > > > > >=20 > > > > > > > > > > NO. There is no such thing as unsigned integer overflow= . That option > > > > > > > > > > is badly designed and the GCC community has rejected a = few times now > > > > > > > > > > having that sanitizer before. It is bad form to have a = sanitizer for > > > > > > > > > > well defined code. > > > > > > > > >=20 > > > > > > > > > Even though unsigned integer overflow is well defined, it= might be > > > > > > > > > unintentional, shall we warn user about this? > > > > > > > >=20 > > > > > > > > *Everything* could be unintentional and should be warned th= en. GCC is a > > > > > > > > compiler, not an advanced AI educating the programmers. > > > > > > >=20 > > > > > > > Well, you are right in some sense. -:) > > > > > > >=20 > > > > > > > However, overflow is one important source for security flaws,= it=E2=80=99s important for compilers to detect > > > > > > > overflows in the programs in general. > > > > > >=20 > > > > > > Except it is NOT an overflow. Rather it is wrapping. That is a = big > > > > > > point here. unsigned wraps and does NOT overflow. Yes there is = a major > > > > > > difference. > > > > >=20 > > > > > Right, yes. I will try to pick my language very carefully. :) > > > > >=20 > > > > > The practical problem I am trying to solve in the 30 million line= s of > > > > > Linux kernel code is that of catching arithmetic wrap-around. The > > > > > problem is one of evolving the code -- I can't just drop -fwrapv = and > > > > > -fwrapv-pointer because it's not possible to fix all the cases at= once. > > > > > (And we really don't want to reintroduce undefined behavior.) > > > > >=20 > > > > > So, for signed, pointer, and unsigned types, we need: > > > > >=20 > > > > > a) No arithmetic UB -- everything needs to have deterministic beh= avior. > > > > > The current solution here is "-fno-strict-overflow", which eli= minates > > > > > the UB and makes sure everything wraps. > > > > >=20 > > > > > b) A way to run-time warn/trap on overflow/underflow/wrap-around.= This > > > > > would work with -fsanitize=3D[signed-integer|pointer]-overflow= except > > > > > due to "a)" we always wrap. And there isn't currently coverage= like > > > > > this for unsigned (in GCC). > > > > >=20 > > > > > Our problem is that the kernel is filled with a mix of places whe= re there > > > > > is intended wrap-around and unintended wrap-around. We can chip a= way at > > > > > fixing the intended wrap-around that we can find with static anal= yzers, > > > > > etc, but at the end of the day there is a long tail of finding th= e places > > > > > where intended wrap-around is hiding. But when the refactoring is > > > > > sufficiently completely, we can move the wrap-around warning to a= trap, > > > > > and the kernel will not longer have this class of security flaw. > > > > >=20 > > > > > As a real-world example, here is a bug where a u8 wraps around ca= using > > > > > an under-allocation that allowed for a heap overwrite: > > > > >=20 > > > > > https://git.kernel.org/linus/6311071a0562 > > > > > https://elixir.bootlin.com/linux/v6.5/source/net/wireless/nl80211= .c#L5422 > > > > >=20 > > > > > If there were more than 255 elements in a linked list, the alloca= tion > > > > > would be too small, and the second loop would write past the end = of the > > > > > allocation. This is a pretty classic allocation underflow and lin= ear > > > > > heap write overflow security flaw. (And it would be trivially sto= pped by > > > > > trapping on the u8 wrap around.) > > > > >=20 > > > > > So, I want to be able to catch that at run-time. But we also have= code > > > > > doing things like "if (ulong + offset < ulong) { ... }": > > > > >=20 > > > > > https://elixir.bootlin.com/linux/v6.5/source/drivers/crypto/axis/= artpec6_crypto.c#L1187 > > > > >=20 > > > > > This is easy for a static analyzer to find and we can replace it = with a > > > > > non-wrapping test (e.g. __builtin_add_overflow()), but we'll not = find > > > > > them all immediately, especially for the signed and pointer cases= . > > > > >=20 > > > > > So, I need to retain the "everything wraps" behavior while still = being > > > > > able to detect when it happens. > > > >=20 > > > >=20 > > > > Hi Kees, > > > >=20 > > > > I have a couple of questions: > > > >=20 > > > > Currently, my thinking was that you would use signed integers > > > > if you want the usual integer arithmetic rules we know from > > > > elementary school and if you overflow this is clearly a bug > > > > you can diagnose with UBsan. > > > >=20 > > > > There are people who think that signed overflow should be > > > > defined to wrap, but I think this would be a severe > > > > mistake because then code would start to rely on it, which > > > > makes it then difficult to differentiate between bugs and > > > > intended uses (e.g. the unfortunate situation you have > > > > with the kernel). > > > >=20 > > > > I assume you want to combine UBSan plus wrapping for > > > > production use? Or only for testing? Or in other words: > > > > why would testing UBSan and production with wrapping > > > > not be sufficient to find and fix all bugs? > > > >=20 > > > > Wrapping would not be correct because it may lead to > > > > logic errors or use-after-free etc. I assume it is still > > > > preferred because it more deterministic than whatever comes > > > > out of the optimizer assuming that overflow has UB. Is this > > > > the reasoning applied here? > > > >=20 > > > >=20 > > > > For unsigned the intended use case is modulo arithmetic > > > > where wrapping is the correct behavior. At least, this > > > > is what I thought so far.. This seems also to be the > > > > position of the overall GCC community rejecting > > > > -fsanitize=3Dunsigned-integer-overflow. > > > >=20 > > > > But then there are also people like Richard Seacord that > > > > have the position that one should use "unsigned" for > > > > every quantity which can not be negative, which implies > > > > that then the modulo use case becomes the exception. > > > >=20 > > > > Related to this I have the following question: In the > > > > bug you refer to above, an unsigned variable was used > > > > for something that is not meant for modulo arithmetic. > > > > Is this done in the kernel to save space? I.e. because > > > > 127 would not be enough as maximum but going to i16 takes > > > > to much space? or is this for compatibility with some > > > > on-the-wire protocol? > > > >=20 > > > > Would an attribute for variables help that tells the > > > > compiler that if stores to the variable wrap around > > > > then this is not intended and this is an error? > > > >=20 > > > > u8 x [[nowrap]]; > > > >=20 > > > > x =3D 256; // can be diagnosed ? > > >=20 > > > There's a problem with representing this, it's a long-standing thing = standing > > > in the way of handling -fwrapv in the IL. Maybe C wants to have > > > 'nonnegative int' in addition to 'unsigned int', or 'bit int' for the > > > "just some two-complement bits". > > >=20 > > > For GCC we think of -fwrapv to perform signed integer arithmetic as i= f > > > promoted to a large enough type first and then truncated to the origi= nal > > > type in the implementation defined manner (modulo reducing). But we > > > end up doing what the actual hardware does (I'm for example unsure > > > how that handles INT_MIN / -1 - x86 documents it as raising #DE > > > which it indeed does, even with -fwrapv). > > >=20 > > > volatile int a, b; > > > int main () > > > { > > > a =3D -__INT_MAX__ - 1; > > > b =3D -1; > > > int z =3D a / b; > > > __builtin_printf ("%d\n", z); > > > } > > >=20 > > > raises SIGFPE with and without -fwrapv. I'll note that we document > > > -fwrapv to only affect addition, subtraction and multiplication > > > (it also affects negation), matching the libgcc *v functions we have > > > for -ftrapv. -fsanitize=3Dundefined instruments division but > > > -fsanitize=3Drecover doesn't help. Maybe there's a bug about this > > > recorded. The documentation of -fwrapv should likely be clarified. Thank you for the explanation. But note that I suggested the=C2=A0 opposite: "nowrap" . A compiler would not need to be changed at all and can simply ignore it this attribute. But if it sees=C2=A0 that a too large value is stored in an unsigned type that is too small, it could warn in the frontend. And we could add=C2=A0 a sanitizer mode to diagnose wraparound at run-time. (which=C2=A0 would then not have the problem with false positives as=C2=A0 -funsigned-integer-overflow would have) > > >=20 > > > For recent work I failed to spot the C standards language on > > > singed overflow behavior for negation, addition, subtraction > > > and multiplication. I can find it for division though. Can someone > > > point me to where that is (moved?)? > >=20 > > I don't think it moved. It should follow from the general rule 6.5p5: > >=20 > > "If an exceptional condition occurs during the evaluation of an > > expression (that is, if the result is not mathematically defined > > or not in the range of representable values for its type), the > > behavior is undefined." >=20 > Ah, and the unsigned integer type exception is then 6.2.5/9 > but using "can never overflow" wording rather than "in the range > of representable values for its type". >=20 The "can never overflow" is consequence of modulo arithmetic, because then the result is in the representable range. > Integer "overflow" isn't defined anywhere and=C2=A0 It might be defined in ISO/IEC 2382. > I suppose the unsigned exception also > applies to underflow for expressions like 0 - 1. Yes, because the operation is performed modulo 2^N. >=20 > It would have been useful to repeat some of this in the sections > documenting the relevant operators. I agree. Martin