From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com [IPv6:2a00:1450:4864:20::229]) by sourceware.org (Postfix) with ESMTPS id 7C7243858D1E for ; Wed, 20 Sep 2023 21:16:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7C7243858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x229.google.com with SMTP id 38308e7fff4ca-2c108e106f0so4232571fa.1 for ; Wed, 20 Sep 2023 14:16:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695244584; x=1695849384; darn=gcc.gnu.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=1UjOjjImThDi2LtVA5pSGmqHmOhVCPc1pN7Vj+v9ThY=; b=gobVOZGnsRFoBa3MD/5YMayd4cwXpl6OMNg1ZTlZmzqtlagjFrbSN0M8s+wG3XJyr+ uz1o6YXwR2J/F26qgAqxXDoYN0u5SSd6SPc8ji1oS/OXGfJkwQfQOeC/nLPwTsDviUEy ge5B2vHAfdt84hw8dmvJClmAa2ubjqIvPCa/SuwXwYvbvBr0KeEJR2toHorivIjFaJXo In/SDzjUkFRhTCvYg2i2tY8vfXnQGpLgfetPLY6ipXO8tOqWRlT5vOP8c/jT432EuDlo P7Is4ETZ0rotFDBLrP3KCRoCLXDNP6UOxYmBaBbRTp4n9rLDQ+mQfaY1Qj/hvVRG/dVP 06gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695244584; x=1695849384; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=1UjOjjImThDi2LtVA5pSGmqHmOhVCPc1pN7Vj+v9ThY=; b=MkXdZysVp8yAWO30oScCUspCxCvKQU8I95xPtrPDeE9tMLdvfxoyz6+LrpRmkUO5L4 5G8TypSct0gYUUwTlgL3AZULArFpzSI/KRv0ayGlMB1axRh5D4bkyIcN/JhPBb5O/YZY 6jCtsmLD8hjT6YFZzVC1kbwnaLThi/imAQ0V5u84wPRhaP8hxOIst7azIjqj8HeUIwmB fIdGHDVCzGb/Bhxhgx+Ja8xspp9i9cXO/WqP3k9F2YsBM5Y6ZT0qySRM24gO3I1QYn3c qipUBEt+Zh51+rmFxu1KvP8B2lzSNc0ZRe82MmWozND0dvZZybi/8nyij8t1EKiVRzwk vV8w== X-Gm-Message-State: AOJu0YxXiMeIyLaM6Ked/C7AG6uvTAd5J/zDqgKM7nj98yc5NFl8VoqJ wWvKjj3tgDLQ/sHaDjepuX8= X-Google-Smtp-Source: AGHT+IEN01YGZ3n/NAepUixheyZunJhmH7ppSGtcuT5qVOZlMwRsyjrUgQ3Q7UMxwwVcwCPnSG3Z2w== X-Received: by 2002:a2e:870a:0:b0:2c0:7e0:2a1 with SMTP id m10-20020a2e870a000000b002c007e002a1mr3189358lji.41.1695244583489; Wed, 20 Sep 2023 14:16:23 -0700 (PDT) Received: from 2a02-8388-e203-9700-eddb-fb4f-5189-911d.cable.dynamic.v6.surfer.at (2a02-8388-e203-9700-eddb-fb4f-5189-911d.cable.dynamic.v6.surfer.at. [2a02:8388:e203:9700:eddb:fb4f:5189:911d]) by smtp.gmail.com with ESMTPSA id j17-20020a170906051100b0098e2969ed44sm12114eja.45.2023.09.20.14.16.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Sep 2023 14:16:22 -0700 (PDT) Message-ID: <902dc33093f80f0ad6200a3aa2c69982cfb880a8.camel@gmail.com> Subject: Re: Question on -fwrapv and -fwrapv-pointer From: Martin Uecker To: Kees Cook Cc: gcc@gcc.gnu.org Date: Wed, 20 Sep 2023 23:16:21 +0200 In-Reply-To: <202309201328.5732A76E@keescook> References: <202309151107.C2F7D9A01@keescook> <228be10b14dabe4e8216f9f93da99f629328b3e0.camel@gmail.com> <202309201328.5732A76E@keescook> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4-2 MIME-Version: 1.0 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Am Mittwoch, dem 20.09.2023 um 13:40 -0700 schrieb Kees Cook via Gcc: > On Sat, Sep 16, 2023 at 10:36:52AM +0200, Martin Uecker wrote: > > > On Fri, Sep 15, 2023 at 08:18:28AM -0700, Andrew Pinski wrote: > > > > On Fri, Sep 15, 2023 at 8:12=E2=80=AFAM Qing Zhao wrote: > > > > >=20 > > > > >=20 > > > > >=20 > > > > > > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao wro= te: > > > > > >=20 > > > > > > On Thu, 2023-09-14 at 21:41 +0000, Qing Zhao wrote: > > > > > > > > > CLANG already provided -fsanitize=3Dunsigned-integer-over= flow. GCC > > > > > > > > > might need to do the same. > > > > > > > >=20 > > > > > > > > NO. There is no such thing as unsigned integer overflow. Th= at option > > > > > > > > is badly designed and the GCC community has rejected a few = times now > > > > > > > > having that sanitizer before. It is bad form to have a sani= tizer for > > > > > > > > well defined code. > > > > > > >=20 > > > > > > > Even though unsigned integer overflow is well defined, it mig= ht be > > > > > > > unintentional, shall we warn user about this? > > > > > >=20 > > > > > > *Everything* could be unintentional and should be warned then. = GCC is a > > > > > > compiler, not an advanced AI educating the programmers. > > > > >=20 > > > > > Well, you are right in some sense. -:) > > > > >=20 > > > > > However, overflow is one important source for security flaws, it= =E2=80=99s important for compilers to detect > > > > > overflows in the programs in general. > > > >=20 > > > > Except it is NOT an overflow. Rather it is wrapping. That is a big > > > > point here. unsigned wraps and does NOT overflow. Yes there is a ma= jor > > > > difference. > > >=20 > > > Right, yes. I will try to pick my language very carefully. :) > > >=20 > > > The practical problem I am trying to solve in the 30 million lines of > > > Linux kernel code is that of catching arithmetic wrap-around. The > > > problem is one of evolving the code -- I can't just drop -fwrapv and > > > -fwrapv-pointer because it's not possible to fix all the cases at onc= e. > > > (And we really don't want to reintroduce undefined behavior.) > > >=20 > > > So, for signed, pointer, and unsigned types, we need: > > >=20 > > > a) No arithmetic UB -- everything needs to have deterministic behavio= r. > > > The current solution here is "-fno-strict-overflow", which elimina= tes > > > the UB and makes sure everything wraps. > > >=20 > > > b) A way to run-time warn/trap on overflow/underflow/wrap-around. Thi= s > > > would work with -fsanitize=3D[signed-integer|pointer]-overflow exc= ept > > > due to "a)" we always wrap. And there isn't currently coverage lik= e > > > this for unsigned (in GCC). > > >=20 > > > Our problem is that the kernel is filled with a mix of places where t= here > > > is intended wrap-around and unintended wrap-around. We can chip away = at > > > fixing the intended wrap-around that we can find with static analyzer= s, > > > etc, but at the end of the day there is a long tail of finding the pl= aces > > > where intended wrap-around is hiding. But when the refactoring is > > > sufficiently completely, we can move the wrap-around warning to a tra= p, > > > and the kernel will not longer have this class of security flaw. > > >=20 > > > As a real-world example, here is a bug where a u8 wraps around causin= g > > > an under-allocation that allowed for a heap overwrite: > > >=20 > > > https://git.kernel.org/linus/6311071a0562 > > > https://elixir.bootlin.com/linux/v6.5/source/net/wireless/nl80211.c#L= 5422 > > >=20 > > > If there were more than 255 elements in a linked list, the allocation > > > would be too small, and the second loop would write past the end of t= he > > > allocation. This is a pretty classic allocation underflow and linear > > > heap write overflow security flaw. (And it would be trivially stopped= by > > > trapping on the u8 wrap around.) > > >=20 > > > So, I want to be able to catch that at run-time. But we also have cod= e > > > doing things like "if (ulong + offset < ulong) { ... }": > > >=20 > > > https://elixir.bootlin.com/linux/v6.5/source/drivers/crypto/axis/artp= ec6_crypto.c#L1187 > > >=20 > > > This is easy for a static analyzer to find and we can replace it with= a > > > non-wrapping test (e.g. __builtin_add_overflow()), but we'll not find > > > them all immediately, especially for the signed and pointer cases. > > >=20 > > > So, I need to retain the "everything wraps" behavior while still bein= g > > > able to detect when it happens. > >=20 > >=20 > > Hi Kees, > >=20 > > I have a couple of questions: > >=20 > > Currently, my thinking was that you would use signed integers > > if you want the usual integer arithmetic rules we know from > > elementary school and if you overflow this is clearly a bug=C2=A0 > > you can diagnose with UBsan. > >=20 > > There are people who think that signed overflow should be > > defined to wrap, but I think this would be a severe > > mistake because then code would start to rely on it, which > > makes it then difficult to differentiate between bugs and > > intended uses (e.g. the unfortunate situation you have=C2=A0 > > with the kernel). >=20 > Right -- my goal is to migrate the kernel codebase into using unambiguous > arithmetic. Doing that evolution, though, is the hard part. :) >=20 > At present, the kernel treats all signed and pointer arithmetic as > wrapping, as that makes sure that there is no UB, which causes must more > unexpected problems than universally wrapping does. >=20 > > I assume you want to combine UBSan plus wrapping for > > production use? Or only for testing? Or in other words: > > why would testing UBSan and production with wrapping > > not be sufficient to find and fix all bugs?=C2=A0 >=20 > I want UBSan to catch wrapping in production -- there is going to be > a long tail of code that may be vulnerable to having it be manipulated > into wrapping a calculation. >=20 > The stuff that _intentionally_ wraps will stand out very quickly and we > can fix those rapidly. (Many we can find today with static analyzers, > but not all from what I've seen.) >=20 > > Wrapping would not be correct because it may lead to > > logic errors or use-after-free etc. I assume it is still > > preferred because it more deterministic than whatever comes > > out of the optimizer assuming that overflow has UB. Is this > > the reasoning applied here? >=20 > Exactly correct. >=20 > > For unsigned the intended use case is modulo arithmetic > > where wrapping is the correct behavior. At least, this > > is what I thought so far.. This seems also to be the=C2=A0 > > position of the overall GCC community rejecting=C2=A0 > > -fsanitize=3Dunsigned-integer-overflow. >=20 > That is my understanding as well: unsigned wrap is "intended behavior". > But for the Linux kernel this behavior is still "unexpected" in the > majority of places where it may be reachable. >=20 > > But then there are also people like Richard Seacord that > > have the position that one should use=C2=A0 "unsigned" for=C2=A0 > > every quantity which can not be negative, which implies > > that then the modulo use case becomes the exception. >=20 > Sure -- this is the preferred coding style, but we both have tons of > old code that uses "int", and we still almost always would interpret > wrapping as an unwanted state. >=20 > > Related to this I have the following question: In the=C2=A0 > > bug you refer to above, an unsigned variable was used > > for something that is not meant for modulo arithmetic. > > Is this done in the kernel to save space? I.e. because > > 127 would not be enough as maximum but going to i16 takes > > to much space? or is this for compatibility with some > > on-the-wire protocol?=20 >=20 > I don't know the original rationale -- I expect the author wanted an > unsigned variable but didn't expect to ever encounter > 255 entries. I meant something else: Why wasn't an unsigned type used in the first place? If all "counter" variables were signed and all "modulo" variables unsigned, one could already=C2=A0 diagnose overflow reliably.=20 I was trying to understand if there are generally valid reasons for using unsigned integers for "counters". or whether this is just a historical mistake in the kernel. (and thank you for your explanations) Martin > These kinds of expectations aren't uncommon in the kernel, and sometimes > past expectations get violated when refactoring, etc. Basically I can't > trust the sanity of the codebase, so I have to depend on the compiler to > provide the runtime coverage to avoid these kinds of security flaws. >=20 > > Would an attribute for variables help that tells the > > compiler that if stores to the variable wrap around > > then this is not intended and this is an error?=20 > >=20 > > u8 x [[nowrap]]; > >=20 > > x =3D 256; // can be diagnosed ? >=20 > I would need the reverse: I want to assume all math to not wrap, > excepting a handful that are designed to. >=20 > -fsanitize=3Dnowrap >=20 > u8 modulo [[wrapping]]; > u8 counter; >=20 > modulo =3D 256; // no diagnose > counter =3D 256; // diagnose >=20