From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x1129.google.com (mail-yw1-x1129.google.com [IPv6:2607:f8b0:4864:20::1129]) by sourceware.org (Postfix) with ESMTPS id 2081738582A9 for ; Tue, 12 Jul 2022 07:02:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2081738582A9 Received: by mail-yw1-x1129.google.com with SMTP id 00721157ae682-2ef5380669cso71610347b3.9 for ; Tue, 12 Jul 2022 00:02:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=BqoP8IgaaYwA40scQcesG2acZyb/aZVnos1UXly3ZsQ=; b=UoqtlpUYGVNeotwA6ufGOfsqpIKjU3fHZpcHVX8VVJo5nx8B3yx6dYpcAzzNcELpbW qfhBbeW2Jlj2wOhy6N5WXddJR6gmzPCQNgf/1Wxij3y7CPBvWPs1oVGfWI6AYJcLIbat 7mbYP67kAnirQyMxsfD5UeBgD1K4W6YxCp2ekTE7sfnSzry95CHQhhQXkbxqoV2v2Bzf d6MrSPawOt92x282B0WNiIW7Obq1+jkzLWCP8G3S8Tmzk+Jy3oLc0jHHtGvdq1FiENFZ Ek+p0NXy5+u3UHI4IlOtas1sd8wkRJb9V5Yj71MTBRQw1je4fXPx3uD5DizN1xwqN/5T UMTA== X-Gm-Message-State: AJIora8bVLtMOrUeMMv7xvkQXD8wymucDHVbqHL0MQgPBu21bK+JTWBQ StcT/PoZz80cYAc0O1Y0pIgCj5YfySMOiDXL16ZOeFiA X-Google-Smtp-Source: AGRyM1sZlvJgflHfjsKSZJgrtNpRxzYcv0QvNryoL5YkR5gLT/wljTDcWSwRV8eo8rQZ/PtdtS/7fAasja7QKexrAfY= X-Received: by 2002:a81:188a:0:b0:31c:86ee:2cf5 with SMTP id 132-20020a81188a000000b0031c86ee2cf5mr23166168ywy.471.1657609368275; Tue, 12 Jul 2022 00:02:48 -0700 (PDT) MIME-Version: 1.0 From: Yair Lenga Date: Tue, 12 Jul 2022 10:02:38 +0300 Message-ID: Subject: Improved Buffer overflow for scanf* functions To: libc-alpha@sourceware.org X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jul 2022 07:02:54 -0000 Hi, I posting into libc-alpha, based on feedback that I got from libc-help members on proposal to improve buffer checks for scanf* functions. Short summary - scanf* are a constant source for buffer overflow issues. There is no good mechanizm to address the very common use case: char foo[FOOSIZE] ; scanf("%s", foo) ; // How to limit the actual size All proposed solution do not scale well - especially when the code base is large. See few of them below: * Use explicit width - scanf("%20s", foo) - does not work well for FOOSIZE changes. * Use dynamic allocation char *foobuff = NULL ;sscanf ("%ms", &value) ; strcpy(foo, foobuff, sizeof(foo)-1) ; free(foobuff) ; * Create dynamic format: char fmt[1000] ; sprintf(fmt, "%%ds", FOOSIZE-1) ; scanf(fmt, foo) ; One alternative is to allow "dynamic" width, following the '*' style used by printf, where width field can be replaced by '*', which indicates the argument list will contain an integer specifying the width. One unfortunate issue: the '*' is already assigned a meaning of 'avoid assignment' flag. The final suggestion that came up from the discussion is: * Use a special character (my favorite: '@') in the scanf. It will be allowed in the sequences '%@s', '%@[' and '%@S', etc. TThis should indicate that the width will be picked from the next integer argument, which is expected to be the sizeof, or the dynamic size (if malloc was used). The number of characters will be one less the size, to ensure the string is always null terminated. The implementation will look like: sscanf("%@s", sizeof(foo), foo) ; // fixed size string: char foo[FOOSIZE] ; My question: * Would like to get community feedback from the glibc maintainer - is there a reasonable chance that this will be accepted as a GLIBC extension to scanf. I understand that this is not a light decision - but considering the security risks associated with the current state - anything is better than the current alternative. As far as implementing, I've reviewed the code for glibc scanf (and a few other implementations) - looks like a very light effort.Since the '@' is currently not allowed, it's not going to break any existing code. Looking for feedback, comments, idea, Yair