From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 01C973858022 for ; Wed, 18 Aug 2021 08:59:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 01C973858022 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629277173; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=O3+7PQk/1aOJtIQ7viBIn5EvHexhb+fEsbwG83Vy8Lg=; b=csIxEAFOxafGealalPDiRcw1y99k0yCMhg82kBCiHdjz7a6qnjoeO1RquQenIrZpCvR1VT zLpZXfo3AGIskkoFAlGb03uMilJFMJvpmz7p9fPgDN3SiFsiFlrKE3S8+Xq/ZI+HiUR94H pZz0AIk6wD5JpJ+/ljpouTF3DYw15ow= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-152-iUQL1GgiMmqO6c0vmOwwPw-1; Wed, 18 Aug 2021 04:59:32 -0400 X-MC-Unique: iUQL1GgiMmqO6c0vmOwwPw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C3316107ACF5 for ; Wed, 18 Aug 2021 08:59:31 +0000 (UTC) Received: from calimero.vinschen.de (ovpn-112-10.ams2.redhat.com [10.36.112.10]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 94BBA1970E for ; Wed, 18 Aug 2021 08:59:31 +0000 (UTC) Received: by calimero.vinschen.de (Postfix, from userid 500) id D927AA80D90; Wed, 18 Aug 2021 10:59:29 +0200 (CEST) Date: Wed, 18 Aug 2021 10:59:29 +0200 From: Corinna Vinschen To: newlib@sourceware.org Subject: Re: [PATCH] svfwscanf: Simplify _sungetwc_r to eliminate apparent buffer overflow Message-ID: Reply-To: newlib@sourceware.org Mail-Followup-To: newlib@sourceware.org References: <874kbnevjm.fsf@keithp.com> MIME-Version: 1.0 In-Reply-To: <874kbnevjm.fsf@keithp.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=vinschen@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Newlib mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Aug 2021 08:59:35 -0000 Hi Keith, On Aug 17 12:11, Keith Packard wrote: > svfwscanf replaces getwc and ungetwc_r. The comments in the code talk > about avoiding file operations, but they also need to bypass the > mbtowc calls as svfwscanf operates on wchar_t, not multibyte data, > which is a more important reason here; they would not work correctly > otherwise. > > The ungetwc replacement has code which uses the 3 byte FILE _ubuf > field, but if wchar_t is 32-bits, this field is not large enough to > hold even one wchar_t value. Building in this mode generates warnings > about array overflow: > > In file included from ../../newlib/libc/stdio/svfiwscanf.c:35: > ../../newlib/libc/stdio/vfwscanf.c: In function '_sungetwc_r.isra': > ../../newlib/libc/stdio/vfwscanf.c:316:12: warning: array subscript 4294967295 is above array bounds of 'unsigned char[3]' [-Warray-bounds] > 316 | fp->_p = &fp->_ubuf[sizeof (fp->_ubuf) - sizeof (wchar_t)]; > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > In file included from ../../newlib/libc/stdio/stdio.h:46, > from ../../newlib/libc/stdio/vfwscanf.c:82, > from ../../newlib/libc/stdio/svfiwscanf.c:35: > ../../newlib/libc/include/sys/reent.h:216:17: note: while referencing '_ubuf' > 216 | unsigned char _ubuf[3]; /* guarantee an ungetc() buffer */ > | ^~~~~ > > However, the vfwscanf code *never* ungets data before the start of the > scanning operation, and *always* ungets data which matches the input > at that point, so the code always hits the block which backs up over > the input data and never hits the block which uses the _ubuf field. LGTM. Under the unlikely assumption that wscanf gets extended in future and has to ungetc a char different from the input char, how do we catch that? Do we need a hint, somehow, somewhere? Corinna