From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by sourceware.org (Postfix) with ESMTPS id 8AD6B3858C31 for ; Tue, 30 Apr 2024 09:01:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8AD6B3858C31 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8AD6B3858C31 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714467687; cv=none; b=vs2MPoF7Ve7O5JwS0H4Ys1Bx6+91FQp3/S8wxKWRCq0qVWUfx2Ebfe5hcBbTP4Qbw3C9J1ue7M+4g3PNZmTJfYqYK9NOkhOqqkF9U57+CjDwC/UCAiaRlsznRRd/iAnBkXjWwaXWiEFxM4h6EUpXIzXIlhUg2AWnyu3X3tUF6BM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714467687; c=relaxed/simple; bh=6pZ4FEz24nxp0lzVlzbzzAtyJdGBtOiQY/bHBwoQRnM=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=XAtXNECNHOr6FPSZKaxYxgyssF4HHND0o2CW+OrhRz/P5mqVOuD2Abp8mFJG4ftK8QSLMtqPu58fx7SS+JBZwkkJG6+dT9xA2/mXNts0sqCOwCdfSSY4ct1Y2alRVeRbG3GoDcJICjgimkYn5QHPx404uweiU4cmpv5lbc4VSKg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.5.241] (unknown [10.168.5.241]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 60BA61F788; Tue, 30 Apr 2024 09:01:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1714467684; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=04RlZj6cpnRHYmwfC7Yj6uPyuRfAqylQ432KywG+N1Y=; b=K3voPzo6V8i1WRmQuFzfUKfrtLBeZyZUyimtJCGr0rFLhHESHdyNP8T5awZLztPxhhd2W+ m3PIpI83jXwX/x3c+eHAQUwFSdaX+yHieVaNZf9qf0ZyAwgt5wz1Q5PnrT6U66OALUFwMv 27ZkuXVxHM6giwLJ+6vBrAYdZUeZtDo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1714467684; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=04RlZj6cpnRHYmwfC7Yj6uPyuRfAqylQ432KywG+N1Y=; b=obcHt+eTRDCMAKDCzVmcK3FVU5fAIlJoYIt6VukcsqvnqiL8pOU1Mh4Zb7HHm6dPbQLLIZ EWqHoSuA3qruBKCQ== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1714467682; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=04RlZj6cpnRHYmwfC7Yj6uPyuRfAqylQ432KywG+N1Y=; b=xoeUY/MaQI9AhPddSs1Y+86ch9c1yVhutqpwTlzMoOUzdg21xyWCtj2yl5Ul4hwl1yhg5Y 4956IL/HhPB95pLX3TxNM7U/fD4Mg87S4iszCCvB9fM42SyNaoBW8NTyiWih1a6zoZIurQ vEtJ0b+IK0XjWExFcFGDl5BO7hW6cek= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1714467682; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=04RlZj6cpnRHYmwfC7Yj6uPyuRfAqylQ432KywG+N1Y=; b=Qc3ejAfZ0a2FDYX07i79pbymG+9Xhanti2KBNvPthrgEzJcolS5z0cnce5rbMju+M2W5TI 7bNe70KK//RCETCQ== Date: Tue, 30 Apr 2024 11:01:22 +0200 (CEST) From: Richard Biener To: Jakub Jelinek cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] gimple-ssa-sprintf: Use [0, 1] range for %lc with (wint_t) 0 argument [PR114876] In-Reply-To: Message-ID: <153r604r-q703-7o89-3696-8466q8s0oo45@fhfr.qr> References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Score: -4.30 X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.990]; MIME_GOOD(-0.10)[text/plain]; RCVD_COUNT_ZERO(0.00)[0]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; MISSING_XM_UA(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; FROM_HAS_DN(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FROM_EQ_ENVFROM(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; DBL_BLOCKED_OPENRESOLVER(0.00)[gimple-ssa-sprintf.cc:url,austingroupbugs.net:url,suse.de:email] X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, 30 Apr 2024, Jakub Jelinek wrote: > Hi! > > Seems when Martin S. implemented this, he coded there strict reading > of the standard, which said that %lc with (wint_t) 0 argument is handled > as wchar_t[2] temp = { arg, 0 }; %ls with temp arg and so shouldn't print > any values. But, most of the libc implementations actually handled that > case like %c with '\0' argument, adding a single NUL character, the only > known exception is musl. > Recently, C23 changed this in response to GB-141 and POSIX in > https://austingroupbugs.net/view.php?id=1647 > so that it should have the same behavior as %c with '\0'. > > Because there is implementation divergence, the following patch uses > a range rather than hardcoding it to all 1s (i.e. the %c behavior), > though the likely case is still 1 (forward looking plus most of > implementations). > The res.knownrange = true; assignment removed is redundant due to > the same assignment done unconditionally before the if statement, > rest is formatting fixes. > > I don't think the min >= 0 && min < 128 case is right either, I'd think > it should be min >= 0 && max < 128, otherwise it is just some possible > inputs are (maybe) ASCII and there can be others, but this code is a total > mess anyway, with the min, max, likely (somewhere in [min, max]?) and then > unlikely possibly larger than max, dunno, perhaps for at least some chars > in the ASCII range the likely case could be for the ascii case; so perhaps > just the one_2_one_ascii shouldn't set max to 1 and mayfail should be true > for max >= 128. Anyway, didn't feel I should touch that right now. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK for trunk. > Shall it go to 14.1, or wait for 14.2? I think this can wait for 14.2. Richard. > 2024-04-30 Jakub Jelinek > > PR tree-optimization/114876 > * gimple-ssa-sprintf.cc (format_character): For min == 0 && max == 0, > set max, likely and unlikely members to 1 rather than 0. Remove > useless res.knownrange = true;. Formatting fixes. > > * gcc.dg/pr114876.c: New test. > * gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust expected > diagnostics. > > --- gcc/gimple-ssa-sprintf.cc.jj 2024-01-03 11:51:22.225860346 +0100 > +++ gcc/gimple-ssa-sprintf.cc 2024-04-29 12:52:59.760668894 +0200 > @@ -2177,8 +2177,7 @@ format_character (const directive &dir, > > res.knownrange = true; > > - if (dir.specifier == 'C' > - || dir.modifier == FMT_LEN_l) > + if (dir.specifier == 'C' || dir.modifier == FMT_LEN_l) > { > /* A wide character can result in as few as zero bytes. */ > res.range.min = 0; > @@ -2189,10 +2188,13 @@ format_character (const directive &dir, > { > if (min == 0 && max == 0) > { > - /* The NUL wide character results in no bytes. */ > - res.range.max = 0; > - res.range.likely = 0; > - res.range.unlikely = 0; > + /* In strict reading of older ISO C or POSIX, this required > + no characters to be emitted. ISO C23 changes that, so > + does POSIX, to match what has been implemented in most of the > + implementations, namely emitting a single NUL character. > + Let's use 0 for minimum and 1 for all the other values. */ > + res.range.max = 1; > + res.range.likely = res.range.unlikely = 1; > } > else if (min >= 0 && min < 128) > { > @@ -2200,11 +2202,12 @@ format_character (const directive &dir, > is not a 1-to-1 mapping to the source character set or > if the source set is not ASCII. */ > bool one_2_one_ascii > - = (target_to_host_charmap[0] == 1 && target_to_host ('a') == 97); > + = (target_to_host_charmap[0] == 1 > + && target_to_host ('a') == 97); > > /* A wide character in the ASCII range most likely results > in a single byte, and only unlikely in up to MB_LEN_MAX. */ > - res.range.max = one_2_one_ascii ? 1 : target_mb_len_max ();; > + res.range.max = one_2_one_ascii ? 1 : target_mb_len_max (); > res.range.likely = 1; > res.range.unlikely = target_mb_len_max (); > res.mayfail = !one_2_one_ascii; > @@ -2235,7 +2238,6 @@ format_character (const directive &dir, > /* A plain '%c' directive. Its output is exactly 1. */ > res.range.min = res.range.max = 1; > res.range.likely = res.range.unlikely = 1; > - res.knownrange = true; > } > > /* Bump up the byte counters if WIDTH is greater. */ > --- gcc/testsuite/gcc.dg/pr114876.c.jj 2024-04-29 12:26:45.774965158 +0200 > +++ gcc/testsuite/gcc.dg/pr114876.c 2024-04-29 12:51:37.863777055 +0200 > @@ -0,0 +1,34 @@ > +/* PR tree-optimization/114876 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump-not "return \[01\];" "optimized" } } */ > +/* { dg-final { scan-tree-dump "return 3;" "optimized" } } */ > +/* { dg-final { scan-tree-dump "return 4;" "optimized" } } */ > + > +int > +foo (void) > +{ > + char buf[64]; > + return __builtin_sprintf (buf, "%lc%lc%lc", (__WINT_TYPE__) 0, (__WINT_TYPE__) 0, (__WINT_TYPE__) 0); > +} > + > +int > +bar (void) > +{ > + char buf[64]; > + return __builtin_sprintf (buf, "%c%c%c", 0, 0, 0); > +} > + > +int > +baz (void) > +{ > + char buf[64]; > + return __builtin_sprintf (buf, "%lc%lc%lca", (__WINT_TYPE__) 0, (__WINT_TYPE__) 0, (__WINT_TYPE__) 0); > +} > + > +int > +qux (void) > +{ > + char buf[64]; > + return __builtin_sprintf (buf, "%c%c%ca", 0, 0, 0); > +} > --- gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c.jj 2020-12-03 10:04:35.888092988 +0100 > +++ gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c 2024-04-29 12:49:14.452717581 +0200 > @@ -200,11 +200,11 @@ void test_sprintf_chk_c_const (void) > T (3, "%c%c", '1', '2'); > > /* Wide characters. */ > - T (0, "%lc", (wint_t)0); /* { dg-warning "nul past the end" } */ > - T (1, "%lc", (wint_t)0); > - T (1, "%lc%lc", (wint_t)0, (wint_t)0); > + T (0, "%lc", (wint_t)0); /* { dg-warning ".%lc. directive writing up to 1 bytes into a region of size 0" } */ > + T (1, "%lc", (wint_t)0); /* { dg-warning "nul past the end" } */ > + T (1, "%lc%lc", (wint_t)0, (wint_t)0); /* { dg-warning ".%lc. directive writing up to 1 bytes into a region of size between 0 and 1" } */ > T (2, "%lc", (wint_t)0); > - T (2, "%lc%lc", (wint_t)0, (wint_t)0); > + T (2, "%lc%lc", (wint_t)0, (wint_t)0); /* { dg-warning "nul past the end" } */ > > /* The following could result in as few as no bytes and in as many as > MB_CUR_MAX, but since the MB_CUR_MAX value is a runtime property > @@ -1550,7 +1550,7 @@ void test_snprintf_c_const (char *d) > > /* Wide characters. */ > T (0, "%lc", (wint_t)0); > - T (1, "%lc", (wint_t)0); > + T (1, "%lc", (wint_t)0); /* { dg-warning "output may be truncated before the last format character" } */ > T (2, "%lc", (wint_t)0); > > /* The following could result in as few as a single byte and in as many > @@ -1603,7 +1603,7 @@ void test_snprintf_chk_c_const (void) > > /* Wide characters. */ > T (0, "%lc", (wint_t)0); > - T (1, "%lc", (wint_t)0); > + T (1, "%lc", (wint_t)0); /* { dg-warning "output may be truncated before the last format character" } */ > T (2, "%lc", (wint_t)0); > > /* The following could result in as few as a single byte and in as many > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)