From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-483138-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 115362 invoked by alias); 3 Aug 2018 19:59:15 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 115346 invoked by uid 89); 3 Aug 2018 19:59:15 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=tone, shock
X-HELO: mail-qt0-f196.google.com
Received: from mail-qt0-f196.google.com (HELO mail-qt0-f196.google.com) (209.85.216.196) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 03 Aug 2018 19:59:13 +0000
Received: by mail-qt0-f196.google.com with SMTP id z8-v6so7445176qto.9        for <gcc-patches@gcc.gnu.org>; Fri, 03 Aug 2018 12:59:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=gmail.com; s=20161025;        h=subject:to:references:from:message-id:date:user-agent:mime-version         :in-reply-to:content-transfer-encoding;        bh=/aQrqt0z0zQIuzXxrx8bEM2z0AxCMW8RkeInLNYoIrw=;        b=NxDT9ArQnsT+J3mvBVfFc8r0EvPTul2bzsm3KGeh0yM9vUD6Nc7Ry3EZi+S4qyOgsw         gPZASe4M5+hjJ/7ry2HOtSus6ntfFjKVWSxpRScRaWHpSM41tptws9Ng2E7ugXdH2qU7         Ap7E3KcfE/+gzhDfpy4AEk8oZinK93uMkzQMfcnpUbag2960eqissPmWfVb4vdmXoyPk         SFG6A+nhFn8ppgIysrx/BKcQJeidnQpNFvMp02MoaS7ce39IckuKGAhAYMeDGBIlX7iO         yPAymScTTi8vT8sZKwNF8OcUwF6hOoUoyb4MqIPt2QoxbMtzlvTpO//AjKYJIjt2NlV+         OlNA==
Return-Path: <msebor@gmail.com>
Received: from localhost.localdomain (97-118-124-30.hlrn.qwest.net. [97.118.124.30])        by smtp.gmail.com with ESMTPSA id a6-v6sm3264309qth.8.2018.08.03.12.59.09        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);        Fri, 03 Aug 2018 12:59:10 -0700 (PDT)
Subject: Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552, 86711, 86714) )
To: Bernd Edlinger <bernd.edlinger@hotmail.de>, Gcc Patch List <gcc-patches@gcc.gnu.org>
References: <a0d277c4-41c5-61a8-0284-5b1b245e2c74@gmail.com> <ff946355-efec-e286-d7b7-1505a8acc55b@gmail.com> <805ad9c9-e6e4-a159-7fe1-6bb5562b8fe0@gmail.com> <3136e337-a551-3853-c536-b3b9ff0acdb9@gmail.com> <AM5PR0701MB2657944F8E5FDA3E3EA7BA7CE42C0@AM5PR0701MB2657.eurprd07.prod.outlook.com> <AM5PR0701MB2657FD7CA43337837EBA02C7E42C0@AM5PR0701MB2657.eurprd07.prod.outlook.com> <39ae51d8-0807-7604-d1f5-78d31d6fece1@gmail.com> <AM5PR0701MB2657D6F0EF1039D22253D9C8E4230@AM5PR0701MB2657.eurprd07.prod.outlook.com>
From: Martin Sebor <msebor@gmail.com>
Message-ID: <8d668d1a-45f8-d595-b048-d0472ed3efc5@gmail.com>
Date: Fri, 03 Aug 2018 19:59:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <AM5PR0701MB2657D6F0EF1039D22253D9C8E4230@AM5PR0701MB2657.eurprd07.prod.outlook.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
X-SW-Source: 2018-08/txt/msg00311.txt.bz2

On 08/03/2018 07:00 AM, Bernd Edlinger wrote:
> On 08/02/18 22:34, Martin Sebor wrote:
>> On 08/02/2018 12:56 PM, Bernd Edlinger wrote:
>>> On 08/02/18 15:26, Bernd Edlinger wrote:
>>>>>
>>>>>    /* If the length can be computed at compile-time, return it.  */
>>>>> -  len = c_strlen (src, 0);
>>>>> +  tree array;
>>>>> +  tree len = c_strlen (src, 0, &array);
>>>>
>>>> You know the c_strlen tries to compute wide character sizes,
>>>> but strlen does not do that, strlen (L"abc") should give 1
>>>> (or 0 on a BE machine)
>>>> I wonder if that is correct.
>>>>
>>> [snip]
>>>>>
>>>>>  static tree
>>>>> -fold_builtin_strlen (location_t loc, tree type, tree arg)
>>>>> +fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg)
>>>>>  {
>>>>>    if (!validate_arg (arg, POINTER_TYPE))
>>>>>      return NULL_TREE;
>>>>>    else
>>>>>      {
>>>>> -      tree len = c_strlen (arg, 0);
>>>>> -
>>>>> +      tree arr = NULL_TREE;
>>>>> +      tree len = c_strlen (arg, 0, &arr);
>>>>
>>>> Is it possible to write a test case where strlen(L"test") reaches this point?
>>>> what will c_strlen return then?
>>>>
>>>
>>> Yes, of course it is:
>>>
>>> $ cat y.c
>>> int f(char *x)
>>> {
>>>    return __builtin_strlen(x);
>>> }
>>>
>>> int main ()
>>> {
>>>    return f((char*)&L"abcdef"[0]);
>>> }
>>> $ gcc -O3 -S y.c
>>> $ cat y.s
>>> main:
>>> .LFB1:
>>>     .cfi_startproc
>>>     movl    $6, %eax
>>>     ret
>>>     .cfi_endproc
>>>
>>> The reason is that c_strlen tries to fold wide chars at all.
>>> I do not know when that was introduced, was that already before your last patches?
>>
>> The function itself was introduced in 1992 if not earlier,
>> before wide strings even existed.  AFAICS, it has always
>> accepted strings of all widths.  Until r241489 (in GCC 7)
>> it computed their length in bytes, not characters.  I don't
>> know if that was on purpose or if it was just never changed
>> to compute the length in characters when wide strings were
>> first introduced.  From the name I assume it's the latter.
>> The difference wasn't detected until sprintf tests were added
>> for wide string directives.  The ChangeLog description for
>> the change reads: Correctly handle wide strings.  I didn't
>> consider pathological cases like strlen (L"abc").  It
>> shouldn't be difficult to change to fix this case.
>>
>
> Oh, oh, oh....
>
> $ cat y3.c
> int main ()
> {
>    char c[100];
>    int x = __builtin_sprintf (c, "%S", L"\uFFFF");
>
>    __builtin_printf("%d %ld\n", x,__builtin_strlen(c));
> }
>
> $ gcc-4.8 -O3 -std=c99 y3.c
> $ ./a.out
> -1 0
> $ gcc -O3 y3.c
> $ ./a.out
> 1 0
> $ echo $LANG
> de_DE.UTF-8
>
> I would have expected L"\uFFFF" to converted to UTF-8
> or another encoding, so the return value if sprintf is
> far from obvious, and probably language dependent.
>
> Why do you think it is a good idea to use really every
> opportunity to optimize totally unnecessary things like
> using the return value from the sprintf function as it is?
>
> Did you never think this adds a significant maintenance
> burden on the rest of us as well?

Your condescending tone is uncalled for, and you clearly speak
out of ignorance.  I don't owe you an explanation but as I have
said multiple times: most of my work, including the sprintf pass,
is primarily motivated by detecting bugs like buffer overflow.
Optimization is only a secondary goal (but bug detection depends
on it).  It may come as a shock to you but mistakes happen.
That's why it's important to make an effort to detect them.
This is one is a simple typo (handling %S the same way as %s
instead of %ls.

If you are incapable of a professional tone I would suggest
you go harass someone else.

Martin