From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x234.google.com (mail-oi1-x234.google.com [IPv6:2607:f8b0:4864:20::234]) by sourceware.org (Postfix) with ESMTPS id 2DF78385E45F for ; Mon, 21 Mar 2022 14:10:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2DF78385E45F Received: by mail-oi1-x234.google.com with SMTP id o64so16313961oib.7 for ; Mon, 21 Mar 2022 07:10:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=4/MSV4mAsrPm1AH99bjHLCl0yzu387PfyXkSMYb9s8I=; b=6mHPbVKA1KVoKkn5RgXGJ9P4ZLXgrK/7uPzRzSM9F/xuMXL/AilaAXHi6F08ak3UYI g7c0gmZz12rTWnOd4mJ/OtzOskmLRE6E/bQlcvuTFZhGE6s9sXYmQRPIvVkehGzZyL9y jcijpRVUZbX0AeyvX7+kmCkBDWckhgFW+4YN+0Ls8y18kzPW5XjxOUR6BojmRmeHCsNZ QcsTbKSRHs+VDqQ2qRiT4OilkpqfTmvkWssn5SP5fOm9C8lnh1FMSQIEw1A+jtXJ9cvm OH4Go+7/x1dLCFHaPjLCdhPdb2cWEbFjLC+JW3OyfRp/ZcuRMTlUQvFCqL2ez+3996zD 6G2A== X-Gm-Message-State: AOAM533KhIbqDVzNMD+EvczNCiHaTdBMdCs01xxlfdQIXE6Kn+1WJVCz eZouAt9bvDZI1m53aOMI/T+LHw== X-Google-Smtp-Source: ABdhPJyP4zqQM8yu0VLZIdxSJPgYrCiabJB8wUcYNfyp5mQcm4FpoYGjvVzPQ/k9HIFq5szBgbzdmg== X-Received: by 2002:aca:2418:0:b0:2ee:f7da:795f with SMTP id n24-20020aca2418000000b002eef7da795fmr11757564oic.276.1647871814336; Mon, 21 Mar 2022 07:10:14 -0700 (PDT) Received: from ?IPV6:2804:431:c7ca:2d55:f04a:67c7:cbf3:571d? ([2804:431:c7ca:2d55:f04a:67c7:cbf3:571d]) by smtp.gmail.com with ESMTPSA id a19-20020a056808099300b002da1428db03sm7307711oic.7.2022.03.21.07.10.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 21 Mar 2022 07:10:13 -0700 (PDT) Message-ID: <71e17874-f0b5-411b-9d74-19f27c54a3db@linaro.org> Date: Mon, 21 Mar 2022 11:10:11 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH v3 0/7] Refactor syslog implementation Content-Language: en-US To: Paul Eggert Cc: libc-alpha@sourceware.org References: <20220318165214.2291065-1-adhemerval.zanella@linaro.org> <4f9f9209-2998-7b39-ffd8-1dc73a11b093@cs.ucla.edu> From: Adhemerval Zanella In-Reply-To: <4f9f9209-2998-7b39-ffd8-1dc73a11b093@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Mar 2022 14:10:18 -0000 On 18/03/2022 18:11, Paul Eggert wrote: > Thanks for looking into this. I'm reviewing the patches all in one diff rather than one patch at a time, as that's more convenient for me: > > >> -#ifdef SYSLOG_NAMES >> +#if defined(SYSLOG_NAMES) && defined(__USE_MISC) > > Need spaces before parens. Better yet, omit the parens. Please do this systematically in #if. > Ack. > >> +  enum >>      { >> +      timestamp_size = sizeof "MMM DD hh:mm:ss ", >> +      bufs_size = 1024 >> +    }; > > As these enums are used only once it might be more readable to eliminate them and replace their uses with their definiens, e.g., > >    char timestamp[sizeof "MMM DD hh:mm:ss "]; >    ... >    char bufs[1024]; > > since the later code uses "sizeof timestamp" and "sizeof bufs" anyway (as that's less error-prone). Yeah, it seems it would be better in this case. > > >> +  /* "%h %e %H:%M:%S "  */ > > Please prefer "%b" to "%h" here and elsewhere, as they're equivalent and "%b" is more mnemonic (it's short for "%B"). > Ack, although in this case I am keeping the old code as-in. > >> +  /* We deviate from RFC3164 which states timestamp should be in localtime > > Please use imperative instead of plural form: "Deviate from" instead of "We deviate from". None of the new comments should need to use "we" or "us" or "our" or "ours". > Ack. > >> +  bool buf_malloced = false; > > This local var isn't needed. You can remove it, and replace its use with "buf != bufs", which is like what the old code did; this is a bit more efficient, I expect. > Ack. > >> +  bool has_ts = __gmtime64_r (&now, &now_tm) != NULL; > > It'll be slightly more efficient to replace this with: > >    struct tm *now_tmp = __gmtime64_r (&now, &now_tm); >    bool has_ts = now_tmp != NULL; > > and replace the "&now_tm" with "now_tmp" in the next __strftime_l call. > Ack. > >> +  /* In the highly unlike case of gmtime_r failure (the clock being >> +     INT_MIN + 1900 or follow INT_MAX + 1900) we skip the hostname so the >> +     message is handl as valid PRI but without TIMESTAMP or invalid TIMESTAMP >> +     (which should force the relay to add the timestamp itself).  */ > > Some English fixups. "unlike" -> "unlikely". No need for "highly". "the clock being INT_MIN + 1900 or follow INT_MAX + 1900" -> "tm_year out of int range". "we skip" -> "skip". "handl" -> "handled". Ack. > > I don't understand the bit about "without TIMESTAMP or invalid TIMESTAMP > (which should force the relay to add the timestamp itself)". Since we're already departing from RFC 3164, aren't we already generating an invalid TIMESTAMP? And if so, why can't we output our own representation of the out-of-range timestamp, e.g., '@67768037170140800' to represent a timestamp that is 67768037170140800 seconds after the Epoch? I meant the RFC3164 '4.3.2 Valid PRI but no TIMESTAMP or invalid TIMESTAMP', which states in such case the relay should be responsible to generate the timestamp itself. And I think that if the clock is in a such bogus state, I don't see a gain in generating timestamp. > > Better yet, we could output the correct year by dividing the __time64_t value by 12622780800 (60 * 60 * 24 * the number of days in 400 Gregorian years), running __gmtime64_r on the remainder, and adding 400 times the quotient to the tm_year that __gmtime64_r gives us; this computation will always succeed and so we won't need to worry about __gmtime64_r failure. On platforms with leap seconds this approach would go very slightly wrong on timestamps millions of years in the future but those timestamps are wrong anyway (due to leap seconds we don't know about yet, plus we'll switch to some approach other than leap seconds by then anyway). I really don't think we should bother for such corner cases, specially since from systemd discussion [1] usually the local relay will use different timestamps mechanisms than the one generated by the client (such as socket timetamp or better resolutions obtained by server itself). [1] https://github.com/systemd/systemd/issues/19251 > > >> +  pid != 0 ? "[" : "", pid, pid != 0 ? "]" : "" > > Is GCC smart enough to optimize this to be branch-free? If not, you can hand-optimize it as follows: > >    "[" + (pid == 0), pid, "]" + (pid == 0) I would guess so, but I did not bother to check. Your suggestion is slight simpler though. > >> +               buf[bufsize - 1] != '\n' ? "\n" : ""); > > Similarly, this can be "\n" + (buf[bufsize - 1] == '\n'). > > Ack. >> +  if (l < sizeof (bufs)) > > Omit the unnecessary parentheses (for consistency with the other code). > Also, this comparison isn't safe on admittedly-theoretical platforms where size_t is narrower than int. So I suggest: > >    if (0 <= l && l < sizeof bufs) > > which is clearer and should be equally efficient. Ack. > > +      if (l + vl < sizeof bufs) > > l + vl could have signed integer overflow, leading to undefined behavior. Also, this doesn't work if vl == -1. Also, we have the same theoretical problem as before. So change this to "if (0 <= vl && vl < sizeof bufs - l)". Ack. > > >> +      FILE *f = __open_memstream (&buf, &bufsize); >> +      if (f != NULL) > > I'm not seeing what the memstream buys you here, compared to a simple malloc. You can't generate anything longer than INT_MAX bytes, since fprintf won't let you. And you already know how many bytes to allocate, from the returned value of the call to snprintf on the too-small stack buffer. So just call malloc and then call snprintf again; there's no need for a memstream. (The existing code already has this problem of course.) Yeah, I tried to keep this part as is, but it seems better indeed.