public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Yair Lenga <yair.lenga@gmail.com>
Cc: libc-help@sourceware.org
Subject: Re: Buffer size checking for scanf* functions
Date: Mon, 11 Jul 2022 09:38:50 -0300	[thread overview]
Message-ID: <E5051608-858D-4008-A64C-E501A282F1AC@linaro.org> (raw)
In-Reply-To: <CAK3_KpPaLa7yzDg1WeV1bu2APjQURpEU0r49KQcXhM8+ushbDg@mail.gmail.com>




> On 6 Jul 2022, at 12:04, Yair Lenga <yair.lenga@gmail.com> wrote:
> 
> Thanks for elaborating .Agree that when possible '-m' should be used, but it's not always trivial .Lot of code is already written in such a way that those changes are less than trivial. Not to mention that each %m will require adding an strcpy (or equivalent) to copy the dynamically allocated strings into the fixed length storage usually defined in struct, etc.
> 
> struct { ... } s ;
> char *sx = NULL;
> scanf("%ms %d", &sx, &s.i) ;
> strcpy_fix(s.output, sizeof(s.output), &sx) ;    // Copy from *sx to s.output, up to the limit, free *sx, and set sx =  NULL

Well either this or to use the a simpler solution like:

  #define NAMELEN 32
  struct { char name[NAMELEN]; } x;
  #define STRFY(__n)  XSTRFY(__n)
  #define XSTRFY(__n) #__n
  scanf ("%"STRFY(NAMELEN)"s", x.name);

> 
> Your point about backward compatibility is also very valid - if possible new features should try to avoid collision with future improvement. The C standard is getting updated every 10 years (c99, c11, and the expected c23), I could not find any reason why the C standard committee chose to use '%m' instead of using the already established '%a' that existed for many years in glibc, and assign new meaning to the '%a'. I hope that those are exceptions that proves the rules.
> 
> You raised a good point with the 'scanf_s' - the fact that they chose to modify the behavior of the '%s' . To my understanding scanf_s '%s' requires 2 arguments (char *, size_t), vs. scanf that will only expect the 'char *'. It would have been a much better solution to keep '%s' compatible. and introduce another formatting sequence for the dynamic fixed-length string.
> 
> Going back to the question - what will be a good way to integrate the type safety provided by scanf_s, without creating problems. a few ideas that I have:
> * Use '%S' (upper S) into indicate that a pair (char *, size_t) is expected, OR
> * Use '%@s' ('@' can be any unused letter or special character e.g. '%!s', '%:s', ...). The logical choice should have been '*' - symmetry with printf("%*s"). Unfortunately, '*' is already used .. as "ignore assignment' flag.
> * Use '%n %s', when 'n' will indicate a size parameter will be provided, and will apply to the next '%s' or '%[', or even '%ms' - dynamic width limit, instead of static width limit.
> 
> Personally, I prefer the second option '%@s', it matches the style of '*' for printf. Easy for existing developers to grasp. Interesting enough, it might be possible to implement the scanf_s as a wrapper around scanf, with some manipulation of the argument list.

I don’t have a strong preference, and although I think scanf is still a
bad interface [1] I think you might try to raise this on libc-alpha.

I think the ‘@‘ modifier would make more sense, since ideally it would
extend to wscanf familiar as well (and it already defines ’%S’).


[1] https://github.com/biojppm/rapidyaml/issues/40

  reply	other threads:[~2022-07-11 12:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-05 21:50 Yair Lenga
2022-07-06 12:10 ` Adhemerval Zanella
2022-07-06 15:04   ` Yair Lenga
2022-07-11 12:38     ` Adhemerval Zanella [this message]
  -- strict thread matches above, loose matches on Subject: below --
2022-07-05  7:31 Yair Lenga
2022-07-05 12:39 ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E5051608-858D-4008-A64C-E501A282F1AC@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=libc-help@sourceware.org \
    --cc=yair.lenga@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).