public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "bettini at dsi dot unifi dot it" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sources.redhat.com
Subject: [Bug libc/2679] New: getopt and optind (when called with different arguments)
Date: Sun, 21 May 2006 10:42:00 -0000	[thread overview]
Message-ID: <20060521104128.2679.bettini@dsi.unifi.it> (raw)

I found a strange behavior in getopt that raises when getopt (or
getopt_long) is called with new argv and argc w.r.t. the ones used in
a previous invocation (in the same process).

Actually this might seem a strange situation, since you usually pass
the argc and argv passed to the main function.  However, I'm using
getopt_long to parse options that are not always the one passed at
command line: they might come from a configuration file, or might be
stored somewhere else.  

I'm the maintainer of GNU gengetopt that generates command line
parsers, and in general option parsers, that use getopt_long.  Thus a
program, can parse the command line, then a configuration file, and so
getopt_long is called with different arguments (string vectors).
optind is set to 1, each time new arguments are used (as requested by
the documentation).

However, sometimes, in such context, some strange behaviors are
experienced and most of the time also illegal accesses to memory
(reported by valgrind, or segfaults).

Taking a look at the getopt.c I see the following code (part of
_getopt_internal_r function):

  if (d->optind == 0 || !d->__initialized)
    {
      if (d->optind == 0)
	d->optind = 1;	/* Don't scan ARGV[0], the program name.  */
      optstring = _getopt_initialize (argc, argv, optstring, d);
      d->__initialized = 1;
    }

where d is the _getopt_data struct containing also pointers such as
__next_char and argv indexes such as __first_nonopt and __last_nonopt.

Now, these elements are initiliazed only the first time or when optind
== 0.  

That's basically the problem: when getopt_long is called with new
argv, since optind is set to 1, the internal structure is not
initialized again, and then it contains pointers to the previous
vector, resulting in strange behaviors or also illegal memory accesses
if the previous vector has already been deallocated, or if the
previous vector had bigger size than the current one.

I seem to understand that the solution is that optind should be set to
0 before any new use of getopt_long, but this is not documented
anywhere but in the source:

"On entry to `getopt', zero means this is the first call; initialize."

and this does not seem to be standard, since optind should be 1
before any call, as also noted in the getopt.c itself:

/* 1003.2 says this must be 1 before any call.  */

I think that the above check should actually be

  if (d->optind == 0 || d->optind == 1 || !d->__initialized)
    {
      if (d->optind == 0)
	d->optind = 1;	/* Don't scan ARGV[0], the program name.  */
      optstring = _getopt_initialize (argc, argv, optstring, d);
      d->__initialized = 1;
    }

i.e., the initialization should be performed even when optind == 1
since "optind must be 1 before any call".

By users of gengetopt I was reported that by using other
implementation of getopt_long, setting optind = 0 makes also the
program name to be interpreted as an option (since it is in position 0
in argv), which, although odd, it is more obvious since optind "is the
index of the next element of the ARGV array to be processed".

Thus setting optind to 0 before the initial invocations makes GNU
gengetopt generate code that would work only with GNU implementation
of getopt, due to feature that I seem to understand as not standard,
and not documented (and thus are allowed to change in the future,
breaking existing code relying on it)...

or am I missing something?

Otherwise I guess the above proposed modification is correct.

-- 
           Summary: getopt and optind (when called with different arguments)
           Product: glibc
           Version: 2.4
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: drepper at redhat dot com
        ReportedBy: bettini at dsi dot unifi dot it
                CC: glibc-bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=2679

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


             reply	other threads:[~2006-05-21 10:42 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-21 10:42 bettini at dsi dot unifi dot it [this message]
2006-05-24 21:38 ` [Bug libc/2679] " drepper at redhat dot com
2006-05-25  7:12 ` bettini at dsi dot unifi dot it

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060521104128.2679.bettini@dsi.unifi.it \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).