public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Implementation of strtok
@ 2023-06-02  5:22 Jayakrishna Vadayath
  2023-06-02  5:40 ` Florian Weimer
  0 siblings, 1 reply; 5+ messages in thread
From: Jayakrishna Vadayath @ 2023-06-02  5:22 UTC (permalink / raw)
  To: libc-alpha

[-- Attachment #1: Type: text/plain, Size: 1282 bytes --]

Hi,

I had a few questions regarding the implementation and documentation of the
strtok function.
I've noticed the following snippet of code in different applications that
parse HTTP packets.

```
char delim[2] = "\r";
void foo(char *p) {
    char *tmp = p;
    if (strchr(p, '\r'))
        tmp = strtok(tmp, delim);
    if (tmp == NULL)
        puts("Y"); // (1) This shouldn't happen
    else
        puts("N"); // (2) This is the expected case
}
```
According to the documentation of the strchr and strtok functions, it would
appear that (1) will never be executed.

However, I've found one situation that leads to strtok returning a NULL
value.
If the function foo is invoked as follows, (1) will be executed and the
string "Y" will be printed out.
```
char buf[2] = "\r\0";
foo(&buf);
```

After looking at the implementation, I find that this line of code is
responsible for it :
https://elixir.bootlin.com/glibc/glibc-2.7/source/string/strtok.c#L51

I was wondering why the implementation of strtok tries to skip the leading
delimiters and why this edge case is not mentioned in the documentation.

I'm looking forward to your answers on this topic.
Additionally, I'd be happy to help with fixing the documentation or code if
I can.

Thank you.
-- 
Regards
Jayakrishna Menon

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Implementation of strtok
  2023-06-02  5:22 Implementation of strtok Jayakrishna Vadayath
@ 2023-06-02  5:40 ` Florian Weimer
  2023-06-02  5:52   ` Jayakrishna Vadayath
  0 siblings, 1 reply; 5+ messages in thread
From: Florian Weimer @ 2023-06-02  5:40 UTC (permalink / raw)
  To: Jayakrishna Vadayath via Libc-alpha; +Cc: Jayakrishna Vadayath

* Jayakrishna Vadayath via Libc-alpha:

> According to the documentation of the strchr and strtok functions, it would
> appear that (1) will never be executed.

I looked at various descriptions of strtok, and they are pretty clear
that the implemented behavior is required.  Where did you find
conflicting information?

Thanks,
Florian


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Implementation of strtok
  2023-06-02  5:40 ` Florian Weimer
@ 2023-06-02  5:52   ` Jayakrishna Vadayath
  2023-06-02  6:05     ` Florian Weimer
  0 siblings, 1 reply; 5+ messages in thread
From: Jayakrishna Vadayath @ 2023-06-02  5:52 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Jayakrishna Vadayath via Libc-alpha

[-- Attachment #1: Type: text/plain, Size: 921 bytes --]

Hi,

The man page of strtok mention that strtok returns a pointer to the next
token or NULL if there are no more tokens :
https://man7.org/linux/man-pages/man3/strtok.3.html
I can see how the "no more tokens" would apply in this case, but it seems
like not many people are aware of this case.

Can you list the descriptions of strtok that explain this behavior ?
I am very interested in reading about it.

Thank you.

On Thu, Jun 1, 2023 at 10:40 PM Florian Weimer <fweimer@redhat.com> wrote:

> * Jayakrishna Vadayath via Libc-alpha:
>
> > According to the documentation of the strchr and strtok functions, it
> would
> > appear that (1) will never be executed.
>
> I looked at various descriptions of strtok, and they are pretty clear
> that the implemented behavior is required.  Where did you find
> conflicting information?
>
> Thanks,
> Florian
>
>

-- 
Regards
Jayakrishna Menon

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Implementation of strtok
  2023-06-02  5:52   ` Jayakrishna Vadayath
@ 2023-06-02  6:05     ` Florian Weimer
  2023-06-02  6:17       ` Jayakrishna Vadayath
  0 siblings, 1 reply; 5+ messages in thread
From: Florian Weimer @ 2023-06-02  6:05 UTC (permalink / raw)
  To: Jayakrishna Vadayath; +Cc: Jayakrishna Vadayath via Libc-alpha

* Jayakrishna Vadayath:

> The man page of strtok mention that strtok returns a pointer to the
> next token or NULL if there are no more tokens :
> https://man7.org/linux/man-pages/man3/strtok.3.html I can see how the
> "no more tokens" would apply in this case, but it seems like not many
> people are aware of this case.
>
> Can you list the descriptions of strtok that explain this behavior ?

This part is quite clear to me:

| The first call to strtok() sets this pointer to point to the first
| byte of the string.  The start of the next token is determined by
| scanning forward for the next nondelimiter byte in str.  If such a
| byte is found, it is taken as the start of the next token.  If no such
| byte is found, then there are no more tokens, and strtok() returns
| NULL.

The last sentence is really unambiguous.  Maybe it's the double
negation?

Thanks,
Florian


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Implementation of strtok
  2023-06-02  6:05     ` Florian Weimer
@ 2023-06-02  6:17       ` Jayakrishna Vadayath
  0 siblings, 0 replies; 5+ messages in thread
From: Jayakrishna Vadayath @ 2023-06-02  6:17 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Jayakrishna Vadayath via Libc-alpha

[-- Attachment #1: Type: text/plain, Size: 1357 bytes --]

Hi,

Thank you for your reply.
I was only looking at the return value section of the documentation.
However, it makes sense to me now.

Thank you.

On Thu, Jun 1, 2023 at 11:05 PM Florian Weimer <fweimer@redhat.com> wrote:

> * Jayakrishna Vadayath:
>
> > The man page of strtok mention that strtok returns a pointer to the
> > next token or NULL if there are no more tokens :
> >
> https://urldefense.com/v3/__https://man7.org/linux/man-pages/man3/strtok.3.html__;!!IKRxdwAv5BmarQ!bQa53ZIha6p2oe0o-luVxpSJFX4pb7VPYYq36zBLfwHYp6-jrFmUvqu99oGfoqfG3qYlrRhyElmpVpse$
> I can see how the
> > "no more tokens" would apply in this case, but it seems like not many
> > people are aware of this case.
> >
> > Can you list the descriptions of strtok that explain this behavior ?
>
> This part is quite clear to me:
>
> | The first call to strtok() sets this pointer to point to the first
> | byte of the string.  The start of the next token is determined by
> | scanning forward for the next nondelimiter byte in str.  If such a
> | byte is found, it is taken as the start of the next token.  If no such
> | byte is found, then there are no more tokens, and strtok() returns
> | NULL.
>
> The last sentence is really unambiguous.  Maybe it's the double
> negation?
>
> Thanks,
> Florian
>
>

-- 
Regards
Jayakrishna Menon

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-02  6:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-02  5:22 Implementation of strtok Jayakrishna Vadayath
2023-06-02  5:40 ` Florian Weimer
2023-06-02  5:52   ` Jayakrishna Vadayath
2023-06-02  6:05     ` Florian Weimer
2023-06-02  6:17       ` Jayakrishna Vadayath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).