Hi наб! On 4/21/23 03:15, наб wrote: > On Fri, Apr 21, 2023 at 03:07:00AM +0200, Alejandro Colomar wrote: >> On 4/21/23 02:45, Alejandro Colomar wrote: >>> Is the following call valid, or is it UB? >>> regmatch_t pmatch = { >>> .rm_so = string, >>> .rm_eo = string + 42, // Assume this offset is valid >>> }; >>> regexec(preg, string, 0, pmatch, REG_NOSUB | REG_STARTEND); >>> How about this? >>> regexec(preg, string, 999, pmatch, REG_NOSUB | REG_STARTEND); > (If you make that "&pmatch", > and put the REG_NOSUB into a preceding regcomp(), my bet is on "valid".) D'oh! I should check what I write before putting it in a bottle. Yeah, I meant that, or at least should have meant that :) > >>> Current implementations will work, because nmatch is effectively >>> ignored. But is it intended to be this way, or just an implementation >>> detail? > My bet is on "intended", quoth 4.4BSD-Lite regex(3): > REG_STARTEND The string is considered to start at string + > pmatch[0].rm_so and to have a terminating NUL located at > string + pmatch[0].rm_eo (there need not actually be a > NUL at that location), regardless of the value of nmatch. > See below for the definition of pmatch and nmatch. This > is an extension, compatible with but not specified by > POSIX 1003.2, and should be used with caution in software > intended to be portable to other systems. Note that a > non-zero rm_so does not imply REG_NOTBOL; REG_STARTEND > affects only the location of the string, not how it is > matched. While this paragraph is not crystal clear to me, the one you quoted below pretty much is. > >> Here's a related question: >> regmatch_t pmatch = { >> .rm_so = string, >> .rm_eo = string + 42, // Assume this offset is valid >> }; >> regexec(preg, string, 0, pmatch, REG_STARTEND); >> Should regexec(3) write to the 1st element in pmatch[] because it knows >> it exists (otherwise the call would be UB because it needs to read it)? > (Which would run counter to how POSIX defines the API.) > >> Or is passing 0 in nmatch effectively another way of performing >> REG_NOSUB behavior without actually using the flag? > Hilariously enough, quoth 4.4BSD-Lite regex(3) again, > which phrases it exactly like you do: > If REG_NOSUB was specified in the compilation of the RE, or if nmatch > is 0, regexec ignores the pmatch argument (but see below for the case > where REG_STARTEND is specified). Touche; it looks like your right. That sentence is unambiguous. BTW, is the reference to some other text about REG_STARTEND the one quoted first (above)? Cheers, Alex > Otherwise, pmatch points to an array > of nmatch structures of type regmatch_t. Such a structure has at least > the members rm_so and rm_eo, both of type regoff_t (a signed arithmetic > type at least as large as an off_t and a ssize_t), containing respec‐ > tively the offset of the first character of a substring and the offset > of the first character after the end of the substring. Offsets are > measured from the beginning of the string argument given to regexec. > An empty substring is denoted by equal offsets, both indicating the > character following the empty substring. > (you know how I'm betting here). :) > > наб -- GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5