public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* MAXSTRINGLEN applied to printf()?
@ 2012-08-09  4:02 halcyonic
  2012-08-09 16:56 ` Josh Stone
  0 siblings, 1 reply; 6+ messages in thread
From: halcyonic @ 2012-08-09  4:02 UTC (permalink / raw)
  To: systemtap

If I try to printf() more than MAXSTRINGLEN characters without outputting a newline, does that mean I'm exceeding the string length limit?  I.e., is just a normal string serving as the printf buffer, and all the normal rules about strings apply to it?  (I was trying to get around the string length limits by issuing multiple printf()'s, but that seems to be backfiring...)

Thanks,
Nick

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MAXSTRINGLEN applied to printf()?
  2012-08-09  4:02 MAXSTRINGLEN applied to printf()? halcyonic
@ 2012-08-09 16:56 ` Josh Stone
  2012-08-09 19:51   ` halcyonic
  0 siblings, 1 reply; 6+ messages in thread
From: Josh Stone @ 2012-08-09 16:56 UTC (permalink / raw)
  To: halcyonic; +Cc: systemtap

On 08/08/2012 09:01 PM, halcyonic@gmail.com wrote:
> If I try to printf() more than MAXSTRINGLEN characters without
> outputting a newline, does that mean I'm exceeding the string length
> limit?  I.e., is just a normal string serving as the printf buffer,
> and all the normal rules about strings apply to it?  (I was trying to
> get around the string length limits by issuing multiple printf()'s,
> but that seems to be backfiring...)

As long as you're not using intermediate strings, MAXSTRINGLEN should
not limit you.  There is a different limit STP_BUFFER_SIZE which is 8192
bytes, but I believe even that is for each individual printf call.

As before, examples of what you're trying and the result would be helpful.


Josh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MAXSTRINGLEN applied to printf()?
  2012-08-09 16:56 ` Josh Stone
@ 2012-08-09 19:51   ` halcyonic
  2012-08-09 20:31     ` Josh Stone
  0 siblings, 1 reply; 6+ messages in thread
From: halcyonic @ 2012-08-09 19:51 UTC (permalink / raw)
  To: Josh Stone; +Cc: systemtap

Sorry...didn't want to spam a lot of code and output.

So, for example, I get a long output that contains this fragment:

... ,"localhost.localdomain"localhost.localdomain_2012-8-8-23-47-30.bz2", ...

...generated by the following probe (I've set string lengths to be 2048).  The thing to notice is that any string concatenation I do should always involve the addition of a pair of quotes, so I should never be able to get an output that involves an odd number of quotes (as per above).  So either I'm somehow getting intermingled output (I don't think I am...the rest of the output looks perfectly fine), or somehow a string is getting silently truncated somewhere (fyi, all the filenames it's listing are of the form localhost.localdomain_2012-8-8-23-47-30.bz2).

Any theories appreciated.

Thanks,
Nick

-----
probe syscall.getdents.return {
    if ((execname() != "stap") && !is_fd_blacklisted(pid(), $fd)) {
    printf("{ \"arglist\":")
    arglist = "[ "
    if ($return > 0) {
        total_entries = 0
        dirent = $dirent
        total_bytes = 0
        current_bytes = 0
        while (total_bytes < $return) {
          if (dirent == 0) {
              break
          }
          nextarg = clean_string(user_string_warn(@cast(dirent, "struct linux_dirent")->d_name))
          len = @cast(dirent, "struct linux_dirent")->d_reclen
          formatlen = strlen(nextarg) + 2
          dirent += len
          total_bytes += len
          total_entries += 1

          if (total_entries < 256) {
              if (current_bytes + formatlen > 2048) {
                printf("%s", arglist)
                arglist = ""
                current_bytes = 0
              }
              
              arglist .= "\"".nextarg."\""
              current_bytes += formatlen 
              if (total_bytes < $return) {
                arglist .= ","
              }
          }          
        }
    }
    printf("%s],", arglist)
#    arglist = substr(arglist, 0, strlen(arglist)-1)."]"
#    outstr = "{ "
#    outstr .= "\"arglist\": "
#    outstr .= arglist.","
    outstr .= "\"count\": "
    outstr .= sprintf("%u", $count).","
    outstr .= "\"execname\": \""
    outstr .= clean_string(execname())."\","
    outstr .= "\"fd\": "
    outstr .= sprintf("%d", $fd).","
    outstr .= "\"op\": \""
    outstr .= clean_string("GETDENTS")."\","
    outstr .= "\"pid\": "
    outstr .= sprintf("%d", pid()).","
    outstr .= "\"ppid\": "
    outstr .= sprintf("%d", ppid()).","
    outstr .= "\"return\": "
    outstr .= sprintf("%d", $return).","
    outstr .= "\"timestamp\": "
    outstr .= sprintf("%d", gettimeofday_ms()).","
    outstr .= "\"total_bytes\": "
    outstr .= sprintf("%u", total_bytes).","
    outstr .= "\"total_entries\": "
    outstr .= sprintf("%u", total_entries).","
    outstr .= "\"uid\": "
    outstr .= sprintf("%d", uid())."}\n"
    printf("%s", outstr)
  }
}
-----

On Aug 9, 2012, at 12:56 PM, Josh Stone <jistone@redhat.com> wrote:

> On 08/08/2012 09:01 PM, halcyonic@gmail.com wrote:
>> If I try to printf() more than MAXSTRINGLEN characters without
>> outputting a newline, does that mean I'm exceeding the string length
>> limit?  I.e., is just a normal string serving as the printf buffer,
>> and all the normal rules about strings apply to it?  (I was trying to
>> get around the string length limits by issuing multiple printf()'s,
>> but that seems to be backfiring...)
> 
> As long as you're not using intermediate strings, MAXSTRINGLEN should
> not limit you.  There is a different limit STP_BUFFER_SIZE which is 8192
> bytes, but I believe even that is for each individual printf call.
> 
> As before, examples of what you're trying and the result would be helpful.
> 
> 
> Josh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MAXSTRINGLEN applied to printf()?
  2012-08-09 19:51   ` halcyonic
@ 2012-08-09 20:31     ` Josh Stone
  2012-08-09 22:25       ` halcyonic
  0 siblings, 1 reply; 6+ messages in thread
From: Josh Stone @ 2012-08-09 20:31 UTC (permalink / raw)
  To: halcyonic; +Cc: systemtap

On 08/09/2012 12:51 PM, halcyonic@gmail.com wrote:
> Sorry...didn't want to spam a lot of code and output.

Thanks - what you have here is not nearly large enough that I'd call it
spam, and it's much better than us trying to guess what's going on.

> So, for example, I get a long output that contains this fragment:
> 
> ... ,"localhost.localdomain"localhost.localdomain_2012-8-8-23-47-30.bz2", ...
> 
> ...generated by the following probe (I've set string lengths to be
> 2048).  The thing to notice is that any string concatenation I do
> should always involve the addition of a pair of quotes, so I should
> never be able to get an output that involves an odd number of quotes
> (as per above).

But if your concatenation goes beyond MAXSTRINGLEN, then one of the pair
of quotes may be silently truncated.

> So either I'm somehow getting intermingled output (I
> don't think I am...the rest of the output looks perfectly fine), or
> somehow a string is getting silently truncated somewhere (fyi, all
> the filenames it's listing are of the form
> localhost.localdomain_2012-8-8-23-47-30.bz2).
> 
> Any theories appreciated.
> 
> Thanks, Nick
> 
> -----
> probe syscall.getdents.return {
>     if ((execname() != "stap") && !is_fd_blacklisted(pid(), $fd)) {
>     printf("{ \"arglist\":")
>     arglist = "[ "
>     if ($return > 0) {
>         total_entries = 0
>         dirent = $dirent
>         total_bytes = 0
>         current_bytes = 0
>         while (total_bytes < $return) {
>           if (dirent == 0) {
>               break
>           }
>           nextarg = clean_string(user_string_warn(@cast(dirent, "struct linux_dirent")->d_name))

What is clean_string(), something to sanitize to printable characters?
You might like user_string_quoted(), or "text_strn(str, 0, 1)" to quote
it after the fact.

>           len = @cast(dirent, "struct linux_dirent")->d_reclen
>           formatlen = strlen(nextarg) + 2
>           dirent += len
>           total_bytes += len
>           total_entries += 1
> 
>           if (total_entries < 256) {
>               if (current_bytes + formatlen > 2048) {
>                 printf("%s", arglist)
>                 arglist = ""
>                 current_bytes = 0
>               }
>               
>               arglist .= "\"".nextarg."\""
>               current_bytes += formatlen 
>               if (total_bytes < $return) {
>                 arglist .= ","
>               }
>           }          

Ok, I see you're trying to avoid MAXSTRINGLEN here.  I think your bug
may be simply when you add "," to arglist without also incrementing
current_bytes, so arglist is longer than you think when you check if a
printf is due.

Also, don't forget that a \0 terminator has to be present within
MAXSTRINGLEN too, so you really only have 2047 bytes to play with.

>         }
>     }
>     printf("%s],", arglist)
> #    arglist = substr(arglist, 0, strlen(arglist)-1)."]"
> #    outstr = "{ "
> #    outstr .= "\"arglist\": "
> #    outstr .= arglist.","
>     outstr .= "\"count\": "
>     outstr .= sprintf("%u", $count).","
>     outstr .= "\"execname\": \""
>     outstr .= clean_string(execname())."\","
>     outstr .= "\"fd\": "
>     outstr .= sprintf("%d", $fd).","
>     outstr .= "\"op\": \""
>     outstr .= clean_string("GETDENTS")."\","
>     outstr .= "\"pid\": "
>     outstr .= sprintf("%d", pid()).","
>     outstr .= "\"ppid\": "
>     outstr .= sprintf("%d", ppid()).","
>     outstr .= "\"return\": "
>     outstr .= sprintf("%d", $return).","
>     outstr .= "\"timestamp\": "
>     outstr .= sprintf("%d", gettimeofday_ms()).","
>     outstr .= "\"total_bytes\": "
>     outstr .= sprintf("%u", total_bytes).","
>     outstr .= "\"total_entries\": "
>     outstr .= sprintf("%u", total_entries).","
>     outstr .= "\"uid\": "
>     outstr .= sprintf("%d", uid())."}\n"
>     printf("%s", outstr)
>   }
> }
> -----

With your MAXTRINGLEN=2048, you're probably not hitting limits here at
the end, but every single one of these ".", ".=", and "sprintf" create
string temporaries.  The code we generate for this will be pretty
inefficient, with lots of string copies.  So I'd recommend building as
much as you can directly into that final output format, e.g.

  printf("\"count\":%u,\"execname\":%s, ...\n",
         $count, clean_string(execname()), ...)


Josh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MAXSTRINGLEN applied to printf()?
  2012-08-09 20:31     ` Josh Stone
@ 2012-08-09 22:25       ` halcyonic
  2012-08-09 23:43         ` Josh Stone
  0 siblings, 1 reply; 6+ messages in thread
From: halcyonic @ 2012-08-09 22:25 UTC (permalink / raw)
  To: Josh Stone; +Cc: systemtap

Sigh...that was the other reason I was afraid of posting code: it would turn out to be a stupid bug.  I think you may be right that it was simply not accounting for the commas.  Sorry for the distraction. :-/

Thanks,
Nick

On Aug 9, 2012, at 4:31 PM, Josh Stone <jistone@redhat.com> wrote:

> On 08/09/2012 12:51 PM, halcyonic@gmail.com wrote:
>> Sorry...didn't want to spam a lot of code and output.
> 
> Thanks - what you have here is not nearly large enough that I'd call it
> spam, and it's much better than us trying to guess what's going on.
> 
>> So, for example, I get a long output that contains this fragment:
>> 
>> ... ,"localhost.localdomain"localhost.localdomain_2012-8-8-23-47-30.bz2", ...
>> 
>> ...generated by the following probe (I've set string lengths to be
>> 2048).  The thing to notice is that any string concatenation I do
>> should always involve the addition of a pair of quotes, so I should
>> never be able to get an output that involves an odd number of quotes
>> (as per above).
> 
> But if your concatenation goes beyond MAXSTRINGLEN, then one of the pair
> of quotes may be silently truncated.
> 
>> So either I'm somehow getting intermingled output (I
>> don't think I am...the rest of the output looks perfectly fine), or
>> somehow a string is getting silently truncated somewhere (fyi, all
>> the filenames it's listing are of the form
>> localhost.localdomain_2012-8-8-23-47-30.bz2).
>> 
>> Any theories appreciated.
>> 
>> Thanks, Nick
>> 
>> -----
>> probe syscall.getdents.return {
>>    if ((execname() != "stap") && !is_fd_blacklisted(pid(), $fd)) {
>>    printf("{ \"arglist\":")
>>    arglist = "[ "
>>    if ($return > 0) {
>>        total_entries = 0
>>        dirent = $dirent
>>        total_bytes = 0
>>        current_bytes = 0
>>        while (total_bytes < $return) {
>>          if (dirent == 0) {
>>              break
>>          }
>>          nextarg = clean_string(user_string_warn(@cast(dirent, "struct linux_dirent")->d_name))
> 
> What is clean_string(), something to sanitize to printable characters?
> You might like user_string_quoted(), or "text_strn(str, 0, 1)" to quote
> it after the fact.
> 
>>          len = @cast(dirent, "struct linux_dirent")->d_reclen
>>          formatlen = strlen(nextarg) + 2
>>          dirent += len
>>          total_bytes += len
>>          total_entries += 1
>> 
>>          if (total_entries < 256) {
>>              if (current_bytes + formatlen > 2048) {
>>                printf("%s", arglist)
>>                arglist = ""
>>                current_bytes = 0
>>              }
>> 
>>              arglist .= "\"".nextarg."\""
>>              current_bytes += formatlen 
>>              if (total_bytes < $return) {
>>                arglist .= ","
>>              }
>>          }          
> 
> Ok, I see you're trying to avoid MAXSTRINGLEN here.  I think your bug
> may be simply when you add "," to arglist without also incrementing
> current_bytes, so arglist is longer than you think when you check if a
> printf is due.
> 
> Also, don't forget that a \0 terminator has to be present within
> MAXSTRINGLEN too, so you really only have 2047 bytes to play with.
> 
>>        }
>>    }
>>    printf("%s],", arglist)
>> #    arglist = substr(arglist, 0, strlen(arglist)-1)."]"
>> #    outstr = "{ "
>> #    outstr .= "\"arglist\": "
>> #    outstr .= arglist.","
>>    outstr .= "\"count\": "
>>    outstr .= sprintf("%u", $count).","
>>    outstr .= "\"execname\": \""
>>    outstr .= clean_string(execname())."\","
>>    outstr .= "\"fd\": "
>>    outstr .= sprintf("%d", $fd).","
>>    outstr .= "\"op\": \""
>>    outstr .= clean_string("GETDENTS")."\","
>>    outstr .= "\"pid\": "
>>    outstr .= sprintf("%d", pid()).","
>>    outstr .= "\"ppid\": "
>>    outstr .= sprintf("%d", ppid()).","
>>    outstr .= "\"return\": "
>>    outstr .= sprintf("%d", $return).","
>>    outstr .= "\"timestamp\": "
>>    outstr .= sprintf("%d", gettimeofday_ms()).","
>>    outstr .= "\"total_bytes\": "
>>    outstr .= sprintf("%u", total_bytes).","
>>    outstr .= "\"total_entries\": "
>>    outstr .= sprintf("%u", total_entries).","
>>    outstr .= "\"uid\": "
>>    outstr .= sprintf("%d", uid())."}\n"
>>    printf("%s", outstr)
>>  }
>> }
>> -----
> 
> With your MAXTRINGLEN=2048, you're probably not hitting limits here at
> the end, but every single one of these ".", ".=", and "sprintf" create
> string temporaries.  The code we generate for this will be pretty
> inefficient, with lots of string copies.  So I'd recommend building as
> much as you can directly into that final output format, e.g.
> 
>  printf("\"count\":%u,\"execname\":%s, ...\n",
>         $count, clean_string(execname()), ...)
> 
> 
> Josh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MAXSTRINGLEN applied to printf()?
  2012-08-09 22:25       ` halcyonic
@ 2012-08-09 23:43         ` Josh Stone
  0 siblings, 0 replies; 6+ messages in thread
From: Josh Stone @ 2012-08-09 23:43 UTC (permalink / raw)
  To: halcyonic, SystemTap

On 08/09/2012 03:25 PM, halcyonic@gmail.com wrote:
> Sigh...that was the other reason I was afraid of posting code: it
> would turn out to be a stupid bug.  I think you may be right that it
> was simply not accounting for the commas.  Sorry for the distraction.
> :-/

Well, sharing code lets us track down bugs more quickly, whether the
problem lies in your own code or elsewhere, so I don't think you need to
feel bad.  Everyone makes mistakes, and this one was fairly subtle for
me to identify. :)  Do let us know if you still find issues.

Josh

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-08-09 23:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-09  4:02 MAXSTRINGLEN applied to printf()? halcyonic
2012-08-09 16:56 ` Josh Stone
2012-08-09 19:51   ` halcyonic
2012-08-09 20:31     ` Josh Stone
2012-08-09 22:25       ` halcyonic
2012-08-09 23:43         ` Josh Stone

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).