From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27954 invoked by alias); 9 Aug 2012 20:31:53 -0000 Received: (qmail 27930 invoked by uid 22791); 9 Aug 2012 20:31:48 -0000 X-SWARE-Spam-Status: No, hits=-6.9 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_DNSWL_HI,RCVD_IN_HOSTKARMA_W,SPF_HELO_PASS,T_RP_MATCHES_RCVD,WEIRD_QUOTING X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 09 Aug 2012 20:31:28 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q79KVRjW024921 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 9 Aug 2012 16:31:27 -0400 Received: from [10.3.113.86] (ovpn-113-86.phx2.redhat.com [10.3.113.86]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id q79KVRLI027064; Thu, 9 Aug 2012 16:31:27 -0400 Message-ID: <50241E1F.9010002@redhat.com> Date: Thu, 09 Aug 2012 20:31:00 -0000 From: Josh Stone User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0 MIME-Version: 1.0 To: halcyonic@gmail.com CC: systemtap@sourceware.org Subject: Re: MAXSTRINGLEN applied to printf()? References: <1F868470-E345-4262-B6C1-1C8DD5FD32F9@eecs.harvard.edu> <5023EBB5.1040206@redhat.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2012-q3/txt/msg00187.txt.bz2 On 08/09/2012 12:51 PM, halcyonic@gmail.com wrote: > Sorry...didn't want to spam a lot of code and output. Thanks - what you have here is not nearly large enough that I'd call it spam, and it's much better than us trying to guess what's going on. > So, for example, I get a long output that contains this fragment: > > ... ,"localhost.localdomain"localhost.localdomain_2012-8-8-23-47-30.bz2", ... > > ...generated by the following probe (I've set string lengths to be > 2048). The thing to notice is that any string concatenation I do > should always involve the addition of a pair of quotes, so I should > never be able to get an output that involves an odd number of quotes > (as per above). But if your concatenation goes beyond MAXSTRINGLEN, then one of the pair of quotes may be silently truncated. > So either I'm somehow getting intermingled output (I > don't think I am...the rest of the output looks perfectly fine), or > somehow a string is getting silently truncated somewhere (fyi, all > the filenames it's listing are of the form > localhost.localdomain_2012-8-8-23-47-30.bz2). > > Any theories appreciated. > > Thanks, Nick > > ----- > probe syscall.getdents.return { > if ((execname() != "stap") && !is_fd_blacklisted(pid(), $fd)) { > printf("{ \"arglist\":") > arglist = "[ " > if ($return > 0) { > total_entries = 0 > dirent = $dirent > total_bytes = 0 > current_bytes = 0 > while (total_bytes < $return) { > if (dirent == 0) { > break > } > nextarg = clean_string(user_string_warn(@cast(dirent, "struct linux_dirent")->d_name)) What is clean_string(), something to sanitize to printable characters? You might like user_string_quoted(), or "text_strn(str, 0, 1)" to quote it after the fact. > len = @cast(dirent, "struct linux_dirent")->d_reclen > formatlen = strlen(nextarg) + 2 > dirent += len > total_bytes += len > total_entries += 1 > > if (total_entries < 256) { > if (current_bytes + formatlen > 2048) { > printf("%s", arglist) > arglist = "" > current_bytes = 0 > } > > arglist .= "\"".nextarg."\"" > current_bytes += formatlen > if (total_bytes < $return) { > arglist .= "," > } > } Ok, I see you're trying to avoid MAXSTRINGLEN here. I think your bug may be simply when you add "," to arglist without also incrementing current_bytes, so arglist is longer than you think when you check if a printf is due. Also, don't forget that a \0 terminator has to be present within MAXSTRINGLEN too, so you really only have 2047 bytes to play with. > } > } > printf("%s],", arglist) > # arglist = substr(arglist, 0, strlen(arglist)-1)."]" > # outstr = "{ " > # outstr .= "\"arglist\": " > # outstr .= arglist."," > outstr .= "\"count\": " > outstr .= sprintf("%u", $count)."," > outstr .= "\"execname\": \"" > outstr .= clean_string(execname())."\"," > outstr .= "\"fd\": " > outstr .= sprintf("%d", $fd)."," > outstr .= "\"op\": \"" > outstr .= clean_string("GETDENTS")."\"," > outstr .= "\"pid\": " > outstr .= sprintf("%d", pid())."," > outstr .= "\"ppid\": " > outstr .= sprintf("%d", ppid())."," > outstr .= "\"return\": " > outstr .= sprintf("%d", $return)."," > outstr .= "\"timestamp\": " > outstr .= sprintf("%d", gettimeofday_ms())."," > outstr .= "\"total_bytes\": " > outstr .= sprintf("%u", total_bytes)."," > outstr .= "\"total_entries\": " > outstr .= sprintf("%u", total_entries)."," > outstr .= "\"uid\": " > outstr .= sprintf("%d", uid())."}\n" > printf("%s", outstr) > } > } > ----- With your MAXTRINGLEN=2048, you're probably not hitting limits here at the end, but every single one of these ".", ".=", and "sprintf" create string temporaries. The code we generate for this will be pretty inefficient, with lots of string copies. So I'd recommend building as much as you can directly into that final output format, e.g. printf("\"count\":%u,\"execname\":%s, ...\n", $count, clean_string(execname()), ...) Josh