public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
@ 2022-03-11 19:06 bill at broadley dot org
  2022-03-11 19:08 ` [Bug runtime/28958] " bill at broadley dot org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: bill at broadley dot org @ 2022-03-11 19:06 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

            Bug ID: 28958
           Summary: Current git + RHEL (4.18.0) has 3 working NFSd
                    examples, and one broken.
           Product: systemtap
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: runtime
          Assignee: systemtap at sourceware dot org
          Reporter: bill at broadley dot org
  Target Milestone: ---

I did a git clone today (Mar 11th, 2022) and nfsd-recent.stp, nfsdtop.std, and
nfsd_unlik.stp work (in testsuite/systemtap.examples/network).

However nfsd-trace.stp results in:
# /nopt/apps/systemtap-git-2022-03-11/bin/stap nfsd-trace.stp
Fri Mar 11 12:03:39 2022 MST 192.168.12.20:34051 nfsd.proc4.write 932,0 ERROR:
read fault [man error::fault] at 0x10 near operator '->' at
/nopt/nrel/apps/systemtap-2022-03-11/share/systemtap/tapset/linux/dentry.stp:267:19
WARNING: Number of errors: 1, skipped probes: 1
WARNING: /nopt/nrel/apps/systemtap-2022-03-11/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

Here's the version:
[root@nas-1-0 network]# /nopt/nrel/apps/systemtap-2022-03-11/bin/stap -V
Systemtap translator/driver (version 4.7/0.185, commit
release-4.6-58-gaa27023c941f + changes)
Copyright (C) 2005-2021 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
tested kernel versions: 2.6.32 ... 5.17.0-rc4
enabled features: BPF NLS

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
@ 2022-03-11 19:08 ` bill at broadley dot org
  2022-03-11 19:11 ` [Bug testsuite/28958] " bill at broadley dot org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bill at broadley dot org @ 2022-03-11 19:08 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

--- Comment #1 from BillB <bill at broadley dot org> ---
Forgot the kernel:

# uname -a
Linux nas-1-0.swift.hpc.foo.gov 4.18.0-348.12.2.el8_5.x86_64 #1 SMP Wed Jan 19
17:53:40 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
  2022-03-11 19:08 ` [Bug runtime/28958] " bill at broadley dot org
@ 2022-03-11 19:11 ` bill at broadley dot org
  2022-03-17 20:18 ` wcohen at redhat dot com
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bill at broadley dot org @ 2022-03-11 19:11 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

BillB <bill at broadley dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|runtime                     |testsuite
           Keywords|                            |testsuite

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
  2022-03-11 19:08 ` [Bug runtime/28958] " bill at broadley dot org
  2022-03-11 19:11 ` [Bug testsuite/28958] " bill at broadley dot org
@ 2022-03-17 20:18 ` wcohen at redhat dot com
  2022-03-17 21:49 ` bill at broadley dot org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: wcohen at redhat dot com @ 2022-03-17 20:18 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

William Cohen <wcohen at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wcohen at redhat dot com

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
                   ` (2 preceding siblings ...)
  2022-03-17 20:18 ` wcohen at redhat dot com
@ 2022-03-17 21:49 ` bill at broadley dot org
  2022-03-18 14:40 ` wcohen at redhat dot com
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bill at broadley dot org @ 2022-03-17 21:49 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

BillB <bill at broadley dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bill at broadley dot org

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
                   ` (3 preceding siblings ...)
  2022-03-17 21:49 ` bill at broadley dot org
@ 2022-03-18 14:40 ` wcohen at redhat dot com
  2022-03-18 15:18 ` wcohen at redhat dot com
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: wcohen at redhat dot com @ 2022-03-18 14:40 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

William Cohen <wcohen at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2022-03-18
     Ever confirmed|0                           |1

--- Comment #2 from William Cohen <wcohen at redhat dot com> ---
I am able to replicate error on RHEL8 vm exporting nfs file system:

$ sudo ../install/bin/stap -v
testsuite/systemtap.examples/network/nfsd-trace.stp 
Pass 1: parsed user script and 483 library scripts using
294924virt/95612res/17928shr/79820data kb, in 180usr/160sys/485real ms.
Pass 2: analyzed script: 22 probes, 64 functions, 7 embeds, 1 global using
553136virt/351840res/19852shr/338032data kb, in 5550usr/1570sys/7502real ms.
Pass 3: translated to C into
"/tmp/stap13rEn4/stap_e2015f4beef9bcfacf31be2a7eccb277_61031_src.c" using
553136virt/351968res/19980shr/338032data kb, in 660usr/100sys/773real ms.
Pass 4: compiled C into "stap_e2015f4beef9bcfacf31be2a7eccb277_61031.ko" in
19610usr/3730sys/24872real ms.
Pass 5: starting run.
Fri Mar 18 10:30:25 2022 EDT 192.168.122.253:32771 nfsd.proc4.lookup testing
ERROR: read fault [man error::fault] at 0x10 near operator '->' at
/home/wcohen/systemtap_write/install/share/systemtap/tapset/linux/dentry.stp:267:19
WARNING: Number of errors: 1, skipped probes: 1
WARNING: /home/wcohen/systemtap_write/install/bin/staprun exited with status: 1
Pass 5: run completed in 20usr/250sys/52136real ms.
Pass 5: run failed.  [man error::pass5]

The nfsd-trace.stp is the only using task_dentry_path
(https://sourceware.org/systemtap/tapsets/API-task-dentry-path.html).

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
                   ` (4 preceding siblings ...)
  2022-03-18 14:40 ` wcohen at redhat dot com
@ 2022-03-18 15:18 ` wcohen at redhat dot com
  2022-03-22 17:55 ` wcohen at redhat dot com
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: wcohen at redhat dot com @ 2022-03-18 15:18 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

--- Comment #3 from William Cohen <wcohen at redhat dot com> ---
Found other testsuite scripts that end up using task_dentry_path directly and
indirectly:

systemtap.examples/network/autofs4.stp
systemtap.examples/process/pfiles.stp
systemtap.base/task_paths.stp
systemtap.base/task_fd_lookup.stp

Of these the task_paths.stp is the simplest and looks to be exhibiting the same
behavior.  Note that the last two ERROR messages are expected and correct, but
the first two should print out strings:

$ sudo ../../install/bin/stap  systemtap.base/task_paths.stp -T 1
ERROR: read fault [man error::fault] at 0x10
ERROR: read fault [man error::fault] at 0x10
ERROR: read fault [man error::fault] at 0x0
ERROR: read fault [man error::fault] at 0xffffffffffffffff

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
                   ` (5 preceding siblings ...)
  2022-03-18 15:18 ` wcohen at redhat dot com
@ 2022-03-22 17:55 ` wcohen at redhat dot com
  2022-03-25 18:22 ` bill at broadley dot org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: wcohen at redhat dot com @ 2022-03-22 17:55 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

William Cohen <wcohen at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #4 from William Cohen <wcohen at redhat dot com> ---
This has been fixed by commit 3739d47c4cc427ce4818d884f429a3efa85c38ae

An example of nfsd-tracing.stp running correctly on a rhel8 vm:

$ sudo ../install/bin/stap  testsuite/systemtap.examples/network/nfsd-trace.stp 
Tue Mar 22 13:51:51 2022 EDT 192.168.122.253:64003 nfsd.proc4.read 13,0
/mnt/myshareddir/testing
Tue Mar 22 13:51:51 2022 EDT 192.168.122.253:64003 nfsd.proc4.lookup .#testing
/mnt/myshareddir/.#testing
Tue Mar 22 13:51:51 2022 EDT 192.168.122.253:64003 nfsd.proc4.lookup #testing#
/mnt/myshareddir/#testing#
...


Also get the expected result for task_paths.stp now:

$ sudo ../install/bin/stap testsuite/systemtap.base/task_paths.stp -T 1
current cwd: /home/wcohen/systemtap_write/systemtap
current exe: /home/wcohen/systemtap_write/install/libexec/systemtap/stapio
ERROR: read fault [man error::fault] at 0x0
ERROR: read fault [man error::fault] at 0xffffffffffffffff

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
                   ` (6 preceding siblings ...)
  2022-03-22 17:55 ` wcohen at redhat dot com
@ 2022-03-25 18:22 ` bill at broadley dot org
  2022-03-25 20:02 ` wcohen at redhat dot com
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bill at broadley dot org @ 2022-03-25 18:22 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

--- Comment #5 from BillB <bill at broadley dot org> ---
Much appreciated, felt helpless to debug what a busy NFS file server was doing
without this functionality.  

I confirmed it worked for me with kernel 4.18.0-348.12.2.el8_5.x86_64.

I have some scripts that can process the logs into useful reports, I'll mention
this on the irc channel with an example and see if it's of use to anyone and a
link should be added to a nfsd-trace.README or maybe a contrib/nfsd directory
or similar.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
                   ` (7 preceding siblings ...)
  2022-03-25 18:22 ` bill at broadley dot org
@ 2022-03-25 20:02 ` wcohen at redhat dot com
  2022-03-25 22:27 ` bill at broadley dot org
  2022-03-26 20:07 ` fche at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: wcohen at redhat dot com @ 2022-03-25 20:02 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

--- Comment #6 from William Cohen <wcohen at redhat dot com> ---
Glad that the nfsd-trace.stp is working for you.

Do you have some repository for the scripts processing the output of the
systemtap scripts?


Not sure what would be the best way to make the supplemental scripts
functionality available to others. Some possibilities:
-Add a directory in testsuite/systemtap.examples to hold the processing
scripts, maybe something like the the lwtools directoy
-Add an entry to the wiki at https://sourceware.org/systemtap/wiki describing
the scripts, where to get them, and how to use them.
-Update existing systemtap script to include the functionality.  The examples
allow using shell scripts.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
                   ` (8 preceding siblings ...)
  2022-03-25 20:02 ` wcohen at redhat dot com
@ 2022-03-25 22:27 ` bill at broadley dot org
  2022-03-26 20:07 ` fche at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: bill at broadley dot org @ 2022-03-25 22:27 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

--- Comment #7 from BillB <bill at broadley dot org> ---
No, but I could create one.

One of my summaries scripts produces this:

begin=1648245843
end=1648246023

Total ops for 180 seconds

base dir                                          Operations IOPs  IOPS%
Bandwidth bandwidth%
/nas-1-0/abcdefg/home/user99/.schrodinger  7358       40.88 2.30%         0    
 0.00%
/scratch/carsXY.Z9_rsm#z1-14-0-3e/fort.50  4312       23.96 1.35%   1832394    
 1.68%
/scratch/carsXY.Z9_rsm#z8-77-0-99/fort.50  4103       22.79 1.28%   2376893    
 2.18%

hosts:               Operations
   192.168.6.55:31235     37529     11.74%
   192.168.5.51:17155     10623      3.32%
   192.168.13.28:9219      6981      2.18%
   192.168.6.15:57603      6973      2.18%

Operations
  nfsd.proc4.write    146374     45.77%
   nfsd.proc4.read    123553     38.64%
 nfsd.proc4.lookup     20517      6.42%
 nfsd.proc4.commit     19071      5.96%
 nfsd.proc4.remove      7566      2.37%
 nfsd.proc4.rename      2486      0.78%
 nfsd.proc4.create       223      0.07%

I find it great for finding weird behavior, like say the above mentioned file
that's using zero bandwidth, but 40% of the OPS, turns out nearly all are
nfsd.proc4.lookup.  Another common problem is this weird issue where c++
programs seek to X, write 4 byte, seek to X write 8 bytes, seek to X and write
12 bytes .... 4096 bytes, then seek to X+4096.  So only 1 in 1024 writes are
actually kept.

Do scripts like this seem generally useful?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/28958] Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken.
  2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
                   ` (9 preceding siblings ...)
  2022-03-25 22:27 ` bill at broadley dot org
@ 2022-03-26 20:07 ` fche at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2022-03-26 20:07 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=28958

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fche at redhat dot com

--- Comment #8 from Frank Ch. Eigler <fche at redhat dot com> ---
Definitely useful, whether for use as is, or as an educational example to
study.
We'd be glad to include it in the samples library.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-03-26 20:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-11 19:06 [Bug runtime/28958] New: Current git + RHEL (4.18.0) has 3 working NFSd examples, and one broken bill at broadley dot org
2022-03-11 19:08 ` [Bug runtime/28958] " bill at broadley dot org
2022-03-11 19:11 ` [Bug testsuite/28958] " bill at broadley dot org
2022-03-17 20:18 ` wcohen at redhat dot com
2022-03-17 21:49 ` bill at broadley dot org
2022-03-18 14:40 ` wcohen at redhat dot com
2022-03-18 15:18 ` wcohen at redhat dot com
2022-03-22 17:55 ` wcohen at redhat dot com
2022-03-25 18:22 ` bill at broadley dot org
2022-03-25 20:02 ` wcohen at redhat dot com
2022-03-25 22:27 ` bill at broadley dot org
2022-03-26 20:07 ` fche at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).