* Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64 @ 2012-03-30 21:15 William Cohen 2012-04-02 19:30 ` David Smith 0 siblings, 1 reply; 8+ messages in thread From: William Cohen @ 2012-03-30 21:15 UTC (permalink / raw) To: systemtap; +Cc: David Smith When running tests I found the following patch causing problems on ARM and IA64 machines http://sourceware.org/git/gitweb.cgi?p=systemtap.git;a=commitdiff;h=c0456b6fda16307070e0896d3cca86f523214a5e;hp=ca73974d84f94065dbabce2a2044d17a83bade57 On ARM with a really simple example end up with: $ ../install/bin/stap -k ../systemtap/testsuite/systemtap.base/add.stp WARNING: "copy_to_user_page" [/tmp/stapgL32Wo/stap_7644.ko] undefined! Error inserting module '/tmp/stapgL32Wo/stap_7644.ko': Unknown symbol in module WARNING: /media/greatplains/wcohen/systemtap_write/install/bin/staprun exited with status: 1 Pass 5: run failed. Try again with another '--vp 00001' option. Keeping temporary directory "/tmp/stapgL32Wo" On IA64 same example ends up with: $ ../install/bin/stap -k ../systemtap/testsuite/systemtap.base/add.stp WARNING: "flush_icache_range" [/tmp/staprfDqn5/stap_29610.ko] undefined! Error inserting module '/tmp/staprfDqn5/stap_29610.ko': Unknown symbol in module WARNING: /home/wcohen/systemtap_write/install/bin/staprun exited with status: 1 Pass 5: run failed. Try again with another '--vp 00001' option. Keeping temporary directory "/tmp/staprfDqn5" It looks like the reordering of the includes is causing a problem on these machines. -Will ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64 2012-03-30 21:15 Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64 William Cohen @ 2012-04-02 19:30 ` David Smith 2012-04-02 20:17 ` William Cohen 0 siblings, 1 reply; 8+ messages in thread From: David Smith @ 2012-04-02 19:30 UTC (permalink / raw) To: William Cohen; +Cc: systemtap On 03/30/2012 04:15 PM, William Cohen wrote: > When running tests I found the following patch causing problems on ARM and IA64 machines > > http://sourceware.org/git/gitweb.cgi?p=systemtap.git;a=commitdiff;h=c0456b6fda16307070e0896d3cca86f523214a5e;hp=ca73974d84f94065dbabce2a2044d17a83bade57 > > On ARM with a really simple example end up with: > > $ ../install/bin/stap -k ../systemtap/testsuite/systemtap.base/add.stp > WARNING: "copy_to_user_page" [/tmp/stapgL32Wo/stap_7644.ko] undefined! > Error inserting module '/tmp/stapgL32Wo/stap_7644.ko': Unknown symbol in module > WARNING: /media/greatplains/wcohen/systemtap_write/install/bin/staprun exited with status: 1 > Pass 5: run failed. Try again with another '--vp 00001' option. > Keeping temporary directory "/tmp/stapgL32Wo" > > > On IA64 same example ends up with: > > $ ../install/bin/stap -k ../systemtap/testsuite/systemtap.base/add.stp > WARNING: "flush_icache_range" [/tmp/staprfDqn5/stap_29610.ko] undefined! > Error inserting module '/tmp/staprfDqn5/stap_29610.ko': Unknown symbol in module > WARNING: /home/wcohen/systemtap_write/install/bin/staprun exited with status: 1 > Pass 5: run failed. Try again with another '--vp 00001' option. > Keeping temporary directory "/tmp/staprfDqn5" > > > It looks like the reordering of the includes is causing a problem on these machines. > > -Will This is fixed for IA64 in commit 414b89b. It might fix ARM also, but I don't have access to an ARM machine, so you'll have to test that one. Long story short, it wasn't a problem with includes, but with some new code. New code in sym.c/vma.c was calling __access_process_vm() in certain situations. The __access_process_vm() function ends up calling flush_icache_range() (__access_process_vm() calls copy_to_user_page(), which calls flush_icache_user_range(), which calls flush_icache_range()), which isn't exported. I fixed the problem by ifdef'ing out the __access_process_vm() call when we've got in-kernel utrace. -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64 2012-04-02 19:30 ` David Smith @ 2012-04-02 20:17 ` William Cohen 2012-04-02 22:10 ` David Smith 0 siblings, 1 reply; 8+ messages in thread From: William Cohen @ 2012-04-02 20:17 UTC (permalink / raw) To: David Smith; +Cc: systemtap On 04/02/2012 03:30 PM, David Smith wrote: > ../install/bin/stap -k ../systemtap/testsuite/systemtap.base/add.stp Hi David, Thanks for taking a look at this problem. The git checkin fixes the problem for ia64. The arm machine is using a kernel from the linus torvald's git repo. This particular kernel doesn't have utrace support in it, so it doesn't have CONFIG_UTRACE set. Thus, things still fail in the same way on the ARM machine. Does this code work with stock x86_64 kernel? I am wondering why this problem wasn't seen on the x86 machine. Does the x86 not need to do the explicit flushes unlike the ia64 and ARM? -Will ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64 2012-04-02 20:17 ` William Cohen @ 2012-04-02 22:10 ` David Smith 2012-04-03 14:18 ` William Cohen ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: David Smith @ 2012-04-02 22:10 UTC (permalink / raw) To: William Cohen; +Cc: systemtap On 04/02/2012 03:16 PM, William Cohen wrote: > On 04/02/2012 03:30 PM, David Smith wrote: >> ../install/bin/stap -k ../systemtap/testsuite/systemtap.base/add.stp > > Hi David, > > Thanks for taking a look at this problem. > > The git checkin fixes the problem for ia64. > > The arm machine is using a kernel from the linus torvald's git repo. This particular kernel doesn't > have utrace support in it, so it doesn't have CONFIG_UTRACE set. > Thus, things still fail in the same way on the ARM machine. > > Does this code work with stock x86_64 kernel? I am wondering why this problem wasn't seen on the x86 > machine. Does the x86 not need to do the explicit flushes unlike the > ia64 and ARM? Right. From what I've been looking at on the stock x86_64 kernel, copy_to_user_page() boils down to a memcpy() call. Since x86_64 doesn't have an arch-specific cacheflush.h file, it inherits the following from asm-generic/cacheflush.h: ==== .... #define flush_icache_user_range(vma,pg,adr,len) do { } while (0) .... #define copy_to_user_page(vma, page, vaddr, dst, src, len) \ do { \ memcpy(dst, src, len); \ flush_icache_user_range(vma, page, vaddr, len); \ } while (0) ==== Arm has a arch-specific copy_to_user_page() (defined in arch/arm/include/cacheflush.h) which isn't exported. Sigh. It looks like on current kernels access_process_vm() is exported, which means we could use the real function instead of our copy (can you check and make sure this is exported on arm?). However, we've added the __access_process_vm_noflush() variant which isn't present upstream. I'm not sure there are easy answers here. -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64 2012-04-02 22:10 ` David Smith @ 2012-04-03 14:18 ` William Cohen 2012-04-03 14:35 ` David Smith 2012-04-04 20:57 ` Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64) David Smith 2 siblings, 0 replies; 8+ messages in thread From: William Cohen @ 2012-04-03 14:18 UTC (permalink / raw) To: David Smith; +Cc: systemtap On 04/02/2012 06:10 PM, David Smith wrote: > On 04/02/2012 03:16 PM, William Cohen wrote: > >> On 04/02/2012 03:30 PM, David Smith wrote: >>> ../install/bin/stap -k ../systemtap/testsuite/systemtap.base/add.stp >> >> Hi David, >> >> Thanks for taking a look at this problem. >> >> The git checkin fixes the problem for ia64. >> >> The arm machine is using a kernel from the linus torvald's git repo. This particular kernel doesn't > >> have utrace support in it, so it doesn't have CONFIG_UTRACE set. >> Thus, things still fail in the same way on the ARM machine. > >> >> Does this code work with stock x86_64 kernel? I am wondering why this problem wasn't seen on the x86 > >> machine. Does the x86 not need to do the explicit flushes unlike the >> ia64 and ARM? > > Right. From what I've been looking at on the stock x86_64 kernel, > copy_to_user_page() boils down to a memcpy() call. > > Since x86_64 doesn't have an arch-specific cacheflush.h file, it > inherits the following from asm-generic/cacheflush.h: > > ==== > .... > #define flush_icache_user_range(vma,pg,adr,len) do { } while (0) > .... > #define copy_to_user_page(vma, page, vaddr, dst, src, len) \ > do { \ > memcpy(dst, src, len); \ > flush_icache_user_range(vma, page, vaddr, len); \ > } while (0) > ==== > > Arm has a arch-specific copy_to_user_page() (defined in > arch/arm/include/cacheflush.h) which isn't exported. Sigh. > > It looks like on current kernels access_process_vm() is exported, which > means we could use the real function instead of our copy (can you check > and make sure this is exported on arm?). However, we've added the > __access_process_vm_noflush() variant which isn't present upstream. > > I'm not sure there are easy answers here. > I looked through Module.symvers for the currently running kernel on the ARM machine and didn't see access_process_vm listed. Also seems that all the uses of access_process_vm() were for things that are built into the kernel rather than modules. -Will ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64 2012-04-02 22:10 ` David Smith 2012-04-03 14:18 ` William Cohen @ 2012-04-03 14:35 ` David Smith 2012-04-04 20:57 ` Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64) David Smith 2 siblings, 0 replies; 8+ messages in thread From: David Smith @ 2012-04-03 14:35 UTC (permalink / raw) To: William Cohen; +Cc: systemtap On 04/02/2012 05:10 PM, David Smith wrote: > It looks like on current kernels access_process_vm() is exported... Which isn't true. On 3.4.0-0.rc0.git1.2.tip0.uprobes.fc18.x86_64, 'stap -L' gets confused about which symbols are exported. I keep getting excited about the new exports, then my hopes get cruelly dashed. I'm still not sure how to solve this problem. -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64) 2012-04-02 22:10 ` David Smith 2012-04-03 14:18 ` William Cohen 2012-04-03 14:35 ` David Smith @ 2012-04-04 20:57 ` David Smith 2012-04-05 15:29 ` David Smith 2 siblings, 1 reply; 8+ messages in thread From: David Smith @ 2012-04-04 20:57 UTC (permalink / raw) To: William Cohen; +Cc: systemtap On 04/02/2012 05:10 PM, David Smith wrote: > On 04/02/2012 03:16 PM, William Cohen wrote: (stuff from Will deleted that describes a problem on IA64/ARM where some new code I added that calls our __access_process_vm() comes up with an undefined symbol) Here's some more background and context here. The kernel has _acesss_process_vm() but it isn't exported. So, we have an internal copy called __access_process_vm(). We also have a variant, called __access_process_vm_noflush(), that doesn't flush the instruction cache. This variant isn't present upstream. Our __access_process_vm() calls copy_from_user_page() and copy_to_use_page() to do the actual reading/writing. There is a default definition of those functions present in include/asm-generic/cacheflush.h that looks like this: ==== .... #define flush_icache_user_range(vma,pg,adr,len) do { } while (0) .... #define copy_to_user_page(vma, page, vaddr, dst, src, len) \ do { \ memcpy(dst, src, len); \ flush_icache_user_range(vma, page, vaddr, len); \ } while (0) #define copy_from_user_page(vma, page, vaddr, dst, src, len) \ memcpy(dst, src, len) ==== So, copy_to_user_page()/copy_from_user_page() boil down to just memcpy(). Here's how each arch we're interested in handles copy_to_user_page()/copy_from_user_page() in the current upstream kernel source. - x86, x86_64: Just uses the default. - s390x: Just uses the default. - ia64: flush_icache_user_range() is a define that calls flush_icache_range(), which isn't exported - mips: has unexported copy_to_user_page()/copy_from_user_page() functions - arm: copy_from_user_page() is just #defined to be memcpy(). copy_to_user_page() is an unexported function So, we've got problems on ia64, mips, and arm. Note that I'm a bit worried that __access_process_vm_noflush() isn't quite right on mips, since it doesn't call its arch specific copy_from_user_page() (all other arches just use memcpy to read). But, the mips arch-specific copy_from_user_page() isn't exported. We'd someone who understands mips details to look at the internals of copy_from_user_page() to know if the differences are important. So, that's the status of __access_process_vm() on all the arches that systemtap runs on. However... I realized while looking at this (prompted by Will) that the build-id checking code is only reading user memory, not writing it. When reading user memory, __access_process_vm() and __access_process_vm_noflush() are fairly equivalent. So, I've switch the build-id checking code to use __access_process_vm_noflush() (commit f1410a8). This should hopefully fix ia64, mips, and arm on current kernels. Will, if this doesn't fix ARM, let me know. -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64) 2012-04-04 20:57 ` Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64) David Smith @ 2012-04-05 15:29 ` David Smith 0 siblings, 0 replies; 8+ messages in thread From: David Smith @ 2012-04-05 15:29 UTC (permalink / raw) To: William Cohen; +Cc: systemtap OK, obviously I needed more caffeine yesterday, because... On 04/04/2012 03:56 PM, David Smith wrote: ... > Here's how each arch we're interested in handles > copy_to_user_page()/copy_from_user_page() in the current upstream kernel > source. > > - x86, x86_64: Just uses the default. > - s390x: Just uses the default. > - ia64: flush_icache_user_range() is a define that calls > flush_icache_range(), which isn't exported > - mips: has unexported copy_to_user_page()/copy_from_user_page() functions > - arm: copy_from_user_page() is just #defined to be memcpy(). > copy_to_user_page() is an unexported function > > So, we've got problems on ia64, mips, and arm. > > Note that I'm a bit worried that __access_process_vm_noflush() isn't > quite right on mips, since it doesn't call its arch specific > copy_from_user_page() (all other arches just use memcpy to read). But, > the mips arch-specific copy_from_user_page() isn't exported. We'd > someone who understands mips details to look at the internals of > copy_from_user_page() to know if the differences are important. > > So, that's the status of __access_process_vm() on all the arches that > systemtap runs on. Systemtap has never run on mips. I don't know how ppc got confused with mips in my head. Here's how ppc handles copy_to_user_page()/copy_from_user_page(): flush_icache_user_range() is an exported function The good news here is that memcpy() can be used just fine (in __access_process_vm_noflush()) to read user memory on all the (real) arches we care about. Now I'm off to drink a diet coke... -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-04-05 15:29 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-03-30 21:15 Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64 William Cohen 2012-04-02 19:30 ` David Smith 2012-04-02 20:17 ` William Cohen 2012-04-02 22:10 ` David Smith 2012-04-03 14:18 ` William Cohen 2012-04-03 14:35 ` David Smith 2012-04-04 20:57 ` Improve build-id checking when the task we're interested in isn't 'current'. git commit causing problems on ARM and IA64) David Smith 2012-04-05 15:29 ` David Smith
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).