* use of %fs segment register in x86_64 with -fstack-check @ 2020-03-03 14:53 Maxim Blinov 2020-03-03 16:51 ` Ruslan Kabatsayev 2020-03-04 9:36 ` Florian Weimer 0 siblings, 2 replies; 5+ messages in thread From: Maxim Blinov @ 2020-03-03 14:53 UTC (permalink / raw) To: gdb Hi all, I'm looking at some -fstack-check'ed code, and would appreciate it if some gdb x86_64 gurus could double check my understanding of a trivial example here is the source: big-access.c: ``` #include <stdio.h> #include <stdlib.h> #include <stdint.h> extern void foo(char *); int main() { char ch[8000]; foo (ch); return 0; } ``` foo.c: ``` void foo(char *ch) { } ``` And the compilation line: $ gcc -O2 -fstack-check -o big-access big-access.c foo.c -fdump-rtl-final And here is the gdb view (ignore the breakpoint and current insn caret): ``` B+ │0x555555554560 <main> sub $0x2f78,%rsp │0x555555554567 <main+7> orq $0x0,0xf58(%rsp) │0x555555554570 <main+16> orq $0x0,(%rsp) │0x555555554575 <main+21> add $0x1020,%rsp │0x55555555457c <main+28> mov %rsp,%rdi │0x55555555457f <main+31> mov %fs:0x28,%rax >│0x555555554588 <main+40> mov %rax,0x1f48(%rsp) │0x555555554590 <main+48> xor %eax,%eax │0x555555554592 <main+50> callq 0x5555555546d0 <foo> │0x555555554597 <main+55> mov 0x1f48(%rsp),%rdx │0x55555555459f <main+63> xor %fs:0x28,%rdx │0x5555555545a8 <main+72> jne 0x5555555545b4 <main+84> │0x5555555545aa <main+74> xor %eax,%eax │0x5555555545ac <main+76> add $0x1f58,%rsp │0x5555555545b3 <main+83> retq │0x5555555545b4 <main+84> callq 0x555555554540 <__stack_chk_fail@plt> │0x5555555545b9 nopl 0x0(%rax) ``` I would just like someone who knows their stuff to double check my understanding: The "orq" at the start are purposefully causing a "dummy" load/store event so the VMM can decide whether or not it is sane for us to have used those pages for the stack, right? Another question, is at address 0x55555555457f. I presume that %fs:0x28 is a memory address that points to a sentinel value. We load it into %rax, and then we store it in strategic locations in our stack to serve as sentinel values. Before we leave, we check that the memory location hasn't changed at 0x55555555459f. That implies, that the memory location %fs:0x28 is pointing to a globally-used sentinel value? But who sets %fs? Indeed what is the ABI usage of %fs in the context of linux x86_64? And why 0x28 offset? Thankyou for reading, Maxim ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: use of %fs segment register in x86_64 with -fstack-check 2020-03-03 14:53 use of %fs segment register in x86_64 with -fstack-check Maxim Blinov @ 2020-03-03 16:51 ` Ruslan Kabatsayev [not found] ` <CADmoyEgaCKp3nNg1Yw_8R2QDhEpd3cuaTSFDNFzSiesqerwrWQ@mail.gmail.com> 2020-03-04 9:36 ` Florian Weimer 1 sibling, 1 reply; 5+ messages in thread From: Ruslan Kabatsayev @ 2020-03-03 16:51 UTC (permalink / raw) To: Maxim Blinov; +Cc: gdb Hi, On Tue, 3 Mar 2020 at 17:53, Maxim Blinov <maxim.blinov@embecosm.com> wrote: > > Hi all, > > I'm looking at some -fstack-check'ed code, and would appreciate it if > some gdb x86_64 gurus could double check my understanding of a trivial > example > > here is the source: > > big-access.c: > ``` > #include <stdio.h> > #include <stdlib.h> > #include <stdint.h> > > extern void foo(char *); > > int main() > { > char ch[8000]; > foo (ch); > > return 0; > } > ``` > > foo.c: > ``` > void foo(char *ch) { } > ``` > > And the compilation line: > > $ gcc -O2 -fstack-check -o big-access big-access.c foo.c -fdump-rtl-final > > And here is the gdb view (ignore the breakpoint and current insn caret): > ``` > B+ │0x555555554560 <main> sub $0x2f78,%rsp > │0x555555554567 <main+7> orq $0x0,0xf58(%rsp) > │0x555555554570 <main+16> orq $0x0,(%rsp) > │0x555555554575 <main+21> add $0x1020,%rsp > │0x55555555457c <main+28> mov %rsp,%rdi > │0x55555555457f <main+31> mov %fs:0x28,%rax > >│0x555555554588 <main+40> mov %rax,0x1f48(%rsp) > │0x555555554590 <main+48> xor %eax,%eax > │0x555555554592 <main+50> callq 0x5555555546d0 <foo> > │0x555555554597 <main+55> mov 0x1f48(%rsp),%rdx > │0x55555555459f <main+63> xor %fs:0x28,%rdx > │0x5555555545a8 <main+72> jne 0x5555555545b4 <main+84> > │0x5555555545aa <main+74> xor %eax,%eax > │0x5555555545ac <main+76> add $0x1f58,%rsp > │0x5555555545b3 <main+83> retq > │0x5555555545b4 <main+84> callq 0x555555554540 <__stack_chk_fail@plt> > │0x5555555545b9 nopl 0x0(%rax) > ``` > > I would just like someone who knows their stuff to double check my > understanding: > > The "orq" at the start are purposefully causing a "dummy" load/store > event so the VMM can decide whether or not it is sane for us to have > used those pages for the stack, right? Not quite. As noted at [1] this OR is to ensure that stack hasn't overflowed. This is the part added by -fstack-check (you can see it go away when you remove this option). See [2] for documentation. > > Another question, is at address 0x55555555457f. I presume that > %fs:0x28 is a memory address that points to a sentinel value. We load > it into %rax, and then we store it in strategic locations in our stack > to serve as sentinel values. Before we leave, we check that the memory > location hasn't changed at 0x55555555459f. That implies, that the > memory location %fs:0x28 is pointing to a globally-used sentinel > value? Right. But note that this is enabled not by -fstack-check, but rather by some of the -fstack-protector* options that are on by default on modern Linux distributions. You can confirm this by explicitly passing -fno-stack-protector and seeing this sentinel checking gone. > > But who sets %fs? Indeed what is the ABI usage of %fs in the context > of linux x86_64? The FS segment base points to the TLS. See [3] and links therein. > And why 0x28 offset? It's the offset of stack_guard member of tcbhead_t. See the corresponding glibc source [4]. > > Thankyou for reading, > Maxim [1]: https://stackoverflow.com/a/44670648/673852 [2]: https://gcc.gnu.org/onlinedocs/gccint/Stack-Checking.html [3]: https://chao-tic.github.io/blog/2018/12/25/tls [4]: https://code.woboq.org/userspace/glibc/sysdeps/x86_64/nptl/tls.h.html#42 Regards, Ruslan ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <CADmoyEgaCKp3nNg1Yw_8R2QDhEpd3cuaTSFDNFzSiesqerwrWQ@mail.gmail.com>]
* Fwd: use of %fs segment register in x86_64 with -fstack-check [not found] ` <CADmoyEgaCKp3nNg1Yw_8R2QDhEpd3cuaTSFDNFzSiesqerwrWQ@mail.gmail.com> @ 2020-03-03 18:43 ` Maxim Blinov 2020-03-03 20:03 ` Ruslan Kabatsayev 0 siblings, 1 reply; 5+ messages in thread From: Maxim Blinov @ 2020-03-03 18:43 UTC (permalink / raw) To: gdb (Sorry, forgot to CC gdb ml) ---------- Forwarded message --------- From: Maxim Blinov <maxim.blinov@embecosm.com> Date: Tue, 3 Mar 2020 at 18:37 Subject: Re: use of %fs segment register in x86_64 with -fstack-check To: Ruslan Kabatsayev <b7.10110111@gmail.com> Hi Ruslan, thankyou for your explanations. Unfortunately, I still can't see the whole picture. On Tue, 3 Mar 2020 at 16:51, Ruslan Kabatsayev <b7.10110111@gmail.com> wrote: > Not quite. As noted at [1] this OR is to ensure that stack hasn't > overflowed. This is the part added by -fstack-check (you can see it go > away when you remove this option). See [2] for documentation. I don't understand how the OR insns check that the stack hasn't overflowed. From [1], the author writes "it just inserts a NULL byte". What is *it* in this context? I don't see anyone writing anything to the stack in the assembly. Does linux do it on our behalf, and then the OR insns check that those bytes are indeed NULL? Furthermore, I can't see who uses the result of the OR operation. I'm under the impression that there is some page fault magic happening under the hood, but what is that magic? No insns after the ORs perform any conditional jumps based on the ORs results that I can see (although I am not very knowledgeable about x86_64 asm.) So I am still confused. I did read [2] before posting, but unfortunately I didn't find it very helpful. I tried to step through each insn in my head to demonstrate where i dont get it: 0x555555554560 <main> sub $0x2f78,%rsp Ok, whatever %rsp was, its now %rsp - 12152. Thats a lot more than 8000, but fine. Lets call %rsp before we subtracted it "%original". 0x555555554567 <main+7> orq $0x0,0xf58(%rsp) Ok, we OR with memory location %rsp + 3928. Taking into account the previous offset, we're accessing %original + (3928 - 12152) which is %original - 8224. So this is about 200 bytes after the stack array ends. The instruction doesn't change the value at 0xf58(%rsp). My understanding is that this instruction will fetch the quadword at 0xf58(%rsp), OR it with $0x0, and then store the result of that computation back to the same address. How does this check that no stack overflow has occurred? 0x555555554570 <main+16> orq $0x0,(%rsp) We do it again, this time at %original - 12152 (the bottom of the stack). Is this because we might span over two pages? 0x555555554575 <main+21> add $0x1020,%rsp Now we set %rsp to be %original - 8024. So now we are actually pointing to the stack byte just after the large array. 0x55555555457c <main+28> mov %rsp,%rdi Now we save %rsp to %rdi, despite %rdi not being used anywhere... not sure about this one. 0x55555555457f <main+31> mov %fs:0x28,%rax Load the magic sentinel pattern, OK. 0x555555554588 <main+40> mov %rax,0x1f48(%rsp) 0x1f48 corresponds to %original - 16. So we are writing a sentinel value to almost the start of the stack for this func. 0x555555554590 <main+48> xor %eax,%eax 0x555555554592 <main+50> callq 0x5555555546d0 <foo> Clear %eax for foo's return value and call foo. 0x555555554597 <main+55> mov 0x1f48(%rsp),%rdx 0x55555555459f <main+63> xor %fs:0x28,%rdx 0x5555555545a8 <main+72> jne 0x5555555545b4 <main+84> Now we double-check that the sentinel value at %original - 16 is exactly the same as it was before we called foo, and if it isn't, we go to __stack_chk_fail. So, this protects us against the case where foo trashed the start of our stack? 0x5555555545aa <main+74> xor %eax,%eax 0x5555555545ac <main+76> add $0x1f58,%rsp 0x5555555545b3 <main+83> retq Clear our own return value, cleanup the stack, and exit. I just don't understand how the ORs are ensuring the stack hasn't overflowed. > Right. But note that this is enabled not by -fstack-check, but rather > by some of the -fstack-protector* options that are on by default on > modern Linux distributions. You can confirm this by explicitly passing > -fno-stack-protector and seeing this sentinel checking gone. Ok, I see. > The FS segment base points to the TLS. See [3] and links therein. ... > It's the offset of stack_guard member of tcbhead_t. See the > corresponding glibc source [4]. Got it, thankyou. > [1]: https://stackoverflow.com/a/44670648/673852 > [2]: https://gcc.gnu.org/onlinedocs/gccint/Stack-Checking.html > [3]: https://chao-tic.github.io/blog/2018/12/25/tls > [4]: https://code.woboq.org/userspace/glibc/sysdeps/x86_64/nptl/tls.h.html#42 > > Regards, > Ruslan ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: use of %fs segment register in x86_64 with -fstack-check 2020-03-03 18:43 ` Fwd: " Maxim Blinov @ 2020-03-03 20:03 ` Ruslan Kabatsayev 0 siblings, 0 replies; 5+ messages in thread From: Ruslan Kabatsayev @ 2020-03-03 20:03 UTC (permalink / raw) To: Maxim Blinov; +Cc: gdb On Tue, 3 Mar 2020 at 21:37, Maxim Blinov <maxim.blinov@embecosm.com> wrote: > > Hi Ruslan, thankyou for your explanations. Unfortunately, I still > can't see the whole picture. > > On Tue, 3 Mar 2020 at 16:51, Ruslan Kabatsayev <b7.10110111@gmail.com> wrote: > > Not quite. As noted at [1] this OR is to ensure that stack hasn't > > overflowed. This is the part added by -fstack-check (you can see it go > > away when you remove this option). See [2] for documentation. > > I don't understand how the OR insns check that the stack hasn't overflowed. > > From [1], the author writes "it just inserts a NULL byte". What is > *it* in this context? I don't see anyone writing anything to the stack > in the assembly. Does linux do it on our behalf, and then the OR insns > check that those bytes are indeed NULL? > > Furthermore, I can't see who uses the result of the OR operation. I'm > under the impression that there is some page fault magic happening > under the hood, but what is that magic? No insns after the ORs perform > any conditional jumps based on the ORs results that I can see > (although I am not very knowledgeable about x86_64 asm.) So I am still > confused. > > I did read [2] before posting, but unfortunately I didn't find it very helpful. > > I tried to step through each insn in my head to demonstrate where i dont get it: > > 0x555555554560 <main> sub $0x2f78,%rsp > Ok, whatever %rsp was, its now %rsp - 12152. Thats a lot more than > 8000, but fine. > Lets call %rsp before we subtracted it "%original". > > 0x555555554567 <main+7> orq $0x0,0xf58(%rsp) > Ok, we OR with memory location %rsp + 3928. Taking into account the > previous offset, we're accessing %original + (3928 - 12152) which is > %original - 8224. So this is about 200 bytes after the stack array > ends. The instruction doesn't change the value at 0xf58(%rsp). My > understanding is that this instruction will fetch the quadword at > 0xf58(%rsp), OR it with $0x0, and then store the result of that > computation back to the same address. How does this check that no > stack overflow has occurred? > > 0x555555554570 <main+16> orq $0x0,(%rsp) > We do it again, this time at %original - 12152 (the bottom of the > stack). Is this because we might span over two pages? Not merely "might", we _do_ span two pages. Pages are 4096 bytes in size. > > 0x555555554575 <main+21> add $0x1020,%rsp > Now we set %rsp to be %original - 8024. So now we are actually > pointing to the stack byte just after the large array. > > 0x55555555457c <main+28> mov %rsp,%rdi > Now we save %rsp to %rdi, despite %rdi not being used anywhere... not > sure about this one. Actually it _is_ used—in the callee. That's how the first integral argument is passed, see System V x86-64 psABI for more details. So RSP (and EDI) now contains the address of the first byte of the array. > > 0x55555555457f <main+31> mov %fs:0x28,%rax > Load the magic sentinel pattern, OK. > > 0x555555554588 <main+40> mov %rax,0x1f48(%rsp) > 0x1f48 corresponds to %original - 16. So we are writing a sentinel > value to almost the start of the stack for this func. > > 0x555555554590 <main+48> xor %eax,%eax > 0x555555554592 <main+50> callq 0x5555555546d0 <foo> > > Clear %eax for foo's return value and call foo. No, it's not clearing for the return value. The return type of foo is void, so this must be something other. I'd guess it's clearing the sentinel value so that foo doesn't have easy access to it. Otherwise it could somehow (e.g. due to an uninitialized variable) be written by foo into the area being protected, which would defy the protector's efforts, since stack smashing will then not be detected. > > 0x555555554597 <main+55> mov 0x1f48(%rsp),%rdx > 0x55555555459f <main+63> xor %fs:0x28,%rdx > 0x5555555545a8 <main+72> jne 0x5555555545b4 <main+84> > > Now we double-check that the sentinel value at %original - 16 is > exactly the same as it was before we called foo, and if it isn't, we > go to __stack_chk_fail. So, this protects us against the case where > foo trashed the start of our stack? Yes, this protects us from the case when buffer overrun overwrites return address and thus possibly lands us somewhere at malicious (if this buffer overrun is being exploited) code at return. > > 0x5555555545aa <main+74> xor %eax,%eax > 0x5555555545ac <main+76> add $0x1f58,%rsp > 0x5555555545b3 <main+83> retq > > Clear our own return value, cleanup the stack, and exit. > > I just don't understand how the ORs are ensuring the stack hasn't overflowed. I think this is supposed to ensure that, as you've grown stack to some large size (by RSP subtraction), the whole allocated space actually belongs to the stack. Otherwise, you could e.g. grow it by 2GiB, write to the newly-allocated space—and clobber heap, not noticing the gap under the lowest stack location. These ORs will ensure that this gap is noticed (and gets you SIGSEGV). > > > Right. But note that this is enabled not by -fstack-check, but rather > > by some of the -fstack-protector* options that are on by default on > > modern Linux distributions. You can confirm this by explicitly passing > > -fno-stack-protector and seeing this sentinel checking gone. > > Ok, I see. > > > The FS segment base points to the TLS. See [3] and links therein. > ... > > It's the offset of stack_guard member of tcbhead_t. See the > > corresponding glibc source [4]. > > Got it, thankyou. > > > [1]: https://stackoverflow.com/a/44670648/673852 > > [2]: https://gcc.gnu.org/onlinedocs/gccint/Stack-Checking.html > > [3]: https://chao-tic.github.io/blog/2018/12/25/tls > > [4]: https://code.woboq.org/userspace/glibc/sysdeps/x86_64/nptl/tls.h.html#42 > > > > Regards, > > Ruslan ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: use of %fs segment register in x86_64 with -fstack-check 2020-03-03 14:53 use of %fs segment register in x86_64 with -fstack-check Maxim Blinov 2020-03-03 16:51 ` Ruslan Kabatsayev @ 2020-03-04 9:36 ` Florian Weimer 1 sibling, 0 replies; 5+ messages in thread From: Florian Weimer @ 2020-03-04 9:36 UTC (permalink / raw) To: Maxim Blinov; +Cc: gdb * Maxim Blinov: > I'm looking at some -fstack-check'ed code, and would appreciate it if > some gdb x86_64 gurus could double check my understanding of a trivial > example What's your motivation for this? -fstack-check is mostly there to support certain Ada uses, yet you post a C snippet. The more generally useful stack overflow detection switch is called -fstack-clash-protection. Thanks, Florian ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-03-04 9:36 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-03 14:53 use of %fs segment register in x86_64 with -fstack-check Maxim Blinov 2020-03-03 16:51 ` Ruslan Kabatsayev [not found] ` <CADmoyEgaCKp3nNg1Yw_8R2QDhEpd3cuaTSFDNFzSiesqerwrWQ@mail.gmail.com> 2020-03-03 18:43 ` Fwd: " Maxim Blinov 2020-03-03 20:03 ` Ruslan Kabatsayev 2020-03-04 9:36 ` Florian Weimer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).