public inbox for gcc-prs@sourceware.org
help / color / mirror / Atom feed
From: Carlo Wood <carlo@alinoe.com>
To: nobody@gcc.gnu.org
Cc: gcc-prs@gcc.gnu.org
Subject: c++/3211: Incorrect, substitution related mangling.
Date: Sun, 17 Jun 2001 10:06:00 -0000	[thread overview]
Message-ID: <20010617170600.7560.qmail@sourceware.cygnus.com> (raw)

The following reply was made to PR c++/3211; it has been noted by GNATS.

From: Carlo Wood <carlo@alinoe.com>
To: gcc-bugs@gcc.gnu.org
Cc: gcc-gnats@gcc.gnu.org
Subject: c++/3211: Incorrect, substitution related mangling.
Date: Sun, 17 Jun 2001 18:59:39 +0200

 I am afraid this is going to be a long one.  The background is that
 I've been working on a demangler for several weeks now, using g++-3.0 (pre)
 and the demangler in libiberty for comparision (without looking at the source).
 
 Now I've come to a point where start to find bugs in g++ instead of in
 my own demangler ;).
 
 PS If you don't have time to read it all, there is a CONCLUSION half way.
 
 I've used http://reality.sgi.com/dehnert_engr/cxx/abi.html#mangling-type
 as reference for writing my demangler.
 
 ---
 
 This reference defines
 
 <pointer-to-member-type> ::= M <class type> <member type>
 
 being a <type> itself, it is subject to substitution.
 
 Moreover, "<CV-qualifiers> <some type>" being a <type>, both
 "<CV-qualifiers> <some type>" and "<some type>" are subject to
 substitution.
 
 For clarity, let <type> be an unqualified type below.  So we
 have to write:
 
 <pointer-to-member-type> ::= [<CV-qualifiers>] M [<CV-qualifiers>] <type1> [<CV-qualifiers>] <type2>
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   pointer-to-member-type ____/                   ^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^
                                   class type ____/         member type ____/
 
 and subsitution members are:
 
 <type1>
 <CV-qualifiers> <type1>
 <type2>
 <CV-qualifiers> <type2>
 M <CV-qualifiers> <type1> <CV-qualifiers> <type2>
 <CV-qualifiers> M <CV-qualifiers> <type1> <CV-qualifiers> <type2>
 
 As an example, <type2> has qualifiers in this case:
 
 --------------------
 struct A {
   int volatile member;
 };
    
 int volatile (A::*member_pointer) = &A::member;
 --------------------
 
 We can use the following test program to find what g++-3.0 makes of it:
 
 --------------------
 #include <iostream>
 #include <typeinfo>
  
 struct A {
   int volatile member;
 };
  
 int volatile (A::*member_pointer) = &A::member;
  
 int main(void)
 {
   std::cout << typeid(member_pointer).name() << '\n';
   return 0;
 }
 --------------------
 
 which gives:
 
 M1AVi
 
 
 Now note that "A const::*" is incorrect C++.
 
 The following test case:
 
 --------------------
 #include <iostream>
 #include <typeinfo>
  
 struct A {
   int volatile member;
 };
  
 template<typename T>
   struct const_test {
     static int volatile (T::*member_pointer);
   };
  
 template<typename T>
   int volatile (T::*const_test<T>::member_pointer) = &T::member;
  
 int main(void)
 {
   std::cout << typeid(const_test<A const>::member_pointer).name() << '\n';
   return 0;
 }
 --------------------
 
 Still return "M1AVi", because the 'const' is simply dropped.
 
 The only way to get a qualifier before <type1> is when <type2> is a member *function*
 that is `const'.  Please not that even:
 
 --------------------
 #include <iostream>
 #include <typeinfo>
  
 struct A {
   int member(void) { }
 };
  
 template<typename T>
   struct const_test {
     typedef int (member_type)(void);
     member_type (T::*member_pointer);
   };
  
 int main(void)
 {
   std::cout << typeid(const_test<A const>::member_pointer).name() << '\n';
   return 0;
 }
 --------------------
 
 Returns "MK1AFivE", where "1A" is <type1> and "FivE" is <type2>.  There is no
 'const' qualifier for "1A".
 
 This seems to indicate that qualifiers directly after the "M" really
 belong to the member *function*.  For example:
 
 --------------------
 #include <iostream>
 #include <typeinfo>
  
 struct A {
   int member(void) const { }
 };
  
 int (A::*member_pointer)(void) const = &A::member;
  
 int main(void)
 {
   std::cout << typeid(member_pointer).name() << '\n';
   return 0;
 }
 --------------------
 
 returns "MK1AFivE", with a 'K' in front of the <type1> -- which as we saw before
 really can't BE there *unless* <type2> is a function.
 
 [ Note that this is very equivalent with how qualifiers work with nested functions,
   for example:
 
   --------------------
     struct A {
       int f1(float);
       int f2(float) const volatile;
     };
      
     int A::f1(float) { }
     int A::f2(float) const volatile { }
   --------------------
 
   results in the mangled names:
 
   00000000 T _ZN1A2f1Ef
   00000010 T _ZNVK1A2f2Ef
 
   very similar like
 
   --------------------
   struct A;
   void f1p(int (A::*)(float)) { }
   void f2p(int (A::*)(float) const volatile) { }
   --------------------
 
   results in the mangled names
 
   00000000 T _Z3f1pM1AFifE
   00000020 T _Z3f2pMVK1AFifE
 ]
 
 Imho, the draft is wrong and should have defined <pointer-to-member> like
 it defines <nested-name>:
 
 <pointer-to-member-type> ::= M [CV-qualifiers] <class type> <member type>
 
 stripping off qualifiers that otherwise *errornous* seem to belong to the
 class type (which is not possible).
 
 Another argument for this is found by looking at what happens when
 <class type> is a <nested-name> itself.  For example:
 
 --------------------
 #include <iostream>
 #include <typeinfo>
  
 struct A {
   struct B {
     int member(void) const { }
   };
 };
  
 int (A::B::*member_pointer)(void) const = &A::B::member;
  
 int main(void)
 {
   std::cout << typeid(member_pointer).name() << '\n';
   return 0;
 }
 --------------------
 
 prints "MKN1A1BEFivE", as it should (imho) and NOT "MN1AK1BEFivE", or
 "MNK1A1BEFivE" for that matter.
 
 The demangler in libiberty is already *heavily* broken concerning qualifiers
 (as I reported before), but how does the above affect the mangling of g++?
 
 It does because it has effect on the used substitutions, after all - when using
 
 <pointer-to-member-type> ::= M [CV-qualifiers] <class type> <member type>
 
 the resulting substitutions would be:
 
 <type1>			<-- unqualified, even when being a <nested-name>
 <type2>
 <CV-qualifiers> <type2>
 M <CV-qualifiers> <type1> <CV-qualifiers> <type2>
 <CV-qualifiers> M <CV-qualifiers> <type1> <CV-qualifiers> <type2>
 
 And "<CV-qualifiers> <type1>" is gone from the list.
 
 Now before you say 'yes, but this is at most a bug in the reference and not in g++',
 let me point out an inconsistency in the reference related to this.  It states:
 "substitutable components are the represented symbolic constructs, not their
  associated mangling character strings".  This would mean that a substitution
 for "<CV-qualifiers> <type1>" would result in storing and re-using a string like
 "A::B const", but it is possible to get:
 
 MS3_FivE
 
 for example, where S3_ represents the mentioned qualified <type1>.  Following the
 reference literally we'd HAVE to demangle that as:
 
 "void (A::B const::*)(int)"
 
 where the string "A::B const" (S3_) appears literally.
 
 ==========
 CONCLUSION
 ==========
 
 This is why I am convinced that this substitution is wrong, it shouldn't exist.
 The correct way to mangle the above is as follows:
 
 with for example:
 
 S2_ == "A::B"
 S3_ == "A::B const"
 
 "void (A::B::*)(int) const"
 
 should mangle as
 
 "MKS2_FviE" and not as "MS3_FviE".
 
 g++-3.0 makes of this:
 
 ----------------------
 struct A {
   struct B {
     int member(void) const { }
     static int (A::B::* const member_pointer)(void) const = &A::B::member;
     void foo(A::B const*, typeof(member_pointer));
   };
 };
  
 void A::B::foo(A::B const*, typeof(A::B::member_pointer)) { }
 ----------------------
 
 where 'A::B::foo' is mangled as "_ZN1A1B3fooEPKS0_MS1_FivE".
 
                  this should be "_ZN1A1B3fooEPKS0_MKS0_FivE" imho.
 
 
 -- 
 Carlo Wood <carlo@alinoe.com>
 
 =====================================================================================
 PS Note that the demangler in libiberty chokes on this as well:
 
 /usr/src/gcc/gcc-cvs-3.0/libiberty>c++filt _ZN1A1B3fooEPKS0_MKS0_FivE
  -> mangled-name             at position   0
  -> encoding                 at position   2
  -> name                     at position   2
  -> nested-name              at position   2
  -> prefix                   at position   3
  -> unqualified-name         at position   3
  -> source-name              at position   3
  -> number                   at position   3
  -> number*                  at position   3
  -> identifier               at position   4
 SUBSTITUTIONS:
  S_   : A
  -> unqualified-name         at position   5
  -> source-name              at position   5
  -> number                   at position   5
  -> number*                  at position   5
  -> identifier               at position   6
 SUBSTITUTIONS:
  S_   : A
  S0_  : A::B
  -> unqualified-name         at position   7
  -> source-name              at position   7
  -> number                   at position   7
  -> number*                  at position   7
  -> identifier               at position   8
  -> bare-function-type       at position  12
  -> type                     at position  12
  -> type*                    at position  12
  -> type*                    at position  13
  -> type                     at position  13
  -> CV-qualifiers            at position  13
  -> type                     at position  14
  -> substitution             at position  14
  -> number                   at position  15
  -> number*                  at position  15
 SUBSTITUTIONS:
  S_   : A
  S0_  : A::B
  S1_  : A::B const
 SUBSTITUTIONS:
  S_   : A
  S0_  : A::B
  S1_  : A::B const
  S2_  : A::B const*
  -> type                     at position  17
  -> type*                    at position  17
  -> type                     at position  18
  -> CV-qualifiers            at position  18
  -> type                     at position  19
  -> substitution             at position  19
  -> number                   at position  20
  -> number*                  at position  20
 SUBSTITUTIONS:
  S_   : A
  S0_  : A::B
  S1_  : A::B const
  S2_  : A::B const*
  S3_  : A::B const
  -> type*                    at position  22
  -> function-type            at position  22
  -> bare-function-type       at position  23
  -> type                     at position  23
  -> builtin-type             at position  23
 SUBSTITUTIONS:
  S_   : A
  S0_  : A::B
  S1_  : A::B const
  S2_  : A::B const*
  S3_  : A::B const
  S4_  : int ()()
 SUBSTITUTIONS:
  S_   : A
  S0_  : A::B
  S1_  : A::B const
  S2_  : A::B const*
  S3_  : A::B const
  S4_  : int ()()
  S5_  : int (A::B const::*)()
 A::B::foo(A::B const*, int (A::B const::*)())
 
 
 Where "A::B const" appears TWICE in the substitution list, which is a definite bug.
 Moreover, I am not convinved that "int ()()" should be there (it can neither be
 used as a literal replacement; same argument).  Finally, the "A::B const::*"
 is syntactical nonsense.
 


             reply	other threads:[~2001-06-17 10:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-06-17 10:06 Carlo Wood [this message]
  -- strict thread matches above, loose matches on Subject: below --
2001-11-13 15:16 loewis
2001-11-13 15:16 loewis
2001-06-18 16:56 Carlo Wood
2001-06-17 17:36 Carlo Wood
2001-06-17  9:56 carlo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010617170600.7560.qmail@sourceware.cygnus.com \
    --to=carlo@alinoe.com \
    --cc=gcc-prs@gcc.gnu.org \
    --cc=nobody@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).