public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* quick question
@ 2006-02-16 14:57 Jim Stapleton
  2006-02-16 15:23 ` John Love-Jensen
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jim Stapleton @ 2006-02-16 14:57 UTC (permalink / raw)
  To: gcc-help

I remember reading that there are systems where they don't like basic
variables to be put on offsets that are not an integer multiple of the
variable size, up to variables the size of a system word.

example, if this applied to the 32 bit x86 architechture where a word
is defined as 4 bytes (I'm talking about the actual arch here, and not
the bastardized useage form 16bit ASM):
char [1 byte]: can be anywhere
short [2 bytes]: any 2N address, where N is an integer, and N > 0.
int/long [4 bytes]: any 4N address, where N is an integer, and N > 0
long long [8 bytes]: any 4N address, where N is an integer, and N > 0
(8 bytes > 1 word)


Now, this set of code works on the x86 platform, but I'm worried it
may not work on other platforms, am I correct in this worry?

#include <stdio.h>

=======================================================
int main()
{
  char test[] =  {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a};
  char *tptr = test;
  int *ptr;

  ptr = (int*)tptr;
  printf("tptr[0]: %x\n", *ptr);

  tptr++;
  ptr = (int*)tptr;
  printf("tptr[1]: %x\n", *ptr);

  tptr++;
  ptr = (int*)tptr;
  printf("tptr[2]: %x\n", *ptr);

  tptr++;
  ptr = (int*)tptr;
  printf("tptr[3]: %x\n", *ptr);

  return 0;
}
=======================================================

output (note: all my machines are 32 bit x86, so this output is
correct for them, on reasonable endian machines, the bytes in the
output would be reversed):
tptr[0]: 3020100
tptr[1]: 4030201
tptr[2]: 5040302
tptr[3]: 6050403



Thanks
-Jim

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
  2006-02-16 14:57 quick question Jim Stapleton
@ 2006-02-16 15:23 ` John Love-Jensen
  2006-02-16 15:28 ` Perry Smith
  2006-02-16 15:35 ` Brian Dessent
  2 siblings, 0 replies; 13+ messages in thread
From: John Love-Jensen @ 2006-02-16 15:23 UTC (permalink / raw)
  To: Jim Stapleton, MSX to GCC

Hi Jim,

> Now, this set of code works on the x86 platform, but I'm worried it
> may not work on other platforms, am I correct in this worry?

You are correct, that code will not necessarily work on all platforms.  Such
as DEC Alpha, SPARC, 680[346]0 or PowerPC architectures (and probably on
HPPA as well).  Depending on the OS, may cause a SIGBUS error.

Some compilers, such as the DEC one for the DEC Alpha on Tru64, had a
facility to automagically catch SIGBUS errors, do-the-right-thing (albeit
horribly inefficiently), and continue running.

HTH,
--Eljay

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
  2006-02-16 14:57 quick question Jim Stapleton
  2006-02-16 15:23 ` John Love-Jensen
@ 2006-02-16 15:28 ` Perry Smith
       [not found]   ` <80f4f2b20602160737q5684d6c3l5df46c8a64fe47ad@mail.gmail.com>
  2006-02-16 15:35 ` Brian Dessent
  2 siblings, 1 reply; 13+ messages in thread
From: Perry Smith @ 2006-02-16 15:28 UTC (permalink / raw)
  To: Jim Stapleton; +Cc: gcc-help

I am 99% sure that if you did this test on an old 68000 type platform  
before 68020(?), it would not work.  Somewhere in the 68000  
genealogy, they added unaligned access.

I believe if you tried this on the old IBM RT it would not work.

I was recently consulted on a problem the client could not solve.  It  
turned out that it was a modern day controller chip (single chip  
computer type thing) that was still based on the 68010 engine.  The  
problem was the code was getting an "odd address exception" because  
it was accessing an unaligned word.

BUT... this is 2006... most machines today have the ability to access  
unaligned data -- but it does cost you in performance.  Also, the  
atomic operations (fetch_and_set, fetch_and_add, etc) I know of no  
platform can do atomic operations on unaligned data.

So, it just depends upon where this is going to be used.  It also  
depends upon why you are doing this in the first place.

On Feb 16, 2006, at 8:57 AM, Jim Stapleton wrote:

> I remember reading that there are systems where they don't like basic
> variables to be put on offsets that are not an integer multiple of the
> variable size, up to variables the size of a system word.
>
> example, if this applied to the 32 bit x86 architechture where a word
> is defined as 4 bytes (I'm talking about the actual arch here, and not
> the bastardized useage form 16bit ASM):
> char [1 byte]: can be anywhere
> short [2 bytes]: any 2N address, where N is an integer, and N > 0.
> int/long [4 bytes]: any 4N address, where N is an integer, and N > 0
> long long [8 bytes]: any 4N address, where N is an integer, and N > 0
> (8 bytes > 1 word)
>
>
> Now, this set of code works on the x86 platform, but I'm worried it
> may not work on other platforms, am I correct in this worry?
>
> #include <stdio.h>
>
> =======================================================
> int main()
> {
>   char test[] =  {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
> 0x08, 0x09, 0x0a};
>   char *tptr = test;
>   int *ptr;
>
>   ptr = (int*)tptr;
>   printf("tptr[0]: %x\n", *ptr);
>
>   tptr++;
>   ptr = (int*)tptr;
>   printf("tptr[1]: %x\n", *ptr);
>
>   tptr++;
>   ptr = (int*)tptr;
>   printf("tptr[2]: %x\n", *ptr);
>
>   tptr++;
>   ptr = (int*)tptr;
>   printf("tptr[3]: %x\n", *ptr);
>
>   return 0;
> }
> =======================================================
>
> output (note: all my machines are 32 bit x86, so this output is
> correct for them, on reasonable endian machines, the bytes in the
> output would be reversed):
> tptr[0]: 3020100
> tptr[1]: 4030201
> tptr[2]: 5040302
> tptr[3]: 6050403
>
>
>
> Thanks
> -Jim
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
  2006-02-16 14:57 quick question Jim Stapleton
  2006-02-16 15:23 ` John Love-Jensen
  2006-02-16 15:28 ` Perry Smith
@ 2006-02-16 15:35 ` Brian Dessent
  2006-02-16 15:39   ` Jim Stapleton
  2 siblings, 1 reply; 13+ messages in thread
From: Brian Dessent @ 2006-02-16 15:35 UTC (permalink / raw)
  To: gcc-help

Jim Stapleton wrote:

>   char test[] =  {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
> 0x08, 0x09, 0x0a};
>   char *tptr = test;
>   int *ptr;
> 
>   ptr = (int*)tptr;

Doesn't this violate the C aliasing rules?  If so the whole thing is
undefined behavior.

Brian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
  2006-02-16 15:35 ` Brian Dessent
@ 2006-02-16 15:39   ` Jim Stapleton
  2006-02-16 15:46     ` Brian Dessent
  2006-02-16 15:49     ` Jim Stapleton
  0 siblings, 2 replies; 13+ messages in thread
From: Jim Stapleton @ 2006-02-16 15:39 UTC (permalink / raw)
  To: gcc-help

which rules are you referring to? I've not seen that one.

On 2/16/06, Brian Dessent <brian@dessent.net> wrote:
> Jim Stapleton wrote:
>
> >   char test[] =  {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
> > 0x08, 0x09, 0x0a};
> >   char *tptr = test;
> >   int *ptr;
> >
> >   ptr = (int*)tptr;
>
> Doesn't this violate the C aliasing rules?  If so the whole thing is
> undefined behavior.
>
> Brian
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
       [not found]   ` <80f4f2b20602160737q5684d6c3l5df46c8a64fe47ad@mail.gmail.com>
@ 2006-02-16 15:39     ` Jim Stapleton
  2006-02-16 20:20       ` Ben Rudiak-Gould
       [not found]     ` <D94028E4-C732-4FB6-8C07-D38676404EEB@easesoftware.net>
  1 sibling, 1 reply; 13+ messages in thread
From: Jim Stapleton @ 2006-02-16 15:39 UTC (permalink / raw)
  To: gcc-help

I wanted to make an efficient "string class" in C, that is mostly
transparent except for allocations and deletions from a normal char*
array. I am a fan of what I call Psudo Obeject Oriented programming,
allowing everything to be used through abstraction functions, or
directly, without the overhead of doing it in full OO.

in this case:
char *superstring = (char*)malloc(sizeof(int) * 2 + sizeof(char) * string_size);

string lenght: (int)*(superstring - sizeof(int))
string allocated: (int)*(superstring - sizeof(int) * 2)

with the proper macros, this could actually be made more effecient
than the above looks, but the above is more readable.



On 2/16/06, Jim Stapleton <stapleton.41@gmail.com> wrote:
> I wanted to make an efficient "string class" in C, that is mostly
> transparent except for allocations and deletions from a normal char*
> array. I am a fan of what I call Psudo Obeject Oriented programming,
> allowing everything to be used through abstraction functions, or
> directly, without the overhead of doing it in full OO.
>
> in this case:
> char *superstring = (char*)malloc(sizeof(int) * 2 + sizeof(char) * string_size);
>
> string lenght: (int)*(superstring - sizeof(int))
> string allocated: (int)*(superstring - sizeof(int) * 2)
>
> with the proper macros, this could actually be made more effecient
> than the above looks, but the above is more readable.
>
> -Jim
>
> On 2/16/06, Perry Smith <pedz@easesoftware.net> wrote:
> > I am 99% sure that if you did this test on an old 68000 type platform
> > before 68020(?), it would not work.  Somewhere in the 68000
> > genealogy, they added unaligned access.
> >
> > I believe if you tried this on the old IBM RT it would not work.
> >
> > I was recently consulted on a problem the client could not solve.  It
> > turned out that it was a modern day controller chip (single chip
> > computer type thing) that was still based on the 68010 engine.  The
> > problem was the code was getting an "odd address exception" because
> > it was accessing an unaligned word.
> >
> > BUT... this is 2006... most machines today have the ability to access
> > unaligned data -- but it does cost you in performance.  Also, the
> > atomic operations (fetch_and_set, fetch_and_add, etc) I know of no
> > platform can do atomic operations on unaligned data.
> >
> > So, it just depends upon where this is going to be used.  It also
> > depends upon why you are doing this in the first place.
> >
> > On Feb 16, 2006, at 8:57 AM, Jim Stapleton wrote:
> >
> > > I remember reading that there are systems where they don't like basic
> > > variables to be put on offsets that are not an integer multiple of the
> > > variable size, up to variables the size of a system word.
> > >
> > > example, if this applied to the 32 bit x86 architechture where a word
> > > is defined as 4 bytes (I'm talking about the actual arch here, and not
> > > the bastardized useage form 16bit ASM):
> > > char [1 byte]: can be anywhere
> > > short [2 bytes]: any 2N address, where N is an integer, and N > 0.
> > > int/long [4 bytes]: any 4N address, where N is an integer, and N > 0
> > > long long [8 bytes]: any 4N address, where N is an integer, and N > 0
> > > (8 bytes > 1 word)
> > >
> > >
> > > Now, this set of code works on the x86 platform, but I'm worried it
> > > may not work on other platforms, am I correct in this worry?
> > >
> > > #include <stdio.h>
> > >
> > > =======================================================
> > > int main()
> > > {
> > >   char test[] =  {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
> > > 0x08, 0x09, 0x0a};
> > >   char *tptr = test;
> > >   int *ptr;
> > >
> > >   ptr = (int*)tptr;
> > >   printf("tptr[0]: %x\n", *ptr);
> > >
> > >   tptr++;
> > >   ptr = (int*)tptr;
> > >   printf("tptr[1]: %x\n", *ptr);
> > >
> > >   tptr++;
> > >   ptr = (int*)tptr;
> > >   printf("tptr[2]: %x\n", *ptr);
> > >
> > >   tptr++;
> > >   ptr = (int*)tptr;
> > >   printf("tptr[3]: %x\n", *ptr);
> > >
> > >   return 0;
> > > }
> > > =======================================================
> > >
> > > output (note: all my machines are 32 bit x86, so this output is
> > > correct for them, on reasonable endian machines, the bytes in the
> > > output would be reversed):
> > > tptr[0]: 3020100
> > > tptr[1]: 4030201
> > > tptr[2]: 5040302
> > > tptr[3]: 6050403
> > >
> > >
> > >
> > > Thanks
> > > -Jim
> > >
> >
> >
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
  2006-02-16 15:39   ` Jim Stapleton
@ 2006-02-16 15:46     ` Brian Dessent
  2006-02-16 15:56       ` Jim Stapleton
  2006-02-16 15:49     ` Jim Stapleton
  1 sibling, 1 reply; 13+ messages in thread
From: Brian Dessent @ 2006-02-16 15:46 UTC (permalink / raw)
  To: gcc-help

Jim Stapleton wrote:
> 
> which rules are you referring to? I've not seen that one.

http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
  2006-02-16 15:39   ` Jim Stapleton
  2006-02-16 15:46     ` Brian Dessent
@ 2006-02-16 15:49     ` Jim Stapleton
  1 sibling, 0 replies; 13+ messages in thread
From: Jim Stapleton @ 2006-02-16 15:49 UTC (permalink / raw)
  To: gcc-help

hmm, I just thought, how about an int pointer instead

char *make_superstring(int size)
{
  int *superstring = (int*)malloc(2 * sizeof(int) + size * sizeof(char);
  return (char*)(superstring + 2 * sizeof(int));
}

-Jim

On 2/16/06, Jim Stapleton <stapleton.41@gmail.com> wrote:
> which rules are you referring to? I've not seen that one.
>
> On 2/16/06, Brian Dessent <brian@dessent.net> wrote:
> > Jim Stapleton wrote:
> >
> > >   char test[] =  {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
> > > 0x08, 0x09, 0x0a};
> > >   char *tptr = test;
> > >   int *ptr;
> > >
> > >   ptr = (int*)tptr;
> >
> > Doesn't this violate the C aliasing rules?  If so the whole thing is
> > undefined behavior.
> >
> > Brian
> >
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
  2006-02-16 15:46     ` Brian Dessent
@ 2006-02-16 15:56       ` Jim Stapleton
  0 siblings, 0 replies; 13+ messages in thread
From: Jim Stapleton @ 2006-02-16 15:56 UTC (permalink / raw)
  To: gcc-help

hmm, looking at that makes me think further. It does say my updated
idea would work, but them I'm stuck on another thought:

really, the initial pointers are always (void*) as they come from
malloc, will malloc always use the safest alignment for memory
creation, as it can be used for anything, or does it base it on
creation size?

And since the initial pointers are void, shouldn't the final product,
whatever type, be ok to point to them, as long as the final product
points to a properly aligned space?

Thanks,
-Jim


On 2/16/06, Brian Dessent <brian@dessent.net> wrote:
> Jim Stapleton wrote:
> >
> > which rules are you referring to? I've not seen that one.
>
> http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
       [not found]     ` <D94028E4-C732-4FB6-8C07-D38676404EEB@easesoftware.net>
@ 2006-02-16 16:17       ` Jim Stapleton
  2006-02-16 16:22         ` random
  0 siblings, 1 reply; 13+ messages in thread
From: Jim Stapleton @ 2006-02-16 16:17 UTC (permalink / raw)
  To: gcc-help

Yes it does, but it makes things easier for people who use C.

Also, I've seen a few tests where the same app compiled in both C &
C++ ran faster in C, even though the source was identical. While this
will loose some of that speed in that it has to calculate the offsets,
it will help with the C users in easier programming. Also, it can be
used to reduce the need for mallocs and strlen() functions (linear as
opposed to constant, isn't it?), which may even show a speed
improvement, in many cases.

-Jim

On 2/16/06, Perry Smith <pedz@easesoftware.net> wrote:
> But... all of this flies in the face of C++.  You are doing the work
> that the C++ compiler would gladly do for you .

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
  2006-02-16 16:17       ` Jim Stapleton
@ 2006-02-16 16:22         ` random
  2006-02-16 16:37           ` Perry Smith
  0 siblings, 1 reply; 13+ messages in thread
From: random @ 2006-02-16 16:22 UTC (permalink / raw)
  To: Jim Stapleton; +Cc: gcc-help

Jim Stapleton wrote:
> Yes it does, but it makes things easier for people who use C.
>
> Also, I've seen a few tests where the same app compiled in both C &
> C++ ran faster in C, even though the source was identical.
Can you produce any? I've seen lots of people see these tests, and never
been able to find any of them myself.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
  2006-02-16 16:22         ` random
@ 2006-02-16 16:37           ` Perry Smith
  0 siblings, 0 replies; 13+ messages in thread
From: Perry Smith @ 2006-02-16 16:37 UTC (permalink / raw)
  To: random; +Cc: Jim Stapleton, gcc-help

I too find it hard to believe.  After all, gcc and g++ are  
essentially the same compiler.  I would find it odd that there are  
optimizations in gcc that are not in g++.

I've used C for 20 years.  I've used C++ for about 20 days.  The  
template syntax in particular I use to find terribly confusing.  But  
if you immerse  yourself in it, I found that I am very quickly  
beginning to read the template code as naturally as if it were C code.

On Feb 16, 2006, at 10:22 AM, random@bubblescope.net wrote:

> Jim Stapleton wrote:
>> Yes it does, but it makes things easier for people who use C.
>>
>> Also, I've seen a few tests where the same app compiled in both C &
>> C++ ran faster in C, even though the source was identical.
> Can you produce any? I've seen lots of people see these tests, and  
> never
> been able to find any of them myself.
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: quick question
  2006-02-16 15:39     ` Jim Stapleton
@ 2006-02-16 20:20       ` Ben Rudiak-Gould
  0 siblings, 0 replies; 13+ messages in thread
From: Ben Rudiak-Gould @ 2006-02-16 20:20 UTC (permalink / raw)
  To: gcc-help

Jim Stapleton wrote:
> char *superstring = (char*)malloc(sizeof(int) * 2 + sizeof(char) * string_size);
> 
> string lenght: (int)*(superstring - sizeof(int))
> string allocated: (int)*(superstring - sizeof(int) * 2)

Suggestion:

     typedef struct {
         size_t allocated, length;
         char contents[1];
     } SuperString;

     SuperString *superstring =
         malloc(offsetof(SuperString,contents) + string_size);

Incidentally, C and C++ both guarantee sizeof(char)==1, and in C it's 
considered good practice to omit explicit casts from (void*) in this 
situation, since the redundancy can be a source of error. (But in C++ you 
can't omit them.)

-- Ben

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2006-02-16 20:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-16 14:57 quick question Jim Stapleton
2006-02-16 15:23 ` John Love-Jensen
2006-02-16 15:28 ` Perry Smith
     [not found]   ` <80f4f2b20602160737q5684d6c3l5df46c8a64fe47ad@mail.gmail.com>
2006-02-16 15:39     ` Jim Stapleton
2006-02-16 20:20       ` Ben Rudiak-Gould
     [not found]     ` <D94028E4-C732-4FB6-8C07-D38676404EEB@easesoftware.net>
2006-02-16 16:17       ` Jim Stapleton
2006-02-16 16:22         ` random
2006-02-16 16:37           ` Perry Smith
2006-02-16 15:35 ` Brian Dessent
2006-02-16 15:39   ` Jim Stapleton
2006-02-16 15:46     ` Brian Dessent
2006-02-16 15:56       ` Jim Stapleton
2006-02-16 15:49     ` Jim Stapleton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).