public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* odd behavior with Character Arrays
@ 2008-08-08  6:55 Rohit Arul Raj
  2008-08-08  7:07 ` Mateusz Loskot
  0 siblings, 1 reply; 12+ messages in thread
From: Rohit Arul Raj @ 2008-08-08  6:55 UTC (permalink / raw)
  To: gcc-help

Hi All,

Compiler Version: gcc 4.1.2 and gcc 3.4.6

Test Case:

unsigned int g = 0;
unsigned int slen(const char* c)
{
        int l = 0;
        while(*c != '\0') {
                ++l;
                ++c;
        }
        return l;
}

int main()
{
        unsigned int t;
        unsigned char n[] = {'a', 'b', 'c', 'd'};
        t = slen(n);
        g = slen(n);

        printf("\n t = %d, g1 = %d\n", t, g1);
        return 0;
}

I have a test case given above which is used to compute the string
length of the character array. In the test case, both 't' and 'g' call
the same function to compute the string length. But both these values
are different and they are wrong also.

With GCC 4.1.2,  t = 7, g = 5
With GCC 3.4.6,  t = 15, g = 5

This happens only when i don't provide the size of the array 'n'. If
size of the array is given "unsigned char n[15] = {'a', 'b', 'c',
'd'};" then the values are proper
t = 4 and g = 4.

1. Is this the expected behavior with GCC?
2. Can i get more details as to why if the size of the array is not
provided the compiler does not insert an string terminator at the end
of the array.
    This happens with both character as well as integer arrays.

Regards,
Rohit

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: odd behavior with Character Arrays
  2008-08-08  6:55 odd behavior with Character Arrays Rohit Arul Raj
@ 2008-08-08  7:07 ` Mateusz Loskot
  2008-08-08  9:38   ` Rohit Arul Raj
  0 siblings, 1 reply; 12+ messages in thread
From: Mateusz Loskot @ 2008-08-08  7:07 UTC (permalink / raw)
  To: Rohit Arul Raj; +Cc: gcc-help

Rohit Arul Raj wrote:
> 2. Can i get more details as to why if the size of the array is not
> provided the compiler does not insert an string terminator at the end
> of the array.

How could that be?
It is an array but not a string literal, so compiler does not append \0 
or any other extra elements to it.

Best regards
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: odd behavior with Character Arrays
  2008-08-08  7:07 ` Mateusz Loskot
@ 2008-08-08  9:38   ` Rohit Arul Raj
  2008-08-08 12:10     ` Jędrzej Dudkiewicz
                       ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Rohit Arul Raj @ 2008-08-08  9:38 UTC (permalink / raw)
  To: Mateusz Loskot; +Cc: gcc-help

On Fri, Aug 8, 2008 at 12:24 PM, Mateusz Loskot <mateusz@loskot.net> wrote:
> Rohit Arul Raj wrote:
>>
>> 2. Can i get more details as to why if the size of the array is not
>> provided the compiler does not insert an string terminator at the end
>> of the array.
>
> How could that be?
> It is an array but not a string literal, so compiler does not append \0 or
> any other extra elements to it.
>
> Best regards
> --
> Mateusz Loskot, http://mateusz.loskot.net
> Charter Member of OSGeo, http://osgeo.org
>


Hi,

If i give the size of the array as 15, like "unsigned char n[15] =
{'a', 'b', 'c','d'};" , then it is appending '\0'.
But if the size of the array is not given "unsigned char n[] ", then
it is not appending '\0'.

Does that mean, that if the size of the array is specified, it appends
'\0' and if it is not specified then it does not append '\0'?
Can you/anyone clarify this point?

Regards,
Rohit

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: odd behavior with Character Arrays
  2008-08-08  9:38   ` Rohit Arul Raj
@ 2008-08-08 12:10     ` Jędrzej Dudkiewicz
  2008-08-08 13:00     ` Mateusz Loskot
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Jędrzej Dudkiewicz @ 2008-08-08 12:10 UTC (permalink / raw)
  To: gcc-help

On Fri, Aug 8, 2008 at 9:06 AM, Rohit Arul Raj <rohitarulraj@gmail.com> wrote:
> On Fri, Aug 8, 2008 at 12:24 PM, Mateusz Loskot <mateusz@loskot.net> wrote:
> Hi,
>
> If i give the size of the array as 15, like "unsigned char n[15] =
> {'a', 'b', 'c','d'};" , then it is appending '\0'.

Yes, because you provide values only for four first chars.

> But if the size of the array is not given "unsigned char n[] ", then
> it is not appending '\0'.

Why should it? If you don't specify size of an array, compiler
calculates it from provided initializer (I'm not sure if this is the
proper name?), in this case - explicit array of characters. Note that
this is NOT a C-string - so it does not contain implicit '\0'
character. Change { 'a',.... } to "abcd" and '\0' will be there.

> Does that mean, that if the size of the array is specified, it appends
> '\0' and if it is not specified then it does not append '\0'?

Yes and no. Yes, because it happens so with values you provided above,
no, because if you specify size to be 4 and use the same initializer,
that is { 'a', 'b', 'c', 'd' }, all values are provided and there is
no need and room to use default initialization.

-- 
Jędrzej Dudkiewicz

I really hate this damn machine, I wish that they would sell it.
It never does just what I want, but only what I tell it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: odd behavior with Character Arrays
  2008-08-08  9:38   ` Rohit Arul Raj
  2008-08-08 12:10     ` Jędrzej Dudkiewicz
@ 2008-08-08 13:00     ` Mateusz Loskot
  2008-08-08 13:40       ` Mateusz Loskot
  2008-08-08 15:01     ` John Fine
  2008-08-08 15:38     ` odd behavior with Character Arrays Bob Plantz
  3 siblings, 1 reply; 12+ messages in thread
From: Mateusz Loskot @ 2008-08-08 13:00 UTC (permalink / raw)
  To: Rohit Arul Raj; +Cc: gcc-help

Rohit Arul Raj wrote:
> On Fri, Aug 8, 2008 at 12:24 PM, Mateusz Loskot <mateusz@loskot.net> wrote:
>> Rohit Arul Raj wrote:
>>> 2. Can i get more details as to why if the size of the array is not
>>> provided the compiler does not insert an string terminator at the end
>>> of the array.
>> How could that be?
>> It is an array but not a string literal, so compiler does not append \0 or
>> any other extra elements to it.
>>
>> Best regards
>> --
>> Mateusz Loskot, http://mateusz.loskot.net
>> Charter Member of OSGeo, http://osgeo.org
>>
> 
> 
> Hi,
> 
> If i give the size of the array as 15, like "unsigned char n[15] =
> {'a', 'b', 'c','d'};" , then it is appending '\0'.

Rohit,

Yes, this is perfectly correct.

 > But if the size of the array is not given "unsigned char n[] ", then
 > it is not appending '\0'.

And this is correct behavior and I was referring to the n[] case only.
Sorry for lack of precision.

> Does that mean, that if the size of the array is specified, it appends
> '\0' and if it is not specified then it does not append '\0'?
> Can you/anyone clarify this point?

I believe Jerzy explained it in details.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: odd behavior with Character Arrays
  2008-08-08 13:00     ` Mateusz Loskot
@ 2008-08-08 13:40       ` Mateusz Loskot
  0 siblings, 0 replies; 12+ messages in thread
From: Mateusz Loskot @ 2008-08-08 13:40 UTC (permalink / raw)
  To: Rohit Arul Raj; +Cc: gcc-help

Mateusz Loskot wrote:
>> Does that mean, that if the size of the array is specified, it appends
>> '\0' and if it is not specified then it does not append '\0'?
>> Can you/anyone clarify this point?
> 
> I believe Jerzy explained it in details.

I should write Jedrzej.
Apologize for the mistake

Ciao,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: odd behavior with Character Arrays
  2008-08-08  9:38   ` Rohit Arul Raj
  2008-08-08 12:10     ` Jędrzej Dudkiewicz
  2008-08-08 13:00     ` Mateusz Loskot
@ 2008-08-08 15:01     ` John Fine
  2008-08-09  1:09       ` Is this code wrong? John Fine
  2008-08-08 15:38     ` odd behavior with Character Arrays Bob Plantz
  3 siblings, 1 reply; 12+ messages in thread
From: John Fine @ 2008-08-08 15:01 UTC (permalink / raw)
  To: Rohit Arul Raj; +Cc: Mateusz Loskot, gcc-help

Maybe this is already clear from the answer Jędrzej gave. But in case it 
isn't:

Rohit Arul Raj wrote:
> Does that mean, that if the size of the array is specified, it appends
> '\0' and if it is not specified then it does not append '\0'?
> Can you/anyone clarify this point?
>
>   
The '\0' are appended only because the specified size is larger than the 
number of initial values.

If you don't specify a size, then the size is the number of initial 
values, not some amount larger than the number of initial values.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: odd behavior with Character Arrays
  2008-08-08  9:38   ` Rohit Arul Raj
                       ` (2 preceding siblings ...)
  2008-08-08 15:01     ` John Fine
@ 2008-08-08 15:38     ` Bob Plantz
  3 siblings, 0 replies; 12+ messages in thread
From: Bob Plantz @ 2008-08-08 15:38 UTC (permalink / raw)
  To: Rohit Arul Raj; +Cc: gcc-help


> If i give the size of the array as 15, like "unsigned char n[15] =
> {'a', 'b', 'c','d'};" , then it is appending '\0'.
> But if the size of the array is not given "unsigned char n[] ", then
> it is not appending '\0'.
> 
> Does that mean, that if the size of the array is specified, it appends
> '\0' and if it is not specified then it does not append '\0'?
> Can you/anyone clarify this point?

I does more than just append a '\0'. It first zeroes the entire array,
then stores your characters there, one at a time.

Here is the (64-bit) assembly language for the beginning of your main
function. I have annotated it to show what's happening to the array.
main:
.LFB3:
	pushq	%rbp                     # save caller's base pointer
.LCFI2:
	movq	%rsp, %rbp         # establish our base pointer
.LCFI3:
	subq	$48, %rsp           # get memory for local variables
.LCFI4:
	movq	%fs:40, %rax      # these three instructions are used to
	movq	%rax, -8(%rbp) # check for stack boundary violation
	xorl	%eax, %eax
# The array is in the stack frame, starting -32 from the base pointer
	movq	$0, -32(%rbp)    # zero first 8 bytes of array
	movl	$0, -24(%rbp)    # zero next 4 bytes of array
	movw	$0, -20(%rbp)    # zero next 2 bytes of array
	movb	$0, -18(%rbp)    # zero next byte of array
# Now all 15 bytes of the array have been zeroed.
	movb	$97, -32(%rbp)   # n[0] = 'a';
	movb	$98, -31(%rbp)   # n[1] = 'b';
	movb	$99, -30(%rbp)   # n[2] = 'c';
	movb	$100, -29(%rbp) # n[3] = 'd';
	leaq	-32(%rbp), %rdi # load address of array
	call	slen
	movl	%eax, -36(%rbp) # t = slen(n);
	leaq	-32(%rbp), %rdi # load address of array
	call	slen
	movl	%eax, g(%rip)      # g = slen(n);
	movl	g(%rip), %edx       # load g
	movl	-36(%rbp), %esi   # load t
	movl	$.LC0, %edi           # address of "\n t = %d, g = %d\n"
	movl	$0, %eax                 # no SSE arguments
	call	printf
	movl	$0, %eax                  # return 0;
	movq	-8(%rbp), %rdx      # these three instructions
	xorq	%fs:40, %rdx          # check for stack boundary
	je	.L8
	call	__stack_chk_fail # violation
.L8:
	leave                                        # undo stack set up
	ret                                            # return to caller

Bob


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Is this code wrong?
  2008-08-08 15:01     ` John Fine
@ 2008-08-09  1:09       ` John Fine
  2008-08-09  2:44         ` Eljay Love-Jensen
  0 siblings, 1 reply; 12+ messages in thread
From: John Fine @ 2008-08-09  1:09 UTC (permalink / raw)
  Cc: gcc-help

While investigating an unrelated (really) problem, I came across a bunch 
of examples of a construct that I'm pretty sure shouldn't work, yet it 
seems to be working.  Am I misunderstanding this (is there some reason 
this code should work)?  This is all in code belonging to my employer, 
written by another employee, so I can't quote a large enough chunk for 
you to test. But I think the concepts are simple enough for someone to 
answer based on the info I can provide.

An inline function declares a std::vector, initializes it, then returns 
it.  The calling function assigns that return value to a const&.

My understanding is that the return of the local object makes a 
temporary copy of that object, which exists only during the calling 
statement.  So the reference is to that temporary object, which no 
longer exists.

inline std::vector<int> get_vector()
{
   std::vector<int> result;
   ... code to put contents into result ...
   return result;
}

in some other function

std::vector<int> const& local_vector = get_vector();
unrelated code
read the contents from local_vector

What is the scope of the temporary vector that catches the return value 
from the function?  I thought that scope should be just that statement.  
Could the scope be the rest of the {} containing the statement?

Is this code working just because of a lazy destructor (frees the 
memory, but leaves the pointer and contents intact and nothing happens 
to reallocate that memory soon enough to trash it)?  Or is the 
destructor really not called until later?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Is this code wrong?
  2008-08-09  1:09       ` Is this code wrong? John Fine
@ 2008-08-09  2:44         ` Eljay Love-Jensen
  2008-08-09 11:23           ` corey taylor
  0 siblings, 1 reply; 12+ messages in thread
From: Eljay Love-Jensen @ 2008-08-09  2:44 UTC (permalink / raw)
  To: John Fine; +Cc: GCC-help

Hi John,

The style that you indicated...

Foo ReturnAFoo()
{
  Foo foo;
  return foo;
}

void DoSomething()
{
  Foo const& foo = ReturnAFoo();
  foo.AreYouStillHere(); // Yep, still here.
}

...is good.

The temporary returned by ReturnAFoo in DoSomething is bound to the
reference.

It won't be destructed until the reference goes out of scope.  Really.

I use that style often.

(Although I can't cite chapter+verse of ISO 14882.)

HTH,
--Eljay

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Is this code wrong?
  2008-08-09  2:44         ` Eljay Love-Jensen
@ 2008-08-09 11:23           ` corey taylor
  2008-08-09 14:57             ` John Fine
  0 siblings, 1 reply; 12+ messages in thread
From: corey taylor @ 2008-08-09 11:23 UTC (permalink / raw)
  To: Eljay Love-Jensen; +Cc: John Fine, GCC-help

On Fri, Aug 8, 2008 at 10:24 PM, Eljay Love-Jensen <eljay@adobe.com> wrote:
> The temporary returned by ReturnAFoo in DoSomething is bound to the
> reference.
>
> It won't be destructed until the reference goes out of scope.  Really.
>
> I use that style often.
>
> (Although I can't cite chapter+verse of ISO 14882.)

12.2.5 - it is a long paragraph explaining the lifetime of the
temporary bound to a reference.

corey

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Is this code wrong?
  2008-08-09 11:23           ` corey taylor
@ 2008-08-09 14:57             ` John Fine
  0 siblings, 0 replies; 12+ messages in thread
From: John Fine @ 2008-08-09 14:57 UTC (permalink / raw)
  To: corey taylor; +Cc: Eljay Love-Jensen, GCC-help

Thankyou.  Now I understand it.

corey taylor wrote:
> 12.2.5 - it is a long paragraph explaining the lifetime of the
> temporary bound to a reference.
>
>
>
>   

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-08-09 11:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-08-08  6:55 odd behavior with Character Arrays Rohit Arul Raj
2008-08-08  7:07 ` Mateusz Loskot
2008-08-08  9:38   ` Rohit Arul Raj
2008-08-08 12:10     ` Jędrzej Dudkiewicz
2008-08-08 13:00     ` Mateusz Loskot
2008-08-08 13:40       ` Mateusz Loskot
2008-08-08 15:01     ` John Fine
2008-08-09  1:09       ` Is this code wrong? John Fine
2008-08-09  2:44         ` Eljay Love-Jensen
2008-08-09 11:23           ` corey taylor
2008-08-09 14:57             ` John Fine
2008-08-08 15:38     ` odd behavior with Character Arrays Bob Plantz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).