From mboxrd@z Thu Jan 1 00:00:00 1970 From: "gemenge@hotmail.com" To: glibc-bugs@sources.redhat.com Subject: [Bug libc/567] New: resolv/base64.c does not correctly decode, does not check parameters & overwrites bytes after target Date: Mon, 22 Nov 2004 14:51:00 -0000 Message-id: <20041122145010.567.gemenge@hotmail.com> X-SW-Source: 2004-11/msg00184.html List-Id: Hello, I was looking for a ddecent base64 implementation and turned to glibc. However, I found that it does not correctly decode. Also, it was hard for me to comprehend. So I rewrote it and give you the new implementation here. You may use it under the GPL, if you want to. ----base64.h begin---- /* * Header file for base64 */ #ifndef _BASE64_ /* * Useful data types */ #ifndef u_char #define u_char unsigned char #endif #ifndef u_int #define u_int unsigned int #endif #include int nBase64Encode(u_char const *pszSource, size_t nSource, char *pszTarget, size_t nMaxTarget); int nBase64Decode(char const *pszSource, u_char *pszTarget, size_t nMaxTarget); #define _BASE64_ #endif ----base64.h end---- ----base64.c begin---- /* +-------------------------------------------------------------------------- | File | base64.c |---------------------+---------------------------------------------------- | Description | Encode and decode data in base64 format. |---------------------+--------------------------------------------------- | Author | Frank Schwab |---------------------+--------------------------------------------------- | Version | 1.0.0 |---------------------+--------------------------------------------------- | Changes | 2004-11-22 V1.0.0: Created. fhs |---------------------+--------------------------------------------------- | Note | This base64 encoding/decoding module is based on | | on the glibc V2.3.3 resolv/base64.c module. | | However the glibc module does not decode correctly | | and is hard to comprehend. It also overwrites the | | byte following the last byte of target if the input | | ends in "=" or "==". This module can be used under | | the GPL. There is no warranty of any kind. |------------------------------------------------------------------------- */ /* */ /* I N C L U D E S */ /* */ #include #include #include #include "base64.h" /* */ /* D E F I N E S */ /* */ /* * These are the values for the boolean data type */ #define F_TRUE (-1) #define F_FALSE (0) /* * Some constants */ #define I_MAX_DECODE (255) /* The max. index of the decode helper table */ #define CH_PAD64 ('=') /* The base64 padding character */ #define N_INVALID_BASE64_CHAR -1 /* Value for invalid characters in the decode helper table */ /* */ /* G L O B A L D A T A */ /* */ static const char gachBase64Code[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"; static int ganBase64Decode[I_MAX_DECODE + 1]; static int gfDecodeInit = 0; /* (From RFC1521 and draft-ietf-dnssec-secext-03.txt) The following encoding technique is taken from RFC 1521 by Borenstein and Freed. It is reproduced here in a slightly edited form for convenience. A 65-character subset of US-ASCII is used, enabling 6 bits to be represented per printable character. (The extra 65th character, "=", is used to signify a special processing function.) The encoding process represents 24-bit groups of chInput bits as chOutput strings of 4 encoded characters. Proceeding from left to right, a 24-bit chInput group is formed by concatenating 3 8-bit chInput groups. These 24 bits are then treated as 4 concatenated 6-bit groups, each of which is translated into a single digit in the base64 alphabet. Each 6-bit group is used as an index into an array of 64 printable characters. The character referenced by the index is placed in the chOutput string. Table 1: The Base64 Alphabet Value Encoding Value Encoding Value Encoding Value Encoding 0 A 17 R 34 i 51 z 1 B 18 S 35 j 52 0 2 C 19 T 36 k 53 1 3 D 20 U 37 l 54 2 4 E 21 V 38 m 55 3 5 F 22 W 39 n 56 4 6 G 23 X 40 o 57 5 7 H 24 Y 41 p 58 6 8 I 25 Z 42 q 59 7 9 J 26 a 43 r 60 8 10 K 27 b 44 s 61 9 11 L 28 c 45 t 62 + 12 M 29 d 46 u 63 / 13 N 30 e 47 v 14 O 31 f 48 w (pad) = 15 P 32 g 49 x 16 Q 33 h 50 y Special processing is performed if fewer than 24 bits are available at the end of the data being encoded. A full encoding quantum is always completed at the end of a quantity. When fewer than 24 chInput bits are available in an chInput group, zero bits are added (on the right) to form an integral number of 6-bit groups. Padding at the end of the data is performed using the '=' character. Since all base64 chInput is an integral number of octets, only the ------------------------------------------------- following cases can arise: (1) the final quantum of encoding chInput is an integral multiple of 24 bits; here, the final unit of encoded chOutput will be an integral multiple of 4 characters with no "=" padding, (2) the final quantum of encoding chInput is exactly 8 bits; here, the final unit of encoded chOutput will be two characters followed by two "=" padding characters, or (3) the final quantum of encoding chInput is exactly 16 bits; here, the final unit of encoded chOutput will be three characters followed by one "=" padding character. */ /* [FUNCTIONHEADER ON] +-------------------------------------------------------------------------- | Function | nBase64Encode |------------------+------------------------------------------------------- | Description | Encode a data stream in base64 format. |------------------+------------------------------------------------------- | Parameter | pszSource : Pointer to source data stream | | nSource : Number of bytes in source data stream | | pszTarget : Pointer to target area | | nMaxTarget : Maximum number of characters that the | | target area can hold |------------------+------------------------------------------------------- | Return values | >=0: Number of characters in the target area | | -1 : An error occured. (Target area too small, | | pointers were null) |------------------+------------------------------------------------------- | Author | Frank Schwab |------------------+------------------------------------------------------- | Created | 2004-11-22 |------------------+------------------------------------------------------- | Changes | ./. |------------------+------------------------------------------------------- | Note | ./. +-------------------------------------------------------------------------- [FUNCTIONHEADER OFF] */ int nBase64Encode(u_char const *pszSource, size_t nSource, char *pszTarget, size_t nMaxTarget) { size_t nData = 0; u_char chInput[3]; u_char chOutput[4]; size_t i; size_t nActSource = nSource; char *pchActTarget = pszTarget; int nReturn = -1; /* * Check parameters first */ if ( (pszSource != (u_char *)NULL) && (pszTarget != (u_char *)NULL) && (nSource > 0) && (nMaxTarget > 0) ) { /* * Now loop through the source 3 bytes at a time and convert it to * 4 base64 output characters. */ while ( 2 < nActSource ) { chInput[0] = *pszSource++; chInput[1] = *pszSource++; chInput[2] = *pszSource++; nActSource -= 3; chOutput[0] = chInput[0] >> 2; chOutput[1] = ((chInput[0] & 0x03) << 4) + (chInput[1] >> 4); chOutput[2] = ((chInput[1] & 0x0f) << 2) + (chInput[2] >> 6); chOutput[3] = chInput[2] & 0x3f; if ( nData + 4 > nMaxTarget ) /* There is not enough room in Target */ { nActSource = 0; /* Set variables so we just fall through */ nData = nMaxTarget; /* to the end of the routine with rc=-1 */ } /* end if ( nData + 4 > nMaxTarget ) */ else { *pchActTarget++ = gachBase64Code[chOutput[0]]; *pchActTarget++ = gachBase64Code[chOutput[1]]; *pchActTarget++ = gachBase64Code[chOutput[2]]; *pchActTarget++ = gachBase64Code[chOutput[3]]; nData += 4; } /* end else ( nData + 4 > nMaxTarget ) */ } /* end while ( 2 < nSource ) */ /* * Now there are less than 3 bytes left. We now care about the padding. */ if ( 0 != nActSource ) { /* Get what's left. */ chInput[0] = chInput[1] = chInput[2] = '\0'; for ( i = 0; i < nActSource; i++ ) { chInput[i] = *pszSource++; } /* end for ( i = 0; i < nActSource; i++ ) */ chOutput[0] = chInput[0] >> 2; chOutput[1] = ((chInput[0] & 0x03) << 4) + (chInput[1] >> 4); chOutput[2] = ((chInput[1] & 0x0f) << 2) + (chInput[2] >> 6); if ( nData + 4 > nMaxTarget ) /* There is not enough room in Target */ { /* Set variable so we just fall through */ nData = nMaxTarget; /* to the end of the routine with rc=-1 */ } /* end if (nData + 4 > nMaxTarget) */ else { *pchActTarget++ = gachBase64Code[chOutput[0]]; *pchActTarget++ = gachBase64Code[chOutput[1]]; nData += 2; if ( 1 == nActSource ) { *pchActTarget++ = CH_PAD64; nData++; } /* end if ( 1 == nActSource ) */ else { *pchActTarget++ = gachBase64Code[chOutput[2]]; nData++; } /* end else ( 1 == nActSource ) */ *pchActTarget++ = CH_PAD64; nData++; } /* end else (nData + 4 > nMaxTarget) */ } /* end if (0 != nActSource) */ if ( nData < nMaxTarget ) { *pchActTarget = '\0'; /* Returned value doesn't count \0. */ nReturn = nData; } /* end if ( nData < nMaxTarget ) */ } /* end if ( Parameter ok ) */ return (nReturn); } /* [FUNCTIONHEADER ON] +-------------------------------------------------------------------------- | Function | vInitDecodeTable |------------------+------------------------------------------------------- | Description | Initialize base64 decode table |------------------+------------------------------------------------------- | Parameter | ./. |------------------+------------------------------------------------------- | Return values | ./. | | gfDecodeInit is set to F_TRUE |------------------+------------------------------------------------------- | Author | Frank Schwab |------------------+------------------------------------------------------- | Created | 2004-11-22 |------------------+------------------------------------------------------- | Changes | ./. |------------------+------------------------------------------------------- | Note | This is an internal helper routine and not exposed | | through the header file. | | It is not efficient to have this table build for just | | one call, but it is very efficient if you call the | | base64 functions more than once. +-------------------------------------------------------------------------- [FUNCTIONHEADER OFF] */ void vInitDecodeTable(void) { u_int i; for ( i=0; i<=I_MAX_DECODE; i++ ) { ganBase64Decode[i] = N_INVALID_BASE64_CHAR; } for ( i=(u_int)'A'; i<=(u_int)'Z'; i++ ) { ganBase64Decode[i] = i - (u_int)'A'; } for ( i=(u_int)'a'; i<=(u_int)'z'; i++ ) { ganBase64Decode[i] = i - (u_int)'a' + 26; } for ( i=(u_int)'0'; i<=(u_int)'9'; i++ ) { ganBase64Decode[i] = i - (u_int)'0' + 52; } ganBase64Decode[(u_int)'+'] = 62; ganBase64Decode[(u_int)'/'] = 63; gfDecodeInit = F_TRUE; return; } /* [FUNCTIONHEADER ON] +-------------------------------------------------------------------------- | Function | nBase64Decode |------------------+------------------------------------------------------- | Description | Decode a base64 string into a byte data stream |------------------+------------------------------------------------------- | Parameter | pszSource : Pointer to base64 string | | pszTarget : Pointer to target area | | nMaxTarget : Maximum number of bytes that the | | target area can hold |------------------+------------------------------------------------------- | Return values | >=0: Number of characters in the target area | | -1 : An error occured. (Target area too small, | | pointers were null, invalid base64 data stream) |------------------+------------------------------------------------------- | Author | Frank Schwab |------------------+------------------------------------------------------- | Created | 2004-11-22 |------------------+------------------------------------------------------- | Changes | ./. |------------------+------------------------------------------------------- | Note | ./. +-------------------------------------------------------------------------- [FUNCTIONHEADER OFF] */ int nBase64Decode(char const *pszSource, u_char *pszTarget, size_t nMaxTarget) { size_t nTarget; int nState; u_int nActChar; int nReturn = -1; u_int nSourceValue; u_int nTargetValue; int fOK = F_TRUE; char const *pchActSource = pszSource; char *pchActTarget = pszTarget; if ( (pszSource != (u_char *)NULL) && (pszTarget != (u_char *)NULL) && (nMaxTarget > 0) ) { nState = 0; nTarget = 1; nTargetValue = 0; if ( F_FALSE == gfDecodeInit ) { vInitDecodeTable(); } /* end if ( 0 == gfDecodeInit ) */ while ( '\0' != (nActChar = *pchActSource++) ) { if ( isspace(nActChar) ) /* Skip whitespace anywhere. */ continue; if ( CH_PAD64 == nActChar ) break; nSourceValue = ganBase64Decode[nActChar]; if ( N_INVALID_BASE64_CHAR != nSourceValue ) { switch ( nState ) { case 0: if ( nTarget <= nMaxTarget ) { nTargetValue = nSourceValue << 2; nState++; } /* end if ( nTarget <= nMaxTarget ) */ else { fOK = F_FALSE; } /* end else ( nTarget <= nMaxTarget ) */ break; case 1: if ( nTarget <= nMaxTarget ) { *pchActTarget = (nSourceValue >> 4) | nTargetValue; nTargetValue = (nSourceValue & 0x0f) << 4; pchActTarget++; nTarget++; nState++; } /* end if ( nTarget <= nMaxTarget ) */ else { fOK = F_FALSE; } /* end else ( nTarget <= nMaxTarget ) */ break; case 2: if ( nTarget <= nMaxTarget ) { *pchActTarget = (nSourceValue >> 2) | nTargetValue; nTargetValue = (nSourceValue & 0x03) << 6; pchActTarget++; nTarget++; nState++; } /* end if ( nTarget <= nMaxTarget ) */ else { fOK = F_FALSE; } /* end else ( nTarget <= nMaxTarget ) */ break; case 3: if ( nTarget <= nMaxTarget ) { *pchActTarget = nSourceValue | nTargetValue; pchActTarget++; nTarget++; nState = 0; } /* end if ( nTarget <= nMaxTarget ) */ else { fOK = F_FALSE; } /* end else ( nTarget <= nMaxTarget ) */ break; } /* end switch ( nState ) */ } /* end if ( N_INVALID_BASE64_CHAR != nSourceValue ) */ if ( F_FALSE == fOK ) { break; } /* end if ( F_FALSE == fOK ) */ } /* end while ((nActChar = *pchActSource++) != '\0') */ /* * We are done decoding Base-64 chars. Let's see if we ended * on a byte boundary, and/or with erroneous trailing characters. */ if ( F_FALSE != fOK ) { if ( CH_PAD64 == nActChar ) /* We got a pad char. */ { switch (nState) { case 0: /* '=' is invalid in the first position */ case 1: /* '=' is invalid in the second position */ fOK = F_FALSE; break; case 2: /* Valid, means one byte of info */ /* Make sure there is another trailing = sign. */ for ( nActChar = *pchActSource++; nActChar != '\0'; nActChar = *pchActSource++ ) { if ( !isspace(nActChar) ) { break; } /* end if ( !isspace(nActChar) ) */ } /* end for ( nActChar = *pchActSource++; nActChar != '\0'; nActChar = *pchActSource++ ) */ if ( CH_PAD64 != nActChar ) { fOK = F_FALSE; break; } /* end if ( CH_PAD64 != nActChar ) */ /* Fall through to "single trailing =" case. */ /* FALLTHROUGH */ case 3: /* Valid, means two bytes of info */ /* * We know this char is an =. Is there anything but * whitespace after it? */ for ( nActChar = *pchActSource++; nActChar != '\0'; nActChar = *pchActSource++ ) { if ( !isspace(nActChar) ) { fOK = F_FALSE; break; } /* end if ( !isspace(nActChar) ) */ } /* end for ( nActChar = *pchActSource++; nActChar != '\0'; nActChar = *pszSource++ ) */ /* * Now make sure for cases 2 and 3 that the "extra" * bits that slopped past the last full byte were * zeros. If we don't check them, they become a * subliminal channel. */ if ( F_FALSE != fOK ) { if ( 0 != nTargetValue ) { fOK = F_FALSE; } /* end if ( 0 != nTargetValue ) */ } /* end if ( F_FALSE != fOK ) */ } /* end switch ( nState ) */ } /* end if ( CH_PAD64 == nActChar ) */ else { /* * We ended by seeing the end of the string. Make sure we * have no partial bytes lying around. */ if ( 0 != nState ) { fOK = F_FALSE; } /* end if ( 0 != nState ) */ } /* end else ( CH_PAD64 == nActChar ) */ if ( F_FALSE != fOK ) { nReturn = nTarget - 1; } /* end if ( F_FALSE != fOK ) */ } /* end if ( F_FALSE != fOK ) */ } /* end if ( parameters ok ) */ return (nReturn); } ----base64.c end---- Here is a test drive: ----b64test.c begin---- #include #include #include "base64.h" main() { unsigned char pszInput[256]; unsigned char pszOutput[1024]; unsigned char pszTest[256]; int i; int j; for (i=0; i<256; i++) { pszInput[i] = i; } for ( j=256; j>0; j-- ) { printf ("--------\nmaxlen=%d\n", j); i = nBase64Encode(pszInput, j, pszOutput, 1024); printf ("Encode-rc=%d, len(Encode)=%d\n", i, strlen(pszOutput)); printf ("Encoded = %s\n", pszOutput); if (i >= 0) { i = nBase64Decode(pszOutput, pszTest, j); printf ("Decode-rc=%d\n", i); if ( i != j ) puts("Wrong decode length"); for (i=0; i