From mboxrd@z Thu Jan  1 00:00:00 1970
From: "gemenge@hotmail.com" <sourceware-bugzilla@sources.redhat.com>
To: glibc-bugs@sources.redhat.com
Subject: [Bug libc/567] New: resolv/base64.c does not correctly decode, does not check parameters & overwrites bytes after target
Date: Mon, 22 Nov 2004 14:51:00 -0000
Message-id: <20041122145010.567.gemenge@hotmail.com>
X-SW-Source: 2004-11/msg00184.html
List-Id: <glibc-bugs.sourceware.org>

Hello,

   I was looking for a ddecent base64 implementation and turned to glibc.
However, I found that it does not correctly decode. Also, it was hard for me to
comprehend. So I rewrote it and give you the new implementation here. You may
use it under the GPL, if you want to.

----base64.h begin----
/*
 * Header file for base64
 */

#ifndef _BASE64_

/*
 * Useful data types
 */
#ifndef u_char
#define u_char unsigned char
#endif

#ifndef u_int
#define u_int unsigned int
#endif

#include <stddef.h>

int
nBase64Encode(u_char const *pszSource, size_t nSource, char *pszTarget, size_t
nMaxTarget);

int
nBase64Decode(char const *pszSource, u_char *pszTarget, size_t nMaxTarget);

#define _BASE64_
#endif
----base64.h end----

----base64.c begin----
/*
<FILEHEADER ON>
+--------------------------------------------------------------------------
| File                | base64.c
|---------------------+----------------------------------------------------
| Description         | Encode and decode data in base64 format.
|---------------------+---------------------------------------------------
| Author              | Frank Schwab
|---------------------+---------------------------------------------------
| Version             | 1.0.0
|---------------------+---------------------------------------------------
| Changes             | 2004-11-22  V1.0.0: Created. fhs
|---------------------+---------------------------------------------------
| Note                | This base64 encoding/decoding module is based on
|                     | on the glibc V2.3.3 resolv/base64.c module.
|                     | However the glibc module does not decode correctly
|                     | and is hard to comprehend. It also overwrites the
|                     | byte following the last byte of target if the input
|                     | ends in "=" or "==". This module can be used under
|                     | the GPL. There is no warranty of any kind.
|-------------------------------------------------------------------------
<FILEHEADER OFF>
*/


/*								 */
/* I N C L U D E S						 */
/*							         */

#include <stddef.h>
#include <ctype.h>
#include <string.h>
#include "base64.h"

/*								 */
/* D E F I N E S						 */
/*								 */

/*
 * These are the values for the boolean data type
 */
#define F_TRUE  (-1)
#define F_FALSE  (0)

/*
 * Some constants
 */

#define I_MAX_DECODE (255)	/* The max. index of the decode helper table */
#define CH_PAD64     ('=')	/* The base64 padding character              */

#define N_INVALID_BASE64_CHAR -1 /* Value for invalid characters in the decode
helper table */

/*							 */
/*	 G L O B A L   D A T A				 */
/*							 */

static const char gachBase64Code[] =
	"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

static int ganBase64Decode[I_MAX_DECODE + 1];
static int gfDecodeInit = 0;

/* (From RFC1521 and draft-ietf-dnssec-secext-03.txt)
   The following encoding technique is taken from RFC 1521 by Borenstein
   and Freed.  It is reproduced here in a slightly edited form for
   convenience.

   A 65-character subset of US-ASCII is used, enabling 6 bits to be
   represented per printable character. (The extra 65th character, "=",
   is used to signify a special processing function.)

   The encoding process represents 24-bit groups of chInput bits as chOutput
   strings of 4 encoded characters. Proceeding from left to right, a
   24-bit chInput group is formed by concatenating 3 8-bit chInput groups.
   These 24 bits are then treated as 4 concatenated 6-bit groups, each
   of which is translated into a single digit in the base64 alphabet.

   Each 6-bit group is used as an index into an array of 64 printable
   characters. The character referenced by the index is placed in the
   chOutput string.

                         Table 1: The Base64 Alphabet

      Value Encoding  Value Encoding  Value Encoding  Value Encoding
          0 A            17 R            34 i            51 z
          1 B            18 S            35 j            52 0
          2 C            19 T            36 k            53 1
          3 D            20 U            37 l            54 2
          4 E            21 V            38 m            55 3
          5 F            22 W            39 n            56 4
          6 G            23 X            40 o            57 5
          7 H            24 Y            41 p            58 6
          8 I            25 Z            42 q            59 7
          9 J            26 a            43 r            60 8
         10 K            27 b            44 s            61 9
         11 L            28 c            45 t            62 +
         12 M            29 d            46 u            63 /
         13 N            30 e            47 v
         14 O            31 f            48 w         (pad) =
         15 P            32 g            49 x
         16 Q            33 h            50 y

   Special processing is performed if fewer than 24 bits are available
   at the end of the data being encoded.  A full encoding quantum is
   always completed at the end of a quantity.  When fewer than 24 chInput
   bits are available in an chInput group, zero bits are added (on the
   right) to form an integral number of 6-bit groups.  Padding at the
   end of the data is performed using the '=' character.

   Since all base64 chInput is an integral number of octets, only the
         -------------------------------------------------                       
   following cases can arise:
   
       (1) the final quantum of encoding chInput is an integral
           multiple of 24 bits; here, the final unit of encoded
	   chOutput will be an integral multiple of 4 characters
	   with no "=" padding,
       (2) the final quantum of encoding chInput is exactly 8 bits;
           here, the final unit of encoded chOutput will be two
	   characters followed by two "=" padding characters, or
       (3) the final quantum of encoding chInput is exactly 16 bits;
           here, the final unit of encoded chOutput will be three
	   characters followed by one "=" padding character.
   */

/*
[FUNCTIONHEADER ON]
+--------------------------------------------------------------------------
| Function         | nBase64Encode
|------------------+-------------------------------------------------------
| Description      | Encode a data stream in base64 format.
|------------------+-------------------------------------------------------
| Parameter        | pszSource  : Pointer to source data stream
|                  | nSource    : Number of bytes in source data stream
|                  | pszTarget  : Pointer to target area
|                  | nMaxTarget : Maximum number of characters that the
|                  |              target area can hold
|------------------+-------------------------------------------------------
| Return values    | >=0: Number of characters in the target area
|                  | -1 : An error occured. (Target area too small,
|                  |      pointers were null)
|------------------+-------------------------------------------------------
| Author           | Frank Schwab
|------------------+-------------------------------------------------------
| Created          | 2004-11-22
|------------------+-------------------------------------------------------
| Changes          | ./.
|------------------+-------------------------------------------------------
| Note             | ./.
+--------------------------------------------------------------------------
[FUNCTIONHEADER OFF]
*/
int
nBase64Encode(u_char const *pszSource, size_t nSource, char *pszTarget, size_t
nMaxTarget)
{
	size_t  nData = 0;
	u_char  chInput[3];
	u_char  chOutput[4];
	size_t  i;
	size_t  nActSource   = nSource;
	char   *pchActTarget = pszTarget;
	int     nReturn = -1;

/*
 * Check parameters first
 */
	if ( (pszSource != (u_char *)NULL) &&
		  (pszTarget != (u_char *)NULL) &&
		  (nSource    > 0)              &&
		  (nMaxTarget > 0) )
	{

/*
 * Now loop through the source 3 bytes at a time and convert it to
 * 4 base64 output characters.
 */
		while ( 2 < nActSource )
		{
			chInput[0] = *pszSource++;
			chInput[1] = *pszSource++;
			chInput[2] = *pszSource++;
			nActSource -= 3;

			chOutput[0] = chInput[0] >> 2;
			chOutput[1] = ((chInput[0] & 0x03) << 4) + (chInput[1] >> 4);
			chOutput[2] = ((chInput[1] & 0x0f) << 2) + (chInput[2] >> 6);
			chOutput[3] = chInput[2] & 0x3f;

			if ( nData + 4 > nMaxTarget )	/* There is not enough room in Target    */
			{
				nActSource = 0;				/* Set variables so we just fall through */
				nData      = nMaxTarget;	/* to the end of the routine with rc=-1  */
			}	/* end if ( nData + 4 > nMaxTarget ) */
			else
			{
				*pchActTarget++ = gachBase64Code[chOutput[0]];
				*pchActTarget++ = gachBase64Code[chOutput[1]];
				*pchActTarget++ = gachBase64Code[chOutput[2]];
				*pchActTarget++ = gachBase64Code[chOutput[3]];
				nData += 4;
			}	/* end else ( nData + 4 > nMaxTarget ) */
		}		/* end while ( 2 < nSource )           */
   
/*
 * Now there are less than 3 bytes left. We now care about the padding.
 */
		if ( 0 != nActSource )
		{
			/* Get what's left. */
			chInput[0] = chInput[1] = chInput[2] = '\0';

			for ( i = 0; i < nActSource; i++ )
			{
				chInput[i] = *pszSource++;
			}	/* end for ( i = 0; i < nActSource; i++ ) */
	
			chOutput[0] = chInput[0] >> 2;
			chOutput[1] = ((chInput[0] & 0x03) << 4) + (chInput[1] >> 4);
			chOutput[2] = ((chInput[1] & 0x0f) << 2) + (chInput[2] >> 6);

			if ( nData + 4 > nMaxTarget )	/* There is not enough room in Target    */
			{										/* Set variable so we just fall through  */
				nData = nMaxTarget;			/* to the end of the routine with rc=-1  */
			}	/* end if (nData + 4 > nMaxTarget) */
			else
			{
				*pchActTarget++ = gachBase64Code[chOutput[0]];
				*pchActTarget++ = gachBase64Code[chOutput[1]];
				nData += 2;

				if ( 1 == nActSource )
				{
					*pchActTarget++ = CH_PAD64;
					nData++;
				}	/* end if ( 1 == nActSource ) */
				else
				{
					*pchActTarget++ = gachBase64Code[chOutput[2]];
					nData++;
				}	/* end else ( 1 == nActSource ) */

				*pchActTarget++ = CH_PAD64;
				nData++;
			}	/* end else (nData + 4 > nMaxTarget) */
		}		/* end if (0 != nActSource)          */

		if ( nData < nMaxTarget )
		{
			*pchActTarget = '\0';	/* Returned value doesn't count \0. */

			nReturn = nData;
		}	/* end if ( nData < nMaxTarget ) */
	}		/* end if ( Parameter ok )       */

	return (nReturn);
}

/*
[FUNCTIONHEADER ON]
+--------------------------------------------------------------------------
| Function         | vInitDecodeTable
|------------------+-------------------------------------------------------
| Description      | Initialize base64 decode table
|------------------+-------------------------------------------------------
| Parameter        | ./.
|------------------+-------------------------------------------------------
| Return values    | ./.
|                  | gfDecodeInit is set to F_TRUE
|------------------+-------------------------------------------------------
| Author           | Frank Schwab
|------------------+-------------------------------------------------------
| Created          | 2004-11-22
|------------------+-------------------------------------------------------
| Changes          | ./.
|------------------+-------------------------------------------------------
| Note             | This is an internal helper routine and not exposed
|                  | through the header file.
|                  | It is not efficient to have this table build for just
|                  | one call, but it is very efficient if you call the
|                  | base64 functions more than once.
+--------------------------------------------------------------------------
[FUNCTIONHEADER OFF]
*/

void
vInitDecodeTable(void)
{
	u_int i;

	for ( i=0; i<=I_MAX_DECODE; i++ )
	{
		ganBase64Decode[i] = N_INVALID_BASE64_CHAR;
	}

	for ( i=(u_int)'A'; i<=(u_int)'Z'; i++ )
	{
		ganBase64Decode[i] = i - (u_int)'A';
	}

	for ( i=(u_int)'a'; i<=(u_int)'z'; i++ )
	{
		ganBase64Decode[i] = i - (u_int)'a' + 26;
	}

	for ( i=(u_int)'0'; i<=(u_int)'9'; i++ )
	{
		ganBase64Decode[i] = i - (u_int)'0' + 52;
	}

	ganBase64Decode[(u_int)'+'] = 62;
	ganBase64Decode[(u_int)'/'] = 63;

	gfDecodeInit = F_TRUE;

	return;
}

/*
[FUNCTIONHEADER ON]
+--------------------------------------------------------------------------
| Function         | nBase64Decode
|------------------+-------------------------------------------------------
| Description      | Decode a base64 string into a byte data stream
|------------------+-------------------------------------------------------
| Parameter        | pszSource  : Pointer to base64 string
|                  | pszTarget  : Pointer to target area
|                  | nMaxTarget : Maximum number of bytes that the
|                  |              target area can hold
|------------------+-------------------------------------------------------
| Return values    | >=0: Number of characters in the target area
|                  | -1 : An error occured. (Target area too small,
|                  |      pointers were null, invalid base64 data stream)
|------------------+-------------------------------------------------------
| Author           | Frank Schwab
|------------------+-------------------------------------------------------
| Created          | 2004-11-22
|------------------+-------------------------------------------------------
| Changes          | ./.
|------------------+-------------------------------------------------------
| Note             | ./.
+--------------------------------------------------------------------------
[FUNCTIONHEADER OFF]
*/
int
nBase64Decode(char const *pszSource, u_char *pszTarget, size_t nMaxTarget)
{
	size_t      nTarget;
	int         nState;
	u_int       nActChar;
	int         nReturn = -1;
	u_int       nSourceValue;
	u_int       nTargetValue;
	int         fOK = F_TRUE;
	char const *pchActSource = pszSource;
	char       *pchActTarget = pszTarget;

	if ( (pszSource != (u_char *)NULL) &&
		  (pszTarget != (u_char *)NULL) &&
		  (nMaxTarget > 0) )
	{
	   nState        = 0;
		nTarget       = 1;
		nTargetValue  = 0;

		if ( F_FALSE == gfDecodeInit )
		{
			vInitDecodeTable();
		}	/* end if ( 0 == gfDecodeInit ) */

		while ( '\0' != (nActChar = *pchActSource++) )
		{
			if ( isspace(nActChar) )	/* Skip whitespace anywhere. */
				continue;

			if ( CH_PAD64 == nActChar )
				break;

			nSourceValue = ganBase64Decode[nActChar];

			if ( N_INVALID_BASE64_CHAR != nSourceValue )
			{
				switch ( nState )
				{
				case 0:
					if ( nTarget <= nMaxTarget )
					{
				      nTargetValue = nSourceValue << 2;

						nState++;
					}	/* end if ( nTarget <= nMaxTarget ) */
					else
					{
						fOK = F_FALSE;
					}	/* end else ( nTarget <= nMaxTarget ) */
				break;

				case 1:
					if ( nTarget <= nMaxTarget )
					{
						*pchActTarget = (nSourceValue >> 4) | nTargetValue;
						nTargetValue  = (nSourceValue & 0x0f) << 4;

						pchActTarget++;
						nTarget++;

						nState++;
					}	/* end if ( nTarget <= nMaxTarget ) */
					else
					{
						fOK = F_FALSE;
					}	/* end else ( nTarget <= nMaxTarget ) */
				break;

				case 2:
					if ( nTarget <= nMaxTarget )
					{
						*pchActTarget = (nSourceValue >> 2) | nTargetValue;
						nTargetValue  = (nSourceValue & 0x03) << 6;

						pchActTarget++;
						nTarget++;

						nState++;
					}	/* end if ( nTarget <= nMaxTarget ) */
					else
					{
						fOK = F_FALSE;
					}	/* end else ( nTarget <= nMaxTarget ) */
				break;

				case 3:
					if ( nTarget <= nMaxTarget )
					{
						*pchActTarget = nSourceValue | nTargetValue;

						pchActTarget++;
						nTarget++;

						nState = 0;
					}	/* end if ( nTarget <= nMaxTarget ) */
					else
					{
						fOK = F_FALSE;
					}	/* end else ( nTarget <= nMaxTarget ) */
				break;
				}	/* end switch ( nState ) */
			}		/* end if ( N_INVALID_BASE64_CHAR != nSourceValue ) */

			if ( F_FALSE == fOK )
			{
				break;
			}	/* end if ( F_FALSE == fOK ) */
		}	/* end while ((nActChar = *pchActSource++) != '\0') */

	/*
	 * We are done decoding Base-64 chars. Let's see if we ended
	 * on a byte boundary, and/or with erroneous trailing characters.
	 */

		if ( F_FALSE != fOK )
		{
			if ( CH_PAD64 == nActChar )	/* We got a pad char. */
			{
				switch (nState)
				{
				case 0:		/* '=' is invalid in the first  position */
				case 1:		/* '=' is invalid in the second position */
					fOK = F_FALSE;
				break;

				case 2:		/* Valid, means one byte of info */
								/* Make sure there is another trailing = sign. */
					for ( nActChar = *pchActSource++; nActChar != '\0'; nActChar =
*pchActSource++ )
					{
						if ( !isspace(nActChar) )
						{
							break;
						}	/* end if ( !isspace(nActChar) ) */
					}		/* end for ( nActChar = *pchActSource++; nActChar != '\0'; nActChar =
*pchActSource++ ) */

					if ( CH_PAD64 != nActChar )
					{
						fOK = F_FALSE;
						break;
					}	/* end if ( CH_PAD64 != nActChar ) */

					/* Fall through to "single trailing =" case. */
					/* FALLTHROUGH */

				case 3:		/* Valid, means two bytes of info */
					/*
					 * We know this char is an =.  Is there anything but
					 * whitespace after it?
					 */
					for ( nActChar = *pchActSource++; nActChar != '\0'; nActChar =
*pchActSource++ )
					{
						if ( !isspace(nActChar) )
						{
							fOK = F_FALSE;
							break;
						}	/* end if ( !isspace(nActChar) ) */
					}		/* end for ( nActChar = *pchActSource++; nActChar != '\0'; nActChar =
*pszSource++ ) */

			/*
			 * Now make sure for cases 2 and 3 that the "extra"
			 * bits that slopped past the last full byte were
			 * zeros.  If we don't check them, they become a
			 * subliminal channel.
			 */
					if ( F_FALSE != fOK )
					{
						if ( 0 != nTargetValue )
						{
							fOK = F_FALSE;
						}	/* end if ( 0 != nTargetValue )    */
					}		/* end if ( F_FALSE != fOK )       */
				}			/* end switch ( nState )           */
			}				/* end if ( CH_PAD64 == nActChar ) */
			else
			{
			/*
			 * We ended by seeing the end of the string.  Make sure we
			 * have no partial bytes lying around.
			 */
				if ( 0 != nState )
				{
					fOK = F_FALSE;
				}	/* end if ( 0 != nState ) */
			}		/* end else ( CH_PAD64 == nActChar ) */

			if ( F_FALSE != fOK )
			{
				nReturn = nTarget - 1;
			}	/* end if ( F_FALSE != fOK ) */
		}		/* end if ( F_FALSE != fOK ) */
	}			/* end if ( parameters ok ) */

	return (nReturn);
}
----base64.c end----

Here is a test drive:

----b64test.c begin----
#include <stdio.h>
#include <string.h>
#include "base64.h"

main()
{
   unsigned char pszInput[256];
	unsigned char pszOutput[1024];
	unsigned char pszTest[256];

	int i;
	int j;

	for (i=0; i<256; i++)
	{
	   pszInput[i] = i;
	}

	for ( j=256; j>0; j-- )
	{
   	printf ("--------\nmaxlen=%d\n", j);

	   i = nBase64Encode(pszInput, j, pszOutput, 1024);

	   printf ("Encode-rc=%d, len(Encode)=%d\n", i, strlen(pszOutput));
		printf ("Encoded = %s\n", pszOutput);

		if (i >= 0)
		{
			i = nBase64Decode(pszOutput, pszTest, j);

	 		printf ("Decode-rc=%d\n", i);

			if ( i != j )
			puts("Wrong decode length");

			for (i=0; i<j; i++)
			{
				if ( pszTest[i] != i )
				{
				   printf ("wrong decode on pos %d\n", i);
					break;
				}
			}
		}
	}

	return 0;
}
----b64test.c end----

-- 
           Summary: resolv/base64.c does not correctly decode, does not
                    check parameters & overwrites bytes after target
           Product: glibc
           Version: 2.3.3
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: gotom at debian dot or dot jp
        ReportedBy: gemenge at hotmail dot com
                CC: glibc-bugs at sources dot redhat dot com


http://sources.redhat.com/bugzilla/show_bug.cgi?id=567

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.