002互聯(lián)網網絡技術之Base64編解碼的C語言實現(xiàn)

簡介

Base64編碼是將任何類型的數(shù)據(jù)轉換成ASCII碼的可見字符,然后接收端再反向解碼裂七,得到原始的數(shù)據(jù)皆看。最早的的Base是用于發(fā)送Email內容的。

經過Base64轉換之后的數(shù)據(jù)大小變大了背零,為原數(shù)據(jù)的4/3大小悬蔽。但是方便了傳輸,比如由于base64的編碼中沒有<>等特殊字符捉兴,可以不用轉義掃描,直接放在XML中录语,放在MIME中倍啥,甚至直接不經過轉義掃描存進數(shù)據(jù)庫中。由于有這些方便的特性澎埠,即使數(shù)據(jù)量變大虽缕,base64編碼還是被廣泛使用。

編碼原理

每個字節(jié)8位蒲稳,每次取出3個字節(jié)氮趋,也就是3 x 8 = 24 位。然后每次從此24位中取出6位江耀,然后在前端補2位0剩胁,組成新的8位,也就是一個字節(jié)祥国。這樣就將3個字節(jié)轉換成了4個字節(jié)昵观。由于前面兩位都是0,所以轉換后的每個字節(jié)能表示的最大數(shù)字為63舌稀, 也就是說轉換后的每個字節(jié)只可能是0-63中的一個數(shù)字啊犬。

然后根據(jù)規(guī)范給出的Base64索引表,將1-63 這64個數(shù)字轉換成"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"中的一個壁查。

當最后取出3個字節(jié)不夠時觉至,不夠的位置補0,并且最后少一個字節(jié)時編碼的最后加一個“=”睡腿,少兩個字節(jié)時加兩個"="

解碼原理

解碼是編碼的反向過程语御,每次取出4個字節(jié)峻贮,然后將每個字節(jié)的字符轉換成原始Base64索引表對應的索引數(shù)字,也就是編碼時3字節(jié)轉換成4字節(jié)的轉換結果沃暗。然后使用位操作將每字節(jié)前2位去掉月洛,重新轉換成3字節(jié)。需要注意的是最后對于結尾“=”的處理孽锥。

代碼實現(xiàn)

static const char Base64[] =
    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
static const char Pad64 = '=';

/* (From RFC1521 and draft-ietf-dnssec-secext-03.txt)
   The following encoding technique is taken from RFC 1521 by Borenstein
   and Freed.  It is reproduced here in a slightly edited form for
   convenience.

   A 65-character subset of US-ASCII is used, enabling 6 bits to be
   represented per printable character. (The extra 65th character, "=",
   is used to signify a special processing function.)

   The encoding process represents 24-bit groups of input bits as output
   strings of 4 encoded characters. Proceeding from left to right, a
   24-bit input group is formed by concatenating 3 8-bit input groups.
   These 24 bits are then treated as 4 concatenated 6-bit groups, each
   of which is translated into a single digit in the base64 alphabet.

   Each 6-bit group is used as an index into an array of 64 printable
   characters. The character referenced by the index is placed in the
   output string.

                         Table 1: The Base64 Alphabet

      Value Encoding  Value Encoding  Value Encoding  Value Encoding
          0 A            17 R            34 i            51 z
          1 B            18 S            35 j            52 0
          2 C            19 T            36 k            53 1
          3 D            20 U            37 l            54 2
          4 E            21 V            38 m            55 3
          5 F            22 W            39 n            56 4
          6 G            23 X            40 o            57 5
          7 H            24 Y            41 p            58 6
          8 I            25 Z            42 q            59 7
          9 J            26 a            43 r            60 8
         10 K            27 b            44 s            61 9
         11 L            28 c            45 t            62 +
         12 M            29 d            46 u            63 /
         13 N            30 e            47 v
         14 O            31 f            48 w         (pad) =
         15 P            32 g            49 x
         16 Q            33 h            50 y

   Special processing is performed if fewer than 24 bits are available
   at the end of the data being encoded.  A full encoding quantum is
   always completed at the end of a quantity.  When fewer than 24 input
   bits are available in an input group, zero bits are added (on the
   right) to form an integral number of 6-bit groups.  Padding at the
   end of the data is performed using the '=' character.

   Since all base64 input is an integral number of octets, only the
         -------------------------------------------------
   following cases can arise:

       (1) the final quantum of encoding input is an integral
           multiple of 24 bits; here, the final unit of encoded
       output will be an integral multiple of 4 characters
       with no "=" padding,
       (2) the final quantum of encoding input is exactly 8 bits;
           here, the final unit of encoded output will be two
       characters followed by two "=" padding characters, or
       (3) the final quantum of encoding input is exactly 16 bits;
           here, the final unit of encoded output will be three
       characters followed by one "=" padding character.
   */

int b64_ntop(u_char const *src, size_t srclength, char *target, size_t targsize)
{
    size_t datalength = 0;
    u_char input[3] = { 0, 0, 0 };  /* make compiler happy */
    u_char output[4];
    size_t i;

    assert(src != NULL);
    assert(target != NULL);

    while (2 < srclength) {
        input[0] = *src++;
        input[1] = *src++;
        input[2] = *src++;
        srclength -= 3;

        output[0] = (u_int32_t)input[0] >> 2;
        output[1] = ((u_int32_t)(input[0] & 0x03) << 4) +
            ((u_int32_t)input[1] >> 4);
        output[2] = ((u_int32_t)(input[1] & 0x0f) << 2) +
            ((u_int32_t)input[2] >> 6);
        output[3] = input[2] & 0x3f;
        assert(output[0] < 64);
        assert(output[1] < 64);
        assert(output[2] < 64);
        assert(output[3] < 64);

        if (datalength + 4 > targsize)
            return (-1);
        target[datalength++] = Base64[output[0]];
        target[datalength++] = Base64[output[1]];
        target[datalength++] = Base64[output[2]];
        target[datalength++] = Base64[output[3]];
    }

    /* Now we worry about padding. */
    if (0 != srclength) {
        /* Get what's left. */
        input[0] = input[1] = input[2] = '\0';
        for (i = 0; i < srclength; i++)
            input[i] = *src++;

        output[0] = (u_int32_t)input[0] >> 2;
        output[1] = ((u_int32_t)(input[0] & 0x03) << 4) +
            ((u_int32_t)input[1] >> 4);
        output[2] = ((u_int32_t)(input[1] & 0x0f) << 2) +
            ((u_int32_t)input[2] >> 6);
        assert(output[0] < 64);
        assert(output[1] < 64);
        assert(output[2] < 64);

        if (datalength + 4 > targsize)
            return (-1);
        target[datalength++] = Base64[output[0]];
        target[datalength++] = Base64[output[1]];
        if (srclength == 1)
            target[datalength++] = Pad64;
        else
            target[datalength++] = Base64[output[2]];
        target[datalength++] = Pad64;
    }
    if (datalength >= targsize)
        return (-1);
    target[datalength] = '\0';  /* Returned value doesn't count \0. */
    return (datalength);
}

/* skips all whitespace anywhere.
   converts characters, four at a time, starting at (or after)
   src from base - 64 numbers into three 8 bit bytes in the target area.
   it returns the number of data bytes stored at the target, or -1 on error.
 */

int b64_pton(char const *src, u_char *target, size_t targsize)
{
    size_t tarindex;
    int state, ch;
    char *pos;

    assert(src != NULL);
    assert(target != NULL);

    state = 0;
    tarindex = 0;

    while ((ch = (u_char) *src++) != '\0') {
        if (isspace(ch))    /* Skip whitespace anywhere. */
            continue;

        if (ch == Pad64)
            break;

        pos = strchr(Base64, ch);
        if (pos == 0)       /* A non-base64 character. */
            return (-1);

        switch (state) {
        case 0:
            if (target) {
                if (tarindex >= targsize)
                    return (-1);
                target[tarindex] = (pos - Base64) << 2;
            }
            state = 1;
            break;
        case 1:
            if (target) {
                if (tarindex + 1 >= targsize)
                    return (-1);
                target[tarindex] |=
                    (u_int32_t)(pos - Base64) >> 4;
                target[tarindex+1]  = ((pos - Base64) & 0x0f)
                            << 4 ;
            }
            tarindex++;
            state = 2;
            break;
        case 2:
            if (target) {
                if (tarindex + 1 >= targsize)
                    return (-1);
                target[tarindex] |=
                    (u_int32_t)(pos - Base64) >> 2;
                target[tarindex+1] = ((pos - Base64) & 0x03)
                            << 6;
            }
            tarindex++;
            state = 3;
            break;
        case 3:
            if (target) {
                if (tarindex >= targsize)
                    return (-1);
                target[tarindex] |= (pos - Base64);
            }
            tarindex++;
            state = 0;
            break;
        default:
            abort();
        }
    }

    /*
     * We are done decoding Base-64 chars.  Let's see if we ended
     * on a byte boundary, and/or with erroneous trailing characters.
     */

    if (ch == Pad64) {      /* We got a pad char. */
        ch = *src++;        /* Skip it, get next. */
        switch (state) {
        case 0:     /* Invalid = in first position */
        case 1:     /* Invalid = in second position */
            return (-1);

        case 2:     /* Valid, means one byte of info */
            /* Skip any number of spaces. */
            for (; ch != '\0'; ch = (u_char) *src++)
                if (!isspace(ch))
                    break;
            /* Make sure there is another trailing = sign. */
            if (ch != Pad64)
                return (-1);
            ch = *src++;        /* Skip the = */
            /* Fall through to "single trailing =" case. */
            /* FALLTHROUGH */

        case 3:     /* Valid, means two bytes of info */
            /*
             * We know this char is an =.  Is there anything but
             * whitespace after it?
             */
            for (; ch != '\0'; ch = (u_char) *src++)
                if (!isspace(ch))
                    return (-1);

            /*
             * Now make sure for cases 2 and 3 that the "extra"
             * bits that slopped past the last full byte were
             * zeros.  If we don't check them, they become a
             * subliminal channel.
             */
            if (target && target[tarindex] != 0)
                return (-1);
        }
    } else {
        /*
         * We ended by seeing the end of the string.  Make sure we
         * have no partial bytes lying around.
         */
        if (state != 0)
            return (-1);
    }

    return (tarindex);
}

測試代碼

int  main()
{
    unsigned char data[200];
    unsigned char data2[400];
    int i=0;
    int ret;

    printf("src:\n");
    for(i=0;i<200;i++)
    {
        data[i]=i;
    }



    for(i=0;i<200;i++)
    {
        printf("%02x,",data[i]);
        if((i+1)%16 ==0)
        {
            printf("\n");
        }
    }

    
    

    unsigned char enstr[1024];

    memset(enstr,0,1024);

    printf("\nexe:\n");

    ret =b64_ntop(data, 200, enstr, 400);

    printf("ret=%d\n%s\n",ret,enstr);
    //base64_encode(data,200,enstr);
    //base64_decode(enstr,data2);
    ret =b64_pton(enstr, data2, 400);

    printf("ret=%d\nresult:\n",ret);

    for(i=0;i<200;i++)
    {
        printf("%02x,",data2[i]);
        if((i+1)%16 ==0)
        {
            printf("\n");
        }
    }

    printf("\n");
    
}

其他

base64在不同的場景也有一個差別嚼黔,有的編碼結果中每76個字符加入一個換行。這也是正確的惜辑。以上的代碼示例中沒有加入唬涧,如需加入只需計數(shù),定期加入換行符即可盛撑。

編碼結果的大小在不加入換行符時是原來長度的4/3碎节,但是需要注意精確malloc內存時,需要 (org_len + 3) * 4 / 3抵卫,如果需要增加結尾的'\0'狮荔,需要再加1位〗檎常或者直接org_len *4 / 3 + 2

參考鏈接

  1. Base64 編解碼的C語言實現(xiàn)
最后編輯于
?著作權歸作者所有,轉載或內容合作請聯(lián)系作者
  • 序言:七十年代末殖氏,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子姻采,更是在濱河造成了極大的恐慌雅采,老刑警劉巖,帶你破解...
    沈念sama閱讀 221,198評論 6 514
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件慨亲,死亡現(xiàn)場離奇詭異婚瓜,居然都是意外死亡,警方通過查閱死者的電腦和手機刑棵,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 94,334評論 3 398
  • 文/潘曉璐 我一進店門巴刻,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人蛉签,你說我怎么就攤上這事冈涧。” “怎么了正蛙?”我有些...
    開封第一講書人閱讀 167,643評論 0 360
  • 文/不壞的土叔 我叫張陵督弓,是天一觀的道長。 經常有香客問我乒验,道長愚隧,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 59,495評論 1 296
  • 正文 為了忘掉前任,我火速辦了婚禮狂塘,結果婚禮上录煤,老公的妹妹穿的比我還像新娘。我一直安慰自己荞胡,他們只是感情好妈踊,可當我...
    茶點故事閱讀 68,502評論 6 397
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著泪漂,像睡著了一般廊营。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上萝勤,一...
    開封第一講書人閱讀 52,156評論 1 308
  • 那天露筒,我揣著相機與錄音,去河邊找鬼敌卓。 笑死慎式,一個胖子當著我的面吹牛,可吹牛的內容都是我干的趟径。 我是一名探鬼主播瘪吏,決...
    沈念sama閱讀 40,743評論 3 421
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼蜗巧!你這毒婦竟也來了肪虎?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 39,659評論 0 276
  • 序言:老撾萬榮一對情侶失蹤惧蛹,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后刑枝,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體香嗓,經...
    沈念sama閱讀 46,200評論 1 319
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 38,282評論 3 340
  • 正文 我和宋清朗相戀三年装畅,在試婚紗的時候發(fā)現(xiàn)自己被綠了靠娱。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 40,424評論 1 352
  • 序言:一個原本活蹦亂跳的男人離奇死亡掠兄,死狀恐怖像云,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情蚂夕,我是刑警寧澤迅诬,帶...
    沈念sama閱讀 36,107評論 5 349
  • 正文 年R本政府宣布,位于F島的核電站婿牍,受9級特大地震影響侈贷,放射性物質發(fā)生泄漏。R本人自食惡果不足惜等脂,卻給世界環(huán)境...
    茶點故事閱讀 41,789評論 3 333
  • 文/蒙蒙 一俏蛮、第九天 我趴在偏房一處隱蔽的房頂上張望撑蚌。 院中可真熱鬧,春花似錦搏屑、人聲如沸争涌。這莊子的主人今日做“春日...
    開封第一講書人閱讀 32,264評論 0 23
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽亮垫。三九已至,卻和暖如春抑党,著一層夾襖步出監(jiān)牢的瞬間包警,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 33,390評論 1 271
  • 我被黑心中介騙來泰國打工底靠, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留害晦,地道東北人。 一個月前我還...
    沈念sama閱讀 48,798評論 3 376
  • 正文 我出身青樓暑中,卻偏偏與公主長得像壹瘟,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子鳄逾,可洞房花燭夜當晚...
    茶點故事閱讀 45,435評論 2 359

推薦閱讀更多精彩內容