002互聯(lián)網網絡技術之Base64編解碼的C語言實現(xiàn)

簡介

Base64編碼是將任何類型的數(shù)據(jù)轉換成ASCII碼的可見字符，然后接收端再反向解碼裂七，得到原始的數(shù)據(jù)皆看。最早的的Base是用于發(fā)送Email內容的。

經過Base64轉換之后的數(shù)據(jù)大小變大了背零，為原數(shù)據(jù)的4/3大小悬蔽。但是方便了傳輸，比如由于base64的編碼中沒有<>等特殊字符捉兴，可以不用轉義掃描，直接放在XML中录语，放在MIME中倍啥，甚至直接不經過轉義掃描存進數(shù)據(jù)庫中。由于有這些方便的特性澎埠，即使數(shù)據(jù)量變大虽缕，base64編碼還是被廣泛使用。

編碼原理

每個字節(jié)8位蒲稳，每次取出3個字節(jié)氮趋，也就是3 x 8 = 24 位。然后每次從此24位中取出6位江耀，然后在前端補2位0剩胁，組成新的8位，也就是一個字節(jié)祥国。這樣就將3個字節(jié)轉換成了4個字節(jié)昵观。由于前面兩位都是0，所以轉換后的每個字節(jié)能表示的最大數(shù)字為63舌稀，也就是說轉換后的每個字節(jié)只可能是0-63中的一個數(shù)字啊犬。

然后根據(jù)規(guī)范給出的Base64索引表，將1-63 這64個數(shù)字轉換成"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"中的一個壁查。

當最后取出3個字節(jié)不夠時觉至，不夠的位置補0，并且最后少一個字節(jié)時編碼的最后加一個“=”睡腿，少兩個字節(jié)時加兩個"="

解碼原理

解碼是編碼的反向過程语御，每次取出4個字節(jié)峻贮，然后將每個字節(jié)的字符轉換成原始Base64索引表對應的索引數(shù)字，也就是編碼時3字節(jié)轉換成4字節(jié)的轉換結果沃暗。然后使用位操作將每字節(jié)前2位去掉月洛，重新轉換成3字節(jié)。需要注意的是最后對于結尾“=”的處理孽锥。

代碼實現(xiàn)

static const char Base64[] =
    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
static const char Pad64 = '=';

/* (From RFC1521 and draft-ietf-dnssec-secext-03.txt)
   The following encoding technique is taken from RFC 1521 by Borenstein
   and Freed.  It is reproduced here in a slightly edited form for
   convenience.

   A 65-character subset of US-ASCII is used, enabling 6 bits to be
   represented per printable character. (The extra 65th character, "=",
   is used to signify a special processing function.)

   The encoding process represents 24-bit groups of input bits as output
   strings of 4 encoded characters. Proceeding from left to right, a
   24-bit input group is formed by concatenating 3 8-bit input groups.
   These 24 bits are then treated as 4 concatenated 6-bit groups, each
   of which is translated into a single digit in the base64 alphabet.

   Each 6-bit group is used as an index into an array of 64 printable
   characters. The character referenced by the index is placed in the
   output string.

                         Table 1: The Base64 Alphabet

      Value Encoding  Value Encoding  Value Encoding  Value Encoding
          0 A            17 R            34 i            51 z
          1 B            18 S            35 j            52 0
          2 C            19 T            36 k            53 1
          3 D            20 U            37 l            54 2
          4 E            21 V            38 m            55 3
          5 F            22 W            39 n            56 4
          6 G            23 X            40 o            57 5
          7 H            24 Y            41 p            58 6
          8 I            25 Z            42 q            59 7
          9 J            26 a            43 r            60 8
         10 K            27 b            44 s            61 9
         11 L            28 c            45 t            62 +
         12 M            29 d            46 u            63 /
         13 N            30 e            47 v
         14 O            31 f            48 w         (pad) =
         15 P            32 g            49 x
         16 Q            33 h            50 y

   Special processing is performed if fewer than 24 bits are available
   at the end of the data being encoded.  A full encoding quantum is
   always completed at the end of a quantity.  When fewer than 24 input
   bits are available in an input group, zero bits are added (on the
   right) to form an integral number of 6-bit groups.  Padding at the
   end of the data is performed using the '=' character.

   Since all base64 input is an integral number of octets, only the
         -------------------------------------------------
   following cases can arise:

       (1) the final quantum of encoding input is an integral
           multiple of 24 bits; here, the final unit of encoded
       output will be an integral multiple of 4 characters
       with no "=" padding,
       (2) the final quantum of encoding input is exactly 8 bits;
           here, the final unit of encoded output will be two
       characters followed by two "=" padding characters, or
       (3) the final quantum of encoding input is exactly 16 bits;
           here, the final unit of encoded output will be three
       characters followed by one "=" padding character.
   */

int b64_ntop(u_char const *src, size_t srclength, char *target, size_t targsize)
{
    size_t datalength = 0;
    u_char input[3] = { 0, 0, 0 };  /* make compiler happy */
    u_char output[4];
    size_t i;

    assert(src != NULL);
    assert(target != NULL);

    while (2 < srclength) {
        input[0] = *src++;
        input[1] = *src++;
        input[2] = *src++;
        srclength -= 3;

        output[0] = (u_int32_t)input[0] >> 2;
        output[1] = ((u_int32_t)(input[0] & 0x03) << 4) +
            ((u_int32_t)input[1] >> 4);
        output[2] = ((u_int32_t)(input[1] & 0x0f) << 2) +
            ((u_int32_t)input[2] >> 6);
        output[3] = input[2] & 0x3f;
        assert(output[0] < 64);
        assert(output[1] < 64);
        assert(output[2] < 64);
        assert(output[3] < 64);

        if (datalength + 4 > targsize)
            return (-1);
        target[datalength++] = Base64[output[0]];
        target[datalength++] = Base64[output[1]];
        target[datalength++] = Base64[output[2]];
        target[datalength++] = Base64[output[3]];
    }

    /* Now we worry about padding. */
    if (0 != srclength) {
        /* Get what's left. */
        input[0] = input[1] = input[2] = '\0';
        for (i = 0; i < srclength; i++)
            input[i] = *src++;

        output[0] = (u_int32_t)input[0] >> 2;
        output[1] = ((u_int32_t)(input[0] & 0x03) << 4) +
            ((u_int32_t)input[1] >> 4);
        output[2] = ((u_int32_t)(input[1] & 0x0f) << 2) +
            ((u_int32_t)input[2] >> 6);
        assert(output[0] < 64);
        assert(output[1] < 64);
        assert(output[2] < 64);

        if (datalength + 4 > targsize)
            return (-1);
        target[datalength++] = Base64[output[0]];
        target[datalength++] = Base64[output[1]];
        if (srclength == 1)
            target[datalength++] = Pad64;
        else
            target[datalength++] = Base64[output[2]];
        target[datalength++] = Pad64;
    }
    if (datalength >= targsize)
        return (-1);
    target[datalength] = '\0';  /* Returned value doesn't count \0. */
    return (datalength);
}

/* skips all whitespace anywhere.
   converts characters, four at a time, starting at (or after)
   src from base - 64 numbers into three 8 bit bytes in the target area.
   it returns the number of data bytes stored at the target, or -1 on error.
 */

int b64_pton(char const *src, u_char *target, size_t targsize)
{
    size_t tarindex;
    int state, ch;
    char *pos;

    assert(src != NULL);
    assert(target != NULL);

    state = 0;
    tarindex = 0;

    while ((ch = (u_char) *src++) != '\0') {
        if (isspace(ch))    /* Skip whitespace anywhere. */
            continue;

        if (ch == Pad64)
            break;

        pos = strchr(Base64, ch);
        if (pos == 0)       /* A non-base64 character. */
            return (-1);

        switch (state) {
        case 0:
            if (target) {
                if (tarindex >= targsize)
                    return (-1);
                target[tarindex] = (pos - Base64) << 2;
            }
            state = 1;
            break;
        case 1:
            if (target) {
                if (tarindex + 1 >= targsize)
                    return (-1);
                target[tarindex] |=
                    (u_int32_t)(pos - Base64) >> 4;
                target[tarindex+1]  = ((pos - Base64) & 0x0f)
                            << 4 ;
            }
            tarindex++;
            state = 2;
            break;
        case 2:
            if (target) {
                if (tarindex + 1 >= targsize)
                    return (-1);
                target[tarindex] |=
                    (u_int32_t)(pos - Base64) >> 2;
                target[tarindex+1] = ((pos - Base64) & 0x03)
                            << 6;
            }
            tarindex++;
            state = 3;
            break;
        case 3:
            if (target) {
                if (tarindex >= targsize)
                    return (-1);
                target[tarindex] |= (pos - Base64);
            }
            tarindex++;
            state = 0;
            break;
        default:
            abort();
        }
    }

    /*
     * We are done decoding Base-64 chars.  Let's see if we ended
     * on a byte boundary, and/or with erroneous trailing characters.
     */

    if (ch == Pad64) {      /* We got a pad char. */
        ch = *src++;        /* Skip it, get next. */
        switch (state) {
        case 0:     /* Invalid = in first position */
        case 1:     /* Invalid = in second position */
            return (-1);

        case 2:     /* Valid, means one byte of info */
            /* Skip any number of spaces. */
            for (; ch != '\0'; ch = (u_char) *src++)
                if (!isspace(ch))
                    break;
            /* Make sure there is another trailing = sign. */
            if (ch != Pad64)
                return (-1);
            ch = *src++;        /* Skip the = */
            /* Fall through to "single trailing =" case. */
            /* FALLTHROUGH */

        case 3:     /* Valid, means two bytes of info */
            /*
             * We know this char is an =.  Is there anything but
             * whitespace after it?
             */
            for (; ch != '\0'; ch = (u_char) *src++)
                if (!isspace(ch))
                    return (-1);

            /*
             * Now make sure for cases 2 and 3 that the "extra"
             * bits that slopped past the last full byte were
             * zeros.  If we don't check them, they become a
             * subliminal channel.
             */
            if (target && target[tarindex] != 0)
                return (-1);
        }
    } else {
        /*
         * We ended by seeing the end of the string.  Make sure we
         * have no partial bytes lying around.
         */
        if (state != 0)
            return (-1);
    }

    return (tarindex);
}

測試代碼

int  main()
{
    unsigned char data[200];
    unsigned char data2[400];
    int i=0;
    int ret;

    printf("src:\n");
    for(i=0;i<200;i++)
    {
        data[i]=i;
    }



    for(i=0;i<200;i++)
    {
        printf("%02x,",data[i]);
        if((i+1)%16 ==0)
        {
            printf("\n");
        }
    }

    
    

    unsigned char enstr[1024];

    memset(enstr,0,1024);

    printf("\nexe:\n");

    ret =b64_ntop(data, 200, enstr, 400);

    printf("ret=%d\n%s\n",ret,enstr);
    //base64_encode(data,200,enstr);
    //base64_decode(enstr,data2);
    ret =b64_pton(enstr, data2, 400);

    printf("ret=%d\nresult:\n",ret);

    for(i=0;i<200;i++)
    {
        printf("%02x,",data2[i]);
        if((i+1)%16 ==0)
        {
            printf("\n");
        }
    }

    printf("\n");
    
}

其他

base64在不同的場景也有一個差別嚼黔，有的編碼結果中每76個字符加入一個換行。這也是正確的惜辑。以上的代碼示例中沒有加入唬涧，如需加入只需計數(shù)，定期加入換行符即可盛撑。

編碼結果的大小在不加入換行符時是原來長度的4/3碎节，但是需要注意精確malloc內存時，需要（org_len + 3) * 4 / 3抵卫，如果需要增加結尾的'\0'狮荔，需要再加1位〗檎常或者直接org_len *4 / 3 + 2

參考鏈接

Base64 編解碼的C語言實現(xiàn)

最后編輯于：2017.12.03 06:18:50

?著作權歸作者所有,轉載或內容合作請聯(lián)系作者

人面猴
序言：七十年代末殖氏，一起剝皮案震驚了整個濱河市，隨后出現(xiàn)的幾起案子姻采，更是在濱河造成了極大的恐慌雅采，老刑警劉巖，帶你破解...
沈念sama閱讀 221,198評論 6贊 514
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件慨亲，死亡現(xiàn)場離奇詭異婚瓜，居然都是意外死亡，警方通過查閱死者的電腦和手機刑棵，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 94,334評論 3贊 398
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門巴刻，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人蛉签，你說我怎么就攤上這事冈涧。” “怎么了正蛙？”我有些...
開封第一講書人閱讀 167,643評論 0贊 360
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵督弓，是天一觀的道長。經常有香客問我乒验，道長愚隧，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 59,495評論 1贊 296
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮狂塘，結果婚禮上录煤，老公的妹妹穿的比我還像新娘。我一直安慰自己荞胡，他們只是感情好妈踊，可當我...
茶點故事閱讀 68,502評論 6贊 397
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著泪漂，像睡著了一般廊营。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上萝勤，一...
開封第一講書人閱讀 52,156評論 1贊 308
城市分裂傳說
那天露筒，我揣著相機與錄音，去河邊找鬼敌卓。笑死慎式，一個胖子當著我的面吹牛，可吹牛的內容都是我干的趟径。我是一名探鬼主播瘪吏，決...
沈念sama閱讀 40,743評論 3贊 421
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼蜗巧！你這毒婦竟也來了肪虎？” 一聲冷哼從身側響起，我...
開封第一講書人閱讀 39,659評論 0贊 276
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤惧蛹，失蹤者是張志新（化名）和其女友劉穎，沒想到半個月后刑枝，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體香嗓，經...
沈念sama閱讀 46,200評論 1贊 319
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內容為張勛視角年9月15日...
茶點故事閱讀 38,282評論 3贊 340
?白月光啟示錄
正文我和宋清朗相戀三年装畅，在試婚紗的時候發(fā)現(xiàn)自己被綠了靠娱。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點故事閱讀 40,424評論 1贊 352
活死人
序言：一個原本活蹦亂跳的男人離奇死亡掠兄，死狀恐怖像云，靈堂內的尸體忽然破棺而出，到底是詐尸還是另有隱情蚂夕，我是刑警寧澤迅诬，帶...
沈念sama閱讀 36,107評論 5贊 349
?日本核電站爆炸內幕
正文年R本政府宣布，位于F島的核電站婿牍，受9級特大地震影響侈贷，放射性物質發(fā)生泄漏。R本人自食惡果不足惜等脂，卻給世界環(huán)境...
茶點故事閱讀 41,789評論 3贊 333
男人毒藥：我在死后第九天來索命
文/蒙蒙一俏蛮、第九天我趴在偏房一處隱蔽的房頂上張望撑蚌。院中可真熱鬧，春花似錦搏屑、人聲如沸争涌。這莊子的主人今日做“春日...
開封第一講書人閱讀 32,264評論 0贊 23
一樁弒父案辣恋，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽亮垫。三九已至，卻和暖如春抑党，著一層夾襖步出監(jiān)牢的瞬間包警，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 33,390評論 1贊 271
情欲美人皮
我被黑心中介騙來泰國打工底靠，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留害晦，地道東北人。一個月前我還...
沈念sama閱讀 48,798評論 3贊 376
代替公主和親
正文我出身青樓暑中，卻偏偏與公主長得像壹瘟，于是被迫代替她去往敵國和親。傳聞我的和親對象是個殘疾皇子鳄逾，可洞房花燭夜當晚...
茶點故事閱讀 45,435評論 2贊 359