一、業(yè)務(wù)背景:
將長(zhǎng)鏈接轉(zhuǎn)化成“指定的短域名+字母數(shù)字組合”不重復(fù)的URL短地址,用于業(yè)務(wù)線的常規(guī)業(yè)務(wù)及日常推廣使用,基于此需求 需要實(shí)現(xiàn)一套簡(jiǎn)單高效的全局唯一的5位字母數(shù)字組合的短地址
在實(shí)現(xiàn)出上述需求的短地址之前 我們先討論下 有幾種辦法可以生成一個(gè)全局唯一ID供后續(xù)生成5位字母組合使用
先說(shuō)有幾種唯一id的實(shí)現(xiàn)方法:
1、數(shù)據(jù)庫(kù)自增長(zhǎng)ID
優(yōu)勢(shì):無(wú)需編碼
缺陷:直接自增吟逝,不夠隨機(jī),對(duì)外暴露信息過(guò)多赦肋。高并發(fā)下插入數(shù)據(jù)需要加入事務(wù)機(jī)制
2块攒、時(shí)間戳+隨機(jī)數(shù)
優(yōu)勢(shì):編碼簡(jiǎn)單
缺陷:隨機(jī)數(shù)存在重復(fù)問(wèn)題,即使在相同的時(shí)間戳下佃乘。每次插入數(shù)據(jù)庫(kù)前需要校驗(yàn)下是否已經(jīng)存在相同的數(shù)值
3囱井、UUID
優(yōu)勢(shì):簡(jiǎn)單
劣勢(shì):用戶不友好,索引關(guān)聯(lián)效率較低趣避、影響性能庞呕。
4、今天想聊的是twitter開(kāi)源的一款分布式自增ID算法snowflake
snowflake算法是一款本地生成的(ID生成過(guò)程不依賴(lài)任何中間件程帕,無(wú)網(wǎng)絡(luò)通信)住练,保證ID全局唯一,并且ID總體有序遞增愁拭,性能每秒生成10w+讲逛、完全可以滿足大部分的業(yè)務(wù)要求。
二岭埠、snowflake算法原理
snowflake生產(chǎn)的ID是一個(gè)18位的long型數(shù)字盏混,二進(jìn)制結(jié)構(gòu)表示如下(每部分用-分開(kāi)):
0 - 00000000 00000000 00000000 00000000 00000000 0 - 00000 - 00000 - 00000000 0000
第一位未使用蔚鸥,接下來(lái)的41位為毫秒級(jí)時(shí)間(41位的長(zhǎng)度可以使用69年,從1970-01-01
08:00:00)许赃,然后是5位datacenterId(最大支持25=32個(gè)止喷,二進(jìn)制表示從00000-11111,也即是十進(jìn)制0-31)图焰,和5位workerId(最大支持25=32個(gè)启盛,原理同datacenterId),所以datacenterId*workerId最多支持部署1024個(gè)節(jié)點(diǎn)技羔,最后12位是毫秒內(nèi)的計(jì)數(shù)(12位的計(jì)數(shù)順序號(hào)支持每個(gè)節(jié)點(diǎn)每毫秒產(chǎn)生2^12=4096個(gè)ID序號(hào)).
所有位數(shù)加起來(lái)共64位,恰好是一個(gè)Long型(轉(zhuǎn)換為字符串長(zhǎng)度為18).
單臺(tái)機(jī)器實(shí)例卧抗,通過(guò)時(shí)間戳保證前41位是唯一的藤滥,分布式系統(tǒng)多臺(tái)機(jī)器實(shí)例下,通過(guò)對(duì)每個(gè)機(jī)器實(shí)例分配不同的datacenterId和workerId避免中間的10位碰撞社裆。最后12位每毫秒從0遞增生產(chǎn)ID拙绊,再提一次:每毫秒最多生成4096個(gè)ID,每秒可達(dá)4096000個(gè)泳秀。理論上标沪,只要CPU計(jì)算能力足夠,單機(jī)每秒實(shí)測(cè)10w+嗜傅,效率之高由此可見(jiàn)金句。
三、snowflake算法源碼(PHP版)
/**
* Description of SnowFlake
* 經(jīng)典的分布式發(fā)號(hào)器 雪花算法
* @date 2018-4-19 16:23:21
*/
class SnowFlake
{
//開(kāi)始時(shí)間,固定一個(gè)小于當(dāng)前時(shí)間的毫秒數(shù)即可
const twepoch = 1482394743339;//2016/12/22 16:19:03
//機(jī)器標(biāo)識(shí)占的位數(shù)
const workerIdBits = 5;
//數(shù)據(jù)中心標(biāo)識(shí)占的位數(shù)
const datacenterIdBits = 5;
//毫秒內(nèi)自增數(shù)點(diǎn)的位數(shù)
const sequenceBits = 12;
protected $workId = 0; //當(dāng)前workid
protected $datacenterId = 0; //數(shù)據(jù)中心id
protected $maxWorkerId = 0; //最大workid
static $lastTimestamp = -1; //最新時(shí)間戳
static $sequence = 0; //發(fā)號(hào)頻率
public function __construct($workId, $datacenterId){
//機(jī)器ID范圍判斷
$maxWorkerId = -1 ^ (-1 << self::workerIdBits); //31
$this->maxWorkerId = $maxWorkerId;
if($workId > $maxWorkerId || $workId< 0){
throw new Exception("workerId can't be greater than ".$this->maxWorkerId." or less than 0");
}
//數(shù)據(jù)中心ID范圍判斷
$maxDatacenterId = -1 ^ (-1 << self::datacenterIdBits); //31
if ($datacenterId > $maxDatacenterId || $datacenterId < 0) {
throw new Exception("datacenter Id can't be greater than ".$this->maxDatacenterId." or less than 0");
}
//賦值
$this->workId = $workId;
$this->datacenterId = $datacenterId;
}
//生成一個(gè)ID
public function nextId(){
$timestamp = self::timeGen();
$lastTimestamp = self::$lastTimestamp;
//判斷時(shí)鐘是否正常
if ($timestamp < $lastTimestamp) {
throw new Exception("Clock moved backwards. Refusing to generate id for %d milliseconds", ($lastTimestamp - $timestamp));
}
//生成唯一序列
if ($lastTimestamp == $timestamp) {
$sequenceMask = -1 ^ (-1 << self::sequenceBits);
self::$sequence = (self::$sequence + 1) & $sequenceMask;
if (self::$sequence == 0) {
$timestamp = self::tilNextMillis($lastTimestamp);
}
} else {
self::$sequence = 0;
}
self::$lastTimestamp = $timestamp;
//
//時(shí)間毫秒/數(shù)據(jù)中心ID/機(jī)器ID,要左移的位數(shù)
$timestampLeftShift = self::sequenceBits + self::workerIdBits + self::datacenterIdBits; //22
$datacenterIdShift = self::sequenceBits + self::workerIdBits; //17
$workerIdShift = self::sequenceBits; //12
//組合4段數(shù)據(jù)返回: 時(shí)間戳.數(shù)據(jù)標(biāo)識(shí).工作機(jī)器.序列
$nextId = (($timestamp - (self::twepoch)) << $timestampLeftShift) |
($this->datacenterId << $datacenterIdShift) |
($this->workId << $workerIdShift) | self::$sequence;
return $nextId;
}
//取當(dāng)前時(shí)間毫秒
protected static function timeGen(){
$timestramp = (float)sprintf("%.0f", microtime(true) * 1000);
return $timestramp;
}
//取下一毫秒
protected static function tilNextMillis($lastTimestamp) {
$timestamp = self::timeGen();
while ($timestamp <= $lastTimestamp) {
$timestamp = self::timeGen();
}
return $timestamp;
}
}
四吕嘀、snowflake算法推導(dǎo)和演算過(guò)程
說(shuō)明:
演算使用的對(duì)象實(shí)例:$snowflake = new SnowFlake(1, 1);
運(yùn)行時(shí)數(shù)據(jù)workerId=1违寞,datacenterId=1,分別表示機(jī)器實(shí)例的生產(chǎn)者編號(hào)偶房,數(shù)據(jù)中心編號(hào)趁曼;
sequence=0表示每毫秒生產(chǎn)ID從0開(kāi)始計(jì)數(shù)遞增;
以下演算基于時(shí)間戳=1482394743339時(shí)刻進(jìn)行推導(dǎo)棕洋。
一句話描述:以下演算模擬了1482394743339這一毫秒時(shí)刻挡闰,workerId=1,datacenterId=1的id生成器掰盘,生產(chǎn)第一個(gè)id的過(guò)程摄悯。
(圖片轉(zhuǎn)載自:https://blog.csdn.net/li396864285/article/details/54668031)
五、派號(hào)器根據(jù)snowflake去生成唯一的5位字母數(shù)字組合的短地址
/*
* 多進(jìn)程 短地址發(fā)號(hào)器
* @date 2018-04-20 14:09
*/
public function ticketAction(){
//初始化成員屬性
$this->initializeAttribute();
//檢測(cè)隊(duì)列剩余量 大于10萬(wàn)的話不生成
$queueNum = $this->redisService->llen($this->config->redisCommonShortUrlQueue);
if($queueNum > self::lastQueueNums){
echo "Date: " . date("Y-m-d H:i:s", time()) . "the data ".$queueNum." is full".PHP_EOL;
exit();
}
for ($i = 0; $i < 10; $i ++) {
$pid = pcntl_fork();
if ($pid == -1) {
echo 'could not fork';
continue;
} else if ($pid) {
pcntl_wait($status);
} else {
//通過(guò)pipeline批量生成
$pipe = $this->redisService->multi(Redis::PIPELINE);
for($j = 0; $j < self::perForkNums;$j ++) {
$newid = $this->patchTicketAction();
//壓入隊(duì)列
$pipe->lpush($this->config->redisCommonShortUrlQueue,$newid);
}
$result = $pipe->exec();
print_r($result);
unset($result);
echo "Date: " . date("Y-m-d H:i:s", time()) . "the ".$i." child process is end".PHP_EOL;
exit();
}
}
}
/*
* 發(fā)號(hào)器生成派發(fā)
* 壓入隊(duì)列 進(jìn)入去重庫(kù)
* @date 2018-04-20 14:09
* @return string
*/
public function patchTicketAction(){
//初始化成員屬性
$this->initializeAttribute();
//先通過(guò)雪花算法 計(jì)算出一個(gè)key
$id = $this->snowflake->nextId();
//echo $id.PHP_EOL;
$secretKey = "a①d shor€t?";
$hex = md5($id.$secretKey);
$hexLen = strlen($hex);
$subHexLen = $hexLen / 8; //將長(zhǎng)網(wǎng)址md5生成32位簽名串庆杜,分為4段射众,每段8個(gè)字節(jié);
$output = array();
for ($i = 0; $i < $subHexLen; $i++) {
$subHex = substr ($hex, $i * 8, 8); //對(duì)這四段循環(huán)處理晃财,取8個(gè)字節(jié)
$int = 0x3FFFFFFF & (1 * ('0x'.$subHex)); //將他看成16進(jìn)制串與0x3fffffff(30位1)與操作叨橱,即超過(guò)30位的忽略處理
$out = '';
//將這30位分成5段
for ($j = 0; $j < 5; $j++) {
$val = 0x0000001F & $int; //每5位的數(shù)字作為字母表的索引取得特定字符典蜕,依次進(jìn)行獲得5位字符串;
$out .= $this->base32[$val];
$int = $int >> 5;
}
$output[] = $out;
}
//總的md5串可以獲得4個(gè)5位串罗洗;取里面的任意一個(gè)就可作為這個(gè)長(zhǎng)url的短url地址愉舔;
$newid = $output[mt_rand(0, 3)];
unset($output,$out);
return $newid;
}
/**
* 初始化成員屬性
* @return null
*/
public function initializeAttribute($totalProcess = 1, $pid = 1)
{
$this->workId = $pid + 1;
$this->datacenterId = 1;
$this->totalProcess = $totalProcess >= 1 ? $totalProcess : 1;
$this->pid = $pid;
//實(shí)例化redis
$this->redisService = new RedisService();
//實(shí)例化雪花算法
$this->snowflake = new SnowFlake($this->workId, $this->datacenterId);
//參與短地址計(jì)算的數(shù)組
$this->base32 = [
"a" , "b" , "c" , "d" , "e" , "f" , "g" , "h" ,
"i" , "j" , "k" , "l" , "m" , "n" , "o" , "p" ,
"q" , "r" , "s" , "t" , "u" , "v" , "w" , "x" ,
"y" , "z" , "1" , "2" , "3" , "4" , "5" , "6" ,
"7" , "8" , "9" , "A" , "B" , "C" , "D" , "E" ,
"F" , "G" , "H" , "I" , "J" , "K" , "L" , "M" ,
"N" , "O" , "P" , "Q" , "R" , "S" , "T" , "U" ,
"V" , "W" , "X" , "Y" , "Z" ];
return;
}
經(jīng)多進(jìn)程測(cè)試 上億條 沒(méi)出現(xiàn)重復(fù)
注意
多進(jìn)程下 執(zhí)行防止重復(fù) 將進(jìn)程id 傳入賦值給 workid 就可以保證單機(jī)下多進(jìn)程不重復(fù)