ceph rgw 上傳大文件采用的是分片上傳的方法。
事先配置參數(shù)最小分片大小改為12
為了使上傳文件透明化脆栋,我們使用curl去實(shí)現(xiàn)整個(gè)rgw上傳文件的三個(gè)邏輯
第一個(gè)準(zhǔn)備階段置蜀,目的使獲取uploadid
ACCESS_KEY="test1"?#填access key
SECRET_KEY="test1"?#填secret key
HOST="192.168.1.29:7480"?#填S3的Endpoint地址
BUCKET="curlbucket"?#填bucket名稱
CONTENT_TYPE="text/plaid"?#MIME
FILENAME=/root/curltest/24?#文件本地路徑
OBJECTNAME="24"
ACL="x-amz-acl:public-read"?#Object的ACL
META_DATA="x-amz-meta-ukey:value"?#自定義medadata
FILESIZE=$(stat -c%s?"$FILENAME")
echo $FILESIZE
FILEMD5=`cat ${FILENAME}| openssl dgst -md5 -binary | openssl enc -base64`
AUTH_PATH="/${BUCKET}/${OBJECTNAME}?uploads"
CURRENT_TIME=`TZ=GMT LANG=en_US date?"+%a, %d %b %Y %H:%M:%S GMT"`
stringToSign="POST\n\n\n${CURRENT_TIME}\n${AUTH_PATH}"
echo $stringToSign
signature=`echo -en ${stringToSign} | openssl sha1 -hmac ${SECRET_KEY} -binary | base64`
echo $signature
curl -s -v -X POST?"http://${HOST}${AUTH_PATH}"?\
????-H?"Authorization: AWS ${ACCESS_KEY}:${signature}"?\
????-H?"Date: ${CURRENT_TIME}"
第二階段闽瓢,根據(jù)uploadid上傳部分文件启具,并保留response中的ETAG值
#!/bin/bash
ACCESS_KEY="test1"?#填access key
SECRET_KEY="test1"?#填secret key
HOST="192.168.1.29:7480"?#填S3的Endpoint地址
BUCKET="curlbucket"?#填bucket名稱
CONTENT_TYPE="text/plaid"?#MIME
FILENAME=/root/curltest/121?#文件本地路徑
OBJECTNAME="24"
FILESIZE=$(stat -c%s?"$FILENAME")
echo $FILESIZE
FILEMD5=`cat ${FILENAME}| openssl dgst -md5 -binary | openssl enc -base64`
AUTH_PATH="/${BUCKET}/${OBJECTNAME}?partNumber=1&uploadId=2~cKXKXvrqwcYFya_UpQqGq4ltxUAYNGV"
CURRENT_TIME=`TZ=GMT LANG=en_US date?"+%a, %d %b %Y %H:%M:%S GMT"`
stringToSign="PUT\n\n\n${CURRENT_TIME}\n${AUTH_PATH}"
echo $stringToSign
signature=`echo -en ${stringToSign} | openssl sha1 -hmac ${SECRET_KEY} -binary | base64`
echo $signature
curl -s -v -X PUT?"http://${HOST}${AUTH_PATH}"?\
????-H?"Authorization: AWS ${ACCESS_KEY}:${signature}"?\
????-H?"Date: ${CURRENT_TIME}"?\
????-H?"Content-Length: 12"?\
????-T?"${FILENAME}"
#!/bin/bash
ACCESS_KEY="test1"?#填access key
SECRET_KEY="test1"?#填secret key
HOST="192.168.1.29:7480"?#填S3的Endpoint地址
BUCKET="curlbucket"?#填bucket名稱
CONTENT_TYPE="text/plaid"?#MIME
FILENAME=/root/curltest/122?#文件本地路徑
OBJECTNAME="24"
FILESIZE=$(stat -c%s?"$FILENAME")
echo $FILESIZE
FILEMD5=`cat ${FILENAME}| openssl dgst -md5 -binary | openssl enc -base64`
AUTH_PATH="/${BUCKET}/${OBJECTNAME}?partNumber=2&uploadId=2~cKXKXvrqwcYFya_UpQqGq4ltxUAYNGV"
CURRENT_TIME=`TZ=GMT LANG=en_US date?"+%a, %d %b %Y %H:%M:%S GMT"`
stringToSign="PUT\n\n\n${CURRENT_TIME}\n${AUTH_PATH}"
echo $stringToSign
signature=`echo -en ${stringToSign} | openssl sha1 -hmac ${SECRET_KEY} -binary | base64`
echo $signature
curl -s -v -X PUT?"http://${HOST}${AUTH_PATH}"?\
????-H?"Authorization: AWS ${ACCESS_KEY}:${signature}"?\
????-H?"Date: ${CURRENT_TIME}"?\
????-H?"Content-Length: 12"?\
????-T?"${FILENAME}"
第三個(gè)階段金度,完成上傳邏輯
#!/bin/bash
ACCESS_KEY="test1"?#填access key
SECRET_KEY="test1"?#填secret key
HOST="192.168.1.29:7480"?#填S3的Endpoint地址
BUCKET="curlbucket"?#填bucket名稱
CONTENT_TYPE="text/plaid"?#MIME
FILENAME=/root/curltest/1.xml #文件本地路徑
OBJECTNAME="24"
FILESIZE=$(stat -c%s?"$FILENAME")
echo $FILESIZE
FILEMD5=`cat ${FILENAME}| openssl dgst -md5 -binary | openssl enc -base64`
AUTH_PATH="/${BUCKET}/${OBJECTNAME}?uploadId=2~cKXKXvrqwcYFya_UpQqGq4ltxUAYNGV"
CURRENT_TIME=`TZ=GMT LANG=en_US date?"+%a, %d %b %Y %H:%M:%S GMT"`
stringToSign="POST\n\n\n${CURRENT_TIME}\n${AUTH_PATH}"
echo $stringToSign
signature=`echo -en ${stringToSign} | openssl sha1 -hmac ${SECRET_KEY} -binary | base64`
echo $signature
curl -s -v -X POST?"http://${HOST}${AUTH_PATH}"?\
????-H?"Authorization: AWS ${ACCESS_KEY}:${signature}"?\
????-H?"Date: ${CURRENT_TIME}"?\
????-T?"${FILENAME}"
xml文件內(nèi)容
<CompleteMultipartUpload>
??<Part>
????<PartNumber>1</PartNumber>//文件里面的順序可以是無(wú)序的应媚,但是partnumber與Etag值一定要對(duì)應(yīng),
????<ETag>"14812c00f44e41ef5233694083171b26"</ETag>
??</Part>
??<Part>
????<PartNumber>2</PartNumber>
????<ETag>"610fcea2f7195f652dbef3bfc77ea30a"</ETag>
??</Part>
</CompleteMultipartUpload>
1.ceph rgw 按你上傳的partnumber(partnumber 是一個(gè)uint32的數(shù)字(范圍大于等于1)猜极,但是如果你輸入partnumber是-1中姜,0 在上傳階段是不會(huì)給你返回錯(cuò)誤,只有在完成階段才報(bào)給你invoidpart問(wèn)題跟伏,使用的話必須大于0 rgw里面有這層代碼) 從小到大排序丢胚。來(lái)組織文件
2.partnumber可以不連續(xù),程序中有這一層邏輯受扳,如果找不到順序的一個(gè)就進(jìn)行 return 函數(shù)本身携龟。
3.斷點(diǎn)續(xù)傳就是在獲取上傳分片的部分進(jìn)行上傳(需要客戶端進(jìn)行文件切片,S3cmd 有這個(gè)功能勘高,就是這樣實(shí)現(xiàn))峡蟋,并行上傳:原理就是(可以同時(shí)上傳多個(gè)part異步上傳分片,并記錄完所有etag后直接發(fā)送完成標(biāo)志)
4.關(guān)于大文件下載相满,這個(gè)沒(méi)進(jìn)行研究层亿,現(xiàn)在ceph L版本支持AWS torrent 下載(減少帶寬 百度百科https://baike.baidu.com/item/BitTorrent/142795?fr=aladdin),https://ceph.com/releases/v12-0-1-luminous-dev-released/? https://docs.aws.amazon.com/zh_cn/AmazonS3/latest/dev/S3Torrent.html