背景
最近好幾個項目在運(yùn)行過程中客戶都提出文件上傳大小的限制能否設(shè)置的大一些毒姨,用戶經(jīng)常需要上傳好幾個G的資料文件疹瘦,如圖紙矿筝,視頻等,并且需要在上傳大文件過程中進(jìn)行優(yōu)化實時展現(xiàn)進(jìn)度條媚送,進(jìn)行技術(shù)評估后針對框架文件上傳進(jìn)行擴(kuò)展升級,擴(kuò)展接口支持大文件分片上傳處理,減少服務(wù)器瞬時的內(nèi)存壓力,同一個文件上傳失敗后可以從成功上傳分片位置進(jìn)行斷點(diǎn)續(xù)傳辛藻,文件上傳成功后再次上傳無需等待達(dá)到秒傳的效果规揪,優(yōu)化用戶交互體驗,具體的實現(xiàn)流程如下圖所示
文件MD5計算
對于文件md5的計算我們使用spark-md5第三方庫疙筹,大文件我們可以分片分別計算再合并節(jié)省時間,但是經(jīng)測試1G文件計算MD5需要20s左右的時間哀托,所以經(jīng)過優(yōu)化我們抽取文件部分特征信息(文件第一片+文件最后一片+文件修改時間)惦辛,來保證文件的相對唯一性,只需要2s左右仓手,大大提高前端計算效率胖齐,對于前端文件內(nèi)容塊的讀取我們需要使用html5的api中fileReader.readAsArrayBuffer方法,因為是異步觸發(fā)嗽冒,封裝的方法提供一個回調(diào)函數(shù)進(jìn)行使用
createSimpleFileMD5(file, chunkSize, finishCaculate) {
var fileReader = new FileReader();
var blobSlice = File.prototype.mozSlice || File.prototype.webkitSlice || File.prototype.slice;
var chunks = Math.ceil(file.size / chunkSize);
var currentChunk = 0;
var spark = new SparkMD5.ArrayBuffer();
var startTime = new Date().getTime();
loadNext();
fileReader.onload = function() {
spark.append(this.result);
if (currentChunk == 0) {
currentChunk = chunks - 1;
loadNext();
} else {
var fileMD5 = hpMD5(spark.end() + file.lastModifiedDate);
finishCaculate(fileMD5)
}
};
function loadNext() {
var start = currentChunk * chunkSize;
var end = start + chunkSize >= file.size ? file.size : start + chunkSize;
fileReader.readAsArrayBuffer(blobSlice.call(file, start, end));
}
}
文件分片切割
我們通過定義好文件分片大小呀伙,使用blob對象支持的file.slice方法切割文件,分片上傳請求需要同步按順序請求添坊,因為使用了同步請求剿另,前端ui會阻塞無法點(diǎn)擊,需要開啟worker線程進(jìn)行操作贬蛙,完成后通過postMessage方法傳遞消息給主頁面通知ui進(jìn)度條的更新雨女,需要注意的是,worker線程方法不支持window對象速客,所以盡量不要使用第三方庫戚篙,使用原生的XMLHttpRequest對象發(fā)起請求,需要的參數(shù)通過onmessage方法傳遞獲取
頁面upload請求方法如下
upload() {
var file = document.getElementById("file").files[0];
if (!file) {
alert("請選擇需要上傳的文件");
return;
}
if (file.size < pageData.chunkSize) {
alert("選擇的文件請大于" + pageData.chunkSize / 1024 / 1024 + "M")
}
var filesize = file.size;
var filename = file.name;
pageData.chunkCount = Math.ceil(filesize / pageData.chunkSize);
this.createSimpleFileMD5(file, pageData.chunkSize, function(fileMD5) {
console.log("計算文件MD:" + fileMD5);
pageData.showProgress = true;
var worker = new Worker('worker.js');
var param = {
token: GetTokenID(),
uploadUrl: uploadUrl,
filename: filename,
filesize: filesize,
fileMD5: fileMD5,
groupguid: pageData.groupguid1,
grouptype: pageData.grouptype1,
chunkCount: pageData.chunkCount,
chunkSize: pageData.chunkSize,
file: file
}
worker.onmessage = function(event) {
var workresult = event.data;
if (workresult.code == 0) {
pageData.percent = workresult.percent;
if (workresult.percent == 100) {
pageData.showProgress = false;
worker.terminate();
}
} else {
pageData.showProgress = false;
worker.terminate();
}
}
worker.postMessage(param);
})
}
worker.js執(zhí)行方法如下
function FormAjax_Sync(token, data, url, success) {
var xmlHttp = new XMLHttpRequest();
xmlHttp.open("post", url, false);
xmlHttp.setRequestHeader("token", token);
xmlHttp.onreadystatechange = function() {
if (xmlHttp.status == 200) {
var result = JSON.parse(this.responseText);
var status = this.status
success(result, status);
}
};
xmlHttp.send(data);
}
onmessage = function(evt) {
var data = evt.data;
console.log(data)
//傳遞的參數(shù)
var token = data.token
var uploadUrl = data.uploadUrl
var filename = data.filename
var fileMD5 = data.fileMD5
var groupguid = data.groupguid
var grouptype = data.grouptype
var chunkCount = data.chunkCount
var chunkSize = data.chunkSize
var filesize = data.filesize
var filename = data.filename
var file = data.file
var start = 0;
var end;
var index = 0;
var startTime = new Date().getTime();
while (start < filesize) {
end = start + chunkSize;
if (end > filesize) {
end = filesize;
}
var chunk = file.slice(start, end); //切割文件
var formData = new FormData();
formData.append("file", chunk, filename);
formData.append("fileMD5", fileMD5);
formData.append("chunkCount", chunkCount)
formData.append("chunkIndex", index);
formData.append("chunkSize", end - start);
formData.append("groupguid", groupguid);
formData.append("grouptype", grouptype);
//上傳文件
FormAjax_Sync(token, formData, uploadUrl, function(result, status) {
var code = 0;
var percent = 0;
if (result.code == 0) {
console.log("分片共" + chunkCount + "個" + ",已成功上傳第" + index + "個")
percent = parseInt((parseInt(formData.get("chunkIndex")) + 1) * 100 / chunkCount);
} else {
filesize = -1;
code = -1
console.log("分片第" + index + "個上傳失敗")
}
self.postMessage({ code: code, percent: percent });
})
start = end;
index++;
}
console.log("上傳分片總時間:" + (new Date().getTime() - startTime));
console.log("分片完成");
}
文件分片接收
前端文件分片處理完畢后溺职,接下來我們詳細(xì)介紹下后端文件接受處理的方案岔擂,分片處理需要支持用戶隨時中斷上傳與文件重復(fù)上傳位喂,我們新建表f_attachchunk來記錄文件分片的詳細(xì)信息,表結(jié)構(gòu)設(shè)計如下
CREATE TABLE `f_attachchunk` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`ChunkGuid` varchar(50) NOT NULL,
`FileMD5` varchar(100) DEFAULT NULL,
`FileName` varchar(200) DEFAULT NULL,
`ChunkSize` int(11) DEFAULT NULL,
`ChunkCount` int(11) DEFAULT NULL,
`ChunkIndex` int(11) DEFAULT NULL,
`ChunkFilePath` varchar(500) DEFAULT NULL,
`UploadUserGuid` varchar(50) DEFAULT NULL,
`UploadUserName` varchar(100) DEFAULT NULL,
`UploadDate` datetime DEFAULT NULL,
`UploadOSSID` varchar(200) DEFAULT NULL,
`UploadOSSChunkInfo` varchar(1000) DEFAULT NULL,
`ChunkType` varchar(50) DEFAULT NULL,
`MergeStatus` int(11) DEFAULT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=237 DEFAULT CHARSET=utf8mb4;
- FileMD5:文件MD5唯一標(biāo)識文件
- FileName:文件名稱
- ChunkSize:分片大小
- ChunkCount:分片總數(shù)量
- ChunkIndex:分片對應(yīng)序號
- ChunkFilePath:分片存儲路徑(本地存儲文件方案使用)
- UploadUserGuid:上傳人主鍵
- UploadUserName:上傳人姓名
- UploadDate:上傳人日期
- UploadOSSID:分片上傳批次ID(云存儲方案使用)
- UploadOSSChunkInfo:分片上傳單片信息(云存儲方案使用)
- ChunkType:分片存儲方式(本地存儲乱灵,阿里云塑崖,華為云,Minio標(biāo)識)
- MergeStatus:分片合并狀態(tài)(未合并痛倚,已合并)
文件分片存儲后端一共分為三步规婆,檢查分片=》保存分片=》合并分片,我們這里先以本地文件存儲為例講解蝉稳,云存儲思路一致抒蚜,后續(xù)會提供對應(yīng)使用的api方法
檢查分片
檢查分片以數(shù)據(jù)庫文件分片記錄的FIleMD5與ChunkIndex組合來確定分片的唯一性,因為本地分片temp文件是作為臨時文件存儲耘戚,可能會出現(xiàn)手動清除施放磁盤空間的問題嗡髓,所以數(shù)據(jù)庫存在記錄我們還需要對應(yīng)的檢查實際文件情況
boolean existChunk = false;
AttachChunkDO dbChunk = attachChunkService.checkExistChunk(fileMD5, chunkIndex, "Local");
if (dbChunk != null) {
File chunkFile = new File(dbChunk.getChunkFilePath());
if (chunkFile.exists()) {
if (chunkFile.length() == chunkSize) {
existChunk = true;
} else {
//刪除數(shù)據(jù)庫記錄
attachChunkService.delete(dbChunk.getChunkGuid());
}
} else {
//刪除數(shù)據(jù)庫記錄
attachChunkService.delete(dbChunk.getChunkGuid());
}
}
保存分片
保存分片分為兩塊,文件存儲到本地收津,成功后數(shù)據(jù)庫插入對應(yīng)分片信息
//獲取配置中附件上傳文件夾
String filePath = frameConfig.getAttachChunkPath() + "/" + fileMD5 + "/";
//根據(jù)附件guid創(chuàng)建文件夾
File targetFile = new File(filePath);
if (!targetFile.exists()) {
targetFile.mkdirs();
}
if (!existChunk) {
//保存文件到文件夾
String chunkFileName = fileMD5 + "-" + chunkIndex + ".temp";
FileUtil.uploadFile(FileUtil.convertStreamToByte(fileContent), filePath, chunkFileName);
//插入chunk表
AttachChunkDO attachChunkDO = new AttachChunkDO(fileMD5, fileName, chunkSize, chunkCount, chunkIndex, filePath + chunkFileName, "Local");
attachChunkService.insert(attachChunkDO);
}
合并分片
在上傳分片方法中饿这,如果當(dāng)前分片是最后一片,上傳完畢后進(jìn)行文件合并工作撞秋,同時進(jìn)行數(shù)據(jù)庫合并狀態(tài)的更新长捧,下一次同一個文件上傳時我們可以直接拷貝之前合并過的文件作為新附件,減少合并這一步驟的I/O操作吻贿,合并文件我們采用BufferedOutputStream與BufferedInputStream兩個對象串结,固定緩沖區(qū)大小
if (chunkIndex == chunkCount - 1) {
//合并文件
String merageFileFolder = frameConfig.getAttachPath() + groupType + "/" + attachGuid;
File attachFolder = new File(merageFileFolder);
if (!attachFolder.exists()) {
attachFolder.mkdirs();
}
String merageFilePath = merageFileFolder + "/" + fileName;
merageFile(fileMD5, merageFilePath);
attachChunkService.updateMergeStatusToFinish(fileMD5);
//插入到附件庫
//設(shè)置附件唯一guid
attachGuid = CommonUtil.getNewGuid();
attachmentDO.setAttguid(attachGuid);
attachmentService.insert(attachmentDO);
}
public void merageFile(String fileMD5, String targetFilePath) throws Exception {
String merageFilePath = frameConfig.getAttachChunkPath()+"/"+fileMD5+"/"+fileMD5+".temp";
File merageFile = new File(merageFilePath);
if(!merageFile.exists()){
BufferedOutputStream destOutputStream = new BufferedOutputStream(new FileOutputStream(merageFilePath));
List<AttachChunkDO> attachChunkDOList = attachChunkService.selectListByFileMD5(fileMD5, "Local");
for (AttachChunkDO attachChunkDO : attachChunkDOList) {
File file = new File(attachChunkDO.getChunkFilePath());
byte[] fileBuffer = new byte[1024 * 1024 * 5];//文件讀寫緩存
int readBytesLength = 0; //每次讀取字節(jié)數(shù)
BufferedInputStream sourceInputStream = new BufferedInputStream(new FileInputStream(file));
while ((readBytesLength = sourceInputStream.read(fileBuffer)) != -1) {
destOutputStream.write(fileBuffer, 0, readBytesLength);
}
sourceInputStream.close();
}
destOutputStream.flush();
destOutputStream.close();
}
FileUtil.copyFile(merageFilePath,targetFilePath);
}
云文件分片上傳
云文件上傳與本地文件上傳的區(qū)別就是,分片文件直接上傳到云端廓八,再調(diào)用云存儲api進(jìn)行文件合并與文件拷貝奉芦,數(shù)據(jù)庫相關(guān)記錄與檢查差異不大
阿里云OSS
上傳分片前需要生成該文件的分片上傳組標(biāo)識uploadid
public String getUplaodOSSID(String key){
key = "chunk/" + key + "/" + key;
TenantParams.attach appConfig = getAttach();
OSSClient ossClient = InitOSS(appConfig);
String bucketName = appConfig.getBucketname_auth();
InitiateMultipartUploadRequest request = new InitiateMultipartUploadRequest(bucketName, key);
InitiateMultipartUploadResult upresult = ossClient.initiateMultipartUpload(request);
String uploadId = upresult.getUploadId();
ossClient.shutdown();
return uploadId;
}
上傳分片時需要指定uploadid赵抢,同時我們要將返回的分片信息PartETag序列化保存數(shù)據(jù)庫剧蹂,用于后續(xù)的文件合并
public String uploadChunk(InputStream stream,String key, int chunkIndex, int chunkSize, String uploadId){
key = "chunk/" + key + "/" + key;
String result = "";
try{
TenantParams.attach appConfig = getAttach();
OSSClient ossClient = InitOSS(appConfig);
String bucketName = appConfig.getBucketname_auth();
UploadPartRequest uploadPartRequest = new UploadPartRequest();
uploadPartRequest.setBucketName(bucketName);
uploadPartRequest.setKey(key);
uploadPartRequest.setUploadId(uploadId);
uploadPartRequest.setInputStream(stream);
// 設(shè)置分片大小。除了最后一個分片沒有大小限制烦却,其他的分片最小為100 KB宠叼。
uploadPartRequest.setPartSize(chunkSize);
// 設(shè)置分片號。每一個上傳的分片都有一個分片號其爵,取值范圍是1~10000冒冬,如果超出此范圍,OSS將返回InvalidArgument錯誤碼摩渺。
uploadPartRequest.setPartNumber(chunkIndex+1);
// 每個分片不需要按順序上傳简烤,甚至可以在不同客戶端上傳,OSS會按照分片號排序組成完整的文件摇幻。
UploadPartResult uploadPartResult = ossClient.uploadPart(uploadPartRequest);
PartETag partETag = uploadPartResult.getPartETag();
result = JSON.toJSONString(partETag);
ossClient.shutdown();
}catch (Exception e){
logger.error("OSS上傳文件Chunk失敗:" + e.getMessage());
}
return result;
}
合并分片時通過傳遞保存分片的PartETag對象數(shù)組進(jìn)行操作横侦,為了附件獨(dú)立唯一性我們不直接使用合并后的文件挥萌,通過api進(jìn)行文件拷貝副本使用
public boolean merageFile(String uploadId, List<PartETag> chunkInfoList,String key,AttachmentDO attachmentDO,boolean checkMerge){
key = "chunk/" + key + "/" + key;
boolean result = true;
try{
TenantParams.attach appConfig = getAttach();
OSSClient ossClient = InitOSS(appConfig);
String bucketName = appConfig.getBucketname_auth();
if(!checkMerge){
CompleteMultipartUploadRequest completeMultipartUploadRequest = new CompleteMultipartUploadRequest(bucketName, key, uploadId, chunkInfoList);
CompleteMultipartUploadResult completeMultipartUploadResult = ossClient.completeMultipartUpload(completeMultipartUploadRequest);
}
String attachKey = getKey(attachmentDO);
ossClient.copyObject(bucketName,key,bucketName,attachKey);
ossClient.shutdown();
}catch (Exception e){
e.printStackTrace();
logger.error("OSS合并文件失敗:" + e.getMessage());
result = false;
}
return result;
}
華為云OBS
華為云api與阿里云api大致相同,只有個別參數(shù)名稱不同枉侧,直接上代碼
public String getUplaodOSSID(String key) throws Exception {
key = "chunk/" + key + "/" + key;
TenantParams.attach appConfig = getAttach();
ObsClient obsClient = InitOBS(appConfig);
String bucketName = appConfig.getBucketname_auth();
InitiateMultipartUploadRequest request = new InitiateMultipartUploadRequest(bucketName, key);
InitiateMultipartUploadResult result = obsClient.initiateMultipartUpload(request);
String uploadId = result.getUploadId();
obsClient.close();
return uploadId;
}
public String uploadChunk(InputStream stream, String key, int chunkIndex, int chunkSize, String uploadId) {
key = "chunk/" + key + "/" + key;
String result = "";
try {
TenantParams.attach appConfig = getAttach();
ObsClient obsClient = InitOBS(appConfig);
String bucketName = appConfig.getBucketname_auth();
UploadPartRequest uploadPartRequest = new UploadPartRequest();
uploadPartRequest.setBucketName(bucketName);
uploadPartRequest.setUploadId(uploadId);
uploadPartRequest.setObjectKey(key);
uploadPartRequest.setInput(stream);
uploadPartRequest.setOffset(chunkIndex * chunkSize);
// 設(shè)置分片大小引瀑。除了最后一個分片沒有大小限制,其他的分片最小為100 KB榨馁。
uploadPartRequest.setPartSize((long) chunkSize);
// 設(shè)置分片號憨栽。每一個上傳的分片都有一個分片號,取值范圍是1~10000翼虫,如果超出此范圍屑柔,OSS將返回InvalidArgument錯誤碼。
uploadPartRequest.setPartNumber(chunkIndex + 1);
// 每個分片不需要按順序上傳珍剑,甚至可以在不同客戶端上傳锯蛀,OSS會按照分片號排序組成完整的文件。
UploadPartResult uploadPartResult = obsClient.uploadPart(uploadPartRequest);
PartEtag partETag = new PartEtag(uploadPartResult.getEtag(), uploadPartResult.getPartNumber());
result = JSON.toJSONString(partETag);
obsClient.close();
} catch (Exception e) {
e.printStackTrace();
logger.error("OBS上傳文件Chunk:" + e.getMessage());
}
return result;
}
public boolean merageFile(String uploadId, List<PartEtag> chunkInfoList, String key, AttachmentDO attachmentDO, boolean checkMerge) {
key = "chunk/" + key + "/" + key;
boolean result = true;
try {
TenantParams.attach appConfig = getAttach();
ObsClient obsClient = InitOBS(appConfig);
String bucketName = appConfig.getBucketname_auth();
if (!checkMerge) {
CompleteMultipartUploadRequest request = new CompleteMultipartUploadRequest(bucketName, key, uploadId, chunkInfoList);
obsClient.completeMultipartUpload(request);
}
String attachKey = getKey(attachmentDO);
obsClient.copyObject(bucketName, key, bucketName, attachKey);
obsClient.close();
} catch (Exception e) {
e.printStackTrace();
logger.error("OBS合并文件失敗:" + e.getMessage());
result = false;
}
return result;
}
Minio
文件存儲Minio應(yīng)用比較廣泛次慢,框架也同時支持了自己獨(dú)立部署的Minio文件存儲系統(tǒng)旁涤,Minio沒有對應(yīng)的分片上傳api支持,我們可以在上傳完分片文件后迫像,使用composeObject方法進(jìn)行文件的合并
public boolean uploadChunk(InputStream stream, String key, int chunkIndex) {
boolean result = true;
try {
MinioClient minioClient = InitMinio();
String bucketName = frameConfig.getMinio_bucknetname();
PutObjectOptions option = new PutObjectOptions(stream.available(), -1);
key = "chunk/" + key + "/" + key;
minioClient.putObject(bucketName, key + "-" + chunkIndex, stream, option);
} catch (Exception e) {
logger.error("Minio上傳Chunk文件失敗:" + e.getMessage());
result = false;
}
return result;
}
public boolean merageFile(String key, int chunkCount, AttachmentDO attachmentDO, boolean checkMerge) {
boolean result = true;
try {
MinioClient minioClient = InitMinio();
String bucketName = frameConfig.getMinio_bucknetname();
key = "chunk/" + key + "/" + key;
if (!checkMerge) {
List<ComposeSource> sourceObjectList = new ArrayList<ComposeSource>();
for (int i = 0; i < chunkCount; i++) {
ComposeSource composeSource = ComposeSource.builder().bucket(bucketName).object(key + "-" + i).build();
sourceObjectList.add(composeSource);
}
minioClient.composeObject(ComposeObjectArgs.builder().bucket(bucketName).object(key).sources(sourceObjectList).build());
}
String attachKey = getKey(attachmentDO);
minioClient.copyObject(
CopyObjectArgs.builder()
.bucket(bucketName)
.object(attachKey)
.source(
CopySource.builder()
.bucket(bucketName)
.object(key)
.build())
.build());
} catch (Exception e) {
logger.error("Minio合并文件失敗:" + e.getMessage());
result = false;
}
return result;
}