最近準(zhǔn)備重寫以前的監(jiān)控日志文件變動的項目,以前監(jiān)控日志文件變動使用的技術(shù)基礎(chǔ)是jdk1.7新出的WatchService移袍,但是使用了接近一年以后發(fā)現(xiàn)了幾個問題:
1.無法指定專門文件監(jiān)聽爷绘,只能對整個目錄的所有文件進(jìn)行監(jiān)聽书劝,然后自己過濾
2.當(dāng)修改了目錄或者文件的任何metadata后,就不會再傳入任何變動事件給監(jiān)聽者了
3.無法很好的控制監(jiān)控頻率
對于以上的三個問題揉阎,使用WatchService技術(shù)無法很好的解決他們庄撮,所以到處找別的解決方案背捌。
因為我使用的是flume, 版本是1.6毙籽,當(dāng)我看到flume1.7使用的TaildirSource的時候,忍不住眼前一亮毡庆,這就是我需要的解決方案坑赡。但是使用之前我要判斷下是否有坑,于是去讀了下他的源代碼烙如。
TaildirSource的代碼很簡單,主要就是五個類:
ReliableTaildirEventReader類
TaildirMatcher類
TaildirSource類
TaildirSourceConfigurationConstants類
TailFile類
這五個類之間的關(guān)系描述如下:
TailSource不用說,Flume的啟動點(diǎn)
在其process方法中
TailSource調(diào)用ReliableTaildirEventReader的updateTailFiles方法去獲取所有需要關(guān)注的文件毅否,并讀取解析
而為了只關(guān)注需要關(guān)注的文件亚铁,在ReliableTaildirEventReader的updateTailFiles方法中使用了TaildirMatcher去過濾出需要的文件
而TaildirMatcher和文件系統(tǒng)的目錄是一一對應(yīng)的,一個目錄可以抽象為一個TaildirMatcher螟加,而被關(guān)注的文件被抽象為TailFile對象
TaildirSourceConfigurationConstants類不用多說徘溢,一些配置常量而已
理解了類和代碼,我們就可以明白整套TaildirSource動態(tài)監(jiān)聽文件變化的技術(shù)基礎(chǔ)就是獲取文件的inode,建立inode和文件之間的一一對應(yīng)關(guān)系捆探,利用RandomAccessFile去讀取文件然爆,并將inode和讀取的位置以及文件位置保存成json文件進(jìn)行持久化,以便后續(xù)的繼續(xù)跟蹤黍图。
但是曾雕,注意,inode是linux文件的概念助被,而獲取inode是在ReliableTaildirEventReader的getInode方法里剖张,在這個方法里,我們將受到一萬點(diǎn)暴擊:(long) Files.getAttribute(file.toPath(), "unix:ino");
這段代碼無恥的排除了windows操作系統(tǒng)的存在揩环。
看明白了TaildirSource的代碼搔弄,發(fā)現(xiàn)TaildirSource是不支持Windows的。怎么辦呢丰滑?首先想想解決思路:
我們知道整體TaildirSource的思想是獲取一個文件的標(biāo)識(linux里inode可以作為文件的標(biāo)識使用肯污,當(dāng)系統(tǒng)讀取文件時,其實就是根據(jù)文件路徑轉(zhuǎn)換成對應(yīng)的inode值來做的操作)并記錄對應(yīng)的文件路徑吨枉,那么我們首先要確認(rèn)在Windows中是否是有類似于inode這種東西的存在蹦渣。這個回答解答了這個問題,windows中是有file id這種類似于inode的存在的貌亭。
那么繼續(xù),這個file id是否是有什么限制或者有什么特性呢?
參考這個回答 我們知道file id是跟文件系統(tǒng)有關(guān)的, 在FAT系統(tǒng)中柬唯,如果修改的名字長于舊名字,file id可能會發(fā)生改變圃庭,但是在NTFS系統(tǒng)中锄奢,在刪除之前file id都是穩(wěn)定的。
ok剧腻,方案可以確定了拘央,如果是windows系統(tǒng) 并且文件系統(tǒng)是ntfs,那么我們就使用file id去獲取文件书在,剩下的邏輯幾乎和linux是一模一樣灰伟。
如果是fat系統(tǒng)(在我們的工作環(huán)境中不可能出現(xiàn)),我們暫時不做支持儒旬。
(另外提一句栏账,在windows 2012中新增加了一個Refs 這個由于我們基本不用帖族,所以沒有做考慮,參考這個)
方案落地如下:
使用java的jna
Kernel32的代碼:
package com.creditease.ns.jna;/* Copyright (c) 2007, 2013 Timothy Wall, Markus Karg, All Rights Reserved
*
* The contents of this file is dual-licensed under 2
* alternative Open Source/Free licenses: LGPL 2.1 or later and
* Apache License 2.0. (starting with JNA version 4.0.0).
*
* You can freely decide which license you want to apply to
* the project.
*
* You may obtain a copy of the LGPL License at:
*
* http://www.gnu.org/licenses/licenses.html
*
* A copy is also included in the downloadable source code package
* containing JNA, in file "LGPL2.1".
*
* You may obtain a copy of the Apache License at:
*
* http://www.apache.org/licenses/
*
* A copy is also included in the downloadable source code package
* containing JNA, in file "AL2.0".
*/
import com.sun.jna.Native;
import com.sun.jna.Structure;
import com.sun.jna.platform.win32.WinBase;
import com.sun.jna.platform.win32.WinNT;
import com.sun.jna.platform.win32.Wincon;
import com.sun.jna.win32.StdCallLibrary;
import com.sun.jna.win32.W32APIOptions;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
/**
* Interface definitions for <code>kernel32.dll</code>. Includes additional
* alternate mappings from {@link WinNT} which make use of NIO buffers,
* {@link Wincon} for console API.
*/
public interface Kernel32 extends StdCallLibrary, WinNT, Wincon {
/**
* The instance.
*/
Kernel32 INSTANCE = (Kernel32) Native.loadLibrary("kernel32", Kernel32.class, W32APIOptions.DEFAULT_OPTIONS);
/**
* The GetLastError function retrieves the calling thread's last-error code
* value. The last-error code is maintained on a per-thread basis. Multiple
* threads do not overwrite each other's last-error code.
*
* @return The return value is the calling thread's last-error code value.
*/
int GetLastError();
/**
* The CreateFile function creates or opens a file, file stream, directory,
* physical disk, volume, console buffer, tape drive, communications
* resource, mailslot, or named pipe. The function returns a handle that can
* be used to access an object.
*
* @param lpFileName A pointer to a null-terminated string that specifies the name
* of an object to create or open.
* @param dwDesiredAccess The access to the object, which can be read, write, or both.
* @param dwShareMode The sharing mode of an object, which can be read, write, both,
* or none.
* @param lpSecurityAttributes A pointer to a SECURITY_ATTRIBUTES structure that determines
* whether or not the returned handle can be inherited by child
* processes. If lpSecurityAttributes is NULL, the handle cannot
* be inherited.
* @param dwCreationDisposition An action to take on files that exist and do not exist.
* @param dwFlagsAndAttributes The file attributes and flags.
* @param hTemplateFile Handle to a template file with the GENERIC_READ access right.
* The template file supplies file attributes and extended
* attributes for the file that is being created. This parameter
* can be NULL.
* @return If the function succeeds, the return value is an open handle to a
* specified file. If a specified file exists before the function
* call and dwCreationDisposition is CREATE_ALWAYS or OPEN_ALWAYS, a
* call to GetLastError returns ERROR_ALREADY_EXISTS, even when the
* function succeeds. If a file does not exist before the call,
* GetLastError returns 0 (zero). If the function fails, the return
* value is INVALID_HANDLE_VALUE. To get extended error information,
* call GetLastError.
*/
HANDLE CreateFile(String lpFileName, int dwDesiredAccess, int dwShareMode,
WinBase.SECURITY_ATTRIBUTES lpSecurityAttributes,
int dwCreationDisposition, int dwFlagsAndAttributes,
HANDLE hTemplateFile);
/**
* Closes an open object handle.
*
* @param hObject Handle to an open object. This parameter can be a pseudo
* handle or INVALID_HANDLE_VALUE.
* @return If the function succeeds, the return value is nonzero. If the
* function fails, the return value is zero. To get extended error
* information, call {@code GetLastError}.
* @see <A >CloseHandle</A>
*/
boolean CloseHandle(HANDLE hObject);
class BY_HANDLE_FILE_INFORMATION extends Structure {
public DWORD dwFileAttributes;
public FILETIME ftCreationTime;
public FILETIME ftLastAccessTime;
public FILETIME ftLastWriteTime;
public DWORD dwVolumeSerialNumber;
public DWORD nFileSizeHigh;
public DWORD nFileSizeLow;
public DWORD nNumberOfLinks;
public DWORD nFileIndexHigh;
public DWORD nFileIndexLow;
public static class ByReference extends BY_HANDLE_FILE_INFORMATION
implements Structure.ByReference {
}
public static class ByValue extends BY_HANDLE_FILE_INFORMATION
implements Structure.ByValue {
}
@Override
protected List getFieldOrder() {
List fields = new ArrayList();
fields.addAll(Arrays.asList(new String[]{"dwFileAttributes",
"ftCreationTime", "ftLastAccessTime", "ftLastWriteTime",
"dwVolumeSerialNumber", "nFileSizeHigh", "nFileSizeLow",
"nNumberOfLinks", "nFileIndexHigh", "nFileIndexLow"}));
return fields;
}
}
boolean GetFileInformationByHandle(HANDLE hFile,
BY_HANDLE_FILE_INFORMATION lpFileInformation);
}
package com.creditease.ns.jna;
import com.sun.jna.platform.win32.WinBase;
import com.sun.jna.platform.win32.WinNT;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.nio.file.Path;
public class WinFileUtils {
private static Logger logger = LoggerFactory.getLogger(WinFileUtils.class);
private final static int GENERIC_READ = 0x80000000;
private final static int FILE_SHARE_READ = 0x00000001;
private static final WinBase.SECURITY_ATTRIBUTES SECURITY_ATTRIBUTES = null;
private static final int OPEN_EXISTING = 3;
private static final int FILE_FLAG_BACKUP_SEMANTICS = 0x02000000;
public static final String IO_ERROR = "_ERROR_";
public static String getUniqueFileId(Path file) {
Kernel32.BY_HANDLE_FILE_INFORMATION nfo = new Kernel32.BY_HANDLE_FILE_INFORMATION();
WinNT.HANDLE handle = Kernel32.INSTANCE.CreateFile(file.toString(), GENERIC_READ, FILE_SHARE_READ,
SECURITY_ATTRIBUTES, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, null);
int errorRet = Kernel32.INSTANCE.GetLastError();
String identifier;
if (errorRet != 0) {
logger.error("創(chuàng)建/打開文件時發(fā)生錯誤 錯誤級別:{} 文件路徑:{}", errorRet, file);
identifier = IO_ERROR;
} else {
Kernel32.INSTANCE.GetFileInformationByHandle(handle, nfo);
errorRet = Kernel32.INSTANCE.GetLastError();
if (errorRet != 0) {
logger.error("獲取文件信息時發(fā)生錯誤 錯誤級別:{} 文件路徑:{}", errorRet, file);
identifier = IO_ERROR;
} else {
identifier = nfo.nFileIndexHigh + nfo.nFileIndexLow.toString() + Integer.toHexString(nfo
.dwVolumeSerialNumber.intValue());
}
}
Kernel32.INSTANCE.CloseHandle(handle);
return identifier;
}
}
以上代碼綜合參考了這里和這里
再次強(qiáng)調(diào)下挡爵,以上獲取FileId的代碼不適用于Refs系統(tǒng)竖般,如果想支持的更完善,請使用JNA 4.4.0版本 這個版本里的kernel32有一個方法叫做GetFileInformationByHandleEx 它是支持最新的Refs系統(tǒng)的茶鹃。