1. 關(guān)于 diff
diff分析兩個文件并打印不同的行米碰。
本質(zhì)上碘橘,它輸出一組指令募狂,用于如何更改第一個文件使其與第二個文件相同。
它實際上并沒有改變文件; 然而,它可以為程序ed(或者可以用來應(yīng)用更改的ex)生成一個腳本(帶-e選項)來落實這些改變。
2. diff 如何工作
2.1. 例子1
我們有個2個文件此疹, file1.txt 和 file2.txt
file1.txt 內(nèi)容如下:
I need to buy apples.
I need to run the laundry.
I need to wash the dog.
I need to get the car detailed.
file2.txt內(nèi)容如下:
I need to buy apples.
I need to do the laundry.
I need to wash the car.
I need to get the dog detailed.
我們可以使用diff命令來自動顯示 這兩個文件的哪些行不同燃领。
diff file1.txt file2.txt
輸出如下:
2,4c2,4
< I need to run the laundry.
< I need to wash the dog.
< I need to get the car detailed.
---
> I need to do the laundry.
> I need to wash the car.
> I need to get the dog detailed.
我們來看下輸出的含義士聪。 一個需要記住的重要前提是,diff用于描述不同猛蔽,在規(guī)定的上下文中剥悟,來說明如何改變第一個文件來與第二個文件相匹配。
輸出的第一行將包含:
- 第一個文件的對應(yīng)行號曼库,
*一個字母(a:添加区岗, c:改變 ,d:刪除)
*對應(yīng)的第2個文件的行號毁枯。
在上面的輸出中慈缔,“2,4c2,4” 表示:“第一個文件的第2到第4行需要進(jìn)行改變,以匹配第2個文件的第2行到第4行”, 然后表示每個文件中的這些行望侈。
- < 開頭的行表示來自第一個文件
- > 開頭的行表示來自第二個文件
- 三個橫杠"---" 僅僅表示分隔開文件1和文件2的這些行叛甫。
2.2. 例子2
再看另一個例子:
file1.txt:
I need to go to the store.
I need to buy some apples.
When I get home, I'll wash the dog.
file2.txt:
I need to go to the store.
I need to buy some apples.
Oh yeah, I also need to buy grated cheese.
When I get home, I'll wash the dog.
diff file1.txt file2.txt
輸出:
2a3
> Oh yeah, I also need to buy grated cheese.
這里锋八,輸出表示“第1個文件的第2行之后欢嘿,需要加一行:第2個文件的第3行”彰阴,然后展示了要加的這一行怕轿。
現(xiàn)在断傲,讓我們來看看當(dāng)需要刪除一行時肄满,diff怎么顯示
file1:
I need to go to the store.
I need to buy some apples.
When I get home, I'll wash the dog.
I promise.
file2:
I need to go to the store.
I need to buy some apples.
When I get home, I'll wash the dog.
Our command:
diff file1.txt file2.txt
輸出:
4d3
< I promise.
輸出表示“需要刪除第1個文件中的第4行谴古,以便兩個文件在第三行對齊〕砬福”掰担,然后輸出了需要刪除的一行(第1個文件中的第4行)。
2.3. 查看diff的 上下文context模式的輸出
上面的例子轧抗,展示了diff的缺省輸出恩敌,這是被計算機讀取的,不是給人看的横媚。對于人類纠炮,它也提供了看改變的上下文。
GNU diff, 是大部分linux的用戶使用的版本灯蝴,提供了2中不同的方式上下文模式"context mode"和統(tǒng)一模式“unified mode”
為了查看上下文模式中的不同恢口,使用 -c 參數(shù), 舉例如下
file1.txt:
apples
oranges
kiwis
carrots
file2.txt:
apples
kiwis
carrots
grapefruits
讓我們來看看基于上下文的輸出穷躁,命令如下:
diff -c file1.txt file2.txt
輸出:
*** file1.txt 2014-08-21 17:58:29.764656635 -0400
--- file2.txt 2014-08-21 17:58:50.768989841 -0400
***************
*** 1,4 ****
apples
- oranges
kiwis
carrots
--- 1,4 ----
apples
kiwis
carrots
+ grapefruits
輸出的頭2行顯示了文件信息耕肩,"from" file (file1)和 “to” file(file2). 顯示了文件名,修改日期问潭,文件修改時間猿诸,"from" file 用\“***\”表示,"to" file用“---”表示狡忙。
行 "***************" 只是一個分隔符梳虽。
下一行有3個星號(\“***\”),后面跟著第1個文件的行范圍(在這里灾茁,第1行到第4行窜觉,中間用逗號分隔)。然后4個星號("***")
然后展示了這些行的內(nèi)容北专。 如果行是不變的禀挫,則前綴是兩個空格,但是拓颓,如果行發(fā)生變化语婴,則會議指示字符和空格為前綴。
字符的含義如下:
character | meaning |
---|---|
! | 表示此行是需要改變的一行或多行的一部分。在另一個文件的上下文中也有一組相應(yīng)的行以“腻格!”作為前綴画拾。 |
+ | 表示第2個文件中需要添加到第1個文件中的行。 |
- | 表示第1個文件中需要刪除的行菜职。 |
After the lines from the first file, there are three dashes ("---"), then a line range, then four dashes ("----"). This indicates the line range in the second file that will sync up with our changes in the first file.
在第一個文件的行之后,有三個破折號(“ --- ”)旗闽,然后是一個行范圍酬核,然后是四個破折號(“ ----”)。這表示第二個文件中的行范圍將與我們在第一個文件中的更改同步适室。
如果有多個部分section需要更改嫡意,diff 會 依次顯示這些部分。來自第一個文件的行將仍然用“ *** ” 表示捣辆,而來自第二個文件的行用“ --- ”表示蔬螟。
2.4. 統(tǒng)一模式 Unified Mode
統(tǒng)一模式(-u 參數(shù))類似上下文context mode, 但是它不顯示任何冗余信息。下面是個例子汽畴,使用上個例子中同樣的輸入文件旧巾。
file1.txt:
apples
oranges
kiwis
carrots
file2.txt:
apples
kiwis
carrots
grapefruits
執(zhí)行命令:
diff -u file1.txt file2.txt
輸出:
--- file1.txt 2014-08-21 17:58:29.764656635 -0400
+++ file2.txt 2014-08-21 17:58:50.768989841 -0400
@@ -1,4 +1,4 @@
apples
-oranges
kiwis
carrots
+grapefruits
輸出同上一個類似,不同的區(qū)別被“統(tǒng)一”到一個集合里忍些。
2.5. 比較目錄
diff 可以通過提供目錄名來比較2個目錄
diff dir1 dir2
輸出:
Only in dir1: tab2.gif
Only in dir1: tab3.gif
Only in dir1: tab4.gif
Only in dir1: tape.htm
Only in dir1: tbernoul.htm
Only in dir1: tconner.htm
Only in dir1: tempbus.psd
2.6. 使用diff來生成編輯腳本
參數(shù) -e 可以將差異輸出到一個腳本鲁猩,包含命令的序列,可以由編輯程序 ed 或 ex 使用罢坝。 命令是 c(改變)廓握,a(添加) 和 d (刪除)的組合, 當(dāng)由編輯器執(zhí)行時嘁酿,它將修改 file1的內(nèi)容 以匹配 file2的內(nèi)容隙券。
假設(shè)我們有2個文件:
file1.txt:
Once upon a time, there was a girl named Persephone.
She had black hair.
She loved her mother more than anything.
She liked to sit outside in the sunshine with her cat, Daisy.
She dreamed of being a painter when she grew up.
file2.txt
Once upon a time, there was a girl named Persephone.
She had red hair.
She loved chocolate chip cookies more than anything.
She liked to sit outside in the sunshine with her cat, Daisy.
She would look up into the clouds and dream of being a world-famous baker.
我們可以接下來的命令來分析兩個文件,并且產(chǎn)生一個腳本來從file1的內(nèi)容創(chuàng)建一個與file2相同內(nèi)容的文件闹司。
diff -e file1.txt file2.txt
輸出將如下所示:
5c
She would look up into the clouds and dream of being a world-famous baker.
.
2,3c
She had red hair.
She loved chocolate chip cookies more than anything.
.
注意娱仔,所做的更改按照相反的順序列出:首先列出接近末尾的更改,最后列出文件開頭的更改开仰。這個順序是為了保存行號拟枚;如果我們先在文件的開頭進(jìn)行更改,那么稍后可能會更改之后文件中的行號众弓,所以腳本從最后開始恩溅,并反向執(zhí)行。
腳本是告訴編輯程序:“改變第5行為接下來的行谓娃,改變第2到第3行為接下來的2行內(nèi)容”
接下來脚乡,我們應(yīng)該保存腳本到一個文件,使用“>”操作符將輸出重定向到一個文件:如下所示:
diff -e file1.txt file2.txt > my-ed-script.txt
該命令不會在屏幕上顯示任何內(nèi)容(除非有錯誤發(fā)生);相反奶稠,輸出被重定向到一個叫做 my-ed-script.txt 的文件俯艰, 如果文件不存在,就會被創(chuàng)建锌订,如果存在就會被覆蓋竹握。
然后我們檢查這個文件內(nèi)容,
cat my-ed-script.txt
會發(fā)現(xiàn)同上面的輸出內(nèi)容相同
但是仍然還缺個步驟辆飘, 我們需要腳本告訴 ed 要寫入文件啦辐,缺少的就是一個 w 命令,它將更改寫入蜈项。 我們可以通過顯示字母"w”和追加符號">>"來添加到文件中芹关。(>>類似>, 重定向輸出到文件紧卒,但是不是覆蓋侥衬,而是追加)命令如下:
echo "w" >> my-ed-script.txt
現(xiàn)在我們可以檢查腳本內(nèi)容:
cat my-ed-script.txt
5c
She would look up into the clouds and dream of being a world-famous baker.
.
2,3c
She had red hair.
She loved chocolate chip cookies more than anything.
.
w
現(xiàn)在該ed了,做變更并將變更寫入磁盤跑芳。如何讓ed做到呢轴总?
我們可以讓ed通過下面的命令執(zhí)行此腳本,告訴它覆蓋原始文件聋亡,破折號- 告訴ed 從標(biāo)準(zhǔn)輸入讀取肘习, <符號將腳本內(nèi)容重定向到輸入。本質(zhì)上坡倔,系統(tǒng)輸入腳本中的任何內(nèi)容都作為了編輯程序ed的輸入漂佩,命令如下:
ed - file1.txt < my-ed-script.txt
這個命令什么都不顯示,我們看下原始文件的內(nèi)容:
cat file1.txt
Once upon a time, there was a girl named Persephone.
She had red hair.
She loved chocolate chip cookies more than anything.
She liked to sit outside in the sunshine with her cat, Daisy.
She would look up into the clouds and dream of being a world-famous baker.
可以看到file1完全與file2匹配了罪塔。
警告:在這個例子中投蝉, ed 覆蓋了原始文件file1,運行腳本后,原始文件file1消失了征堪,所以在運行這些命令前請確保你理解你的操作瘩缆!
2.7. diff -y 并排顯示
下面是個通過-y選項使用diff來并排顯示兩個文件之間區(qū)別的例子:
file1.txt:
apples
oranges
kiwis
carrots
file2.txt:
apples
kiwis
carrots
grapefruits
diff -y file1.txt file2.txt
輸出:
apples apples
oranges <
kiwis kiwis
carrots carrots
> grapefruits
3. 常用 diff 選項
這里有一些需要注意的有用的diff選項:
參數(shù) | 含義 |
---|---|
-b | 只改變空白的變化(spaces 或 tabs) |
-w | 完全忽略空白 |
-B | 計算差異時忽略空行 |
-y | 以列顯示輸出 |
這些只是一些最常用的diff選項,下面是 diff的選項和功能的完整列表:
4. diff 使用選項列表
diff [OPTION]... FILES
Options | 含義 |
---|---|
--normal | 輸出一個“正车柩粒”差異庸娱,這是默認(rèn)值。 |
-q, --brief | 僅在文件不同時生成輸出谐算。如果沒有差異熟尉,則不輸出任何內(nèi)容。 |
-s, --report-identical-files | 當(dāng)兩個文件相同時報告洲脂。 |
-c, -C NUM, --context[=NUM] | 提供NUM(默認(rèn)3行)上下文(context) |
-u, -U NUM, --unified[=NUM] | 提供NUM(默認(rèn)3行)統(tǒng)一(unified)上下文 |
-e, --ed | 輸出一個ed腳本 |
-n, --rcs | 輸出 RCS-format diff. |
-y, --side by side | 以2列格式輸出 |
-W, --width=NUM | 輸出最多NUM個(默認(rèn)130個)打印列print columns. |
--left-column | 只輸出公共行的左列 |
--suppress-common-lines | Do not output lines common between the two files. |
-p, --show-c-function | For files that contain C code, also show each C function change. |
-F, --show-function-line=RE | Show the most recent line matching regular expression RE. |
--label LABEL | When displaying output, use the label LABEL instead of the file name. This option can be issued more than once for multiple labels. |
-t, --expand-tabs | Expand tabs to spaces in output. |
-T, --initial-tab | Make tabs line up by prepending a tab if necessary. |
--tabsize=NUM | Define a tab stop as NUM (default 8) columns. |
--suppress-blank-empty | Suppress spaces or tabs before empty output lines. |
-l, --paginate | Pass output through pr to paginate. |
-r, --recursive | Recursively compare any subdirectories found. |
-N, --new-file | If a specified file does not exist, perform the diff as if it is an empty file. |
--unidirectional-new-file | Same as -n, but only applies to the first file. |
--ignore-file-name-case | Ignore case when comparing file names. |
--no-ignore-file-name-case | Consider case when comparing file names. |
-x, --exclude=PAT | Exclude files that match file name pattern PAT. |
-X, --exclude-from=FILE | Exclude files that match any file name pattern in file FILE. |
-S, --starting-file=FILE | Start with file FILE when comparing directories. |
--from-file=FILE1 | Compare FILE1 to all operands; FILE1 can be a directory. |
--to-file=FILE2 | Compare all operands to FILE2; FILE2 can be a directory. |
-i, --ignore-case | Ignore case differences in file contents. |
-E, --ignore-tab-expansion | Ignore changes due to tab expansion. |
-b, --ignore-space-change | Ignore changes in the amount of white space. |
-w, --ignore-all-space | Ignore all white space. |
-B, --ignore-blank-lines | Ignore changes whose lines are all blank. |
-I, --ignore-matching-lines=RE | Ignore changes whose lines all match regular expression RE. |
-a, --text | Treat all files as text. |
--strip-trailing-cr | Strip trailing carriage return on input. |
-D, --ifdef=NAME | Output merged file with "#ifdef NAME" diffs. |
--GTYPE-group-format=GFMT | Format GTYPE input groups with GFMT. |
--line-format=LFMT | Format all input lines with LFMT. |
--LTYPE-line-format=LFMT | Format LTYPE input lines with LFMT. These format options provide fine-grained control over the output of diff, generalizing -D/--ifdef. LTYPE is old, new, or unchanged. GTYPE can be any of the LTYPE values, or the value changed. GFMT (but not LFMT) may contain: %< lines from FILE1 %> lines from FILE2 %= lines common to FILE1 and FILE2. |
%[-][WIDTH][.[PREC]]{doxX}LETTER | printf-style spec for LETTER |
Options
Options | 含義 |
---|---|
--normal | 輸出一個“正辰锒”差異,這是默認(rèn)值。 |
-q, --brief | 僅在文件不同時生成輸出往果。如果沒有差異疆液,則不輸出任何內(nèi)容。 |
-s, --report-identical-files | 當(dāng)兩個文件相同時報告陕贮。 |
-c, -C NUM, --context[=NUM] | 提供NUM(默認(rèn)3行)上下文(context) |
-u, -U NUM, --unified[=NUM] | 提供NUM(默認(rèn)3行)統(tǒng)一(unified)上下文 |
-e, --ed | 輸出一個ed腳本 |
-n, --rcs | 輸出 RCS-format diff. |
-y, --side by side | 以2列格式輸出 |
-W, --width=NUM | 輸出最多NUM個(默認(rèn)130個)打印列print columns. |
--left-column | 只輸出公共行的左列 |
--suppress-common-lines | Do not output lines common between the two files. |
-p, --show-c-function | For files that contain C code, also show each C function change. |
-F, --show-function-line=RE | Show the most recent line matching regular expression RE. |
--label LABEL | When displaying output, use the label LABEL instead of the file name. This option can be issued more than once for multiple labels. |
-t, --expand-tabs | Expand tabs to spaces in output. |
-T, --initial-tab | Make tabs line up by prepending a tab if necessary. |
--tabsize=NUM | Define a tab stop as NUM (default 8) columns. |
--suppress-blank-empty | Suppress spaces or tabs before empty output lines. |
-l, --paginate | Pass output through pr to paginate. |
-r, --recursive | Recursively compare any subdirectories found. |
-N, --new-file | If a specified file does not exist, perform the diff as if it is an empty file. |
--unidirectional-new-file | Same as -n, but only applies to the first file. |
--ignore-file-name-case | Ignore case when comparing file names. |
--no-ignore-file-name-case | Consider case when comparing file names. |
-x, --exclude=PAT | Exclude files that match file name pattern PAT. |
-X, --exclude-from=FILE | Exclude files that match any file name pattern in file FILE. |
-S, --starting-file=FILE | Start with file FILE when comparing directories. |
--from-file=FILE1 | Compare FILE1 to all operands; FILE1 can be a directory. |
--to-file=FILE2 | Compare all operands to FILE2; FILE2 can be a directory. |
-i, --ignore-case | Ignore case differences in file contents. |
-E, --ignore-tab-expansion | Ignore changes due to tab expansion. |
-b, --ignore-space-change | Ignore changes in the amount of white space. |
-w, --ignore-all-space | Ignore all white space. |
-B, --ignore-blank-lines | Ignore changes whose lines are all blank. |
-I, --ignore-matching-lines=RE | Ignore changes whose lines all match regular expression RE. |
-a, --text | Treat all files as text. |
--strip-trailing-cr | Strip trailing carriage return on input. |
-D, --ifdef=NAME | Output merged file with "#ifdef NAME" diffs. |
--GTYPE-group-format=GFMT | Format GTYPE input groups with GFMT. |
--line-format=LFMT | Format all input lines with LFMT. |
--LTYPE-line-format=LFMT | Format LTYPE input lines with LFMT. These format options provide fine-grained control over the output of diff, generalizing -D/--ifdef. LTYPE is old, new, or unchanged. GTYPE can be any of the LTYPE values, or the value changed. GFMT (but not LFMT) may contain: %< lines from FILE1 %> lines from FILE2 %= lines common to FILE1 and FILE2. |
%[-][WIDTH][.[PREC]]{doxX}LETTER | printf-style spec for LETTER |
LETTERs are as follows for new group, lower case for old group:
字母 | 含義 |
---|---|
F | First line number. |
L | Last line number, |
N | Number of lines = L - F + 1. |
E | F - 1 |
M | L + 1 |
%(A=B?T:E) | If A equals B then T else E. |
LFMT (only) may contain:
符號 | 含義 |
---|---|
%L | Contents of line. |
%l | Contents of line, excluding any trailing newline. |
%[-][WIDTH][.[PREC]]{doxX}n | printf-style spec for input line number. |
Both GFMT and LFMT may contain:
符號 | 含義 |
---|---|
%% | A literal %. |
%c'C' | The single character C. |
%c'\OOO' | The character with octal code OOO. |
C | The character C (other characters represent themselves). |
-d, --minimal | Try hard to find a smaller set of changes. |
--horizon-lines=NUM | Keep NUM lines of the common prefix and suffix. |
--speed-large-files | Assume large files and many scattered small changes. |
--help | Display a help message and exit. |
-v, --version | Output version information and exit. |
FILES takes the form "FILE1 FILE2" or "DIR1 DIR2" or "DIR FILE..." or "FILE... DIR".
如果給出了--from -file或--to-file選項堕油,則對FILE沒有限制。如果FILE是破折號(“ - ”)肮之,diff從標(biāo)準(zhǔn)輸入讀取馍迄。
如果輸入相同,則退出狀態(tài)為0 ; 如果不同局骤,則退出狀態(tài)為1 ; 如果差異遇到任何問題,則退出狀態(tài)為2暴凑。
5. 相關(guān)命令
- bdiff — Identify the differences between two very big files.
- cmp — Compare two files byte by byte.
- comm — Compare two sorted files line by line.
- dircmp — Compare the contents of two directories, listing unique files.
- ed — A simple text editor.
- pr — Format a text file for printing.
- ls — List the contents of a directory or directories.
- sdiff — Compare two files, side-by-side.
6. 參考資料
翻譯自
https://www.computerhope.com/unix/udiff.htm
具體參數(shù)和選項也可以參考如下文章:
http://www.runoob.com/linux/linux-comm-diff.html