翻譯自:Trieste tutorial: Analyzing trajectories using PLUMED
目標(biāo)
主要是熟悉PLUMED語句绢要,并用來寫簡單的CV(collective variable)進(jìn)行分析已有的軌跡
流程
- 寫一個簡單的PLUMED輸入文件,使用PLUMED
driver
來分析軌跡 - 使用
GROUP
關(guān)鍵字進(jìn)行壓縮和快速的構(gòu)建復(fù)雜的原子組 - 打印
PRINT
CV變量例如距離(DISTANCE)拗小,轉(zhuǎn)角(TORSION),回轉(zhuǎn)半徑(GYRATION)和坐標(biāo)數(shù)(?COORDINATION) - 計算原子組幾何中心CENTER
- 了解如何處理周期性條件重罪,使用WHOLEMOLECULES 和WRAPAROUND,能夠確認(rèn)結(jié)果DUMPATOMS.
- 釋放滿足特別條件的快照,使用關(guān)鍵字UPDATE_IF.
資源包
在線軌跡包:
wget https://plumed.github.io/doc-v2.4/user-doc/html/tutorial-resources/trieste-1.tar.gz
文件包含如下內(nèi)容:
- ref.pdb: RNA雙鏈在含有MG離子的水盒子內(nèi)
- traj-whole.xtc: 軌跡文件,周期性處理過后
- traj-broken.xtc: 原始GROMACS文件
RNA如下(chimera做圖):
[圖片上傳中...(p-2.png-d02ac1-1553349235876-0)]
PLUMED輸入文件例子
一個簡單的輸入例子:
# 告知VIM這是一個plumed格式的文件剿配,若用vim編輯器
# vim: ft=plumed
# 計算原子1和10之間的距離.
# 設(shè)置了一個"d"的變量.
d: DISTANCE ATOMS=1,10
# 創(chuàng)建原子20到30之間的中心的虛擬原子位點(diǎn).
# 設(shè)置了一個"center"的變量.
center: CENTER ATOMS=20,30
# 計算 1, 10, 20, 和 center之間的轉(zhuǎn)角.
# 注意的是虛擬位點(diǎn)可以用于真實(shí)的原子
# 設(shè)置了一個"phi"的變量.
phi: TORSION ATOMS=1,10,20,center
# 計算一些功能函數(shù)
# 設(shè)置了一個"d2"的變量搅幅,計算的為cos phi
d2: MATHEVAL ...
ARG=phi FUNC=cos(x)
PERIODIC=NO
...
# 上面的多行可以寫成一行.
# d2: MATHEVAL ARG=phi FUNC=cos(x) PERIODIC=NO
# 打印d,d2,每10步到一個文件命名為 "COLVAR1".
PRINT ARG=d,d2 STRIDE=10 FILE=COLVAR1
# 打印 phi 到另外一個稱之為"COLVAR2" 的文件呼胚,每 100 步.
PRINT ARG=phi STRIDE=100 FILE=COLVAR2
PS:給的名稱例如:d: DISTANCE ATOMS=1,10
等價于DISTANCE ATOMS=1,10
然后我們就可以運(yùn)行進(jìn)行分析了:
- 拷貝上面的PLUMED輸入文件保存盏筐,例如
plumed.dat
- 運(yùn)行plumed driver
--mf_xtc traj.xtc --plumed plumed.dat
運(yùn)行后會得到COLVAR1和COLVAR2兩個文件。當(dāng)然還不需要進(jìn)行計算砸讳,練習(xí)從下面才正式開始:
練習(xí)1:計算和打印CV
一些原子選擇可以通過MOLINFO關(guān)鍵字來進(jìn)行
,我們來看給的例子,其中_FILL_
是需要自己進(jìn)行修改的:
要求如下:該段就不翻譯了:
The gyration radius of the solute RNA molecule (GYRATION). Look in the ref.pdb file which are the atoms that are part of RNA (search for the first occurrence of a water molecule, residue name SOL). Remember that you don't need to list all the atoms: instead of ATOMS=1,2,3,4,5 you can write ATOMS=1-5.
The torsional angle (TORSION) corresponding to the glycosidic chi angle χ of the first nucleotide. Since this nucleotide is a purine (guanine), the proper atoms to compute the torsion are O4' C1 N9 C4. Find their serial number in the ref.pdb file or learn how to select a special angle reading the MOLINFO documentation.
The total number of contacts (COORDINATION) between all RNA atoms and all water oxygens. For COORDINATION, set reference distance R_0 to 2.5 A (be careful with units!!). Try to be smart in selecting the water oxygens without listing all of them explicitly.
Same as before but against water hydrogen. Also in this case you should be smart to select water hydrogens. Documentation of GROUP might help.
Distance between the Mg ion and the geometric center of the RNA duplex (use CENTER and DISTANCE).
需要完整制作的目標(biāo)如下:
# First load information about the molecule.
MOLINFO __FILL__
# Notice that this is special kind of "action" ("setup action")
# that is only used during setup. It will not be re-executed at each step.
# Define some group that will make the rest of the input more readable
# Here are the atoms belonging to RNA.
rna: GROUP ATOMS=1-258
# This is the Mg ion. A group with atom is also useful!
mg: GROUP ATOMS=6580
# This group should contain all the atoms belonging to water molecules.
wat: GROUP ATOMS=__FILL__
# Select water oxygens only:
owat: GROUP __FILL__
# Select water hydrogens only:
hwat: GROUP __FILL__
# Compute gyration radius:
r: GYRATION ATOMS=__FILL__
# Compute the Chi torsional angle:
c: TORSION ATOMS=__FILL__
# Compute coordination of RNA with water oxygens
co: COORDINATION GROUPA=rna GROUPB=owat R_0=__FILL__
# Compute coordination of RNA with water hydrogens
ch: COORDINATION GROUPA=rna GROUPB=hwat __FILL__
# Compute the geometric center of the RNA molecule:
ce: CENTER ATOMS=__FILL__
# Compute the distance between the Mg ion and the RNA center:
d: DISTANCE ATOMS=__FILL__
# Print the collective variables on COLVAR file
# No STRIDE means "print for every step"
PRINT ARG=r,c,co,ch,d FILE=COLVAR
注意界牡,以上的.dat
文件中的FILL需要補(bǔ)全后再運(yùn)行
plumed driver --plumed plumed.dat --mf_xtc whole.xtc
創(chuàng)建的COLVAR文件大致這個樣子:
#! FIELDS time r c co ch d
#! SET min_c -pi
#! SET max_c pi
0.000000 0.788694 -2.963150 207.795793 502.027244 0.595611
1.000000 0.804101 -2.717302 208.021688 499.792595 0.951945
2.000000 0.788769 -2.939333 208.347867 500.552127 1.014850
3.000000 0.790232 -2.940726 211.274315 514.749124 1.249502
4.000000 0.796395 3.050949 212.352810 507.892198 2.270682
這個文件可以使用gnuplot查看:
gnuplot> p "COLVAR" u 1:2, "" u 1:3
[圖片上傳中...(p-3.png-78b106-1553349198098-0)]
以下是我的設(shè)置簿寂,供參考:
# First load information about the molecule.
MOLINFO STRUCTURE=ref.pdb MOLTYPE=rna
# Notice that this is special kind of "action" ("setup action")
# that is only used during setup. It will not be re-executed at each step.
# Define some group that will make the rest of the input more readable
# Here are the atoms belonging to RNA.
rna: GROUP ATOMS=1-258
# This is the Mg ion. A group with atom is also useful!
mg: GROUP ATOMS=6580
# This group should contain all the atoms belonging to water molecules.
wat: GROUP ATOMS=259-6579
# Select water oxygens only:
owat: GROUP ATOMS=wat
# Select water hydrogens only:
hwat: GROUP ATOMS=wat REMOVE=owat
# Compute gyration radius:
r: GYRATION ATOMS=rna
# Compute the Chi torsional angle:
c: TORSION ATOMS=@chi-1
# Compute coordination of RNA with water oxygens
co: COORDINATION GROUPA=rna GROUPB=owat R_0=0.25
# Compute coordination of RNA with water hydrogens
ch: COORDINATION GROUPA=rna GROUPB=hwat R_0=0.25
# Compute the geometric center of the RNA molecule:
ce: CENTER ATOMS=rna
# Compute the distance between the Mg ion and the RNA center:
d: DISTANCE ATOMS=ce,mg
# Print the collective variables on COLVAR file
# No STRIDE means "print for every step"
PRINT ARG=r,c,co,ch,d FILE=COLVAR
我的結(jié)果如下:
#! FIELDS time r c co ch d
#! SET min_c -pi
#! SET max_c pi
0.000000 0.788694 -2.963150 206.199511 498.826274 0.595611
1.000000 0.804101 -2.717302 206.427138 496.592883 0.951945
2.000000 0.788769 -2.939333 206.744720 497.333112 1.014850
3.000000 0.790232 -2.940726 209.694868 511.577231 1.249502
4.000000 0.796395 3.050949 210.755463 504.675747 2.270682
兩個COORDINATION有點(diǎn)不一樣,原因不詳
附加:結(jié)合多個CV
# Distance between atoms 1 and 2:
d1: DISTANCE ATOMS=1,2
# Distance between atoms 1 and 3:
d2: DISTANCE ATOMS=1,3
# Distance between atoms 1 and 4:
d3: DISTANCE ATOMS=1,4
# Compute the sum of the squares of those three distances:
c: COMBINE ARG=d1,d2,d3 POWERS=2 PERIODIC=NO
# Sort the three distances:
s: SORT ARG=d1,d2,d3
# Notice that SORT creates a compund object with three components:
# s.1: the smallest distance
# s.2: the middle distance
# s.3: the largest distance
p: MATHEVAL ARG=d1,d2,d3 FUNC=x*y*z PERIODIC=NO
# Print the sum of the squares and the largest among the three distances:
PRINT FILE=COLVAR ARG=c,s.3
當(dāng)有多個距離的時候可以使用部分正則表達(dá)式宿亡,比如
s: SORT ARG=d1,d2,d3
可以修改為:
s: SORT ARG=(d.)
MATHEVAL是一個函數(shù)功能的實(shí)現(xiàn)常遂,非常好用。挽荠!
練習(xí)1b:結(jié)合CV
目的:
- The sum of the distances between Mg and each of the phosphorous atoms.
- The distance between Mg and the closest phosphorous atom.
要查看亞磷酸原子可以簡單的使用shell:
grep ATOM ref.pdb | grep " P " | awk '{print $2}'
以上腳本主要是用到了管道,grep和awk
其中g(shù)rep先搜尋ATOM克胳,然后得到的結(jié)果交付給搜尋P
(注意空格),最后的行去第二行
結(jié)果如下:
33
67
101
165
196
227
練習(xí)如下:
# First load information about the molecule.
MOLINFO __FILL__
# Define some group that will make the rest of the input more readable
mg: GROUP ATOMS=6580 # a group with one atom is also useful!
# Distances between Mg and phosphorous atoms:
d1: DISTANCE ATOMS=mg,33
d2: DISTANCE __FILL__
__FILL__
d6: DISTANCE __FILL__
# You can use serial numbers, but you might also use MOLINFO strings
# Compute the sum of these distances
c: COMBINE __FILL__
# Compute the distance between Mg and the closest phosphorous atom
s: SORT __FILL__
# Print the requested variables
PRINT FILE=COLVAR __FILL__
結(jié)果類似如下:
#! FIELDS time c s.1
0.000000 6.655622 0.768704
1.000000 7.264049 0.379416
2.000000 7.876489 0.817820
3.000000 8.230621 0.380191
4.000000 13.708759 2.046935
以下是我的設(shè)置圈匆,供參考:
# First load information about the molecule.
MOLINFO STRUCTURE=ref.pdb MOLTYPE=rna
# Define some group that will make the rest of the input more readable
mg: GROUP ATOMS=6580 # a group with one atom is also useful!
# Distances between Mg and phosphorous atoms:
d1: DISTANCE ATOMS=mg,33
d2: DISTANCE ATOMS=mg,67
d3: DISTANCE ATOMS=mg,101
d4: DISTANCE ATOMS=mg,165
d5: DISTANCE ATOMS=mg,196
d6: DISTANCE ATOMS=mg,227
# You can use serial numbers, but you might also use MOLINFO strings
# Compute the sum of these distances
c: COMBINE ARG=(d.) PERIODIC=NO
# Compute the distance between Mg and the closest phosphorous atom
s: SORT ARG=(d.)
# Print the requested variables
PRINT FILE=COLVAR ARG=c,s.1
我的結(jié)果如下:
#! FIELDS time c s.1
0.000000 6.655622 0.768704
1.000000 7.264049 0.379416
2.000000 7.876489 0.817820
3.000000 8.230621 0.380191
4.000000 13.708759 2.046935
解決周期性問題
PLUMED可以使用DUMPATOMS
來轉(zhuǎn)存(dump)內(nèi)部儲存的原子坐標(biāo)漠另,主要可能用到的用途如下:
- To dump coordinates of virtual atoms that only exist within PLUMED (e.g. a CENTER).
- To dump snapshots of our molecule conditioned to some value of some collective variable (see UPDATE_IF).
- To dump coordinates of atoms that have been moved by PLUMED.
我們可以用VMD來查看GROMACS未處理周期性的軌跡(其實(shí)我感覺處理周期性完全可以在GROMACS中處理后交由PLUMED完成后續(xù)分析)
vmd ref.pdb traj-broken.xtc
[圖片上傳失敗...(image-f36160-1553349180464)]
練習(xí)2:解決周期性問題并且轉(zhuǎn)存原子坐標(biāo)
要求如下:
- 保證RNA完整,詳細(xì)參考WHOLEMOLECULES
- 提供一個模板進(jìn)行比對(structure reference.pdb).具體查看FIT_TO_TEMPLATE,使用
TYPE=OPTIMAL
.如若使用ref.pdb
跃赚,需要移除非RNA的原子 - 完整的RNA和Mg離子笆搓,但是包括的快照為Mg和殘基8的亞磷酸(P)小于4A的距離. 使用UPDATE_IF選擇執(zhí)行框架
- 整個RNA雙鏈體加上水分子和mg離子纏繞在雙鏈體的中心。 先用CENTER計算雙螺旋的中心纬傲,然后用WRAPAROUND包裹分子满败。 確保移動后個別水分子不會破碎!
練習(xí)需要填充的配置文件如下:
# First load information about the molecule.
MOLINFO __FILL__
# Define here the groups that you need.
# Same as in the previous exercise.
rna: GROUP ATOMS=__FILL__
mg: GROUP ATOMS=__FILL__
wat: GROUP ATOMS=__FILL__
# Make RNA duplex whole.
WHOLEMOLECULES __FILL__
# Dump first trajectory in gro format.
# Notice that PLUMED understands the format based on the file extension
DUMPATOMS ATOMS=rna FILE=rna-whole.gro
# Align RNA duplex to a reference structure
# This should not be the ref.pdb file but a new file with only RNA atoms.
FIT_TO_TEMPLATE REFERENCE=__FILL__ TYPE=OPTIMAL
# Notice that before using FIT_TO_TEMPLATE we used WHOLEMOLECULES to make RNA whole
# This is necessary otherwise you would align a broken molecule!
# Dump the aligned RNA on a separate file
DUMPATOMS ATOMS=rna FILE=rna-aligned.gro
# Compute the distance between the Mg and the Phosphorous from residue 8
d: DISTANCE ATOMS=mg,__FILL__ ## put the serial number of the correct phosphorous here
# here we only dump frames conditioned to the value of d
UPDATE_IF ARG=d __FILL__
DUMPATOMS ATOMS=rna,mg FILE=rna-select.gro
UPDATE_IF ARG=d __FILL__ # this command is required to close the UPDATE_IF above
# compute the center of the RNA molecule
center: CENTER ATOMS=rna
# Wrap atoms correctly
WRAPAROUND ATOMS=mg AROUND=__FILL__
WRAPAROUND ATOMS=wat AROUND=center __FILL__ # anything missing here?
# Dump the last trajectory
DUMPATOMS ATOMS=rna,wat,mg FILE=rna-wrap.gro
我的設(shè)置與結(jié)果
首先創(chuàng)建一個僅含RNA的pdb文件
grep -v "SOL" ref.pdb | grep -v "MG" > refrna.pdb
我的配置如下:
# First load information about the molecule.
MOLINFO STRUCTURE=ref.pdb MOLTYPE=rna
# Define here the groups that you need.
# Same as in the previous exercise.
rna: GROUP ATOMS=1-258
mg: GROUP ATOMS=6580
wat: GROUP ATOMS=259-6579
# Make RNA duplex whole.
WHOLEMOLECULES ENTITY0=1-258
# Dump first trajectory in gro format.
# Notice that PLUMED understands the format based on the file extension
DUMPATOMS ATOMS=rna FILE=rna-whole.gro
# Align RNA duplex to a reference structure
# This should not be the ref.pdb file but a new file with only RNA atoms.
FIT_TO_TEMPLATE REFERENCE=refrna.pdb TYPE=SIMPLE
# Notice that before using FIT_TO_TEMPLATE we used WHOLEMOLECULES to make RNA whole
# This is necessary otherwise you would align a broken molecule!
# Dump the aligned RNA on a separate file
DUMPATOMS ATOMS=rna FILE=rna-aligned.gro
# Compute the distance between the Mg and the Phosphorous from residue 8
d: DISTANCE ATOMS=mg,227 ## put the serial number of the correct phosphorous here
# here we only dump frames conditioned to the value of d
UPDATE_IF ARG=d LESS_THAN=0.5
DUMPATOMS ATOMS=rna,mg FILE=rna-select.gro
UPDATE_IF ARG=d END # this command is required to close the UPDATE_IF above
# compute the center of the RNA molecule
center: CENTER ATOMS=rna
# Wrap atoms correctly
WRAPAROUND ATOMS=mg AROUND=rna
WRAPAROUND ATOMS=wat AROUND=center GROUPBY=3 # anything missing here?
# Dump the last trajectory
DUMPATOMS ATOMS=rna,wat,mg FILE=rna-wrap.gro
執(zhí)行命令:
plumed driver --plumed plumed.dat --mf_xtc broken.xtc
我使用的是**TYPE=SIMPLE **叹括,但是教程上建議的是 OPTIMAL算墨,但是我一使用就報錯,暫時不知道原因汁雷,也不知道自己哪里錯了净嘀,不知道是不是bug
以下為我的結(jié)果:
[圖片上傳中...(p-5.png-8cf0a5-1553349256150-0)]