- If you want to practice at your own pace, you can download related materials from the following website: Biostar data site.
- I just skimmed through the whole book. It's a good book for newbies. You will get to know about what is bioinformatics and learn some basic codes. If you want to learn more, the authors have also given some useful resources and links that you can refer to. This book will guide you learn bioinformatics in a systematic way. The best way to learn is to learn by practicing.
- What is Bioinformatics?
Make sense of biological data by using computational methods. Most bioinformatics mainly includds the following four categories:- Assembly: 基因組裝,建立新的基因組
- Resequencing:重測序已慢,與已知基因組進(jìn)行序列比對苞尝,鑒別突變和變異情況
- Classification:確定一個生物群的種群構(gòu)成
- Quantification:用DNA測序的方法來測量細(xì)胞內(nèi)的功能學(xué)特征。
- "pwd": show current filepath. If you want to use the returned value, You can use the following. DATA_PATH=${PWD}
- "|": Pipe sign. very useful when you are trying to acheive simple goals in several steps that can be connected by a pipe.
- Keep file folders well-organized, easy to memorize and use.
- parallel: use multiple process to finish similartasks. eg:
mkdir -p ~/tmp/fastq && cd ~/tmp/fastq
touch GSE89245.txt
for i in $(seq -w 86 95); do echo "SRP0921""$i" >>GSE89245.txt;done
# seq -w, return the value in the format of the latter number (compare "seq -w 1 10" and "seq 1 10")
cat sraid.txt | parallel fastq-dump -o sra --split-files {}
- view and combine files
# for regular files
cat file1 file2 file... >> bigfile
# for gziped files
zcat file1.gz file2.gz filen.gz >> bigfile.gz
- The $PATH environment variable
echo $PATH
export $PATH=/file/path/of/real/programes:$PATH >> ~/.bashrc
source ~/.bashrc
- "grep" command, usually used with "cat" or "zcat" and "|" and "cut -f" command to extract certain column and pass the values to downstream analysis
man grep
cat SGD_features.tab | cut -f 2,3,4 | grep ORF | grep -v Dubious | wc -l # sample lines
- "sed": replace strings with new values. Very useful when renaming multiple files with similar patterns.
man sed
- "awk" command. This command is a little complicated, try to use online resources to learn more.
man awk
Since I used most of my time skiming through this book, I will write more about grep/sed/awk command in the future. Hope you find this useful to you.