Tidyverse course: ggplot Code
clp
29 June, 2020
前言
本教程包含tidyverse
課程的ggplot
部分幻燈片中顯示的所有代碼。并提供了課程中使用的練習(xí)的答案着降。
加載必要的包及內(nèi)置數(shù)據(jù)集
library("tidyverse")
#> Warning: package 'ggplot2' was built under R version 3.6.2
#> Warning: package 'tibble' was built under R version 3.6.2
#> Warning: package 'tidyr' was built under R version 3.6.2
#> Warning: package 'purrr' was built under R version 3.6.2
#> Warning: package 'dplyr' was built under R version 3.6.2
library("ggplot2")
msleep
#> # A tibble: 83 x 11
#> name genus vore order conservation sleep_total sleep_rem sleep_cycle awake
#> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Chee… Acin… carni Carn… lc 12.1 NA NA 11.9
#> 2 Owl … Aotus omni Prim… <NA> 17 1.8 NA 7
#> 3 Moun… Aplo… herbi Rode… nt 14.4 2.4 NA 9.6
#> 4 Grea… Blar… omni Sori… lc 14.9 2.3 0.133 9.1
#> 5 Cow Bos herbi Arti… domesticated 4 0.7 0.667 20
#> 6 Thre… Brad… herbi Pilo… <NA> 14.4 2.2 0.767 9.6
#> 7 Nort… Call… carni Carn… vu 8.7 1.4 0.383 15.3
#> 8 Vesp… Calo… <NA> Rode… <NA> 7 NA NA 17
#> 9 Dog Canis carni Carn… domesticated 10.1 2.9 0.333 13.9
#> 10 Roe … Capr… herbi Arti… lc 3 NA NA 21
#> # … with 73 more rows, and 2 more variables: brainwt <dbl>, bodywt <dbl>
class(msleep)
#> [1] "tbl_df" "tbl" "data.frame"
清洗數(shù)據(jù)
從msleep
中刪除NA
值
msleep %>% filter(!is.na(vore)) -> msleep.clean
msleep.clean
#> # A tibble: 76 x 11
#> name genus vore order conservation sleep_total sleep_rem sleep_cycle awake
#> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Chee… Acin… carni Carn… lc 12.1 NA NA 11.9
#> 2 Owl … Aotus omni Prim… <NA> 17 1.8 NA 7
#> 3 Moun… Aplo… herbi Rode… nt 14.4 2.4 NA 9.6
#> 4 Grea… Blar… omni Sori… lc 14.9 2.3 0.133 9.1
#> 5 Cow Bos herbi Arti… domesticated 4 0.7 0.667 20
#> 6 Thre… Brad… herbi Pilo… <NA> 14.4 2.2 0.767 9.6
#> 7 Nort… Call… carni Carn… vu 8.7 1.4 0.383 15.3
#> 8 Dog Canis carni Carn… domesticated 10.1 2.9 0.333 13.9
#> 9 Roe … Capr… herbi Arti… lc 3 NA NA 21
#> 10 Goat Capri herbi Arti… lc 5.3 0.6 NA 18.7
#> # … with 66 more rows, and 2 more variables: brainwt <dbl>, bodywt <dbl>
繪制散點(diǎn)圖(Scatterplot)
基礎(chǔ)圖形
ggplot(
msleep.clean,
aes(x=bodywt, y=sleep_total)
)+geom_point() -> scatterplot
scatterplot
其實(shí),我們可以直接從過(guò)濾通過(guò)管道符%>%
進(jìn)入ggplot
蓄喇,而不需要保存中間數(shù)據(jù)直接出圖(這一波操作很秀刃鳄!)。
msleep %>%
filter(!is.na(vore)) %>%
ggplot(
aes(x=bodywt, y=sleep_total)
)+geom_point()
然后是各種美化策略
加點(diǎn)顏色
ggplot(
msleep.clean,
aes(x=bodywt, y=sleep_total, colour=vore)
)+geom_point()
另一種加顏色的方法
ggplot(
msleep.clean,
aes(x=bodywt, y=sleep_total)
)+geom_point(aes(colour=vore))
看上去這個(gè)數(shù)據(jù)比較集中在X軸的2000以內(nèi)钱骂,現(xiàn)在這個(gè)顯示不出更更多信息叔锐,要做到一圖勝千言,那就要對(duì)X軸進(jìn)行數(shù)據(jù)轉(zhuǎn)換见秽,常用的方法有取對(duì)數(shù)值(with log: log axis)
對(duì)X軸取取對(duì)數(shù)展示
ggplot(
msleep.clean,
aes(x=bodywt, y=sleep_total, colour=vore)
)+geom_point() -> scatterplot
scatterplot+scale_x_log10()
一個(gè)字:秀愉烙!
另一個(gè)展示方法
ggplot(
msleep.clean,
aes(x=log(bodywt), y=sleep_total,colour=vore)
)+geom_point()
展示更大的點(diǎn)、軸和圖形標(biāo)題
ggplot(
msleep.clean,
aes(x=log(bodywt), y=sleep_total,colour=vore)
) +
geom_point(size=4) +
xlab("Log Body Weight") +
ylab("Total Hours Sleep") +
ggtitle("Some Sleep Data") -> scatterplot
scatterplot
更換顯示主題
theme_set(theme_bw(base_size=18))
scatterplot+theme(plot.title = element_text(hjust = 0.5)) -> scatterplot
scatterplot
更改x軸和y軸上的配色方案和刻度解取,并改進(jìn)圖例 這將添加到先前的圖形中步责,而不是重新創(chuàng)建它。
scatterplot +
scale_colour_brewer(
palette="Set1",
name="Trophic levels",
labels=c("Carnivore", "Herbivore", "Insectivore", "Omnivore")
) +
scale_x_continuous(breaks=-5:10) +
scale_y_continuous(breaks=seq(0,20, 2)) -> scatterplot
scatterplot
手動(dòng)更改顏色
scatterplot +
scale_color_manual(
values=c("chocolate3", "chartreuse3", "darkorchid2","cyan3"),
name="Trophic levels",
labels=c("Carnivore", "Herbivore", "Insectivore", "Omnivore")
) -> scatterplot
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
scatterplot
超喜歡這個(gè)“Insectivore”顏色
接下來(lái)就是練習(xí)題了
練習(xí)1
文件up_down_expression sion.txt
包含一個(gè)表達(dá)矩陣禀苦,該數(shù)據(jù)集帶有一個(gè)額外的列勺择,該列將行分類為3組(上調(diào)、下調(diào)或不變)伦忠。 加載up_down_expression sion.txt
檢查下文件的結(jié)構(gòu)并繪制散點(diǎn)圖geom_point()
:紅色表示上調(diào)省核,藍(lán)色下調(diào)灰色不變, - 主標(biāo)題:表達(dá)數(shù)據(jù)
- 顏色圖例:下調(diào)昆码,不變气忠,上調(diào)
- 軸標(biāo)簽:條件1
和條件2
expression <- read_tsv("up_down_expression.txt")
expression
#> # A tibble: 5,196 x 4
#> Gene Condition1 Condition2 State
#> <chr> <dbl> <dbl> <chr>
#> 1 A4GNT -3.68 -3.44 unchanging
#> 2 AAAS 4.55 4.39 unchanging
#> 3 AASDH 3.72 3.48 unchanging
#> 4 AATF 5.08 5.02 unchanging
#> 5 AATK 0.471 0.560 unchanging
#> 6 AB015752.4 -3.68 -3.59 unchanging
#> 7 ABCA7 3.45 3.83 unchanging
#> 8 ABCA9-AS1 -3.68 -3.59 unchanging
#> 9 ABCC11 -3.53 -1.86 unchanging
#> 10 ABCC3 0.931 3.26 up
#> # … with 5,186 more rows
expression.scatter<-ggplot(expression, aes(Condition1, Condition2, colour=State))+
geom_point()+
scale_colour_manual(values=c("blue", "grey", "red"),
name="State",
labels=c("Down", "Unchanging", "Up"))+
xlab("Condition 1") +
ylab("Condition 2") +
ggtitle("Expression data")+
theme(plot.title = element_text(hjust = 0.5))
expression.scatter
現(xiàn)在,讓我們嘗試另一種類型的圖:條形圖赋咽。它類似于散點(diǎn)圖旧噪,但x變量本質(zhì)上是定性的或絕對(duì)的。
Stripchart
ggplot(
msleep.clean,
aes(vore, sleep_total)
)+geom_point()
抖動(dòng)脓匿,變大淘钟,上色 jitter, bigger points and colours
ggplot(
msleep.clean,
aes(vore,sleep_total, colour=vore)
) + geom_point(size=4,position="jitter")
調(diào)節(jié)抖動(dòng)的范圍
ggplot(
msleep.clean,
aes(vore, sleep_total, colour=vore)
) +
geom_jitter(
width = .2,
size=4
) -> stripchart
stripchart
為平均值添加一條線,并為y軸添加標(biāo)題
stripchart +
stat_summary(
fun.y="mean",
geom='errorbar',
aes(ymin=..y.., ymax=..y..),
width=0.6,
size=1.5,
colour="grey25"
) -> stripchart
#> Warning: `fun.y` is deprecated. Use `fun` instead.
stripchart
箱式圖的雛形
一小段計(jì)算平均值(mean)和標(biāo)準(zhǔn)誤(SEm)的tidyverse
式的代碼
msleep.clean %>%
group_by(vore) %>%
summarise(sleep=mean(sleep_total), sem=sd(sleep_total)/sqrt(n()))
#> # A tibble: 4 x 3
#> vore sleep sem
#> <chr> <dbl> <dbl>
#> 1 carni 10.4 1.07
#> 2 herbi 9.51 0.862
#> 3 insecti 14.9 2.65
#> 4 omni 10.9 0.659
繼續(xù)美化
stripchart +
ylab("Total Hours Sleep") +
xlab("Trophic Levels") +
ggtitle("Some Sleep Data") +
scale_y_continuous(breaks=seq(0, 20, 2)) +
scale_x_discrete(labels=c("Carnivore", "Herbivore", "Insectivore", "Omnivore")) +
theme(legend.position = "none") -> stripchart
stripchart
同前
stripchart +
scale_colour_brewer(palette="Dark2")+
scale_x_discrete(
limit=c("insecti","omni","carni", "herbi"),
labels=c("Insectivore", "Herbivore", "Carnivore", "Omnivore"))+
theme(plot.title = element_text(hjust = 0.5)
) -> stripchart
## Scale for 'x' is already present. Adding another scale for 'x', which
## will replace the existing scale.
stripchart
library("ggthemes")
## Warning: package 'ggthemes' was built under R version 3.5.3
stripchart+
theme_wsj()+
scale_colour_wsj("colors6")+
theme(legend.position = "none")+
theme(plot.title = element_text(hjust = 0.5))
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
現(xiàn)在陪毡,讓我們嘗試一些其他的數(shù)據(jù)米母。DownloadFestival數(shù)據(jù)記錄了為期三天的音樂(lè)節(jié)期間810名音樂(lè)會(huì)觀眾的hygiene scores(0-5)。
讀入數(shù)據(jù)進(jìn)行預(yù)處理
read_csv("DownloadFestival.csv") -> festival.data
festival.data
#> # A tibble: 810 x 5
#> ticknumb gender day1 day2 day3
#> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 2111 Male 2.64 1.35 1.61
#> 2 2229 Female 0.97 1.41 0.290
#> 3 2338 Male 0.84 NA NA
#> 4 2384 Female 3.03 NA NA
#> 5 2401 Female 0.88 0.08 NA
#> 6 2405 Male 0.85 NA NA
#> 7 2467 Female 1.56 NA NA
#> 8 2478 Female 3.02 NA NA
#> 9 2490 Male 2.29 NA NA
#> 10 2504 Female 1.11 0.44 0.55
#> # … with 800 more rows
max(festival.data$day1)
#> [1] 3.69
Histogram
ggplot(
festival.data,
aes(day1)
)+geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
稍美化一下
ggplot(
festival.data,
aes(day1)
)+geom_histogram(binwidth=0.3)
進(jìn)一步美化
ggplot(
festival.data,
aes(day1)
)+geom_histogram(binwidth=0.3, color="black", fill="yellow")+
labs(x="Score", y="Counts")+
theme(plot.title = element_text(hjust = 0.5))+
ggtitle("Hygiene at Day 1") -> Day1Histogram
Day1Histogram
現(xiàn)在我們想要畫出所有3天每個(gè)性別情況毡琉。所以對(duì)數(shù)據(jù)進(jìn)行清洗(reshape)铁瞒。我們還將刪除NAs
。
festival.data %>%
gather(day,score,-ticknumb,-gender) -> festival.data.stack
festival.data.stack %>% filter(!is.na(score)) -> festival.data.stack
festival.data.stack
#> # A tibble: 1,197 x 4
#> ticknumb gender day score
#> <dbl> <chr> <chr> <dbl>
#> 1 2111 Male day1 2.64
#> 2 2229 Female day1 0.97
#> 3 2338 Male day1 0.84
#> 4 2384 Female day1 3.03
#> 5 2401 Female day1 0.88
#> 6 2405 Male day1 0.85
#> 7 2467 Female day1 1.56
#> 8 2478 Female day1 3.02
#> 9 2490 Male day1 2.29
#> 10 2504 Female day1 1.11
#> # … with 1,187 more rows
ggplot(festival.data.stack,aes(score))+
geom_histogram(binwidth=0.3, color="black", fill="yellow")+
labs(x="Hygiene score", y="Counts")+
facet_grid(gender~day) -> histogram.3days
histogram.3days
可以修改小分面(facets)的標(biāo)簽桅滋。下面是一些示例慧耍。
histogram.3days<-ggplot(festival.data.stack,aes(score))+
geom_histogram(binwidth=0.3, color="black", fill="yellow")+
labs(x="Hygiene score", y="Counts")+
facet_grid(gender~day)+
theme(strip.text.x = element_text(size = 16, colour = "purple", face="bold"),
strip.text.y = element_text(size=12, face="bold"))
histogram.3days
密度圖
density.3days<-ggplot(festival.data.stack, aes(score))+
geom_density(aes(group=day, fill=day), alpha=0.5)+
facet_grid(~gender)
density.3days
練習(xí) 2: Plot a stripchart representing all 3 days and each gender
stripchart <-ggplot(festival.data.stack, aes(gender, score, colour=gender))+
facet_grid(~day)+
geom_point(position="jitter")+
scale_colour_manual(values=c("darkorange", "darkorchid4"))+
stat_summary(geom='errorbar',fun.y=mean, aes(ymin=..y.., ymax=..y..),
colour="black", width=0.8, size=1.5)+
labs(x="Gender", y="Score")+
theme(legend.position = "none")
#> Warning: `fun.y` is deprecated. Use `fun` instead.
stripchart
從條形圖中,我們可以為平均值或任何其他描述性地理位置添加一條線作為統(tǒng)計(jì)匯總。
具體操作分2步:
stripchart<-ggplot(festival.data.stack, aes(gender, score,colour=gender))+facet_grid(~day)+
geom_point(position="jitter")+
scale_colour_manual(values=c("darkorange", "darkorchid4"))+
labs(x="Gender", y="Score")+
theme(legend.position = "none")
stripchart
加均值線
stripchart+
stat_summary(fun.y="mean",geom="errorbar", aes(ymin=..y.., ymax=..y..), width=0.8, colour="black", size = 1.3)
#> Warning: `fun.y` is deprecated. Use `fun` instead.
加個(gè)框框
stripchart+
geom_boxplot(alpha=0, colour="black")
進(jìn)一步美化
stripchart+
geom_boxplot(aes(gender, score, fill=gender), alpha=0.5, colour="black")+
scale_fill_manual(values=c("darkorange", "darkorchid4"))
說(shuō)到把圖表做得更漂亮芍碧,我們可以改進(jìn)箱式圖(boxplot)煌珊。
如果需要,我們可以更改x軸上的順序:boxplot+scale_x_discrete(limits=c(“Male”,“Female”))
boxplot<-ggplot(festival.data.stack, aes(gender,score))+
geom_boxplot()+
facet_grid(~day)
boxplot
boxplot <-ggplot(festival.data.stack, aes(gender,score, fill=gender))+
facet_grid(~day)+
stat_boxplot(geom="errorbar", width=0.5)+
geom_boxplot(outlier.shape=8)+
theme(legend.position = "none")+
scale_fill_manual(values=c("sienna1","darkorchid3 "))+
labs(x="Gender", y="Score")
boxplot
Violinplot (beanplot) 小提琴圖
stripchart+
geom_violin(alpha=0, colour="black")
基礎(chǔ)款
violinplot<-ggplot(festival.data.stack, aes(gender,score))+geom_violin()+facet_grid(~day)
violinplot
美化版
violinplot<-ggplot(festival.data.stack, aes(gender,score,fill=gender))+
facet_grid(~day)+
geom_violin(trim = FALSE)+
scale_fill_manual(values=c("goldenrod2","darkgrey"))+
theme(legend.position="none")+
stat_summary(fun.y=median, geom="point", size=2, color="black")+
labs(x="Gender", y="Hygiene scores")
#> Warning: `fun.y` is deprecated. Use `fun` instead.
violinplot
疊加箱式圖
violinplot+geom_boxplot(width=0.3)
violinplot+geom_jitter(width=0.1,size=1, shape=1)
Barchart 柱狀圖:首先泌豆,我們要計(jì)算平均值和sem怪瓶,并將這些值存儲(chǔ)在一個(gè)文件中。
festival.data.stack %>%
group_by(gender,day) %>%
summarise(mean=mean(score), sem=sd(score)/sqrt(n())) -> score.sem
score.sem
#> # A tibble: 6 x 4
#> # Groups: gender [2]
#> gender day mean sem
#> <chr> <chr> <dbl> <dbl>
#> 1 Female day1 1.88 0.0316
#> 2 Female day2 1.08 0.0608
#> 3 Female day3 1.10 0.0990
#> 4 Male day1 1.60 0.0362
#> 5 Male day2 0.773 0.0585
#> 6 Male day3 0.829 0.0721
barchart<-ggplot(score.sem, aes(day,mean, fill=gender))+
geom_bar(stat="identity")
barchart
加誤差線
barchart<-ggplot(score.sem, aes(day,mean, fill=gender))+
geom_bar(stat="identity", position="dodge")+
geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem), position="dodge")
barchart
barchart<-ggplot(score.sem, aes(day,mean, fill=gender))+
geom_bar(position="dodge", stat="identity")+
geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem), position="dodge")
barchart
美化
barchart<-ggplot(score.sem, aes(day,mean, fill=gender))+
geom_bar(position="dodge", colour="black",stat="identity",size=1)+
geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem), width=.5, position=position_dodge(0.9), size=1)+
ylab("Mean scores")+
ggtitle("Levels of hygiene over 3 days of concert")+
theme(plot.title = element_text(hjust = 0.5))+
theme(plot.title = element_text(size = 19))+
theme(axis.title.x=element_blank())+
scale_fill_manual(values=c("darkorange3", "darkorchid4"), name="Gender")
barchart
Linegraph 折線圖
linegraph<-ggplot(score.sem, aes(day, mean, group=gender))+
geom_line()+
geom_point()+
geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem))
linegraph
美化
linegraph<-ggplot(score.sem, aes(day,mean, colour=gender, group=gender))+
geom_line(size=1.5)+
geom_point(size=4)+
geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem), width=.2, size=1.5)
linegraph
進(jìn)一步美化
linegraph<-ggplot(score.sem, aes(day,mean, colour=gender, group=gender))+
geom_line(size=1.5)+
geom_point(size=5)+
geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem), width=.2, size=1.5)+
labs(x="", y="Mean scores")+
scale_y_continuous(breaks=seq(0, 2, 0.2))+
ggtitle("Levels of hygiene over 3 days of concert")+
theme(plot.title = element_text(hjust = 0.5))+
scale_colour_manual(values=c("purple","darkorange3"), name="")+
theme(legend.position = c(0.85, 0.9))+
theme(legend.text=element_text(size=14))+
theme(legend.background = element_rect(fill = "transparent"))
linegraph
練習(xí) 3 該文件包含3個(gè)不同數(shù)據(jù)集(一個(gè)WT和兩個(gè)mutants)的positional count數(shù)據(jù)践美。 繪制一個(gè)圖,顯示同一圖下的所有3個(gè)數(shù)據(jù)集 先讀Chrometic_position_data.txt
檢查文件的結(jié)構(gòu), 用gather()
將文件從寬格式重新構(gòu)造為長(zhǎng)格式, 重命名列:Gentype
和Value
繪制基本折線圖.
chromosome<-read_tsv("chromosome_position_data.txt")
chromosome
#> # A tibble: 184 x 4
#> Position Mut1 Mut2 WT
#> <dbl> <dbl> <dbl> <dbl>
#> 1 91757273 2.71 1.34 1.25
#> 2 91757323 2.71 1.3 1.25
#> 3 91757373 5.41 1.14 1.25
#> 4 91757423 2.71 1.58 1.88
#> 5 91757473 2.71 1.19 1.25
#> 6 91757523 2.71 2.82 2.5
#> 7 91757573 2.71 3.15 3.75
#> 8 91757623 0 4.05 3.75
#> 9 91757673 0 2.94 3.12
#> 10 91757723 0 2.69 3.12
#> # … with 174 more rows
chromosome %>%
gather(Genotype, Value,-Position) -> chromosome.long
chromosome.long
#> # A tibble: 552 x 3
#> Position Genotype Value
#> <dbl> <chr> <dbl>
#> 1 91757273 Mut1 2.71
#> 2 91757323 Mut1 2.71
#> 3 91757373 Mut1 5.41
#> 4 91757423 Mut1 2.71
#> 5 91757473 Mut1 2.71
#> 6 91757523 Mut1 2.71
#> 7 91757573 Mut1 2.71
#> 8 91757623 Mut1 0
#> 9 91757673 Mut1 0
#> 10 91757723 Mut1 0
#> # … with 542 more rows
chromosome.linegraph<-ggplot(chromosome.long, aes(x=Position, y=Value, group=Genotype, colour=Genotype))+
geom_line(size=2)
chromosome.linegraph
畫一張圖表,顯示一個(gè)典型嬰兒在出生后9個(gè)月的年齡和體重之間的關(guān)系找岖。 讀入數(shù)據(jù)weight_chart.txt
檢查文件結(jié)構(gòu) 繪制基本折線圖繪制更漂亮的版本:更改點(diǎn)的大小和顏色更改線的粗細(xì)和顏色 更改y軸:比例從2 kg更改為10 kg 更改x軸:比例從0 t 10個(gè)月更改兩個(gè)軸上的標(biāo)簽 向圖表添加標(biāo)題
weight<-read_tsv("weight_chart.txt")
weight
#> # A tibble: 10 x 2
#> Age Weight
#> <dbl> <dbl>
#> 1 0 3.6
#> 2 1 4.4
#> 3 2 5.2
#> 4 3 6
#> 5 4 6.6
#> 6 5 7.2
#> 7 6 7.8
#> 8 7 8.4
#> 9 8 8.8
#> 10 9 9.2
#基礎(chǔ)
weight.linegraph<-ggplot(weight, aes(Age, Weight))+
geom_line()+
geom_point()
weight.linegraph
#美化
weight.linegraph<-ggplot(weight, aes(Age, Weight))+
geom_line(size=1, colour="lightblue2")+
geom_point(shape=16, size=3, colour="darkorchid1")+
scale_y_continuous(breaks=2:10, limits = c(2, 10))+
scale_x_continuous(breaks=0:10, limits = c(0, 10))+
labs(x="Age (months)", y="Weight (kg)")+
ggtitle("Relation between age and weight")+
theme(plot.title = element_text(hjust = 0.5))
weight.linegraph
練習(xí) 5 文件brain_bodyweight.txt
包含一系列物種的log10大腦和體重?cái)?shù)據(jù)陨倡,以及每個(gè)點(diǎn)的SEM測(cè)量結(jié)果。 將這些數(shù)據(jù)繪制在帶有誤差條的散點(diǎn)圖上许布,顯示每個(gè)點(diǎn)下的平均值+/-SEM和數(shù)據(jù)集的名稱兴革。 讀入brain_bodyweitt.txt
檢查文件的結(jié)構(gòu),繪制一個(gè)基本的圖形蜜唾。繪制一個(gè)更漂亮的版本杂曲,對(duì)于水平誤差線需要更改:geom_barh()
,對(duì)于標(biāo)簽需要更改:geom_text()
袁余。
由于從教程中獲取的.txt文檔有點(diǎn)問(wèn)題擎勘,所以費(fèi)了點(diǎn)時(shí)間
options(stringsAsFactors = F)
brain.bodyweight<- read.csv("brain_bodyweight.txt",sep=',',header = T)
brain.bodyweight
#> Species Bodyweight Brainweight Bodyweight.SEM Brainweight.SEM X
#> 1 Cow 2.670 2.630 0.12500 0.12700 NA
#> 2 Goat 1.440 2.060 0.01930 0.14200 NA
#> 3 Guinea Pig 0.017 0.740 0.00163 0.03590 NA
#> 4 Diplodocus 4.070 1.700 0.10100 0.13400 NA
#> 5 Horse 2.720 2.820 0.00367 0.16900 NA
#> 6 Cat 0.519 1.410 0.02920 0.10200 NA
#> 7 Gorilla 2.320 2.610 0.02680 0.24300 NA
#> 8 Human 1.790 3.120 0.03610 0.23100 NA
#> 9 African Elephant 3.820 3.760 0.11200 0.32900 NA
#> 10 Rhesus Monkey 0.833 2.250 0.04650 0.14600 NA
#> 11 Kangaroo 1.540 1.750 0.01750 0.05870 NA
#> 12 Hamster -0.921 0.100 0.02740 0.00981 NA
#> 13 Mouse -1.640 -0.398 0.06860 0.02340 NA
#> 14 Rabbit 0.398 1.080 0.01150 0.06690 NA
#> 15 Sheep 1.740 2.240 0.04100 0.13600 NA
#> 16 Chimpanzee 1.720 2.640 0.01640 0.23800 NA
#> 17 Brachiosaurus 4.940 2.190 0.26900 0.11200 NA
#> 18 Rat -0.553 0.279 0.03370 0.00188 NA
#> 19 Mole -0.914 0.477 0.06170 0.04770 NA
#> 20 Pig 2.280 2.260 0.01330 0.16000 NA
brain.bodyweight=brain.bodyweight[,-6]
brain.bodyweight
#> Species Bodyweight Brainweight Bodyweight.SEM Brainweight.SEM
#> 1 Cow 2.670 2.630 0.12500 0.12700
#> 2 Goat 1.440 2.060 0.01930 0.14200
#> 3 Guinea Pig 0.017 0.740 0.00163 0.03590
#> 4 Diplodocus 4.070 1.700 0.10100 0.13400
#> 5 Horse 2.720 2.820 0.00367 0.16900
#> 6 Cat 0.519 1.410 0.02920 0.10200
#> 7 Gorilla 2.320 2.610 0.02680 0.24300
#> 8 Human 1.790 3.120 0.03610 0.23100
#> 9 African Elephant 3.820 3.760 0.11200 0.32900
#> 10 Rhesus Monkey 0.833 2.250 0.04650 0.14600
#> 11 Kangaroo 1.540 1.750 0.01750 0.05870
#> 12 Hamster -0.921 0.100 0.02740 0.00981
#> 13 Mouse -1.640 -0.398 0.06860 0.02340
#> 14 Rabbit 0.398 1.080 0.01150 0.06690
#> 15 Sheep 1.740 2.240 0.04100 0.13600
#> 16 Chimpanzee 1.720 2.640 0.01640 0.23800
#> 17 Brachiosaurus 4.940 2.190 0.26900 0.11200
#> 18 Rat -0.553 0.279 0.03370 0.00188
#> 19 Mole -0.914 0.477 0.06170 0.04770
#> 20 Pig 2.280 2.260 0.01330 0.16000
brain.bodyweight.graph<-ggplot(brain.bodyweight, aes(x=Bodyweight, y=Brainweight))+
geom_point()+
geom_errorbar(aes(ymin=Brainweight-Brainweight.SEM, ymax=Brainweight+Brainweight.SEM))+
geom_errorbarh(aes(xmin=Bodyweight-Bodyweight.SEM, xmax=Bodyweight+Bodyweight.SEM))+
geom_text(aes(label=Species), hjust = 1.05, vjust = -0.6, size=2.7)
brain.bodyweight.graph
brain.bodyweight.graph<-ggplot(brain.bodyweight, aes(x=Bodyweight, y=Brainweight))+
geom_point()+
geom_errorbar(aes(ymin=Brainweight-Brainweight.SEM, ymax=Brainweight+Brainweight.SEM), width=.1, size=1, colour="tomato3")+
geom_errorbarh(aes(xmin=Bodyweight-Bodyweight.SEM, xmax=Bodyweight+Bodyweight.SEM), height=.1, size=1, colour="tomato3")+
geom_point(size=2)+
geom_text(aes(label=Species), hjust = 1.1, vjust = -0.6, size=2.7)
brain.bodyweight.graph
進(jìn)一步美化
library("ggrepel")
## Warning: package 'ggrepel' was built under R version 3.5.3
ggplot(brain.bodyweight, aes(x=Bodyweight, y=Brainweight))+
geom_errorbar(aes(ymin=Brainweight-Brainweight.SEM, ymax=Brainweight+Brainweight.SEM),
width=.1, size=0.5, colour="grey28")+
geom_errorbarh(aes(xmin=Bodyweight-Bodyweight.SEM, xmax=Bodyweight+Bodyweight.SEM),
height=.1, size=0.5, colour="grey28")+
geom_point(shape=21, size=3, colour="black", fill="maroon3")+
geom_label_repel(aes(label = Species), box.padding=0.6, point.padding =0.5,
fill="mintcream", segment.colour="grey", size=3) -> brain.bodyweight.graph
brain.bodyweight.graph
Stacked bar: categorical data 堆疊柱狀圖
Changing<-read_csv("Changing.csv")
Changing
#> # A tibble: 60 x 3
#> Type.of.Behaviour Sample.Size Stage.of.Change
#> <chr> <dbl> <chr>
#> 1 Smoking cessation 108 Precontemplation
#> 2 Smoking cessation 187 Contemplation
#> 3 Smoking cessation 0 Preparation
#> 4 Smoking cessation 134 Action
#> 5 Smoking cessation 247 Maintenance
#> 6 Quitting cocaine 8 Precontemplation
#> 7 Quitting cocaine 15 Contemplation
#> 8 Quitting cocaine 0 Preparation
#> 9 Quitting cocaine 71 Action
#> 10 Quitting cocaine 62 Maintenance
#> # … with 50 more rows
stackedBar<-ggplot(Changing, aes(Type.of.Behaviour, Sample.Size, fill=Stage.of.Change))+
geom_bar(stat="identity")
stackedBar
更改比較的順序:factor(variable name, levels = c(“”, “” .)) 。 旋轉(zhuǎn)圖表以讀取x軸標(biāo)簽:coord_flip()
Changing$Stage.of.Change <- factor(Changing$Stage.of.Change, levels = c("Maintenance","Action","Preparation","Contemplation","Precontemplation"))
stackedBar<-ggplot(Changing, aes(Type.of.Behaviour, Sample.Size, fill = Stage.of.Change))+
geom_bar(stat="identity", colour="black")+
coord_flip()
stackedBar
進(jìn)一步美化
stackedBar<-stackedBar+
labs(title="Stages for Each of the 12 Problem Behaviours", y="Sample Size", fill="Stages of Change")+
theme(plot.title = element_text(hjust = 0.5, size=12, face="bold"))+
theme(axis.title.y=element_blank())+
scale_fill_brewer(palette = 4)+
theme(axis.text.x = element_text(size=10), axis.text.y = element_text(size=9))+
theme(legend.text=element_text(size=8), legend.title=element_text(size=10, face="bold"))+
theme(axis.title.x = element_text(size=10))
stackedBar
提高對(duì)比度
stackedBar+scale_fill_brewer(palette="RdYlGn", direction=-1)
練習(xí) 6 讓我們將相同的數(shù)據(jù)繪制為百分比颖榜,將變化的數(shù)據(jù)繪制為百分比棚饵。 將文件格式更改為應(yīng)急xtabs()
計(jì)算百分比prop.table()
將格式更改為dataframe as.data.frame()
檢查前幾行head()
像以前一樣使用不同的調(diào)色板繪制數(shù)據(jù)
contingency.table100<-prop.table(xtabs(Sample.Size~Type.of.Behaviour+Stage.of.Change, Changing),1)*100
contingency.table100
#> Stage.of.Change
#> Type.of.Behaviour Maintenance Action Preparation Contemplation
#> Adolescent delinquency 25.786164 27.044025 0.000000 28.930818
#> Condom use 35.294118 6.191950 0.000000 17.956656
#> Exercise acquisition 19.386332 14.086471 25.383543 33.751743
#> High fat diet 56.666667 2.777778 0.000000 17.777778
#> Mammography screening 42.553191 18.439716 0.000000 17.021277
#> Physicians'practices 49.629630 1.481481 2.222222 14.814815
#> Quitting cocaine 39.743590 45.512821 0.000000 9.615385
#> Radon gas exposure 0.000000 8.166189 0.000000 17.335244
#> Safer sex 0.000000 47.887324 0.000000 7.981221
#> Smoking cessation 36.538462 19.822485 0.000000 27.662722
#> Sunscreen use 35.242291 4.405286 0.000000 7.929515
#> Weight control 14.634146 17.886179 0.000000 52.845528
#> Stage.of.Change
#> Type.of.Behaviour Precontemplation
#> Adolescent delinquency 18.238994
#> Condom use 40.557276
#> Exercise acquisition 7.391911
#> High fat diet 22.777778
#> Mammography screening 21.985816
#> Physicians'practices 31.851852
#> Quitting cocaine 5.128205
#> Radon gas exposure 74.498567
#> Safer sex 44.131455
#> Smoking cessation 15.976331
#> Sunscreen use 52.422907
#> Weight control 14.634146
Changing.percent<-as.data.frame(contingency.table100)
Changing.percent
#> Type.of.Behaviour Stage.of.Change Freq
#> 1 Adolescent delinquency Maintenance 25.786164
#> 2 Condom use Maintenance 35.294118
#> 3 Exercise acquisition Maintenance 19.386332
#> 4 High fat diet Maintenance 56.666667
#> 5 Mammography screening Maintenance 42.553191
#> 6 Physicians'practices Maintenance 49.629630
#> 7 Quitting cocaine Maintenance 39.743590
#> 8 Radon gas exposure Maintenance 0.000000
#> 9 Safer sex Maintenance 0.000000
#> 10 Smoking cessation Maintenance 36.538462
#> 11 Sunscreen use Maintenance 35.242291
#> 12 Weight control Maintenance 14.634146
#> 13 Adolescent delinquency Action 27.044025
#> 14 Condom use Action 6.191950
#> 15 Exercise acquisition Action 14.086471
#> 16 High fat diet Action 2.777778
#> 17 Mammography screening Action 18.439716
#> 18 Physicians'practices Action 1.481481
#> 19 Quitting cocaine Action 45.512821
#> 20 Radon gas exposure Action 8.166189
#> 21 Safer sex Action 47.887324
#> 22 Smoking cessation Action 19.822485
#> 23 Sunscreen use Action 4.405286
#> 24 Weight control Action 17.886179
#> 25 Adolescent delinquency Preparation 0.000000
#> 26 Condom use Preparation 0.000000
#> 27 Exercise acquisition Preparation 25.383543
#> 28 High fat diet Preparation 0.000000
#> 29 Mammography screening Preparation 0.000000
#> 30 Physicians'practices Preparation 2.222222
#> 31 Quitting cocaine Preparation 0.000000
#> 32 Radon gas exposure Preparation 0.000000
#> 33 Safer sex Preparation 0.000000
#> 34 Smoking cessation Preparation 0.000000
#> 35 Sunscreen use Preparation 0.000000
#> 36 Weight control Preparation 0.000000
#> 37 Adolescent delinquency Contemplation 28.930818
#> 38 Condom use Contemplation 17.956656
#> 39 Exercise acquisition Contemplation 33.751743
#> 40 High fat diet Contemplation 17.777778
#> 41 Mammography screening Contemplation 17.021277
#> 42 Physicians'practices Contemplation 14.814815
#> 43 Quitting cocaine Contemplation 9.615385
#> 44 Radon gas exposure Contemplation 17.335244
#> 45 Safer sex Contemplation 7.981221
#> 46 Smoking cessation Contemplation 27.662722
#> 47 Sunscreen use Contemplation 7.929515
#> 48 Weight control Contemplation 52.845528
#> 49 Adolescent delinquency Precontemplation 18.238994
#> 50 Condom use Precontemplation 40.557276
#> 51 Exercise acquisition Precontemplation 7.391911
#> 52 High fat diet Precontemplation 22.777778
#> 53 Mammography screening Precontemplation 21.985816
#> 54 Physicians'practices Precontemplation 31.851852
#> 55 Quitting cocaine Precontemplation 5.128205
#> 56 Radon gas exposure Precontemplation 74.498567
#> 57 Safer sex Precontemplation 44.131455
#> 58 Smoking cessation Precontemplation 15.976331
#> 59 Sunscreen use Precontemplation 52.422907
#> 60 Weight control Precontemplation 14.634146
繪制百分比圖
stackedBar.percent<-ggplot(Changing.percent,aes(Type.of.Behaviour, Freq, fill = Stage.of.Change))+
geom_bar(stat="identity",colour="black")+
coord_flip()+
scale_fill_brewer(palette = "Spectral", direction=-1)+
labs(title="Stages for Each of the 12 Problem Behaviours", y="Frequencies")+
theme(axis.title.y=element_blank())+
theme(plot.title = element_text(hjust = 0.5, size=12, face="bold"))+
theme(axis.text.x = element_text(size=10), axis.text.y = element_text(size=9))+
theme(legend.text=element_text(size=8), legend.title=element_text(size=10, face="bold"))+
theme(axis.title.x = element_text(size=10))
stackedBar.percent
保存圖片
stackedBar_percent <-ggsave(stackedBar.percent, file="stackedBar_percent.png")
本教程所用數(shù)據(jù)部分來(lái)自http://www.bioinformatics.babraham.ac.uk/training.html