ggplot2進(jìn)階：統(tǒng)計(jì)+可視化

Tidyverse course: ggplot Code

clp

29 June, 2020

前言

本教程包含tidyverse課程的ggplot部分幻燈片中顯示的所有代碼。并提供了課程中使用的練習(xí)的答案着降。

加載必要的包及內(nèi)置數(shù)據(jù)集

library("tidyverse")
#> Warning: package 'ggplot2' was built under R version 3.6.2
#> Warning: package 'tibble' was built under R version 3.6.2
#> Warning: package 'tidyr' was built under R version 3.6.2
#> Warning: package 'purrr' was built under R version 3.6.2
#> Warning: package 'dplyr' was built under R version 3.6.2
library("ggplot2")
msleep
#> # A tibble: 83 x 11
#>    name  genus vore  order conservation sleep_total sleep_rem sleep_cycle awake
#>    <chr> <chr> <chr> <chr> <chr>              <dbl>     <dbl>       <dbl> <dbl>
#>  1 Chee… Acin… carni Carn… lc                  12.1      NA        NA      11.9
#>  2 Owl … Aotus omni  Prim… <NA>                17         1.8      NA       7  
#>  3 Moun… Aplo… herbi Rode… nt                  14.4       2.4      NA       9.6
#>  4 Grea… Blar… omni  Sori… lc                  14.9       2.3       0.133   9.1
#>  5 Cow   Bos   herbi Arti… domesticated         4         0.7       0.667  20  
#>  6 Thre… Brad… herbi Pilo… <NA>                14.4       2.2       0.767   9.6
#>  7 Nort… Call… carni Carn… vu                   8.7       1.4       0.383  15.3
#>  8 Vesp… Calo… <NA>  Rode… <NA>                 7        NA        NA      17  
#>  9 Dog   Canis carni Carn… domesticated        10.1       2.9       0.333  13.9
#> 10 Roe … Capr… herbi Arti… lc                   3        NA        NA      21  
#> # … with 73 more rows, and 2 more variables: brainwt <dbl>, bodywt <dbl>
class(msleep)
#> [1] "tbl_df"     "tbl"        "data.frame"

清洗數(shù)據(jù)

從msleep中刪除NA值

msleep %>% filter(!is.na(vore)) -> msleep.clean
msleep.clean
#> # A tibble: 76 x 11
#>    name  genus vore  order conservation sleep_total sleep_rem sleep_cycle awake
#>    <chr> <chr> <chr> <chr> <chr>              <dbl>     <dbl>       <dbl> <dbl>
#>  1 Chee… Acin… carni Carn… lc                  12.1      NA        NA      11.9
#>  2 Owl … Aotus omni  Prim… <NA>                17         1.8      NA       7  
#>  3 Moun… Aplo… herbi Rode… nt                  14.4       2.4      NA       9.6
#>  4 Grea… Blar… omni  Sori… lc                  14.9       2.3       0.133   9.1
#>  5 Cow   Bos   herbi Arti… domesticated         4         0.7       0.667  20  
#>  6 Thre… Brad… herbi Pilo… <NA>                14.4       2.2       0.767   9.6
#>  7 Nort… Call… carni Carn… vu                   8.7       1.4       0.383  15.3
#>  8 Dog   Canis carni Carn… domesticated        10.1       2.9       0.333  13.9
#>  9 Roe … Capr… herbi Arti… lc                   3        NA        NA      21  
#> 10 Goat  Capri herbi Arti… lc                   5.3       0.6      NA      18.7
#> # … with 66 more rows, and 2 more variables: brainwt <dbl>, bodywt <dbl>

繪制散點(diǎn)圖(Scatterplot)

基礎(chǔ)圖形

ggplot(
  msleep.clean, 
  aes(x=bodywt, y=sleep_total)
)+geom_point() -> scatterplot

scatterplot

image.png

其實(shí)，我們可以直接從過(guò)濾通過(guò)管道符%>%進(jìn)入ggplot蓄喇，而不需要保存中間數(shù)據(jù)直接出圖(這一波操作很秀刃鳄！)。

msleep %>% 
  filter(!is.na(vore)) %>%
    ggplot(
      aes(x=bodywt, y=sleep_total)
  )+geom_point()

image.png

然后是各種美化策略

加點(diǎn)顏色

ggplot(
  msleep.clean, 
  aes(x=bodywt, y=sleep_total, colour=vore)
)+geom_point()

image.png

另一種加顏色的方法

ggplot(
  msleep.clean, 
  aes(x=bodywt, y=sleep_total)
)+geom_point(aes(colour=vore))

image.png

看上去這個(gè)數(shù)據(jù)比較集中在X軸的2000以內(nèi)钱骂，現(xiàn)在這個(gè)顯示不出更更多信息叔锐，要做到一圖勝千言，那就要對(duì)X軸進(jìn)行數(shù)據(jù)轉(zhuǎn)換见秽，常用的方法有取對(duì)數(shù)值(with log: log axis)

對(duì)X軸取取對(duì)數(shù)展示

ggplot(
  msleep.clean, 
  aes(x=bodywt, y=sleep_total, colour=vore)
)+geom_point() -> scatterplot

scatterplot+scale_x_log10()

image.png

一個(gè)字：秀愉烙！

另一個(gè)展示方法

ggplot(
  msleep.clean, 
  aes(x=log(bodywt), y=sleep_total,colour=vore)
)+geom_point()

image.png

展示更大的點(diǎn)、軸和圖形標(biāo)題

ggplot(
  msleep.clean, 
  aes(x=log(bodywt), y=sleep_total,colour=vore)
) +
  geom_point(size=4) +
  xlab("Log Body Weight") + 
  ylab("Total Hours Sleep") + 
  ggtitle("Some Sleep Data")  -> scatterplot

scatterplot

image.png

更換顯示主題

theme_set(theme_bw(base_size=18))

scatterplot+theme(plot.title = element_text(hjust = 0.5)) -> scatterplot
scatterplot

image.png

更改x軸和y軸上的配色方案和刻度解取，并改進(jìn)圖例這將添加到先前的圖形中步责，而不是重新創(chuàng)建它。

scatterplot +
  scale_colour_brewer(
    palette="Set1", 
    name="Trophic levels", 
    labels=c("Carnivore", "Herbivore", "Insectivore", "Omnivore")
  ) +
  scale_x_continuous(breaks=-5:10) +
  scale_y_continuous(breaks=seq(0,20, 2)) -> scatterplot
scatterplot

image.png

手動(dòng)更改顏色

scatterplot +
  scale_color_manual(
    values=c("chocolate3", "chartreuse3", "darkorchid2","cyan3"),
    name="Trophic levels", 
    labels=c("Carnivore", "Herbivore", "Insectivore", "Omnivore")
  ) -> scatterplot
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
scatterplot

image.png

超喜歡這個(gè)“Insectivore”顏色

接下來(lái)就是練習(xí)題了

練習(xí)1

文件up_down_expression sion.txt包含一個(gè)表達(dá)矩陣禀苦，該數(shù)據(jù)集帶有一個(gè)額外的列勺择，該列將行分類為3組(上調(diào)、下調(diào)或不變)伦忠。加載up_down_expression sion.txt檢查下文件的結(jié)構(gòu)并繪制散點(diǎn)圖geom_point()：紅色表示上調(diào)省核，藍(lán)色下調(diào)灰色不變， - 主標(biāo)題：表達(dá)數(shù)據(jù) - 顏色圖例：下調(diào)昆码，不變气忠，上調(diào) - 軸標(biāo)簽：條件1和條件2

expression <- read_tsv("up_down_expression.txt")
expression
#> # A tibble: 5,196 x 4
#>    Gene       Condition1 Condition2 State     
#>    <chr>           <dbl>      <dbl> <chr>     
#>  1 A4GNT          -3.68      -3.44  unchanging
#>  2 AAAS            4.55       4.39  unchanging
#>  3 AASDH           3.72       3.48  unchanging
#>  4 AATF            5.08       5.02  unchanging
#>  5 AATK            0.471      0.560 unchanging
#>  6 AB015752.4     -3.68      -3.59  unchanging
#>  7 ABCA7           3.45       3.83  unchanging
#>  8 ABCA9-AS1      -3.68      -3.59  unchanging
#>  9 ABCC11         -3.53      -1.86  unchanging
#> 10 ABCC3           0.931      3.26  up        
#> # … with 5,186 more rows

expression.scatter<-ggplot(expression, aes(Condition1, Condition2, colour=State))+
  geom_point()+
  scale_colour_manual(values=c("blue", "grey", "red"),
                      name="State", 
    labels=c("Down", "Unchanging", "Up"))+
  xlab("Condition 1") + 
  ylab("Condition 2") + 
  ggtitle("Expression data")+
  theme(plot.title = element_text(hjust = 0.5))

expression.scatter

image.png

現(xiàn)在，讓我們嘗試另一種類型的圖：條形圖赋咽。它類似于散點(diǎn)圖旧噪，但x變量本質(zhì)上是定性的或絕對(duì)的。

Stripchart

ggplot(
    msleep.clean, 
    aes(vore, sleep_total)
  )+geom_point()

image.png

抖動(dòng)脓匿，變大淘钟，上色 jitter, bigger points and colours

ggplot(
  msleep.clean,
  aes(vore,sleep_total, colour=vore)
) + geom_point(size=4,position="jitter")

image.png

調(diào)節(jié)抖動(dòng)的范圍

ggplot(
  msleep.clean, 
  aes(vore, sleep_total, colour=vore)
) +
  geom_jitter(
    width = .2,
    size=4
  ) -> stripchart

stripchart

image.png

為平均值添加一條線，并為y軸添加標(biāo)題

stripchart +
  stat_summary(
    fun.y="mean",
    geom='errorbar', 
    aes(ymin=..y.., ymax=..y..), 
    width=0.6, 
    size=1.5,
    colour="grey25"
  ) -> stripchart
#> Warning: `fun.y` is deprecated. Use `fun` instead.

stripchart

image.png

箱式圖的雛形

一小段計(jì)算平均值(mean)和標(biāo)準(zhǔn)誤(SEm)的tidyverse式的代碼

msleep.clean %>%
  group_by(vore) %>%
    summarise(sleep=mean(sleep_total), sem=sd(sleep_total)/sqrt(n()))
#> # A tibble: 4 x 3
#>   vore    sleep   sem
#>   <chr>   <dbl> <dbl>
#> 1 carni   10.4  1.07 
#> 2 herbi    9.51 0.862
#> 3 insecti 14.9  2.65 
#> 4 omni    10.9  0.659

繼續(xù)美化

stripchart +
  ylab("Total Hours Sleep") +
  xlab("Trophic Levels") +
  ggtitle("Some Sleep Data") +
  scale_y_continuous(breaks=seq(0, 20, 2)) +
  scale_x_discrete(labels=c("Carnivore", "Herbivore", "Insectivore", "Omnivore")) +
  theme(legend.position = "none") -> stripchart

stripchart

image.png

同前

stripchart +
  scale_colour_brewer(palette="Dark2")+
  scale_x_discrete(
    limit=c("insecti","omni","carni", "herbi"),
    labels=c("Insectivore", "Herbivore", "Carnivore", "Omnivore"))+
      theme(plot.title = element_text(hjust = 0.5)
  ) -> stripchart
## Scale for 'x' is already present. Adding another scale for 'x', which
## will replace the existing scale.
stripchart

image.png

library("ggthemes")
## Warning: package 'ggthemes' was built under R version 3.5.3
stripchart+
  theme_wsj()+
  scale_colour_wsj("colors6")+
  theme(legend.position = "none")+
  theme(plot.title = element_text(hjust = 0.5))

image.png

## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.

現(xiàn)在陪毡，讓我們嘗試一些其他的數(shù)據(jù)米母。DownloadFestival數(shù)據(jù)記錄了為期三天的音樂(lè)節(jié)期間810名音樂(lè)會(huì)觀眾的hygiene scores(0-5)。

讀入數(shù)據(jù)進(jìn)行預(yù)處理

read_csv("DownloadFestival.csv") -> festival.data
festival.data
#> # A tibble: 810 x 5
#>    ticknumb gender  day1  day2   day3
#>       <dbl> <chr>  <dbl> <dbl>  <dbl>
#>  1     2111 Male    2.64  1.35  1.61 
#>  2     2229 Female  0.97  1.41  0.290
#>  3     2338 Male    0.84 NA    NA    
#>  4     2384 Female  3.03 NA    NA    
#>  5     2401 Female  0.88  0.08 NA    
#>  6     2405 Male    0.85 NA    NA    
#>  7     2467 Female  1.56 NA    NA    
#>  8     2478 Female  3.02 NA    NA    
#>  9     2490 Male    2.29 NA    NA    
#> 10     2504 Female  1.11  0.44  0.55 
#> # … with 800 more rows
max(festival.data$day1)
#> [1] 3.69

Histogram

ggplot(
  festival.data, 
  aes(day1)
)+geom_histogram()

image.png

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

稍美化一下

ggplot(
  festival.data, 
  aes(day1)
)+geom_histogram(binwidth=0.3)

image.png

進(jìn)一步美化

ggplot(
  festival.data, 
  aes(day1)
)+geom_histogram(binwidth=0.3, color="black", fill="yellow")+
  labs(x="Score", y="Counts")+
  theme(plot.title = element_text(hjust = 0.5))+
  ggtitle("Hygiene at Day 1") -> Day1Histogram
Day1Histogram

image.png

現(xiàn)在我們想要畫出所有3天每個(gè)性別情況毡琉。所以對(duì)數(shù)據(jù)進(jìn)行清洗(reshape)铁瞒。我們還將刪除NAs。

festival.data %>%
  gather(day,score,-ticknumb,-gender) -> festival.data.stack

festival.data.stack %>% filter(!is.na(score)) -> festival.data.stack
festival.data.stack
#> # A tibble: 1,197 x 4
#>    ticknumb gender day   score
#>       <dbl> <chr>  <chr> <dbl>
#>  1     2111 Male   day1   2.64
#>  2     2229 Female day1   0.97
#>  3     2338 Male   day1   0.84
#>  4     2384 Female day1   3.03
#>  5     2401 Female day1   0.88
#>  6     2405 Male   day1   0.85
#>  7     2467 Female day1   1.56
#>  8     2478 Female day1   3.02
#>  9     2490 Male   day1   2.29
#> 10     2504 Female day1   1.11
#> # … with 1,187 more rows

ggplot(festival.data.stack,aes(score))+
  geom_histogram(binwidth=0.3, color="black", fill="yellow")+
  labs(x="Hygiene score", y="Counts")+
  facet_grid(gender~day) -> histogram.3days
histogram.3days

image.png

可以修改小分面(facets)的標(biāo)簽桅滋。下面是一些示例慧耍。

histogram.3days<-ggplot(festival.data.stack,aes(score))+
  geom_histogram(binwidth=0.3, color="black", fill="yellow")+
  labs(x="Hygiene score", y="Counts")+
  facet_grid(gender~day)+
  theme(strip.text.x = element_text(size = 16, colour = "purple", face="bold"),
        strip.text.y = element_text(size=12, face="bold"))
histogram.3days

image.png

密度圖

density.3days<-ggplot(festival.data.stack, aes(score))+
  geom_density(aes(group=day, fill=day), alpha=0.5)+
  facet_grid(~gender)
density.3days

image.png

練習(xí) 2: Plot a stripchart representing all 3 days and each gender

stripchart <-ggplot(festival.data.stack, aes(gender, score, colour=gender))+ 
    facet_grid(~day)+
    geom_point(position="jitter")+
    scale_colour_manual(values=c("darkorange", "darkorchid4"))+ 
    stat_summary(geom='errorbar',fun.y=mean, aes(ymin=..y.., ymax=..y..), 
        colour="black", width=0.8, size=1.5)+
            labs(x="Gender", y="Score")+
            theme(legend.position = "none")
#> Warning: `fun.y` is deprecated. Use `fun` instead.
stripchart

image.png

從條形圖中，我們可以為平均值或任何其他描述性地理位置添加一條線作為統(tǒng)計(jì)匯總。

具體操作分2步：

stripchart<-ggplot(festival.data.stack, aes(gender, score,colour=gender))+facet_grid(~day)+
  geom_point(position="jitter")+
  scale_colour_manual(values=c("darkorange", "darkorchid4"))+
  labs(x="Gender", y="Score")+
  theme(legend.position = "none")

stripchart

image.png

加均值線

stripchart+
  stat_summary(fun.y="mean",geom="errorbar", aes(ymin=..y.., ymax=..y..), width=0.8, colour="black", size = 1.3)
#> Warning: `fun.y` is deprecated. Use `fun` instead.

image.png

加個(gè)框框

stripchart+
  geom_boxplot(alpha=0, colour="black")

image.png

進(jìn)一步美化

stripchart+
  geom_boxplot(aes(gender, score, fill=gender), alpha=0.5, colour="black")+
    scale_fill_manual(values=c("darkorange", "darkorchid4"))

image.png

說(shuō)到把圖表做得更漂亮芍碧，我們可以改進(jìn)箱式圖(boxplot)煌珊。

如果需要，我們可以更改x軸上的順序：boxplot+scale_x_discrete(limits=c(“Male”,“Female”))

boxplot<-ggplot(festival.data.stack, aes(gender,score))+
  geom_boxplot()+
  facet_grid(~day)

boxplot

image.png

boxplot <-ggplot(festival.data.stack, aes(gender,score, fill=gender))+
    facet_grid(~day)+
    stat_boxplot(geom="errorbar", width=0.5)+   
    geom_boxplot(outlier.shape=8)+
    theme(legend.position = "none")+
    scale_fill_manual(values=c("sienna1","darkorchid3 "))+
    labs(x="Gender", y="Score")
boxplot

image.png

Violinplot (beanplot) 小提琴圖

stripchart+
  geom_violin(alpha=0, colour="black")

image.png

基礎(chǔ)款

violinplot<-ggplot(festival.data.stack, aes(gender,score))+geom_violin()+facet_grid(~day)
violinplot

image.png

美化版

violinplot<-ggplot(festival.data.stack, aes(gender,score,fill=gender))+
        facet_grid(~day)+
        geom_violin(trim = FALSE)+
        scale_fill_manual(values=c("goldenrod2","darkgrey"))+
        theme(legend.position="none")+
        stat_summary(fun.y=median, geom="point", size=2, color="black")+
        labs(x="Gender", y="Hygiene scores")
#> Warning: `fun.y` is deprecated. Use `fun` instead.
violinplot

image.png

疊加箱式圖

violinplot+geom_boxplot(width=0.3)

image.png

violinplot+geom_jitter(width=0.1,size=1, shape=1)

image.png

Barchart 柱狀圖：首先泌豆，我們要計(jì)算平均值和sem怪瓶，并將這些值存儲(chǔ)在一個(gè)文件中。

festival.data.stack %>%
    group_by(gender,day) %>%
      summarise(mean=mean(score), sem=sd(score)/sqrt(n())) -> score.sem

score.sem
#> # A tibble: 6 x 4
#> # Groups:   gender [2]
#>   gender day    mean    sem
#>   <chr>  <chr> <dbl>  <dbl>
#> 1 Female day1  1.88  0.0316
#> 2 Female day2  1.08  0.0608
#> 3 Female day3  1.10  0.0990
#> 4 Male   day1  1.60  0.0362
#> 5 Male   day2  0.773 0.0585
#> 6 Male   day3  0.829 0.0721

barchart<-ggplot(score.sem, aes(day,mean, fill=gender))+
  geom_bar(stat="identity")
barchart

image.png

加誤差線

barchart<-ggplot(score.sem, aes(day,mean, fill=gender))+
  geom_bar(stat="identity", position="dodge")+
  geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem), position="dodge")
barchart

image.png

barchart<-ggplot(score.sem, aes(day,mean, fill=gender))+
  geom_bar(position="dodge", stat="identity")+
  geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem), position="dodge")
barchart

image.png

美化

barchart<-ggplot(score.sem, aes(day,mean, fill=gender))+
  geom_bar(position="dodge", colour="black",stat="identity",size=1)+
  geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem), width=.5, position=position_dodge(0.9), size=1)+
  ylab("Mean scores")+ 
  ggtitle("Levels of hygiene over 3 days of concert")+
  theme(plot.title = element_text(hjust = 0.5))+
  theme(plot.title = element_text(size = 19))+
  theme(axis.title.x=element_blank())+
  scale_fill_manual(values=c("darkorange3", "darkorchid4"), name="Gender")
barchart

image.png

Linegraph 折線圖

linegraph<-ggplot(score.sem, aes(day, mean, group=gender))+
    geom_line()+
    geom_point()+
    geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem))

linegraph

image.png

美化

linegraph<-ggplot(score.sem, aes(day,mean, colour=gender, group=gender))+
  geom_line(size=1.5)+
  geom_point(size=4)+
  geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem), width=.2, size=1.5)
linegraph

image.png

進(jìn)一步美化

linegraph<-ggplot(score.sem, aes(day,mean, colour=gender, group=gender))+
  geom_line(size=1.5)+
  geom_point(size=5)+
  geom_errorbar(aes(ymin=mean-sem, ymax=mean+sem), width=.2, size=1.5)+
  labs(x="", y="Mean scores")+
  scale_y_continuous(breaks=seq(0, 2, 0.2))+
  ggtitle("Levels of hygiene over 3 days of concert")+
  theme(plot.title = element_text(hjust = 0.5))+
  scale_colour_manual(values=c("purple","darkorange3"), name="")+
  theme(legend.position = c(0.85, 0.9))+
  theme(legend.text=element_text(size=14))+
  theme(legend.background = element_rect(fill = "transparent"))

linegraph

image.png

練習(xí) 3 該文件包含3個(gè)不同數(shù)據(jù)集(一個(gè)WT和兩個(gè)mutants)的positional count數(shù)據(jù)践美。繪制一個(gè)圖,顯示同一圖下的所有3個(gè)數(shù)據(jù)集先讀Chrometic_position_data.txt檢查文件的結(jié)構(gòu), 用gather()將文件從寬格式重新構(gòu)造為長(zhǎng)格式, 重命名列：Gentype和Value繪制基本折線圖.

chromosome<-read_tsv("chromosome_position_data.txt")
chromosome
#> # A tibble: 184 x 4
#>    Position  Mut1  Mut2    WT
#>       <dbl> <dbl> <dbl> <dbl>
#>  1 91757273  2.71  1.34  1.25
#>  2 91757323  2.71  1.3   1.25
#>  3 91757373  5.41  1.14  1.25
#>  4 91757423  2.71  1.58  1.88
#>  5 91757473  2.71  1.19  1.25
#>  6 91757523  2.71  2.82  2.5 
#>  7 91757573  2.71  3.15  3.75
#>  8 91757623  0     4.05  3.75
#>  9 91757673  0     2.94  3.12
#> 10 91757723  0     2.69  3.12
#> # … with 174 more rows
chromosome %>%
  gather(Genotype, Value,-Position) -> chromosome.long
chromosome.long
#> # A tibble: 552 x 3
#>    Position Genotype Value
#>       <dbl> <chr>    <dbl>
#>  1 91757273 Mut1      2.71
#>  2 91757323 Mut1      2.71
#>  3 91757373 Mut1      5.41
#>  4 91757423 Mut1      2.71
#>  5 91757473 Mut1      2.71
#>  6 91757523 Mut1      2.71
#>  7 91757573 Mut1      2.71
#>  8 91757623 Mut1      0   
#>  9 91757673 Mut1      0   
#> 10 91757723 Mut1      0   
#> # … with 542 more rows
chromosome.linegraph<-ggplot(chromosome.long, aes(x=Position, y=Value, group=Genotype, colour=Genotype))+
geom_line(size=2)
chromosome.linegraph

image.png

畫一張圖表，顯示一個(gè)典型嬰兒在出生后9個(gè)月的年齡和體重之間的關(guān)系找岖。讀入數(shù)據(jù)weight_chart.txt檢查文件結(jié)構(gòu) 繪制基本折線圖繪制更漂亮的版本：更改點(diǎn)的大小和顏色更改線的粗細(xì)和顏色更改y軸：比例從2 kg更改為10 kg 更改x軸：比例從0 t 10個(gè)月更改兩個(gè)軸上的標(biāo)簽向圖表添加標(biāo)題

weight<-read_tsv("weight_chart.txt")
weight
#> # A tibble: 10 x 2
#>      Age Weight
#>    <dbl>  <dbl>
#>  1     0    3.6
#>  2     1    4.4
#>  3     2    5.2
#>  4     3    6  
#>  5     4    6.6
#>  6     5    7.2
#>  7     6    7.8
#>  8     7    8.4
#>  9     8    8.8
#> 10     9    9.2
#基礎(chǔ)
weight.linegraph<-ggplot(weight, aes(Age, Weight))+
  geom_line()+
  geom_point()
weight.linegraph

image.png

#美化
weight.linegraph<-ggplot(weight, aes(Age, Weight))+
 geom_line(size=1, colour="lightblue2")+
 geom_point(shape=16, size=3, colour="darkorchid1")+
  scale_y_continuous(breaks=2:10, limits = c(2, 10))+
  scale_x_continuous(breaks=0:10, limits = c(0, 10))+
  labs(x="Age (months)", y="Weight (kg)")+
  ggtitle("Relation between age and weight")+
  theme(plot.title = element_text(hjust = 0.5))
weight.linegraph

image.png

練習(xí) 5 文件brain_bodyweight.txt包含一系列物種的log10大腦和體重?cái)?shù)據(jù)陨倡，以及每個(gè)點(diǎn)的SEM測(cè)量結(jié)果。將這些數(shù)據(jù)繪制在帶有誤差條的散點(diǎn)圖上许布，顯示每個(gè)點(diǎn)下的平均值+/-SEM和數(shù)據(jù)集的名稱兴革。讀入brain_bodyweitt.txt檢查文件的結(jié)構(gòu)，繪制一個(gè)基本的圖形蜜唾。繪制一個(gè)更漂亮的版本杂曲，對(duì)于水平誤差線需要更改：geom_barh()，對(duì)于標(biāo)簽需要更改：geom_text()袁余。

由于從教程中獲取的.txt文檔有點(diǎn)問(wèn)題擎勘，所以費(fèi)了點(diǎn)時(shí)間

options(stringsAsFactors = F)
brain.bodyweight<- read.csv("brain_bodyweight.txt",sep=',',header = T)
brain.bodyweight
#>             Species Bodyweight Brainweight Bodyweight.SEM Brainweight.SEM  X
#> 1               Cow      2.670       2.630        0.12500         0.12700 NA
#> 2              Goat      1.440       2.060        0.01930         0.14200 NA
#> 3        Guinea Pig      0.017       0.740        0.00163         0.03590 NA
#> 4        Diplodocus      4.070       1.700        0.10100         0.13400 NA
#> 5             Horse      2.720       2.820        0.00367         0.16900 NA
#> 6               Cat      0.519       1.410        0.02920         0.10200 NA
#> 7           Gorilla      2.320       2.610        0.02680         0.24300 NA
#> 8             Human      1.790       3.120        0.03610         0.23100 NA
#> 9  African Elephant      3.820       3.760        0.11200         0.32900 NA
#> 10    Rhesus Monkey      0.833       2.250        0.04650         0.14600 NA
#> 11         Kangaroo      1.540       1.750        0.01750         0.05870 NA
#> 12          Hamster     -0.921       0.100        0.02740         0.00981 NA
#> 13            Mouse     -1.640      -0.398        0.06860         0.02340 NA
#> 14           Rabbit      0.398       1.080        0.01150         0.06690 NA
#> 15            Sheep      1.740       2.240        0.04100         0.13600 NA
#> 16       Chimpanzee      1.720       2.640        0.01640         0.23800 NA
#> 17    Brachiosaurus      4.940       2.190        0.26900         0.11200 NA
#> 18              Rat     -0.553       0.279        0.03370         0.00188 NA
#> 19             Mole     -0.914       0.477        0.06170         0.04770 NA
#> 20              Pig      2.280       2.260        0.01330         0.16000 NA
brain.bodyweight=brain.bodyweight[,-6]
brain.bodyweight
#>             Species Bodyweight Brainweight Bodyweight.SEM Brainweight.SEM
#> 1               Cow      2.670       2.630        0.12500         0.12700
#> 2              Goat      1.440       2.060        0.01930         0.14200
#> 3        Guinea Pig      0.017       0.740        0.00163         0.03590
#> 4        Diplodocus      4.070       1.700        0.10100         0.13400
#> 5             Horse      2.720       2.820        0.00367         0.16900
#> 6               Cat      0.519       1.410        0.02920         0.10200
#> 7           Gorilla      2.320       2.610        0.02680         0.24300
#> 8             Human      1.790       3.120        0.03610         0.23100
#> 9  African Elephant      3.820       3.760        0.11200         0.32900
#> 10    Rhesus Monkey      0.833       2.250        0.04650         0.14600
#> 11         Kangaroo      1.540       1.750        0.01750         0.05870
#> 12          Hamster     -0.921       0.100        0.02740         0.00981
#> 13            Mouse     -1.640      -0.398        0.06860         0.02340
#> 14           Rabbit      0.398       1.080        0.01150         0.06690
#> 15            Sheep      1.740       2.240        0.04100         0.13600
#> 16       Chimpanzee      1.720       2.640        0.01640         0.23800
#> 17    Brachiosaurus      4.940       2.190        0.26900         0.11200
#> 18              Rat     -0.553       0.279        0.03370         0.00188
#> 19             Mole     -0.914       0.477        0.06170         0.04770
#> 20              Pig      2.280       2.260        0.01330         0.16000

brain.bodyweight.graph<-ggplot(brain.bodyweight, aes(x=Bodyweight, y=Brainweight))+
  geom_point()+
  geom_errorbar(aes(ymin=Brainweight-Brainweight.SEM, ymax=Brainweight+Brainweight.SEM))+
  geom_errorbarh(aes(xmin=Bodyweight-Bodyweight.SEM, xmax=Bodyweight+Bodyweight.SEM))+
  geom_text(aes(label=Species), hjust = 1.05, vjust = -0.6, size=2.7)
brain.bodyweight.graph

image.png


brain.bodyweight.graph<-ggplot(brain.bodyweight, aes(x=Bodyweight, y=Brainweight))+
  geom_point()+
  geom_errorbar(aes(ymin=Brainweight-Brainweight.SEM, ymax=Brainweight+Brainweight.SEM), width=.1, size=1, colour="tomato3")+
  geom_errorbarh(aes(xmin=Bodyweight-Bodyweight.SEM, xmax=Bodyweight+Bodyweight.SEM), height=.1, size=1, colour="tomato3")+
  geom_point(size=2)+
  geom_text(aes(label=Species), hjust = 1.1, vjust = -0.6, size=2.7)
brain.bodyweight.graph

image.png

進(jìn)一步美化

library("ggrepel")
## Warning: package 'ggrepel' was built under R version 3.5.3
ggplot(brain.bodyweight, aes(x=Bodyweight, y=Brainweight))+
  geom_errorbar(aes(ymin=Brainweight-Brainweight.SEM, ymax=Brainweight+Brainweight.SEM), 
width=.1, size=0.5, colour="grey28")+
  geom_errorbarh(aes(xmin=Bodyweight-Bodyweight.SEM, xmax=Bodyweight+Bodyweight.SEM), 
height=.1, size=0.5, colour="grey28")+
  geom_point(shape=21, size=3, colour="black", fill="maroon3")+
  geom_label_repel(aes(label = Species), box.padding=0.6, point.padding =0.5, 
fill="mintcream", segment.colour="grey", size=3) -> brain.bodyweight.graph
brain.bodyweight.graph

image.png

Stacked bar: categorical data 堆疊柱狀圖

Changing<-read_csv("Changing.csv")
Changing
#> # A tibble: 60 x 3
#>    Type.of.Behaviour Sample.Size Stage.of.Change 
#>    <chr>                   <dbl> <chr>           
#>  1 Smoking cessation         108 Precontemplation
#>  2 Smoking cessation         187 Contemplation   
#>  3 Smoking cessation           0 Preparation     
#>  4 Smoking cessation         134 Action          
#>  5 Smoking cessation         247 Maintenance     
#>  6 Quitting cocaine            8 Precontemplation
#>  7 Quitting cocaine           15 Contemplation   
#>  8 Quitting cocaine            0 Preparation     
#>  9 Quitting cocaine           71 Action          
#> 10 Quitting cocaine           62 Maintenance     
#> # … with 50 more rows

stackedBar<-ggplot(Changing, aes(Type.of.Behaviour, Sample.Size, fill=Stage.of.Change))+
geom_bar(stat="identity")
stackedBar

image.png

更改比較的順序：factor(variable name, levels = c(“”, “” .)) 。旋轉(zhuǎn)圖表以讀取x軸標(biāo)簽：coord_flip()

Changing$Stage.of.Change <- factor(Changing$Stage.of.Change, levels = c("Maintenance","Action","Preparation","Contemplation","Precontemplation"))

stackedBar<-ggplot(Changing, aes(Type.of.Behaviour, Sample.Size, fill = Stage.of.Change))+
  geom_bar(stat="identity", colour="black")+
  coord_flip()
stackedBar

image.png

進(jìn)一步美化

stackedBar<-stackedBar+
  labs(title="Stages for Each of the 12 Problem Behaviours", y="Sample Size", fill="Stages of Change")+
  theme(plot.title = element_text(hjust = 0.5, size=12, face="bold"))+
  theme(axis.title.y=element_blank())+
  scale_fill_brewer(palette = 4)+
  theme(axis.text.x = element_text(size=10), axis.text.y = element_text(size=9))+
   theme(legend.text=element_text(size=8), legend.title=element_text(size=10, face="bold"))+
  theme(axis.title.x = element_text(size=10))
stackedBar

image.png

提高對(duì)比度

stackedBar+scale_fill_brewer(palette="RdYlGn", direction=-1)

image.png

練習(xí) 6 讓我們將相同的數(shù)據(jù)繪制為百分比颖榜，將變化的數(shù)據(jù)繪制為百分比棚饵。將文件格式更改為應(yīng)急xtabs() 計(jì)算百分比prop.table() 將格式更改為dataframe as.data.frame() 檢查前幾行head() 像以前一樣使用不同的調(diào)色板繪制數(shù)據(jù)

contingency.table100<-prop.table(xtabs(Sample.Size~Type.of.Behaviour+Stage.of.Change, Changing),1)*100
contingency.table100
#>                         Stage.of.Change
#> Type.of.Behaviour        Maintenance    Action Preparation Contemplation
#>   Adolescent delinquency   25.786164 27.044025    0.000000     28.930818
#>   Condom use               35.294118  6.191950    0.000000     17.956656
#>   Exercise acquisition     19.386332 14.086471   25.383543     33.751743
#>   High fat diet            56.666667  2.777778    0.000000     17.777778
#>   Mammography screening    42.553191 18.439716    0.000000     17.021277
#>   Physicians'practices     49.629630  1.481481    2.222222     14.814815
#>   Quitting cocaine         39.743590 45.512821    0.000000      9.615385
#>   Radon gas exposure        0.000000  8.166189    0.000000     17.335244
#>   Safer sex                 0.000000 47.887324    0.000000      7.981221
#>   Smoking cessation        36.538462 19.822485    0.000000     27.662722
#>   Sunscreen use            35.242291  4.405286    0.000000      7.929515
#>   Weight control           14.634146 17.886179    0.000000     52.845528
#>                         Stage.of.Change
#> Type.of.Behaviour        Precontemplation
#>   Adolescent delinquency        18.238994
#>   Condom use                    40.557276
#>   Exercise acquisition           7.391911
#>   High fat diet                 22.777778
#>   Mammography screening         21.985816
#>   Physicians'practices          31.851852
#>   Quitting cocaine               5.128205
#>   Radon gas exposure            74.498567
#>   Safer sex                     44.131455
#>   Smoking cessation             15.976331
#>   Sunscreen use                 52.422907
#>   Weight control                14.634146

Changing.percent<-as.data.frame(contingency.table100)
Changing.percent
#>         Type.of.Behaviour  Stage.of.Change      Freq
#> 1  Adolescent delinquency      Maintenance 25.786164
#> 2              Condom use      Maintenance 35.294118
#> 3    Exercise acquisition      Maintenance 19.386332
#> 4           High fat diet      Maintenance 56.666667
#> 5   Mammography screening      Maintenance 42.553191
#> 6    Physicians'practices      Maintenance 49.629630
#> 7        Quitting cocaine      Maintenance 39.743590
#> 8      Radon gas exposure      Maintenance  0.000000
#> 9               Safer sex      Maintenance  0.000000
#> 10      Smoking cessation      Maintenance 36.538462
#> 11          Sunscreen use      Maintenance 35.242291
#> 12         Weight control      Maintenance 14.634146
#> 13 Adolescent delinquency           Action 27.044025
#> 14             Condom use           Action  6.191950
#> 15   Exercise acquisition           Action 14.086471
#> 16          High fat diet           Action  2.777778
#> 17  Mammography screening           Action 18.439716
#> 18   Physicians'practices           Action  1.481481
#> 19       Quitting cocaine           Action 45.512821
#> 20     Radon gas exposure           Action  8.166189
#> 21              Safer sex           Action 47.887324
#> 22      Smoking cessation           Action 19.822485
#> 23          Sunscreen use           Action  4.405286
#> 24         Weight control           Action 17.886179
#> 25 Adolescent delinquency      Preparation  0.000000
#> 26             Condom use      Preparation  0.000000
#> 27   Exercise acquisition      Preparation 25.383543
#> 28          High fat diet      Preparation  0.000000
#> 29  Mammography screening      Preparation  0.000000
#> 30   Physicians'practices      Preparation  2.222222
#> 31       Quitting cocaine      Preparation  0.000000
#> 32     Radon gas exposure      Preparation  0.000000
#> 33              Safer sex      Preparation  0.000000
#> 34      Smoking cessation      Preparation  0.000000
#> 35          Sunscreen use      Preparation  0.000000
#> 36         Weight control      Preparation  0.000000
#> 37 Adolescent delinquency    Contemplation 28.930818
#> 38             Condom use    Contemplation 17.956656
#> 39   Exercise acquisition    Contemplation 33.751743
#> 40          High fat diet    Contemplation 17.777778
#> 41  Mammography screening    Contemplation 17.021277
#> 42   Physicians'practices    Contemplation 14.814815
#> 43       Quitting cocaine    Contemplation  9.615385
#> 44     Radon gas exposure    Contemplation 17.335244
#> 45              Safer sex    Contemplation  7.981221
#> 46      Smoking cessation    Contemplation 27.662722
#> 47          Sunscreen use    Contemplation  7.929515
#> 48         Weight control    Contemplation 52.845528
#> 49 Adolescent delinquency Precontemplation 18.238994
#> 50             Condom use Precontemplation 40.557276
#> 51   Exercise acquisition Precontemplation  7.391911
#> 52          High fat diet Precontemplation 22.777778
#> 53  Mammography screening Precontemplation 21.985816
#> 54   Physicians'practices Precontemplation 31.851852
#> 55       Quitting cocaine Precontemplation  5.128205
#> 56     Radon gas exposure Precontemplation 74.498567
#> 57              Safer sex Precontemplation 44.131455
#> 58      Smoking cessation Precontemplation 15.976331
#> 59          Sunscreen use Precontemplation 52.422907
#> 60         Weight control Precontemplation 14.634146

繪制百分比圖

stackedBar.percent<-ggplot(Changing.percent,aes(Type.of.Behaviour, Freq, fill = Stage.of.Change))+
        geom_bar(stat="identity",colour="black")+
    coord_flip()+
        scale_fill_brewer(palette = "Spectral", direction=-1)+
        labs(title="Stages for Each of the 12 Problem Behaviours", y="Frequencies")+
    theme(axis.title.y=element_blank())+
        theme(plot.title = element_text(hjust = 0.5, size=12, face="bold"))+
    theme(axis.text.x = element_text(size=10), axis.text.y = element_text(size=9))+
    theme(legend.text=element_text(size=8), legend.title=element_text(size=10, face="bold"))+
    theme(axis.title.x = element_text(size=10))
stackedBar.percent

image.png

保存圖片

stackedBar_percent <-ggsave(stackedBar.percent, file="stackedBar_percent.png")

本教程所用數(shù)據(jù)部分來(lái)自http://www.bioinformatics.babraham.ac.uk/training.html

最后編輯于：2020.06.30 09:20:22

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個(gè)濱河市掩完，隨后出現(xiàn)的幾起案子噪漾，更是在濱河造成了極大的恐慌，老刑警劉巖且蓬，帶你破解...
沈念sama閱讀 217,826評(píng)論 6贊 506
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件欣硼，死亡現(xiàn)場(chǎng)離奇詭異，居然都是意外死亡恶阴，警方通過(guò)查閱死者的電腦和手機(jī)诈胜，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 92,968評(píng)論 3贊 395
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)冯事，“玉大人耘斩，你說(shuō)我怎么就攤上這事∥ε兀” “怎么了括授？”我有些...
開(kāi)封第一講書(shū)人閱讀 164,234評(píng)論 0贊 354
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長(zhǎng)。經(jīng)常有香客問(wèn)我荚虚，道長(zhǎng)薛夜，這世上最難降的妖魔是什么？我笑而不...
開(kāi)封第一講書(shū)人閱讀 58,562評(píng)論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任版述，我火速辦了婚禮梯澜，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘渴析。我一直安慰自己晚伙，他們只是感情好，可當(dāng)我...
茶點(diǎn)故事閱讀 67,611評(píng)論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來(lái)的
文/花漫我一把揭開(kāi)白布俭茧。她就那樣靜靜地躺著咆疗，像睡著了一般。火紅的嫁衣襯著肌膚如雪母债。梳的紋絲不亂的頭發(fā)上午磁，一...
開(kāi)封第一講書(shū)人閱讀 51,482評(píng)論 1贊 302
城市分裂傳說(shuō)
那天，我揣著相機(jī)與錄音毡们，去河邊找鬼迅皇。笑死，一個(gè)胖子當(dāng)著我的面吹牛衙熔，可吹牛的內(nèi)容都是我干的登颓。我是一名探鬼主播，決...
沈念sama閱讀 40,271評(píng)論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開(kāi)眼红氯，長(zhǎng)吁一口氣：“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼挺据！你這毒婦竟也來(lái)了？” 一聲冷哼從身側(cè)響起脖隶，我...
開(kāi)封第一講書(shū)人閱讀 39,166評(píng)論 0贊 276
萬(wàn)榮殺人案實(shí)錄
序言：老撾萬(wàn)榮一對(duì)情侶失蹤扁耐，失蹤者是張志新（化名）和其女友劉穎，沒(méi)想到半個(gè)月后产阱，有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體婉称，經(jīng)...
沈念sama閱讀 45,608評(píng)論 1贊 314
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 37,814評(píng)論 3贊 336
?白月光啟示錄
正文我和宋清朗相戀三年构蹬，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了王暗。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 39,926評(píng)論 1贊 348
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡庄敛，死狀恐怖俗壹，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情藻烤，我是刑警寧澤绷雏，帶...
沈念sama閱讀 35,644評(píng)論 5贊 346
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布头滔，位于F島的核電站，受9級(jí)特大地震影響涎显，放射性物質(zhì)發(fā)生泄漏坤检。R本人自食惡果不足惜，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,249評(píng)論 3贊 329
男人毒藥：我在死后第九天來(lái)索命
文/蒙蒙一期吓、第九天我趴在偏房一處隱蔽的房頂上張望早歇。院中可真熱鬧，春花似錦讨勤、人聲如沸箭跳。這莊子的主人今日做“春日...
開(kāi)封第一講書(shū)人閱讀 31,866評(píng)論 0贊 22
一樁弒父案潭千，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽(yáng)谱姓。三九已至，卻和暖如春脊岳，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背垛玻。一陣腳步聲響...
開(kāi)封第一講書(shū)人閱讀 32,991評(píng)論 1贊 269
情欲美人皮
我被黑心中介騙來(lái)泰國(guó)打工割捅，沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留，地道東北人帚桩。一個(gè)月前我還...
沈念sama閱讀 48,063評(píng)論 3贊 370
代替公主和親
正文我出身青樓亿驾，卻偏偏與公主長(zhǎng)得像，于是被迫代替她去往敵國(guó)和親账嚎。傳聞我的和親對(duì)象是個(gè)殘疾皇子莫瞬，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 44,871評(píng)論 2贊 354