library(tidyverse)
library(ggplot2)
查看mpg數(shù)據(jù)結(jié)構(gòu)
## 采用ggplot2自帶的數(shù)據(jù)mpg來(lái)探索引擎與燃油效率之間的關(guān)系
## 變量displ:引擎的大小 hwy: 燃油的效率
View(mpg)
簡(jiǎn)單可視化
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
ggplot2畫圖結(jié)構(gòu)
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
Exercises
- Run ggplot(data = mpg). What do you see?
- How many rows are in mtcars? How many columns?
- What does the drv variable describe? Read the help for ?mpg to
find out. - Make a scatterplot of hwy versus cyl.
- What happens if you make a scatterplot of class versus drv? Why is the plot not useful?
1、 運(yùn)行后得到的只是灰色的畫布,我們并沒(méi)有指定變量
2捌木、查看數(shù)據(jù)有多少行列?
> dim(mpg) ## 表示得到數(shù)據(jù)的維度
[1] 234 11 ##表示234行 11列
> ncol(mpg) # 列
[1] 11
> nrow(mpg) # 行
[1] 234
3职辅、查看變量drv代表什么?
> ?mpg
可以看到右邊會(huì)顯示得到如下結(jié)果
drv
f = front-wheel drive, r = rear wheel drive, 4 = 4wd
4聂示、繪制一個(gè)hwy與cyl的散點(diǎn)圖
ggplot(mpg) +
geom_point(aes(x = hwy, y = cyl))
5域携、ggplot(mpg) +
geom_point(aes(x = class, y = drv))
Aesthetic Mappings
將顏色映射到class上
> unique(mpg$class)
[1] "compact" "midsize" "suv" "2seater" "minivan" "pickup" "subcompact"
> ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = class))
以點(diǎn)的大小來(lái)代表每一個(gè)類別
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, size = class))
使用陰影程度來(lái)代表不同的類別
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, alpha = class))
使用不同的形狀來(lái)代表不同的類別
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, shape = class))
如果想改變散點(diǎn)圖中點(diǎn)的顏色?
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), color = "blue")
小結(jié)
- 參數(shù)shape定義形狀催什、color定義顏色涵亏,可以指定顏色或者根據(jù)變量中的level自動(dòng)填充、size以點(diǎn)的大小來(lái)表示蒲凶,alpha可以定義陰影程度气筋。
-
shape參數(shù)形狀匯總
image.png
Exercises
- What happens if you map an aesthetic to something other than
a variable name, like aes(color = displ < 5)?
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = displ < 5))
## 表示按照條件來(lái)填充顏色,不符合小于5的為一種顏色旋圆,符合小于5的為一種顏色
Facets (分面)
## 按照變量class里面不用的類來(lái)分面
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2) ##表示兩行
## 表示根據(jù)變量drv和cyl兩個(gè)變量里面的類別進(jìn)行排列組合即4*3=12個(gè)面
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(drv ~ cyl)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(drv ~ .) ## 表示以行展示
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(. ~ drv) ##表示以列展示
Geometric Objects
ggplot(data = mpg) +
geom_smooth(mapping = aes(x = displ, y = hwy, linetype = drv))
ggplot(data = mpg) +
geom_smooth(mapping = aes(x = displ, y = hwy, linetype = drv, color = drv))
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point(mapping = aes(color = class)) +
geom_smooth()
使用filter函數(shù)挑選某一類進(jìn)行smooth
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point(mapping = aes(color = class)) +
geom_smooth(
data = filter(mpg, class == "subcompact"),
se = FALSE
)
哈哈 中文版買了宠默,現(xiàn)在開(kāi)始就看中文版了
1.7 統(tǒng)計(jì)變換 (P19)
-
geom_bar() 統(tǒng)計(jì)變換的過(guò)程
image.png
ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut))
ggplot(data = diamonds) + stat_count(mapping = aes(x = cut))
這兩句代碼是等價(jià)的,在這里geom_bar()使用了stat_count()函數(shù)進(jìn)行統(tǒng)計(jì)變換
如果你只想要顯示比例而不是計(jì)數(shù)的話使用以下命令 灵巧,y = ..prop..表示百分比的形式
ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut, y = ..prop.., group = 1))
你可能想要在代碼中強(qiáng)調(diào)統(tǒng)計(jì)變換搀矫。例如,你可以使用 stat_summary() 函數(shù)將人們的注意力吸引到你計(jì)算出的那些摘要統(tǒng)計(jì)量上刻肄。 stat_summary() 函數(shù)為 x 的每個(gè)唯一值計(jì)算 y 值的摘要統(tǒng)計(jì):
ggplot(data = diamonds) + stat_summary(mapping = aes(x = cut, y = depth),
fun.ymin = min,
fun.ymax = max,
fun.y = median
)
## 表示每一個(gè)變量對(duì)應(yīng)的深度的值的分布瓤球,最大值、最小值敏弃、以及中位值卦羡,類似箱式圖
1.8、位置調(diào)整
添加柱狀圖邊框的顏色
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, color = cut))
如果要設(shè)置柱狀圖填充的顏色麦到,就需要使用fill參數(shù)
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut))
如果將 fill 圖形屬性映射到另一個(gè)變量(如 clarity)绿饵,那么條形會(huì)自動(dòng)分塊堆疊起來(lái)。每個(gè)彩色矩形表示 cut 和 clarity 的一種組合瓶颠。
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity))
這種堆疊是由 position 參數(shù)設(shè)定的位置調(diào)整功能自動(dòng)完成的拟赊。如果不想生成堆疊式條形圖,你還可以使用以下 3 種選項(xiàng)之一: "identity"粹淋、 "fill" 和 "dodge"吸祟。
- position = "identity" 將每個(gè)對(duì)象直接顯示在圖中瑟慈。這種方式不太適合條形圖,因?yàn)?br>
條形會(huì)彼此重疊屋匕。為了讓重疊部分能夠顯示出來(lái)封豪,我們可以設(shè)置 alpha 參數(shù)為一個(gè)較小
的數(shù),從而使得條形略微透明炒瘟;或者設(shè)定 fill = NA,讓條形完全透明:
ggplot(data = diamonds,mapping = aes(x = cut, fill = clarity)) +
geom_bar(alpha = 1/5, position = "identity")
ggplot(data = diamonds, mapping = aes(x = cut, color = clarity)) +
geom_bar(fill = NA, position = "identity")
- position = "fill" 的效果與堆疊相似第步,但每組堆疊條形具有同樣的高度疮装,因此這種條
形圖可以非常輕松地比較各組間的比例:
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity),position = "fill")
- position = "dodge" 將每組中的條形依次并列放置,這樣可以非常輕松地比較每個(gè)條形
表示的具體數(shù)值:
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity), position = "dodge")
1.9粘都、坐標(biāo)系
坐標(biāo)系可能是 ggplot2 中最復(fù)雜的部分廓推。默認(rèn)的坐標(biāo)系是笛卡兒直角坐標(biāo)系,可以通過(guò)其獨(dú)立作用的 x 坐標(biāo)和 y 坐標(biāo)找到每個(gè)數(shù)據(jù)點(diǎn)翩隧。
- coord_flip() 函數(shù)可以交換 x 軸和 y 軸樊展。
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) +
geom_boxplot()
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) +
geom_boxplot() +
coord_flip()
- coord_polar() 函數(shù)使用極坐標(biāo)系。極坐標(biāo)系可以揭示出條形圖和雞冠花圖間的一種有趣聯(lián)系:
bar <- ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut), show.legend = FALSE,width = 1) +
theme(aspect.ratio = 1) + labs(x = NULL, y = NULL)
bar
bar + coord_flip()
bar + coord_polar()