Week 1 -Day 2
前面我們做了簡單的數(shù)據(jù)分析谋国,接下來我們看一下,怎么把這些數(shù)據(jù)轉(zhuǎn)換成更直觀的的圖標(biāo)的形式迁沫。畢竟芦瘾,data scientist的工作之一就是要communicate insights from data analysis。圖表是非常有效的一種形式集畅。
boxplot
首先需要安裝ggplot2 這個package
install.packages("ggplot2")
library("ggplot2")
boxplot(z$new_column, col="blue")
公牛隊(duì)最佳勝率是0.88旅急, 最差的勝率是0.18. 勝率的中位數(shù)是0.5
Histogram
從圖中來看,勝率主要集中在0.4~0.6之間牡整。如果希望能看到更清晰的顆粒度藐吮,我么可以把bar break成更小的間隔
hist(z$new_column, col="green", breaks = 15)
可以自己在圖中標(biāo)注想要的線。例如:
> boxplot(z$new_column, col="blue")
> abline(h=0.78)
再比如
hist(z$new_column, col="green", breaks = 15)
> abline(v = 0.5, lwd = 2)
hist(z$new_column, col="green", breaks = 15)
> abline(v = 0.5, lwd = 2)
如果想按照年份輸出呢逃贝?——也就是按照時間序列輸出
用barplot
先作圖
bp <- barplot(mydata$new_column, col="wheat", main ="Chicago Bulls Historical Winning Rate", names.arg =mydata$year)
然后label 各個bar的value
text(bp, mydata$new_column,? labels = round(mydata$new_column, 2), col="black",cex =0.8,? pos =3, offset = 0.1)
this is a very helpful article.
http://www.talkstats.com/archive/index.php/t-24754.html?s=59e14b595125d3fb8910e0e6ee80a585
yourgraph<-barplot( blah blah blah....)
# use text to add freq on the top of the bars
text(yourgraph, d.f$freq, labels=d.f$freq)
http://www.ats.ucla.edu/stat/r/faq/barplotplus.htm