自用模板舵盈,隨緣更新
cheat sheet
Reshape Date
- 比reshape 好用
pivot_longer()
- 寬變長
# Simplest case where column names are character data
relig_income
#> # A tibble: 18 × 11
#> religion `<$10k` `$10-20k` `$20-30k` `$30-40k` `$40-50k` `$50-75k`
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Agnostic 27 34 60 81 76 137
#> 2 Atheist 12 27 37 52 35 70
#> 3 Buddhist 27 21 30 34 33 58
#> 4 Catholic 418 617 732 670 638 1116
#> 5 Don’t know/r… 15 14 15 11 10 35
#> 6 Evangelical … 575 869 1064 982 881 1486
#> 7 Hindu 1 9 7 9 11 34
#> 8 Historically… 228 244 236 238 197 223
#> 9 Jehovah's Wi… 20 27 24 24 21 30
#> 10 Jewish 19 19 25 25 30 95
#> 11 Mainline Prot 289 495 619 655 651 1107
#> 12 Mormon 29 40 48 51 56 112
#> 13 Muslim 6 7 9 10 9 23
#> 14 Orthodox 13 17 23 32 32 47
#> 15 Other Christ… 9 7 11 13 13 14
#> 16 Other Faiths 20 33 40 46 49 63
#> 17 Other World … 5 2 3 4 2 7
#> 18 Unaffiliated 217 299 374 365 341 528
#> # … with 4 more variables: `$75-100k` <dbl>, `$100-150k` <dbl>,
#> # `>150k` <dbl>, `Don't know/refused` <dbl>
relig_income %>%
pivot_longer(!religion, names_to = "income", values_to = "count")
#> # A tibble: 180 × 3
#> religion income count
#> <chr> <chr> <dbl>
#> 1 Agnostic <$10k 27
#> 2 Agnostic $10-20k 34
#> 3 Agnostic $20-30k 60
#> 4 Agnostic $30-40k 81
#> 5 Agnostic $40-50k 76
#> 6 Agnostic $50-75k 137
#> 7 Agnostic $75-100k 122
#> 8 Agnostic $100-150k 109
#> 9 Agnostic >150k 84
#> 10 Agnostic Don't know/refused 96
#> # … with 170 more row
pivot_wider()
- 長變寬
Split Cell
unite()
sepatate()
- 可以自定義劃分一列的分隔符
sep = "\"
- 命名劃分出的列名
into = c("", "")
sepatate_row()
- 將分割出來的數(shù)據(jù)以長數(shù)據(jù)的形式呈現(xiàn)
Nested Data Frame
Definition
Nesting uses alternative representation of grouped data where a group becomes a single row containing a nested data frame. See vignette("nest") for more details and examples.
- 就是將list 放入一個cell 中,形成表格中表格
df <- tibble(x = c(1, 1, 1, 2, 2, 3), y = 1:6, z = 6:1)
# Note that we get one row of output for each unique combination of
# non-nested variables
df %>% nest(data = c(y, z))
#> # A tibble: 3 × 2
#> x data
#> <dbl> <list>
#> 1 1 <tibble [3 × 2]>
#> 2 2 <tibble [2 × 2]>
#> 3 3 <tibble [1 × 2]>
Expand Date Frame
extract()
- 可以使用正則表達式
- 通過指定展開列名的數(shù)目來確定展開的個數(shù)
df <- data.frame(x = c(NA, "a-b", "a-d", "b-c", "d-e"))
df %>% extract(x, "A")
#> A
#> 1 <NA>
#> 2 a
#> 3 a
#> 4 b
#> 5 d
df %>% extract(x, c("A", "B"), "([[:alnum:]]+)-([[:alnum:]]+)")
#> A B
#> 1 <NA> <NA>
#> 2 a b
#> 3 a d
#> 4 b c
#> 5 d e
# If no match, NA:
df %>% extract(x, c("A", "B"), "([a-d]+)-([a-d]+)")
#> A B
#> 1 <NA> <NA>
#> 2 a b
#> 3 a d
#> 4 b c
#> 5 <NA> <NA>
如果想要按照前綴展開的話,可以考慮使用paste()函數(shù)
eg:paste("prefix", -100:99, sep = "-")
unnest
unnest_longer()
- 拆成一行
unnest_wider()
- 根據(jù)element 的數(shù)目劃分colunms 的數(shù)目