簡介
PCoA分析氯葬,即主坐標(biāo)分析(principal co-ordinates analysis)纪铺,是一種非約束性的數(shù)據(jù)降維分析方法妻坝,可用來研究樣本的相似性或差異性坦胶,與PCA分析類似;但相比于PCA滤奈,PCoA以樣本距離為整體考慮摆昧,更符合生態(tài)學(xué)數(shù)據(jù)特征,應(yīng)用也更為廣泛蜒程。
PCoA分析绅你,首先對一系列的特征值和特征向量進(jìn)行排序,然后選擇排在前幾位的最主要特征值昭躺,經(jīng)過投影后并將其投影在坐標(biāo)系里勇吊,結(jié)果相當(dāng)于是距離矩陣的一個旋轉(zhuǎn),在低維度空間以最大限度地保留原始樣本的距離關(guān)系窍仰;相似的樣本在圖形中的距離更為接近汉规,相異的樣本距離更遠(yuǎn)。
示例
PCoA plot showing the difference between sample groups, stratified to only samples originating from participants not receiving any topical treatment. Pvalue corresponds to Adonis PERMANOVA test. Ellipses delineate the 75% prediction areas of samples from each group.
腳本
數(shù)據(jù)樣式:
- OTU豐度數(shù)據(jù)就是一般OTU表或注釋后的OTU豐度表驹吮,每一行為一個OTU针史,每一列為一個樣品。
- 分組數(shù)據(jù)為跟樣品一一對應(yīng)的分組數(shù)據(jù)碟狞。
vegan包的分析結(jié)果解釋:eig記錄了PCoA排序結(jié)果中啄枕,主要排序軸的特征值(再除以特征值總和就是各軸的解釋量);points記錄了各樣本在各排序軸中的坐標(biāo)值族沃。
library(readxl)
library(ggplot2)
library(learn)
library(patchwork)
library(tidyverse)
rm(list = ls())
file <- "C:\\Users\\...total_data\\"
genes_abundance <- read.table(file = paste0(file, "otu_table_g_relative.xls"),
header = TRUE, stringsAsFactors = FALSE)
genes_abundance <- genes_abundance[-ncol(genes_abundance)]
str(genes_abundance)
which(duplicated(genes_abundance$Taxonomy) == TRUE)
groups <- read_xls(path = paste0(file, "the_information_of_sample_site.xls"),
sheet = 3)
row.names(genes_abundance) <- genes_abundance$Taxonomy
otu <- genes_abundance[-1]
otu <- data.frame(t(otu))
head(otu)
#排序(基于 OTU 豐度表)
library(vegan)
distance <- vegdist(otu, method = 'bray')
pcoa <- cmdscale(distance, k = (nrow(otu) - 1), eig = TRUE)
# 可視化數(shù)據(jù)提取 ------------------------------------------------
# 提取樣本點坐標(biāo)(points記錄了各樣本在各排序軸中的坐標(biāo)值)
# 前兩軸
plot_data <- data.frame({pcoa$point})[1:2]
# 提取列名频祝,便于后面操作。
plot_data$Sample_name <- rownames(plot_data)
names(plot_data)[1:2] <- c('PCoA1', 'PCoA2')
# eig記錄了PCoA排序結(jié)果中脆淹,主要排序軸的特征值(再除以特征值總和就是各軸的解釋量)
eig = pcoa$eig
#為樣本點坐標(biāo)添加分組信息
plot_data <- merge(plot_data, groups, by = 'Sample_name', all.x = TRUE)
# 繪制主標(biāo)準(zhǔn)軸的第1常空,2軸
ggplot(data = plot_data, aes(x=PCoA1, y=PCoA2, color=Group3)) +
geom_point(alpha=.7, size=2) +
stat_chull(fill =NA) +
labs(x=paste("PCoA 1 (", format(100 * eig[1] / sum(eig), digits=4), "%)", sep=""),
y=paste("PCoA 2 (", format(100 * eig[2] / sum(eig), digits=4), "%)", sep=""))
Reference
Ring HC, Thorsen J, Saunte DM, Lilje B, Bay L, Riis PT, Larsen N, Andersen LO, Nielsen HV, Miller IM, Bjarnsholt T, Fuursted K, Jemec GB. The Follicular Skin Microbiome in Patients With Hidradenitis Suppurativa and Healthy Controls. JAMA Dermatol. 2017 Sep 1;153(9):897-905. doi: 10.1001/jamadermatol.2017.0904.