關(guān)系數(shù)據(jù)庫系統(tǒng)中的數(shù)據(jù)是以規(guī)范化格式存儲(chǔ)的。 所以,為了進(jìn)行統(tǒng)計(jì)計(jì)算凯旋,我們需要非常高級(jí)和復(fù)雜的SQL查詢。但是R可以很容易地連接到許多關(guān)系數(shù)據(jù)庫钉迷,如:MySQL至非,Oracle,Sql Server等糠聪,并將它們作為數(shù)據(jù)幀提取荒椭。 當(dāng)從數(shù)據(jù)庫中讀取數(shù)據(jù)到R環(huán)境中可用以后,它就成為一個(gè)正常的R數(shù)據(jù)集舰蟆,可以使用所有強(qiáng)大的軟件包和函數(shù)進(jìn)行操作或分析趣惠。
在本教程中,我們將使用R編程語言連接到MySQL數(shù)據(jù)庫身害。
RMySQL包
R有一個(gè)名為RMySQL
的內(nèi)置包味悄,它提供與MySql數(shù)據(jù)庫之間的本機(jī)連接。您可以使用以下命令在R環(huán)境中安裝此軟件包题造。
install.packages("RMySQL")
將R連接到MySql
當(dāng)安裝了軟件包(RMySQL
)之后傍菇,我們?cè)赗中創(chuàng)建一個(gè)連接對(duì)象以連接到數(shù)據(jù)庫。它需要用戶名界赔,密碼丢习,數(shù)據(jù)庫名稱和主機(jī)名等數(shù)據(jù)庫連接所需要的信息。
library("RMySQL");
# Create a connection Object to MySQL database.
# We will connect to the sampel database named "testdb" that comes with MySql installation.
mysqlconnection = dbConnect(MySQL(), user = 'root', password = '123456', dbname = 'testdb',
host = 'localhost')
# List the tables available in this database.
dbListTables(mysqlconnection)
當(dāng)我們執(zhí)行上述代碼時(shí)淮悼,會(huì)產(chǎn)生以下結(jié)果(當(dāng)前數(shù)據(jù)中的所有表) -
[1] "articles" "contacts" "demos" "divisions"
[5] "items" "luxuryitems" "order" "persons"
[9] "posts" "revenues" "special_isnull" "t"
[13] "tbl" "tmp" "v1" "vparts"
查詢表
可以使用dbSendQuery()
函數(shù)查詢MySQL中的數(shù)據(jù)庫表咐低。該查詢?cè)贛ySql中執(zhí)行,并使用R 的fetch()
函數(shù)返回結(jié)果集,最后將此結(jié)果作為數(shù)據(jù)幀存儲(chǔ)在R中袜腥。
假設(shè)要查詢的表是:persons
见擦,其創(chuàng)建語句和數(shù)據(jù)如下 -
/*
Navicat MySQL Data Transfer
Source Server : localhost-57
Source Server Version : 50709
Source Host : localhost:3306
Source Database : testdb
Target Server Type : MYSQL
Target Server Version : 50709
File Encoding : 65001
Date: 2017-08-24 00:35:17
*/
SET FOREIGN_KEY_CHECKS=0;
-- ----------------------------
-- Table structure for `persons`
-- ----------------------------
DROP TABLE IF EXISTS `persons`;
CREATE TABLE `persons` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`full_name` varchar(255) NOT NULL,
`date_of_birth` date NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8;
-- ----------------------------
-- Records of persons
-- ----------------------------
INSERT INTO `persons` VALUES ('1', 'John Doe', '1990-01-01');
INSERT INTO `persons` VALUES ('2', 'David Taylor', '1989-06-06');
INSERT INTO `persons` VALUES ('3', 'Peter Drucker', '1988-03-02');
INSERT INTO `persons` VALUES ('4', 'Lily Minsu', '1992-05-05');
INSERT INTO `persons` VALUES ('5', 'Mary William', '1995-12-01');
將上述表導(dǎo)入到數(shù)據(jù)庫中钉汗,并創(chuàng)建以下R代碼,用來執(zhí)行從數(shù)據(jù)庫的表中查詢數(shù)據(jù) -
library("RMySQL");
# Create a connection Object to MySQL database.
# We will connect to the sampel database named "testdb" that comes with MySql installation.
mysqlconnection = dbConnect(MySQL(), user = 'root', password = '123456', dbname = 'testdb',
host = 'localhost');
# Query the "actor" tables to get all the rows.
result = dbSendQuery(mysqlconnection, "select * from persons")
# Store the result in a R data frame object. n = 5 is used to fetch first 5 rows.
data.frame = fetch(result, n = 5)
print(data.frame)
執(zhí)行上面示例代碼鲤屡,得到以下結(jié)果 -
id full_name date_of_birth
1 1 John Doe 1990-01-01
2 2 David Taylor 1989-06-06
3 3 Peter Drucker 1988-03-02
4 4 Lily Minsu 1992-05-05
5 5 Mary William 1995-12-01
使用過濾子句查詢
我們可以傳遞任何有效的選擇查詢來獲取結(jié)果损痰,如下代碼所示 -
library("RMySQL");
# Create a connection Object to MySQL database.
# We will connect to the sampel database named "testdb" that comes with MySql installation.
mysqlconnection = dbConnect(MySQL(), user = 'root', password = '123456', dbname = 'testdb',
host = 'localhost');
result = dbSendQuery(mysqlconnection, "select * from persons where date_of_birth = '1990-01-01'")
# Fetch all the records(with n = -1) and store it as a data frame.
data.frame = fetch(result, n = -1)
print(data.frame)
當(dāng)我們執(zhí)行上述代碼時(shí),會(huì)產(chǎn)生以下結(jié)果 -
id full_name date_of_birth
1 1 John Doe 1990-01-01
更新表中的行記錄
可以通過將更新查詢傳遞給dbSendQuery()
函數(shù)來更新MySQL表中的行酒来。
dbSendQuery(mysqlconnection, "update persons set date_of_birth = '1999-01-01' where id=3")
執(zhí)行上述代碼后卢未,可以看到在MySql已經(jīng)更新persons
表中對(duì)應(yīng)的記錄。
將數(shù)據(jù)插入到表中
參考以下代碼實(shí)現(xiàn) -
library("RMySQL");
# Create a connection Object to MySQL database.
# We will connect to the sampel database named "testdb" that comes with MySql installation.
mysqlconnection = dbConnect(MySQL(), user = 'root', password = '123456', dbname = 'testdb',
host = 'localhost');
dbSendQuery(mysqlconnection,
"insert into persons(full_name, date_of_birth) values ('Maxsu', '1992-01-01')"
)
執(zhí)行上述代碼后堰汉,可以看到向MySql的persons
表中辽社,插入一行數(shù)據(jù)。
在MySql中創(chuàng)建表
我們通過使用dbWriteTable()
函數(shù)向MySql中創(chuàng)建表翘鸭。它會(huì)覆蓋表滴铅,如果它已經(jīng)存在并且以數(shù)據(jù)幀為輸入。
library("RMySQL");
# Create the connection object to the testdb database where we want to create the table.
mysqlconnection = dbConnect(MySQL(), user = 'root', password = '123456', dbname = 'testdb',host = 'localhost')
# Use the R data frame "mtcars" to create the table in MySql.
# All the rows of mtcars are taken inot MySql.
dbWriteTable(mysqlconnection, "mtcars", mtcars[, ], overwrite = TRUE)
執(zhí)行上述代碼后就乓,我們可以看到在MySql數(shù)據(jù)庫中創(chuàng)建一個(gè)名稱為:mtcars的表汉匙,并有填充了一些數(shù)據(jù)。
在MySql中刪除表
我們可以刪除MySql數(shù)據(jù)庫中的表生蚁,將drop table
語句傳遞到dbSendQuery()
函數(shù)中盹兢,就像在SQL中查詢表中的數(shù)據(jù)一樣。
dbSendQuery(mysqlconnection, 'drop table if exists mtcars')
執(zhí)行上述代碼后守伸,我們可以看到MySql數(shù)據(jù)庫中的mtcars
表被刪除绎秒。