Tables and Schema
Tables
Data is organized into named tables, or relations
Table is used to store the data
Every column has its data type:
- INTEGER, SMALLINT, TINYINT, BIGINT
- DECIMAL(n,m) : float type
- VARCHAR: string type
- BOOLEAN: logic
Schemas
The names and types of the data in a table's columns are called the table's schema.
Log in shell
mysql -u root -p psw
SQL
The most common features of SQL are:
- Projection - selecting some columns from table
- Function application - applying a function to a value
- Filtering - selecting rows based on some criterion
- Grouping - aggregating rows based on the values in a column
- Joins - combining two tables
(回想在big data中學(xué)習(xí)的實(shí)現(xiàn)各類功能的完備數(shù)學(xué)運(yùn)算集)
Query syntax
SQL queries have the form:
SELECT expression[, expression, ...]
FROM table
[... additional clauses ...]
Projection
Projection is when you ask for some(all) of the columns of a table.
也就是抽取相應(yīng)的列礼烈,keyword 負(fù)責(zé)對(duì)抽取的列中的行進(jìn)行過濾操作弧满,
比如DISTINCT
就是抽取出某一列的不重復(fù)的所有元素(類似于數(shù)學(xué)里面的集合)婆跑。
SELECT col1, col2
FROM table1
SELECT has many keywords like:
- DISTINCT - return unique results
- BETWEEN a AND b - limit the range, the values can be numbers, text, or dates
- LIKE - pattern search within the column text
- IN (a, b, c) - check if the value is contained among given.
SELECT DISTINCT Facility
FROM Admissions;
SELECT *
FROM table1
Function application
在列上施加函數(shù),可以構(gòu)造新的列名
SELECT Facility,
Lengthofstay * 24
FROM Admissions;
SELECT UPPER(Facility),
Lengthofstay * IllnessCode
FROM Admissions;
SELECT Facility,
Lengthofstay * 24 AS stayinhours
FROM Admissions;
Filtering
相當(dāng)于對(duì)行施加函數(shù)犀忱,抽取出部分的行, 關(guān)鍵詞為WHERE
.多個(gè)條件語(yǔ)句之間用邏輯符OR AND
隔開
SELECT *
FROM Admissions
WHERE Lengthofstay > 4;
SELECT Facility, IllnessCode
FROM Admissions
WHERE IllnessCode = 141 OR
IllnessCode = 94;
SELECT Facility, Lengthofstay
FROM Admissions
WHERE Lengthofstay > 2 AND
Lengthofstay < 6;
Aggregation function
Aggregate the rows according to a column, and perform operarions on the aggregated rows. 使用keyword :GROUP BY
Grouping 操作也就是匯總通常與聚合函數(shù)(共有40多種聚合函數(shù))一起使用扶关,常見的聚合函數(shù):
- COUNT - return the number of rows
- SUM - cumulate the values
- AVG - return the average for the group
- MIN / MAX - smallest / largest value
SELECT Facility,
MAX(Lengthofstay)
FROM Admissions
GROUP BY Facility;
SELECT Facility,
MIN(Lengthofstay),
MAX(Lengthofstay),
AVG(Lengthofstay)
FROM Admissions
GROUP BY Facility;
SELECT COUNT(*) FROM Admissions;
可以將過濾操作和GROUP BY 操作放到一起同時(shí)執(zhí)行,但是注意過濾操作要放到前面搀庶。
Sort the result
對(duì)結(jié)果排序,關(guān)鍵詞 :ORDER BY
SELECT Facility, Lengthofstay
FROM Admissions
ORDER BY Lengthofstay;
SELECT Facility, Lengthofstay
FROM Admissions
ORDER BY Facility, Lengthofstay
Joins
鏈接兩個(gè)或者多個(gè)表
SELECT * FROM Admissions JOIN Illnesses
ON Admissions.IllnessCode = Illnesses.Code;
Join 操作可以寫的非常的復(fù)雜,同時(shí)join分為 Left Outer Join
, Inner join
and right Outer join
UPDATE t1 SET a = 1
FROM table1 t1 JOIN table2 t2 ON t1.id = t2.t1_id WHERE t1.col1 = 0 AND t2.col2 IS NULL;
LIMIT clause
從返回的結(jié)果中顯示有限的元素
SELECT Facility, IllnessCode
FROM Admissions
LIMIT 2
Database and tables
Basic operations
CREATE DATABASE database name;
DROP DATABASE database name;
SHOW DATABASES;
USE database name;
SHOW TABLES;
DESCRIBE table-name;
tables
DROP TABLE IF EXISTS plate_types;
CREATE TABLE plate_types(
code CHAR(3) NOT NULL,
description VARCHAR(27),
PRIMARY KEY(code)
);
You may notice that the return of a SELECT statement is actually another table. So you may use it as a subqueries anywhere you need a table
-- create a table from another select-statement
CREATE TABLE longstays
AS SELECT facility, MAX(lengthofstay)
FROM hospital
GROUP BY facility;
Deleting rows and tables
DROP TABLE table-name;
# just delete the data but leave the table(as a schema)
TRUNCATE TABLE table-name;
# Delete rows
DELETE FROM table-name
WHERE conditions;
add or update tables
-- update speci c data with the WHERE clause
UPDATE table1 SET col1 = 1
WHERE col2 = 2
-- insert values manually
INSERT INTO table1 (ID, FIRST_NAME, LAST_NAME)
VALUES (1, ‘Rebel’, ‘Labs’);
-- or by using the results of a query
INSERT INTO table1 (ID, FIRST_NAME, LAST_NAME)
SELECT id, last_name, rst_name FROM table2
-- load data
LOAD DATA LOCAL INFILE '/Users/xiaodiu/Google/Big Data/assignment2/data/plate-types.csv'
INTO TABLE plate_types
COLUMNS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\n';