Chapter 7: Databases and AWS
- B. Amazon RDS is best suited for traditional OLTP transactions. Amazon Redshift, on the other hand, is designed for OLAP workloads. Amazon Glacier is designed for cold archival storage.
- 傳統(tǒng)的OLTP一般采用RDS類(lèi)型作為數(shù)據(jù)庫(kù)谒出,aws的RDS支持的引擎如下:
- Anrora引擎:兼容mysql和postgreSQL赌躺,高于mysql5倍的吞吐量,高于postgreSQL3倍的吞吐量窒百,64TB存儲(chǔ)词裤,3個(gè)az的6路復(fù)制刺洒,15個(gè)只讀副本,且副本滯后不超過(guò)10毫秒吼砂,故障監(jiān)控逆航,保證在30S內(nèi)進(jìn)行故障轉(zhuǎn)移;
- MYSQL引擎:支持跨區(qū)域讀取副本渔肩,支持32 VCPU及244G內(nèi)存因俐,16TB存儲(chǔ),支持自動(dòng)備份及時(shí)間點(diǎn)恢復(fù)
- MariaDB引擎:支持跨區(qū)域讀取副本周偎,支持32 VCPU及244G內(nèi)存抹剩,16TB存儲(chǔ),支持自動(dòng)備份及時(shí)間點(diǎn)恢復(fù)蓉坎,支持全局事務(wù)和線程池澳眷;
- PostgreSQL引擎:支持高穩(wěn)定性和高可靠性,兼容oracle
- SqlServer引擎:支持express(10G存儲(chǔ))蛉艾,web钳踊,standard,enterprise四種版本勿侯;
- oracle引擎:支持enterprise拓瞪、standard(32 vcpu)、standard one(16 vcpu)罐监、standard two(16vcpu)四個(gè)版本吴藻;
- D. Amazon DynamoDB is best suited for non-relational databases. Amazon RDS and Amazon Redshift are both structured relational databases.
- DynamoDB是非關(guān)系型數(shù)據(jù)庫(kù),也就是非結(jié)構(gòu)化數(shù)據(jù)庫(kù)NOSQLDB的典型代表弓柱。
- RDS和Redshift都是結(jié)構(gòu)化數(shù)據(jù)庫(kù)沟堡;
- DynamoDB對(duì)標(biāo)的是開(kāi)源數(shù)據(jù)庫(kù)Cassandra
- 想創(chuàng)建全局表:要?jiǎng)?chuàng)建全局表,請(qǐng)確保此表是空的矢空,而且 DynamoDB Streams 已啟用
- C. In this scenario, the best idea is to use read replicas to scale out the database and thus maximize read performance. When using Multi-AZ, the secondary database is not accessible and all reads and writes must go to the primary or any read replicas.
- 讀寫(xiě)分離是提升性能的最佳方式航罗。
- A. Amazon Redshift is best suited for traditional OLAP transactions. While Amazon RDS can also be used for OLAP, Amazon Redshift is purpose-built as an OLAP data warehouse.
- aws的OLAP解決方案就是redshift,用來(lái)處理在線數(shù)據(jù)分析的場(chǎng)景
- B. DB Snapshots can be used to restore a complete copy of the database at a specific point in time. Individual tables cannot be extracted from a snapshot.
- 因?yàn)镽DS的自動(dòng)備份被打開(kāi)屁药,所以可以通過(guò)恢復(fù)一個(gè)時(shí)間點(diǎn)的snapshots完成數(shù)據(jù)恢復(fù)
- 恢復(fù)數(shù)據(jù)是無(wú)法通過(guò)抽取單獨(dú)的表進(jìn)行的粥血;
- A. All Amazon RDS database engines support Multi-AZ deployment.
- 所有的RDS數(shù)據(jù)庫(kù)引擎都支持 multi-AZ的部署。這里包括 Anrora、mysql复亏、SqlServer趾娃、oracle、mariadb缔御、postgrepSQL
- B. Read replicas are supported by MySQL, MariaDB, PostgreSQL, and Aurora.
- 支持只讀副本的數(shù)據(jù)庫(kù)引擎有:MYSQL/mariaDB抬闷、PostgrepSql、Anrora
- A. You can force a failover from one Availability Zone to another by rebooting the primary instance in the AWS Management Console. This is often how people test a failover in the real world. There is no need to create a support case.
- 測(cè)試RDS的Multi-AZ能力耕突,只需要開(kāi)啟MultiAZ的部署方式笤成,然后重啟主數(shù)據(jù)庫(kù)就可以了。
- D. Monitor the environment while Amazon RDS attempts to recover automatically. AWS will update the DB endpoint to point to the secondary instance automatically.
- 啟動(dòng)了MultiAZ的部署方式眷茁,當(dāng)主數(shù)據(jù)庫(kù)宕機(jī)的時(shí)候炕泳,從數(shù)據(jù)庫(kù)自動(dòng)承接所有的訪問(wèn),不需要人為的干預(yù)上祈;
- A. Amazon RDS supports Microsoft SQL Server Enterprise edition and the license is available only under the BYOL model.
- aws的SQLServer 支持byol模式培遵,就是自己帶lisence的模式啟動(dòng);
- B. General Purpose (SSD) volumes are generally the right choice for databases that have bursts of activity.
- 采用General Purpose ssd就行雇逞,因?yàn)檫@個(gè)有信用分荤懂,支持短時(shí)間內(nèi)突然提升訪問(wèn)性能的需求;
- B. NoSQL databases like Amazon DynamoDB excel at scaling to hundreds of thousands of requests with key/value access to user profile and session.
- 注意是保存session數(shù)據(jù)塘砸,這個(gè)使用nosqldb是最合適的,這個(gè)采用Dynamo數(shù)據(jù)庫(kù)比較匹配晤锥;
- A, C, D. DB snapshots allow you to back up and recover your data, while read replicas and a Multi-AZ deployment allow you to replicate your data and reduce the time to failover.
- snapshots支持我們恢復(fù)數(shù)據(jù)庫(kù)數(shù)據(jù)掉蔬;
- read副本和Multi-az的部署模式支持將數(shù)據(jù)快速無(wú)損回復(fù);
- C, D. Amazon RDS allows for the creation of one or more read-replicas for many engines that can be used to handle reads. Another common pattern is to create a cache using Memcached and Amazon ElastiCache to store frequently used queries. The secondary slave DB Instance is not accessible and cannot be used to offload queries.
- 目標(biāo)是降低主庫(kù)讀的壓力矾瘾。MultiAZ的部署方式的standby服務(wù)是不可用的女轿;
- 策略只有構(gòu)建read副本或者通過(guò)ElasticCache進(jìn)行;
- A, B, C. Protecting your database requires a multilayered approach that secures the infrastructure, the network, and the database itself. Amazon RDS is a managed service and direct access to the OS is not available.
- 使用AWS的RDS壕翩,理論上是無(wú)法直接訪問(wèn)RDS所在實(shí)例的操作系統(tǒng)的蛉迹;
- A, B, C. Vertically scaling up is one of the simpler options that can give you additional processing power without making any architectural changes. Read replicas require some application changes but let you scale processing power horizontally. Finally, busy databases are often I/O- bound, so upgrading storage to General Purpose (SSD) or
Provisioned IOPS (SSD) can often allow for additional request processing.
- 短時(shí)間快速提升RDS的性能:創(chuàng)建時(shí)選擇高性能的instance、只讀的副本放妈、使用Provisioned SSD磁盤(pán)
- C. Query is the most efficient operation to find a single item in a large table.
- Query是最高效的查找單一數(shù)據(jù)條目的方式
- A. Using the Username as a partition key will evenly spread your users across the partitions. Messages are often filtered down by time range, so Timestamp makes sense as a sort key.
- 主鍵使用username進(jìn)行區(qū)分北救,排序采用timestamp,這樣是比較有意義的
- B, D. You can only have a single local secondary index, and it must be created at the same time the table is created. You can create many global secondary indexes after the table has been created.
- 本地的二級(jí)索引只能有一個(gè)芜抒,只能在表創(chuàng)建的時(shí)候一同創(chuàng)建 珍策;
- B, C. Amazon Redshift is an Online Analytical Processing (OLAP) data warehouse designed for analytics, Extract, Transform, Load (ETL), and high-speed querying. It is not well suited for running transactional applications that require high volumes of small inserts or updates.
- Redshift是一個(gè)OLAP的引用場(chǎng)景,比較適合數(shù)據(jù)倉(cāng)庫(kù)和數(shù)據(jù)分析宅倒;
知識(shí)點(diǎn)總結(jié)
Know what a relational database is. A relational database consists of one or more tables. Communication to and from relational databases usually involves simple SQL queries, such as “Add a new record,” or “What is the cost of product x?” These simple queries are often referred to as OLTP.
了解什么是關(guān)系型數(shù)據(jù)庫(kù)攘宙,一個(gè)關(guān)系型數(shù)據(jù)庫(kù)有一個(gè)或者多個(gè)表組成。與關(guān)系型數(shù)據(jù)庫(kù)交互是通過(guò)SQL query完成。例如增加一個(gè)新記錄蹭劈,或者查詢某個(gè)產(chǎn)品的價(jià)格疗绣,這些簡(jiǎn)單的查詢都經(jīng)常使用OLTP。
Understand which databases are supported by Amazon RDS. Amazon RDS currently supports six relational database engines:Microsoft SQL Server铺韧、MySQL Server多矮、Oracle、PostgreSQL祟蚀、MariaDB工窍、Amazon Aurora
理解AWS的RDS的數(shù)據(jù)庫(kù)引擎,當(dāng)前支持6種引擎:SQL Server前酿、MYSQL患雏、ORACLE、PostgreSQL罢维、MariaDB淹仑、Aurora
Understand the operational benefits of using Amazon RDS. Amazon RDS is a managed service provided by AWS. AWS is responsible for patching, antivirus, and management of the underlying guest OS for Amazon RDS. Amazon RDS greatly simplifies the process of setting a secondary slave with replication for failover and setting up read replicas to offload queries. Remember that you cannot access the underlying OS for Amazon RDS DB instances. You cannot use Remote Desktop Protocol (RDP) or SSH to connect to the underlying OS. If you need to access the OS, install custom software or agents, or want to use a database engine not supported by Amazon RDS, consider running your database on Amazon EC2 instead.
- 了解使用AWS RDS的好處。RDS是一個(gè)托管型的服務(wù)肺孵。AWS負(fù)責(zé)補(bǔ)丁匀借、病毒防護(hù)、guest OS的維護(hù)平窘。RDS極大的簡(jiǎn)化了建設(shè)slave的工作吓肋。同時(shí)記住你不能訪問(wèn)存放RDS的EC2實(shí)例。你不能使用RDP或者ssh去連接RDB所在instance OS.如果你需要訪問(wèn)OS瑰艘,需要安裝定制的軟件或者代理是鬼。如果想使用一個(gè)AWS不支持的引擎,考慮自己部署到EC2上紫新。
Know that you can increase availability using Amazon RDS Multi-AZ deployment. Add fault tolerance to your Amazon RDS database using Multi-AZ deployment. You can quickly set up a secondary DB Instance in another Availability Zone with Multi-AZ for rapid failover.
- 了解如何使用Multi-AZ增加RDS的高可用性均蜜。可以通過(guò)使用Multi-AZ的部署方式來(lái)完成容災(zāi)芒率。在容災(zāi)恢復(fù)場(chǎng)景囤耳,你可以快速建立起另外一個(gè)DB實(shí)例在另外一個(gè)AZ中。
Understand the importance of RPO and RTO. Each application should set RPO and RTO targets to define the amount of acceptable data loss and also the amount of time required to recover from an incident. Amazon RDS can be used to meet a wide range of RPO and RTO requirements.
理解RPO和RTO的重要性偶芍。每個(gè)應(yīng)用都應(yīng)該設(shè)置一個(gè)RPO和RTO目標(biāo)充择,來(lái)定義可接受的事故數(shù)據(jù)損失以及恢復(fù)時(shí)間損失。RDS被用來(lái)滿足更廣范圍的RPO和RTO需求腋寨;
Understand that Amazon RDS handles Multi-AZ failover for you. If your primary Amazon RDS Instance becomes unavailable, AWS fails over to your secondary instance in another Availability Zone automatically. This failover is done by pointing your existing database endpoint to a new IP address. You do not have to change the connection string manually; AWS handles the DNS change automatically.
理解RDS如何處理Multi-AZ的故障轉(zhuǎn)移聪铺。當(dāng)你的主數(shù)據(jù)庫(kù)不可用的時(shí)候。AWS的故障轉(zhuǎn)移功能將自動(dòng)的將訪問(wèn)切換到另外一個(gè)可用區(qū)萄窜。這個(gè)故障轉(zhuǎn)移通過(guò)將現(xiàn)有的數(shù)據(jù)庫(kù)訪問(wèn)端點(diǎn)指向一個(gè)新的IP地址铃剔。這個(gè)不需要我們手工更改任何數(shù)據(jù)庫(kù)連接撒桨,這個(gè)通過(guò)dns進(jìn)行域名解析完成。
Remember that Amazon RDS read replicas are used for scaling out and increased performance. This replication feature makes it easy to scale out your read-intensive databases. Read replicas are currently supported in Amazon RDS for MySQL, PostgreSQL, and Amazon Aurora. You can create one or more replicas of a database within a single AWS Region or across multiple AWS Regions. Amazon RDS uses native replication to propagate changes made to a source DB Instance to any associated read replicas. Amazon RDS also supports cross-region read replicas to replicate changes asynchronously to another geography or AWS Region.
記得RDS的read replias是用來(lái)進(jìn)行水平擴(kuò)展提升性能的键兜。這個(gè)replication特性讓我們很容易擴(kuò)展讀敏感的數(shù)據(jù)庫(kù)凤类。Read replicas當(dāng)前在RDS支持的引擎有MYSQL、PostgrepSQL普气、Aurora谜疤。你可以創(chuàng)建一個(gè)或者多個(gè)DB的讀副本在但一個(gè)region中或者跨多個(gè)REGION中。RDS使用本地復(fù)制的方式將source DB的改變傳遞到read replicas上现诀。RDS也支持跨region的replicate同步夷磕;
Know what a NoSQL database is. NoSQL databases are non-relational databases, meaning that you do not have to have an existing table created in which to store your data. NoSQL databases come in the following formats:
Document databases
Graph stores
Key/value stores
Wide-column stores理解什么是NOSQL DB。NOSQL數(shù)據(jù)庫(kù)是非關(guān)系型數(shù)據(jù)庫(kù)仔沿。意味著你不必先創(chuàng)建一個(gè)表來(lái)存儲(chǔ)你的數(shù)據(jù)坐桩。NOSQL數(shù)據(jù)庫(kù)以如下形式存在:文檔數(shù)據(jù)庫(kù)、圖形存儲(chǔ)封锉、kv存儲(chǔ)绵跷、列存儲(chǔ);
Remember that Amazon DynamoDB is AWS NoSQL service. You should remember that for NoSQL databases, AWS provides a fully managed service called Amazon DynamoDB. Amazon DynamoDB is an extremely fast NoSQL database with predictable performance and high scalability. You can use Amazon DynamoDB to create a table that can store and retrieve any amount of data and serve any level of request traffic. Amazon DynamoDB automatically spreads the data and traffic for the table over a sufficient number of partitions to handle the request capacity specified by the customer and the amount of data stored, while maintaining consistent and fast performance.
了解DynamoDB就是一個(gè)nosqldb服務(wù)成福。AWS的托管nosqldb就是DynamoDB碾局。DynamoDB是一個(gè)相當(dāng)快的NOSQL 數(shù)控,擁有高性能和高擴(kuò)展性奴艾。你可以使用它去創(chuàng)建表净当,同時(shí)存儲(chǔ)任意數(shù)量的數(shù)據(jù),支持任意量級(jí)的請(qǐng)求蕴潦。他可以自動(dòng)的將數(shù)據(jù)和流量在分發(fā)的不同的分區(qū)中進(jìn)行存儲(chǔ)蚯瞧,同時(shí)保持了一致性和高性能;
Know what a data warehouse is. A data warehouse is a central repository for data that can come from one or more sources. This data repository would be used for query and analysis using OLAP. An organization’s management typically uses a data warehouse to
compile reports on specific data. Data warehouses are usually queried with highly complex queries.了解什么是warehourse品擎,一個(gè)warehouse是一個(gè)數(shù)據(jù)倉(cāng)庫(kù),可以存儲(chǔ)一個(gè)或者多個(gè)來(lái)源的數(shù)據(jù)备徐。這個(gè)數(shù)據(jù)庫(kù)倉(cāng)庫(kù)會(huì)被用來(lái)查詢和分析萄传。一個(gè)組織最典型的場(chǎng)景就是用數(shù)據(jù)倉(cāng)庫(kù)進(jìn)行報(bào)告生成。數(shù)據(jù)庫(kù)倉(cāng)庫(kù)一般用來(lái)進(jìn)行高復(fù)雜度的查詢分析蜜猾;
Remember that Amazon Redshift is AWS data warehouse service. You should remember that Amazon Redshift is Amazon’s data warehouse service. Amazon Redshift organizes the data by column instead of storing data as a series of rows. Because only the columns involved in the queries are processed and columnar data is stored sequentially on the storage media, column-based systems require far fewer I/Os, which greatly improves query performance. Another advantage of columnar data storage is the increased compression, which can further reduce overall I/O.
Redshift是aws的數(shù)據(jù)庫(kù)倉(cāng)庫(kù)服務(wù)秀菱。redshift通過(guò)列的方式存儲(chǔ)數(shù)據(jù)。因?yàn)榱惺酱鎯?chǔ)查詢更改蹭睡、i/o更小衍菱。另外就是可以增加壓縮,更好的減少i/o肩豁。