Oracle Database Sharding
Sharding is a data tier architecture in which data is horizontally partitioned across independent databases
Sharding技术是数据层架构体系,在各个数据库间进行垂直分片
Sharding with Oracle Database 12c Release 2 (12.2) is an architecture for suitable online transaction processing (OLTP) applications where data is horizontally partitioned across multiple discrete Oracle databases, called shards, which share no hardware or software. The collection of shards is presented to an application as a single logical Oracle database.
Oracle sharding supports automated deployment, high performance routing, and complete life-cycle management. High availability for individual shards is enabled by automatic deployment of either Oracle Data Guard or Oracle GoldenGate replication, at the discretion of the administrator. Each shard is an Oracle Database that has the same capabilities, with very few exceptions, as an Oracle Database in a non-sharded deployment.
Oracle sharding is intended for custom OLTP applications that are explicitly designed for a sharded database architecture. Unlike an architecture based on Oracle Real Application Clusters (Oracle RAC), applications that use sharding must have a well-defined data model and data distribution strategy (consistent hash, range, list, or composite) that primarily accesses data using a sharding key. Examples of keys include customer_id, account_no, country_id, and so on. Oracle sharding also supports data placement policies (rack and geo awareness) and all deployment models (for example, on-premises and public or hybrid clouds).
Sharding with Oracle Database 12c Release 2 (12.2) provides a number of benefits:
1.Linear scalability with complete fault isolation. OLTP applications designed for Oracle sharding can elastically scale (data, transactions and users) to any level, on any platform, simply by deploying new shards on additional stand-alone servers. The unavailability of a shard due to either an unplanned outage or planned maintenance affects only the users of that shard; it does not affect the availability or performance of the application for users of other shards. Each shard can run a different release of the Oracle Database as long as the application is backward compatible with the oldest running version making it simple to maintain availability of an application while performing database maintenance.
2.Simplicity using automation of many lifecycle management tasks including system-managed partitioning, single-command deployment, and fine-grained rebalancing.
3.Superior runtime performance using intelligent, data-dependent routing.
4.Enterprise quality. Each shard is an Oracle Database rendering strict consistency, the full power of SQL, developer agility with JSON, and proven enterprise qualities for security, availability, backup and recovery, and lifecycle management.
1.Each database is hosted on dedicated server with its own local resources - CPU, memory, flash, or disk. Each database in such configuration is called a shard. All of the shards together make up a single logical database, which is referred to as a sharded database (SDB).
每个独占模式的数据库独立分配自己的CPU、内存以及存盘资源。在这个环境中,每个独立的数据库被称为一个Shard(分片)。将所有的分片从逻辑上构成一个逻辑的数据库,称为分片数据库(SDB);
2.Horizontal partitioning involves splitting a database table across shards so that each shard contains the table with the same columns but a different subset of rows. A table split up in this manner is also known as a sharded table.
水平分区技术意味着,将数据库表横跨多个Shard数据库,同时间保持在每个Shard数据库中包含相同的表列和不同的表记录数。这样的表被称为分片表(shard table)
3.Sharding is based on shared-nothing hardware infrastructure and it eliminates single points of failure because shards do not share physical resources such as CPU, memory, or storage devices. Shards are also loosely coupled in terms of software; they do not run clusterware.
分片技术基于非共享的硬件设备,可以消除单点故障。这是因为Shard数据库部共享物理资源,Shard节点进行宽泛的双份镜像,而非集群
4.Shards are typically hosted on dedicated servers. These servers can be commodity hardware or engineered systems. The shards can run on single instance or Oracle RAC databases. They can be placed on-premises, in a cloud, or in a hybrid on-premises and cloud configuration.
From the perspective of a database administrator, an SDB consists of multiple databases that can be managed either collectively or individually. However, from the perspective of the application, an SDB looks like a single database: the number of shards and distribution of data across those shards are completely transparent to database applications.
Sharding is intended for custom OLTP applications that are suitable for a sharded database architecture. Applications that use sharding must have a well-defined data model and data distribution strategy (consistent hash, range, list, or composite) that primarily accesses data using a sharding key. Examples of a sharding key include customer_id, account_no, or country_id.
1.•Linear Scalability. Sharding eliminates performance bottlenecks and makes it possible to linearly scale performance and capacity by adding shards.
线性伸缩性(Scale out):Sharding技术减少性能瓶颈,同时可以通过添加Shard节点的方式扩展容量和性能;
2.•Fault Containment. Sharding is a shared nothing hardware infrastructure that eliminates single points of failure, such as shared disk, SAN, and clusterware, and provides strong fault isolation—the failure or slow-down of one shard does not affect the performance or availability of other shards.
错误纠正机制:Sharding是基于非共享物理资源的架构设计,有效消除了单点故障,例如共享磁盘以及SAN存储扥。它的错误纠正机制保证了单个Shard故障不会影响整个集群的使用和性能
3.•Geographical Distribution of Data. Sharding makes it possible to store particular data close to its consumers and satisfy regulatory requirements when data must be located in a particular jurisdiction.
地域分布式数据:Sharding可以将数据就近存储在消费者的同时,可以有效的满足监管范围内的管控需要
4.•Rolling Upgrades. Applying configuration changes on one shard at a time does not affect other shards, and allows administrators to first test the changes on a small subset of data.
滚动式更新:应用更新一次仅在一个Shard节点上进行,不会影响其他节点,这样可以保证在小部分数据更新时,可以及时进行数据验证。
5.•Simplicity of Cloud Deployment. Sharding is well suited to deployment in the cloud. Shards may be sized as required to accommodate whatever cloud infrastructure is available and still achieve required service levels. Oracle Sharding supports on-premises, cloud, and hybrid deployment models.
简化云发布:Sharding可以简化云发布
6.Unlike NoSQL data stores that implement sharding, Oracle Sharding provides the benefits of sharding without sacrificing the capabilities of an enterprise RDBMS. For example, Oracle Sharding supports:
与NoSQL数据库的分片不同,Oracle分片不会牺牲数据库的容量和功能。Oracle分片支持功以下能:
Oracle Sharding is a scalability and availability feature for suitable OLTP applications. It enables distribution and replication of data across a pool of Oracle databases that share no hardware or software.
Applications perceive the pool of databases as a single logical database. Applications can elastically scale data, transactions, and users to any level, on any platform, by adding databases (shards) to the pool. Oracle Database 12c Release 2 (12.2.0.1) supports scaling up to 1000 shards.
The following figure illustrates the major architectural components of Oracle Sharding:
应用程序通过数据库pool将数据库在逻辑层当做1个数据库:数据、事务、用户以及所有Level,通过将Shards添加到pool。Oracle12c支持1000个Shards
Sharding主要组成部分:
1.•Sharded database (SDB) – a single logical Oracle Database that is horizontally partitioned across a pool of physical Oracle Databases (shards) that share no hardware or software
2.•Shards - independent physical Oracle databases that host a subset of the sharded database
3.•Global service - database services that provide access to data in an SDB
4.•Shard catalog – an Oracle Database that supports automated shard deployment, centralized management of a sharded database, and multi-shard queries
5.•Shard directors – network listeners that enable high performance connection routing based on a sharding key
6.•Connection pools - at runtime, act as shard directors by routing database requests across pooled connections
7.•Management interfaces - GDSCTL (command-line utility) and Oracle Enterprise Manager (GUI)