11.1 DISTRIBUTED DATABASE SYSTEMS

TOC PREV NEXT

11.1 DISTRIBUTED DATABASE SYSTEMS


11.1.1 Relational database systems

Information is stored in 2D-tables

Rows in a table are records

Columns in a record are fields

A key, consisting of one or more fields, that uniquely identifies a record

A dictionary is a table that describes all the tables

Fully-partitioned: if each table is stored at exactly one physical site

Fully-replicated: if each table is stored at all physical sites

Natural distribution: data are kept at the local site

Query operations:

- Select - picking records
- Project - picking fields
- Join - merging of tables

Predicate: a condition between fields used to manipulate the queries

11.1.2 Issues for distributed database systems

Distribution of tables to sites

Natural distribution of data at various sites

Fully partition of systems

Fully replicated systems

Important factors

Replicates the dictionary at every site

Frequency of request to a table from a site

Storage capacity at each site

Communication costs between sites

11.1.2.1 - Query processing

Query response time (for interactive applications)

Total bandwidth consumed (for batch applications)

Approach

- Optimizing specific query based upon specific statistical conditions
- The query site will either "estimate" or request the related sites to report the related time and cost of moving the data before deciding on an actual query sequence

11.1.2.2 - Concurrency control

Maximize the amount of parallel activity while maintaining the semantic integrity of the data

Approach

- A transaction: a set of reads, followed by some processing, and then a set of writes
- A log is the time ordered sequence of reads and writes performed on the database
- A log is serial if each read is immediately followed by an appropriate write
- There is no known algorithm that allows serial logs, all serializable logs, and all other logs that leave the database consistent
- Most algorithms achieve serializable logs by allowing transaction to lock part of the database
- The lock could be applied on the full database, or some tables, records, fields, and physical sectors
- Deadlock occurs when two queries want to lock certain resources that have already been locked by each other

TOC PREV NEXT