• Information is stored in 2D-tables
• Columns in a record are fields
• A key, consisting of one or more fields, that uniquely identifies a record
• A dictionary is a table that describes all the tables
• Fully-partitioned: if each table is stored at exactly one physical site
• Fully-replicated: if each table is stored at all physical sites
• Natural distribution: data are kept at the local site
• Predicate: a condition between fields used to manipulate the queries
• Distribution of tables to sites
• Natural distribution of data at various sites
• Replicates the dictionary at every site
• Frequency of request to a table from a site
• Storage capacity at each site
• Communication costs between sites
• Query response time (for interactive applications)
• Total bandwidth consumed (for batch applications)
Optimizing specific query based upon specific statistical conditions
The query site will either “estimate” or request the related sites to report the related time and cost of moving the data before deciding on an actual query sequence
• Maximize the amount of parallel activity while maintaining the semantic integrity of the data
A transaction: a set of reads, followed by some processing, and then a set of writes
A log is the time ordered sequence of reads and writes performed on the database
A log is serial if each read is immediately followed by an appropriate write
There is no known algorithm that allows serial logs, all serializable logs, and all other logs that leave the database consistent
Most algorithms achieve serializable logs by allowing transaction to lock part of the database
The lock could be applied on the full database, or some tables, records, fields, and physical sectors
Deadlock occurs when two queries want to lock certain resources that have already been locked by each other