Advanced Database - Old Questions

3. What are the benefits of using distributed databases? Discuss different types of distributed database systems. (3+3)

6 marks | Asked in 2075

Distributed database management has been proposed for various reasons ranging from organizational decentralization and economical processing to greater autonomy. We highlight some of these advantages here.

1. Management of distributed data with different levels of transparency: A DBMS should be distribution transparent in the sense of hiding the details of where each file (table, relation) is physically stored within the system. Consider the company database


        The EMPLOYEE, PROJECT, and WORKS_ON tables may be fragmented horizontally (that is, into sets of rows) and stored with possible replication as shown in Figure. The following types of transparencies are possible:

  • Distribution or network transparency: This refers to freedom for the user from the operational details of the network. It may be divided into location transparency and naming transparency. Location transparency refers to the fact that the command used to perform a task is independent of the location of data and the location of the system where the command was issued. Naming transparency implies that once a name is specified, the named objects can be accessed unambiguously without additional specification.
  • Replication transparency: As we show in Figure, copies of data may be stored at multiple sites for better availability, performance, and reliability. Replication transparency makes the user unaware of the existence of copies.
  • Fragmentation transparency: Two types offragmentation are possible. Horizontal fragmentation distributes a relation into sets of tuples (rows). Vertical fragmentation distributes a relation into subrelations where each subrelation is defined by a subset of the columns of the original relation. A global query by the user must be transformed into several fragment queries. Fragmentation transparency makes the user unaware of the existence of fragments.

2. Increased reliability and availability: These are two of the most common potential advantages cited for distributed databases. Reliability is broadly defined as the probability that a system is running (not down) at a certain time point, whereas availability is the probability that the system is continuously available during a time interval. When the data and DBMS software are distributed over several sites, one site may fail while other sites continue to operate. Only the data and software that exist at the failed site cannot be accessed. This improves both reliability and availability. Further improvement is achieved by judiciously replicating data and software at more than one site. In a centralized system, failure at a single site makes the whole system unavailable to all users. In a distributed database, someof the data may be unreachable, but users may still be able to access other parts of the database.

3. Improved performance: A distributed DBMS fragments the database by keeping the data closer to where it is needed most. Data localization reduces the contention for CPU and I/O services and simultaneously reduces access delays involved in wide area networks. When a large database is distributed over multiple sites, smaller databases exist at each site. As a result, local queries and transactions accessing data at a single site have better performance because of the smaller local databases. In addition, each site has a smaller number of transactions executing than if all transactions are submitted to a single centralized database. Moreover, interquery and intraquery parallelism can be achieved by executing multiple queries at different sites, or by breaking up a query into a number of subqueries that execute in parallel. This contributes to improved performance.

4. Easier expansion: In a distributed environment, expansion of the system in terms of adding more data, increasing database sizes, or adding more processors is much easier.


TYPES OF DISTRIBUTED DATABASE SYSTEMS

Distributed database management system can describe various systems that differ from one another in many respects.Different types of DDBMSs and the criteria and factors that make some of these systems different are as follows:

According to degree of homogeneity of the DDBMS software:

  • Homogeneous DDBMS : If all servers (or individual local DBMSs) use identical software and all users (clients) use identical software, the DDBMS is called homogeneous.
  • Heterogeneous DDBMS : If servers (or individual local DBMSs) use different software and users (clients) use different software, the DDBMS is called heterogeneous.


According to degree of local autonomy of the DDBMS software:

  • local autonomy: if direct access by local transactions to a server is permitted, the system has some degree of local autonomy.
  • no local autonomy: If there is no provision for the local site to function as a stand-alone DBMS, then the system has no local autonomy.