Advanced Database 2075

Tribhuwan University
Institute of Science and Technology
2075
Bachelor Level / Eighth Semester / Science
Computer Science and Information Technology ( CSC461 )
( Advanced Database )
Full Marks: 60
Pass Marks: 24
Time: 3 hours
Candidates are required to give their answers in their own words as far as practicable.
The figures in the margin indicate full marks.

Attempt all questions. (10 x 60=60)

1. How do you increase performance of the database? Explain any one database performance tuning technique with example. (2+4)

6 marks view

2. What is query processing? How is it different from query optimization? Discuss heuristic query optimization.(1+2+3)

6 marks view

Query Processing

Query processing refers to the range of activities involved in extracting data from a database. The activities include translation of queries in high-level database languages into expressions that can be used at the physical level of the file system, a variety of query-optimizing transformations, and actual evaluation of queries.

The steps involved in processing a query appear in Figure. The basic steps are

  1. Parsing and translation
  2. Optimization
  3. Evaluation

Before query processing can begin, the system must translate the query into a usable form. A language such as SQL is suitable for human use, but is ill-suited to be the system’s internal representation of a query. A more useful internal representation is one based on the extended relational algebra.

Thus, the first action the system must take in query processing is to translate a given query into its internal form. This translation process is similar to the work performed by the parser of a compiler. In generating the internal form of the query, the parser checks the syntax of the user’s query, verifies that the relation names appearing in the query are names of the relations in the database, and so on. The system constructs a parse-tree representation of the query, which it then translates into a relational-algebra expression.


3. What are the benefits of using distributed databases? Discuss different types of distributed database systems. (3+3)

6 marks view

Distributed database management has been proposed for various reasons ranging from organizational decentralization and economical processing to greater autonomy. We highlight some of these advantages here.

1. Management of distributed data with different levels of transparency: A DBMS should be distribution transparent in the sense of hiding the details of where each file (table, relation) is physically stored within the system. Consider the company database


        The EMPLOYEE, PROJECT, and WORKS_ON tables may be fragmented horizontally (that is, into sets of rows) and stored with possible replication as shown in Figure. The following types of transparencies are possible:

  • Distribution or network transparency: This refers to freedom for the user from the operational details of the network. It may be divided into location transparency and naming transparency. Location transparency refers to the fact that the command used to perform a task is independent of the location of data and the location of the system where the command was issued. Naming transparency implies that once a name is specified, the named objects can be accessed unambiguously without additional specification.
  • Replication transparency: As we show in Figure, copies of data may be stored at multiple sites for better availability, performance, and reliability. Replication transparency makes the user unaware of the existence of copies.
  • Fragmentation transparency: Two types offragmentation are possible. Horizontal fragmentation distributes a relation into sets of tuples (rows). Vertical fragmentation distributes a relation into subrelations where each subrelation is defined by a subset of the columns of the original relation. A global query by the user must be transformed into several fragment queries. Fragmentation transparency makes the user unaware of the existence of fragments.

2. Increased reliability and availability: These are two of the most common potential advantages cited for distributed databases. Reliability is broadly defined as the probability that a system is running (not down) at a certain time point, whereas availability is the probability that the system is continuously available during a time interval. When the data and DBMS software are distributed over several sites, one site may fail while other sites continue to operate. Only the data and software that exist at the failed site cannot be accessed. This improves both reliability and availability. Further improvement is achieved by judiciously replicating data and software at more than one site. In a centralized system, failure at a single site makes the whole system unavailable to all users. In a distributed database, someof the data may be unreachable, but users may still be able to access other parts of the database.

3. Improved performance: A distributed DBMS fragments the database by keeping the data closer to where it is needed most. Data localization reduces the contention for CPU and I/O services and simultaneously reduces access delays involved in wide area networks. When a large database is distributed over multiple sites, smaller databases exist at each site. As a result, local queries and transactions accessing data at a single site have better performance because of the smaller local databases. In addition, each site has a smaller number of transactions executing than if all transactions are submitted to a single centralized database. Moreover, interquery and intraquery parallelism can be achieved by executing multiple queries at different sites, or by breaking up a query into a number of subqueries that execute in parallel. This contributes to improved performance.

4. Easier expansion: In a distributed environment, expansion of the system in terms of adding more data, increasing database sizes, or adding more processors is much easier.


TYPES OF DISTRIBUTED DATABASE SYSTEMS

Distributed database management system can describe various systems that differ from one another in many respects.Different types of DDBMSs and the criteria and factors that make some of these systems different are as follows:

According to degree of homogeneity of the DDBMS software:

  • Homogeneous DDBMS : If all servers (or individual local DBMSs) use identical software and all users (clients) use identical software, the DDBMS is called homogeneous.
  • Heterogeneous DDBMS : If servers (or individual local DBMSs) use different software and users (clients) use different software, the DDBMS is called heterogeneous.


According to degree of local autonomy of the DDBMS software:

  • local autonomy: if direct access by local transactions to a server is permitted, the system has some degree of local autonomy.
  • no local autonomy: If there is no provision for the local site to function as a stand-alone DBMS, then the system has no local autonomy.



4. What are the benefits of using object oriented databases over relational databases? Discuss different type of constructors used in object oriented databases. (2+4)

6 marks view

5. Discuss different implementation issues related with object relational database. (6)

6 marks view

6. Define GIS. Discuss different data modeling and representation for GIS data. (1+5)

6 marks view

7. Define multimedia database. Discuss benefits of multimedia databases. How do you query image database? (1+2+3)

6 marks view

8. Discuss data warehouse and its functionality. Discuss association rule mining with example. (3+3)

6 marks view

9. What is web service? Discuss SOAP in detail. (2+4)

6 marks view

10. Write short notes on:(2x3)

a) Integrity constraint

b) Mobile database

6 marks view