Data Warehousing and Data Mining - Unit Wise Questions

Questions Organized by Units
Unit 1: Introduction to Data Warehousing
52 Questions

1. What are the key steps in knowledge discovery in databases? Explain.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

1. Differentiate between Data-Warehouse and Data-mining. Explain the stages of knowledge discovery in database with example.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

1. Differentiate between Data-Warehouse and Data-mining..

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

1. Write down any one advantage and disadvantage of MOLAP over ROLAP. Define signed network and how do you check whether it is balanced or not? How beam search reduces the space complexity? Illustrate with an example.[2+4+4]

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2.  Why do we need to preprocess the data before running the algorithm? What are the processes for this? Explain. Give some examples of noise that must be removed in data while extracting the pattern.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2. Explain the functionalities and classification of data mining system with example.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2. Explain the various data mining task primitives in detail.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2. How concept hierarchy is used in extracting information? Generate the frequent pattern from the following data set FP growth, where minimum support = 3.[2+8]

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. Explain the architecture and implementation of data warehouse with example.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. How do you compare two classifiers? Given the points A(3,7), B(4,6), C(5,5), D(6,4), E(7,3), F(6,2), G(7, 2), and H(8,4), find the core points and outliers using DBSCAN. Take Eps = 2.5 and MinPts = 3. [2+8]

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. Explain the architecture of data mining system with schematic diagram.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. What kind of data preprocessing do we need before applying data mining algorithm to any data set. Explain minning method to handle noisy data with example.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4. What are the stages of knowledge discovery in database (KDD)?

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4. What are the basic stages of KDD?

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4. What is the purpose of cluster analysis in data mining? Explain.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. How does KDD differ with data mining? Describe the stages of data mining.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2. "Data mining is a part of KDD", Do you agree or disagree? Justify. Explain the different stages in HDD.[3+7]

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5.  Describe the types of data used in data mining.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4.How classification plays significance role in data mining? Explain.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4.When a pattern is said to be interesting? List the issues of data mining. [1+4]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. Are the information given by data mining is always useful? What are the issues in data warehousing and data mining?

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6. Differentiate between OLAP and OLTP.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. How data can be modeled in multidimensional data model? Explain the conceptual modeling of data warehouse.[4+6]

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. Define data discretization. Describe the tasks for data preprocessing. [1+4]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

7. Differentiate between KDD and Data Mining.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6.Explain the four characteristics of data warehouse.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

7. Differentiate between KDD and Data Mining.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6. Define spatial data mining. What are the challenged of multimedia mining? Describe with an example.[2+3]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

7. Consider the following data set.

Find out whether the object with attribute Confident = Yes, Sick = No will Fail or Pass using Bayesian classification.[5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4. In real-world data, tuples with missing values values for some attributes are a common occurrence. Describe various methods for handling problem. [5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. What are the choices for data cube materialization? Explain the strategies for cube computation. [2+3]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. Can we use operational database instead of data warehouse? List the nature of data warehouse.[1+4]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9.Show the conflict between theory of balance and status. How do you improve Apriori? [2+3]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6. Why it is necessary to pre-compute the data cube? What are the possible issues for performing data cube computation.[3+2]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10. Differentiate between star schema and snow flake schema. List any two methods for data normalization. [2+3]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

7. Describe any three methods to normalize the group of data.[5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11. Differentiate between KDD and data mining.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10. Describe genetic algorithm using as problem solving technique in data mining.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11. How do you evaluate the accuracy of a classifier? Discuss the advantages of using K- fold cross validation. [2+3]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

13. Write short notes (Any Two)

     a) MOLAP

     b) Data cubes

     c) Snowflakes

     d) Regression

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. What are the significances of association rules in data mining? List the types of association rules with examples.[2+3]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

13. Write short notes (Any Two)

     a) Stars

     b) HOLAP

     c) Data Specification

     d) Mining and world wide web (WWW)

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

13. Write short notes (Any Two)

    a) HOLAP

    b) Hierarchy specification

    c) Spatial Database

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

13. Write short notes (Any Two)

     a) Data cubes

     b) HOLAP

     c) Spatial Database

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12. Apply K(=2)- Means algorithm over the data (185, 72), (170, 56), (168, 60), (179, 68), (182, 72), (188, 77) up to two iterations and show the clusters. Initially choose first two objects as initial centroids.[5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9. How do you index OLAP data? Give examples.[5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

13. Write short notes (Any Two)

a) Text Database Mining

b) Back propagation Algorithm

c) Regression

d) HOLAP

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

13. Write short notes on (Any Two)

a. Evolution analysis

b. Decision trees

c. Text mining

d. Classification using Regression

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10. Apriori needs to scan the dataset a lot of time which reduces the efficiency. Explain some mechanism to improve its efficiency.[5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11. Differentiate between OLTP and OLAP. [5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12. Which one approach is better, hierarchical or partitioning for clustering? Justify. List some drawbacks of k-means.[2+3]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

13. Write short notes.(Any Two)

a. Outlier Analysis

b. Web Mining

c. Query Manager

d. Pros and Cons of Association rules

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

Unit 2: Introduction to Data Mining
24 Questions

1. Explain the architecture of Data mining system with block diagram.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2. Explain the DBMS vs. Data Warehouse.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2. Do pattern and information refer to same aspect? Justify. Differentiate between data warehouse and operational database.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

1. Suppose that a data warehouse for Big University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg-grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg-grade measure stores the actual course grade of the student. At higher conceptual levels, avg-grade stores the average grade for the given combination.

a) Draw a snowflake schema diagram for the data warehouse.

b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each Big University Student.

c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. Explain the data warehouse architecture. Differentiate between distributed and virtual data warehouse.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. Explain about the architecture and implementation of data warehouse with example.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4. What do you mean by knowledge discovery in database (KDD)?

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4. Explain the multidimensional data model with example.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4. Differentiate between Data marks and Meta data.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. Explain the application of data warehouse and data mining.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. What do you mean by virtual data warehouse.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. Differentiate between DBMS and Data Warehouse.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. List down the functionality of meta data.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6. Explain the distributed and virtual data warehouse.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6.  Explain the similarities and dissimilarities between operational database and data warehouse.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. Differentiate between data marts and data cubes.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

7. Explain the multidimensional data model.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

7. Explain the data mining techniques.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. How different schema are used to model data warehouse? Explain.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9.  Why data cube computation is essential task in data mining? Describe general strategy in data cube computation.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. How multidimensional data model helps in retrieving information? Explain with suitable example. 

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10.  Describe the different components of a data warehouse.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11.  Define dimension table and fact table. What makes the necessity of multidimensional data model?

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12. What is DMQL? How do you define Star Schema using DMQL?

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

Unit 3: Data Preprocessing
10 Questions

2. Describe how bitmap and join indexing are used to represent OLAP data. Explain the different components of data warehouse.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. Differentiate between OLTP and OLAP.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6. Differentiate between OLAP and OLTP.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6. Explain OLAP operations with examples.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

7.  List the types of OLAP operations with example.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6. Explain OLAP operations with example?

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9. Compare the OLAP servers, ROLAP, MOLAP and HOLAP.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11. Differentiate between OLTP and OLAP.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12. Explain the data mining languages.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

13.  Write short notes on (any two):

            a)  Concept hierarchy

            b)  Data mining Query Language

            c)  Text mining

            d)  ROLAP vs MOLAP

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

Unit 4: Data Cube Technology
8 Questions

6. Explain the tuning and testing of Data Warehouse.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6. Explain the tuning and testing of Data Warehouse.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. List down the data mining tools.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. What are the data warehouse back end tools? Explain.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

7. Explain the optimization techniques in data cube computation.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9. Describe the significances of pre-computation of data cube.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11. What is data cube? Explain with example.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12. What does data warehouse tuning mean? Describe the parameters.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

Unit 5: Mining Frequent Patterns
2 Questions

7. Explain the data cube with example.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. Explain the Apriori Algorithm.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

Unit 6: Classification and Prediction
6 Questions

1. Consider the following 14 training dataset assumed a credit risk of high, moderate or low to people based on the following properties of their credit rating:

a. Collateral with possible values { Adequate, None}

b. Income with possible values {"Rs 0K to Rs 15K","Rs 15 K to Rs 35K","Over Rs 35 K"}

c. Debt with possible values{ High, Low}

d. Credit history with possible values {Good, Bad, Unknown}

Classify the individual with credit history=unknown, debt  = low, collateral = adequate and income = Rs 15K to Rs 35K using decision tree algorithm. Use ID3 algorithm for building the decision tree.[10]


10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4. List and describe the five primitives for specifying a data mining task.

5 marks
Details
Official Answer


AI Generated Answer

AI is thinking...

7. Explain the primitives of data mining query language.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. Explain the data mining query language with example.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. Explain the data mining query language.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10. Give a syntax and example of data mining query language.
5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

Unit 7: Cluster Analysis
15 Questions

1. What do you mean by representative object based clustering technique? Explain in detail with example.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

1. Discuss the types of web mining. Explain why K-means is sensitive to outlier and how does K-Medoid minimize this issue.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2. Define clustering. Explain with example of the partitioning and hierarchical clustering methods.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2. What do you mean by clustering? Explain the K-Mean and K-Mediod algorithm with example.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3.  List the two steps used in classification approach with its issues. Is this right decision to use neural network always as a classifier? Give your opinion. Discuss the working mechanism of back propagation classification algorithm.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. Explain the K-mean and K-Mediod Algorithm with example.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8.  Illustrate the strength and weakness of k-mean in comparison with k-medoids algorithm.
5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9. Explain the K-Mediod Algorithm.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

7. List the drawbacks of ID3 algorithm with over-fitting and its remedy techniques

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. Write the algorithm for K-means clustering. Compare it with k-nearest neighbor algorithm.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10. What are the types of Regression? Explain.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10. Explain the types of Regression.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10. What is the objective of K-means algorithm?

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12.  Discuss the approach behind Bayesian classification. Why smoothing technique is necessary in Bayesian classification?

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

13. Write short notes (Any Two)

     a) OLAP queries

     b) Snow flakes

     c) K-mean

     d) Mining text databases

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

Unit 8: Graph Mining and Social Network Analysis
11 Questions

1.  You are given the transaction data shown below from a fast food restaurant. There are 9 distinct transactions (order 1 to order 9). There are total 5 meal (M1 to M5) involved in transactions.

Meal ItemsList of item IDsMeal Items
List of item IDs

order 1

order 2

order 3

order 4

order 5

M1, M2, M5

M2, M4

M2, M3

M1, M2, M4

M1, M3

order 6

order 7

order 8

order 9

M2, M3

M1, M3

M1, M2, M3, M5

M1, M2, M3

Minimum support =2, Minimum confidence = 0,7

Apply the Apriori algorithm to the database to identify frequent k-itemset and find all strong association rules.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2. A= {A1, A2, A3, A4, A5, A6}, Assume σ = 35%. Use Apriori algorithm to get the desired solution.


A1A2A3A4A5A6
000111
011100
100111
110100
101011
011101
000110
010101
100100
111111


10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. List the problems of Apriori algorithm with its possible solutions. Consider the following transaction dataset.

Transaction_ID          Item_List

T1                                 {K, A, D, B}

T2                                 {D,A,C,E,B}

T3                                 {C,A,B,E}

T4                                 {B,A,D}

What association rules can be found in this set, if the minimum support is 3 and the minimum confidence is 80%.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. Give any two types of association rules with example. Trace the results of using the Apriori algorithm on the grocery store example with support threshold 2 and confidence threshold 60 %. Show the candidate and frequent itemsets for each database scan. Enumerate all the final frequent itemsets. Also indicate the association rules that are generated.

Transaction_IDItems
T1HotDogs, Buns, Ketchup
T2HotDogs, Buns
T3HotDogs, Coke, Chips
T4Chips, Coke
T5Chips, Ketchup
T6HotDogs, Coke, Chips
10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

4. Explain the use of frequent item set generation process.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9. Explain the Aprion Algorithm.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9. What are the advantages and disadvantages of association rules?

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9. Write down the two measures of association rule.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11. Explain the association rules with advantages and disadvantages.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11. Explain the Apriori Algorithm.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12. Explain the Apriori Algorithm.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

Unit 9: Mining Spatial, Multimedia, Text and Web Data
10 Questions

1. List some issues of multimedia mining. Describe how back propagation is used in classification.

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9. Explain the data mining tasks performed on a text database.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10. Define the spatial database and its features.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10. Define the spatial database and its features.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11. Explain the application of spatial databases.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9. What is text mining? Explain the text indexing techniques.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12. Explain mining text databases.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12. Explain the application of mining used in WWW.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12. Explain the methods of mining multimedia database.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11. What do you mean by WWW mining? Explain WWW mining techniques.

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...