Data Warehousing and Data Mining 2078

Question Paper Details
Tribhuwan University
Institute of Science and Technology
2078
Bachelor Level / Seventh Semester / Science
Computer Science and Information Technology ( CSC410 )
( Data Warehousing and Data Mining )
Full Marks: 60
Pass Marks: 24
Time: 3 hours
Candidates are required to give their answers in their own words as far as practicable.
The figures in the margin indicate full marks.

Group A

Official Answer
AI Generated Answer

AI is thinking...

Attempt any two questions:(2*10=20)

Official Answer
AI Generated Answer

AI is thinking...

1. Consider the following 14 training dataset assumed a credit risk of high, moderate or low to people based on the following properties of their credit rating:

a. Collateral with possible values { Adequate, None}

b. Income with possible values {"Rs 0K to Rs 15K","Rs 15 K to Rs 35K","Over Rs 35 K"}

c. Debt with possible values{ High, Low}

d. Credit history with possible values {Good, Bad, Unknown}

Classify the individual with credit history=unknown, debt  = low, collateral = adequate and income = Rs 15K to Rs 35K using decision tree algorithm. Use ID3 algorithm for building the decision tree.[10]


10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

2. "Data mining is a part of KDD", Do you agree or disagree? Justify. Explain the different stages in HDD.[3+7]

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

3. How data can be modeled in multidimensional data model? Explain the conceptual modeling of data warehouse.[4+6]

10 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

Group B

Official Answer
AI Generated Answer

AI is thinking...

Short Answer Questions. [5*8 = 40]

Official Answer
AI Generated Answer

AI is thinking...

4. In real-world data, tuples with missing values values for some attributes are a common occurrence. Describe various methods for handling problem. [5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

5. Can we use operational database instead of data warehouse? List the nature of data warehouse.[1+4]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

6. Why it is necessary to pre-compute the data cube? What are the possible issues for performing data cube computation.[3+2]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

7. Describe any three methods to normalize the group of data.[5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

8. What are the significances of association rules in data mining? List the types of association rules with examples.[2+3]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

9. How do you index OLAP data? Give examples.[5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

10. Apriori needs to scan the dataset a lot of time which reduces the efficiency. Explain some mechanism to improve its efficiency.[5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

11. Differentiate between OLTP and OLAP. [5]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

12. Which one approach is better, hierarchical or partitioning for clustering? Justify. List some drawbacks of k-means.[2+3]

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...

13. Write short notes.(Any Two)

a. Outlier Analysis

b. Web Mining

c. Query Manager

d. Pros and Cons of Association rules

5 marks
Details
Official Answer
AI Generated Answer

AI is thinking...