Data Warehousing and Data Mining - Syllabus

Course Overview and Structure

Embark on a profound academic exploration as you delve into the Data Warehousing and Data Mining course () within the distinguished Tribhuvan university's CSIT department. Aligned with the 2065 Syllabus, this course (CSC-451) seamlessly merges theoretical frameworks with practical sessions, ensuring a comprehensive understanding of the subject. Rigorous assessment based on a 60+20+20 marks system, coupled with a challenging passing threshold of , propels students to strive for excellence, fostering a deeper grasp of the course content.

This 3 credit-hour journey unfolds as a holistic learning experience, bridging theory and application. Beyond theoretical comprehension, students actively engage in practical sessions, acquiring valuable skills for real-world scenarios. Immerse yourself in this well-structured course, where each element, from the course description to interactive sessions, is meticulously crafted to shape a well-rounded and insightful academic experience.


Course Synopsis: Analysis of advanced aspect of data warehousing and data mining.
Goal: This course introduces advanced aspects of data warehousing and data mining, encompassing the principles, research results and commercial application of the current technologies.

Units

Key Topics

  • Compiler Structure
    UN-1.1

    Analysis and Synthesis Model of Compilation, including different sub-phases within analysis and synthesis phases.

  • Compiler Concepts
    UN-1.2

    Basic concepts related to Compiler, including interpreter, simple One-Pass Compiler, preprocessor, macros, symbol table, and error handler.

  • Institutional Infrastructural Preparedness
    UN-1.3

    Institutional infrastructural preparedness refers to the readiness of government agencies and institutions to adopt and implement e-governance systems.

  • Human Infrastructural Preparedness
    UN-1.4

    Human infrastructural preparedness involves the development of skills and capacities of public officials and citizens to effectively use e-governance systems.

  • Technological Infrastructural Preparedness
    UN-1.5

    Technological infrastructural preparedness refers to the availability and quality of technology infrastructure, including computers, internet connectivity, and other digital tools.

Key Topics

  • E-readiness
    UN-1

    E-readiness refers to the state of preparedness of a country or organization to participate in the digital economy. It involves assessing the availability and quality of digital system infrastructure, legal frameworks, institutional arrangements, human resources, and technological capabilities.

  • Evolutionary Stages in E-Governance
    UN-2

    The evolutionary stages in e-governance refer to the different phases of development and implementation of e-governance initiatives, from basic online presence to integrated and transformative e-governance systems.

  • Internetworking
    UN-3

    Bridges and routers in distributed networking, enabling communication between different networks.

  • Internet Design and Evolution
    UN-4

    History and development of the internet, including its design principles and evolution over time.

  • Data Cubes
    UN-5

    A multidimensional representation of data, where each dimension represents a different aspect of the data, used for fast querying and data analysis.

  • Schemes for Multidimensional Database
    UN-6

    Different schemes used to design and implement multidimensional databases, including Stars, Snowflakes, and Fact Constellations.

  • Stars
    UN-7

    A type of multidimensional database scheme, characterized by a central fact table surrounded by dimension tables.

  • Snowflakes
    UN-8

    A type of multidimensional database scheme, characterized by a central fact table surrounded by multiple levels of dimension tables.

  • Fact Constellations
    UN-9

    A type of multidimensional database scheme, characterized by multiple fact tables connected by dimension tables.

Key Topics

  • Symbol Table Design
    UN-3.1

    Function of Symbol Table, Information provided by Symbol Table, Attributes and Data Structures for symbol table

  • Run-time Storage Management
    UN-3.2

    Managing storage during runtime

  • Database Recovery
    UN-3.3

    Failure Classification, The Storage Hierarchy, Transaction Model, Log-Based recovery, Buffer Management, Checkpoints, Shadow Paging, Failure with Loss of Non-volatile Storage.

  • Querying Role Information
    UN-3.4

    Querying role information involves retrieving information about roles and their associated privileges. This topic covers the different methods for querying role information.

  • Database Security and Auditing
    UN-3.5

    Database security and auditing involve ensuring the confidentiality, integrity, and availability of database data. This topic covers the different security measures and auditing techniques.

  • Creating and Managing Databases
    UN-3.6

    Creating and managing databases involves designing, creating, and modifying database structures. This topic covers the basics of database creation and management.

  • Creating and Managing Tables
    UN-3.7

    Creating and managing tables involves designing, creating, and modifying table structures. This topic covers the basics of table creation and management.

Key Topics

  • Intermediate Code Generation
    UN-4.1

    This topic covers the generation of intermediate code, including high-level and low-level representations, syntax trees, and three-address code. It also discusses the generation of intermediate code for declarations, assignments, control flow, boolean expressions, and procedure calls.

  • Code Generation
    UN-4.2

    This topic explores the factors affecting code generation, including target language, basic blocks, and flow graphs. It also covers dynamic programming code-generation algorithms.

  • Code Optimization
    UN-4.3

    This topic discusses the need and criteria for code optimization, as well as basic optimization techniques to improve code efficiency.

  • Compiler Case Studies
    UN-4.4

    This topic presents case studies of compilers, including C and C++ compilers, to illustrate the application of compiler design principles.

  • Testing the Backup and Recovery Plan
    UN-4.5

    Validating the effectiveness of a backup and recovery strategy through regular testing and simulation exercises.

Key Topics

  • Introduction to Virtual Reality
    UN-5.1

    This topic covers the fundamental concepts and principles of Virtual Reality (VR), including its history, applications, and key technologies.

  • Introduction to Animation
    UN-5.2

    This topic provides an overview of the basics of animation, including its history, types, and key concepts, as well as its applications in computer graphics.

  • Automatic Storage Management
    UN-5.3

    Automatic storage management is a feature that automates the management of database storage, including disk space allocation and deallocation. This topic covers the concepts and best practices of automatic storage management.

  • RMAN (Recovery Manager)
    UN-5.4

    RMAN is a utility provided by Oracle for backing up, restoring, and recovering databases. This topic covers the features, benefits, and usage of RMAN in database administration.

  • Data Mining Applications
    UN-5.5

    Examining the various applications of Data Mining in different industries, including marketing, finance, and healthcare. Understanding the benefits and challenges of Data Mining in real-world scenarios.

Data mining query languages, data specification, specifying knowledge, hierarchy specification, pattern presentation & visualization specification, data languages and standardization of data mining.

Mining Association Rules in Large Database: Association Rule Mining, why Association Mining is necessary, Pros and Cons of Association Rules, Apriori Algorithm.

Key Topics

  • Classification and Prediction Issues
    UN-8.1

    Discussion of common issues and challenges in classification and prediction, including data quality, class imbalance, and overfitting.

  • Classification by Decision Tree Induction
    UN-8.2

    Introduction to decision tree induction as a method for classification, including how to construct and prune decision trees.

  • Introduction to Regression
    UN-8.3

    Overview of regression analysis, including simple and multiple regression, and its applications in data mining.

  • Types of Regression
    UN-8.4

    Exploration of different types of regression, including linear, logistic, and nonlinear regression.

  • Introduction to Clustering
    UN-8.5

    Fundamentals of clustering, including types of clustering, clustering algorithms, and applications in data mining.

  • K-Mean and K-Mediod Algorithms
    UN-8.6

    In-depth look at K-Mean and K-Mediod algorithms, including how they work, advantages, and limitations.

Mining complex Types of Data: Mining Text Databases, Mining the World Wide Web, Mining Multimedia and Spatial Databases.