Data Warehousing and Data Mining - Syllabus
Embark on a profound academic exploration as you delve into the Data Warehousing and Data Mining course (DWDM) within the distinguished Tribhuvan university's CSIT department. Aligned with the 2074 Syllabus, this course (CSC410) seamlessly merges theoretical frameworks with practical sessions, ensuring a comprehensive understanding of the subject. Rigorous assessment based on a 60 + 20 + 20 marks system, coupled with a challenging passing threshold of , propels students to strive for excellence, fostering a deeper grasp of the course content.
This 3 credit-hour journey unfolds as a holistic learning experience, bridging theory and application. Beyond theoretical comprehension, students actively engage in practical sessions, acquiring valuable skills for real-world scenarios. Immerse yourself in this well-structured course, where each element, from the course description to interactive sessions, is meticulously crafted to shape a well-rounded and insightful academic experience.
Course Description:
This course introduces advanced aspects of data warehousing and data mining, encompassing the principles, research results and commercial application of the current technologies.
Course Objective:
The main objective of this course is to provide knowledge of different data mining techniques and data warehousing.
Units
Key Topics
-
Introduction to E-commerce
IN-1Overview of E-commerce and its significance in the digital age.
-
E-business vs E-commerce
IN-2Understanding the differences between E-business and E-commerce.
-
Features of E-commerce
IN-3Key characteristics and benefits of E-commerce.
-
Pure vs Partial E-commerce
IN-4Types of E-commerce models and their applications.
-
History of E-commerce
IN-5Evolution and development of E-commerce over time.
-
E-commerce Framework
IN-6Understanding the components of E-commerce framework including People, Public Policy, Marketing and Advertisement, Support Services, and Business Partnerships.
-
Types of E-commerce
IN-7Overview of different types of E-commerce including B2C, B2B, C2B, C2C, M-Commerce, U-commerce, Social-Ecommerce, and Local E-commerce.
-
Challenges in E-commerce
IN-8Common obstacles and difficulties faced in E-commerce.
-
Status of E-commerce in Nepal
IN-9Current state and trends of E-commerce in Nepal.
-
Overview of Electronic Transaction Act of Nepal
IN-10Understanding the legal framework governing E-commerce in Nepal.
-
Software Engineering Ethics
IN-11Ethical considerations and principles in software engineering, including accountability, privacy, and intellectual property.
-
Distributed Computing in Grid and Cloud
IN-12Exploring the role of distributed computing in grid and cloud environments, including its applications and benefits.
-
Trends in Data Warehousing
IN-13Current and emerging trends in data warehousing, including big data, cloud computing, and real-time analytics.
Key Topics
-
Introduction to Computers
IN-01An overview of computers and their significance in today's world. This topic sets the stage for understanding the basics of computers.
-
Digital and Analog Computers
IN-02Understanding the difference between digital and analog computers, their characteristics, and applications.
-
Characteristics of Computers
IN-03Exploring the key characteristics of computers, including input, processing, storage, and output.
-
History of Computers
IN-04A brief history of computers, from their inception to the present day, highlighting key milestones and developments.
-
Generations of Computers
IN-05Understanding the different generations of computers, including their features, advantages, and limitations.
-
Classification of Computers
IN-06Categorizing computers based on their size, functionality, and application, including desktops, laptops, and mobile devices.
-
The Computer System
IN-07An in-depth look at the components of a computer system, including hardware and software.
Key Topics
-
Introduction to Databases
DA-1Introduction to databases, including examples and basic concepts.
-
Database Management System
DA-2Introduction to Database Management Systems (DBMS), including advantages and examples.
-
Database Users
DA-3Types of database users, including actors on the scene and workers behind the scene.
-
Benefits of Databases
DA-4Advantages and benefits of using databases.
-
Data Models
DA-5Types of data models, including hierarchical, network, ER, relational, and object models.
Key Topics
-
Introduction to Databases
DA-1Introduction to databases, including examples and basic concepts.
-
Database Management System
DA-2Introduction to Database Management Systems (DBMS), including advantages and examples.
-
Database Users
DA-3Types of database users, including actors on the scene and workers behind the scene.
-
Benefits of Databases
DA-4Advantages and benefits of using databases.
-
Data Models
DA-5Types of data models, including hierarchical, network, ER, relational, and object models.
-
Three-Schema Architecture
DA-6Three-schema architecture, including internal, conceptual, and external views.
Key Topics
-
Control Word and Microprogram
MI-1This topic covers the concept of control words and microprograms in microprogrammed control, including their roles in controlling the flow of data and instructions in a computer system.
-
Address Sequencing and Conditional Branch
MI-2This topic explains how address sequencing and conditional branching are used to control the flow of instructions in a microprogrammed control unit, including the use of conditional branch instructions and subroutines.
-
Microinstruction Format and Symbolic Microinstructions
MI-3This topic covers the format of microinstructions and the use of symbolic microinstructions to represent complex control sequences in a microprogrammed control unit.
-
Design of Control Unit
MI-4This topic covers the design principles and considerations for building a control unit using microprogrammed control, including the organization of control memory and the role of the sequencer.
-
Association Rules
MI-5Association rules are statements that describe the relationship between different items in a dataset. They are used to identify patterns and correlations between items.
-
Types of Association Rules
MI-6There are different types of association rules, including single dimensional, multidimensional, multilevel, and quantitative rules. Each type has its own characteristics and applications.
-
Finding Frequent Itemsets
MI-7Finding frequent itemsets involves using algorithms such as Apriori and FP-growth to identify patterns in a dataset.
-
Generating Association Rules
MI-8Generating association rules involves using frequent itemsets to create rules that describe the relationships between items.
-
Limitations and Improvements of Apriori
MI-9The Apriori algorithm has limitations, such as being computationally expensive. Improvements can be made by using techniques such as sampling and parallel processing.
-
From Association Mining to Correlation Analysis
MI-10Association mining can be extended to correlation analysis, which involves identifying relationships between continuous variables.
-
Lift
MI-11Lift is a measure of the strength of an association rule. It helps to evaluate the usefulness and relevance of the rule.
Key Topics
-
Common Client-side Web Technologies
CL-1This topic covers the fundamental technologies used on the client-side of web development, including HTML, CSS, and JavaScript.
-
JQuery
CL-2This topic explores the use of JQuery, a popular JavaScript library, for client-side scripting and DOM manipulation.
-
Forms and Validation
CL-3This topic discusses the importance of form validation and how to implement it using ASP.NET Core, including client-side and server-side validation techniques.
-
Single Page Application (SPA) Frameworks
CL-4This topic introduces Single Page Application (SPA) frameworks, including Angular and React, and their role in building dynamic and interactive client-side applications.
-
Software-as-a-Service (SaaS)
CL-5SaaS implementation issues, key characteristics of SaaS, benefits of the SaaS model.
-
Jericho Cloud Cube Model
CL-6A cloud service model framework.
-
User Defined Objects
CL-10Creating custom objects with properties and methods.
-
Event Handling and Form Validation
CL-11Handling events and validating form data with JavaScript.
-
Error Handling
CL-12Catching and handling errors in JavaScript code.
-
Handling Cookies
CL-13Storing and retrieving data with cookies in JavaScript.
-
Graphic Presentation
CL-7Graphic presentation involves using graphs such as histograms, frequency polygons, and frequency curves to present data. It is a visual way of presenting data, making it easy to understand and analyze.
-
Histogram
CL-8A histogram is a type of graph that uses bars to represent the frequency of different ranges of values. It is commonly used to display continuous data.
-
Frequency Polygon
CL-9A frequency polygon is a type of graph that uses lines to connect the points representing the frequency of different ranges of values. It is commonly used to display continuous data.
Key Topics
-
Common Client-side Web Technologies
CL-1This topic covers the fundamental technologies used on the client-side of web development, including HTML, CSS, and JavaScript.
-
JQuery
CL-2This topic explores the use of JQuery, a popular JavaScript library, for client-side scripting and DOM manipulation.
-
Forms and Validation
CL-3This topic discusses the importance of form validation and how to implement it using ASP.NET Core, including client-side and server-side validation techniques.
-
Single Page Application (SPA) Frameworks
CL-4This topic introduces Single Page Application (SPA) frameworks, including Angular and React, and their role in building dynamic and interactive client-side applications.
-
Software-as-a-Service (SaaS)
CL-5SaaS implementation issues, key characteristics of SaaS, benefits of the SaaS model.
-
Jericho Cloud Cube Model
CL-6A cloud service model framework.
Key Topics
-
Optimization Problems and Greedy Algorithms
GR-1Introduction to optimization problems and the concept of optimal solutions, with an overview of greedy algorithms and their elements.
-
Greedy Algorithm Applications
GR-2Exploration of various applications of greedy algorithms, including fractional knapsack, job sequencing with deadlines, Kruskal's algorithm, Prim's algorithm, and Dijkstra's algorithm.
-
Huffman Coding
GR-3Introduction to Huffman coding, including its purpose, prefix codes, and the Huffman coding algorithm, along with its analysis.
-
Social Network Analysis
GR-4Social network analysis is the process of examining social structures, relationships, and interactions within a network. It involves using graph theory and statistical methods to understand social behavior and patterns.
-
Link Mining
GR-5Link mining is a subfield of graph mining that focuses on the analysis of links between nodes in a graph. It involves discovering patterns and relationships between entities in a network.
-
Friends of Friends
GR-6Friends of friends is a concept in social network analysis that refers to the friends of an individual's friends. It is used to study social relationships and network structures.
-
Degree Assortativity
GR-7Degree assortativity is a measure of the tendency of nodes in a network to be connected to other nodes with similar degrees. It is used to study network structures and patterns.
-
Signed Networks
GR-8Signed networks are graphs that contain both positive and negative edges, representing friendships and antagonisms between nodes. It involves using theories such as structured balance and status to analyze signed networks.
-
Trust in a Network
GR-9Trust in a network refers to the level of confidence or reliability between nodes. It involves using algorithms such as atomic propagation and iterative propagation to predict trust and distrust in a network.
-
Predicting Positive and Negative Links
GR-10This topic involves using machine learning and graph mining techniques to predict the formation of positive and negative links in a network, such as friendships and antagonisms.
Key Topics
-
From Association Mining to Correlation Analysis
MI-10Association mining can be extended to correlation analysis, which involves identifying relationships between continuous variables.
-
Lift
MI-11Lift is a measure of the strength of an association rule. It helps to evaluate the usefulness and relevance of the rule.
-
Spatial Data Mining
MI-01Spatial data mining involves discovering patterns and relationships in spatial data, such as geographic information. It includes techniques for mining spatial association and spatial data cubes.
-
Spatial Data Cube
MI-02A spatial data cube is a multidimensional representation of spatial data, allowing for efficient querying and analysis of spatial relationships.
-
Mining Spatial Association
MI-03Mining spatial association involves discovering relationships between spatial objects, such as proximity, distance, and orientation.
-
Multimedia Data Mining
MI-04Multimedia data mining involves discovering patterns and relationships in multimedia data, such as images, videos, and audio files.
-
Similarity Search in Multimedia Data
MI-05Similarity search in multimedia data involves finding similar multimedia objects based on their features and attributes.
-
Mining Association in Multimedia Data
MI-06Mining association in multimedia data involves discovering relationships between multimedia objects, such as co-occurrence and correlation.
-
Text Mining
MI-07Text mining involves discovering patterns and relationships in unstructured text data, using techniques from natural language processing and information extraction.
-
Web Mining
MI-08Web mining involves discovering patterns and relationships in web data, including web content, structure, and usage.
-
Web Content Mining
MI-09Web content mining involves extracting useful information from web pages, such as text, images, and links.
Lab works
Laboratory Works:
The laboratory should contain all the features mentioned in a course, which should include data preprocessing and cleaning, implementing classification, clustering, association algorithms in any programming language, and data visualization through data mining tools.