Statistics II - Syllabus
Embark on a profound academic exploration as you delve into the Statistics II course () within the distinguished Tribhuvan university's CSIT department. Aligned with the 2074 Syllabus, this course (STA210) seamlessly merges theoretical frameworks with practical sessions, ensuring a comprehensive understanding of the subject. Rigorous assessment based on a 60 + 20 + 20 marks system, coupled with a challenging passing threshold of , propels students to strive for excellence, fostering a deeper grasp of the course content.
This 3 credit-hour journey unfolds as a holistic learning experience, bridging theory and application. Beyond theoretical comprehension, students actively engage in practical sessions, acquiring valuable skills for real-world scenarios. Immerse yourself in this well-structured course, where each element, from the course description to interactive sessions, is meticulously crafted to shape a well-rounded and insightful academic experience.
Course objectives:
To impart the theoretical as well as practical knowledge of estimation, testing of hypothesis,
application of parametric and non-parametric statistical tests, design of experiments, multiple
regression analysis, and basic concept of stochastic process with special focus to data/problems
related with computer science and information technology.
Units
Key Topics
-
Sampling Distribution
SA-1The distribution of a statistic obtained from multiple samples of a population. It is a fundamental concept in inferential statistics.
-
Sampling Distribution of Mean and Proportion
SA-2The distribution of the sample mean and proportion, which are used to make inferences about the population mean and proportion.
-
Central Limit Theorem
SA-3A fundamental theorem in statistics that states that the sampling distribution of the mean will be approximately normal, even if the population distribution is not normal.
-
Concept of Inferential Statistics
SA-4The branch of statistics that deals with making inferences about a population based on a sample of data.
-
Estimation
SA-5The process of making an educated guess about a population parameter based on a sample of data.
-
Methods of Estimation
SA-6Different techniques used to estimate population parameters, such as maximum likelihood estimation and method of moments.
-
Properties of Good Estimator
SA-7The characteristics of a good estimator, including unbiasedness, consistency, and efficiency.
-
Determination of Sample Size
SA-8The process of determining the required sample size to achieve a desired level of precision in estimation.
-
Relationship of Sample Size with Desired Level of Error
SA-9The relationship between the sample size and the desired level of error in estimation, including the concept of margin of error.
Key Topics
-
Types of Statistical Hypotheses
TE-1This topic covers the different types of statistical hypotheses, including null and alternative hypotheses, and their roles in hypothesis testing.
-
Power of the Test and P-Value
TE-2This topic explains the concept of power of the test, p-value, and its use in decision making during hypothesis testing.
-
Steps in Testing of Hypothesis
TE-3This topic outlines the steps involved in testing a hypothesis, from formulating the hypothesis to making a decision based on the test results.
-
One Sample Tests for Mean of Normal Population
TE-4This topic covers one sample tests for the mean of a normal population, including tests for known and unknown variance.
-
Test for Single Proportion
TE-5This topic explains how to conduct a test for a single proportion, including the test statistic and p-value calculation.
-
Test for Difference between Two Means
TE-6This topic covers the test for the difference between two means, including the test statistic and p-value calculation.
-
Test for Difference between Two Proportions
TE-7This topic explains how to conduct a test for the difference between two proportions, including the test statistic and p-value calculation.
-
Paired Sample T-Test
TE-8This topic covers the paired sample t-test, including its application and interpretation.
-
Linkage between Confidence Interval and Testing of Hypothesis
TE-9This topic explains the relationship between confidence intervals and hypothesis testing, including how to use confidence intervals to make inferences about a population.
Key Topics
-
Chi-Square Test
NO-1The Chi-Square test is a statistical test used to determine whether there is a significant association between two categorical variables. It is used to test the independence of two variables or to test whether the observed frequencies of a categorical variable match the expected frequencies.
-
Order Statistics
NO-2Order statistics is a branch of statistics that deals with the arrangement of data in order of magnitude. It is used to describe the distribution of data and to make inferences about the population.
-
Run Test
NO-3The Run test is a non-parametric test used to determine whether a sequence of data is random or not. It is used to test for randomness in a sequence of binary data.
-
Sign Test
NO-4The Sign test is a non-parametric test used to compare the median of two related samples. It is used to test whether the median of one sample is significantly different from the median of another sample.
-
Wilcoxon Matched Pairs Signed Ranks Test
NO-5The Wilcoxon Matched Pairs Signed Ranks test is a non-parametric test used to compare the median of two related samples. It is used to test whether the median of one sample is significantly different from the median of another sample.
-
Mann-Whitney U Test
NO-6The Mann-Whitney U test is a non-parametric test used to compare the median of two independent samples. It is used to test whether the median of one sample is significantly different from the median of another sample.
-
Median Test
NO-7The Median test is a non-parametric test used to compare the median of two or more samples. It is used to test whether the median of one sample is significantly different from the median of another sample.
-
Kolmogorov Smirnov Test (One Sample Case)
NO-8The Kolmogorov Smirnov test is a non-parametric test used to compare the distribution of a sample to a known distribution. It is used to test whether the distribution of a sample is significantly different from a known distribution.
Key Topics
-
Multiple Correlation
MU-1Introduction to multiple correlation, its concept, and application in statistics. Exploring the relationship between multiple variables.
-
Partial Correlation
MU-2Understanding partial correlation, its concept, and application in statistics. Analyzing the relationship between two variables while controlling for other variables.
-
Introduction to Multiple Linear Regression
MU-3Basic concepts and principles of multiple linear regression, including model formulation and estimation. Understanding the relationship between multiple independent variables and a dependent variable.
-
Hypothesis Testing of Multiple Regression
MU-4Testing hypotheses in multiple regression, including significance testing and confidence intervals. Evaluating the overall fit and significance of the regression model.
-
Test of Significance of Regression
MU-5Testing the overall significance of the regression model, including F-test and p-value interpretation. Determining whether the regression model is a good fit to the data.
-
Test of Individual Regression Coefficient
MU-6Testing the significance of individual regression coefficients, including t-test and p-value interpretation. Evaluating the contribution of each independent variable to the regression model.
-
Model Adequacy Tests
MU-7Evaluating the goodness of fit and adequacy of the multiple regression model, including residual analysis and diagnostic plots. Identifying potential issues and limitations of the model.
Experimental design; Basic principles of experimental designs; Completely Randomized Design
(CRD); Randomized Block Design (RBD); ANOVA table, Efficiency of RBD relative to CRD,
Estimations of missing value (one observation only), Advantages and disadvantages; Latin
Square Design (LSD): Statistical analysis of m × m LSD for one observation per experimental
unit, ANOVA table, Estimation of missing value in LSD (one observation only), Efficiency of
LSD relative to RBD, Advantage and disadvantages.
Problems and illustrative examples related to computer Science and IT
Definition and classification; Markov Process: Markov chain, Matrix approach, Steady- State
distribution; Counting process: Binomial process, Poisson process; Simulation of stochastic
process; Queuing system: Main component of queuing system, Little’s law; Bernoulli single
server queuing process: system with limited capacity; M/M/1 system: Evaluating the system
performance.
Lab works
S. No. | Title of the practical problems | (Using any statistical software such as SPSS, STATA etc. whichever | convenient). | No. of practical problems | ||||
1 | Sampling distribution, random number generation, and computation of | sample size | 1 | |||||
2 | Methods of estimation(including interval estimation) | 1 | ||||||
3 | Parametric tests (covering most of the tests) | 3 | ||||||
4 | Non-parametric test(covering most of the tests) | 3 | ||||||
5 | Partial correlation | 1 | ||||||
6 | Multiple regression | 1 | ||||||
7 | Design of Experiments | 3 | ||||||
9 | Stochastic process | 2 | ||||||
Total number of practical problems | 15 |