Statistics I Model Question
Group A
Attempt any Two questions. (2 x 10 = 20)
1. A new computer program consists of two modules. The first module contains an error with probability 0.2. The second module is more complex; it has a probability of 0.4 to contain an error, independently of the first module. An error in the first module alone causes the program to crash with probability 0.5. For the second module, this probability is 0.8. If there are errors in both modules, the program crashes with probability 0.9. Suppose the program crashed. What is the probability of errors in both modules?
2. Explain how box-plot is helpful to know the shape of the data distribution. The following data set represents the number of new computer accounts registered during ten consecutive days.
a) Compute the mean, median, quartiles, and sample standard deviation.
b) Check whether there are outliers or not.
c) If outliers are present, then delete the detected outliers and compute the mean, median, quartiles, and sample standard deviation again.
d) Make your conclusion about the effect of outliers on descriptive statistical analysis.
3. A computer manager interested to know how efficiency of his/her new computer program which depends on the size of incoming data. Efficiency will be measured by the number of processed requests per hour. In general, larger data sets require more computer time, and therefore, fewer requests are processed within 1 hour. Applying the program to data sets of different sizes, the following data were gathered.
a) Identify which one response variable, and fit a simple regression line, assuming that the relationship between them is linear.
b) Interpret the regression coefficient with reference to your problem.
c) Obtain coefficient of determination, and interpret this.
d) Based on the fitted model in (a), predict the efficiency of new computer for data size 12(gigabytes). Does it possible to predict efficiency for data size of 30 gigabytes? Discuss.
Group B
Attempt any Eight questions. (8 x 5 =40)
4. Explain the role of statistics in computer science and information technology.
5. Following table presents some descriptive statistics computed from three different independent sample dataset(X).
a) Compare sample mean and median, and explain about the shape of the data distribution for each dataset. Compare the variability of the three set of dataset. Box-plots have been generated through SPSS for each dataset as follows.
b) Do these box-plots support your findings obtained in a) about the shape of the distribution? Explain.
6. A large chain retailer purchases a certain kind of electronic device from a manufacturer. The manufacturer indicates that the defective rate of the device is 3%.
a) The inspector randomly picks 20 items from a shipment. What is the probability that there will be at least one defective item among these 20?
b) Suppose that the retailer receives 10 shipments in a month and the inspector randomly tests 20 devices per shipment. What is the probability that there will be exactly 3 shipments each containing at least one defective device among the 20 that are selected and tested from the shipment?
7. Messages arrive at an electronic message center at random times, with an average of 9 messages per hour.
a) What is the probability of receiving at least five messages during the next hour?
b) What is the probability of receiving exactly seven messages during the next hour?
8. The time, in minutes, it takes to reboot a certain system is a continuous variable with the density function:
Compute C, and then compute the probability that it takes between 1 and 2 minutes to reboot.
9. Following data represent the preference of 10 students studying B.Sc .(CSIT) towards two brands of computers namely DELL and HP.
Apply appropriate statistical tool to measure whether the brand preference is correlated. Also interpret your result.
10. Define exponential distribution with parameter λ . The time required to reach to the printer after ordering in the computer follows exponential distribution at an average rate of 3 jobs per hour.
a) What is the expected time between jobs?
b) What is the probability that the next job is sent within 5 minutes?
11. The lifetime of a certain electronic component is a normal random variate with the expectation of 5000 hours and a standard deviation of 100 hours. Compute the probabilities under the following conditions
a) Lifetime of components is less than 5012 hours
b) Lifetime of components between 4000 to 6000 hours
c) Lifetime of components more than 7000 hours
12. Write short notes on the following.
a) Sampling error and non-sampling error
b) Conditional probability