Get premium membership and access revision papers, questions with answers as well as video lessons.

Exploratory Data Analysis Question Paper

Exploratory Data Analysis 

Course:Statistical Analysis

Institution: University Of Nairobi question papers

Exam Year:2012



UNIVERSITY OF NAIROBI
SECOND SEMESTER EXAMINATIONS 2011/2012
FIRST YEAR EXAMINATIONS FOR THE DEGREE OF BACHELOR OF STATISTICS
STA 102: EXPLORATORY DATA ANALYSIS
DATE MAY 31ST, 2012 TIME: 2.00 P.M -4.OO P.M
INSTRUCTIONS:
Answer Question One and any other two questions.
QUESTION 1: [30 MARKS]
a) The probability density function of a random variable y is given by:

f (y)={k, y=1,2,3,4
{ 0, otherwise

i. Find the value of the constant k.
ii. Find E(y) and Var(y)
iii. Determine the coefficient of kurtosis for y and comment on it. [7 marks]


b) What is Exploratory Data Analysis (EDA)?
How does EDA differ from classical data analysis?

c) Explain the meaning of the term fixed location and fixed variation.


• Give the possible consequences of failure of fixed location in EDA.

• Give one EDA Technique used to detect departure from the assumptions mentioned in part (i).

• Give any three measures of location and any one measure of scale. [7marks]

d) Use a quantile plot to test whether the following data is normally distributed
12, 25, 11, 16, 7, 20, 45
Is there evidence to suggest the presence of any anomalies of the data? [5 marks]

e) Consider the following Data:
Y 150 220 305 120 170
X 2 4 5 1 3
i. Does a scatter diagram reveal any other anomalies or outliers?
ii. Fit a simple linear regression to the data and check if the residuals satisfy the underlying model assumptions.



QUESTION TWO: [20 MARKS}

A. Distinguish between a quantile plot and a quantile-quantile plot. [4 marks]
B. A random variable y has the following probability density function.

F(y)= {cy, 1<y<10
{0 , otherwise
I. Find the value of the constant C.
II. Determine the cumulative distributive function of the random variable y.
III. Explain the meaning of the terms leptokurtic, platykurtic and mesokurtic distributions. [8 marks]


C. Consider the following sample data
y: 7.03, 1.44, 4.06, 3.80, 9.86, 3.59

Draw a quantile plot and a box-plot of y and comment on them. Are there any anomalies or outliers in the data? [8 marks]



QUESTION THREE: [20 MARKS]
A. (i) What is the difference between qualitative (categorical) and quantitative (numeric) data?
(ii) What difference is there between ordinal and nominal data?[6 marks]

B. The wages (in hundreds of shillings ) of ten casuals are given below:
20, 18, 8, 20, 12, 19, 2, 16, 14
(i) Draw a normality plot and a histogram for the wages and comment on them. [8 marks]

(ii) Does the data appear to be symmetric? If not, suggest a transformation that will make the data more symmetric. [2 marks]

(iii) Transform the data using the transform you suggested in part (ii) and use a box-plot to verify your assertion. [4 marks]












QUESTION FOUR: [20 MARKS]

A. How does EDA differ from Summary Analysis? [3 marks]

B. Fixed distribution is one of the four EDA assumptions:
(i) List the other three assumptions.
(ii) What are the consequences of non-fixed distribution?
(iii) Which EDA technique is used to detect departure from fixed distribution? [5 marks]



C. Consider the following data:

31, 25, 37, 45, 70, 55, 43, 133

Use a lag plot, a run-sequential plot and a box-plot to check whether the data above satisfy the EDA assumptions. [12 marks]









More Question Papers


Popular Exams


Mid Term Exams

End Term 1 Exams

End Term 3 Exams

Opener Exams

Full Set Exams



Return to Question Papers