Digital English Studies

Data analysis: Self-test

Questions about the key concepts in data analysis.

What is a variable whose value is a numerical measurement along a particular dimension (e.g.age|income|length of words)? 

    
What do we call the outcome variable|the one that changes as a function of some other parameters of interest; the variable that is being measured or tested in a study|e.g. in a study looking at how tutoring impacts test scores|the dependent variable would be the participants' test scores|since that is what is being measured?

What do we call the variable that influences the outcome; researchers are looking at how changes in the independent variable cause changes in the dependent variable|e.g. in a study looking at how tutoring impacts test scores|the independent variable would be the type of tutoring?

What is a variable whose values are labels for categories that have no intrinsic order with respect to each other (e.g. gender|nationality|native language)?

    
What is a variable whose values are labels for categories that have an intrinsic order with respect to each other but that cannot be expressed in terms of natural numbers (e.g. education|school grades|ratings in a questionnaire)

What is the approach in statistics when we use different ways to describe our data (e.g. measure of central tendency|measure of spread)? We use this type of statistics simply to describe what's going on in our data.

What is the approach in statistics when we are trying to reach conclusions that extend beyond the immediate data alone|i.e. to infer from the sample data what the population might think? we use this type of statistics to make inferences from our data to more general conditions.

What do we call some property of the objects that can vary and that can be measured and described?

 

 

Question 1 of 2

Type the correct answer. Choose from the following key concepts: population|sample|variance|mean|standard deviation|range|mode|range|median
is a measure of central tendency; to calculate it|all the values of a variable are added and then the sum is divided by the number of values
is a measure of central tendency; the value that lies in the middle of a set of values - 50% of the values lie above the median|and 50% lie below the median
is a measure of central tendency; the value that occurs most frequently in the data
is the entire set of a clearly defined group of people or objects; samples may be drawn from the population|with the aim of generalizing from the sample to the whole population
is a measure of dispersion of data; it is calculated by subtracting the value of the lowest data point from that of the highest data point
is a group of cases (e.g. occurrences|texts|participants) taken from a population that will|hopefully|represent that population such that findings from the sample can be generalised to the 
is the average amount by which each score differs from the mean
in statistics|is the extent to which a set of numbers is spread out from the mean of the set of numbers

Question 2 of 2