Research Institute for Linguistics, Hungarian Academy of Sciences
Quantifying linguistic data
This course offers an introduction into statistical methods that are relevant for quantitative linguistic research. The goal is to understand the basic concepts of these methods and to be able to set up experimental designs whose results are eligible for statistical testing. Although the form of this lecture does not allow for a hands-on training in performing statistical tests on your own, additional material will be provided on-line that will enable you to get more practice on your own.
The schedule of the course is the following:
Lecture 1: How to set up and test a hypothesis?
Types of data and samples. Scale types and relevant measures.
Lecture 2: What is probable and what isn’t? What is different and what isn’t?
Probability, normal distribution, significance level. F-test and Student’s t-test.
Lecture 3: How dependent is one measure of the other, and how to model this dependence?
Correlation and its coefficients (Kendall’s tau, Spearman’s rho, Pearson’s r). Linear regression.
Lecture 4: What if you want to test different variables within one experiment? How to deal with repetitions?
Analysis of variance (ANOVA), repeated-measures ANOVA.
Lecture 5: What to do if your data do not rely on accurate measurements?
Non-parametric tests: Chi-square, Wilcoxon, Mann-Whitney, Kruskal-Wallis tests.
A cookbook for the choice of the correct analysis method.