Statistics 635

Generalized Linear Models

Fall 2013

Instructor: Xuewen Lu
Course Web:
Classes: MWF: 13:00-13:50  at MS 427.
Office Hours:  MW: 14:00-15:00, MS 540, or by appointment


Note: For other course information not on this page such as lecture notes, sample programs and assignment solutions, please visit the Blackboard . You will need to log-on using your University of Calgary IT username and password. A student who does not have a valid IT username can obtain one by going to the online registration system at

Course Description

This course introduces some statistical modeling tools that are developed for situations where least squares regression and standard ANOVA techniques may not naturally apply. The statistical methods studied are the general linear models for quantitative responses (including multiple regression, ANOVA and ANCOVA), binomial regression models for binary data (including logistic regression and probit models), and Poisson regression models for count data (including log-linear models for contingency tables and hazard models for survival data). All of these techniques are covered as special cases of the generalized linear models (GLM) for regression (or ANOVA or ANCOVA) with Gaussian or non Gaussian responses. Much recent statistical research has focused on generalized additive models (GAM) and on generalized linear mixed models (GLMM) and applications in longitudinal studies. We will survey some of this more recent methodology. As the course develops, we will make extensive use of the statistical modelling and analysis package R and SAS. Data examples will be used throughout the course to illustrate the methodologies and the related software tools.


Working knowledge of basic statistical inference and modeling, such as the theory of point estimation and statistical hypothesis testing, ANOVA, ANCOVA and the standard linear models.


·         An introduction to generalized linear models (3rd ed., 2008), by Dobson and Barnett (DB).

·         This website   contains some SAS Textbook examples hosted in UCLA.

·         Particularly, this website   contains some SAS examples for the GLM textbook.



·        Introduction to Statistical Modelling in R , by P.M.E. Altham.

·        Categorical Data Analysis (2002) , by Alan Agresti

·        Modern Applied Statistics with S-Plus, by Venables & Ripley.

·        Generalized Linear Models, by McCullagh & Nelder, 1989, London: Chapman and Hall.

·        Survival Analysis: Techniques for Censored and Truncated Data, by Klein and Moeschberger, 1997, New York: Springer-Verlag.

·        Modelling Binary Data, by D. Collett, 1991, London: Chapman and Hall.

·        Multivariate Statistical Modelling Based on Generalized Linear Models, by Fahrmeir and Tutz, 1994, New York: Springer-Verlag.

·        Categorical Data Analysis Using the SAS System, by Stokes, Davis & Koch, 1995, SAS Institute Inc., Cary, NC, USA.



We will be using R, an open-source clone of S/Splus, or SAS for computation programming, data analysis and graphics. R resources are to be found at CRAN, the Comprehensive R Archive Network. The S/Splus Archive at Statlib contains contributed code for S/Splus, which may or may not work under R.

Note: SAS is available only in room MS 571.

The following tutorial documents should be helpful to you, especially if you had little previous exposure to R/S/Splus and SAS.

·        An Introduction to R, by Venables, Smith, and the R Development Core Team.

·        Using R for Data Analysis and Graphics: An Introduction, by John Maindonald.

·        Introduction to SAS.

·        Introduction to Categorical Data Analysis Procedures. SAS/STAT User's Guide

·        SAS Online Documents SAS/IML, SAS/GLM, SAS/REG, SAS/GENMOD, SAS/MIXED etc. User's Guide

·        SAS Onlines Samples for Categorical Data Analysis Using the SAS System, by Stokes, Davis & Koch, 1995, SAS Institute Inc., Cary, NC, USA.


Course Work

There will be three homework assignments, a midterm, and a project & oral presentation. The assignments will contribute about 45% to the course grade, the midterm 30%, and the project & oral presentation 25%. Some worksheets designed by Altham (see this reference above) will be assigned as non-credit homework for practising R and GLM. You are encouraged to discuss with each other on the homework assignments, but you are expected to do your independent work. Sometimes, you also need to submit your computer programs electronically for me to test. For the project, you need to find a data set from the real applications and analyze it using the methods you learned from this course. After that, you should write a report with 8-10 pages (double spaced lines) and present your discoveries to the class in 15 minutes. Evaluation of your work will be based on novelty of the approaches, correct interpretation of the results and oral presentation of the findings.

Homework Assignments and Project due Dates and Midterm Time







Chapters Covered and road map

1->2->3-> 4->5



  Nov. 18, 2013