University of Calgary University of Calgary University of Calgary Library Help University of Calgary Library Search Options Academic Data Centre

NLSCY - Data Quality (10.0)

The estimates derived from this survey are based on a sample of children. Somewhat different figures might have been obtained if a complete census had been taken using the same questionnaire, interviewers, supervisors, processing methods, etc. as those actually used. The difference between the estimates obtained from the sample and the results from a complete count taken under similar conditions is called the sampling error of the estimate.

Errors which are not related to sampling may occur at almost every phase of a survey operation. Interviewers may misunderstand instructions, respondents may make errors in answering questions, the answers may be incorrectly entered on the questionnaire and errors may be introduced in the processing and tabulation of the data. These are all examples of non-sampling errors.

In this section some of the non-sampling errors that occurred in the NLSCY are discussed. Non-response to the various components of the NLSCY is discussed in detail. It should be noted that further information regarding data quality in the various sections of the NLSCY questionnaire can be found in Section 9.



10.1 Overall Response Rates and Response Bias

In total, 15,579 households were selected to participate in the NLSCY. Out of these selected households a response was obtained for 13,439 which results in an overall response rate of 86.3%.

In the table that follows the number of households selected in each province is presented as well as the number of responding households and the response rate. This is followed by a table giving the response rates for the Main and Integrated Components32. It should be noted that for the Main Component only households which were respondents to the Labour Force Survey were included in the NLSCY sample. The impact of not including LFS non-respondents is discussed later on in this section.

NLSCY RESPONSE RATES BY PROVINCE

PROVINCESAMPLED 33 RESPONDING HOUSEHOLDSRESPONSE RATE
Newfoundland
889
803
90.3%
Prince Edward Island
481
422
87.7%
Nova Scotia
1,059
926
87.4%
New Brunswick
980
857
87.4%
Qu&eacut;bec
2,867
2,514
87.7%
Ontario
4,268
3,519
82.5%
Manitoba
1,133
1,001
88.3%
Saskatchewan
1,166
1,039
89.1%
Alberta
1,355
1,213
89.5%
British Columbia
1,381
1,145
82.9%
TOTAL 34
15,579
13,439
86.3%

NLSCY RESPONSE RATES FOR
MAIN AND INTEGRATED COMPONENTS

SAMPLED HOUSEHOLDSRESPONDING HOUSEHOLDSRESPONSE RATE
Main Component
12,904
11,150
86.4%
Integrated Component
2,675
2,289
85.6%
Overall
15,579
13,439
86.3%

As mentioned in Section 5, initially the response rates to the NLSCY were not as high as expected. Therefore in June of 1995 a major follow-up was conducted of all non-responding households. In this follow-up 787 households were converted to respondents thereby increasing the response rate by 5%.

There were many possible reasons why a household chose not to respond to the NLSCY. In some cases an interviewer was unable to make contact with a selected household for the entire collection period, in other cases the household refused to participate in the survey, special circumstances such as an illness or death in a family or extreme weather conditions sometimes prevented an interview from taking place.

Sometimes it was possible to carry out some of the interview but a complete interview was not obtained for a variety of reasons. Some respondents were willing to give only a certain amount of time to the completion of the survey. In some cases an interviewer completed a portion of the survey with the respondent and made an appointment to continue at another time but was unable to recontact the respondent.

It was necessary to come up with a criteria for classifying these "partial" interviews as respondent or non-respondent households. If the majority of the survey had been completed, obviously the preference was to keep this case and label it as a responding household. However if only very minimal information was collected, the decision was made to drop the household and treat it as a non-responding household. In order to make this assessment the data collected for each selected child in the household were examined. This was done by analysing certain key questions across the Child Questionnaire. An assessment was made as to whether or not there was an adequate amount of information collected for at least one child in each household. If there was, this household was maintained in the responding sample. All missing variables for this household were set to not stated or imputed. If there was not adequate information for at least one child then the household was dropped from the responding sample and treated as a non-response. A more thorough discussion on the procedure for assigning response codes to partial interviews can be found in Section 6.3.

In the table that follows the disposition of the non-responding sample is presented.

THE NLSCY NON-RESPONDING SAMPLE
BY REASON FOR NON-RESPONSE

REASONNON-RESPONDING HOUSEHOLDS %
Refusal
1,437
67.1
No One at Home
117
5.5
Temporarily Absent
31
1.4
Language Barrier
62
2.9
Special Circumstances (sickness in family, weather conditions, etc.)
173
8.1
Partial Response (rejected due to an inadequate amount of information)
283
13.2
Other or Reason Unknown
37
1.7
Total
2,140

32. For the Integrated Component, it was often not even known if non-responding households had children or not. If there were children, the household should have been considered to be a non-respondent to the NLSCY. If there were no children it should have been considered to be out of scope and not included in the response rate calculation. In order to estimate a response rate for the Integrated Component a certain proportion of these households were estimated to have children and therefore labelled as non-responding households and the remainder were considered as out-of-scope.

Return to text

33. This includes households with at least one child 0 to 11 years of age at the time of the NLSCY interview.

Return to text

34. Excludes Yukon and Northwest Territories.

Return to text

Non-response Bias

Non-response is a type of error that can result in bias in survey estimates. Biased estimates can result if the non-respondents to a survey differ significantly from the respondents.

In order to study the effect of non-response bias, a study was carried out for households included in the sample for the Main Component. Since these households were at one point LFS respondents, certain information was available on both respondents and non-respondents to the Main Component. For the integrated sample there is no prior information available about non-responding household so a non-response bias study was not possible.

There were 12,904 households selected for the Main Component; 11,150 were respondents and 1,754 were non-respondents. Information collected for the LFS was compared for these responding and non-responding groups. The analysis was carried out using weighted data with a correction for the complex sampling design.

The LFS characteristics that were considered in this comparison included:

Out of this list there was a significant difference 36 between responding and non-responding households for only four variables.

  • Non-responding households were more likely to be in CMAs.

  • The parent in non-responding households was more likely to be in the 40+ age group.

  • The parent in non-responding households was more likely to have a lower level of education (0 to 8 years of education).

  • Households where the parent was a student were more likely to be responding households.
  • It should be noted that problems associated with the first two variables (CMA and age) are at least partially corrected in the weight adjustments that were carried out. There was a weight adjustment made within CMAs and there was an adjustment made by single years of age for children (See Section 7). Older children are more likely to have older parents so at least some adjustment has been made to compensate for the higher non-response rate for older parents.

    The impact of non-responding households where the parent had a lower level of education remains minimal due to the fact that a relatively small proportion of the sample falls into this category.

    35. In this analysis the parent referred to the female parent except for families where there was a lone parent and that parent was male.

    Return to text

    36. Only differences that were significant at the 95% confidence level are reported.

    Return to text

    Other Sources of Bias

    For the NLSCY Main Component there is another potential source of bias due to the method by which the sample was selected for this component. As discussed in Section 4, the sample for this component was selected from households that had participated in the Canadian Labour Force Survey. Households which had at least one child 0 to 11 years of age at the time of the LFS interview were selected for the NLSCY. This sampling methodology results in two problems.

    One problem is that only respondents to the LFS were considered for the NLSCY sample for the Main Component. It could be that some of the LFS non-respondents had children in the appropriate target age group; but these households were not included in the NLSCY sample. The response rate to the LFS was approximately 95% in the time period in which the NLSCY sample was selected. It is estimated that approximately 700 households with children were missed due to the fact that no attempt was made to make contact with non-responding LFS households.

    A second complication was due to the fact that only households for which there were children when the LFS was conducted were included in the NLSCY sample. It is possible that households were not included in the sample since they were vacant or only contained members 12 years of age or older at the time of the LFS. Some of these household may have had children (0 to 11) living with them a few months later when the NLSCY interview took place. It is estimated that approximately 240 households with children were missed in the NLSCY sample for this reason. It is likely that a large portion of these 240 household would represent households with a newborn, since the newborn came into the family after the time of the LFS. The weighting procedures carried out (see Section 7) compensates for the under representation of 0 year-olds at the global level, but there is likely an under representation of children 0 to 3 months old.

    In total 3080 households were missed due to non-response to the NLSCY interview (2140) or due to the two problems discussed above (940). A complete interview was obtained for 13,439 households which represents 81.4% of the total households estimated to have children in the 0 to 11 age group.


    10.1.1 Component Non-response

    As discussed in Section 5, there were several respondents or components to the NLSCY interview. The PMK provided detailed information about each selected child. In the parent and the general interview, the PMK provided information about herself and her spouse/partner. The PPVT-R test was administered to children in the 4 to 5 age group. Children in the 10 to 11 age group completed a questionnaire on their own. For school-aged children the teacher completed a questionnaire about the child and if the child was in grade 2 or above a Math Test was administered. There was a potential for non-response for each of these individual components.

    It should be noted however, that if a household was deemed to be a responding household, then all required components were created for that household; even if there were no data provided for a particular component. For example, if there was a 10 year-old in a responding household who would not complete the 10 to 11 Questionnaire, then this component still exists for the child, with all variables set to not-stated. Likewise if a parent completed a Child Questionnaire for one child in the household but refused to provide information for a second child then there is a record for this second child with not-stated values for all variables.

    The following sections provide a summary of the degree to which there is complete data for the various NLSCY components. A brief summary of the content for each of these components can be found in Section 5. As it can be seen in the sections that follow, the impact of partial non-response on data quality is minimal. The one exception to this is the Mathematics Computation Test (Section 10.7).


    10.2 Child Questionnaire Response Rates

    In order to assess the completeness of the child data (i.e., the information provided by the PMK about the child) several key questions were identified across the Child Questionnaire. One item was selected from most of the sections on this questionnaire in a somewhat random fashion to assess data quality. Of the responding sample of 22,831 children:


    10.3 Parent Questionnaire Response Rates

    This questionnaire was administered for the both the PMK and spouse/partner. Again key questions were identified to assess completeness. Out of the 24,692 PMKs and spouse/partners:

    37. The reason for the high number of partials for the parent questionnaire was because of one of the questions that was chosen as a key item. This item was question 6A from the Neighbourhood Section - "If there is a problem around here, the neighbours get together to deal with it." A fairly high number of parents answered don't know to this question.


    10.4 General Questionnaire Response Rates

    This questionnaire was also administered to the PMK and spouse/partner. Out of the 24,692 PMKs and spouse/partners:


    10.5 PPVT-R Response Rates and Bias

    For the 3,728 children in the 4 to 5 age group:

    In order to assess non-response bias for the PPVT-R, characteristics of the children who completed the test (88.8%) were compared with those who did not (11.2%).

    The following table presents the variables included in this non-response bias study and the results. An explanation is given for differences significant at the 95% confidence level.

    NON-RESPONSE BIAS FOR PPVT-R

    VARIABLERESULT
    Sex of the child (AMMCQ02)Girls were more likely to respond to the PPVT-R than boys. The response rate for girls was 90.5% and the response rate for boys was 87.3%.
    Parent Status (ADMCD04)
    • child lives with
      • two parents
      • one parent
      • no parents
    No effect
    Score on the hyperactivity factor on the behaviour scale on the Child's Questionnaire (ABECS06)Children who were more hyperactive were more likely to be respondents.

    Average hyperactivity score

    • Respondents - 5.0
    • Non-respondents - 4.6
    Score on the prosocial factor on the behaviour scale on the Child's Questionnaire (ABECS07)Children who were more prosocial were more likely to be respondents.

    Average prosocial score

    • Respondents - 11.5
    • Non-respondents - 10.5
    Score on the emotional disorder factor on the behaviour scale on the Child's Questionnaire (ABECS08)No effect
    Score on the conduct disorder factor on the behaviour scale on the Child's Questionnaire (ABECS09)Children who had higher conduct disorder scores were more likely to be respondents.

    Average conduct disorder score

    • Respondents - 1.7
    • Non-respondents - 1.3
    Score on the indirect aggression factor on the behaviour scale on the Child's Questionnaire (ABECS10)Children who scored higher on the indirect aggression scale were more likely to be respondents.

    Average indirect aggression score

    • Respondents - 0.8
    • Non-respondents - 0.6
    Household income (AINHD01)No effect
    Current working status of PMK (ALFPD28)
    • full-time, part-time or not working
    No effect
    Highest level of education of PMKNo effect
    ProvinceThe response rate was lower in Manitoba, Saskatchewan and Alberta, (84.1%, 80.6% and 83.1%) 38

    38. One reason for this, is that when the June follow-up for non-respondents was carried out, the response rate for these provinces was already quite high. Therefore it was agreed that for these provinces only, the follow-up could be done completely by telephone. This precluded administering the PPVT-R since it had to be administered in person.


    10.6 10 to 11 Questionnaire Response Rates and Bias

    Again key questions (nine in total) were identified on the 10 to 11 Questionnaire in order to assess completeness. Out of the 3,434 children in the 10 to 11 age group selected in responding households:

    In order to assess non-response bias for the 10 to 11 Questionnaire, characteristics of the children who answered at least half of the key questions (86.7%) were compared with those who did not (13.3%).

    The following table presents the variables included in this non-response bias study and the results. Only results significant at the 95% confidence level are presented. Children who answered at least half of the key questions are labelled as respondents in this table.

    NON-RESPONSE BIAS FOR 10 TO 11 QUESTIONNAIRE

    VARIABLERESULT
    Sex of the child (AMMCQ02)Girls were more likely to respond to the 10 to 11 Questionnaire than boys. The response rate for girls was 87.8% and the response rate for boys was 85.5%
    Parent Status (ADMCD04)
    • child lives with
      • two parents
      • one parent
      • no parents
    No effect
    Score on the hyperactivity factor on the behaviour scale on the Child's Questionnaire (ABECS06)No effect
    Score on the prosocial factor on the behaviour scale on the Child's Questionnaire (ABECS07)Children who were more prosocial were more likely to be respondents.

    Average prosocial score

    • Respondents - 13.0
    • Non-respondents - 12.4
    Score on the emotional disorder factor on the behaviour scale on the Child's Questionnaire (ABECS08)No effect
    Score on the conduct disorder factor on the behaviour scale on the Child's Questionnaire (ABECS09)No effect
    Score on the indirect aggression factor on the behaviour scale on the Child's Questionnaire (ABECS10)No effect
    How well the child is doing at school in reading based on information from the parent on the Child's Questionnaire (AEDCQ14A)Children who were doing poorly or very poorly in reading were more likely to be non-respondents.

    For children who had poor or very poor reading skills the response rate was 67.3%. For children who were reported to have average or above average skills the response rate was 88.4%

    How well the child is doing at school "overall" based on information from the parent on the Child's Questionnaire (AEDCQ14D)Children who were doing poorly or very poorly in school were more likely to be non-respondents.

    For children who very doing poorly or very poorly in school the response rate was 59.3%. For children who were reported to have average or above average skills the response rate was 88.2%

    Household income (AINHD01)Children living in households with lower incomes were more likely to be non-respondents.

    Average household income

    • Respondents - $50,466
    • Non-respondents - $43,633
    Highest level of education of PMKChildren for which the PMK had a higher level of education were more likely to be respondents
    ProvinceThe response rate was lower in Manitoba, Saskatchewan and Alberta, (82.5%, 76.8% and 84.7%) 39

    39. One reason for this, is that when the June follow-up for non-respondents was carried out, the response rate for these provinces was already quite high. Therefore it was agreed that for these provinces only, the follow-up could be done completely by telephone. This precluded administering the 10 to 11 Self-complete Questionnaire.

    10.7 Mathematics Test Response Rate and Bias

    As mentioned earlier, one component of the NLSCY was administered at school. Wherever parents and school boards consented, the child's teacher and principal were contacted and asked a number of questions about the child and his/her school environment. Children in grade 2 and above were also given a short mathematical skills test. Mathematics tests were completed for only 50.5% of the children in grade 2 and over who were part of the NLSCY responding sample. The table that follows shows the distribution by province.

    DISTRIBUTION OF MATHEMATICS TESTS BY PROVINCE

    Number of children 'eligible' for the mathematics testNumber of mathematics tests completed
    Newfoundland
    541
    378
    Prince Edward Island
    281
    153
    Nova Scotia
    549
    326
    New Brunswick
    534
    316
    Québec
    1,372
    505
    Ontario
    2,208
    1,160
    Manitoba
    646
    327
    Saskatchewan
    699
    304
    Alberta
    835
    436
    British Columbia
    702
    334
    Total
    8,367
    4,239

    The number of math tests completed divided by the number of children "eligible" for the test represents the response rate for the math test. A lower response rate has two potential consequences. First, it reduces the actual sample size for which users will have data. Second, non-respondents may have different characteristics from respondents, which would produce a bias in the results.

    The math test response rate was lower than originally hoped. Various factors affected the response rate. Although no one factor was particularly detrimental, a combination of factors had an impact on the response rate. Nevertheless, it is unlikely that all these factors had the same effect on potential bias. For example, to boost the response rate in households, follow-up collection was carried out in June of 1996. For operational reasons, no math tests were administered at that time. In addition, a number of consent forms in Québec were processed too late for the test to be administered. While these factors did contribute to non-response to the math test, they probably had less effect on potential bias than cases where parents refused permission for their children to take the test. The various components of test non-response are shown in the table below.

    NON-RESPONSE FACTORS

    Component of non-responsePortion of non-response (%)
    June follow-up
    5.9
    Consent form not received
    13.7
    Parent refusal
    3.4
    School Board refusal
    9.6
    Teacher non-response
    17.4
    Other
    1.8
    Total
    51.8

    A study was done to assess the impact that the low response rate had on the results. It is difficult to quantify the actual impact; however it is possible to examine some of the characteristics observed in household interviews and compare distributions for cases where there was a response to the mathematics test vs. a non-response. If those characteristics are related to the test results, and if a difference in behaviour is noted between respondents and non-respondents, it can be assumed that there may be some bias in the data.

    The table that follows provides an example of this phenomenon for grade 2 students.

    DISTRIBUTION OF MATHEMATICS TEST RESPONDENTS AND
    NON-RESPONDENTS BY HOUSEHOLD INCOME CATEGORY

    0 - $14k$15k - $29k$30k - $49k$50k - $59k$60k +Total
    Respondents Math. (%)
    14.3
    28.5
    29.6
    15.2
    12.4
    100
    Non-respondents Math. (%)
    16.9
    28.6
    26.6
    16.3
    11.5
    100

    As the table shows, there is a difference between the distributions. The next table shows the average score to the math test by household income class.

    AVERAGE RAW SCORES OF MATHEMATICS TEST RESPONDENTS
    (GRADE 2) BY HOUSEHOLD INCOME CATEGORY

    0 - $14k$15k - $29k$30k - $44k $45k - $59k$60k +
    Score
    5.24
    5.68
    6.08
    6.32
    6.55

    There is a higher percentage of children that are non-respondents to the math test for the lower income class and a lower percentage in the higher income class. If it is assumed that the average math score is the same for respondents and non-respondents within an income class, the results from the responding sample could be expected to be slightly higher than the results that would have been obtained in the whole population.

    The following table presents the list of variables that were compared and shows the ones that had significant differences between respondents and non-respondents to the mathematics test.

    COMPARISONS BETWEEN RESPONDENTS AND
    NON-RESPONDENTS TO THE MATH TEST FOR VARIOUS
    CHARACTERISTICS OF THE NLSCY

    VARIABLEDIFFERENCESGROUP FOR WHICH RESPONSE RATE IS LOWER
    Failed a grade
    x
    failed a grade
    How well child is doing in reading (according to PMK)
    x
    poorly, very poorly
    How well the child is doing in Math (according to PMK)
    x
    average, poorly, very poorly
    How well the child is doing in composition (according to PMK)
    x
    poorly, very poorly, not-applicable
    How well the child is doing in general (according to PMK)
    x
    poorly, very poorly
    Received help outside the school
    x
    received help
    Contacted by school regarding behaviour
    x
    twice or more
    Looks forward to go to school (according to PMK)
    no
    Important that child has good grades (according to PMK)
    x
    not important / refusal
    How far it is hoped that the child will go in school (according to PMK)
    x
    primary / secondary / other / CEGEP / trade
    Progress important at the school
    x
    refusal
    Child enjoys being at school (according to PMK)
    x
    refusal / disagree
    Parents welcome at school
    x
    refusal / disagree
    School spirit high
    x
    refusal
    Child receives special education
    x
    yes / refusal
    PMK has a high school diploma
    x
    no
    PMK went to a post-secondary establishment
    x
    no
    Household income 40
    x
    0 - $15k, refusal
    PMK income
    x
    Number of children in the household
    no
    CMA
    x
    certain CMAs in Québec, as well as in Kitchener and Regina
    Child's health
    no
    Reads well with glasses
    no
    Reads well without glasses
    no
    Needs a hearing aid
    no

    The observed differences in the various characteristics appear to give evidence that there may be a tendency to overestimate the average scores of the mathematics test of the responding sample compared to the results that would have been obtained if everybody in the sample had been a respondent.

    40. Household income has been imputed. For this reason the previous tables with income data did not show missing values. However there was more non-response to the math test for the children where the PMK refused to give a household income.

    Return to text


    10.8 Ceiling Effect for Mathematics Test

    The mathematics tests administered in Cycle 1 were fairly short. There were 10 questions in the test for grade 2 and 3 students, and 15 questions in the test given to students in higher grades. Furthermore, in order to streamline administrative procedures, tests with the same level of difficulty were used for two grades (e.g. second and third graders took the same test, as did grade 4 and 5 students and grade 6 and 7 students). Although the problem did not show up in the initial testing, a ceiling effect was noted, especially among third and fifth graders (the ceiling is the highest possible mark on the test, and a high ceiling effect indicates that "too many" children had perfect marks).

    DISTRIBUTION OF CHILDREN WITH PERFECT MARKS BY GRADE

    Percentage of children with perfect marks
    Grade 2
    10.6%
    Grade 3
    38.0%
    Grade 4
    3.2%
    Grade 5
    14.7%
    Grade 6
    4.5%

    Comparisons at the provincial level reveal even more pronounced differences. Québec in particular had a more serious ceiling effect. Consequently, even though the mathematics test scores are available for all grades, it is recommended that the data for grades 3 and 5 not be used, or that they be used with extreme caution!

    For the next cycle, a number of steps have been taken to improve the results by increasing the response rate and reducing the ceiling effect. First, there will be a different test for each grade. All mathematics tests will consist of 15 questions. An aptitude indicator will be administered during the home interview to help identify the child's grade and to assist in the imputation of missing responses where necessary. In addition, in order to improve response rates, more effort has been put in encouraging school boards to co-operate and a better tracking system for consent forms has been implemented. These measures should help eliminate most of the problems encountered in Cycle 1.