The Effect of Parental Educational Attainment on Child Enrollment in School in South Africa

An Examination of Data: 1994-2004

by Laura Shawn Roberts

 

Background

Research Question

Notes

Key Variables

1994

Descriptive Statistics

-Var: maxparented
-Var: enrolled
-Graph Codes

Summary

Regression Analysis

2004

Descriptive Statistics

-Var: maxparented04
-Var: attended
-Graph Codes

Regression Analysis

Summary/Conclusion

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Background

Formalized segregation of schools and other institutions is an all too recent memory in the United States. It is an even clearer memory in South Africa. In a way, the massive desegregation of the United States and South African education systems may serve as useful comparisons. Though the United States has had mandatory desegregation of its schools for over fifty years, de facto segregation pervades the education system to this day. Policymakers, educators, and citizens alike struggle to diminish and hopefully eliminate the prevalent gaps in educational achievement, attendance, and access. 

South Africa’s education system has also seen massive desegregation, though more recently. The country is in the midst of an opportunity to create a truly equal education system. To do so requires careful collection and analysis of data as well as innovative thinking regarding education policy design and implementation. The purpose of this paper is to examine some of the effects of desegregation with respect to the all too recent educational segregation.  Specifically, the paper addresses the question: Does the educational attainment of a child's parents affect that child's likelihood of being enrolled in or attending school?

Notes
There are two data sources for this study. All 1994 data comes from the SALDRU survey of that year. The comparative study is from Statistics South Africa's 2004 General Household and Labor Surveys. In the text, I will refer to these as 1994 SALDRU and 2004 SSA, respectively. I will examine trends in education using descriptive statistics and basic regression analysis. There are notable limitations to these approaches, but I continue in the spirit that this lesson module is designed to be an educational tool.

South Africa obviously has history of tense racial interaction. The country has experienced profound change in the past decade, change that has spanned nearly every aspect of daily life. This researcher honors personal opinions about proper titles for different population groups. Indeed, all residents of Africa are Africans. For the purposes of this paper however, I will use the labels assigned in the survey instruments: African (Black), Coloured, Indian, and White. Additionally, in the 2004 dataset, I exclude the self-assigned race category for ease of comparison with the 1994 data.

Key Variables
It is important to begin any research endeavor by determining what key variables factor into the research question. The main variables with which I will be working are twofold: the key independent variable is maximum parental education [maxparented]. The key dependent variable is of course enrollment in school [enrolled in 1994 and attended in 2004.]. I would ideally like to be able to examine the effect of parental educational attainment on child educational attainment, but for the purpose of this learning tool, I will substitute child enrollment/attendance at school for child educational attainment.

I also need to consider what other factors may influence enrollment in school. Socio-economic status and race quickly come to mind. Additionally, the gender of the child, the location of the household with relation to the school, and whether or not the child lives with both parents, may all be considered.

To see how I coded/cleaned these variables, please click here.

1994 SALDRU Dataset

Descriptive Statistics
This paper intends to examine the causal relationship between parental education attainment and child enrollment/attendance at school. I will begin my analysis with the study of a variety of descriptive statistics to set the stage for, and hopefully demonstrate the value of, our eventual regression analysis. I begin my examination the relationship between my key variables and the various controls by beginning with a simple look at the distribution of maximum parent educational attainment.

Maximum Parent Education

tab maxparented

     Max |
   parental |
  education |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |     10,097       24.14       24.14
          1 |      2,571        6.15       30.29
          2 |      2,304        5.51       35.80
          3 |      2,605        6.23       42.03
          4 |      3,070        7.34       49.37
          5 |      3,654        8.74       58.10
          6 |      5,003       11.96       70.07
          7 |      1,912        4.57       74.64
          8 |      3,283        7.85       82.49
          9 |      1,590        3.80       86.29
         10 |      2,994        7.16       93.45
         12 |      1,908        4.56       98.01
         16 |        832        1.99      100.00
------------+-----------------------------------
      Total |     41,823      100.00

 

I see that of the households that responded, the majority have a household head or spouse of the household head who reported no educational attainment. A clear way to look at this is with a graph:

Slowly, I add more detail to my examination. Next, I will look at the relationship between maximum parent education and race.

. tab maxparented race, col

 
      Max |
  parental |            19 :population group
 education |   African   Coloured     Indian      White |     Total
-----------+--------------------------------------------+----------
         0 |     9,712        175         39         83 |    10,009
           |     29.44       5.45       3.52       2.17 |     24.34
-----------+--------------------------------------------+----------
         1 |     2,409        104         16          9 |     2,538
           |      7.30       3.24       1.45       0.24 |      6.17
-----------+--------------------------------------------+----------
         2 |     2,184         73         12          1 |     2,270
           |      6.62       2.27       1.08       0.03 |      5.52
-----------+--------------------------------------------+----------
         3 |     2,366        190         12          0 |     2,568
           |      7.17       5.92       1.08       0.00 |      6.24
-----------+--------------------------------------------+----------
         4 |     2,734        248         20         28 |     3,030
           |      8.29       7.72       1.81       0.73 |      7.37
-----------+--------------------------------------------+----------
         5 |     3,151        372         54         21 |     3,598
           |      9.55      11.58       4.88       0.55 |      8.75
-----------+--------------------------------------------+----------
         6 |     4,079        605        141        107 |     4,932
           |     12.37      18.84      12.74       2.80 |     11.99
-----------+--------------------------------------------+----------
         7 |     1,422        303         68         93 |     1,886
           |      4.31       9.43       6.14       2.44 |      4.59
-----------+--------------------------------------------+----------
         8 |     1,836        599        185        565 |     3,185
           |      5.57      18.65      16.71      14.79 |      7.74
-----------+--------------------------------------------+----------
         9 |     1,004        197        122        229 |     1,552
           |      3.04       6.13      11.02       6.00 |      3.77
-----------+--------------------------------------------+----------
        10 |     1,225        214        312      1,148 |     2,899
           |      3.71       6.66      28.18      30.06 |      7.05
-----------+--------------------------------------------+----------
        12 |       754        109         66        922 |     1,851
           |      2.29       3.39       5.96      24.14 |      4.50
-----------+--------------------------------------------+----------
        16 |       111         23         60        613 |       807
           |      0.34       0.72       5.42      16.05 |      1.96
-----------+--------------------------------------------+----------
     Total |    32,987      3,212      1,107      3,819 |    41,125
           |    100.00     100.00     100.00     100.00 |    100.00

According to this table, the most marked comparison is between the maximum education attainment for African and White respondents. Examining this with a graph provides a cleaerer description.

 

Grouping the grades together as above produeces even more stark results. We see clearly from these graphs that the White population group has a significantly larger proportion of parents who have passed the twelfth grade. In contrast, the African population has a very large proportino of parents who claim no educational attainment.

.  tab maxparented hhpcibin, col

Max |
  parental |                    HH PCI Breakdown
 education |    <98.75  98.76-243  243.479-6  698.096-1    >1611.5 |     Total
-----------+-------------------------------------------------------+----------
         0 |       695        572        282         62         25 |     1,636
           |     34.09      27.81      13.99       5.41       3.25 |     20.38
-----------+-------------------------------------------------------+----------
         1 |       176        129         85         23          2 |       415
           |      8.63       6.27       4.22       2.01       0.26 |      5.17
-----------+-------------------------------------------------------+----------
         2 |       157        109         92         31          7 |       396
           |      7.70       5.30       4.56       2.71       0.91 |      4.93
-----------+-------------------------------------------------------+----------
         3 |       164        148         97         33          6 |       448
           |      8.04       7.19       4.81       2.88       0.78 |      5.58
-----------+-------------------------------------------------------+----------
         4 |       178        189        123         68          9 |       567
           |      8.73       9.19       6.10       5.94       1.17 |      7.06
-----------+-------------------------------------------------------+----------
         5 |       194        239        182         78          9 |       702
           |      9.51      11.62       9.03       6.81       1.17 |      8.75
-----------+-------------------------------------------------------+----------
         6 |       225        256        308        106         25 |       920
           |     11.03      12.45      15.28       9.26       3.25 |     11.46
-----------+-------------------------------------------------------+----------
         7 |        88        103        154         43          9 |       397
           |      4.32       5.01       7.64       3.76       1.17 |      4.95
-----------+-------------------------------------------------------+----------
         8 |        72        157        272        159         56 |       716
           |      3.53       7.63      13.49      13.89       7.28 |      8.92
-----------+-------------------------------------------------------+----------
         9 |        46         75        129         81         30 |       361
           |      2.26       3.65       6.40       7.07       3.90 |      4.50
-----------+-------------------------------------------------------+----------
        10 |        33         66        187        248        234 |       768
           |      1.62       3.21       9.28      21.66      30.43 |      9.57
-----------+-------------------------------------------------------+----------
        12 |         6         14         90        164        200 |       474
           |      0.29       0.68       4.46      14.32      26.01 |      5.91
-----------+-------------------------------------------------------+----------
        16 |         5          0         15         49        157 |       226
           |      0.25       0.00       0.74       4.28      20.42 |      2.82
-----------+-------------------------------------------------------+----------
     Total |     2,039      2,057      2,016      1,145        769 |     8,026
           |    100.00     100.00     100.00     100.00     100.00 |    100.00

Again, as expected, the most dramatic contrast in education attainment is between the lowest and highest income groups. Specifically, the of the people in the lowest income group, over 34% claimed no education, while only 3.25% of people in the highest income group have no education. Examination of the highest educational attainment shows similar disparity; over 20% of the maximum education attainment in the highest income group was at the higher education level, while only one-quarter percent of reporting parents in the lowest income group had any schooling after metriculation from high school.

The graph:

Similar to the graphs of parental education attainment, income distribution also seems to be a reliable indicator of maximum parental educational attainment. The percentage of parents with educational attainment of zero and more than high school, respectively, are almost inverted between the lowest and highest income groupings.

The picture created thus far with the data is one of dramatic inequality in the level of education attainment of heads of household and their spouses. To have such a large portion of the population with so little education is alone cause for concern. If there is in fact a positive causal relationship between parental education and child enrollment in school (if we think of enrollment as a proxy for attainment) as I suspect there is, efforts should be made to improve school-age child enrollment in school in order to ensure at least a basic level of education attainment.

Next I take a look at how enrollment is affected by different factors.

Enrollment

. tab enrolled if age>=5 & age<=25

  Enrollment |      Freq.     Percent        Cum.
-------------+-----------------------------------
Not enrolled |      3,649       21.73       21.73
    Enrolled |     13,146       78.27      100.00
-------------+-----------------------------------
       Total |     16,795      100.00

We see here that the majority of school-age respondents are enrolled in school.

. tab hhpcibin enrolled if age>=5 & age<=25, row col

HH PCI |      Enrollment
      Breakdown | Not enrol   Enrolled |     Total
----------------+----------------------+----------
         <98.75 |        18         14 |        32
                |     56.25      43.75 |    100.00
                |      9.89      20.90 |     12.85
----------------+----------------------+----------
  98.76-243.478 |        28         21 |        49
                |     57.14      42.86 |    100.00
                |     15.38      31.34 |     19.68
----------------+----------------------+----------
243.479-698.095 |        67          7 |        74
                |     90.54       9.46 |    100.00
                |     36.81      10.45 |     29.72
----------------+----------------------+----------
 698.096-1611.5 |        49         13 |        62
                |     79.03      20.97 |    100.00
                |     26.92      19.40 |     24.90
----------------+----------------------+----------
        >1611.5 |        20         12 |        32
                |     62.50      37.50 |    100.00
                |     10.99      17.91 |     12.85
----------------+----------------------+----------
          Total |       182         67 |       249
                |     73.09      26.91 |    100.00
                |    100.00     100.00 |    100.00

 

When we examine enrollment and per capita income, there is a surprising trend in that the groups with the highest percentage of students not enrolled in school are the highest income group and the group with per capita incomes between 243 and 698 Rand (62.5% and 90.54%, respectively).  For all income groupings, a higher percentage of students are not enrolled than enrolled in school.  In 1996, the South African government passed legislation requiring enrollment in school; this data is particularly interesting when compared to figures from the 2004 survey.

This is how the above table looks in graph form:

I can see from the pie charts that the distribution of the percentage of students enrolled in school is different within different income groups. To get a better picture of the distribution, I will use a histogram:

Image2

Indeed, household per capita income has a seemingly surprising relationship with enrollment.  I expected that enrollment would have increased as PCI increased, but that was not the case. Instead, the income group with the lowest percentage of enrolled students is the "middle" income group, with average household per capita incomes between 243 and 698 Rand.

. tab enrolled race if age>=5 & age<=25, row col

          |            19 :population group
  Enrollment |   African   Coloured     Indian      White |     Total
-------------+--------------------------------------------+----------
Not enrolled |     2,967        361         94        227 |     3,649
             |     81.31       9.89       2.58       6.22 |    100.00
             |     21.39      27.60      22.38      18.98 |     21.73
-------------+--------------------------------------------+----------
    Enrolled |    10,904        947        326        969 |    13,146
             |     82.95       7.20       2.48       7.37 |    100.00
             |     78.61      72.40      77.62      81.02 |     78.27
-------------+--------------------------------------------+----------
       Total |    13,871      1,308        420      1,196 |    16,795
             |     82.59       7.79       2.50       7.12 |    100.00
             |    100.00     100.00     100.00     100.00 |    100.00

Of those not enrolled in school, the majority (81%) are African.  Of those enrolled in school, the majority is also overwhelmingly (almost 83%).  Examination of the frequency shows that for both the enrolled and not enrolled categories, the distribution of responses is heavily skewed towards the African population group.  There are over ten times the number of African responses compared to White responses.
For each population group, the majority of respondents are enrolled in school.  We can see this in the graph below.  

. tab maxparented enrolled if age>=5 & age<=25, row col

       Max |
  parental |      Enrollment
 education | Not enrol   Enrolled |     Total
-----------+----------------------+----------
         0 |     1,017      3,001 |     4,018
           |     25.31      74.69 |    100.00
           |     29.18      23.92 |     25.07
-----------+----------------------+----------
         1 |       225        810 |     1,035
           |     21.74      78.26 |    100.00
           |      6.46       6.46 |      6.46
-----------+----------------------+----------
         2 |       210        713 |       923
           |     22.75      77.25 |    100.00
           |      6.03       5.68 |      5.76
-----------+----------------------+----------
         3 |       230        817 |     1,047
           |     21.97      78.03 |    100.00
           |      6.60       6.51 |      6.53
-----------+----------------------+----------
         4 |       270        952 |     1,222
           |     22.09      77.91 |    100.00
           |      7.75       7.59 |      7.62
-----------+----------------------+----------
         5 |       307      1,117 |     1,424
           |     21.56      78.44 |    100.00
           |      8.81       8.90 |      8.88
-----------+----------------------+----------
         6 |       390      1,614 |     2,004
           |     19.46      80.54 |    100.00
           |     11.19      12.87 |     12.50
-----------+----------------------+----------
         7 |       160        563 |       723
           |     22.13      77.87 |    100.00
           |      4.59       4.49 |      4.51
-----------+----------------------+----------
         8 |       267        979 |     1,246
           |     21.43      78.57 |    100.00
           |      7.66       7.80 |      7.77
-----------+----------------------+----------
         9 |       133        467 |       600
           |     22.17      77.83 |    100.00
           |      3.82       3.72 |      3.74
-----------+----------------------+----------
        10 |       174        768 |       942
           |     18.47      81.53 |    100.00
           |      4.99       6.12 |      5.88
-----------+----------------------+----------
        12 |        88        522 |       610
           |     14.43      85.57 |    100.00
           |      2.53       4.16 |      3.81
-----------+----------------------+----------
        16 |        14        222 |       236
           |      5.93      94.07 |    100.00
           |      0.40       1.77 |      1.47
-----------+----------------------+----------
     Total |     3,485     12,545 |    16,030
           |     21.74      78.26 |    100.00
           |    100.00     100.00 |    100.00

We can especially see the distribution of enrollment by grade with relation to parent education attainment in a bar distribution:

Given that, for some people, completing some high school may provide similar value as matriculating from high school, and some middle school may be as valuable as completing eighth grade, grouping the grade levels may provide a different perception of student enrollment and parental education attainment.

Though more students who are not enrolled in school have parents who claim no education, and more students enrolled in school have parents who have matriculated high school, there are not any other drastic differences between parent education attainment of childen who are enrolled and not enrolled in school.

As suspected, the difference between enrollment for students whose parents claim no education and students whose parents claim to have completed at least some higher education is more apparent.  As expected, a significantly higher percentage of students whose parents claim no education are not enrolled in school, especially compared to students whose parents had some higher education.

Above, we also mentioned the possible effect of location of a household with relation to school, as well as the possible effect of living in a single or two-parent household.  The descriptive statistics relating to these controls are below.

. tab enrolled timetoschool if age>=5 & age<=25, row col

             |     Time to and from school
             |            (min/day)
  Enrollment |   0 to 15   15 to 30       > 30 |     Total
-------------+---------------------------------+----------
Not enrolled |     2,725          0        924 |     3,649
             |     74.68       0.00      25.32 |    100.00
             |     23.42       0.00      42.70 |     21.73
-------------+---------------------------------+----------
    Enrolled |     8,908      2,998      1,240 |    13,146
             |     67.76      22.81       9.43 |    100.00
             |     76.58     100.00      57.30 |     78.27
-------------+---------------------------------+----------
       Total |    11,633      2,998      2,164 |    16,795
             |     69.26      17.85      12.88 |    100.00
             |    100.00     100.00     100.00 |    100.00

Of the students who live zero to fifteen minutes from school, over 76.58% are enrolled in school.  That number drops to under 57.3% for students who live more than a thirty minute commute from school.  There are fewer students not enrolled in school than are enrolled, but the fact that over 74% of students not enrolled in school live within fifteen minutes of school, may be valuable information for policymakers seeking to increase student attendance.

. tab enrolled twoparents if age>=5 & age<=25, row col

  Currently |
 enrolled in | Two parent household
      school | Not two-p  Two-paren |     Total
-------------+----------------------+----------
Not enrolled |     2,149      1,495 |     3,644
             |     58.97      41.03 |    100.00
             |     26.80      17.10 |     21.74
-------------+----------------------+----------
    Enrolled |     5,869      7,248 |    13,117
             |     44.74      55.26 |    100.00
             |     73.20      82.90 |     78.26
-------------+----------------------+----------
       Total |     8,018      8,743 |    16,761
             |     47.84      52.16 |    100.00
             |    100.00     100.00 |    100.00

According to this data, students in two-parent households are more likely to be enrolled in school and less likely not to be enrolled in school.  Specifically, 58.97% of students who are not enrolled in school live do not live in a two-parent household, while 41.03% of students who live in a two-parent household are enrolled in school.  To corroborate this, 55.26% of students enrolled in school live in two-parent households, while 44.74% of students enrolled in school do not.

Summary of Descriptive Statistics

Our descriptive statistics demonstrate that enrollment does in fact seem to have some interesting trends when considered with variables such as household per capita income and race, trends indicative of the affects of population group and socio-economic status on student enrollment in school.  We saw in our most recent graph that parental educational attainment and student enrollment looked the same regardless of the level of education the parent attained.  This graph, however, demonstrates the state of enrollment given parental education, which we must be careful not to interpret as a probability.  As I am interested specifically in the affect of parental educational attainment on student enrollment in school, the descriptive statistics prove the importance of including race and income as controls in order to produce valid regression outcomes.

There are obviously multiple other possible relationships to examine to enhance our picture of the state of education in 1994. Additionally, it would be statistically valuable to run t-tests to check the significance of the relationships we're examining.

The descriptive statistics also point to a possible noteworthy relationships between enrollment and living in a two-parent household, which I will also examine below.

Regressions

A regression is an examination of the causal relationship between two variables. It asks, "How much of a (the dependent variable) is explained by b (the independent variable)?" In the context of my research, the question reads "How much of enrollment of school-age children is explained by the education attainment of those children's parents?" Or, "How does the probability of being enrolled in school change when parent education attainment changes?"

A basic regression including the variables maxparented and enrolled assumes a linear relationship between the two variables. That is, for every one unit increase in parental education level, enrollment increases by the same amount. The regression is below.

. reg enrolled maxparented if age>=5 & age<=25

      Source |       SS       df       MS              Number of obs =   16030
-------------+------------------------------           F(  1, 16028) =   70.80
       Model |  11.9950354     1  11.9950354           Prob > F      =  0.0000
    Residual |  2715.34901 16028   .16941284           R-squared     =  0.0044
-------------+------------------------------           Adj R-squared =  0.0043
       Total |  2727.34404 16029  .170150605           Root MSE      =   .4116

------------------------------------------------------------------------------
    enrolled |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
 maxparented |   .0071718   .0008523     8.41   0.000     .0055012    .0088424
       _cons |   .7508303   .0049819   150.71   0.000     .7410652    .7605954
------------------------------------------------------------------------------

A superior measure of causality are logistic regressions, but for the purpose of this lesson module, I will continue with basic regressions.

Whenever we run regressions we have to consider the likelihood of confounding factors that may also affect our dependent variable. The regression above fails to consider a number of possibly important controls. I mentioned several controls at the beginning of the module, and include those in the regression below. There are of course many other controls I could factor into my regression, but I feel I have included controls which are most likely the most influential.

First, as I will be including income [hhpci], I need to create a log version of this variable for my regression.

. gen loghhpci=log(hhpci)
(35563 missing values generated)

The rest of the control variables I defined and coded above should be sufficient as they are coded. My next step is then to run the regression with all my controls.

. xi: reg enrolled maxparented loghhpci i.race i.son timetoschool i.twoparents if age>=5 & age<=25

i.race            _Irace_1-4          (naturally coded; _Irace_1 omitted)

i.son             _Ison_0-1           (naturally coded; _Ison_0 omitted)

i.twoparents      _Itwoparent_0-1     (naturally coded; _Itwoparent_0 omitted)

 

      Source |       SS       df       MS              Number of obs =      11

-------------+------------------------------           F(  7,     3) =    0.82

       Model |  1.07629982     7  .153757117           Prob > F      =  0.6270

    Residual |  .560063821     3   .18668794           R-squared     =  0.6577

-------------+------------------------------           Adj R-squared = -0.1409

       Total |  1.63636364    10  .163636364           Root MSE      =  .43207

 

------------------------------------------------------------------------------

    enrolled |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

 maxparented |   .0818116   .0563417     1.45   0.242    -.0974929     .261116

    loghhpci |  -.7487846   .6604914    -1.13   0.339    -2.850763    1.353194

    _Irace_2 |   .4520746   .8233126     0.55   0.621    -2.168074    3.072223

    _Irace_3 |    1.19726   1.271466     0.94   0.416    -2.849113    5.243632

    _Irace_4 |  (dropped)

     _Ison_1 |   .1225385   .2952479     0.42   0.706    -.8170722    1.062149

timetoschool |   .3040913   .3432478     0.89   0.441    -.7882764    1.396459

_Itwoparen~1 |   -.085065   .5854808    -0.15   0.894    -1.948326    1.778196

       _cons |   3.970811   2.817164     1.41   0.253    -4.994661    12.93628

------------------------------------------------------------------------------

 

Unfortunately, none of my regression variables are significant. Thus, I will focus my analysis on my main independent variable, maxparented. The coefficient of .0818 signifies that every one unit increase in maxparented, or one grade level increase, increases the probability of being enrolled in school by eight percent. Though this is not statistically significant, the relationship makes sense empirically: parents with higher levels of education are more likely to enrol their child(ren) in school. That the percentage is not statistically significant may be because of a related variable I have not oconsidered but that a confounding factor.

 

Summary

 

To conclude the analysis of the 1994 dataset, let's consider some of the important findings of this section: first, we saw that both education attainment of parents and enrollment seem very influenced by both race and household per capita income. The influence of these variables is such that members of the White population group and the higher income classes have higher percentages of students enrolled in school. These groups also have a higher percentage of parents who have graduated from high school. Conversely, members of the lower income classes and the remaining population groups--and in particular the African population group--have dramatically lower percentages of school age members enrolled in school and parents who have graduated from high school.

 

This data allows us to paint a picture of the state of education in 1993/1994. An examination of the 2004 data may provide a picture of how that state has changed--if at all--over the years from 1994 to 2004. To continue to the 2004 analysis, please click here.

Top