EXERCISE 2 - ANSWER

Do people with income levels at or above the median level report a higher level of satisfaction than those who report an income below the median level, and if so, by how much?

The variables we need to answer this question have already been used in this module so there is probably no need to look them up. We will use incmon and satisfie. Answering this question proceeds in steps. The first step is to figure out the median level of household income. To do this you could type:

sum incmon, detail

                     monthly gross pay
-------------------------------------------------------------
      Percentiles      Smallest
 1%            0              0
 5%            8              0
10%          120              0       Obs                 636
25%          320              0       Sum of Wgt.         636

50%        879.5                      Mean           1585.805
                        Largest       Std. Dev.      2106.602
75%         1950          12000
90%         3939          15000       Variance        4437771
95%         6000          15000       Skewness       2.976268
99%        10000          16400       Kurtosis       14.93181

to learn that the median level of income is 879.5 Rand.
Then we can investigate the mean of satisfie. However, we have to be careful that satisfie is household level variable. Thus we need to create new variable that shows only one value per household to eliminate the household size bias. We can create the new variable using [_n]:

gen satisfi2=satisfie if hhid[_n] ~= hhid[_n-1]

Now, let's find out what the means are.

 

means satisfi2 if incmon >= 879.5 & incmon ~= . 
Variable |    Type        Obs        Mean       [95% Conf. Interval]
---------+----------------------------------------------------------
satisfi2 | Arithmetic     266    2.879699        2.725533   3.033866 
         |  Geometric     266    2.586001        2.440213   2.740498 
         |   Harmonic     266    2.294422        2.157769   2.449554 
---------+---------------------------------------------------------- 
  
means satisfi2 if incmon < 879.5 & incmon ~= . 
Variable |    Type        Obs        Mean       [95% Conf. Interval]
---------+----------------------------------------------------------
satisfi2 | Arithmetic     234    3.222222        3.037661   3.406783 
         |  Geometric     228    3.035702         2.85658   3.226055 
         |   Harmonic     228    2.686567        2.499823   2.903465 
---------+----------------------------------------------------------

We see that the average level of satisfaction among the top half of group is 2.87 while that of the bottom half is 3.22. What does these number imply? For example, tabulate will show values. It seems like the richer half are more satisfied.

Lastly, it would be natural to first type:

means satisfie if incmon >= 879.5
means satisfie if incmon < 879.5

If you did this, STATA will count missing observations as being above the median level, and the result would be biased by the household sizes. It is always a good idea to use list and check if a variable we are using is household level or individual level.

 

BACK TO EXERCISE QUESTIONS