The variables of interest here are rooms_to (again the number of rooms in a residence) and hhsizem (number of people in a household). Again, these variables are measured at the household level, so it is important to reduce the dataset down to one observation per household. The best way to do this is to drop everyone who isn't the respondent to the survey by using the command:
keep if pers_res==1
Now use the following regression command to generate this table:
regress rooms_to hhsizem
Source SS df MS Number of obs 990 F( 1, 988) 59.39 Model 345.241841 1 345.241841 Prob > F 0 Residual 5743.46624 988 5.81322494 R-squared 0.0567 Adj R-squared 0.0557 Total 6088.70808 989 6.1564288 Root MSE 2.4111 rooms_to Coef. Std. Err. t P>t [95% Conf. Interval] hhsizem 0.2011084 0.0260962 7.706 0 0.1498981 0.2523188 _cons 3.196684 0.1431271 22.335 0 2.915815 3.477552
Because the t-stat for hhsizem (7.706) is greater than 1.96, it is safe to say that the number of people in a household significantly affects the number of rooms the family chooses for its dwelling. In fact, the coefficient of .2011 for hhsizem suggests that an additional person would make a family rent or purchase one-fifth of another room. What is a fifth of a room, you ask? Said differently, after an increase of five more people, an average family would acquire another room.
To graph this relationship, you must first "predict" the values of the regression. We called our predicted variable "roomshat" but any name would have sufficed.
predict roomshat
With the following graphing command, you should get this resulting graph:
graph rooms_to roomshat hhsizem, connect(.s) symbol (oi) ylabel xlabel