It is always a good idea to investigate the distribution of a variable by using the summarize command. Look at the values the variables take. You should find that the value of a house is sometimes coded as a negative number. , and we delete these observations. Rooms in a house vary between 1 and 16. Hence:
gen hhval = sale_val
replace hhval = . if sale_val<0
We now run the regression
sort hhid
reg hhval rooms_to if hhid~=hhid[_n-1]
The results are:
Source | SS df MS Number of obs = 598 ---------+------------------------------ F( 1, 596) = 214.61 Model | 1.7534e+12 1 1.7534e+12 Prob > F = 0.0000 Residual | 4.8695e+12 596 8.1702e+09 R-squared = 0.2648 ---------+------------------------------ Adj R-squared = 0.2635 Total | 6.6229e+12 597 1.1094e+10 Root MSE = 90389 ------------------------------------------------------------------------------ hhval | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- rooms_to | 22486.78 1534.965 14.650 0.000 19472.18 25501.38 _cons | -50973.19 8127.605 -6.272 0.000 -66935.41 -35010.96 ------------------------------------------------------------------------------
For every additional room a house has, the house value increases by about 22,487 Rand.
Note: Remember to reload the saldru12.dta file after this exercise.