new; /* erases everything in memory and closes files and windows 1 */ /* This is an annotated Gauss batch program for Bayesian ** estimation of a normal linear regression, based on part of ** Example 23.3.1 in Mittelhammer et al. (2000). ** ** The purpose of this program is to demonstrate the use of Monte ** Carlo methods for calculating properties of the posterior ** distribution of the parameters of a linear model. The likelihood ** function is based on a normal distribution, and the prior ** distribution is on regression coefficients is general--i.e., any ** pdf the user specifies. The standard weakly informative prior on ** sigma2 is used. ** ** For present purposes, the Bayes analysis is based on outcomes ** of a linear regression model of the form, ** ** y = b1 + b2*x2 + b3*x3 + e ** ** The Y variable is specified to be distributed as ** N(x*Beta, sigma2*eye(n)), i = 1,..., n. ** ** The observations for this example are the textile data provided ** by Theil (Principles of Econometrics, John Wiley and Sons, 1971, ** p. 102). The data relate to textile consumption in the ** Netherlands. The variables consist of indices measuring ** the per-capita consumption of textiles, the per-capita level ** of real disposable income, and the real price of textiles. All ** of the variables in the model are converted, following ** Theil's example, to natural logarithms. Thus the model takes the ** form of a constant elasticity demand function. The data are ** found loaded internally to the code of this program. */ /* Set some graphics and program parameters */ gausset; /* Resets global control variables declared in gauss.dec */ /* The next 3 lines set up an output file */ format /rd 11,5; /* right justified numbers of form xxxxxx.xxxxx */ outwidth 110; /* 110 columns in output */ output file = BayesGen.out reset; /* Overwrite previous version of file */ /* Normal model, Y ~ Normal(x*Beta, sigma2*eye(n)) */ /* Load part of Theil's data set and transform the data to natural ** logarithms. The data represent indices of per-capita textile ** consumption, real per-capita disposable income, and the real ** price of textiles. We use only the first five observations to ** illustrate a case in which the data are not sufficiently ** informative to yield good estimates of all parameters of interest ** unless supplemented by theoretically grounded prior information. */ data = { 99.2 96.7 101.0, 99.0 98.1 100.1, 100.0 100.0 100.0, 111.6 104.9 90.6, 122.2 104.9 86.5}; /* 117.6 109.5 89.7, ** 121.1 110.8 90.6, ** 136.0 112.3 82.8, ** 154.2 109.3 70.1, ** 153.6 105.3 65.4, ** 158.5 101.7 61.3, ** 140.6 95.4 62.5, ** 136.2 96.4 63.6, ** 168.0 97.6 52.6, ** 154.3 102.4 59.7, ** 149.0 101.6 59.5, ** 165.5 103.8 61.3 }; */ /* Generate the other model components */ n = rows(data); /* number of observations */ k = 3; /* number of regressors, including constant term */ v = n - k; /* degrees of freedom */ y = ln(data[.,1]); /* take natural logs of all entries in first column ** of data matrix */ x = ones(n,1)~ln(data[.,2 3]); /* Create nx3 design matrix by concatenation */ /* Compute the LS estimates */ {bhat, ehat, yhat} = olsqr2(y, x); /* Compute OLS estimates using QR ** decompostion */ s2 = ehat'ehat/(n - k); /* Calculate LS estimate of sigma^2 */ Covbhat = s2*invpd(x'x); /* LS estimate of cov. matrix via inverse of ** symmetric pos. def. matrix */ /* Calculate precision h for use later in the normal - MultT case */ h = x'x/s2; /* Set the number of Monte Carlo trials to a postivie interger multiple of 100 */ m = 100000; /* Set print format for matrices. ** /m1 means print 1 carriage return/line feed pair before each row of a ** matrix. /rd 16,4 means numbers should be right-justified with field ** width 16 and 4.0 digits right of the decimal place. */ format /m1 /rd 16,4; /* We now generate m independent outcomes from the appropriate multivariate ** T distribution that characterizes the posterior distribution based on an ** UNINFORMATIVE prior for the betas. ** ** As a numerical measure of the accuracy with which the posterior means are ** calculated, we also report the value of the sample standard deviation of the ** sample mean for each of the three posterior means (intercept, income parameter, ** price parameter) calculated. ** ** Initialize some variables used in the simulation loop */ psum = 0; pssum = 0; Bpsum = zeros(3,1); Bspsum = zeros(3,1); /* Create procedure to draw from a multivariate t distribution */ proc rndmultT(u, h, v, n); local t, Zsqr, w, k; k = rows(u); /* count the rows in u */ /* multiply the transpose of the Cholesky factor of the covariance matrix by ** a matrix of standard normal draws */ w = chol(invpd(h))'rndn(k,n); Zsqr = 2*rndgam(1,n,v/2); /* twice a 1xn vector of draws from gamma ** distribution with parameter v/2. This is the same as a Chi-square(v) ** distribution */ t = u + w.*sqrt(v./Zsqr); /* 1xk vector of variables with t-distribution */ retp(t'); /* Return from the Procedure the row vector t' */ endp; /* Start the simulation loop */ numloop = m/100; /* set format for matrix to right-justified fields of width 6 and no decimal */ format /mat /rd 6,0; /* Begin a loop with counter variable j with initial value 1, final value numloop, ** and increments of size 1. */ for j (1, numloop, 1); /* locate(4,1); ** print /flush " Trial" j*100 "of" m; */ /* Draw from a multivariate t distribution with mean bhat, precision h, degrees of ** freedom v, and dimension 100. Weight each draw by its prior probability. In ** this example, the prior is uninformative for beta1 (the intercept) and normal for ** for beta2 and beta3 (income and price elasticities). */ mu2 = .8; /* prior means */ mu3 = -.8; sig2 = .2; /* prior standard deveiations */ sig3 = .2; rndseed 1; /* set seed for random number generator for replicability */ Btry = rndmultT(bhat, h, v, 100); Bsqrs = Btry.*Btry; prior = (2*pi*sig2*sig3)^(-1)*exp(-.5*(((Btry[.,2]-mu2)/sig2)^2+((Btry[.,3]-mu3)/sig3)^2)); ps = prior^2; Bp = Btry.*prior; Bsp = Bsqrs.*prior; /* Sum the elements in each column */ psum = psum + sumc(prior); pssum = pssum + sumc(ps); Bpsum = Bpsum + sumc(Bp); Bspsum = Bspsum + sumc(Bsp); endfor; cls; /* Calculate the posterior results and the accuracy measures */ Bmeans = Bpsum/psum; /* Weighted mean of draws of betas */ Bsmeans = Bspsum/psum; /* Weighted mean of squared betas */ PostVars = Bsmeans - Bmeans^2; /*posterior var(beta) = post E(beta^2)-(post E(beta))^2 */ StdMean = sqrt(PostVars*pssum/psum^2); /* Standard deviation of posterior mean */ MLest = bhat; MLstd = sqrt(diag(covbhat)); lowbound = Bmeans - 2*StdMean; upbound = Bmeans + 2*StdMean; /* Report the results */ print; format /mat /rd 7,4; print "Comparison of ML and Bayes posterior means estimates with normal prior"; print; print "Maximum Likelihood (unconstrained):"; print "Parameter Estimate Std Error"; print; print "Constant " MLest[1,1] " " MLstd[1,1]; print "Income " MLest[2,1] " " MLstd[2,1]; print "Price " MLest[3,1] " " MLstd[3,1]; print; format /mat /rd 6,0; print "BAYES: (Attempted MC Repetitions =" m ")"; print "Parameter Estimate Posterior StdDev StdDev of Mean"; print; format /mat /rd 10,4; print "Constant " Bmeans[1,1] " " sqrt(PostVars[1,1]) " " Stdmean[1,1]; print "Income " Bmeans[2,1] " " sqrt(PostVars[2,1]) " " Stdmean[2,1]; print "Price " Bmeans[3,1] " " sqrt(PostVars[3,1]) " " Stdmean[3,1]; /* ndpclex; clears math coprocessor exception flags (under DOS only) */ /* system; quit Gauss and return to the operating system */ end; /* Terminate program, revert to command mode, close open files, turn screen on */