
Results from the regression of forest cover on the independent variables described above are summarized in table 3 columns 1-6. With the exception of few variables relating to soil-quality limitations, all of the coefficients are significant and generally have the expected signs.[9] Furthermore, the predictive power of the regression is high. Using a 50% probability as the cut-off point, 76% of the plots under forest and 70% of plots without forest cover are predicted correctly. Computation of the predicted effect of a one-standard deviation change in the independent variable at the mean of all other variables provides an assessment of the quantitative importance of individual variables. Four main conclusions emerge.
First, distance from infrastructure, elevation, rainfall, and slope emerge are the quantitatively most important determinants of forest cover. All of them have the predicted positive sign parallel to the results obtained in the literature (Chomitz and Gray 1996) which suggest that better access to infrastructure (or higher agricultural potential) are associated with lower probability of forest cover, or equivalently a higher probability of deforestation.
Second, high levels of poverty in any given municipio are statistically significantly associated with a lower probability of a plot being under forest even after other factors are controlled for. This means that higher levels of poverty are indeed associated with more deforestation. Two aspects are of interest. First, if this finding carries over to transient increases in poverty, safety nets that augment individuals’ income levels during periods of crisis (e.g. through food for work program in times of drought) could be associated with important environmental benefits, in addition to any other benefits they would provide. Second, it is interesting to note that a one-percent change in poverty has a larger effect on the probability of forest than a one percent change in population density, suggesting that the large importance that has, in earlier studies, been attributed to population density, may have been exaggerated.
Third, contrary to simplistic arguments regarding a "tragedy of the commons", we do find a positive effect of ejido-area and the share of indigenous population on forest cover even once other factors are controlled for. This leads to reject the hypothesis that such arrangements would, after accounting for all other factors, be associated with higher levels of deforestation.
Fourth, government policies, in the form of credit subsidies (measured by the number of people who actually had access to subsidized credit), are indeed associated with lower levels of forest cover, suggesting that the cheap credit thus made available has contributed to deforestation rather than to increased formation of capital that would substitute for intensive cultivation of existing agricultural land rather than cutting down of forest.
One interesting aspect of these findings relates to the relationship between distance from infrastructure and poverty. From the negative sign of the poverty-distance interaction one concludes that, far from infrastructure, high levels of poverty have a particularly large environmentally damaging effect. This could reflect the use of land-intensive forms of shifting cultivation for subsistence by households whose distance from markets renders transport cost prohibitively high.
The predicted probability of finding forest cover on any given plot, as a function of its road-distance and the poverty level of the municipio is illustrated in figure 1. One notes that, at high levels of poverty, additional infrastructure that would reduce road-distance would be associated only with a slight reduction the probability of finding forest cover. This increase in the propensity to deforest could easily be compensated for if the improved infrastructure-access would lead to a modest decrease in poverty levels. In that case provision of infrastructure could even result in increased overall probability of forest cover. Recalling that on average about 59% of the population in our study area are poor (see table 2 above), this may be of empirical relevance. Future research to explore the potential interactions between the provision of infrastructure and poverty reduction would be warranted to clarify to what degree - in Mexico as well as in other settings - the poverty-reducing effect of infrastructure may outweigh the deforestation-increasing effect normally associated with provision of better road-access.
We first test the robustness of our results by re-estimating the basic equation for subsets of the data characterized by variations in distance to infrastructure, the incidence of poverty, and altitude (the latter to represent physio-geographic variables in a broader sense). Results, summarized in table 4, suggest that, except for irrigation and extension, neither the significance nor the signs of the respective coefficients change in a substantively important way.
An important test of the robustness of our results which could also provide insights regarding the interpretation of earlier results from the literature, is to use only subsets of the independent variables included in the base regression. This is of interest for two reasons. First, it enables us to mimic regressions performed in previous studies and to obtain information on potential biases, through aggregation or through the omission of certain variables, inherent to these results. Second, if we can identify a subset of independent variables that are available not only for 1990 but also for 1980, we could investigate to what degree a low predicted probability of forest cover for areas that were still under forest in 1980 does correspond to actual deforestation during the following decade. This would facilitate a preliminary assessment as to whether cross-sectional approaches can legitimately be used to make inferences about future deforestation.
Starting by using only plot-specific information to re-estimate the equation discussed above, we find that omission of socio-economic variables affects neither the overall quality of the regression nor its predictive power, with coefficients being very close to the ones in the base case (table 3, columns 7 and 8). Altitude, distance to infrastructure, slope, and protection, continue to be the quantitatively most important determinants of forest cover, with 74% of the area under forest cover and 71% of the area without forest cover predicted correctly. We conclude that, even in cases of limited availability of data on socio-economic variables, plot-specific information on physio-geographic characteristics would allow to make quite accurate predictions regarding forest cover.
Of course, the case more prevalent in the empirical literature is that physio-geographic rather than socio-economic variables are omitted. In many cases this is combined with various degrees of aggregation bias as models of deforestation are estimated using data from various political aggregates such as provinces or municipalities. Dropping all plot-specific variables and estimating the model by using only socio-economic and non-instrumented variables, we find that, compared to the base regression where all variables are included, such an approach can give rise to very erroneous conclusions (table 3, columns 9 and 10). In particular, the coefficient on poverty reverses sign and becomes highly significant.[10]
Our study area comprises the two poorest states of Mexico, with considerably less variation in income and levels of socio-economic well-being (e.g. education and access to government services and infrastructure) than at the national level. We would therefore like to find out whether the results can be generalized and considered to be in some sense representative for Mexico as a whole. Lack of adequate data on infrastructure[11] prevented us from simply re-estimating the plot-level model for the whole nation and forced us to resort to average all of the plot-specific variables within municipios and estimate area-weighted heteroskedasticity-consistent tobit regressions. At the national level, we restrict ourselves to the 2267 municipios with an average population density of less than 500 per square kilometer. Results for the percentage of area covered with forest in any given municipio (table 5; see Deininger and Minten 1996 for a more elaborate discussion) are, again not very different from the ones obtained here.
A final question that remains to be addressed is the legitimacy of utilizing cross-sectional estimates to make predictions regarding the threat of deforestation facing individual plots. Availability of data on actual deforestation between 1980 and 1990 allows us to investigate this issue by examining to what degree forested plots with low predicted probability of forest cover in the initial period (1980) were actually subject to deforestation in the following decade. In other words, after running the base regression with only point-specific informatin (based on 1980 data) as independent variables, we compare how many of the plots forested in 1980 that had a probability of being under forest cover below 0.5 were no longer covered with forest in 1990. We find that about two thirds (64.4%) of the plots deforested and about three quarters of the forested plots that were not subsequently deforested during the 1980-90 period are predicted correctly by our regressions. This leads to the conclusion that, at least for the case at hand, estimates derived from cross-sectional data do provide a basis for making inferences on subsequent deforestation. This conclusion is also supported by regressions that were run with actual deforestation between 1980 and 1990 as dependent variable.
As indicated in table 3, we use a protection dummy to investigate whether location of a plot in a protected area (in 1980) had any measurable effect on its probability of being covered with forest in 1990. The positive and significant coefficient of this variable suggests that protection did reduce the threat of deforestation, even in cases where it failed to eliminate deforestation altogether. Comparing actual deforestation in protected areas during the 1980-90 period with the predicted probability of deforestation without protection, one finds that, without protection, 43% of the plots within protected areas would have been deforested with a probability of more than fifty percent, while --with protection-- only 9% of the area was, during the period under concern, actually deforested. In the following we slightly expand on these estimates, investigating whether (and to what degree) they could be utilized to rationalize the delineation of protected areas.
We first note that aggregation of individual points’ predicted probability of forest cover from the regression enables us to compute a quantitative measure of the threat of deforestation faced by any subset of points in the data. This would, in particular, facilitate the derivation of a measure of the deforestation-threat faced by certain bio-regions or types of forest. To illustrate, one can identify considerable differences in the deforestation-reducing effect of protection across different types of forest (table 6, column 4). Column 3 of the same table illustrates that, for example, protecting closed montane forest which --even without protection-- has only a 10.5% probability of being deforested, would result in a much lower reduction in the probability of deforestation than protecting low tropical forest which, without protection, has a 54% probability of not being covered with forest even under present conditions.
More precise inferences could be made if, in addition to the information on any given point’s probability of being deforested (with and without protection) that can be derived from the regressions, information on the plot-specific benefits and costs of protection were available. Let every point be associated with a specific biodiversity value (Bi) and define protection costs (Fi), which include both the foregone profits from agricultural production --net of any value that may be derived from sustainable use of the standing forest-- plus the actual cost of protection (fencing, patrolling, etc.).[12] Assuming that the environmental planner or agency faces a budget constraint (I) out of which the costs of physical protection and possible compensation to landowners for the loss of production arising from protection have to be paid, the set of points to be protected (PiP) solves the problem:

where PiP and PiN are the probabilities of point i being deforested with (P) and without protection (N), respectively, and X is the vector of determinants of the probability of deforestation that are included in the regression. A simplified example can be constructed based on the assumption of costs and benefits from protection being equal for all points currently covered with forest. In this case, the maximand simplifies to the predicted difference between the predicted deforestation-probability with and without protection, an expression that can be easily evaluated with the information available to us at this point.
Evaluating this difference for the 82,195 km2 of the study area that were still covered with forest in 1990, we obtain the results summarized in table 6. One notes that, in the aggregate, protection would on average result in an expected increase of the probability of forest cover by 14.6% (column 4) but also that the impact of protection varies considerably depending on the type of forest. Of more interest is the finding that the effect of protection varies considerably with distance to infrastructure. Interacting the two variables,[13] we find that the deforestation-reducing effect of protection would be comparatively modest in areas close to infrastructure, where the pressure on forests would be high irrespectively of protection, as well as very far from infrastructure, where deforestation would not be very attractive even without protection. As graphically illustrated in figure 2, protection of forest cover between 60 and 120 km from the road would be predicted to result in an additional increase in the predicted probability of forest cover of 19.7 percentage points, compared to an increase of 3.4 points for forest more than 240 km from the road. Hence, if the costs of and benefits from protection were equal for all points, protecting plots far away from infrastructure, where the threat of deforestation is low even without protection, would not appear to be a particularly efficient use of public resources. Obtaining more precise information on biodiversity values and protection costs that would allow to conduct a more careful evaluation of the maximization problem might be a promising way to rationalize the allocation of public resources to environmental protection and thus provide a very tangible and policy-relevant use of the type of regressions performed above.
