There are numerous ways to address nonresponse bias adjustment in surveys; two such methods are calibration weighting and propensity score models. Calibration is a viable technique when good external benchmarks exist; however, good external benchmarks are not always available. An alternative method to calibration is to use propensity scores to adjust for nonresponse. There are at least three main modeling techniques used to create propensity scores, but little if any research has focused on which methods provide the best propensity scores in terms of nonresponse adjustment. This paper compares calibration weights with three propensity score adjustment methods. One propensity weight is based on logistic regression models; the other two are based on classification trees (using either a single or an ensemble tree approach). This research focused on the Agricultural Resource Management Survey Phase III (ARMS III), which adjusts for potential bias resulting from unit nonresponse by calibrating weights so that estimates equal published benchmarks from other sources. Using Census of Agriculture (COA) data, we were able to compare the effectiveness of using calibration weights versus propensity score weights to reduce (unit) nonresponse bias. Bias comparisons were done by using COA data as proxy data for the 2000-2008 ARMS III samples, since the COA includes items surveyed on ARMS III as well as a number of items pertaining to operational characteristics. Nonresponse bias of the mean was compared across 30 production and demographic type items. The results indicate that tree weights outperform logistic regression weights, and that calibration weighting reduces nonresponse bias of the mean to the lowest levels. The results also suggest that tree weighting is the next best option when calibration targets are not available.