Abstract: Model benchmarking allows us to separate uncertainty in model predictions caused 1 by model inputs from uncertainty due to model structural error. We extend this method with a large-sample approach (using data from multiple field sites) to measure prediction uncertainty caused by errors in (i) forcing data, (ii) model parameters, and (iii) model structure, and use it to compare the efficiency of soil moisture state and evapotranspiration flux predictions made by the four land surface models i…