U.S. Geological Survey (USGS) UCODE_2005(1) NOTE: Any use of trade, product or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government. NAME UCODE_2005 and six post-processors for universal inverse modeling, including sensitivity analysis, data needs assessment, calibration, prediction, and uncertainty analysis ABSTRACT UCODE_2005 and six post-processors are included in this distribution. These programs can be used with existing process models to perform sensitivity analysis, data needs assessment, calibration, prediction, and uncertainty analysis. Any process model or set of models can be used; the only requirements are that models have numerical (ASCII or text only) input and output files, that the numbers in these files have sufficient significant digits, that all required models can be run from a single batch file or script, and that simulated values are continuous functions of the parameter values. Process models can include pre-processors and post-processors as well as one or more models related to the processes of interest (physical, chemical, and so on), making UCODE_2005 extremely powerful. An estimated parameter can be a quantity that appears in the input files of the process model(s), or a quantity used in an equation that produces a value that appears in the input files. In the latter situation, the equation is user-defined. UCODE_2005 can compare observations and simulated equivalents. The simulated equivalents can be any simulated value written in the process-model output files or can be calculated from simulated values with user-defined equations. The quantities can be model results, or dependent variables. For example, for ground-water models they can be heads, flows, concentrations, and so on. Prior, or direct, information on estimated parameters also can be considered. Statistics are calculated to quantify the comparison of observations and simulated equivalents, including a weighted least-squares objective function. In addition, data-exchange files are produced that facilitate graphical analysis. UCODE_2005 can be used fruitfully in model calibration through its sensitivity analysis capabilities and its ability to estimate parameter values that result in the best possible fit to the observations. Parameters are estimated using nonlinear regression: a weighted least-squares objective function is minimized with respect to the parameter values using a modified Gauss-Newton method or a double-dogleg technique. Sensitivities needed for the method can be read from files produced by process models that can calculate sensitivities, such as MODFLOW-2000, or can be calculated by UCODE_2005 using a more general, but less accurate, forward- or central-difference perturbation technique. Problems resulting from inaccurate sensitivities and solutions related to the perturbation techniques are discussed in the report. Statistics are calculated and printed for use in (1) diagnosing inadequate data and identifying parameters that probably cannot be estimated; (2) evaluating estimated parameter values; and (3) evaluating how well the model represents the simulated processes. Results from UCODE_2005 and codes RESIDUAL_ANALYSIS and RESIDUAL_ANALYSIS_ADV can be used to evaluate how accurately the model represents the processes it simulates. Results from LINEAR_UNCERTAINTY can be used to quantify the uncertainty of model simulated values if the model is sufficiently linear. Results from MODEL_LINEARITY and MODEL_LINEARITY_ADV can be used to evaluate model linearity and, thereby, the accuracy of the LINEAR_UNCERTAINTY results. UCODE_2005 can also be used to calculate nonlinear confidence and predictions intervals, which quantify the uncertainty of model simulated values when the model is not linear. CORFAC_PLUS can be used to produce factors that allow intervals to account for model intrinsic nonlinearity and small-scale variations in system characteristics that are not explicitly accounted for in the model or the observation weighting. The six post-processing programs are independent of UCODE_2005 and can use the results of other programs that produce the required data-exchange files. UCODE_2005 and the other six codes are intended for use on any computer operating system. The programs consist of algorithms programmed in Fortran 90/95, which efficiently performs numerical calculations. The model runs required to obtain perturbation sensitivities can be performed using multiple processors. The programs are constructed in a modular fashion using JUPITER API conventions and modules. For example, the data-exchange files and input blocks are JUPITER API conventions and many of those used by UCODE_2005 are read or written by JUPITER API modules. UCODE-2005 includes capabilities likely to be required by many applications (programs) constructed using the JUPITER API, and can be used as a starting point for such programs. HISTORY UCODE_2005 Version 1.000 02/06/2006 - Initial release. UCODE_2005 Version 1.001 02/13/2006 - minor adjustment to sta.f90 to accommodate UNIX, distribution files are formatted for UNIX, an example make file is included for UNIX. Minor changes to documentation. UCODE_2005 Version 1.002 03/26/2006 - correction to nonlinear confidence limit feature to: avoid an unnecessary final calculation of sensitivities, print the correct set of parameter values, search for and print the values from the iteration closest to the goal, print warnings and suggestions when the result is far from the goal, and to provide a more readable output. - correction to correct error when derived observations were used. UCODE_2005 would fail if this error occurred. - updated API modules, notable is improvement in the parallel module - echo printing was added to repeat what is read when _dm is read and verbose>4 - a format was changed in linear_uncertainty to add a missing " changes were made to ucode_2005.f90. ucode_mod.f90 reg_gn_ucode.f90 reg_gn_mod.f90 utlucode.f90 and sta.f90 pll.f90 utl.f90 and linear_uncertainty UCODE_2005 Version 1.003 08/13/2006 - updated version # to 1.003 - correction to use the absolute value of 1% of the starting parameter value as the default for ScalePval when it is not specified, because the value must be positive - replaced STA_UEV_DX_READ_DM in ucode_mod.f90 and reg_gn_ucode.f90 with UTLUCODE_DX_READ_DM which is a modified version of STA_UEV_DX_READ_DM that echoes items of the _dm file as they are read - corrected error for which prior information plot symbol was always printed as 1 in some underscore files - added check to see if the simulated equivalents for subsequent Beale parameter sets are identical to the previous set, and write a warning to the screen, #umodlin, and #modlin if that is the case (occasionally there is a good reason for these being the same, but usually it is a failure of the process model for a given set of parameter values - corrected error that arose when derived predictions were used. UCODE_2005 would fail if this error occurred. - nullified some pointers in all files which were identified as problems by some compilers and deleted some unused variables - corrected error that selected the second to the last iteration as the one with the lowest sum of squared residuals if model calculated sensitivities were being used. Note this results in slightly different results for the test problems. - corrected error that printed sum of squared residuals offset by one iteration in the penultimate summaries when intermediate printing was on. Also this output was slightly reformatted. - corrected error for which maximum fractional parameter change in regression space was used for convergence as well as for limiting change when maxchangerealm=regression - added the hookstep option to the trust region method. In order to use this the user needs to specify trustregion=hookstep in the REG_GN_CONTROLS input block. Descriptions of the hookstep method is provided by Dennis and Schnabel, in their book: "Numerical Methods for Nonlinear Optimization and Nonlinear Equations" - updated API modules, updating the modules corrected a bug that limited equations to 40 characters (the limit is now 2000) changes were made to all source files for ucode and associated codes UCODE_2005 Version 1.004 - updated version # to 1.004 - version numbers were incremented on all auxiliary codes that had been modified since first release in Feb 2006 - added write statements to indicate which version of the Jupiter API is used - updated the API. Items known to be specific to ucode_2005 functionality include: - UTL_SSVD2 was revised so that when ITER equals 30, an error message is printed and UTL_STOP is called to stop execution - write statements reformatted to print SCALEPVAL rather than BSCAL which is the name of SCALEPVAL in the code. This does not change the results or the numbers that are printed. - code was made consistent with input manual to allow SOSSurface to be not only yes or no, but also file. Error reporting related to that file was improved. - corrections were made to the trust region algorithm so that MODVALPREV is passed as an argument so its value is retained. Also a check was added to the hookstep to prevent an infinite loop. - a new feature was added to the trust region algorithm adding the option to not scale the least squares matrix. The keyword is: SCALING = none (false, f, no, or n will work as well) Omitting scaling can, in some cases, improve performance of the regression. - example files included a few prerelease parameters that are not part of ucode, these were altered or deleted. (TOL was changed to TOLPAR, TOLSOSR was changed to TOLSOSC, and FletcherReeves was deleted). - A couple of errors were found in the examples files. 1) The ex1a\02.in and 03.in file was modified to use the template file for the forward model. 2) For some cases, having only one outer pcg iteration was not enough to obtain convergence. So, the number of outer iterations was increased in test-data-win\data-used-by-all\tc1.pcg. These changes alter the details of the results but not their essence. - the MODFLOW executable for the example problems was updated to version 1.17 UCODE_2005 Version 1.005 - updated version # to 1.005 - added version checking for compatibility of UCODE source and Jupiter API - compiled with JUPITER API 1.1.0 - using JUPITER API 1.100 allows up to 2 billion observations previous limit was 44721 - corrected error related to SOSSURFACE when only one parameter was evaluated, parameter value now varies over the indicated range - corrected error that led to printing of zero for derived predictions and variance in _p and _pv - corrected error that at times led to problems with damping when parameters were constrained - included a check for conditions that at times led to continual estimation of the same values when parameters were constrained (this condition is likely to occur if parameters are over constrained) - added checking and adjustment of input that controls the value of the Marquardt parameter so as to avoid the potential for entering an infinite loop in the Gauss-Newton solution - extended message written to *.#uout describing tasks accomplished when prediction=yes - Label STANDARD ERROR OF THE REGRESSION in #uout was changed to STANDARD ERROR so as to be applicable when calculated for a forward model execution. - Maximum Likelihood Objective Function is now reported as MLOFO = NOBS*ln(SWSR/NOBS) for Observations Only and MLOFOP = (NOBS+NPR)*ln(SWSRwPri/(NOBS+NPR)) for Obs & Prior where: SWSR = sum_weighted_squared_residuals NOBS = #observations SWSRwPri = sum_weighted_squared_residuals including prior NPR = number of prior observations This change in the calculation of MLOF changes the values of model evaluation criteria. All criteria were affected by the same amount, so their relative rankings are not affected. - calculation of the ln determinant of the Fisher Information Matrix was corrected from ln|XTwX| to: NPE * ln(SWSR / NOBS) + ln |XTwX| NPE = # estimated_parameters When prior are included, the SWSR includes the residuals on prior information items and NOBS includes the number of prior information items. This change in the calculation of the Fisher Information Matrix changes the value of Kashyap's measure. - Formula for calculating Kashyap's measure was corrected to use the normalized Fisher information matrix. - printing of model evaluation criteria was changed to include values only relative to MLOF, both without prior, and with prior if prior information equations are included in the parameter estimation. - changes were made to the _dm file: 1)Maximum Likelihood Objective Function is now reported as describe above for the #uout file 2)Values of the ln determinant of the Fisher Information Matrix and model evaluation criteria were corrected as described above for the #uout file 3)Three lines were added. These contain: 1) the value of ln determinant of the Kashyap's measure (KIC) for observations only, 2) the Fisher Information Matrix for observations only. and 3) an indication of whether any of the final statistics were calculated with forward difference perturbation. - correction was made to ensure the final sum of squared weighted residuals is printed and used in calculation of the model measures UCODE_2005 Version 1.006 was a temporary version that was not released UCODE_2005 Version 1.007 - updated version # to 1.007 - k of Kashyap's measure is now NPE, the number of parameters estimated for the process model rather than NPE+1 as is used for other model criteria, where the addition of one reflects the estimation of sigma-squared - subroutine UTLUCODE_INVERT was modified to scale the matrix before inverting it and unscale the inverted matrix before returning it. This makes matrix inversion possible for difficult problem in which elements of the matrix differ by many orders of magnitude and in most cases prevents the error Error: Failed in UTLUCODE_INVERT - added test to determine if upperconstraint>lowerconstraint for parameters with constrain=yes, program terminates and indicates the parameters for which this is not met - extension of the range of DOF in UTL_CHISQ was accomplished by calling UTLUCODE_CHISQ which allows for appropriate intervals to be calculated on cev when many observations are included in the regression - Model_Linearity was corrected to only print the prior weight matrix if prior information are included in the model being evaluated. In addition this print was altered so it will only occur when verbose>4. Version of Model_Linearity was updated to 1.002 - Printing of Residual_analysis was modified to accomodate cases with more than 99 parameters by allowing 4 spaces for each parameter and printing strips of 10 parameters at a time. Version of Residual_Analysis was updated to 1.002 - Summary of the progress of the regression is now printed at the end of *.#uout when the regression does not converge even if Stats_on_nonconverge=no. - The same summary information is printed after each iteration to a file named *._summary. This information is available to quickly evaluate the progress of the regression should it terminate prematurely. UCODE_2005 Version 1.008 was a temporary version that was not released UCODE_2005 Version 1.009 07/12/2007 - Updated version # to 1.009 - Compiled with JUPITER API 1.2.1 - Updated to accommodate the use of parameters in predictive mode that were not included in the calibration. This allows the user to specify uncertainty for a parameter that influences predictive uncertainty but cannot be estimate in the calibration mode. For example the addition of effective porosity for an advective travel time prediction when the calibration considers only head and flow observations which do not contain information about effective porosity. This change includes a number of new input blocks which are described in the supplemental pdf: "Additional_Input_Blocks_ucode_2005.pdf". - Updated input files for ex1b to use the capability to add parameters for prediction. - Corrected maximum record length on data-exchange files with long lines in ucode and associated codes. In some cases the length was not sufficient to write the entire line and the execution would terminate. - Modified format of header lines in _b* files so parameter and observation names line up with their values - Heading banners for warnings were changed from !'s to *'s - Added a feature to edit prior information equations in the case where a log-transformed parameter appears as the only term of a prior equation but is not included in the parentheses of a log10 function. If such a parameter occurs in any form other than log10(param-name) or param-name alone, an error message is written and the code terminates. - Updated Corfac_Plus to version 1.003. This version considers parameters that are included for the predictive mode. The level of verbose required to echo input was increased from 3 to 4 in order to avoid echoing at the default value of verbose. In addition, a trailer was added indicating the time of completion and the time required to execute. - Updated Linear_Uncertainty to version 1.005. This version considers parameters that are included for the predictive mode. In addition, a trailer was added indicating the time of completion and the time required to execute. - Updated Model_Linearity to version 1.005. Increased maximum record length for reading _b1 and _b2. Modified output format for final table to improve readability. In addition, a trailer was added indicating the time of completion and the time required to execute. - Corrected Model_Linearity_Adv and updated to version 1.003 to properly consider cases in which some parameters are not adjustable in the regression. This version considers parameters that are included for the predictive mode. In addition, a trailer was added indicating the time of completion and the time required to execute. - Corrected Residual_Analysis and updated to version 1.004 to use new api to read supri. In addition, a trailer was added indicating the time of completion and the time required to execute. - Corrected Residual_Analysis_Adv and updated to version 1.002 to use new api to read supri. In addition, a trailer was added indicating the time of completion and the time required to execute. UCODE_2005 Version 1.010 08/16/2007 - Updated version # to 1.010 - Compiled with JUPITER API 1.2.1 - Modified such to provide normal termination for cases in which there are fewer observations than parameters but only sensitivitity calculations (not optimization) are requested. In this situation parameter variance cannot be calculated and related statistics are not printed. If this occurs when optimization is requested, the message: CONSIDER WHETHER TOO MANY OBSERVATIONS HAVE BEEN OMITTED has been changed to TOO FEW OBSERVATIONS GIVEN THE NUMBER OF PARAMETERS UCODE_2005 Version 1.011 10/10/2007 - Updated version # to 1.011 - Adjusted formats for _sc because labels and some numbers ran together - Corrected error where parameters were not substituted properly for the base case of a prediction=yes analysis. - Adjusted job definition so that runs with Forward&Der run the application code only once. - Corrected error in which _paoptp was not read properly for prediction=yes mode. - Corrected error in which _mvp was not written properly for prediction=yes mode. - Clarified some of the statements printed to the screen during execution of ucode - Modified Residual_Analysis and updated to version 1.005 to print a message and terminate when the number of observations exceeds 10,000 because the NxN matrix is too large. UCODE_2005 Version 1.012 02/05/2008 - Updated version # to 1.012 - Compiled with JUPITER API 1.2.3 - Added option to run prediction=yes with sensitivities=no to facilitate testing of the predictive mode setup - The above change required that sensitivities be set to yes in the file 05.in of folder ex1a - Corrected format error for writing large numbers of predictions - Corrected further error in which _mvp was not written properly for prediction=yes mode. - In the calculation of the hookstep Marquardt parameter, the algorithm now uses a scaled Hessian and unscaled Marquardt parameter. This improves robustness on ill-conditioned Hessians. - Corrected calculation of ln|F|, log determinant of the Fisher Information Matrix. Now ln(|XTwX|/(SOSWR/n)) is printed. This results in a change in the order of variables in the subroutine REG_GNMOD_EVA_MLOFSTATS. This only changes values printed for "LN DETERMINANT of Fisher Information Matrix" with and without prior. The value printed for KIC is not affected. - Corrected condition in which underscore files for residuals were not written for sensitivities=no optimize=no dataexchange=yes. - Corrected condition for which _xyzt file was not written because unused observation names were not included in the .xyzt file. Now only used observations need to be listed in the .xyzt file in order fo _xyzt to be written. - Cleaned up coding in UTLUCODE_CHECK_PRI. This did not affect functionality. - Improved error checking and messaging for definition of prior. - Corfac_plus was edited. The version number was updated to 1.004. - there is now a default for Read_cov. The default is no. - the default for RegressionUsedTrueCov was changed to yes - Residual Analysis was edited so that one line of the code that was longer than 80 characters is now wrapped and thus does not exceed 80 characters. - the documentation pdf was updated to match the version distributed on the USGS Publication Warehouse web site. Differences between this pdf and the previously distributed pdf are: Page(s) Change from previous to current version xiv Missing page number for table 38 added 30-39 Different pagination caused by moving text from after to before table 3 44 Sentence added at end of last paragraph. 63-64 MAXSTEP description revised 154 Definition for NP added to header DATA REQUIREMENTS In order to use UCODE_2005, the process model needs to be constructed and the main UCODE_2005 input file needs to be created. Input data are read from files. SYSTEM REQUIREMENTS UCODE_2005 is written primarily in Fortran 90. The code has been used on UNIX-based computers and personal computers running various forms of the Microsoft Windows operating system. CONTACTS Eileen Poeter International Ground Water Modeling Center and the Colorado School of Mines Golden, Colorado 80401 USA epoeter@mines.edu Mary C. Hill U.S. Geological Survey National Research Program 3215 Marine St Boulder, CO 80302 USA mchill@usgs.gov See http://water.usgs.gov/software/ordering_documentation.html for information on ordering printed copies of USGS publications.