Previous research has argued that preliminary data analysis is necessary for software cost estimation. In this paper, a framework for such analysis is applied to a substantial corpus of historical project data (ISBSG R9 data), selected without explicit bias. The consequent analysis yields sets of dominant variables, which are then used to construct project effort estimation models. Performance of the predictors on the raw variables and the extracted sets of variables is then measured in terms of Mean Magnitude of Relative Error (MMRE), Median of Magnitude of Relative Error (MdMRE) and prediction at levels 0.05, 0.1, and 0.25. The results from the comparative evaluation suggest that more accurate prediction models can be constructed for the selected prediction techniques. The framework processed predictor variables are statistically significant, at the 95% confidence level for both parametric techniques and one non-parametric technique. The results are also compared with the latest published results obtained by other research based on the same data set. The comparison indicates that, the models constructed using framework processed data are generally more accurate.
Liu, Q., Qin , W. Z., Mintram , R., & Ross, M. (2008). Evaluation of Preliminary Data Analysis Framework in Software Cost Estimation Based on ISBSG R3 Data. Software Quality Journal, 16(3), 411- 458. https://doi.org/10.1007/s11219-007-9041-4