Abstract
One of the largest challenges with soil information around the world is how to harmonize archived soil data from different sources and how make it usable to extract knowledge. In Ecuador there have been two major projects that provided soil information, whose methodology, although comparable, did not coincide, especially regarding the structure of how information was reported. Here, we present a new soil database for Ecuador, comprising 13 542 soil profiles with over 51 713 measured soil horizons, including 92 different edaphic variables. Original data was in a non-editable format (i.e., PDF) making it difficult to access and process the information. Our study provides an integrated framework combining multiple data analytic tools for the automatic conversion of legacy soil information from analog format to usable digital soil mapping inputs across Ecuador. This framework allowed to incorporate quantitative information of a broad set of soil properties and retrieve qualitative information on soil morphological properties collected in the profile description phase, which is rarely included in soil databases. A new harmonized national database was generated using specific methodology to rescue relevant information. National representativeness of soil information has been enhanced compared to other international databases, and this new database contributes to filling the gaps of publicly available soil information across the country. The database is freely available to registered users at https://doi.org/doi:10.6073/pasta/1560e803953c839e7aedef78ff7d3f6c (Armas, et al., 2022).