Data Type and Data Sources for Agricultural Big Data and Machine Learning
Ania Cravero (),
Sebastián Pardo,
Patricio Galeas,
Julio López Fenner and
Mónica Caniupán
Additional contact information
Ania Cravero: Centre of Excellence for Modelling and Scientific Computing, Computer Science and Informatics Department, Universidad de La Frontera, Temuco 4811230, Chile
Sebastián Pardo: Centre of Excellence for Modelling and Scientific Computing, Computer Science and Informatics Department, Universidad de La Frontera, Temuco 4811230, Chile
Patricio Galeas: Centre of Excellence for Modelling and Scientific Computing, Computer Science and Informatics Department, Universidad de La Frontera, Temuco 4811230, Chile
Julio López Fenner: Centre of Excellence for Modelling and Scientific Computing, Computer Science and Informatics Department, Universidad de La Frontera, Temuco 4811230, Chile
Mónica Caniupán: Information Systems Department, Universidad del Bío-Bío, Concepción 4030000, Chile
Sustainability, 2022, vol. 14, issue 23, 1-37
Abstract:
Sustainable agriculture is currently being challenged under climate change scenarios since extreme environmental processes disrupt and diminish global food production. For example, drought-induced increases in plant diseases and rainfall caused a decrease in food production. Machine Learning and Agricultural Big Data are high-performance computing technologies that allow analyzing a large amount of data to understand agricultural production. Machine Learning and Agricultural Big Data are high-performance computing technologies that allow the processing and analysis of large amounts of heterogeneous data for which intelligent IT and high-resolution remote sensing techniques are required. However, the selection of ML algorithms depends on the types of data to be used. Therefore, agricultural scientists need to understand the data and the sources from which they are derived. These data can be structured, such as temperature and humidity data, which are usually numerical (e.g., float); semi-structured, such as those from spreadsheets and information repositories, since these data types are not previously defined and are stored in No-SQL databases; and unstructured, such as those from files such as PDF, TIFF, and satellite images, since they have not been processed and therefore are not stored in any database but in repositories (e.g., Hadoop). This study provides insight into the data types used in Agricultural Big Data along with their main challenges and trends. It analyzes 43 papers selected through the protocol proposed by Kitchenham and Charters and validated with the PRISMA criteria. It was found that the primary data sources are Databases, Sensors, Cameras, GPS, and Remote Sensing, which capture data stored in Platforms such as Hadoop, Cloud Computing, and Google Earth Engine. In the future, Data Lakes will allow for data integration across different platforms, as they provide representation models of other data types and the relationships between them, improving the quality of the data to be integrated.
Keywords: agriculture; big data; machine learning; data type; data source (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (4)
Downloads: (external link)
https://www.mdpi.com/2071-1050/14/23/16131/pdf (application/pdf)
https://www.mdpi.com/2071-1050/14/23/16131/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:14:y:2022:i:23:p:16131-:d:991934
Access Statistics for this article
Sustainability is currently edited by Ms. Alexandra Wu
More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().