Scalable Generation of Synthetic IoT Network Datasets: A Case Study with Cooja
Hrant Khachatrian,
Aram Dovlatyan,
Greta Grigoryan and
Theofanis P. Raptis ()
Additional contact information
Hrant Khachatrian: Machine Learning Group, Center for Mathematical and Applied Research, Yerevan State University, Yerevan 0025, Armenia
Aram Dovlatyan: YerevaNN, Yerevan 0025, Armenia
Greta Grigoryan: Machine Learning Group, Center for Mathematical and Applied Research, Yerevan State University, Yerevan 0025, Armenia
Theofanis P. Raptis: Institute of Informatics and Telematics, National Research Council, 56124 Pisa, Italy
Future Internet, 2025, vol. 17, issue 11, 1-17
Abstract:
Predicting the behavior of Internet of Things (IoT) networks under irregular topologies and heterogeneous battery conditions remains a significant challenge. Simulation tools can capture these effects but can require high manual effort and computational capacity, motivating the use of machine learning surrogates. This work introduces an automated pipeline for generating large-scale IoT network datasets by bringing together the Contiki-NG firmware, parameterized topology generation, and Slurm-based orchestration of Cooja simulations. The system supports a variety of network structures, scalable node counts, randomized battery allocations, and routing protocols to reproduce diverse failure modes. As a case study, we conduct over 10,000 Cooja simulations with 15–75 battery-powered motes arranged in sparse grid topologies and operating the RPL routing protocol, consuming 1300 CPU-hours in total. The simulations capture realistic failure modes, including unjoined nodes despite physical connectivity and cascading disconnects caused by battery depletion. The resulting graph-structured datasets are used for two prediction tasks: (1) estimating the last successful message delivery time for each node and (2) predicting network-wide spatial coverage. Graph neural network models trained on these datasets outperform baseline regression models and topology-aware heuristics while evaluating substantially faster than full simulations. The proposed framework provides a reproducible foundation for data-driven analysis of energy-limited IoT networks.
Keywords: Cooja; network simulation; energy-constrained networks; synthetic data; surrogate modeling (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1999-5903/17/11/518/pdf (application/pdf)
https://www.mdpi.com/1999-5903/17/11/518/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:17:y:2025:i:11:p:518-:d:1793408
Access Statistics for this article
Future Internet is currently edited by Ms. Grace You
More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().