A New Tidy Data Structure to Support Exploration and Modeling of Temporal Data
Earo Wang (),
Dianne Cook () and
Rob Hyndman
No 12/19, Monash Econometrics and Business Statistics Working Papers from Monash University, Department of Econometrics and Business Statistics
Abstract:
Mining temporal data for information is often inhibited by a multitude of formats: irregular or multiple time intervals, point events that need aggregating, multiple observational units or repeated measurements on multiple individuals, and heterogeneous data types. On the other hand, the software supporting time series modeling and forecasting, makes strict assumptions on the data to be provided, typically requiring a matrix of numeric data with implicit time indexes. Going from raw data to model-ready data is painful. This work presents a cohesive and conceptual framework for organizing and manipulating temporal data, which in turn flows into visualization, modeling and forecasting routines. Tidy data principles are extended to temporal data by: (1) mapping the semantics of a dataset into its physical layout; (2) including an explicitly declared index variable representing time; (3) incorporating a "key" comprising single or multiple variables to uniquely identify units over time. This tidy data representation most naturally supports thinking of operations on the data as building blocks, forming part of a "data pipeline" in time-based contexts. A sound data pipeline facilitates a fluent workflow for analyzing temporal data. The infrastructure of tidy temporal data has been implemented in the R package tsibble.
Keywords: time series; data wrangling; tidy data; R; forecasting; data science; exploratory data analysis; data pipelines (search for similar items in EconPapers)
JEL-codes: C22 C32 C81 C82 C88 (search for similar items in EconPapers)
Pages: 28
Date: 2019
New Economics Papers: this item is included in nep-ecm, nep-ets, nep-for and nep-ore
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.monash.edu/business/ebs/research/publications/ebs/wp12-2019.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:msh:ebswps:2019-12
Ordering information: This working paper can be ordered from
http://business.mona ... -business-statistics
Access Statistics for this paper
More papers in Monash Econometrics and Business Statistics Working Papers from Monash University, Department of Econometrics and Business Statistics PO Box 11E, Monash University, Victoria 3800, Australia. Contact information at EDIRC.
Bibliographic data for series maintained by Professor Xibin Zhang ().