Efficiently handling Parquet files in Stata
Rute Costa
Portugal Stata Conference 2026 from Stata Users Group
Abstract:
As datasets continue to grow and “big data” become a practical reality, Stata users increasingly face challenges related to storage and performance. A modern solution to these issues is Parquet—an open-source, columnar file format designed for efficient storage and capable of improving performance when used correctly. Some tools for handling Parquet files in Stata have emerged, but this presentation will focus on the community-contributed stata_parquet_io package by Jon Rothbaum. I will discuss the advantages of Parquet and demonstrate how to read, write, merge, and append Parquet files in Stata, drawing on practical experience to highlight efficient workflows and common pitfalls. The goal is to show how Parquet can offer substantial benefits over native Stata files when working with large datasets—and how to use it effectively.
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:pcon26:8
Access Statistics for this paper
More papers in Portugal Stata Conference 2026 from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().