Parquet to CSV converter
Trusted by over 20,000 every month
Convert Parquet to CSV online
With our online Parquet to CSV converter you can convert your files without downloading any software or writing code. Unlike other services, you can make graphs from your converted data or perform analysis. Just click the navigation on the left hand side.
Convert Parquet to CSV online
Works with large Parquet files that have millions of rows
View your converted CSV data before downloading it
Parquet
Apache Parquet (.parquet) is a format that was designed for storing tabular data on disk. It was designed based on the format used in Google's Dremel paper (Dremel later became Big Query).
Parquet files store data in a binary format, which means that they can be efficiently read by computers but are difficult for people to read.
Parquet files have a schema, so means that every value in a column must have the same type. The schema makes Parquet files easier to analyse than CSV files and also helps them to have better compression so they are smaller on disk.
CSV
CSV (Comma Separated Values) files are the most common format for storing tabular data. Values in a row are separated by commas and rows are separated by newlines.
CSV files often start with a header row that has column names, but this is not required.
Each row in a CSV file must have the same number of values as the header row.
CSV files do no enforce types or a schema. This means that each column can have multiple types, which can make analysis difficult and compression inefficient.
Parquet files can be easier to analyze and compress better than CSV files.
How to convert Parquet to CSV
- Upload your Parquet file
- Your Parquet file will be converted to CSV
- Download your CSV file
- Click the view button to view your file
How to convert Parquet to CSV in Python
We can convert Parquet to CSV in Python using Pandas or DuckDB
How to Convert Parquet to CSV using Pandas
First, we need to install pandas
pip install pandas
Then we can load the Parquet file into a dataframe
df = pd.read_parquet('path/to/file.parquet')
Finally, we can export the dataframe to the CSV format
df.to_csv('/path/to/file.csv', index=False)
How to Convert Parquet to CSV using DuckDB
First, we need to install duckdb for Python
pip install duckdb
The following duckdb query will copy the contents of a single Parquet file to a CSV file
duckdb.sql("""COPY (select * from 'path/to/file.parquet') TO 'path/to/file.csv' (HEADER, FORMAT 'csv')""")
If you have more than one Parquet file with the same schema (e.g. your Parquet files are partitioned) then you can use the following
duckdb.sql("""COPY (select * from read_parquet(['path/to/file1.parquet', 'path/to/file2.parquet'])) TO 'path/to/file.csv' (HEADER, FORMAT 'csv')""")
MT cars
Motor Trends Car Road Tests dataset.
filename
mtcars.csv
rows
32
Flights 1m
1 Million flights including arrival and departure delays.
filename
flights-1m.csv
rows
1000000
Iris
Iris plant species data set.
filename
iris.csv
rows
50
House price
Housing price dataset.
filename
house-price.csv
rows
545
Weather
Weather dataset with temperature, rainfall, sunshine and wind measurements.
filename
weather.csv
rows
366