Parquet viewer
View, filter and sort Parquet data in seconds
Trusted by over 10,000 every month
View and filter Parquet files online
This Parquet viewer enables you to upload and view Parquet files online. It is optimized for big data handling and provides a simple interface for browsing large-scale datasets.
This tool supports massive Parquet files and ensures smooth performance without the need for complex software installations.
Upload your Parquet file and access your data instantly.
Easily view large-scale Parquet files online in seconds.
Optimized for big data handling with fast processing.
No need for complex software installations.
Simple and intuitive interface for data browsing.
Supports massive datasets with smooth performance.
Perfect for professionals working with Parquet files.
Parquet format
Apache Parquet (.parquet) is a format that was designed for storing tabular data on disk. It was designed based on the format used in Google's Dremel paper (Dremel later became Big Query).
Parquet files store data in a binary format, which means that they can be efficiently read by computers but are difficult for people to read.
Parquet files have a schema, so means that every value in a column must have the same type. The schema makes Parquet files easier to analyse than CSV files and also helps them to have better compression so they are smaller on disk.
How to view and filter Parquet files online
- Upload your Parquet file
- Your file will be loaded and then you can view your Parquet data
- Sort data by clicking a column name
- Filter a column by clicking the three dots
- Export your Parquet file in CSV or Excel format by clicking the export button
How to view and filter Parquet files in Python with Pandas
First, we need to install pandas
pip install pandas
Then we can load the Parquet file into a dataframe.
df = pd.read_parquet('path/to/file.parquet')
We can view the first few rows of the dataframe using the head method.
print(df.head(n=5))
The n parameter controls how many rows are returned. Increase it to show more rows.
We can view the last few rows of the dataframe using the tail method.
print(df.tail(n=5))
We can sort the dataframe using the sort_values method.
df = df.sort_values('column_name', ascending=true)
Just replace 'column_name' with the name of the column you want to sort by. The 'ascending' parameter controls whether the values will be sorted in 'ascending' or 'descending' order.
We can filter the dataframe using comparison operators. The following statement will filter a dataframe to rows where the value of the 'column_name' column is greater than 5.
df = df[df['column_name'] > 5]
How to view and filter Parquet files in Python with DuckDB
First, we need to install duckdb for Python
pip install duckdb
The following duckdb query will create a view from the input Parquet file.
duckdb.sql("""SELECT * from path/to/file.parquet""")
Sometimes we have large file and it's impractical to read the whole file. We can read the first 5 rows using the following.
duckdb.sql("""SELECT * from path/to/file.parquet limit 5""")
We can sort rows using the ORDER BY clause and a SQL comparison operator.
duckdb.sql("""SELECT * from path/to/file.parquet order by 'column_name' ASC limit 5""")
Just change 'column_name' for the column you want to sort by. Use ASC to sort ascending or DESC to sort descending.
We can also filter using SQL comparison operators and the WHERE clause.
duckdb.sql("""SELECT * from path/to/file.parquet where 'column_name' > 5 ASC limit 5""")
You can change the 'column_name' to change the column you want to filter by. The operator (>) and value (5) control how the filtering is applied to 'column_name'.
MT cars
Motor Trends Car Road Tests dataset.
filename
mtcars.parquet
rows
32
Flights 1m
1 Million flights including arrival and departure delays.
filename
flights-1m.parquet
rows
1000000
Iris
Iris plant species data set.
filename
iris.parquet
rows
50
House price
Housing price dataset.
filename
house-price.parquet
rows
545
Weather
Weather dataset with temperature, rainfall, sunshine and wind measurements.
filename
weather.parquet
rows
366