Parquet viewer

View, filter and sort Parquet data in seconds

Trusted by over 10,000 every month

CSV viewer

View and filter CSV files

Parquet viewer

View and filter Parquet files

TSV viewer

View and filter TSV files

JSON viewer

View and filter JSON files

View and filter Parquet files online

This Parquet viewer enables you to upload and view Parquet files online. It is optimized for big data handling and provides a simple interface for browsing large-scale datasets.

This tool supports massive Parquet files and ensures smooth performance without the need for complex software installations.

Upload your Parquet file and access your data instantly.

Easily view large-scale Parquet files online in seconds.

Optimized for big data handling with fast processing.

No need for complex software installations.

Simple and intuitive interface for data browsing.

Supports massive datasets with smooth performance.

Perfect for professionals working with Parquet files.

Parquet format

Apache Parquet (.parquet) is a format that was designed for storing tabular data on disk. It was designed based on the format used in Google's Dremel paper (Dremel later became Big Query).

Parquet files store data in a binary format, which means that they can be efficiently read by computers but are difficult for people to read.

Parquet files have a schema, so means that every value in a column must have the same type. The schema makes Parquet files easier to analyse than CSV files and also helps them to have better compression so they are smaller on disk.

How to view and filter Parquet files online

  1. Upload your Parquet file
  2. Your file will be loaded and then you can view your Parquet data
  3. Sort data by clicking a column name
  4. Filter a column by clicking the three dots
  5. Export your Parquet file in CSV or Excel format by clicking the export button

How to view and filter Parquet files in Python with Pandas

First, we need to install pandas

pip install pandas

Then we can load the Parquet file into a dataframe.

df = pd.read_parquet('path/to/file.parquet')

We can view the first few rows of the dataframe using the head method.

print(df.head(n=5))

The n parameter controls how many rows are returned. Increase it to show more rows.

We can view the last few rows of the dataframe using the tail method.

print(df.tail(n=5))

We can sort the dataframe using the sort_values method.

df = df.sort_values('column_name', ascending=true)

Just replace 'column_name' with the name of the column you want to sort by. The 'ascending' parameter controls whether the values will be sorted in 'ascending' or 'descending' order.

We can filter the dataframe using comparison operators. The following statement will filter a dataframe to rows where the value of the 'column_name' column is greater than 5.

df = df[df['column_name'] > 5]

How to view and filter Parquet files in Python with DuckDB

First, we need to install duckdb for Python

pip install duckdb

The following duckdb query will create a view from the input Parquet file.

duckdb.sql("""SELECT * from path/to/file.parquet""")

Sometimes we have large file and it's impractical to read the whole file. We can read the first 5 rows using the following.

duckdb.sql("""SELECT * from path/to/file.parquet limit 5""")

We can sort rows using the ORDER BY clause and a SQL comparison operator.

duckdb.sql("""SELECT * from path/to/file.parquet order by 'column_name' ASC limit 5""")

Just change 'column_name' for the column you want to sort by. Use ASC to sort ascending or DESC to sort descending.

We can also filter using SQL comparison operators and the WHERE clause.

duckdb.sql("""SELECT * from path/to/file.parquet where 'column_name' > 5 ASC limit 5""")

You can change the 'column_name' to change the column you want to filter by. The operator (>) and value (5) control how the filtering is applied to 'column_name'.

MT cars

Motor Trends Car Road Tests dataset.

filename

mtcars.parquet

rows

32

Flights 1m

1 Million flights including arrival and departure delays.

filename

flights-1m.parquet

rows

1000000

Iris

Iris plant species data set.

filename

iris.parquet

rows

50

House price

Housing price dataset.

filename

house-price.parquet

rows

545

Weather

Weather dataset with temperature, rainfall, sunshine and wind measurements.

filename

weather.parquet

rows

366