Sort Parquet files online

Sort your Parquet files by any column. Upload, sort, and download your data in seconds.

Click anywhere to select a fileor drag and drop a file here
Accepts Parquet file (.parquet)

Trusted by over 40,000 every month

Parquet Sort Features

Simple Column Sorting
Sort your data by any column in ascending or descending order
Lightning-Fast Performance
Sort even large files instantly with optimized processing
SQL-Powered Advanced Sorting
Use SQL queries for complex multi-column sorting needs
Combined Filtering
Apply filters before sorting to organize specific data subsets
AI-Powered Sorting
Describe your sorting needs in plain English for complex scenarios
Export Sorted Data
Save your sorted results in various formats

How to sort Parquet files

  1. Upload your Parquet file using the upload button
  2. View your data in the interactive viewer
  3. Click on column headers to sort by that column
  4. Toggle between ascending and descending order
  5. Download the sorted Parquet file

How to view and sort parquet files in Python

We can view and sort parquet files in Python using Pandas or DuckDB

How to view and sort parquet files using Pandas

First, we need to install pandas

pip install pandas

Then we can load the parquet file into a dataframe.

df = pd.read_parquet('path/to/file.parquet')

We can view the first few rows of the dataframe using the head method.

print(df.head(n=5))

The n parameter controls how many rows are returned. Increase it to show more rows.

We can view the last few rows of the dataframe using the tail method.

print(df.tail(n=5))

We can sort the dataframe using the sort_values method.

df = df.sort_values('column_name', ascending=true)

Just replace 'column_name' with the name of the column you want to sort by. The 'ascending' parameter controls whether the values will be sorted in 'ascending' or 'descending' order.

We can filter the dataframe using comparison operators. The following statement will filter a dataframe to rows where the value of the 'column_name' column is greater than 5.

df = df[df['column_name'] > 5]

How to view and sort parquet files using DuckDB

First, we need to install duckdb for Python

pip install duckdb

The following duckdb query will create a view from the input parquet file.

duckdb.sql("""SELECT * from path/to/file.parquet""")

Sometimes we have large file and it's impractical to read the whole file. We can read the first 5 rows using the following.

duckdb.sql("""SELECT * from path/to/file.parquet limit 5""")

We can sort rows using the ORDER BY clause and a SQL comparison operator.

duckdb.sql("""SELECT * from path/to/file.parquet order by 'column_name' ASC limit 5""")

Just change 'column_name' for the column you want to sort by. Use ASC to sort ascending or DESC to sort descending.

We can also filter using SQL comparison operators and the WHERE clause.

duckdb.sql("""SELECT * from path/to/file.parquet where 'column_name' > 5 ASC limit 5""")

You can change the 'column_name' to change the column you want to filter by. The operator (>) and value (5) control how the filtering is applied to 'column_name'.