Sort Parquet files online
Sort your Parquet files by any column. Upload, sort, and download your data in seconds.
Trusted by over 40,000 every month
Parquet Sort Features
How to sort Parquet files
- Upload your Parquet file using the upload button
- View your data in the interactive viewer
- Click on column headers to sort by that column
- Toggle between ascending and descending order
- Download the sorted Parquet file
How to view and sort parquet files in Python
We can view and sort parquet files in Python using Pandas or DuckDB
How to view and sort parquet files using Pandas
First, we need to install pandas
Then we can load the parquet file into a dataframe.
We can view the first few rows of the dataframe using the head method.
The n parameter controls how many rows are returned. Increase it to show more rows.
We can view the last few rows of the dataframe using the tail method.
We can sort the dataframe using the sort_values method.
Just replace 'column_name' with the name of the column you want to sort by. The 'ascending' parameter controls whether the values will be sorted in 'ascending' or 'descending' order.
We can filter the dataframe using comparison operators. The following statement will filter a dataframe to rows where the value of the 'column_name' column is greater than 5.
How to view and sort parquet files using DuckDB
First, we need to install duckdb for Python
The following duckdb query will create a view from the input parquet file.
Sometimes we have large file and it's impractical to read the whole file. We can read the first 5 rows using the following.
We can sort rows using the ORDER BY clause and a SQL comparison operator.
Just change 'column_name' for the column you want to sort by. Use ASC to sort ascending or DESC to sort descending.
We can also filter using SQL comparison operators and the WHERE clause.
You can change the 'column_name' to change the column you want to filter by. The operator (>) and value (5) control how the filtering is applied to 'column_name'.