Reading Parquet Files
Convert audience downloads to CSV or open in Excel
What Are Parquet Files?
Parquet is a compressed data format used for large datasets. Delivr.ai audience downloads use Parquet because it handles millions of rows efficiently. Unlike CSV, most tools need a conversion step to open them.
Option 1: Convert to CSV (No Coding Required)
Using an Online Converter
- Go to parquet-viewer.com or search "parquet to csv online"
- Upload your
.parquetfile - Download the converted CSV
- Open the CSV in Excel or Google Sheets
Using DuckDB (Free Desktop App)
DuckDB is a free tool that can convert Parquet to CSV with one command.
- Download DuckDB from duckdb.org/docs/installation
- Open a terminal (Mac: Terminal app, Windows: Command Prompt)
- Run:
duckdb -c "COPY (SELECT * FROM 'your_file.parquet') TO 'output.csv' (HEADER)"- Open
output.csvin Excel or Google Sheets
To preview the data first:
duckdb -c "SELECT * FROM 'your_file.parquet' LIMIT 10"To see just the column names:
duckdb -c "DESCRIBE SELECT * FROM 'your_file.parquet'"Option 2: Open in Python
If you're already using Python to call the API, pandas can read Parquet files directly.
Install
pip install pandas pyarrowConvert to CSV
import pandas as pd
df = pd.read_parquet("your_file.parquet")
df.to_csv("output.csv", index=False)
print(f"Converted {len(df)} rows to output.csv")Preview in Terminal
import pandas as pd
df = pd.read_parquet("your_file.parquet")
print(f"{len(df)} rows, {len(df.columns)} columns")
print(df.head(10))Combine Multiple Files
Audience downloads may contain multiple Parquet files across partitions. To combine them into one CSV:
import glob
import pandas as pd
files = glob.glob("*.parquet")
dfs = [pd.read_parquet(f) for f in files]
combined = pd.concat(dfs, ignore_index=True)
combined.to_csv("all_audience_data.csv", index=False)
print(f"Combined {len(files)} files into {len(combined)} rows")Option 3: Query Without Converting
DuckDB can query Parquet files directly using SQL, without converting to CSV first. This is useful for large files that would be slow in Excel.
# Count rows
duckdb -c "SELECT COUNT(*) FROM 'your_file.parquet'"
# Filter to high-intent contacts at large companies
duckdb -c "SELECT first_name, last_name, company_name, job_title, score
FROM 'your_file.parquet'
WHERE score = 'high'
AND company_employee_count_range IN ('1001 to 5000', '5000+')
LIMIT 20"
# Export filtered results to CSV
duckdb -c "COPY (
SELECT first_name, last_name, current_business_email, company_name, job_title, score
FROM 'your_file.parquet'
WHERE score = 'high'
) TO 'high_intent_contacts.csv' (HEADER)"Common Fields in Audience Files
| Field | Example |
|---|---|
first_name | jane |
last_name | smith |
current_business_email | [email protected] |
job_title | product analyst |
company_name | acme corp |
company_domain | acmecorp.com |
company_industry | insurance |
company_employee_count_range | 1001 to 5000 |
score | high |
topic_name | Cloud Computing |
linkedin_url | https://linkedin.com/in/jane-smith |
The full file contains ~88 fields per row. See the Audiences API schema endpoint for the complete list.
Troubleshooting
| Issue | Solution |
|---|---|
| Excel says "file format not recognized" | Parquet can't be opened directly in Excel. Convert to CSV first. |
| Online converter says file is too large | Use DuckDB or Python instead -- they handle large files. |
| CSV opens but columns are misaligned | Open Excel, use File > Import > CSV, and set delimiter to comma. |
| Some columns are empty | Fields that don't have data for a contact will be blank. This is normal. |
Next Steps
- Create an Intent Audience -- How to create and download audience files
- Intent Audiences API -- Full API reference
Updated 2 days ago