Reading Parquet Files

What Are Parquet Files?

Parquet is a compressed data format used for large datasets. Delivr.ai audience downloads use Parquet because it handles millions of rows efficiently. Unlike CSV, most tools need a conversion step to open them.

Option 1: Convert to CSV (No Coding Required)

Using an Online Converter

Go to parquet-viewer.com or search "parquet to csv online"
Upload your .parquet file
Download the converted CSV
Open the CSV in Excel or Google Sheets

Using DuckDB (Free Desktop App)

DuckDB is a free tool that can convert Parquet to CSV with one command.

Download DuckDB from duckdb.org/docs/installation
Open a terminal (Mac: Terminal app, Windows: Command Prompt)
Run:

duckdb -c "COPY (SELECT * FROM 'your_file.parquet') TO 'output.csv' (HEADER)"

Open output.csv in Excel or Google Sheets

To preview the data first:

duckdb -c "SELECT * FROM 'your_file.parquet' LIMIT 10"

To see just the column names:

duckdb -c "DESCRIBE SELECT * FROM 'your_file.parquet'"

Option 2: Open in Python

If you're already using Python to call the API, pandas can read Parquet files directly.

Install

pip install pandas pyarrow

Convert to CSV

import pandas as pd

df = pd.read_parquet("your_file.parquet")
df.to_csv("output.csv", index=False)
print(f"Converted {len(df)} rows to output.csv")

Preview in Terminal

import pandas as pd

df = pd.read_parquet("your_file.parquet")
print(f"{len(df)} rows, {len(df.columns)} columns")
print(df.head(10))

Combine Multiple Files

Audience downloads may contain multiple Parquet files across partitions. To combine them into one CSV:

import glob
import pandas as pd

files = glob.glob("*.parquet")
dfs = [pd.read_parquet(f) for f in files]
combined = pd.concat(dfs, ignore_index=True)
combined.to_csv("all_audience_data.csv", index=False)
print(f"Combined {len(files)} files into {len(combined)} rows")

Option 3: Query Without Converting

DuckDB can query Parquet files directly using SQL, without converting to CSV first. This is useful for large files that would be slow in Excel.

# Count rows
duckdb -c "SELECT COUNT(*) FROM 'your_file.parquet'"

# Filter to high-intent contacts at large companies
duckdb -c "SELECT first_name, last_name, company_name, job_title, score
            FROM 'your_file.parquet'
            WHERE score = 'high'
            AND company_employee_count_range IN ('1001 to 5000', '5001 to 10000', '10000+')
            LIMIT 20"

# Export filtered results to CSV
duckdb -c "COPY (
    SELECT first_name, last_name, current_business_email, company_name, job_title, score
    FROM 'your_file.parquet'
    WHERE score = 'high'
) TO 'high_intent_contacts.csv' (HEADER)"

Common Fields in Audience Files

Field	Example
`first_name`	jane
`last_name`	smith
`current_business_email`	[email protected]
`job_title`	product analyst
`company_name`	acme corp
`company_domain`	acmecorp.com
`company_industry`	insurance
`company_employee_count_range`	1001 to 5000
`score`	high
`topic_name`	Cloud Computing
`linkedin_url`	https://linkedin.com/in/jane-smith

The full file contains ~88 fields per row. See the Audiences API schema endpoint for the complete list.

Troubleshooting

Issue	Solution
Excel says "file format not recognized"	Parquet can't be opened directly in Excel. Convert to CSV first.
Online converter says file is too large	Use DuckDB or Python instead -- they handle large files.
CSV opens but columns are misaligned	Open Excel, use File > Import > CSV, and set delimiter to comma.
Some columns are empty	Fields that don't have data for a contact will be blank. This is normal.

Next Steps

Create an Intent Audience -- How to create and download audience files
Intent Audiences API -- Full API reference