Reading Parquet Files

Convert audience downloads to CSV or open in Excel

What Are Parquet Files?

Parquet is a compressed data format used for large datasets. Delivr.ai audience downloads use Parquet because it handles millions of rows efficiently. Unlike CSV, most tools need a conversion step to open them.


Option 1: Convert to CSV (No Coding Required)

Using an Online Converter

  1. Go to parquet-viewer.com or search "parquet to csv online"
  2. Upload your .parquet file
  3. Download the converted CSV
  4. Open the CSV in Excel or Google Sheets

Using DuckDB (Free Desktop App)

DuckDB is a free tool that can convert Parquet to CSV with one command.

  1. Download DuckDB from duckdb.org/docs/installation
  2. Open a terminal (Mac: Terminal app, Windows: Command Prompt)
  3. Run:
duckdb -c "COPY (SELECT * FROM 'your_file.parquet') TO 'output.csv' (HEADER)"
  1. Open output.csv in Excel or Google Sheets

To preview the data first:

duckdb -c "SELECT * FROM 'your_file.parquet' LIMIT 10"

To see just the column names:

duckdb -c "DESCRIBE SELECT * FROM 'your_file.parquet'"

Option 2: Open in Python

If you're already using Python to call the API, pandas can read Parquet files directly.

Install

pip install pandas pyarrow

Convert to CSV

import pandas as pd

df = pd.read_parquet("your_file.parquet")
df.to_csv("output.csv", index=False)
print(f"Converted {len(df)} rows to output.csv")

Preview in Terminal

import pandas as pd

df = pd.read_parquet("your_file.parquet")
print(f"{len(df)} rows, {len(df.columns)} columns")
print(df.head(10))

Combine Multiple Files

Audience downloads may contain multiple Parquet files across partitions. To combine them into one CSV:

import glob
import pandas as pd

files = glob.glob("*.parquet")
dfs = [pd.read_parquet(f) for f in files]
combined = pd.concat(dfs, ignore_index=True)
combined.to_csv("all_audience_data.csv", index=False)
print(f"Combined {len(files)} files into {len(combined)} rows")

Option 3: Query Without Converting

DuckDB can query Parquet files directly using SQL, without converting to CSV first. This is useful for large files that would be slow in Excel.

# Count rows
duckdb -c "SELECT COUNT(*) FROM 'your_file.parquet'"

# Filter to high-intent contacts at large companies
duckdb -c "SELECT first_name, last_name, company_name, job_title, score
            FROM 'your_file.parquet'
            WHERE score = 'high'
            AND company_employee_count_range IN ('1001 to 5000', '5000+')
            LIMIT 20"

# Export filtered results to CSV
duckdb -c "COPY (
    SELECT first_name, last_name, current_business_email, company_name, job_title, score
    FROM 'your_file.parquet'
    WHERE score = 'high'
) TO 'high_intent_contacts.csv' (HEADER)"

Common Fields in Audience Files

FieldExample
first_namejane
last_namesmith
current_business_email[email protected]
job_titleproduct analyst
company_nameacme corp
company_domainacmecorp.com
company_industryinsurance
company_employee_count_range1001 to 5000
scorehigh
topic_nameCloud Computing
linkedin_urlhttps://linkedin.com/in/jane-smith

The full file contains ~88 fields per row. See the Audiences API schema endpoint for the complete list.


Troubleshooting

IssueSolution
Excel says "file format not recognized"Parquet can't be opened directly in Excel. Convert to CSV first.
Online converter says file is too largeUse DuckDB or Python instead -- they handle large files.
CSV opens but columns are misalignedOpen Excel, use File > Import > CSV, and set delimiter to comma.
Some columns are emptyFields that don't have data for a contact will be blank. This is normal.

Next Steps