psql COPY: The Three Forms a Real Pipeline Actually Uses

Sean

Platform Writer

Jun 10, 2026
6 min read

psql COPY is the meta-command and SQL statement that move bulk data in and out of Postgres: \\copy is the psql client-side command, COPY is the SQL statement the server executes, and COPY ... PROGRAM is the server-side command that runs an external program. The three are not interchangeable, and the three are the answer to most “how do I load a million rows” questions. The interesting part is the difference between \\copy and COPY (the file location), the seven flags (delimiter, header, null, format, encoding, quote, escape), and the gotchas (permissions, encoding, transaction wrapping, free-space-map bloat) that quietly break a bulk load that was supposed to be fast.

The reason “psql copy command” is its own question and not just “import csv to postgres” is that the COPY command is the lever for the fastest possible bulk load, and the COPY command is the one most teams reach for when the data set is too big for INSERT. The command has two forms (COPY and \\copy) that look similar but behave differently, and the form is the part the developer has to get right.

psql COPY: the three forms a real pipeline actually uses

Table of contents

The short version

The COPY SQL statement moves data between a table and a file (or a program), and the \\copy psql meta-command moves data between a table and a file (or stdin/stdout) on the client side. The two are similar, but the two have a critical difference: COPY reads/writes the file on the server, and \\copy reads/writes the file on the client. The difference is the part the developer has to know to choose the right form. The third form, COPY ... PROGRAM, runs an external program on the server and reads/writes the program’s stdout/stdin. The three forms cover 99% of real use cases.

The three forms a real pipeline actually uses

A short, opinionated list of the three forms. The three are not interchangeable, and the developer should know which one they want before they write the command.

COPY table FROM 'file.csv' (SQL form). The server reads the file. The file must be on the server’s filesystem, and the Postgres user must have read permission on the file. The form is the right answer for a server-side bulk load (a backup restore, a data import from a server-mounted volume, a pipeline that writes the CSV to a known location on the server). The form is the fastest of the three, because there is no client-server round trip for the data.

COPY products FROM '/var/lib/postgres/imports/products.csv'
  WITH (FORMAT csv, HEADER true);

\\copy table FROM 'file.csv' (psql meta-command). The client reads the file. The file must be on the client’s filesystem, and the file is streamed over the connection to the server. The form is the right answer for a developer running psql on their laptop, for a script that has the file locally, or for a CI job that is connecting to a remote database. The form is slightly slower than the SQL form (because of the network round trip), but the form is the one most developers use because the file is usually on the developer’s machine.

\copy products FROM '/home/me/products.csv' WITH (FORMAT csv, HEADER true)

COPY table FROM PROGRAM 'gunzip -c /path/to/file.csv.gz' (PROGRAM form). The server runs the program and reads its stdout. The form is the right answer for a server-side import of a compressed file, a file fetched from a URL, or a file generated by another tool. The form requires the pg_read_server_files or pg_execute_server_program privilege (in modern Postgres) or superuser (in older versions), and the form is the one the developer should reach for when the file is too big to fit on the server and the file is too compressed to send over the network.

COPY products FROM PROGRAM 'gunzip -c /var/lib/postgres/imports/products.csv.gz'
  WITH (FORMAT csv, HEADER true);

The three forms are the floor. The developer can also use COPY ... TO STDOUT and COPY ... FROM STDIN to pipe data through stdin/stdout, which is the form a Python or Node script uses to stream data into Postgres. The three are the ones the developer should learn first.

The seven flags that decide what the command does

The COPY and \\copy commands accept a set of flags that control the format of the data. The seven flags are the ones the developer will see most often, and the seven are the floor for any real bulk load.

FORMAT. The data format. The values are csv (comma-separated values, the most common), text (Postgres’s native text format, with tab delimiters and backslash escapes), and binary (Postgres’s native binary format, the fastest but not human-readable). The default is text for COPY and csv for some clients. The csv format is the right answer for almost every real use case.

HEADER. A boolean that says whether the first line of the file is a header row. The flag only applies to csv format. The flag is the right answer for any file that has a header row, and the flag is the right answer for any file the developer does not want to manually skip the header on.

DELIMITER. The field delimiter. The default for csv is ,. The default for text is a tab. The flag is the right answer for files that use a different delimiter (;, |, \t), and the flag is the right answer for files that are TSV (tab-separated values).

NULL. The string that represents a null value. The default for csv is an empty string (an unquoted empty field). The default for text is \N. The flag is the right answer for files that use a different null representation (NULL, \\N, null).

QUOTE. The character used to quote fields. The default for csv is ". The flag is the right answer for files that use a different quote character, and the flag is the right answer for files that contain the default quote character in field values.

ESCAPE. The character used to escape special characters. The default for csv is " (a doubled quote escapes a quote). The flag is the right answer for files that use a different escape character, and the flag is the right answer for files that contain the default escape character in field values.

ENCODING. The character encoding of the file. The default is the database’s encoding (usually UTF8). The flag is the right answer for files that are in a different encoding (Latin1, Windows-1252, etc.), and the flag is the right answer for files that come from a system that does not use UTF8.

The seven flags are the floor. There is also FORCE_NOT_NULL (which forces a column to be not-null even if the value is the null string), FORCE_NULL (which forces a column to be null if the value is the null string), and FREEZE (which is the equivalent of VACUUM FREEZE on the loaded rows — only safe for tables that are being loaded for the first time). The seven are the ones the developer should learn first.

The five places COPY shows up in a real pipeline

A short, opinionated list of the places a real team uses COPY. The places are not the only ones, but they are the ones the developer will see most often.

The “loading a CSV from a customer’s export” place. A customer sends a CSV export of their data, the team loads it into Postgres, and the team does the import in a single COPY command. The pattern is the right answer for any data onboarding workflow, and the pattern is the lever that turns a multi-hour import into a multi-minute import.

The “dumping a table for a backup” place. The team needs to dump a table to a file, and the team uses COPY table TO '/path/to/file.csv'. The pattern is the right answer for a one-off backup, and the pattern is the right answer for a script that needs to snapshot a table for testing.

The “loading JSON Lines into a JSONB column” place. The team has a JSON Lines file (one JSON object per line) and needs to load it into a JSONB column. The pattern is to use COPY table (jsonb_column) FROM '/path/to/file.jsonl' with FORMAT text (because each line is a valid JSON value), and then INSERT INTO target SELECT * FROM staging to convert. The pattern is the right answer for a team that needs to load a large JSON dataset.

The “dumping a query to a file for downstream processing” place. The team has a query that produces a CSV, and the team needs the CSV on disk for downstream processing (a Spark job, a data warehouse load, a partner’s API). The pattern is COPY (SELECT * FROM products WHERE created_at > '2026-01-01') TO '/path/to/file.csv' WITH (FORMAT csv, HEADER true). The pattern is the right answer for any data export workflow.

The “loading a million rows in a CI test” place. The team needs to load a million rows into a test database for a performance test, and the team uses \\copy from a generated CSV. The pattern is the right answer for any test that needs a realistic data set, and the pattern is the lever that turns a 30-minute test setup into a 30-second test setup.

The five places are the floor. There are also patterns for loading a parquet file (convert to CSV first, or use a tool like pg_parquet), for loading a gzip-compressed file (use PROGRAM with gunzip -c), and for loading a streaming source (use COPY ... FROM STDIN and pipe from a process). The five are the ones the developer should learn first.

The seven gotchas that quietly break a bulk load

A short, opinionated list of gotchas that have actually broken real bulk loads. None of them are dramatic. They are the boring ones.

The file is on the wrong machine. A developer who writes COPY products FROM '/home/me/products.csv' and gets ERROR: could not open file ... for reading: No such file or directory is a developer whose file is on the client, not the server. The fix is to use \\copy (psql meta-command) instead of COPY (SQL statement). The two look similar, but the two read from different places.

The encoding does not match. A developer who loads a Latin1-encoded file into a UTF8 database gets either a successful load with mangled characters, or a ERROR: invalid byte sequence for encoding "UTF8". The fix is to set the ENCODING flag on the COPY command, or to convert the file to UTF8 before the load.

The CSV uses CRLF line endings. A developer who loads a CSV that was saved on Windows gets an extra \r at the end of each line, which becomes part of the last column’s value. The fix is to convert the file to LF line endings before the load, or to use a csv parser that strips the \r (most modern parsers do, but Postgres’s COPY does not).

The column count does not match. A COPY command that has 10 columns in the table but 9 columns in the CSV gets a ERROR: extra data after last expected column. The fix is to specify the column list in the COPY command: COPY products (name, price, sku) FROM 'file.csv'. The list is the right answer for any CSV that does not have all the columns, and the list is the right answer for any table that has a generated column or a serial primary key.

The data has a NULL represented as the empty string. A developer who has an empty string in the CSV and the column is nullable gets an empty string, not a NULL. The fix is to use the NULL '' flag to tell COPY that the empty string is a null. The pattern is the right answer for any CSV that uses the empty string for nulls, and the pattern is the right answer for any table where the empty string and null are different.

The load triggers a constraint check on every row. A COPY command that loads a million rows into a table with a foreign key, a unique index, and a check constraint is going to be slow, because Postgres checks every constraint on every row. The fix is to drop the indexes and constraints before the load, load the data, and recreate the indexes and constraints. The pattern is the right answer for any large bulk load, and the pattern is the lever that turns a 30-minute load into a 3-minute load.

The free space map is not updated. A COPY command that loads a lot of rows into a table does not update the free space map, which means the table is going to be bloated until the next VACUUM. The fix is to run VACUUM (ANALYZE) on the table after the load. The pattern is the right answer for any large bulk load, and the pattern is the lever that keeps the table from becoming a performance problem.

The way COPY compares to alternatives

The COPY command is the fastest way to load data into Postgres, but it is not the only way. The developer should know the alternatives, and the developer should know when COPY is the right answer and when an alternative is the right answer.

COPY vs. INSERT with multi-row VALUES. A COPY command is on the order of 10x faster than an INSERT ... VALUES (...), (...), (...) with the same data, because COPY is a single statement with a single parse-and-plan cycle, and INSERT is many statements. The INSERT form is the right answer for a developer who needs to load a small number of rows in a script, and the INSERT form is the right answer for a developer who needs to generate the SQL dynamically.

COPY vs. \\COPY in psql. The two are nearly identical, with the only difference being the file location (server vs. client). The COPY form is faster for large files (no network round trip), and the \\copy form is more convenient for a developer on their laptop. The form is the right answer for the developer’s workflow, not a single global answer.

COPY vs. \copy from stdin. A COPY ... FROM STDIN is a server-side import that reads from the connection’s stdin. The pattern is the right answer for a script that is generating the data on the client (a Python loop, a Node script) and streaming it to the server. The pattern is the lever that turns a million-row script into a streaming import.

COPY vs. pg_dump / pg_restore. A COPY command is the right answer for a CSV or text file. A pg_dump / pg_restore is the right answer for a Postgres-native dump (a full database, a full schema, a table with all its indexes and constraints). The two are not interchangeable, and the developer should use the right tool for the format.

COPY vs. COPY ... PROGRAM. The PROGRAM form is the right answer for a server-side import of a compressed or generated file. The form is faster than piping the file over the network, and the form is the right answer for a file that is too big to send over the network.

The five comparisons are the floor. There are also tools like pgloader (for migrating from MySQL), pg_import (for cloud-native imports), and pg_partman (for partitioning large tables). The five are the ones the developer should know first.

How this fits the rest of the stack

A COPY command rarely lives in isolation. The command is usually part of a pipeline (a backup, a data import, a data export) that runs on a server, in a CI job, or in a scheduled task. The platform that handles the pipeline should make the rest of the stack feel like part of the same conversation.

The services layer is the part of the platform that runs the long-lived API the loaded data serves. The database layer is the part that holds the data the COPY is loading. The static layer is the part that hosts the dashboard the developer uses to see the load’s status. The environment variables are the part that holds the connection string the COPY runs over.

A COPY command on a platform where the service, the database, the storage, and the secrets are all in the same place is a command the team is going to be able to debug. A COPY command on a platform where each piece is in a different console is a command the team is going to spend the first hour just opening the right tab.

For a team that wants to see the full cost of the project before it commits, the RunxBuild hosting calculator shows the line items together. The API, the database, the storage, the worker, the bandwidth — each one is a separate number, and the team’s mental model for the platform is the sum of those numbers.

FAQ

What is the psql COPY command?

\\copy is the psql meta-command that moves data between a table and a file (or stdin/stdout) on the client side. COPY is the SQL statement that moves data between a table and a file (or program) on the server side. The two look similar, but the two read from different places. \\copy is the right answer for a developer on their laptop, and COPY is the right answer for a server-side load.

How do I import a CSV into Postgres with psql?

Use \\copy table FROM 'file.csv' WITH (FORMAT csv, HEADER true). The command runs in psql, reads the file on the client, and streams the data to the server. The pattern is the right answer for a developer who has a CSV on their laptop and wants to load it into a remote database.

How do I export a table to CSV with psql?

Use \\copy table TO 'file.csv' WITH (FORMAT csv, HEADER true). The command runs in psql, reads the data from the server, and writes the file on the client. The pattern is the right answer for a developer who wants to export a table to a CSV file on their laptop.

What is the difference between COPY and \copy?

COPY is the SQL statement that runs on the server and reads/writes files on the server’s filesystem. \\copy is the psql meta-command that runs on the client and reads/writes files on the client’s filesystem. The \\copy form is slightly slower (because of the network round trip), but the \\copy form is the right answer for a developer who has the file locally and the database is remote.

How do I load a compressed CSV into Postgres?

Use COPY table FROM PROGRAM 'gunzip -c /path/to/file.csv.gz' WITH (FORMAT csv, HEADER true). The command runs on the server, executes the gunzip command, and reads the decompressed data. The form requires the pg_execute_server_program privilege, and the form is the right answer for a file that is too large to send over the network uncompressed.

How fast is COPY compared to INSERT?

COPY is on the order of 10x faster than INSERT with multi-row VALUES for the same data, because COPY is a single statement with a single parse-and-plan cycle. The exact speedup depends on the data, the indexes, the constraints, and the server’s hardware, but the speedup is real and consistent.

Can COPY trigger a trigger or a generated column?

Yes. COPY fires the standard BEFORE and AFTER triggers for the operation (insert, update), and COPY respects generated columns. The behavior is the same as INSERT — the only difference is the speed. The pattern is the right answer for a developer who needs the triggers to fire, and the pattern is the right answer for a developer who needs to load data into a table with a generated column.