We've seen the INSERT statement in earlier courses so we know by now that it is used to add new rows of data to a table.
In this course, we'll dive a little deeper into the nuances of INSERT. For example, what happens if you attempt to add a value that violates a constraint? We'll also try to place INSERT within the broader context of application processing.
INSERT is an example of what's referred to as Data Manipulation Language (DML), since it allows you to add data to existing tables. UPDATE and DELETE are other types of DML statements. (CREATE TABLE, on the other hand, is considered Data Definition Language, aka DDL.)
In this scenario, we'll log you in to a database called
workshop where a few
tables have already been created and data has been loaded in. You'll be logged
in as a user with all the necessary privileges to add new data to a table, as
well as query existing tables. We'll also use the command line
to interact with PostgreSQL.
We're structuring this course around the example of a career counseling agency we've called Extra Mile Counseling. If you did the Advanced Data Types course, you'll be familiar with the example data model. Otherwise, if you'd like to get quickly caught up, feel free to run through that course before continuing with this one.
INSERT statements add new rows to data tables. Knowing how to write them to handle conflicts as well as provide return values can prove very useful for application developers.
For more details, refer to the official PostgreSQL documentation on INSERT.
An INSERT isn't the only way to get data into the database - sometimes you need to load existing data stored in a file. Our next scenario will cover how to import data from a file into Postgres.
Enjoy learning about PostgreSQL? Sign up for our newsletter and get the latest tips from us each month.
Add Data with the INSERT statement
First, we'll take a quick look at our tables that have been populated with data about Extra Mile's clients as well as events:
SELECT * FROM client; SELECT * FROM event;
Let's review the basic syntax for adding a single row using INSERT:
INSERT INTO event (name, event_dt, mode) VALUES ('Test Event', tstzrange('2019-12-14 12:00:00 EST', '2019-12-14 17:00:00 EST', '[)'), 'Virtual' );
The basic syntax includes:
- The name of the table you are inserting data into
- The names of columns to which you're adding data (in parentheses following the table name)
- The values that should go in each respective column (in parentheses following the VALUES keyword)
When there are columns in the table which are not populated as part of the
INSERT command, Postgres will fill those columns using the default value for
the column if there is one (such as in the case of serial and identity columns),
or it will insert a
NULL (which means that there is no value for the column).
We'll see this in action later in this step. Generally speaking, when there is
a serial or identity column you should not include them in the INSERT statement
and instead allow Postgres to populate them.
(Our Create Table course also discusses what happens if you include an insert value for serial and identity columns. Don't do it, folks!)
When specifying the column names, it doesn't have to go in the same order as in the table, so this works the exact same way as the earlier statement:
INSERT INTO event (event_dt, name, mode) VALUES (tstzrange('2019-12-14 12:00:00 EST', '2019-12-14 17:00:00 EST', '[)'), 'Test Event', 'Virtual' );
The order of the values must always match the order of the column names.
What happens when a column is omitted?
As mentioned above, it depends on whether a default value has been set for that column.
In this scenario, we also have an attendance table (that we did not include as an example in the previous scenario, Advanced Data Types):
You'll see that this table has a relationship to both the client and event tables. This table tracks what events were attended by which clients, so the foreign key references indicate the record on the client table as well as the record on the event table.
You'll also see that the
attend_status column has been set with a default. If
we add a new row to the table and not specify an
INSERT INTO attendance (client_id, event_id) VALUES (3, 5) ; SELECT * FROM attendance;
The default value
'Registered' is added for the new row.
If no default value is explicitly defined for the column, it defaults to
What about when no list of column names is provided?
In Postgres, it is possible to skip the column list entirely (but all columns must be filled), like so :
INSERT INTO attendance VALUES (5, 1, DEFAULT); SELECT * FROM attendance;
That said, it is still good practice to explicitly specify the column names. Someone else could add new columns to the table, so specifying column names helps prevent unexpected behavior, like values being added to the wrong column. This practice also keeps your database queries clear.