Intro to PostgreSQL

Installation

Yum is your friend. Important packages:

Assuming the server is installed, this should get it running:

# service postgresql initdb
# service postgresql start

Test login, remembering that the default authentication is 'ident'. The PostgreSQL superuser is called 'postgres'.

# su - postgres
$ psql template1
Welcome to psql 8.3.3, the PostgreSQL interactive terminal.

The latest version is 8.3. RHEL 5 comes with PostgreSQL 8.1, which is OK. RHEL 4 comes with 7.4, which is nasty old. Never start a new PostgreSQL user with 7.4.

Fortunately an excellent YUM repository is available for RHEL 4 and 5, supporting all recent PostgreSQL versions.

http://yum.pgsqlrpms.org/

Click on "Yum Howto". Couldn't be much easier.

Configuration

Main directory: /var/lib/pgsql/data

Main configuration file is postgresql.conf

Authentication configuration is in pg_hba.conf. Default configuration is "ident", meaning the system user is assumed to be the same as the database user. More on this later.

PG_VERSION contains the database major version number that must be used on this data store.

Connecting and Creating Users and Databases

On a new PostgreSQL install, you need to su - postgres before doing anything with it, because it checks that the database user is the same as the system user. System users do not automatically exist in the database. For example, as user micah:

$ psql
psql: FATAL:  Ident authentication failed for user "micah"

This means the user doesn't exist. Let's create it, as user postgres:

$ createuser micah
Shall the new role be a superuser? (y/n) y

Now, as micah:

$ psql
psql: FATAL:  database "micah" does not exist
$ createdb
$ psql -l
        List of databases
   Name    |  Owner   | Encoding
-----------+----------+----------
 postgres  | postgres | UTF8
 template0 | postgres | UTF8
 template1 | postgres | UTF8
 micah     | micah    | UTF8

$ psql
Welcome to psql 8.3.3, the PostgreSQL interactive terminal.

The client utilities assume the DB name is the same as the user name, unless another DB name is presented on the command line. First, the 'micah' database did not exist, but the createdb command created it. Now we can connect.

Often it is a good idea to create users with fewer privileges. Again as the postgres user:

$ createuser bozo
Shall the new role be a superuser? (y/n) n
Shall the new role be allowed to create databases? (y/n) n
Shall the new role be allowed to create more new roles? (y/n) n

Demonstration of Use

Creating tables works as expected. Let's create some to play with, as user micah:

create table names (
  id serial primary key,
  first varchar not null,
  last varchar not null
);

Table names is straightforward, except for the serial type. This is sort of like auto_increment in MySQL but with a difference. Instead of automatically choosing a number higher than the maximum, it uses a sequence. Sequences are separate entities in PostgreSQL and can be independently manipulated. Of course, if you set a sequence to a value such that it will choose a number which already exists in a primary key column, you will hit an error! Let's demonstrate inserts and sequences:

micah=# insert into names (first, last) values ('Lanham', 'Napier');
INSERT 0 1
micah=# select * from names;
 id | first  |  last
----+--------+--------
  1 | Lanham | Napier
(1 row)

micah=# select currval('names_id_seq');
 currval
---------
       1
(1 row)

micah=# insert into names (first, last) values ('Graham', 'Weston');
INSERT 0 1
micah=# select currval('names_id_seq');
 currval
---------
       2
(1 row)

micah=# select setval('names_id_seq', 1);
 setval
--------
      1
(1 row)

micah=# insert into names (first, last) values ('Mickey', 'Mouse');
ERROR:  duplicate key value violates unique constraint "names_pkey"
micah=# select currval('names_id_seq');
 currval
---------
       2
(1 row)

micah=# insert into names (first, last) values ('Mickey', 'Mouse');
INSERT 0 1
micah=# select * from names;
 id | first  |  last
----+--------+--------
  1 | Lanham | Napier
  2 | Graham | Weston
  3 | Mickey | Mouse
(3 rows)

micah=# delete from names where id=3;
DELETE 1
micah=# insert into names (first, last) values ('Frederick', 'Mendler');
INSERT 0 1
micah=# select * from names;
 id |   first   |  last
----+-----------+---------
  1 | Lanham    | Napier
  2 | Graham    | Weston
  4 | Frederick | Mendler
(3 rows)

Let's add another table to play with:

create table kudos (
  id int not null references names on delete cascade,
  kudo varchar
);

Here we see that the id field is a foreign key. Its value must exist in the names table. Furthermore, if the corresponding row in the names table is deleted, all corresponding kudos will also be deleted!

micah=# insert into kudos values (1, 'Great job!!!');
INSERT 0 1
micah=# insert into kudos values (2, 'Yeah!!!');
INSERT 0 1
micah=# insert into kudos values (3, 'Way to be Fanatical');
ERROR:  insert or update on table "kudos" violates foreign key constraint "kudos_id_fkey"
DETAIL:  Key (id)=(3) is not present in table "names".
micah=# insert into kudos values (4, 'Way to be Fanatical');
INSERT 0 1
micah=# insert into names (first, last) values ('Bozo', 'Clown');
INSERT 0 1
micah=# insert into kudos values (5, 'Nice!');
INSERT 0 1
micah=# insert into kudos values (5, 'Sweet!!!');
INSERT 0 1
micah=# select * from kudos;
 id |        kudo
----+---------------------
  1 | Great job!!!
  2 | Yeah!!!
  4 | Way to be Fanatical
  5 | Nice!
  5 | Sweet!!!
(5 rows)

micah=# delete from names where id=5;
DELETE 1
micah=# select * from kudos;
 id |        kudo
----+---------------------
  1 | Great job!!!
  2 | Yeah!!!
  4 | Way to be Fanatical
(3 rows)

And yet another table ...

create table salaries (
  id int not null primary key references names on delete cascade,
  salary int not null,
  check ( salary > 25000 )
);

This introduces constraints.

micah=# insert into salaries values (1, 10000);
ERROR:  new row for relation "salaries" violates check constraint "salaries_salary_check"
micah=# insert into salaries values (1, 1000000);
INSERT 0 1

Access and Authentication

Before we can continue, we need to discuss the various ways of authentication for users. Done in pg_hba.conf.

There are two types of rows (besides comments and blanks): host and local rows. Rows that start with "local" relate to UNIX socket connections and those that start with "host" relate to TCP/IP connections. Local rows have four fields and host rows have five:

The authentication method tells how it authenticates.

For non-production systems with non-critical data, this seems reasonable:

# "local" is for Unix domain socket connections only
local   all         all                               trust
# IPv4 local connections:
host    all         all         127.0.0.1/32          md5

If system has sensitive data or untrusted local users, local connections should probably use md5 also, just in case of a system compromise. Apply with:

# service postgresql reload

Roles

"Users" are actually called roles in PostgreSQL, and they can be inherited. Some simple examples:

micah=# grant select on names to bozo;
micah=# grant select,insert on kudos to bozo;

Switching to user 'bozo':

$ psql -U bozo micah
micah=> insert into names (first, last) values ('Test', 'User');
ERROR:  permission denied for relation names
micah=> insert into kudos values (2, 'Is there a rating higher than Fanatical?');
INSERT 0 1

You can change the password for a role, as any superuser:

micah=> ALTER USER bozo WITH ENCRYPTED PASSWORD 'pgrocks';
$ psql -h 127.0.0.1 -U bozo micah

With that it should let you connect as bozo only if you type the right password. It will prompt here because of the 'md5' authentication in pg_hba.conf for 'host' connections. If you connect without the -h, it will still not prompt because it would be using UNIX domain socket connections, which we have configured to 'trust'.

Upgrading PostgreSQL

Minor point releases, such as 8.3.2 to 8.3.3:

# yum update
# service postgresql restart

Major releases, such as 8.2 to 8.3 (the second digit is considered major). From the aforementioned Yum repository, replace the currently installed pgdg-redhat RPM with that of the version you want to install. Then:

# su - postgres
$ mkdir pre-upgrade
$ cp data/postgresql.conf pre-upgrade
$ cp data/pg_hba.conf pre-upgrade
$ pg_dumpall > pre-upgrade/dump.sql
$ exit
# service postgresql stop
# rm -rf /var/lib/pgsql/data
# yum update
# service postgresql initdb
# service postgresql start
$ su - postgresql
$ psql template1 < pre-upgrade/dump.sql
$ cp pre-upgrade/postgresql.conf data
$ cp pre-upgrade/pg_hba.conf data
$ exit
# service postgresql restart

That should minimally do it. If the database is being updated continuously, steps should be taken to prevent writes to the DB before the dump. Edit pg_hba.conf for this.

Analysis

Showing queries in progress:

micah=# select * from pg_stat_activity;
 datid | datname | procpid | usesysid | usename |          current_query          | waiting |          xact_start           |          query_start          |         backend_start         | client_addr | client_port
-------+---------+---------+----------+---------+---------------------------------+---------+-------------------------------+-------------------------------+-------------------------------+-------------+-------------
 16386 | micah   |   29598 |    16384 | micah   | select * from pg_stat_activity; | f       | 2008-09-19 03:24:23.858877-05 | 2008-09-19 03:24:23.858877-05 | 2008-09-19 03:24:22.290114-05 |             |          -1
(1 row)
micah=# \x
Expanded display is on.
micah=# select * from pg_stat_activity;
-[ RECORD 1 ]-+--------------------------------
datid         | 16386
datname       | micah
procpid       | 29598
usesysid      | 16384
usename       | micah
current_query | select * from pg_stat_activity;
waiting       | f
xact_start    | 2008-09-19 03:25:08.762913-05
query_start   | 2008-09-19 03:25:08.762913-05
backend_start | 2008-09-19 03:24:22.290114-05
client_addr   |
client_port   | -1

We can analyze queries and increase performance by adding queries:

micah=# \x
Expanded display is off.
micah=# select names.first, kudos.kudo from names, kudos where names.id = kudos.id;
   first   |                   kudo
-----------+------------------------------------------
 Lanham    | Great job!!!
 Graham    | Yeah!!!
 Frederick | Way to be Fanatical
 Graham    | Is there a rating higher than Fanatical?
(4 rows)

micah=# explain select names.first, kudos.kudo from names, kudos where names.id = kudos.id;
                             QUERY PLAN
---------------------------------------------------------------------
 Hash Join  (cost=28.68..67.89 rows=1230 width=64)
   Hash Cond: (kudos.id = names.id)
   ->  Seq Scan on kudos  (cost=0.00..22.30 rows=1230 width=36)
   ->  Hash  (cost=18.30..18.30 rows=830 width=36)
         ->  Seq Scan on names  (cost=0.00..18.30 rows=830 width=36)
(5 rows)

micah=# CREATE INDEX kud_idx ON kudos (id);
CREATE INDEX

micah=# explain select names.first, kudos.kudo from names, kudos where names.id = kudos.id;
                            QUERY PLAN
------------------------------------------------------------------
 Hash Join  (cost=1.09..22.54 rows=4 width=64)
   Hash Cond: (names.id = kudos.id)
   ->  Seq Scan on names  (cost=0.00..18.30 rows=830 width=36)
   ->  Hash  (cost=1.04..1.04 rows=4 width=36)
         ->  Seq Scan on kudos  (cost=0.00..1.04 rows=4 width=36)
(5 rows)