PostgreSQL Partitioning

 Partitioning: Divide large into small pieces to manage  it effectively.


Where Partitioning will help to improve

- Useful where we have to periodically load and remove or purge the subsets of data

- Maintenance activities will complete quickly as compared to large tables.

- Performance will increase because partitions relevant to queries stores in mem so it reduces the fetch time.

- complex analytic queries can easily operates to individual partitions


History of partitioning in postgres

V8.1 (2005)


V9 :

Partitioning work with table inheritance, check constraints and trigger functions to redirect the data

Range and List Partitioning Supported.


V10

Partitioning Syntax added in this version.

No need to use a trigger function to redirect the data , internal function will work for this

Range and List Partitioning Supported.


v11

Support default partitions : where there is a no partition condition it will add in default partition

Hash partitions introduce

parallel partition scans

Optimizer partition eliminations


v12

Foreign keys can now reference partitioned tables

handling of tablespace assigned to partitioning.

psql enhancements




Range Partitioning : divide data according tpo the range

---Date/Time range

--- Price Range

--Year /Month



List Partitioning : any discrete list( no range)

--States/Cities


Hash Partitioning:

A table is partitioned by specifying a modulus and remainder for each partition.


Declarative Partitioning

Used a proper declarative syntax for creating a partitioning.

like Partition by range, Partition by list, Partition by hash


test=# create table item ( itemid int not null, logdate date not null, sales int ) partition by range (logdate);

CREATE TABLE


test=# create table item1 partition of item for values from ('2006-02-01') to ('2006-03-31');

CREATE TABLE


test=# create table item2 partition of item for values from ('2006-04-01') to ('2006-05-31');

CREATE TABLE


test=# \d+ item

                              Partitioned table "public.item"

 Column  |  Type   | Collation | Nullable | Default | Storage | Stats target | Description

---------+---------+-----------+----------+---------+---------+--------------+-------------

 itemid  | integer |           | not null |         | plain   |              |

 logdate | date    |           | not null |         | plain   |              |

 sales   | integer |           |          |         | plain   |              |

Partition key: RANGE (logdate)

Partitions: item1 FOR VALUES FROM ('2006-02-01') TO ('2006-03-31'),

            item2 FOR VALUES FROM ('2006-04-01') TO ('2006-05-31')


test=#



INSERTED SOME DATA ON PARENT TABLE

insert into item values(1,'2006-02-02',20);


But as you can see, the data size of the parent table is zero, because data resides in the child table.


test=# \dt+ item*

                                   List of relations

 Schema | Name  |       Type        |  Owner  | Persistence |    Size    | Description

--------+-------+-------------------+---------+-------------+------------+-------------

 public | item  | partitioned table | pawan | permanent   | 0 bytes    |

 public | item1 | table             | pawan | permanent   | 8192 bytes |

 public | item2 | table             | pawan | permanent   | 8192 bytes |

(3 rows)



Index is automatically created on the child table whenever I am creating an index on the parent table.

test=# create index item_idx on item(logdate);

CREATE INDEX

test=#

test=# \d+ item

                              Partitioned table "public.item"

 Column  |  Type   | Collation | Nullable | Default | Storage | Stats target | Description

---------+---------+-----------+----------+---------+---------+--------------+-------------

 itemid  | integer |           | not null |         | plain   |              |

 logdate | date    |           | not null |         | plain   |              |

 sales   | integer |           |          |         | plain   |              |

Partition key: RANGE (logdate)

Indexes:

    "item_idx" btree (logdate)

Partitions: item1 FOR VALUES FROM ('2006-02-01') TO ('2006-03-31'),

            item2 FOR VALUES FROM ('2006-04-01') TO ('2006-05-31')


test=# \d+ item1

                                   Table "public.item1"

 Column  |  Type   | Collation | Nullable | Default | Storage | Stats target | Description

---------+---------+-----------+----------+---------+---------+--------------+-------------

 itemid  | integer |           | not null |         | plain   |              |

 logdate | date    |           | not null |         | plain   |              |

 sales   | integer |           |          |         | plain   |              |

Partition of: item FOR VALUES FROM ('2006-02-01') TO ('2006-03-31')

Partition constraint: ((logdate IS NOT NULL) AND (logdate >= '2006-02-01'::date) AND (logdate < '2006-03-31'::date))

Indexes:

    "item1_logdate_idx" btree (logdate)

Access method: heap


test=# \d+ item2

                                   Table "public.item2"

 Column  |  Type   | Collation | Nullable | Default | Storage | Stats target | Description

---------+---------+-----------+----------+---------+---------+--------------+-------------

 itemid  | integer |           | not null |         | plain   |              |

 logdate | date    |           | not null |         | plain   |              |

 sales   | integer |           |          |         | plain   |              |

Partition of: item FOR VALUES FROM ('2006-04-01') TO ('2006-05-31')

Partition constraint: ((logdate IS NOT NULL) AND (logdate >= '2006-04-01'::date) AND (logdate < '2006-05-31'::date))

Indexes:

    "item2_logdate_idx" btree (logdate)

Access method: heap


test=#


Partition maintenance.


if you want to drop a partition table(example item3) then It will completely drop from the database.


test=# create table item3 partition of item for values from ('2006-06-01') to ('2006-07-31')

;

CREATE TABLE

test=#

test=#

test=# \d+ item

                              Partitioned table "public.item"

 Column  |  Type   | Collation | Nullable | Default | Storage | Stats target | Description

---------+---------+-----------+----------+---------+---------+--------------+-------------

 itemid  | integer |           | not null |         | plain   |              |

 logdate | date    |           | not null |         | plain   |              |

 sales   | integer |           |          |         | plain   |              |

Partition key: RANGE (logdate)

Indexes:

    "item_idx" btree (logdate)

Partitions: item1 FOR VALUES FROM ('2006-02-01') TO ('2006-03-31'),

            item2 FOR VALUES FROM ('2006-04-01') TO ('2006-05-31'),

            item3 FOR VALUES FROM ('2006-06-01') TO ('2006-07-31')


test=# drop table item3;

DROP TABLE

test=#

test=#

test=# \d+ item

                              Partitioned table "public.item"

 Column  |  Type   | Collation | Nullable | Default | Storage | Stats target | Description

---------+---------+-----------+----------+---------+---------+--------------+-------------

 itemid  | integer |           | not null |         | plain   |              |

 logdate | date    |           | not null |         | plain   |              |

 sales   | integer |           |          |         | plain   |              |

Partition key: RANGE (logdate)

Indexes:

    "item_idx" btree (logdate)

Partitions: item1 FOR VALUES FROM ('2006-02-01') TO ('2006-03-31'),

            item2 FOR VALUES FROM ('2006-04-01') TO ('2006-05-31')


test=#


ALTER TABLE ITEM DETACH PARTITION item3;  we can also remove or detach the child partition from parent partitions.



LIMITATIONS  of Declarative Partitions


We have to manually create exclusion constraints  on each partition. There is no command to create all partitions at onces

We have manually created unique constraints to maintain the uniqueness in each partition.



Partitions Pruning -Performance features;;;


your table or partition tables is only scanned when it's only required during query execution. system know if this data does n't is this partition so it won't scan it


 We can enable and disable the feature, but by default it's ON.


set enable_partition_pruning='off'


test=# explain analyze select * from item where sales >= 20 and logdate > '2006-04-01';

                                             QUERY PLAN

-----------------------------------------------------------------------------------------------------

 Seq Scan on item2 item  (cost=0.00..1.04 rows=1 width=12) (actual time=0.012..0.013 rows=1 loops=1)

   Filter: ((sales >= 20) AND (logdate > '2006-04-01'::date))

   Rows Removed by Filter: 2

 Planning Time: 0.509 ms

 Execution Time: 0.027 ms

(5 rows)


test=# set enable_partition_pruning ='off';

SET

test=# explain analyze select * from item where sales >= 20 and logdate > '2006-04-01';

                                                 QUERY PLAN

-------------------------------------------------------------------------------------------------------------

 Append  (cost=0.00..2.10 rows=2 width=12) (actual time=0.041..0.043 rows=1 loops=1)

   ->  Seq Scan on item1 item_1  (cost=0.00..1.04 rows=1 width=12) (actual time=0.029..0.030 rows=0 loops=1)

         Filter: ((sales >= 20) AND (logdate > '2006-04-01'::date))

         Rows Removed by Filter: 3

   ->  Seq Scan on item2 item_2  (cost=0.00..1.04 rows=1 width=12) (actual time=0.011..0.011 rows=1 loops=1)

         Filter: ((sales >= 20) AND (logdate > '2006-04-01'::date))

         Rows Removed by Filter: 2

 Planning Time: 0.524 ms

 Execution Time: 0.059 ms

(9 rows)


test=#


Declarative partitioning best practices:

  • Selection on primary key column or column which can divide logically
  • Used column used mostly used in where clause
  • Too less partition means index remain large
  • Too many partitions can increase the planning time.
  • carefully choose subpartition
  • verify nature of data


Constraints Exclusion:

(Scan the Constraint of all child tables and exclude child table which doesn't match the where clause it query)

similar to partition pruning , only it will work during query planning where partition puring works at execution time.