Talk about the mapping relationship between PostgreSql table and disk files

Whether the Drop table will release disk space in postgresql, today we will witness it with practical instructions

--2019-01-11 09:49:21 drop table Will it free up space
 create table tab_todrop(id int,cname varchar(50),remark text);
 insert into tab_todrop select generate_series(1,10000000),'wx good boy',md5('wx good boy');
 insert into tab_todrop select generate_series(1,10000000),'wx good boy',md5('wx good boy');
--Check table size
qmstst=# select pg_size_pretty(pg_relation_size('tab_todrop'));
 pg_size_pretty 
----------------
 2410 MB
(1 row)
qmstst=#

Each data table is placed under datap*. The postgresql cluster is distributed to dataap* files through hash.

[root@P1QMSTST01 ~]# df -h
Filesystem   Size Used Avail Use% Mounted on
/dev/mapper/rootvg-rootlv
      35G 4.8G 28G 15% /
tmpfs     63G  0 63G 0% /dev/shm
/dev/sda2    477M 33M 419M 8% /boot
/dev/sda1    500M 272K 500M 1% /boot/efi
/dev/mapper/rootvg-homelv
      4.8G 1.6G 3.1G 34% /home
/dev/mapper/rootvg-optlv
      20G 8.4G 11G 46% /opt
/dev/mapper/rootvg-tmplv
      4.8G 402M 4.2G 9% /tmp
/dev/mapper/rootvg-usrlv
      9.8G 3.6G 5.8G 39% /usr
/dev/mapper/rootvg-locallv
      52G 25G 25G 51% /usr/local
/dev/mapper/rootvg-varlv
      15G 5.2G 8.8G 37% /var
/dev/mapper/datavg-gpmasterlv
      100G 50G 51G 50% /gpmaster
/dev/mapper/datavg-datap1lv
      150G 43G 108G 29% /datap1
/dev/mapper/datavg-datap2lv
      150G 42G 109G 28% /datap2
/dev/mapper/datavg-datap3lv
      150G 42G 109G 28% /datap3
/dev/mapper/datavg-datap4lv
      150G 42G 109G 28% /datap4
/dev/mapper/datavg-datap5lv
      150G 43G 108G 29% /datap5
/dev/mapper/datavg-datap6lv
      150G 42G 108G 28% /datap6
/dev/mapper/rootvg-redislv

Data tables are stored in the base directory under dataap*

select 
relname, --surface/view/Names of indexes, etc.
relowner, --Relationship owner
relfilenode --The name of the file on disk，If not,0
from pg_class
where relname = 'tab_todrop';
qmstst=# select relname, relowner, relfilenode from pg_class where relname = 'tab_todrop';
 relname | relowner | relfilenode 
------------+----------+-------------
 tab_todrop | 17088 | 15997062
(1 row)
 
ls -lh 17089/15997006*
-rw------- 1 gpadmin gpadmin 268M Jan 11 13:56 17089/15997006
[root@P1QMSTST01 base]# pwd
/datap2/gpseg1/base
[root@P1QMSTST01 base]#

After drop table, the file in the base directory is deleted, so it can be concluded that "drop table in postgresql will free up space"

Supplement: Postgresql's table and index physical storage

postgresql uses a file system to store data. Sometimes you need to find the corresponding disk files of tables and indexes, so you must understand the following knowledge points.

Non-toast situation

zabbix=# create table tmp_t0(c0 varchar(100),c1 varchar(100), c2 varchar(100));
CREATE TABLE
zabbix=# 
zabbix=# create index idx_tmp_t0 on tmp_t0(c0);
CREATE INDEX
zabbix=# 
zabbix=# insert into tmp_t0 select id::varchar,md5(id::varchar),md5(md5(id::varchar)) from generate_series(1,100000) as id ;
INSERT 0 100000
zabbix=# 
zabbix=# delete from tmp_t0 where c0 > '1';
DELETE 99999

View the operating system files corresponding to the table.

zabbix=# select pg_relation_filenode('tmp_t0'),pg_relation_filepath('tmp_t0');
 pg_relation_filenode | pg_relation_filepath 
----------------------+----------------------
  24583 | base/24579/24583
(1 row)

View the operating system file corresponding to the index.

zabbix=# select pg_relation_filenode('idx_tmp_t0'),pg_relation_filepath('idx_tmp_t0');
 pg_relation_filenode | pg_relation_filepath 
----------------------+----------------------
  24588 | base/24579/24588
(1 row)

Use pg_class to view

zabbix=# select ,, from pg_class pc where 1=1 and  in ('tmp_t0','idx_tmp_t0');
 oid | relname | relfilenode 
-------+------------+-------------
 24583 | tmp_t0 | 24583
 24588 | idx_tmp_t0 | 24588
(2 rows)

Operating system view

$ ls -l |grep -i 24583; ls -l |grep -i 24588;
-rw------- 1 postgres postgres 10117120 Sep 19 11:18 24583
-rw------- 1 postgres postgres 24576 Sep 19 11:18 24583_fsm
-rw------- 1 postgres postgres 8192 Sep 19 11:20 24583_vm
-rw------- 1 postgres postgres 2260992 Sep 19 11:25 24588

Do a truncate operation at this time

zabbix=# truncate table tmp_t0;
TRUNCATE table
zabbix=#
zabbix=# insert into tmp_t0 select id::varchar,md5(id::varchar),md5(md5(id::varchar)) from generate_series(1,100000) as id ;
INSERT 0 100000

View in turn

zabbix=# select pg_relation_filenode('tmp_t0'),pg_relation_filepath('tmp_t0');
 pg_relation_filenode | pg_relation_filepath 
----------------------+----------------------
  24589 | base/24579/24589
(1 row)
 
zabbix=# select pg_relation_filenode('idx_tmp_t0'),pg_relation_filepath('idx_tmp_t0');
 pg_relation_filenode | pg_relation_filepath 
----------------------+----------------------
  24590 | base/24579/24590
(1 row)
zabbix=# select ,, from pg_class pc where 1=1 and  in ('tmp_t0','idx_tmp_t0');
 oid | relname | relfilenode 
-------+------------+-------------
 24583 | tmp_t0 | 24589
 24588 | idx_tmp_t0 | 24590
(2 rows)
 
$ ls -l |grep -i 24583; ls -l |grep -i 24588;
-rw------- 1 postgres postgres 0 Sep 19 11:33 24583
-rw------- 1 postgres postgres 0 Sep 19 11:33 24588
$ ls -l |grep -i 24589; ls -l |grep -i 24590;
-rw------- 1 postgres postgres 10117120 Sep 19 11:35 24589
-rw------- 1 postgres postgres 24576 Sep 19 11:35 24589_fsm
-rw------- 1 postgres postgres 3932160 Sep 19 11:35 24590

When you check the old relfilenode later, it disappeared

$ ls -l |grep -i 24583; ls -l |grep -i 24588;
$

The summary is as follows:

1. When creating table and create index, the oid of pg_class is the same as relfilenode.

1. After truncate the table, the oids of the table and index have not changed, but the relfilenode has changed.

toast situation

Insert 40 million rows of data, so that the size of tmp_t0 on the disk is greater than 1G

zabbix=# insert into tmp_t0 select id::varchar,md5(id::varchar),md5(md5(id::varchar)) from generate_series(1,40000000) as id ;
INSERT 0 40000000
zabbix=#
zabbix=# create index idx_tmp_t0_1 on tmp_t0(c1);
CREATE index
zabbix=# select ,, from pg_class pc where 1=1 and  in ('tmp_t0','idx_tmp_t0','idx_tmp_t0_1');
 oid | relname | relfilenode 
-------+--------------+-------------
 24583 | tmp_t0 | 24589
 24588 | idx_tmp_t0 | 24590
 24599 | idx_tmp_t0_1 | 24599
(3 rows)
$ ls -l |grep -i 24589; ls -l |grep -i 24590;ls -l|grep -i 24599;
-rw------- 1 postgres postgres 1073741824 Sep 19 12:15 24589
-rw------- 1 postgres postgres 1073741824 Sep 19 12:17 24589.1
-rw------- 1 postgres postgres 1073741824 Sep 19 12:19 24589.2
-rw------- 1 postgres postgres 1073741824 Sep 19 12:23 24589.3
-rw------- 1 postgres postgres 81788928 Sep 19 12:25 24589.4
-rw------- 1 postgres postgres 1097728 Sep 19 12:14 24589_fsm
-rw------- 1 postgres postgres 1073741824 Sep 19 12:14 24590
-rw------- 1 postgres postgres 332496896 Sep 19 12:14 24590.1
-rw------- 1 postgres postgres 1073741824 Sep 19 12:24 24599
-rw------- 1 postgres postgres 1073741824 Sep 19 12:24 24599.1
-rw------- 1 postgres postgres 220487680 Sep 19 12:24 24599.2

Below is the storage file corresponding to the table and index

select ||'.'||,pg_relation_filepath(||'.'||),
 pg_table_size(||'.'||),
 pg_relation_size(||'.'||),
 pg_total_relation_size(||'.'||),
 ||'.'||,pg_relation_filepath(||'.'||),
 pg_relation_size(||'.'||),--The specified single index
 pg_indexes_size(||'.'||) --All indexes on the table
 from pg_tables pt
 left outer join pg_indexes pi 
   on ||'.'|| = ||'.'||
where 1=1
 and ='public'
 and ='tmp_t0'
;
 
 ?column? | pg_relation_filepath | pg_table_size | pg_relation_size | pg_total_relation_size | ?column? | pg_relation_filepath | pg_relation_size | pg_indexes_size 
---------------+----------------------+---------------+------------------+------------------------+---------------------+----------------------+------------------+-----------------
 public.tmp_t0 | base/24579/24589 | 4377853952 | 4376756224 |  8152064000 | public.idx_tmp_t0 | base/24579/24590 | 1406238720 | 3774210048
 public.tmp_t0 | base/24579/24589 | 4377853952 | 4376756224 |  8152064000 | public.idx_tmp_t0_1 | base/24579/24599 | 2367971328 | 3774210048
(2 rows)

Reference documentation:

Table 9-83. Database Object Size Function

name      Return type  describe 
pg_column_size(any)  int Store a specific value（Probably compressed）The number of bytes required 
pg_database_size(oid)    bigint Specify OID The disk space used by the database 
pg_database_size(name)    bigint SpecifynameThe disk space used by the database 
pg_indexes_size(regclass)   bigint 附加到Specify表的索引所占的总磁盘空间 
pg_relation_size(relation regclass, fork text) bigint Specify表or索引的Specify分叉（'main'、'fsm'、'vm'or'init'）Disk space used 
pg_relation_size(relation regclass)  bigint pg_relation_size(..., 'main')Abbreviation of 
pg_size_bytes(text)    bigint Converts the size with units in human-readable format to bytes 
pg_size_pretty(bigint)    text  Will be represented as one 64Convert byte size of bit integers to human-readable format with size units 
pg_size_pretty(numeric)   text  Will be represented as one数字值的字节尺寸转换为带尺寸单位的人类可读格式 
pg_table_size(regclass)   bigint 被Specify表Disk space used，Exclude index（But including TOAST、Free space mapping and visibility mapping） 
pg_tablespace_size(oid)   bigint Specify OID 的表空间Disk space used 
pg_tablespace_size(name)   bigint Specifyname的表空间Disk space used 
pg_total_relation_size(regclass)  bigint Specify表所用的总磁盘空间，Includes all indexes andTOASTdata

pg_column_size Displays the space used to store any independent data values.

pg_total_relation_size accepts the OID or name of a table or TOAST table and returns the total disk space used by the table, including all relevant indexes. This function is equivalent to pg_table_size + pg_indexes_size.

pg_table_sizeAccepts the OID or name of a table and returns the disk space required for that table, but excludes the index (TOAST space, free space map, and visibility map are included)

pg_indexes_sizeAccepts the OID or name of a table and returns all disk space used by all indexes attached to the table.

pg_database_size and pg_tablespace_size accept the OID or name of a database or tablespace and return the entire disk space they use. To use pg_database_size, you must have CONNECT permissions on the specified database (the default will be granted). To use pg_tablespace_size, you must have CREATE permissions on the specified tablespace unless it is the default tablespace for the current database.

pg_relation_sizeAccepts the OID or name of a table, index, or TOAST table, and returns the byte size of disk space occupied by a fork of that relationship (note that for most purposes it is more convenient to use the higher-level functions pg_total_relation_size or pg_table_size, which sums up all fork sizes). If you get only one parameter, it returns the size of the main data fork for the relationship. Provide a second parameter to specify which fork to check:

'main'Returns the size of the relationship's primary data fork.

'fsm'Returns the dimensions of the free space map associated with this relationship (see Section 65.3).

'vm'Returns the dimensions of the visibility map associated with this relationship (see Section 65.4).

'init'Returns the dimensions of the initialization fork (if any) associated with the relationship.

pg_size_prettyIt can be used to format the results of one of the other functions into a human-readable format, and KB, MB, GB, or TB can be used depending on the situation.

pg_size_bytesIt can be used to obtain the number of bytes represented therein from a human-readable format string. The units whose input may have include bytes, kB, MB, GB, or TB, and are case sensitive when parsing the input. If no unit is specified, the unit is assumed to be bytes.

Notice:

The units kB, MB, GB, and TB used by the functions pg_size_pretty and pg_size_bytes are defined by powers of 2 instead of powers of 10, so 1kB is 1024 bytes, 1MB is 1024 = 1048576 bytes, and so on

The above-mentioned function that operates on the table and index accepts a regclass parameter, which is the OID of the table or index in the pg_class system directory. You don't have to search for the OID manually, because the input converter for the regclass data type will do it for you. Write only the table name surrounded by single quotes so that it looks like a literal constant. For compatibility with the processing of normal SQL names, the string will be converted to lowercase unless double quotes are included around the table name.

If an OID does not represent an existing object and is passed as an argument to the above function, NULL will be returned.

The above is personal experience. I hope you can give you a reference and I hope you can support me more. If there are any mistakes or no complete considerations, I would like to give you advice.