PostgreSQL大对象的清理

旧时模样 提交于 2019-11-30 09:42:38
系统使用了一款开源的cas单点登录系统,存储大对象的方式是lo,通常lo的性能会比bytea要好一点,开发告知会定期清理用户数据,但是实际上发现系统并没有删除用户数据所关联的大对象数据。故需要写个脚本定期清理一下。

一、开发背景
DB: PostgreSQL 9.3.0
cas=# select oid,rolname from pg_authid where oid in (10,327299);
 oid | rolname  
-----+----------
  10 | postgres
327299| usr_cas
(1 row)

cas=# select lomowner,count(1) from pg_largeobject_metadata group by 1;
 lomowner | count 
----------+--------
       10 |  292408
   327299 |  382123
(2 row)
二、清理
需要清理两部分,postgres用户的大对象与usr_cas用户的大对象,前者是用postgres连接时创建的,需要全部删除,后者存在部分用户数据已删但大对象没删的数据,也需要删除。
 1.lo_unlink删除
 删除通常使用自带的lo_unlink()函数,于是使用了以下命令,但爆出问题 out of shared memory
cas=# select lo_unlink(oid) from pg_largeobject_metadata where lomowner = 10;
WARNING:  out of shared memory
ERROR:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.

cas=# show max_locks_per_transaction ;
 max_locks_per_transaction 
---------------------------
 64
(1 row)
这个提示比较明显,一个SQL把所有的大对象在一个事务里完成,但分配的内存不够,所以失败了,要增加max_locks_per_transaction参数值,这个值默认是64。其实也可以换个角度删除,不把所有的大对象在一个事务里删除,而是分批次执行,因为要删除的数据量其实也不算多,就考虑了后者。
--多执行以下命令几次就可以了,每次删2W,执行10几次就够了,也可以放脚本里写,一次执行
cas=# select lo_unlink(oid) from pg_largeobject_metadata where lomowner = 10 limit 20000;
2.vacuumlo删除
清理完postgres的用户数据以后,接着要清理usr_cas用户的大对象数据,要写脚本逐个比对比较麻烦,而且效率也不一定好。这可以使用自带的vacuumlo的小工具。这个工具是通过大对象的OID与用户表中的oid进行关联比对,然后逐一删除,所以在设计大对象用户表时,虽然也可以使用int类型存储oid值,但是对后期的维护不方便,推荐使用oid类型。 如果这个工具没有安装,可以在contrib/vacuumlo下面make && make install安装一下即可
简介如下:
[postgres@kenyon-primary ~]$ vacuumlo --help
vacuumlo removes unreferenced large objects from databases.

Usage:
  vacuumlo [OPTION]... DBNAME...

Options:
  -l LIMIT       commit after removing each LIMIT large objects
  -n             don't remove large objects, just show what would be done
  -v             write a lot of progress messages
  -V, --version  output version information, then exit
  -?, --help     show this help, then exit

Connection options:
  -h HOSTNAME    database server host or socket directory
  -p PORT        database server port
  -U USERNAME    user name to connect as
  -w             never prompt for password
  -W             force password prompt

Report bugs to .
使用:
--显示要清理的数据,不清理,只显示
[postgres@kenyon-primary ~]$ vacuumlo -n cas -v
Connected to database "cas"
Test run: no large objects will be removed!
Checking expiration_policy in public.serviceticket
Checking service in public.serviceticket
Checking expiration_policy in public.ticketgrantingticket
Checking authentication in public.ticketgrantingticket
Checking services_granted_access_to in public.ticketgrantingticket
Would remove 382143 large objects from database "cas".

--清理,可以加个“l”参数,每隔这个参数提交一次
[postgres@kenyon-primary ~]$ vacuumlo cas -v -l 1000
Connected to database "cas"
Test run: no large objects will be removed!
Checking expiration_policy in public.serviceticket
Checking service in public.serviceticket
Checking expiration_policy in public.ticketgrantingticket
Checking authentication in public.ticketgrantingticket
Checking services_granted_access_to in public.ticketgrantingticket
Would remove 382143 large objects from database "cas".
清理完毕再看一下
cas=# select pg_size_pretty(pg_database_size('cas'));
 pg_size_pretty 
----------------
 1.3 GB
(1 row)

--空间还没有收缩,使用vacuum full analyze
cas=# vacuum full analyze verbose pg_largeobject;
INFO:  vacuuming "pg_catalog.pg_largeobject"
INFO:  scanned index "pg_largeobject_loid_pn_index" to remove 88928 row versions
DETAIL:  CPU 0.01s/0.24u sec elapsed 0.26 sec.
INFO:  "pg_largeobject": removed 88928 row versions in 6833 pages
DETAIL:  CPU 0.00s/0.02u sec elapsed 0.02 sec.
INFO:  index "pg_largeobject_loid_pn_index" now contains 948117 row versions in 4120 pages
DETAIL:  88928 index row versions were removed.
1516 index pages have been deleted, 1269 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_largeobject": found 88928 removable, 52 nonremovable row versions in 6891 out of 109226 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 2329 unused item pointers.
0 pages are entirely empty.
CPU 0.03s/0.32u sec elapsed 0.35 sec.
INFO:  analyzing "pg_catalog.pg_largeobject"
INFO:  "pg_largeobject": scanned 30000 of 109226 pages, containing 260529 live rows and 0 dead rows; 30000 rows in sample, 947568 estimated total rows
VACUUM

cas=# select pg_size_pretty(pg_relation_size('pg_largeobject'));
pg_size_pretty
----------------
8192 KB
(1 row)
整个世界清静了。 写成脚本的方式,定期执行
[postgres@kenyon-primary ~]$ more cas_rm_lo.sh
#!/bin/bash

######################################################
##
##  purpose:Rm the cas's large object and free space
##  
##  author :Kenyon
##   
##  created:2014-01-22
##  
#####################################################


source /home/postgres/.bash_profile

vacuumlo cas -l 1000 -v

psql -d cas -c "vacuum full analyze verbose pg_largeobject;"
psql -d cas -c "vacuum full analyze verbose pg_largeobject_metadata;"
三、总结
在使用开源的一些工具时,如果有使用一些大对象,需要注意一下程序清理用户数据时是否会同步删除大对象数据。
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!