博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Hierarchical data in postgres
阅读量:5066 次
发布时间:2019-06-12

本文共 5177 字,大约阅读时间需要 17 分钟。

-------------------------------------------------------------------------------

This tip will try to answer the following questions:

  • How can we represent a tree of data in postgres
  • How can we efficiently query for any entire single node and all of it's children (and children's children).

The test data

Since we want to keep this simple we will assume our data is just a bunch of sections. A section just has a nameand each section has a single parent section.

Section A    |--- Section A.1 Section B |--- Section B.1 |--- Section B.1 |--- Section B.1.1

We'll use this simple data for examples below.

Simple self-referencing

When designing a self-referential table (something that joins itself to itself) the most obvious choice is to have some kind of parent_id column on our table that references itself.

CREATE TABLE section (    id INTEGER PRIMARY KEY, name TEXT, parent_id INTEGER REFERENCES section, ); ALTER TABLE page ADD COLUMN parent_id INTEGER REFERENCES page; CREATE INDEX section_parent_id_idx ON section (parent_id);

Now insert our example data, using the parent_id to related the nodes together:

INSERT INTO section (id, name, parent_id) VALUES (1, 'Section A', NULL); INSERT INTO section (id, name, parent_id) VALUES (2, 'Section A.1', 1); INSERT INTO section (id, name, parent_id) VALUES (3, 'Section B', NULL); INSERT INTO section (id, name, parent_id) VALUES (4, 'Section B.1', 3); INSERT INTO section (id, name, parent_id) VALUES (5, 'Section B.2', 3); INSERT INTO section (id, name, parent_id) VALUES (6, 'Section B.2.1', 5);

This works great for simple queries such as, fetch the direct children of Section B:

SELECT * FROM section WHERE parent = 3

but it will require complex or recursive queries for questions like fetch me all the children and children's children of Section B:

WITH RECURSIVE nodes(id,name,parent_id) AS ( SELECT s1.id, s1.name, s1.parent_id FROM section s1 WHERE parent_id = 3 UNION SELECT s2.id, s2.name, s2.parent_id FROM section s2, nodes s1 WHERE s2.parent_id = s1.id ) SELECT * FROM nodes;

So we have answered the "how to build a tree" part of the question, but are not happy with the "how to query for a node and all it's children" part.

Enter ltree. (Short for "label tree" - I think?).

The ltree extension

The  is a great choice for querying hierarchical data. This is especially true for self-referential relationships.

Lets rebuild the above example using ltree. We'll use the page's primary keys as the "labels" within our ltree paths and a special "root" label to denote the top of the tree.

CREATE EXTENSION ltree;CREATE TABLE section ( id INTEGER PRIMARY KEY, name TEXT, parent_path LTREE ); CREATE INDEX section_parent_path_idx ON section USING GIST (parent_path);

We'll add in our data again, this time rather than using the id for the parent, we will construct an ltree path that represents the parent node.

INSERT INTO section (id, name, parent_path) VALUES (1, 'Section 1', 'root'); INSERT INTO section (id, name, parent_path) VALUES (2, 'Section 1.1', 'root.1'); INSERT INTO section (id, name, parent_path) VALUES (3, 'Section 2', 'root'); INSERT INTO section (id, name, parent_path) VALUES (4, 'Section 2.1', 'root.3'); INSERT INTO section (id, name, parent_path) VALUES (4, 'Section 2.2', 'root.3'); INSERT INTO section (id, name, parent_path) VALUES (5, 'Section 2.2.1', 'root.3.4');

Cool. So now we can make use of ltree's operators @> and <@ to answer our original question like:

SELECT * FROM section WHERE parent_path <@ 'root.3';

However we have introduced a few issues.

  • Our simple parent_id version ensured referential consistancy by making use of the REFERENCES constraint. We lost that by switching to ltree paths.
  • Ensuring that the ltree paths are valid can be a bit of a pain, and if paths become stale for some reason your queries may return unexpected results or you may "orphan" nodes.

The final solution

To fix these issues we want a hybrid of our original parent_id (for the referential consistency and simplicity of the child/parent relationship) and our ltree paths (for improved querying power/indexing). To achieve this we will hide the management of the ltree path behind a trigger and only ever update the parent_id column.

CREATE EXTENSION ltree;CREATE TABLE section ( id INTEGER PRIMARY KEY, name TEXT, parent_id INTEGER REFERENCES section, parent_path LTREE ); CREATE INDEX section_parent_path_idx ON section USING GIST (parent_path); CREATE INDEX section_parent_id_idx ON section (parent_id); CREATE OR REPLACE FUNCTION update_section_parent_path() RETURNS TRIGGER AS $$ DECLARE path ltree; BEGIN IF NEW.parent_id IS NULL THEN NEW.parent_path = 'root'::ltree; ELSEIF TG_OP = 'INSERT' OR OLD.parent_id IS NULL OR OLD.parent_id != NEW.parent_id THEN SELECT parent_path || id::text FROM section WHERE id = NEW.parent_id INTO path; IF path IS NULL THEN RAISE EXCEPTION 'Invalid parent_id %', NEW.parent_id; END IF; NEW.parent_path = path; END IF; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER parent_path_tgr BEFORE INSERT OR UPDATE ON section FOR EACH ROW EXECUTE PROCEDURE update_section_parent_path();

Much better.

More

Written by 

转载于:https://www.cnblogs.com/oxspirt/p/8961183.html

你可能感兴趣的文章
Mongodb 基本命令
查看>>
控制文件的备份与恢复
查看>>
PHP的SQL注入技术实现以及预防措施
查看>>
软件目录结构规范
查看>>
mysqladmin
查看>>
解决 No Entity Framework provider found for the ADO.NET provider
查看>>
设置虚拟机虚拟机中fedora上网配置-bridge连接方式(图解)
查看>>
[置顶] Android仿人人客户端(v5.7.1)——人人授权访问界面
查看>>
Eclipse 调试的时候Tomcat报错启动不了
查看>>
ES6内置方法find 和 filter的区别在哪
查看>>
Android实现 ScrollView + ListView无滚动条滚动
查看>>
java学习笔记之String类
查看>>
UVA 11082 Matrix Decompressing 矩阵解压(最大流,经典)
查看>>
jdk从1.8降到jdk1.7失败
查看>>
硬件笔记之Thinkpad T470P更换2K屏幕
查看>>
iOS开发——缩放图片
查看>>
HTTP之URL的快捷方式
查看>>
满世界都是图论
查看>>
配置链路聚合中极小错误——失之毫厘谬以千里
查看>>
代码整洁
查看>>