我正在考虑为 AGE 开发一个函数,返回图的邻接矩阵。邻接矩阵存储两个节点之间的边数。检查邻接矩阵中是否存在边的复杂度为 O(1),而空间复杂度为 O(v^2),其中 v = 顶点数。 AGE 将每个图的顶点存储在
graph_name._ag_label_vertex
中,而边存储在 graph_name._ag_label_edge
中。
顶点表包含两列:id和properties。 id 列是整数类型,而properties 列是类似 JSON 的格式。边表包含边的 id、它来自的顶点的 id、它去的顶点的 id 以及边的属性。
示例:
SELECT * FROM "graph_name"."Vertex";
id | properties
------------------+-------------
844424930131971 | {}
1407374883553281 | {}
SELECT * FROM "graph_name"."Edges";
id | start_id | end_id | properties
------------------+-----------------+------------------+------------
1125899906842625 | 844424930131971 | 1407374883553281 | {}
(1 row)
我正在考虑首先遍历顶点列,以便最初可以存储顶点ID:
[0, 1, 2, 3]
[1, 0, 0, 0]
[2, 0, 0, 0]
[3, 0, 0, 0]
之后,循环遍历edges表,如果两个节点之间存在边,则存储1(如果一对节点之间有多条边,则为每条边在已经设置的值上加1)。但我不知道如何在 C 语言中为 PostgreSQL 函数执行此操作。 我应该调用哪个函数来遍历表并从 id 列获取值?我想过使用 SPI 两次,第一次用于顶点,第二次用于边缘,但我不知道这是否是最好的方法。
在age--1.3.0.sql文件中,我将函数声明为
CREATE FUNCTION ag_catalog.get_adjacency_matrix(graph_name name)
RETURNS integer[][]
LANGUAGE c
AS 'MODULE_PATHNAME';
其定义在src/backend/commands/graph_commands.c:
PG_FUNCTION_INFO_V1(get_adjacency_matrix);
/*
* Function for returning the graph's adjacency matrix. This adjacency matrix
* stores the number of edges between two nodes. The complexity to check if an
* edge exists within an adjacency matrix is O(1), while the space complexity
* is O(v^2) with v = number of vertices.
*/
Datum get_adjacency_matrix(PG_FUNCTION_ARGS)
{
PG_RETURN_VOID();
}
显然,使用 PostgreSQL 的 C 函数遍历表的最佳方法是使用服务器编程接口 (SPI)。这是一个代码示例:
SPI_connect();
char *query = psprintf("SELECT id FROM \"GraphName\".\"_ag_label_vertex\" ORDER BY id");
// Execute the query and check for errors.
int ret = SPI_execute(query, true, 0);
if (ret != SPI_OK_SELECT)
{
elog(ERROR, "Error when fetching the vertices");
SPI_finish();
PG_RETURN_NULL();
}
// And then here is how to traverse the resulting table.
for (i = 0; i < SPI_processed; i++)
{
char *result_string = SPI_getvalue(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, 1);
int vertex_id = DatumGetInt64(DirectFunctionCall1(int8in, CStringGetDatum(result_string)));
}
来自 PostgreSQL 的文档:
当我们调用
SPI_execute(const char * command, bool read_only, long count)
时,它将针对count
行执行指定的SQL命令,并且,如果read_only
为true,则该命令必须是只读的。如果 count
为零,则对所有行执行该命令。该函数根据执行的查询返回一个值。在这种情况下,如果执行成功,它会返回 SPI_OK_SELECT
。
执行(最后一个)命令的实际行数在全局变量
SPI_processed
中返回。如果函数的返回值确实是SPI_OK_SELECT
,那么我们可以使用全局指针SPITupleTable *SPI_tuptable
来访问结果行。
SPITupleTable
结构声明为:
typedef struct SPITupleTable
{
/* Public members */
TupleDesc tupdesc; /* tuple descriptor */
HeapTuple *vals; /* array of tuples */
uint64 numvals; /* number of valid tuples */
/* Private members, not intended for external callers */
uint64 alloced; /* allocated length of vals array */
MemoryContext tuptabcxt; /* memory context of result table */
slist_node next; /* link for internal bookkeeping */
SubTransactionId subid; /* subxact in which tuptable was created */
} SPITupleTable;
我们可以通过执行
SPITupleTable
来从 SPI_tuptable->vals[i]
访问相应的行,其中 i
是行号。要从行中获取特定列的值,在这种情况下,我们需要调用SPI_getvalue(HeapTuple row, TupleDesc rowdesc, int colnumber)
。该函数将返回一个char*
到相应的数据。
如果我们想将其转换为
int64
,我们调用 DatumGetInt64(DirectFunctionCall1(int8in, CStringGetDatum(result_string)))
。
这不是答案,但它可以被视为来自另一个函数的指南:
通过查看
get_all_edge_labels_per_graph
在某些时候,我们可以使用这些函数来获取其中的元组,如下所示:
功能:
/*
* Retrieves a list of all the names of a graph.
*
* XXX: We may want to use the cache system for this function,
* however the cache system currently requires us to know the
* name of the label we want.
*/
List *get_all_edge_labels_per_graph(EState *estate, Oid graph_oid)
{
List *labels = NIL;
ScanKeyData scan_keys[2];
Relation ag_label;
TableScanDesc scan_desc;
HeapTuple tuple;
TupleTableSlot *slot;
ResultRelInfo *resultRelInfo;
// setup scan keys to get all edges for the given graph oid
ScanKeyInit(&scan_keys[1], Anum_ag_label_graph, BTEqualStrategyNumber,
F_OIDEQ, ObjectIdGetDatum(graph_oid));
ScanKeyInit(&scan_keys[0], Anum_ag_label_kind, BTEqualStrategyNumber,
F_CHAREQ, CharGetDatum(LABEL_TYPE_EDGE));
// setup the table to be scanned
ag_label = table_open(ag_label_relation_id(), RowExclusiveLock);
scan_desc = table_beginscan(ag_label, estate->es_snapshot, 2, scan_keys);
resultRelInfo = create_entity_result_rel_info(estate, "ag_catalog",
"ag_label");
slot = ExecInitExtraTupleSlot(
estate, RelationGetDescr(resultRelInfo->ri_RelationDesc),
&TTSOpsHeapTuple);
// scan through the results and get all the label names.
while(true)
{
Name label;
bool isNull;
Datum datum;
tuple = heap_getnext(scan_desc, ForwardScanDirection);
// no more labels to process
if (!HeapTupleIsValid(tuple))
break;
ExecStoreHeapTuple(tuple, slot, false);
datum = slot_getattr(slot, Anum_ag_label_name, &isNull);
label = DatumGetName(datum);
labels = lappend(labels, label);
}
table_endscan(scan_desc);
destroy_entity_result_rel_info(resultRelInfo);
table_close(resultRelInfo->ri_RelationDesc, RowExclusiveLock);
return labels;
}
来源: