Create copy of PostgreSQL internal C function and load it as user defined function

天大地大妈咪最大 提交于 2019-12-12 14:06:05

问题


I would like to create modified version of C function int2_avg_accum located in src/backend/utils/adt/numeric.c. I thought I can (as a start) just compile numeric.h and load int2_avg_accum as user defined function.

What I did:

  1. Added PG_MODULE_MAGIC to numeric.h (as described here)
  2. Renamed int2_avg_accum to int2_avg_accum2 in numeric.h
  3. Compiled numeric.h as described in the documentation (no errors, no warnings) (cc -fpic -Ipg_config --includedir-server-c numeric.c and then cc -shared -o numeric.so numeric.o)
  4. Created a function in PostgreSQL:

.

create or replace function int2_avg_accum2(bigint[], smallint)
  returns bigint[] as
'/usr/lib/postgresql/9.1/lib/numeric', 'int2_avg_accum2'
  language c
  cost 1;
alter function int2_avg_accum2(bigint[], smallint)
  owner to postgres;

When I try to run select int2_avg_accum2(array[1::bigint,1],1::smallint); I get only message (in pgAdmin): "Do you want to attempt to reconnect to the database?". No other messages or errors.

When I call the function I see the following in /var/log/postgresql/postgresql-9.1-main.log:

2013-12-03 09:52:02 CET LOG:  server process (PID 3366) was terminated by signal 11: Segmentation fault
2013-12-03 09:52:02 CET LOG:  terminating any other active server processes
2013-12-03 09:52:02 CET WARNING:  terminating connection because of crash of another server process
2013-12-03 09:52:02 CET DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2013-12-03 09:52:02 CET HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2013-12-03 09:52:02 CET LOG:  all server processes terminated; reinitializing
2013-12-03 09:52:02 CET LOG:  database system was interrupted; last known up at 2013-12-03 09:50:53 CET
2013-12-03 09:52:02 CET LOG:  database system was not properly shut down; automatic recovery in progress
2013-12-03 09:52:02 CET LOG:  record with zero length at 0/B483EA0
2013-12-03 09:52:02 CET LOG:  redo is not required
2013-12-03 09:52:03 CET LOG:  autovacuum launcher started
2013-12-03 09:52:03 CET LOG:  database system is ready to accept connections

What I have to do differently in order to get working copy of int2_avg_accum?


回答1:


The reason the psql client is asking if you wish to reconnect is because the backend is segfaulting, as per the comments.

It would be possible to collect a core dump from such a crash and examine it with a debugger (eg. gdb) to find out exactly where it is crashing. However, my best guess is that it is crashing because you have taken a big file written to be a core component of postgresql, compiled it up separately, and attempted to load it in as an extension module.

The file numeric.c contains a huge number of functions, static variables and data structures, of which you are trying to duplicate only one. All of these functions, variables, etc already exist in the running postgresql system. When you compile up your version of numeric.c and load it, the new function you are adding will be referencing the functions and variables in your library instead of using those in the main postgresql program. It is probably referencing data structures which are not properly initialized, causing it to crash.

I recommend you start with a blank file and copy in only the int2_avg_accum function from numeric.c (renamed as you have done). If that function is calling other functions in postgresql, or referencing variables, it will use the functions and variables in the main postgresql binary, which is what you want. You can #include the original numeric.h to get the declarations of all of the external functions.

There are some other differences between how the function is defined as an internal function and how it needs to be defined when loaded as a dynamically loaded module:

  • You needed to specify that you are using V1 calling convention by adding the macro:

    PG_FUNCTION_INFO_V1(int2_avg_accum2);

    If missing this will also cause segfaults because postgresql will assume version 0 calling conventions, which does not match the function definition!

  • As you indicated you have to include the PG_MODOULE_MAGIC.

The complete file, which worked for me, is:

#include "postgres.h"
#include "fmgr.h"
#include "utils/array.h"

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

typedef struct Int8TransTypeData
{
    int64       count;
    int64       sum;
} Int8TransTypeData;

PG_FUNCTION_INFO_V1(int2_avg_accum2);

Datum
int2_avg_accum2(PG_FUNCTION_ARGS)
{
    ArrayType  *transarray;
    int16       newval = PG_GETARG_INT16(1);
    Int8TransTypeData *transdata;

    /*
     * If we're invoked as an aggregate, we can cheat and modify our first
     * parameter in-place to reduce palloc overhead. Otherwise we need to make
     * a copy of it before scribbling on it.
     */
    if (AggCheckCallContext(fcinfo, NULL))
        transarray = PG_GETARG_ARRAYTYPE_P(0);
    else
        transarray = PG_GETARG_ARRAYTYPE_P_COPY(0);

    if (ARR_HASNULL(transarray) ||
        ARR_SIZE(transarray) != ARR_OVERHEAD_NONULLS(1) + sizeof(Int8TransTypeData))
        elog(ERROR, "expected 2-element int8 array");

    transdata = (Int8TransTypeData *) ARR_DATA_PTR(transarray);
    transdata->count++;
    transdata->sum += newval;

    PG_RETURN_ARRAYTYPE_P(transarray);
}

Compiled with:

gcc -I/usr/pgsql-9.2/include/server -fPIC -c my_avg_accum.c
gcc -shared -o my_avg_accum.so my_avg_accum.o

I was using Postgresql 9.2 on Centos 6. You may need to adjust your paths according to your setup.



来源:https://stackoverflow.com/questions/20329959/create-copy-of-postgresql-internal-c-function-and-load-it-as-user-defined-functi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!