Using Postgres domains to simplify function input validation

霸气de小男生 提交于 2021-02-10 20:45:06

问题


Using Postgres 11.5, I've been looking at CREATE DOMAIN since yesterday, and would like to clarify how they can/can't help with function parameters. Ideally, I'd like to use a domain to screen parameter inputs easily, but with a helpful error response. As an example, I'm using a simple first-case, a domain that blocks null and empty strings:

CREATE DOMAIN text_not_empty AS
    text
    NOT NULL
    CHECK (value <> '');

I tried this out as a field type for a table, and it's great. When we don't allow empty strings, this seems like a simple way to implement a constraint without an individual rule or a trigger. I'm hoping to get similar benefits on function parameters. However with functions, we really need clear error messages as the caller may be from Node, or some other environment.

As an example, here's a dummy function that requires a string, using an explicit check on the input:

CREATE OR REPLACE FUNCTION api.test_this_function(in_text text)
 RETURNS int
 LANGUAGE plpgsql
AS $function$

BEGIN

IF in_text = '' THEN
    RAISE EXCEPTION USING
        message = 'Input text must be supplied',
        detail  = 'Deets',
        hint    = 'Supply a search string',
        errcode = 'KC123'; -- Custom code
END IF;

    RETURN 1;

END;
$function$

This works fine, but I'm hoping a domain could simplify such cases. As far as I can tell, a domain won't improve things as the function never gets a chance to grab the error. If I've read the docs and understood my experiments correctly, you can only RAISE within a BEGIN...END block. If so, how do people recommend vetting inputs? And am I missing an opportunity with domains?

To flesh out what I'm basing my impressions on, here's a function that does use a domain-based checked, as well as a (silly) custom check:

CREATE OR REPLACE FUNCTION api.domain_test(in_text text_not_empty)
     RETURNS timestamptz

AS $BODY$

BEGIN

IF in_text = 'foo' THEN
    RAISE EXCEPTION USING
        message = 'Invalid search string',
        hint = 'Supply a search string other than ''foo''.',
        errcode = 'KC123'; -- Custom code
END IF;

    RETURN now();

END;

$BODY$
 LANGUAGE plpgsql;

So, it should fail on no parameter, a null parameter, an empty string, or a string of 'foo'. Otherwise, it should return a timestamp. I tried out these five cases, shown here:

select * from domain_test();      -- 1 : Fails to reach the method.
select * from domain_test(null);  -- 2 : Fails before entering method.
select * from domain_test('');    -- 3 : Fails before entering method.
select * from domain_test('foo'); -- 4 : Fails on custom exception.
select * from domain_test('a');   -- 5 : Succeeds.

I've diagrammed out how far each of these statements make it through the function. The diagram is no clearer than the code, but sometimes I find it helpful to try and make a diagram, just to see if I've got all of the pieces.

I'm not asking a specific code question, but it would be a big help if someone could confirm that my model of how the errors are being caught and handled is correct and complete. Once I've understood how Postgres "thinks" about this stuff, it will be easier for me to reason about it too.

The null and empty string cases never get to the BEGIN block, so there doesn't seem to be a way to use RAISE to customize the message, hint, etc. Is there a global error handler, or a broader catch system that I'm overlooking?

Regarding ORMs

wildplasser offered some comments about ORMs and strategies, which make it clear I didn't explain the background here. I didn't want to bog the question down with more detail, but I figure I'll add in some clarification.

We're not going with an ORM. It seems like that adds an other model/abstraction layer to help people used to some other language. For me, it's just more complexity I don't gain anything from, in this case. I'd prefer to write the queries in straight SQL without a lot of scaffolding. The idea is to push the query logic/SQL into Postgres. Then there's one place for the logic.

The plan is to make our PG query API a series of functions with defined inputs/outputs. I can capture or store those details that up using pg_proc, information_schema.parameters, and a custom table to define parameter rules (allowed/excluded values, series, or ranges.) That much scaffolding should help as it's pretty easy to mechanize. With the input/output data, I can automatically generate input/output declarations, check code (what I'm working on here), documentation, and test cases. The actual query body? I'll write that by hand. Writing a smart query builder that figures out all of my joins etc.? Postgres is better at that now that I'll ever be...huge task, I'd do a crap job. So, I can hand-write the query body, give it to the PG planner/optimizer, and tweak as needed. It's in a black box, so outside clients aren't harmed by internal modifications.

The HTTP API layer will be written in at two languages to start with, with possibly more languages, an likely more dialects to follow. Then the Node, etc. tools can handle the routing and function calls in their own idiom. The last thing I want to do is push out the query logic implementation to redundant implementations in different languages. Nightmare on so many levels. For the record, the functions will mostly RETURN TABLE, defined inline or via CREATE TYPE.


回答1:


Your insights are accurate, the error

select domain_test('');

ERROR:  value for domain text_not_empty violates check constraint "text_not_empty_check"

is raised on the stage of resolving the function argument types hence the function is never executed. If your aim is to customize the error message, the custom domain does not help you.

There is no global error handler. The only option is to call the function inside another code block

do $$
begin
    select domain_test('');
exception when others then
    raise exception 'my error message';
end $$;

ERROR:  my error message
CONTEXT:  PL/pgSQL function inline_code_block line 5 at RAISE

It seems though that your original approach without a custom domain makes more sense.




回答2:


I'd say you got that right.

If the supplied value fails the tests for the domain, the function is not even called. That's a feature: it centralizes such tests in the domain definition, so that you don't have to repeat them all over the place, and it saves the expense of actually calling the function.

I find at least the second error message pretty helpful:

ERROR:  value for domain text_not_empty violates check constraint "text_not_empty_check"

ERROR:  domain text_not_empty does not allow null values

If that is not clear enough for you, and you don't mind writing C, you could write your own data type and have fancy error messages in the type input function.



来源:https://stackoverflow.com/questions/60048259/using-postgres-domains-to-simplify-function-input-validation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!