问题
As of NiFi 1.7.1, the new DBCPConnectionPoolLookup enables dynamic selection of database connections: set an attribute database.name
on a FlowFile and when a consuming processor accesses a configured DBCPConnectionPoolLookup controller service, the content of that attribute will be used to get a connection through this lookup's configured properties, which contain a mapping of potential values to DBCPConnectionPool controller service.
I'd like to list the tables in each database that I've configured in the lookup, but the ListDatabaseTables processor does not accept incoming FlowFiles. This seems to mean that it's not usable for listing tables in a dynamic set of databases.
What is the best way to accomplish this?
回答1:
ListDatabaseTables uses the JDBC API for getting table info from the metadata of an established JDBC connection. This hides the underlying method of how to actually get tables from a particular database.
If all your databases are of the same ilk, then if you have a list of databases, you could generate flow files with one per database, filling in the database.name
attribute, then using ExecuteSQL with the DBCPConnectionPoolLookup to execute the corresponding SQL statement to get the tables for that database, such as SHOW TABLES
. You can parse the records using any of the record-aware processors such as QueryRecord, UpdateRecord, ConvertRecord, etc. and if you need one table per flow file you can use SplitRecord. If the output is JSON or CSV or XML, you could use EvaluateJsonPath, ExtractText, or EvaluateXPath respectively to get the table name into an attribute, and continue on from there.
I wrote up NIFI-5519 to cover the proposal for ListDatabaseTables to optionally accept incoming connections, in the meantime you'd need 1 ListDatabaseTables instance to correspond to each of your DBCPConnectionPool instances.
来源:https://stackoverflow.com/questions/51848326/how-can-i-list-database-tables-in-a-set-of-databases-using-the-new-dbcpconnectio