vertica | 易学教程

Optimizing join in vertica

阅读更多关于 Optimizing join in vertica

问题 I hava a query like this SELECT a.column, b.column FROM table_a a INNER JOIN tableb_b ON a.id= b.id where a.anotherid = 'some condition' It is supposed to be very fast because with the predicate a.anotherid = 'some condition' the query plan should filter much data on table_b. However, according to the document of Vertica, The WHERE clause is evaluated after the join is performed. It filters records returned by the FROM clause, eliminating any records that do not satisfy the WHERE clause

Get Multi Columns Count in Single Query

阅读更多关于 Get Multi Columns Count in Single Query

问题 I am working on a application where I need to write a query on a table, which will return multiple columns count in a single query. After research I was able to develop a query for a single sourceId, but what will happen if i want result for multiple sourceIds. select '3'as sourceId, (select count(*) from event where sourceId = 3 and plateCategoryId = 3) as TotalNewCount, (select count(*) from event where sourceId = 3 and plateCategoryId = 4) as TotalOldCount; I need to get TotalNewCount and

How to create external procedures in Vertica

阅读更多关于 How to create external procedures in Vertica

问题 How do I create functions / procedures in Vertica that make use of SQL with clauses such as FROM, WHERE, GROUP BY, ORDER BY, LIMIT etc ? 回答1: Vertica's create function syntax prohibits the use of certain clauses in the expression . Create function CREATE [ OR REPLACE ] FUNCTION ... [[db-name.]schema.]function-name ( [ argname argtype [, ...] ] ) ... RETURN rettype ... AS ... BEGIN ...... RETURN expression; ... END; Note: Only one RETURN expression is allowed in the CREATE FUNCTION definition.

GROUP_CONCAT in Vertica

阅读更多关于 GROUP_CONCAT in Vertica

问题 Suppose we have data something like this: date | campaign | raw | unq ------------+----------+-----+----- 2016-06-01 | camp1 | 5 | 1 2016-06-01 | camp2 | 10 | 1 2016-06-01 | camp3 | 15 | 2 2016-06-02 | camp4 | 5 | 3 2016-06-02 | camp1 | 5 | 1 I need to group it in such a way as to obtain the following result: date | campaigns | raw | unq ------------+---------------------+----- +----- 2016-06-01 | camp1, camp2, camp3 | 30 | 4 2016-06-02 | camp4, camp1 | 10 | 4 Mysql for these purposes has a

Transfer data from vertica to Redshift using Apache Nifi

阅读更多关于 Transfer data from vertica to Redshift using Apache Nifi

I want to transfer data from vertica to redshift using apache nifi. which are the processors and configuration I need to set? If Vertica and Redshift have "well-behaved" JDBC drivers, you can set up a DBCPConnectionPool for each, then a SQL processor such as ExecuteSQL , QueryDatabaseTable , or GenerateTableFetch (the latter of which generates SQL for use in ExecuteSQL). These will get your records into Avro format, then (prior to NiFi 1.2.0) you can use ConvertAvroToJSON -> ConvertJSONToSQL -> PutSQL to get your records inserted into Redshift. In NiFi 1.2.0, you can use set up an AvroReader

Driver for pyodbc: how to specify its location in macOS?

阅读更多关于 Driver for pyodbc: how to specify its location in macOS?

问题 I want to find a way to specify the path of a driver -- or otherwise resolve problem that Pyodbc cannot find a Vertica driver -- for the following Python 3 command with Pyodbc package Pyodbc.connect(...) but I keep getting the error that a Vertica driver 9.0.x cannot be found. I used the installer here and the installer here on macOS. I currently use an alias Vertica in the command but unknown for the Pyodbc connect so apparently some driver file problem, now need to find a way to resolve

Vertica - Is there LATERAL VIEW functionality?

阅读更多关于 Vertica - Is there LATERAL VIEW functionality?

问题 Need to rotate a matrix to do TIMESERIES interpolation / gap filling, and would like to avoid the messy & inefficient UNION ALL approach. Is there anything like Hive's LATERAL VIEW EXPLODE functionality available in Vertica? EDIT: @marcothesane -- thanks for your interesting scenario -- I like your approach for interpolation. I will play around with it more and see how it goes. Looks promising. FYI -- here is the solution that I came up with -- My scenario is that I am trying to view memory

Transfer data from vertica to Redshift using Apache Nifi

阅读更多关于 Transfer data from vertica to Redshift using Apache Nifi

问题 I want to transfer data from vertica to redshift using apache nifi. which are the processors and configuration I need to set? 回答1: If Vertica and Redshift have "well-behaved" JDBC drivers, you can set up a DBCPConnectionPool for each, then a SQL processor such as ExecuteSQL, QueryDatabaseTable, or GenerateTableFetch (the latter of which generates SQL for use in ExecuteSQL). These will get your records into Avro format, then (prior to NiFi 1.2.0) you can use ConvertAvroToJSON ->

Using an ODBC application with a JDBC driver

阅读更多关于 Using an ODBC application with a JDBC driver

问题 My company uses Vertica. We have Python applications that connect to it with pyodbc. I do most of my development on a Mac (Snow Leopard) and unfortunately Vertica has not released ODBC drivers for Mac. They do have JDBC drivers though. I don't think developing in Jython is a good compromise. Is there any way to use JDBC drivers with an ODBC application? Some kind of ODBC connector? 回答1: edit: update for vertica 5/6 can be found here https://github.com/serbaut/psycopg2 Here is a patch to make

Unable to write data to Vertica Database using Python SqlAlchemy - Type “TEXT” does not exist

阅读更多关于 Unable to write data to Vertica Database using Python SqlAlchemy - Type “TEXT” does not exist

I am trying to upload pandas dataframe into Vertica Database was able to setup the engine and query database using sqlalchemy. But when I try to upload data from pandas dataframe get error message as Type "TEXT" does not exist. I am using windows 10, and created an ODBC connection. import sqlalchemy as sa engine = sa.create_engine('vertica+pyodbc:///?odbc_connect=%s' %(urllib.parse.quote('DSN=TESTDB'),)) sql_query = "select * from sample_table" df = pd.read_sql_query(sql_query, con=engine) # this works, get the data as required in the dataframe *df.apply[Do various data transformations as