database-partitioning

How to partition when ranking on a particular column?

こ雲淡風輕ζ 提交于 2019-11-30 03:59:42
All: I have a data frame like the follow.I know I can do a global rank order like this: dt <- data.frame( ID = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'), Value = c(4,3,1,3,4,6,6,1,8,4) ); > dt ID Value 1 A1 4 2 A2 3 3 A4 1 4 A2 3 5 A1 4 6 A4 6 7 A3 6 8 A2 1 9 A1 8 10 A3 4 dt$Order <- rank(dt$Value,ties.method= "first") > dt ID Value Order 1 A1 4 5 2 A2 3 3 3 A4 1 1 4 A2 3 4 5 A1 4 6 6 A4 6 8 7 A3 6 9 8 A2 1 2 9 A1 8 10 10 A3 4 7 But how can I set a rank order for a particular ID instead of a global rank order. How can I get this done? In T-SQL, we can get this done as the following

ORA-14763: Unable to resolve FOR VALUES clause to a partition number

左心房为你撑大大i 提交于 2019-11-29 23:24:12
问题 I have partitioned my table on daily basis. TABLE NAME MY_TABLE COLUMN NAME IN_TIME TIMESTAMP I want to fetch the rows for the last 2 days partition. I am using the below query. SELECT * FROM MY_TABLE PARTITION FOR (TO_DATE('17-DEC-2017','DD-MON-YYYY')) UNION SELECT * FROM MY_TABLE PARTITION FOR (TO_DATE('18-DEC-2017','DD-MON-YYYY')) I am trying to set date using prepared statement preparedStatement.setString(1, "17-DEC-2017"); preparedStatement.setString(2, "18-DEC-2017"); But I get the

Mysql improve SELECT speed

蹲街弑〆低调 提交于 2019-11-29 18:28:42
问题 I'm currently trying to improve the speed of SELECTS for a MySQL table and would appreciate any suggestions on ways to improve it. We have over 300 million records in the table and the table has the structure tag, date, value. The primary key is a combined key of tag and date. The table contains information for about 600 unique tags most containing an average of about 400,000 rows but can range from 2000 to over 11 million rows. The queries run against the table are: SELECT date, value FROM

What is the algorithm used by the ORA_HASH function?

巧了我就是萌 提交于 2019-11-29 16:59:04
问题 I've come across some code in the application I'm working on that makes a database call merely to call the ORA_HASH function (documentation) on a UUID string. The reason it's doing this is that it needs the value to make a service call to another system that appears to use ORA_HASH for partitioning. I would like to know the algorithm ORA_HASH uses so that I can re-implement it to make a similar service call for an application that won't have access to a real database, let alone Oracle. I've

Partition Hive table by existing field?

有些话、适合烂在心里 提交于 2019-11-29 11:04:21
Can I partition a Hive table upon insert by an existing field? I have a 10 GB file with a date field and an hour of day field. Can I load this file into a table, then insert-overwrite into another partitioned table that uses those fields as a partition? Would something like the following work? INSERT OVERWRITE TABLE tealeaf_event PARTITION(dt=evt.datestring,hour=evt.hour) SELECT * FROM staging_event evt; Thanks! Travis I just ran across this trying to answer the same question and it was helpful but not quite complete. The short answer is yes, something like the query in the question will work

SQL Error: ORA-14006: invalid partition name

雨燕双飞 提交于 2019-11-29 05:20:19
I am trying to partition an existing table in Oracle 12C R1 using below SQL statement. ALTER TABLE TABLE_NAME MODIFY PARTITION BY RANGE (DATE_COLUMN_NAME) INTERVAL (NUMTOYMINTERVAL(1,'MONTH')) ( PARTITION part_01 VALUES LESS THAN (TO_DATE('01-SEP-2017', 'DD-MON-RRRR')) ) ONLINE; Getting error: Error report - SQL Error: ORA-14006: invalid partition name 14006. 00000 - "invalid partition name" *Cause: a partition name of the form <identifier> is expected but not present. *Action: enter an appropriate partition name. Partition needs to be done on the basis of data datatype column with the

How to partition when ranking on a particular column?

最后都变了- 提交于 2019-11-29 00:39:59
问题 All: I have a data frame like the follow.I know I can do a global rank order like this: dt <- data.frame( ID = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'), Value = c(4,3,1,3,4,6,6,1,8,4) ); > dt ID Value 1 A1 4 2 A2 3 3 A4 1 4 A2 3 5 A1 4 6 A4 6 7 A3 6 8 A2 1 9 A1 8 10 A3 4 dt$Order <- rank(dt$Value,ties.method= "first") > dt ID Value Order 1 A1 4 5 2 A2 3 3 3 A4 1 1 4 A2 3 4 5 A1 4 6 6 A4 6 8 7 A3 6 9 8 A2 1 2 9 A1 8 10 10 A3 4 7 But how can I set a rank order for a particular ID

Cassandra: choosing a Partition Key

♀尐吖头ヾ 提交于 2019-11-28 04:33:01
I'm undecided whether it's better, performance-wise, to use a very commonly shared column value (like Country ) as partition key for a compound primary key or a rather unique column value (like Last_Name ). Looking at Cassandra 1.2's documentation about indexes I get this: " When to use an index : Cassandra's built-in indexes are best on a table having many rows that contain the indexed value. The more unique values that exist in a particular column, the more overhead you will have, on average, to query and maintain the index. For example, suppose you had a user table with a billion users and

Partition Hive table by existing field?

南笙酒味 提交于 2019-11-28 03:55:23
问题 Can I partition a Hive table upon insert by an existing field? I have a 10 GB file with a date field and an hour of day field. Can I load this file into a table, then insert-overwrite into another partitioned table that uses those fields as a partition? Would something like the following work? INSERT OVERWRITE TABLE tealeaf_event PARTITION(dt=evt.datestring,hour=evt.hour) SELECT * FROM staging_event evt; Thanks! Travis 回答1: I just ran across this trying to answer the same question and it was

how to partition a table by datetime column?

岁酱吖の 提交于 2019-11-27 18:37:21
I want to partition a mysql table by datetime column. One day a partition.The create table scripts is like this: CREATE TABLE raw_log_2011_4 ( id bigint(20) NOT NULL AUTO_INCREMENT, logid char(16) NOT NULL, tid char(16) NOT NULL, reporterip char(46) DEFAULT NULL, ftime datetime DEFAULT NULL, KEY id (id) ) ENGINE=InnoDB AUTO_INCREMENT=286802795 DEFAULT CHARSET=utf8 PARTITION BY hash (day(ftime)) partitions 31; But when I select data of some day.It could not locate the partition.The select statement is like this: explain partitions select * from raw_log_2011_4 where day(ftime) = 30; when i use