query-performance

Performance of nested select

只愿长相守 提交于 2019-12-07 05:49:47
问题 I know this is a common question and I have read several other posts and papers but I could not find one that takes into account indexed fields and the volume of records that both queries could return. My question is simple really. Which of the two is recommended here written in an SQL-like syntax (in terms of performance). First query: Select * from someTable s where s.someTable_id in (Select someTable_id from otherTable o where o.indexedField = 123) Second query: Select * from someTable

How to measure RU in DocumentDB?

℡╲_俬逩灬. 提交于 2019-12-06 14:49:47
Given that Azure DocumentDB uses Requests Units as a measurement for throughput I would like to make sure my queries utilize the least amount of RUs as possible to ncrease my throughput. Is there a tool that will tell me how many RUs a query will take and if the query is actually using an index or not? As you discovered, certain tools will provide RU's upon completion of a query. This is also available programmatically, as the x-ms-request-charge header is returned in the response, and easily retrievable via the DocumentDB SDKs. For example, here's a snippet showing RU retrieval using JS/node:

SQL Performance: Using OR is slower than IN when using order by

安稳与你 提交于 2019-12-06 11:58:29
I am using MariaDB 10.0.21 and running a query similar to the following query against 12 Million Rows: SELECT `primary_key` FROM `texas_parcels` WHERE `zip_code` IN ('28461', '48227', '60411', '65802', '75215', '75440', '75773', '75783', '76501', '76502', '76504', '76511', '76513', '76518', '76519', '76520', '76522', '76525', '76527', '76528', '76530', '76537', '76539', '76541', '76542', '76543', '76548', '76549', '76550', '76556', '76567', '76571', '76574', '76577', '76578', '76642', '76704', '76853', '77418', '77434', '77474', '77833', '77835', '77836', '77845', '77853', '77879', '77964',

Oracle “Total” plan cost is really less than some of it's elements

江枫思渺然 提交于 2019-12-06 11:53:00
问题 I cannot figure out why sometimes, the total cost of a plan can be a very small number whereas looking inside the plan we can find huge costs. (indeed the query is very slow). Can somebody explain me that? Here is an example. Apparently the costful part comes from a field in the main select that does a listagg on a subview and the join condition with this subview contains a complex condition (we can join on one field or another). | Id | Operation | Name | Rows | Bytes | Cost | ---------------

Why MongoDB different query plans show different nReturned value?

我的未来我决定 提交于 2019-12-06 09:39:21
I have a collection faults in my MongoDB database which every document has these fields: rack_name , timestamp Just for sake of testing and comparing performances, I have created these two indexes: rack -> {'rack_name': 1} and time -> {'timestamp': 1} Now I executed the following query with explain(): db.faults.find({ 'rack_name': { $in: [ 'providence1', 'helena2' ] }, 'timestamp': { $gt: 1501548359000 } }) .explain('allPlansExecution') and here is the result: { "queryPlanner" : { "plannerVersion" : 1, "namespace" : "quicktester_clone.faults", "indexFilterSet" : false, "parsedQuery" : { "$and"

Query performance issue for large nested data in mongodb

假如想象 提交于 2019-12-06 08:07:33
I'm trying to query results from a large dataset called 'tasks' containing 187297 documents which are nested into another dataset called 'workers' , that's in its turn nested into a collection called 'production_units' . production_units -> workers -> tasks (BTW this is a simplified version of production_units): [{ "_id": ObjectId("5aca27b926974863ed9f01ab"), "name": "Z", "workers": [{ "name": "X Y", "worker_number": 655, "employed": false, "_id": ObjectId("5aca27bd26974863ed9f0425"), "tasks": [{ "_id": ObjectId("5ac9f6c2e1a668d6d39c1fd1"), "inbound_order_number": 3296, "task_number": 90,

Optimize MySQL Full outer join for massive amount of data

眉间皱痕 提交于 2019-12-05 23:59:07
We have the following mysql tables (simplified for going straight to the point) CREATE TABLE `MONTH_RAW_EVENTS` ( `idEvent` int(11) unsigned NOT NULL, `city` varchar(45) NOT NULL, `country` varchar(45) NOT NULL, `ts` datetime NOT NULL, `idClient` varchar(45) NOT NULL, `event_category` varchar(45) NOT NULL, ... bunch of other fields PRIMARY KEY (`idEvent`), KEY `idx_city` (`city`), KEY `idx_country` (`country`), KEY `idClient` (`idClient`), ) ENGINE=InnoDB; CREATE TABLE `compilation_table` ( `idClient` int(11) unsigned DEFAULT NULL, `city` varchar(200) DEFAULT NULL, `month` int(2) DEFAULT NULL,

How to obtain the most recent row per type and perform calculations, depending on the row type?

谁说胖子不能爱 提交于 2019-12-05 21:30:45
I need some help writing/optimizing a query to retrieve the latest version of each row by type and performing some calculations depending on the type. I think would be best if I illustrate it with an example. Given the following dataset: +-------+-------------------+---------------------+-------------+---------------------+--------+----------+ | id | event_type | event_timestamp | message_id | sent_at | status | rate | +-------+-------------------+---------------------+-------------+---------------------+--------+----------+ | 1 | create | 2016-11-25 09:17:48 | 1 | 2016-11-25 09:17:48 | 0 | 0

Reuse mysql Subquery in InnerJoin

一笑奈何 提交于 2019-12-05 20:42:50
I'm trying optimizing a query, trying to avoid repeating the query indicated with " COMPLEX QUERY ", that is used 2 times and both, has the same results. The original query SELECT news.* FROM news INNER JOIN((SELECT myposter FROM (SELECT **COMPLEX QUERY**)) UNION (SELECT myposter FROM `profiles_old` prof2 WHERE prof2.profile_id NOT IN (SELECT **COMPLEX QUERY**))) r ON news.profile = r.p I was wondering if something like this was possible: SELECT news.* FROM (SELECT **COMPLEX QUERY**) complexQuery, news INNER JOIN ((SELECT myposter FROM complexquery) UNION (SELECT myposter FROM `profiles_old`

How to tell when a Postgres table was clustered and what indexes were used

假如想象 提交于 2019-12-05 14:28:00
I've been impressed by the performance improvements achieved with clustering, but not with how long it takes. I know clustering needs to be rebuilt if a table or partition is changed after the clustering, but unless I've made a note of when I last clustered a table, how can I tell when I need to do it again? I can use this query to tell me what table(s) have one or more clustered indexes SELECT * FROM pg_class c JOIN pg_index i ON i.indrelid = c.oid WHERE relkind = 'r' AND relhasindex AND i.indisclustered My questions are. How can I tell which indexes have been clustered? Is there any way of