query-optimization

SQL: How to select one record per day, assuming that each day contain more than 1 value MySQL

北战南征 提交于 2019-12-21 04:26:14
问题 I want to select records from '2013-04-01 00:00:00' to 'today' but, each day has lot of value, because they are saving each 15 minutes a value, so I want only the first or last value from each day. Table schema: CREATE TABLE IF NOT EXISTS `value_magnitudes` ( `id` int(11) NOT NULL AUTO_INCREMENT, `value` float DEFAULT NULL, `magnitude_id` int(11) DEFAULT NULL, `sdi_belongs_id` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL, `reading_date` datetime DEFAULT NULL, `created_at` datetime

Where might I find a method to convert an arbitrary boolean expression into conjunctive or disjunctive normal form?

独自空忆成欢 提交于 2019-12-21 03:56:17
问题 I've written a little app that parses expressions into abstract syntax trees. Right now, I use a bunch of heuristics against the expression in order to decide how to best evaluate the query. Unfortunately, there are examples which make the query plan extremely bad. I've found a way to provably make better guesses as to how queries should be evaluated, but I need to put my expression into CNF or DNF first in order to get provably correct answers. I know this could result in potentially

Ridiculously slow mongoDB query on small collection in simple but big database

不问归期 提交于 2019-12-21 02:28:08
问题 So I have a super simple database in mongoDB with a few collections: > show collections Aggregates <-- count: 92 Users <-- count: 68222 Pages <-- count: 1728288847, about 1.1TB system.indexes The Aggregates collection is an aggregate of the Pages collection, and each document looks like this: > db.Aggregates.findOne() { "_id" : ObjectId("50f237126ba71610eab3aaa5"), "daily_total_pages" : 16929799, "day" : 21, "month" : 9, "year" : 2011 } Very simple. However, let's try and get the total page

When designing databases, what is the preferred way to store multiple true / false values?

柔情痞子 提交于 2019-12-20 18:26:44
问题 As stated in the title, when designing databases, what is the preferred way to handle tables that have multiple columns that are just storing true / false values as just a single either or value (e.g. "Y/N: or "0/1")? Likewise, are there some issues that might arise between different databases (e.g. Oracle and SQL Server) that might affect how the columns are handled? 回答1: In SQL Server , there is BIT datatype. You can store 0 or 1 there, compare the values but not run MIN or MAX . In Oracle

Which provides better performance one big join or multiple queries?

丶灬走出姿态 提交于 2019-12-20 11:08:06
问题 i have a table called orders. one column on order is customer_id i have a table called customers with 10 fields Given the two options if i want to build up an array of order objects and embedded in an order object is a customer object i have two choices. Option 1: a. first query orders table. b. loop through records and query the persons table to get the records for the person this would be something like: Select * from APplications Select * from Customer where id = 1 Select * from Customer

Selecting COUNT from different criteria on a table

主宰稳场 提交于 2019-12-20 10:28:35
问题 I have a table named 'jobs'. For a particular user a job can be active, archived, overdue, pending, or closed. Right now every page request is generating 5 COUNT queries and in an attempt at optimization I'm trying to reduce this to a single query. This is what I have so far but it is barely faster than the 5 individual queries. Note that I've simplified the conditions for each subquery to make it easier to understand, the full query acts the same however. Is there a way to get these 5 counts

Spark SQL: how to cache sql query result without using rdd.cache()

百般思念 提交于 2019-12-20 09:38:47
问题 Is there any way to cache a cache sql query result without using rdd.cache()? for examples: output = sqlContext.sql("SELECT * From people") We can use output.cache() to cache the result, but then we cannot use sql query to deal with it. So I want to ask is there anything like sqlcontext.cacheTable() to cache the result? 回答1: You should use sqlContext.cacheTable("table_name") in order to cache it, or alternatively use CACHE TABLE table_name SQL query. Here's an example. I've got this file on

What is the difference between Seq Scan and Bitmap heap scan in postgres?

有些话、适合烂在心里 提交于 2019-12-20 08:38:54
问题 In output of explain command I found two terms 'Seq Scan' and 'Bitmap heap Scan'. Can somebody tell me what is the difference between these two types of scan? (I am using PostgreSql) 回答1: http://www.postgresql.org/docs/8.2/static/using-explain.html Basically, a sequential scan is going to the actual rows, and start reading from row 1, and continue until the query is satisfied (this may not be the entire table, e.g., in the case of limit) Bitmap heap scan means that PostgreSQL has found a

How do I implement threaded comments?

和自甴很熟 提交于 2019-12-20 08:27:50
问题 I am developing a web application that can support threaded comments. I need the ability to rearrange the comments based on the number of votes received. (Identical to how threaded comments work in reddit) I would love to hear the inputs from the SO community on how to do it. How should I design the comments table? Here is the structure I am using now: Comment id parent_post parent_comment author points What changes should be done to this structure? How should I get the details from this

Mongodb: Performance impact of $HINT

元气小坏坏 提交于 2019-12-20 07:18:22
问题 I have a query that uses compound index with sort on "_id". The compound index has "_id" at the end of the index and it works fine until I add a $gt clause to my query. i.e, Initial query db.colletion.find({"field1": "blabla", "field2":"blabla"}).sort({_id:1} Subsequent queries db.colletion.find({"field1": "blabla", "field2":"blabla", _id:{$gt:ObjetId('...')}}).sort({_id:1} what I am noticing is that there are times when my compound index is not used. Instead, Mongo uses the default