query-optimization | 易学教程

SQL: How to select one record per day, assuming that each day contain more than 1 value MySQL

阅读更多关于 SQL: How to select one record per day, assuming that each day contain more than 1 value MySQL

问题 I want to select records from '2013-04-01 00:00:00' to 'today' but, each day has lot of value, because they are saving each 15 minutes a value, so I want only the first or last value from each day. Table schema: CREATE TABLE IF NOT EXISTS `value_magnitudes` ( `id` int(11) NOT NULL AUTO_INCREMENT, `value` float DEFAULT NULL, `magnitude_id` int(11) DEFAULT NULL, `sdi_belongs_id` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL, `reading_date` datetime DEFAULT NULL, `created_at` datetime

Where might I find a method to convert an arbitrary boolean expression into conjunctive or disjunctive normal form?

阅读更多关于 Where might I find a method to convert an arbitrary boolean expression into conjunctive or disjunctive normal form?

问题 I've written a little app that parses expressions into abstract syntax trees. Right now, I use a bunch of heuristics against the expression in order to decide how to best evaluate the query. Unfortunately, there are examples which make the query plan extremely bad. I've found a way to provably make better guesses as to how queries should be evaluated, but I need to put my expression into CNF or DNF first in order to get provably correct answers. I know this could result in potentially

Ridiculously slow mongoDB query on small collection in simple but big database

阅读更多关于 Ridiculously slow mongoDB query on small collection in simple but big database

问题 So I have a super simple database in mongoDB with a few collections: > show collections Aggregates <-- count: 92 Users <-- count: 68222 Pages <-- count: 1728288847, about 1.1TB system.indexes The Aggregates collection is an aggregate of the Pages collection, and each document looks like this: > db.Aggregates.findOne() { "_id" : ObjectId("50f237126ba71610eab3aaa5"), "daily_total_pages" : 16929799, "day" : 21, "month" : 9, "year" : 2011 } Very simple. However, let's try and get the total page

When designing databases, what is the preferred way to store multiple true / false values?

阅读更多关于 When designing databases, what is the preferred way to store multiple true / false values?

问题 As stated in the title, when designing databases, what is the preferred way to handle tables that have multiple columns that are just storing true / false values as just a single either or value (e.g. "Y/N: or "0/1")? Likewise, are there some issues that might arise between different databases (e.g. Oracle and SQL Server) that might affect how the columns are handled? 回答1: In SQL Server , there is BIT datatype. You can store 0 or 1 there, compare the values but not run MIN or MAX . In Oracle

Which provides better performance one big join or multiple queries?

阅读更多关于 Which provides better performance one big join or multiple queries?

问题 i have a table called orders. one column on order is customer_id i have a table called customers with 10 fields Given the two options if i want to build up an array of order objects and embedded in an order object is a customer object i have two choices. Option 1: a. first query orders table. b. loop through records and query the persons table to get the records for the person this would be something like: Select * from APplications Select * from Customer where id = 1 Select * from Customer

Selecting COUNT from different criteria on a table

阅读更多关于 Selecting COUNT from different criteria on a table

问题 I have a table named 'jobs'. For a particular user a job can be active, archived, overdue, pending, or closed. Right now every page request is generating 5 COUNT queries and in an attempt at optimization I'm trying to reduce this to a single query. This is what I have so far but it is barely faster than the 5 individual queries. Note that I've simplified the conditions for each subquery to make it easier to understand, the full query acts the same however. Is there a way to get these 5 counts

Spark SQL: how to cache sql query result without using rdd.cache()

阅读更多关于 Spark SQL: how to cache sql query result without using rdd.cache()

问题 Is there any way to cache a cache sql query result without using rdd.cache()? for examples: output = sqlContext.sql("SELECT * From people") We can use output.cache() to cache the result, but then we cannot use sql query to deal with it. So I want to ask is there anything like sqlcontext.cacheTable() to cache the result? 回答1: You should use sqlContext.cacheTable("table_name") in order to cache it, or alternatively use CACHE TABLE table_name SQL query. Here's an example. I've got this file on

What is the difference between Seq Scan and Bitmap heap scan in postgres?

阅读更多关于 What is the difference between Seq Scan and Bitmap heap scan in postgres?

问题 In output of explain command I found two terms 'Seq Scan' and 'Bitmap heap Scan'. Can somebody tell me what is the difference between these two types of scan? (I am using PostgreSql) 回答1: http://www.postgresql.org/docs/8.2/static/using-explain.html Basically, a sequential scan is going to the actual rows, and start reading from row 1, and continue until the query is satisfied (this may not be the entire table, e.g., in the case of limit) Bitmap heap scan means that PostgreSQL has found a

How do I implement threaded comments?

阅读更多关于 How do I implement threaded comments?

问题 I am developing a web application that can support threaded comments. I need the ability to rearrange the comments based on the number of votes received. (Identical to how threaded comments work in reddit) I would love to hear the inputs from the SO community on how to do it. How should I design the comments table? Here is the structure I am using now: Comment id parent_post parent_comment author points What changes should be done to this structure? How should I get the details from this

Mongodb: Performance impact of $HINT

阅读更多关于 Mongodb: Performance impact of $HINT

问题 I have a query that uses compound index with sort on "_id". The compound index has "_id" at the end of the index and it works fine until I add a $gt clause to my query. i.e, Initial query db.colletion.find({"field1": "blabla", "field2":"blabla"}).sort({_id:1} Subsequent queries db.colletion.find({"field1": "blabla", "field2":"blabla", _id:{$gt:ObjetId('...')}}).sort({_id:1} what I am noticing is that there are times when my compound index is not used. Instead, Mongo uses the default