database-performance

Load a large csv file into neo4j

北城余情 提交于 2019-12-23 04:40:10
问题 I want to load a csv that contains relationships between Wikipedia categories rels.csv (4 million of relations between categories). I tried to modify the setting file by changing the following parameter values: dbms.memory.heap.initial_size=8G dbms.memory.heap.max_size=8G dbms.memory.pagecache.size=9G My query is as follows: USING PERIODIC COMMIT 10000 LOAD CSV FROM "https://github.com/jbarrasa/datasets/blob/master/wikipedia/data/rels.csv?raw=true" AS row MATCH (from:Category { catId: row[0]}

What is the fastest way to look for duplicate uniqueidentifier in Sql Server?

青春壹個敷衍的年華 提交于 2019-12-23 03:24:57
问题 We use uniqueidentifier for every record within a very large database. For business reasons we need to ensure that the uniqueidentifier is never used more than once, but for performance reasons we have a bigint as the primary key . What is the fastest way to test the existence of a uniqueidentifer in a Sql server table? 回答1: ** < 0.05ms to validate a uniqueidentifier from 100,000,000 rows on a single Standard S0 Sql Azure instance. ** DISCLAIMER: The following aproach may require tweaking to

MySQL: Converting datatypes and collations effect on stored data

╄→гoц情女王★ 提交于 2019-12-23 03:06:33
问题 I have a general question about this. There are many times we want to change data-types of fields or collations when lots of data is inserted before . Consider these situations : converting varchar collation from utf8_general_ci to latin1_swedish_ci : as I know the first has multibyte chars and the second singly byte ones. Does this conversion manipulate stored records correctly? And does this conversion lead to reduction of volume of existing data (maybe 50%)? Conversion of int(10) to

Implementing inheritance in MySQL: alternatives and a table with only surrogate keys

主宰稳场 提交于 2019-12-22 18:56:14
问题 This is a question that has probably been asked before, but I'm having some difficulty to find exactly my case, so I'll explain my situation in search for some feedback: I have an application that will be registering locations, I have several types of locations, each location type has a different set of attributes, but I need to associate notes to locations regardless of their type and also other types of content (mostly multimedia entries and comments) to said notes. With this in mind, I

Reuse mysql Subquery in InnerJoin

不想你离开。 提交于 2019-12-22 11:21:11
问题 I'm trying optimizing a query, trying to avoid repeating the query indicated with " COMPLEX QUERY ", that is used 2 times and both, has the same results. The original query SELECT news.* FROM news INNER JOIN((SELECT myposter FROM (SELECT **COMPLEX QUERY**)) UNION (SELECT myposter FROM `profiles_old` prof2 WHERE prof2.profile_id NOT IN (SELECT **COMPLEX QUERY**))) r ON news.profile = r.p I was wondering if something like this was possible: SELECT news.* FROM (SELECT **COMPLEX QUERY**)

MongoDB Index definition strategy

允我心安 提交于 2019-12-22 10:48:57
问题 I have a MongoDB-based database with something about 100K to 500K text documents inside and the collection keeps growing. The system should support the queries by different fields of the documents, e.g. title, category, importance etc. The system is a near real-time system, which got new documents every 5-10 minutes. My question: Is it a good idea, in order to boost the queries' performance, to define a separate index for each frequently queried field (field types: small text, numeric, date)

Improve performance of first query

我的未来我决定 提交于 2019-12-22 05:16:19
问题 If the following database (postgres) queries are executed, the second call is much faster. I guess the first query is slow since the operating system (linux) needs to get the data from disk. The second query benefits from caching at filesystem level and in postgres. Is there a way to optimize the database to get the results fast on the first call? First call (slow) foo3_bar_p@BAR-FOO3-Test:~$ psql foo3_bar_p=# explain analyze SELECT "foo3_beleg"."id", ... FROM "foo3_beleg" WHERE foo3_bar_p-#

How would using an ORDER BY clause both increase and decrease performance?

一世执手 提交于 2019-12-21 23:05:28
问题 I have a MySQL table called devicelog with it's PK on id , but multiple indices on device_id (INT), field_id (INT), and unixtime (BIGINT). They are just the default InnoDB indices. I'm trying to get the ID next to a certain time, I get WAY different performance with different values and different ORDER BYs. IDs and unixtimes both have a positive association, since they both are increasing in order as more data gets inserted, so it seems like it would be okay to safely omit ordering on

Need recommendations on pushing the envelope with SqlBulkCopy on SQL Server

 ̄綄美尐妖づ 提交于 2019-12-21 20:05:20
问题 I am designing an application, one aspect of which is that it is supposed to be able to receive massive amounts of data into SQL database. I designed the database stricture as a single table with bigint identity, something like this one: CREATE TABLE MainTable ( _id bigint IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED, field1, field2, ... ) I will omit how am I intending to perform queries, since it is irrelevant to the question I have. I have written a prototype, which inserts data into this

What is faster, a big joined query with more PHP or multiple small selects with less PHP?

淺唱寂寞╮ 提交于 2019-12-21 17:40:51
问题 I'm running a cron task which makes lots of queries to a MySQL server. The big issue is that the server runs extremely slowly sometimes. I've got a relatively big query with 4 tables left joined between them, and 4 smaller queries with natural join s that also attack the first table. After throwing those queries, I then process the results and group them using PHP . What I'm planning is to somehow mix those 5 queries into just one big query, and then letting PHP do some quick sort() s when I