sharding

Can AUTO_INCREMENT be safely used in a BEFORE TRIGGER in MySQL

丶灬走出姿态 提交于 2019-12-04 02:04:09
Instagram's Postgres method of implementing custom Ids for Sharding is great, but I need the implementation in MySQL. So, I converted the method found at the bottom of this blog, here: http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram MySQL Version: CREATE TRIGGER shard_insert BEFORE INSERT ON tablename FOR EACH ROW BEGIN DECLARE seq_id BIGINT; DECLARE now_millis BIGINT; DECLARE our_epoch BIGINT DEFAULT 1314220021721; DECLARE shard_id INT DEFAULT 1; SET now_millis = (SELECT UNIX_TIMESTAMP(NOW(3)) * 1000); SET seq_id = (SELECT AUTO_INCREMENT FROM information

MongoDB: Sharding on single machine. Does it make sense?

自古美人都是妖i 提交于 2019-12-03 19:31:05
问题 created a collection in MongoDB consisting of 11446615 documents. Each document has the following form: { "_id" : ObjectId("4e03dec7c3c365f574820835"), "httpReferer" : "http://www.somewebsite.pl/art.php?id=13321&b=1", "words" : ["SEX", "DRUGS", "ROCKNROLL", "WHATEVER"], "howMany" : 3 } httpReferer : just an url words : words parsed from the url above. Size of the list is between 15 and 90. I am planning to use this database to obtain list of webpages which have similar content. I 'll by

Unable to launch mongos

十年热恋 提交于 2019-12-03 16:07:48
I am attempting a simple sharding set up (on a single host without any replica set). However I am unable to go any further because this is what happens when i try to start mongos: C:\>mongos --configdb localhost:27010 --port 27011 I get: BadValue: configdb supports only replica set connection string try 'mongos --help' for more information I am failing to see what is lacking. I tried mongos --help , but according to that valid arguments for --configdb are <config replset name>/<host1:port>, <host2:port>, etc . But this is what I've done. I have not done anything else than starting the config

Entity Framework and sharded database

耗尽温柔 提交于 2019-12-03 15:58:16
I have a WCF Data Service running on top of a Entity Framework code first 4.1 provider. The database is quite large (one key table has 77+ million records and grows by ~10% per month) and has presented quite a performance problem. While sharding a database that large is a pain it seems inevitable. My question is, has anybody successfully implemented EF with a sharded database and, if so, do you have any guidance? Have you investigated the following options: Clustering your DB (I assume it's SQL Server you are using)? Extracting some of your data (archived records, for example) into another

MongoDB replica heartbeat request time exceeded

梦想的初衷 提交于 2019-12-03 14:46:49
I have replica set (hosted on amazon) which has: primary secondary arbiter All of them are version 3.2.6 and this replica is making one shard in my sharded cluster (if that is important although I think it is not). When I type rs.status() on primary it says that cannot reach secondary (the same thing is on arbiter): { "_id" : 1, "name" : "secondary-ip:27017", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2016-07-20T15:40:50

Database sharding and JPA

给你一囗甜甜゛ 提交于 2019-12-03 11:19:33
I am working on a Java application that requires horizontal partitioning of data in different PostgreSQL servers. I would like to use a JPA framework and Spring for transaction management. The most popular frameworks for sharding data with JPA seem to be Hibernate Shards , which appears to be no longer in development, and OpenJPA Slice , which does not support virtual shards (one of my requirements). Are there any other options that I'm missing, or a way to get around the OpenJPA limitation? Thanks in advance for your input! You can have a look at Sharding-JDBC , it is a JDBC driver for shard

How does MySQL Cluster determine which data nodes to search for a SELECT query?

 ̄綄美尐妖づ 提交于 2019-12-03 10:08:11
I'm researching how to resolve a situation where a client needs all data for a particular customer (and only the data for that customer) to be stored on a geographically disparate database server. For example, all data should be stored in database servers on the main cloud, except for all data relating to customer ID 92, which should be stored in servers on a different cloud in another location. There are a couple of constraints I am working with that are making this a little tricky, but so far, MySQL Cluster seems like the best approach. However, it is unclear to me how it selects data nodes

Does Cassandra support sharding?

余生长醉 提交于 2019-12-03 10:02:53
Does Apache Cassandra support sharding? Apologize that this question must seem trivial, but I cannot seem to find the answer. I have read that Cassandra was partially modeled after GAE's Big Table which shards on a massive scale. But most of the documentation I'm currently finding on Cassandra seems to imply that Cassandra does not partition data horizontally across multiple machines, but rather supports many many duplicate machines. This would imply that Cassandra is a good fit high availability reads, but would eventually break down if the write volume became very very high. Cassandra does

Searching across shards?

杀马特。学长 韩版系。学妹 提交于 2019-12-03 08:04:21
Short version If I split my users into shards, how do I offer a "user search"? Obviously, I don't want every search to hit every shard. Long version By shard, I mean have multiple databases where each contains a fraction of the total data. For (a naive) example, the databases UserA, UserB, etc. might contain users whose names begin with "A", "B", etc. When a new user signs up, I simple examine his name and put him into the correct database. When a returning user signs in, I again look at his name to determine the correct database to pull his information from. The advantage of sharding vs read

Database sharding and Rails

帅比萌擦擦* 提交于 2019-12-03 07:30:07
问题 What's the best way to deal with a sharded database in Rails? Should the sharding be handled at the application layer, the active record layer, the database driver layer, a proxy layer, or something else altogether? What are the pros and cons of each? 回答1: FiveRuns have a gem named DataFabric that does application-level sharding and master/slave replication. It might be worth checking out. 回答2: I assume with shards we're talking about horizontal partitioning and not vertical partitioning