sharding | 易学教程

Can AUTO_INCREMENT be safely used in a BEFORE TRIGGER in MySQL

阅读更多关于 Can AUTO_INCREMENT be safely used in a BEFORE TRIGGER in MySQL

Instagram's Postgres method of implementing custom Ids for Sharding is great, but I need the implementation in MySQL. So, I converted the method found at the bottom of this blog, here: http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram MySQL Version: CREATE TRIGGER shard_insert BEFORE INSERT ON tablename FOR EACH ROW BEGIN DECLARE seq_id BIGINT; DECLARE now_millis BIGINT; DECLARE our_epoch BIGINT DEFAULT 1314220021721; DECLARE shard_id INT DEFAULT 1; SET now_millis = (SELECT UNIX_TIMESTAMP(NOW(3)) * 1000); SET seq_id = (SELECT AUTO_INCREMENT FROM information

MongoDB: Sharding on single machine. Does it make sense?

阅读更多关于 MongoDB: Sharding on single machine. Does it make sense?

问题 created a collection in MongoDB consisting of 11446615 documents. Each document has the following form: { "_id" : ObjectId("4e03dec7c3c365f574820835"), "httpReferer" : "http://www.somewebsite.pl/art.php?id=13321&b=1", "words" : ["SEX", "DRUGS", "ROCKNROLL", "WHATEVER"], "howMany" : 3 } httpReferer : just an url words : words parsed from the url above. Size of the list is between 15 and 90. I am planning to use this database to obtain list of webpages which have similar content. I 'll by

Unable to launch mongos

阅读更多关于 Unable to launch mongos

I am attempting a simple sharding set up (on a single host without any replica set). However I am unable to go any further because this is what happens when i try to start mongos: C:\>mongos --configdb localhost:27010 --port 27011 I get: BadValue: configdb supports only replica set connection string try 'mongos --help' for more information I am failing to see what is lacking. I tried mongos --help , but according to that valid arguments for --configdb are <config replset name>/<host1:port>, <host2:port>, etc . But this is what I've done. I have not done anything else than starting the config

Entity Framework and sharded database

阅读更多关于 Entity Framework and sharded database

I have a WCF Data Service running on top of a Entity Framework code first 4.1 provider. The database is quite large (one key table has 77+ million records and grows by ~10% per month) and has presented quite a performance problem. While sharding a database that large is a pain it seems inevitable. My question is, has anybody successfully implemented EF with a sharded database and, if so, do you have any guidance? Have you investigated the following options: Clustering your DB (I assume it's SQL Server you are using)? Extracting some of your data (archived records, for example) into another

MongoDB replica heartbeat request time exceeded

阅读更多关于 MongoDB replica heartbeat request time exceeded

I have replica set (hosted on amazon) which has: primary secondary arbiter All of them are version 3.2.6 and this replica is making one shard in my sharded cluster (if that is important although I think it is not). When I type rs.status() on primary it says that cannot reach secondary (the same thing is on arbiter): { "_id" : 1, "name" : "secondary-ip:27017", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2016-07-20T15:40:50

Database sharding and JPA

阅读更多关于 Database sharding and JPA

I am working on a Java application that requires horizontal partitioning of data in different PostgreSQL servers. I would like to use a JPA framework and Spring for transaction management. The most popular frameworks for sharding data with JPA seem to be Hibernate Shards , which appears to be no longer in development, and OpenJPA Slice , which does not support virtual shards (one of my requirements). Are there any other options that I'm missing, or a way to get around the OpenJPA limitation? Thanks in advance for your input! You can have a look at Sharding-JDBC , it is a JDBC driver for shard

How does MySQL Cluster determine which data nodes to search for a SELECT query?

阅读更多关于 How does MySQL Cluster determine which data nodes to search for a SELECT query?

I'm researching how to resolve a situation where a client needs all data for a particular customer (and only the data for that customer) to be stored on a geographically disparate database server. For example, all data should be stored in database servers on the main cloud, except for all data relating to customer ID 92, which should be stored in servers on a different cloud in another location. There are a couple of constraints I am working with that are making this a little tricky, but so far, MySQL Cluster seems like the best approach. However, it is unclear to me how it selects data nodes

Does Cassandra support sharding?

阅读更多关于 Does Cassandra support sharding?

Does Apache Cassandra support sharding? Apologize that this question must seem trivial, but I cannot seem to find the answer. I have read that Cassandra was partially modeled after GAE's Big Table which shards on a massive scale. But most of the documentation I'm currently finding on Cassandra seems to imply that Cassandra does not partition data horizontally across multiple machines, but rather supports many many duplicate machines. This would imply that Cassandra is a good fit high availability reads, but would eventually break down if the write volume became very very high. Cassandra does

Searching across shards?

阅读更多关于 Searching across shards?

Short version If I split my users into shards, how do I offer a "user search"? Obviously, I don't want every search to hit every shard. Long version By shard, I mean have multiple databases where each contains a fraction of the total data. For (a naive) example, the databases UserA, UserB, etc. might contain users whose names begin with "A", "B", etc. When a new user signs up, I simple examine his name and put him into the correct database. When a returning user signs in, I again look at his name to determine the correct database to pull his information from. The advantage of sharding vs read

Database sharding and Rails

阅读更多关于 Database sharding and Rails

问题 What's the best way to deal with a sharded database in Rails? Should the sharding be handled at the application layer, the active record layer, the database driver layer, a proxy layer, or something else altogether? What are the pros and cons of each? 回答1: FiveRuns have a gem named DataFabric that does application-level sharding and master/slave replication. It might be worth checking out. 回答2: I assume with shards we're talking about horizontal partitioning and not vertical partitioning