data-modeling | 易学教程

Is there a difference between Surrogate key, Synthetic Key, and Artificial Key?

阅读更多关于 Is there a difference between Surrogate key, Synthetic Key, and Artificial Key?

问题 Are there any differences among a Surrogate Key, Synthetic Key, and an Artificial Key? I'm not clear on the exact difference. 回答1: Surrogate key, synthetic key and artificial key are synonyms. Technical key is another one. They all mean "primary key which doesn't have a business meaning". They are distinct from natural or business keys which have a meaning beyond the system at hand. For instance, consider the SO user account. We are identified by two keys. The natural key is the identifier we

Model design: Users have friends which are users

阅读更多关于 Model design: Users have friends which are users

I'm looking to make sure my methods are correct before pursuing this association. The implementation sounds too complicated, so I think there must be something wrong with my plan. I am using structured (SQL) data storage, conforming to the rails storage conventions. What I have is a User model, it has an email address, password_digest , and name in the Schema. class User < ActiveRecord::Base has_many :posts end I'd like to implement a has_many association to a friends collection, so that Users can belong_to Users (as friends). I'm hoping to be able to have User.last.friends.last return a User

Choosing a partition key for a Cassandra table — how many is too many partitions?

阅读更多关于 Choosing a partition key for a Cassandra table — how many is too many partitions?

I have an application where the 'natural' partition key for a Cassandra table seems like it would be 'customer'. This is the primary way we want to query the data, we would get good data distribution, etc. But if there were well over 1 million customers, would that be too many different partitions? Should I choose a partition key that results in a smaller number of partition keys? I've looked at a number of the related questions on this topic but none seem to address this particular point. But if there were well over 1 million customers, would that be too many different partitions? No. The

Using categorical data as features in sklean LogisticRegression

阅读更多关于 Using categorical data as features in sklean LogisticRegression

I'm trying to understand how to use categorical data as features in sklearn.linear_model 's LogisticRegression . I understand of course I need to encode it. What I don't understand is how to pass the encoded feature to the Logistic regression so it's processed as a categorical feature, and not interpreting the int value it got when encoding as a standard quantifiable feature. (Less important) Can somebody explain the difference between using preprocessing.LabelEncoder() , DictVectorizer.vocabulary or just encoding the categorical data yourself with a simple dict? Alex A.'s comment here touches

Tools to visualize a database and understand the datamodel quickly [closed]

阅读更多关于 Tools to visualize a database and understand the datamodel quickly [closed]

I have several SQL Server 2005 databases ranging from 20 – 600 tables in an application and no documentation. I am looking for a database diagramming tool that is smart enough to pick tables that seem to be related to one entity (e.g., tables related to Patient, tables related to Orders) or one functionality (e.g., Patient Management, Order Management) and show them separately instead of drawing the entire database. In the past, I have seen tables related to one piece of functionality represented in one color in the ER diagrams. In a well designed database, perhaps there will be multiple

Working with nested single queries in Firestore

阅读更多关于 Working with nested single queries in Firestore

问题 Recently I moved my data model from Firebase to Firestore. All my code is working, but I'm having some ugly troubles regarding my nested queries for retrieve some data. Here is the point: Right now my data model for this part looks like this(Yes! Another followers/feed example): { "Users": { //Collection "UserId1" : { //Document "Feed" : { //Subcollection of Id of posts from users this user Follow "PostId1" : { //Document "timeStamp" : "SomeDate" }, "PostId2" : { "timeStamp" : "SomeDate" },

How to model many blobs for an object?

阅读更多关于 How to model many blobs for an object?

问题 I want to enable something like a one-to-many relation between a text object and blobs so that a text object (an "article" or likewise) has many images and/or videos. There are two ways I see how to do this where the first is using a list of blobs as instance variable. Will it work? class A(search.SearchableModel): blobs = db.ListProperty(blobstore.BlobReferenceProperty()) Advantages: Just one class. Readable and easy to get and set data. Disadvantages: Lacks extra info for blobs e.g. if I

Django multi-table inheritance alternatives for basic data model pattern

阅读更多关于 Django multi-table inheritance alternatives for basic data model pattern

tl;dr Is there a simple alternative to multi-table inheritance for implementing the basic data-model pattern depicted below, in Django? Premise Please consider the very basic data-model pattern in the image below, based on e.g. Hay, 1996 . Simply put: Organizations and Persons are Parties , and all Parties have Address es. A similar pattern may apply to many other situations. The important point here is that the Address has an explicit relation with Party , rather than explicit relations with the individual sub-models Organization and Person . Note that each sub-model introduces additional

What is the best way to store a historical price list in a MySQL table?

阅读更多关于 What is the best way to store a historical price list in a MySQL table?

Basically, my question is this - I have a list of prices, some of which are historical (i.e. I want to be able to search that product X was $0.99 on March 11, $1.99 on April 1, etc...). What is the best way to store this information? I assumed I would probably have a Product table that has a foreign key to a price table. I initially thought that storing the current price would probably be the best bet, but I think I want to be able to store historical price data, so would the better route to go be to store a table like the following for the price list: CREATE TABLE prices ( id BIGINT auto

When two tables are very similar, when should they be combined?

阅读更多关于 When two tables are very similar, when should they be combined?

I have events and photos, and then comments for both. Right now, I have two comments tables, one for comments related to the events, and another for photo comments. Schema is similar to this: CREATE TABLE EventComments ( CommentId int, EventId int, Comment NVarChar(250), DateSubmitted datetime ) CREATE TABLE PhotoComments ( CommentId int, PhotoId int, Comment NVarChar(250), DateSubmitted datetime ) My questions is whether or not I should combine them, and add a separate cross reference table, but I can't think of a way to do it properly. I think this should be OK, what are your thoughts? Edit