data-modeling

Counter a better choice for uniqueness?

╄→尐↘猪︶ㄣ 提交于 2019-12-25 04:13:35
问题 I currently have the following table layout for a basic user event table: CREATE TABLE IF NOT EXISTS events.events_by_user( user text, added_week int, added_timestamp timestamp, event text, uuid uuid, PRIMARY KEY((user, added_week), added_timestamp, event, uuid)) WITH CLUSTERING ORDER BY(added_timestamp DESC) Thus uniqueness is basically warranted by the uuid as last column of the primary key. There is a chance that several identical events for the same user occur in the same millisecond

What's the best way to store versioning when writing a Wiki application?

[亡魂溺海] 提交于 2019-12-25 01:18:46
问题 I'm writing a wiki application which needs searchable version conrol. What's the best data model for this? I'm writing it in Django, not that that matters much. 回答1: Let me suggest you not implement version control, but make use of one of the existing implementations. Version control is a lot of work to implement well, and a lot of bother to the user if not implemented well. See, for example, how ikiwiki does it: it has plugins to abstract away the version control system and supports several,

MongoDB: upsert with different values for update and insert

旧时模样 提交于 2019-12-24 19:08:59
问题 A little context: I have a document for each user that contains an array with latest 20 events related to a user. As MongoDB does not have this feature(to cap arrays inside a document), I will push my event and pop the latest one. My problem: initializing the document(aka filling array with nulls). I want to atomically: create document containing an array with 20 null values and push one value, if document does not exist or update document (push one value in array), if document exists Do you

How to prove the reliability of a predictive model to executives?

白昼怎懂夜的黑 提交于 2019-12-24 16:40:18
问题 I trained data from 500 devices to predict their performance. Then I applied my trained model to a test data set for another 500 devices and show pretty good prediction results. Now my executives want me to prove this model will work well on one million devices not only on 500. Obviously we don't have data for one million devices. And if the model is not reliable, they want me to discover the required amount of train data in order to make a reliable prediction on one million devices. How

Generally, are string (or varchar) fields used as join fields?

瘦欲@ 提交于 2019-12-24 16:03:35
问题 We have two tables. The first contains a name (varchar) field. The second contains a field that references the name field from the first table. This foreign key in the second table will be repeated for every row associated with that name. Is it generally discouraged to use a varchar/string field as a join between two tables? When is the best case where a string field can be used as a join field? 回答1: It's certainly possible to use a varchar as a key field (or simply something to join on). The

maximum secondary indexes on a columnfamily

拈花ヽ惹草 提交于 2019-12-24 13:42:07
问题 Is it a performance issue if we have two or more secondary indexes on a columnfamily? I have orderid,city and shipmenttype. So I thought I create primary key on orderid and secondary indexes on city and shipmenttype. And use combination of secondary index columns while querying. Is that a bad modelling? 回答1: Consider the data that will be placed in the secondary index. Looking at the docs, you want to avoid columns with high cardinality. If your city and shipment type values vary greatly (or

Database model to manage documents

偶尔善良 提交于 2019-12-24 09:35:51
问题 I need to build a tables related to manage documents such as jpg,doc,msg,pdf using a sql server 2008 . As i know sql server support .jpg images, so my question is if it's possible to upload other kind of files into a db. This is an example of the table (could be redefined if it's needed). Document : document_id int(10) name varchar(10) type image (doesnt know how it might works) Those are the initial values for a table, but i dont know how to make it useful for any type. pd: do i need to

Column Nullability/Optionality: NULL vs NOT NULL

一曲冷凌霜 提交于 2019-12-24 04:33:07
问题 Is there a reason for or against setting some fields as NULL or NOT NULL in a mysql table, apart from primary/foreign key fields? 回答1: That completely depends on your domain to be honest. Functionally it makes little difference to the database engine, but if you're looking to have a well defined domain it is often best to have both the database and application layer mirror the requirements you are placing on the user. If it's moot to you whether or not the user enters their "Display Name",

Use a ListProperty or custom tuple property in App Engine?

血红的双手。 提交于 2019-12-24 04:06:28
问题 I'm developing an application with Google App Engine and stumbled across the following scenario, which can perhaps be described as "MVP-lite". When modeling many-to-many relationships, the standard property to use is the ListProperty. Most likely, your list is comprised of the foreign keys of another model. However, in most practical applications, you'll usually want at least one more detail when you get a list of keys - the object's name - so you can construct a nice hyperlink to that object

Integer vs char for DB record property

∥☆過路亽.° 提交于 2019-12-24 00:59:17
问题 Say I have a table with real estate listings. Every listing can be either 'For sale' or 'For rent'. Therefore, I can map 'For sale' to 0, 'For rent' to 1 and store it as an INT in the database. However, it would be much more descriptive if I store it as 'sale' / 'rent' in a field of type CHAR. Or I can map 0 and 1 to two constants FOR_SALE and FOR_RENT in my program. Or use chars 'S' and 'R'. What are the best practices for storing such properties in a database with a condition that the total