join

Align two vector with common elements [duplicate]

a 夏天 提交于 2020-07-23 04:29:33
问题 This question already has answers here : How to join (merge) data frames (inner, outer, left, right) (13 answers) Closed 8 days ago . I have two vectors where some elements are common: v1= c('a', 'b', 'c') v2 = c('b', 'c', 'd') I want to combine the vectors into two data.frame s. In the first I want all elements from both vectors, and non-matching positions in either vector should be replaced by NA : v1 v2 a NA b b c c NA d In the second data frame, I want the elements from from the first

Spark: Prevent shuffle/exchange when joining two identically partitioned dataframes

陌路散爱 提交于 2020-07-17 05:50:10
问题 I have two dataframes df1 and df2 and I want to join these tables many times on a high cardinality field called visitor_id . I would like to perform only one initial shuffle and have all the joins take place without shuffling/exchanging data between spark executors. To do so, I have created another column called visitor_partition that consistently assigns each visitor_id a random value between [0, 1000) . I have used a custom partitioner to ensure that the df1 and df2 are exactly partitioned

BigQuery: Lookup array of ids type RECORD and join data from secondary table using SQL

♀尐吖头ヾ 提交于 2020-07-10 10:26:47
问题 I have a data structure like below: Products | name | region_ids | ---------------------------------- | shoe | c32, a43, x53 | | hat | c32, f42 | # Schema name STRING NULLABLE region_ids RECORD REPEATED region_ids.value STRING NULLABLE Regions | _id | name | --------------------- | c32 | london | | a43 | manchester | | x53 | bristol | | f42 | liverpool | # Schema _id STRING NULLABLE name STRING NULLABLE I want to look up the array of "region_ids" and replace them by the region name to result

MySQL get ordered list of contacts by last message sent/received

有些话、适合烂在心里 提交于 2020-07-08 00:30:32
问题 this is my situation,I have 2 tables, one about friends, another about messages. Friends table is like this: user_id|friend_id|accepted 12 | 1 | 1 13 | 1 | 1 1 | 3 | 1 accepted can be 0 or 1. (1 accepted, 0 nope) Messages table message|time |user_id|receiver_id hi! | 1328688| 1 | 12 hey | 1343409| 12 | 1 Time is a timestamp, so i need to list in order by the highest timestamp for each friend . I need to list all contacts (that are accepted = 1) in order of last message (send/received). In

Mongo db c# driver - how to join by id in collection?

无人久伴 提交于 2020-07-04 19:55:51
问题 I'm using Mongo DB c# driver 2, I'm trying to join 2 collections by ID (Many to Many) Class A { public string id; public string name; public list<string> classBReferenceid; // <--- I want use this as the "keys" for the join public list<B> myBs; } Class B { public string id; // <--- I use this as the "key" for the join public string name; } In my DB class "A" is saved without the data of "myBs" and I want to pull it from mongo in one call. I tried to use the Lookup function: IMongoCollection<A

Mongo db c# driver - how to join by id in collection?

拥有回忆 提交于 2020-07-04 19:53:39
问题 I'm using Mongo DB c# driver 2, I'm trying to join 2 collections by ID (Many to Many) Class A { public string id; public string name; public list<string> classBReferenceid; // <--- I want use this as the "keys" for the join public list<B> myBs; } Class B { public string id; // <--- I use this as the "key" for the join public string name; } In my DB class "A" is saved without the data of "myBs" and I want to pull it from mongo in one call. I tried to use the Lookup function: IMongoCollection<A

How to avoid n+1 select in django?

安稳与你 提交于 2020-07-04 08:56:52
问题 I have a very simple datamodel with a one to many relationship between video and comments class Video(models.Model): url = models.URLField(unique=True) ..... class Comment(models.Model): title = models.CharField(max_length=128) video = models.ForeignKey('Video') ..... I want to query for videos and grab the whole object graph (videos with all the comments). Looking at the sql, I see it does two selects, one for the Videos and one for the Comments. How do I avoid that? I want to do a join and

Creating Column based on WHERE condition

孤街醉人 提交于 2020-06-29 06:43:09
问题 I've the following query: SELECT tn.Date, b1.Name DrBook, b2.Name CrBook, c1.Name DrControl, c2.Name CrControl, l1.Name DrLedger, l2.Name CrLedger, s1.Name DrSubLedger, s2.Name CrSubLedger, p1.Name DrParty, p2.Name CrParty, m1.Name DrMember, m2.Name CrMember, tn.Amount, tn.Narration FROM Transactions tn LEFT JOIN Books b1 ON b1.Id = tn.DrBook LEFT JOIN Books b2 ON b2.Id = tn.CrBook LEFT JOIN ControlLedgers c1 ON c1.Id = tn.DrControl LEFT JOIN ControlLedgers c2 ON c2.Id = tn.CrControl LEFT

Condition on count of associated records in SQL

独自空忆成欢 提交于 2020-06-28 04:01:18
问题 I have the following tables (with given columns): houses (id) users (id, house_id, active) custom_values (name, house_id, type) I want to get all the (distinct) houses and the count of associated users that: have at least 1 associated custom_value which name column contains the string 'red' (case insensitive) AND the custom_value column type value is 'mandatory'. have at least 100 associated users which status column is 'active' How can I run this query in PostgreSQL? Right now I have this

Translate SQL SERVER SELECT to LINQ with multiple join Method Syntax

我的未来我决定 提交于 2020-06-28 03:58:51
问题 How could I do the join in linq from the following SQL SELECT statement by using method syntax : SELECT distinct [LAB_RESULTS].ORDER_ID ,LAB_RESULTS.patient_no ,Patients.PATIENT_NAME ,labtests.TestId ,labtests.TestName ,[RESULT_NUMBER] ,TestsRanges.LowerLimit ,TestsRanges.UpperLimit ,TestsUnits.UnitName FROM [dbo].[LAB_RESULTS] inner join LabTests on LabTests.testid=LAB_RESULTS.TESTID inner join TestsRanges on TestsRanges.TestId = LAB_RESULTS.TESTID inner join patients on Patients.Patient_No