LINQ - Full Outer Join

后端 未结 16 1893
既然无缘
既然无缘 2020-11-21 22:45

I have a list of people\'s ID and their first name, and a list of people\'s ID and their surname. Some people don\'t have a first name and some don\'t have a surname; I\'d l

16条回答
  •  佛祖请我去吃肉
    2020-11-21 23:07

    Yet another full outer join

    As was not that happy with the simplicity and the readability of the other propositions, I ended up with this :

    It does not have the pretension to be fast ( about 800 ms to join 1000 * 1000 on a 2020m CPU : 2.4ghz / 2cores). To me, it is just a compact and casual full outer join.

    It works the same as a SQL FULL OUTER JOIN (duplicates conservation)

    Cheers ;-)

    using System;
    using System.Collections.Generic;
    using System.Linq;
    namespace NS
    {
    public static class DataReunion
    {
        public static List> FullJoin(List List1, Func KeyFunc1, List List2, Func KeyFunc2)
        {
            List> result = new List>();
    
            Tuple[] identifiedList1 = List1.Select(_ => Tuple.Create(KeyFunc1(_), _)).OrderBy(_ => _.Item1).ToArray();
            Tuple[] identifiedList2 = List2.Select(_ => Tuple.Create(KeyFunc2(_), _)).OrderBy(_ => _.Item1).ToArray();
    
            identifiedList1.Where(_ => !identifiedList2.Select(__ => __.Item1).Contains(_.Item1)).ToList().ForEach(_ => {
                result.Add(Tuple.Create(_.Item2, default(T2)));
            });
    
            result.AddRange(
                identifiedList1.Join(identifiedList2, left => left.Item1, right => right.Item1, (left, right) => Tuple.Create(left.Item2, right.Item2)).ToList()
            );
    
            identifiedList2.Where(_ => !identifiedList1.Select(__ => __.Item1).Contains(_.Item1)).ToList().ForEach(_ => {
                result.Add(Tuple.Create(default(T1), _.Item2));
            });
    
            return result;
        }
    }
    }
    

    The idea is to

    1. Build Ids based on provided key function builders
    2. Process left only items
    3. Process inner join
    4. Process right only items

    Here is a succinct test that goes with it :

    Place a break point at the end to manually verify that it behaves as expected

    using System;
    using System.Collections.Generic;
    using Microsoft.VisualStudio.TestTools.UnitTesting;
    using Newtonsoft.Json;
    using Newtonsoft.Json.Linq;
    using NS;
    
    namespace Tests
    {
    [TestClass]
    public class DataReunionTest
    {
        [TestMethod]
        public void Test()
        {
            List> A = new List>();
            List> B = new List>();
    
            Random rnd = new Random();
    
            /* Comment the testing block you do not want to run
            /* Solution to test a wide range of keys*/
    
            for (int i = 0; i < 500; i += 1)
            {
                A.Add(Tuple.Create(rnd.Next(1, 101), rnd.Next(1, 101), "A"));
                B.Add(Tuple.Create(rnd.Next(1, 101), rnd.Next(1, 101), "B"));
            }
    
            /* Solution for essential testing*/
    
            A.Add(Tuple.Create(1, 2, "B11"));
            A.Add(Tuple.Create(1, 2, "B12"));
            A.Add(Tuple.Create(1, 3, "C11"));
            A.Add(Tuple.Create(1, 3, "C12"));
            A.Add(Tuple.Create(1, 3, "C13"));
            A.Add(Tuple.Create(1, 4, "D1"));
    
            B.Add(Tuple.Create(1, 1, "A21"));
            B.Add(Tuple.Create(1, 1, "A22"));
            B.Add(Tuple.Create(1, 1, "A23"));
            B.Add(Tuple.Create(1, 2, "B21"));
            B.Add(Tuple.Create(1, 2, "B22"));
            B.Add(Tuple.Create(1, 2, "B23"));
            B.Add(Tuple.Create(1, 3, "C2"));
            B.Add(Tuple.Create(1, 5, "E2"));
    
            Func, Tuple> key = (_) => Tuple.Create(_.Item1, _.Item2);
    
            var watch = System.Diagnostics.Stopwatch.StartNew();
            var res = DataReunion.FullJoin(A, key, B, key);
            watch.Stop();
            var elapsedMs = watch.ElapsedMilliseconds;
            String aser = JToken.FromObject(res).ToString(Formatting.Indented);
            Console.Write(elapsedMs);
        }
    }
    

    }

提交回复
热议问题