hashset

HashSets don't keep the elements unique if you mutate their identity

亡梦爱人 提交于 2019-12-06 01:27:15
问题 When working with HashSets in C#, I recently came across an annoying problem: HashSets don't guarantee unicity of the elements; they are not Sets. What they do guarantee is that when Add(T item) is called the item is not added if for any item in the set item.equals(that) is true . This holds no longer if you manipulate items already in the set. A small program that demonstrates (copypasta from my Linqpad): void Main() { HashSet<Tester> testset = new HashSet<Tester>(); testset.Add(new Tester(1

Faster way to count number of sets an item appears in?

会有一股神秘感。 提交于 2019-12-05 16:24:55
I've got a list of bookmarks. Each bookmark has a list of keywords (stored as a HashSet). I also have a set of all possible keywords ("universe"). I want to find the keyword that appears in the most bookmarks. I have 1356 bookmarks with a combined total of 698,539 keywords, with 187,358 unique. If I iterate through every keyword in the universe and count the number of bookmarks it appears in, I'm doing 254,057,448 checks. This takes 35 seconds on my machine. The algorithm is pretty simple: var biggest = universe.MaxBy(kw => bookmarks.Count(bm => bm.Keywords.Contains(kw))); Using Jon Skeet's

HashSet storing equal objects

痴心易碎 提交于 2019-12-05 15:07:38
Below is the code for finding duplicate objects from a list of object. But for some reason the hashset is storing even the equal objects. I am certainly missing out something here but when I check the size of hashset it comes out 5. import java.util.ArrayList; import java.util.HashSet; public class DuplicateTest { public static void main(String args[]){ ArrayList<Dog> dogList = new ArrayList<Dog>(); ArrayList<Dog> duplicatesList = new ArrayList<Dog>(); HashSet<Dog> uniqueSet = new HashSet<Dog>(); Dog a = new Dog(); Dog b = new Dog(); Dog c = new Dog(); Dog d = new Dog(); Dog e = new Dog(); a

Internal System.Linq.Set<T> vs public System.Collections.Generic.HashSet<T>

旧时模样 提交于 2019-12-05 10:58:38
问题 Check out this piece of code from Linq.Enumerable class: static IEnumerable<TSource> DistinctIterator<TSource>(IEnumerable<TSource> source, IEqualityComparer<TSource> comparer) { Set<TSource> set = new Set<TSource>(comparer); foreach (TSource element in source) if (set.Add(element)) yield return element; } Why did the guys at Microsoft decided to use this internal implementation of Set and not the regular HashSet ? If it's better in any way, why not exposing it to the public? 回答1: The

Remove a key in hashmap when the value's hashset is Empty

风流意气都作罢 提交于 2019-12-05 07:28:25
I have a hashmap that maps strings keys to hashsets values, and I want to remove a key from the hashmap when the hashmaps's hashset value is empty. I'm having trouble approaching this. Here's what I've tried but I'm very stuck: for(Map.Entry<String, HashSet<Integer>> entr : stringIDMap.entrySet()) { String key = entr.getKey(); if (stringIDMap.get(key).isEmpty()) { stringIDMap.remove(key); continue; } //few print statements... } In order to avoid ConcurrentModificationException , you need to use the Iterator interface directly: Iterator<Map.Entry<String, HashSet<Integer>>> it = stringIDMap

What load factor should be used when you know maximum possible no of elements in HashSet

巧了我就是萌 提交于 2019-12-05 04:17:36
What load factor should I use when I really know the maximum possible no of elements in a HashSet ? I had heard that the default load factor of 0.75 is recommended as it offers good performance trade-offs between speed & space. Is this correct ? However a larger size HashSet would also takes more time in creation and more space. I am using HashSet just inorder to remove duplicate integers from a list of integers. I spent some time playing around with load factors once, and it is shocking how little difference that setting really makes in practice. Even setting it to something high like 2.0

nhibernate Iesi ISet fails to Remove()

非 Y 不嫁゛ 提交于 2019-12-05 02:46:23
问题 I have 2 class'es that are handled by NHibernate : AssetGroup , Asset The AssetGroup has a ISet _assets collection. The constructor of AssetGroup will say _assets = new HashSet<Asset>(); I have some operation to add , remove asset in AssetGroup public abstract class Entity<Tid> { public virtual Tid Id { get; protected set; } public override bool Equals(object obj) { return Equals(obj as Entity<Tid>); } public static bool IsTransient(Entity<Tid> obj) { return obj != null && Equals(obj.Id,

Is the .Net HashSet uniqueness calculation completely based on Hash Codes?

痞子三分冷 提交于 2019-12-05 00:15:14
I was wondering whether the .Net HashSet<T> is based completely on hash codes or whether it uses equality as well? I have a particular class that I may potentially instantiate millions of instances of and there is a reasonable chance that some hash codes will collide at that point. I'm considering using HashSet's to store some instances of this class and am wondering if it's actually worth doing - if the uniqueness of an element is only determined on its hash code then that's of no use to me for real applications MSDN documentation seems to be rather vague on this topic - any enlightenment

Subtract HashSets (and return a copy)?

北战南征 提交于 2019-12-04 22:51:29
I've got a HashSet, var universe = new HashSet<int>(); And a bunch of subsets, var sets = new List<HashSet<int>>(numSets); I want to subtract a chunk, which I can do like this: var remaining = universe.ExceptWith(sets[0]); But ExceptWith works in-place. I don't want to modify the universe . Should I clone it first, or is there a better way? I guess I should clone it first? How do I do that? var universe = new HashSet<int>(); var subset = new HashSet<int>(); ... // clone the universe var remaining = new HashSet<int>(universe); remaining.ExceptWith(subset); Not as simple as with the Except

Why is Python set intersection faster than Rust HashSet intersection?

不羁岁月 提交于 2019-12-04 22:24:09
Here is my Python code: len_sums = 0 for i in xrange(100000): set_1 = set(xrange(1000)) set_2 = set(xrange(500, 1500)) intersection_len = len(set_1.intersection(set_2)) len_sums += intersection_len print len_sums Here is my Rust code: use std::collections::HashSet; fn main() { let mut len_sums = 0; for _ in 0..100000 { let set_1: HashSet<i32> = (0..1000).collect(); let set_2: HashSet<i32> = (500..1500).collect(); let intersection_len = set_1.intersection(&set_2).count(); len_sums += intersection_len; } println!("{}", len_sums); } I believe these are roughly equivalent. I get the following