Python intersection and difference sum doesn't give me the actual number of the original set

血红的双手。 提交于 2019-12-18 09:31:05

问题


I have two lists, one with old IDs and one with new IDs.

I want to get items in common and items not common.

The new_Items list has all new ones. The old_Items has the old ones.

I suppose that when I calculate the ones in common plus the in new items list but not in old items list, I get the actual number of new items.

Here is the code and the output.

print(old_Items)
print(new_Items)

common          =  set(new_Items) & set(old_Items) 
not_common      =  set(new_Items) - set(old_Items)
print(len(old_Items))
print(len(new_Items))
print(len(common))
print(len(not_common))

output

['312064913440', '312062038159', '382373644951', '312061362147', '312063436815', '382376480677', '382376472268', '382377376960', '382377376948', '312064169607', '312064914150', '312064169620', '312064169613', '382376480674', '382376472280', '382378338388', '312061362154', '312063426996', '382377376961', '312064912982', '312064912973', '312063426974', '312063427017', '312063427025', '312063436813', '312064913415', '382378337435', '382378337746', '382378337752', '382378338374', '382378338378', '382378338385', '382378338387', '382378338389', '382378338392', '312063436814', '312064169626', '312064912968', '312064912971', '312064912972', '312064912981', '312064913414', '312064913435', '312064914151', '312064914158', '382376480665', '382378337434', '382378337437', '382378337449', '382378337456', '382378337737', '382378337757', '312063426962', '382376480681', '382376472292', '382376480675', '382377376955', '312064914146', '382378337735', '312064912964', '312064913436', '312064914160', '382376472265', '382378337443', '382378337738', '382378337740', '312063436819', '382376472311', '382376480678', '382376480667', '312063426963', '312063426969', '312063426988', '312063426991', '312063427011', '312063427027', '312063436817', '312064169618', '312064169622', '312064169623', '312064912959', '312064912966', '312064912974', '312064912975', '312064912976', '312064912979', '312064912980', '312064912985', '312064913416', '312064913417', '312064913420', '312064913424', '312064913427', '312064913437', '312064913439', '312064913442', '312064914148', '312064914155', '312064914162', '312064914163', '312064914164', '312064914166', '382376472307', '382376480658', '382376480679', '382377376950', '382378337438', '382378337442', '382378337444', '382378337445', '382378337446', '382378337448', '382378337455', '382378337458', '382378337460', '382378337739', '382378337742', '382378337745', '382378337748', '382378337749', '382378337750', '382378337756', '382378337758', '382378337759', '382378337765', '382378338361', '382378338363', '382378338372', '382378338373', '382378338377', '382378338379', '382378338382', '382378338383', '382378338384', '312062038160', '312063426970', '312063427014', '312063427022', '312063436820', '312063436821', '312063436822', '312064169625', '312064169630', '312064912962', '312064912963', '312064912969', '312064912978', '312064912983', '312064912984', '312064912986', '312064912987', '312064912988', '312064913419', '312064913425', '312064913432', '312064913438', '312064914147', '312064914154', '312064914159', '312064914161', '382376472276', '382376472282', '382376472297', '382376472308', '382376480659', '382376480663', '382376480670', '382376480673', '382376480676', '382376480684', '382376480686', '382376480687', '382377376951', '382378337433', '382378337436', '382378337439', '382378337447', '382378337450', '382378337451', '382378337452', '382378337454', '382378337457', '382378337736', '382378337741', '382378337743', '382378337747', '382378337751', '382378337754', '382378337760', '382378337761', '382378337763', '382378337764', '382378338362', '382378338365', '382378338366', '382378338367', '382378338368', '382378338369', '382378338370', '382378338371', '382378338381', '382378338386', '382378338390', '312063426985', '312064169612', '382376480671', '312063427019', '312064169608', '312064169610', '312063436828', '312064169619', '382378337755', '312062714117', '312063436833', '312064169611', '382373643627', '382376472281', '382376472287', '382376472301', '382376472302', '382376480661', '382377376952', '382377376954', '382377376956', '382377376957', '382377376959', '382378337459', '312063426973', '312063427005', '312063436826', '312064169606', '312064169624', '312064169628', '382373643615', '382376472288', '382376480666', '382376480669', '382376480682', '312063427002', '312063436831', '312064169614', '312064169615', '382376480662', '382377376947', '312063426998', '382376480664', '382376480668', '382377376958', '312063426992', '312063436810', '312064169605', '312064912970', '312064913418', '312064913429', '312064913431', '382376480660', '382378337753', '382378338364', '382378338380', '312063426964', '312063426957', '312063436809', '312063436812', '382376472298', '382378338393', '382376480680', '312064169629', '312064913423', '312064914152', '312064914157', '312064914165', '382378338375', '382378338376', '312063426977', '312063426978', '382376472279', '312063436827', '382376472275', '382377376949', '312063427001', '312063436825', '312063436829', '312063436830', '312063426989', '312063426993', '312064169609', '382375693533', '382376472267', '382376472299', '382376480685', '312063436832']
['312065926243', '382376472268', '312067111164', '382378338380', '312064913415', '382380706562', '382380706577', '382380706899', '382379331671', '382376480673', '382376480674', '312067111153', '382380706584', '382378337450', '382378337454', '382376472301', '312067111663', '382378337459', '382379835966', '382379835959', '382379835961', '382380706907', '382378337444', '382380706580', '382378337436', '312066454641', '312063426992', '312067111152', '382379335272', '382378337752', '382378337449', '382378337437', '312067111167', '312066454623', '312067111471', '382379835965', '382380706919', '312066454621', '312067111158', '312067111163', '312067111468', '312067111647', '382380706718', '382380706732', '312067111150', '312067111446', '382379331513', '382379835967', '312067111436', '312067111462', '312067111464', '312067111466', '312067111468', '312067111647', '312067111652', '382380706583', '382380706718', '382380706723', '382380706732', '382380706894', '382380706897', '382380706912', '382379331513', '382379835967', '382378337435', '312064912968', '382378337456', '312064912971', '312064912972', '312064914151', '312066454616', '312066454639', '382378338378', '312064912981', '312067111435', '382376472292', '382378337434', '312064912973', '312064914158', '312067111169', '312067111443', '312067111646', '312067111676', '382380706567', '382380706559', '382380706572', '382380706719', '312064914160', '382378337443', '312064914146', '312067111442', '312067111441', '312067111463', '382378337735', '382376472265', '312063436819', '312067111441', '382376472311', '312064914155', '312063427014', '312063436822', '312064912984', '312066454628', '312063436817', '382378337756', '382376480670', '312064912962', '312064913438', '312066454629', '312066454634', '312066454635', '312066454645', '312067111143', '312067111451', '312067111452', '312067111454', '312067111467', '312067111470', '312067111650', '312067111653', '312067111654', '312067111662', '312067111665', '312067111671', '382379835960', '382379835962', '382379835968', '382379835971', '382380706573', '382380706727', '382380706728', '382380706915', '382380706917', '382380706920', '312065919161', '312066454625', '312067111147', '312067111156', '312067111159', '312067111457', '312067111458', '312067111460', '312067111461', '312067111651', '312067111667', '312067111672', '382379835958', '382380706574', '382380706722', '382380706901', '312064913432', '382378337433', '312067111154', '312067111165', '382380706892', '382378338379', '382378338365', '312064912988', '312067111455', '312067111465', '312067111657', '312067111660', '312067111664', '382378337447', '382380706729', '312063436828', '382378338377', '312064913427', '382378337438', '312064913442', '312064912987', '382378337452', '382378338362', '382378337455', '312064912979', '312067111168', '382380706717', '312063427011', '382378337750', '382378337458', '382378337743', '382378338373', '312067111140', '382379835974', '382380706565', '382380706734', '312064912975', '382378337446', '312064914162', '382378338382', '312064914166', '312063426998', '312064914166', '312063426998', '312063427019', '382378337754', '312064912963', '382378338369', '382379835964', '382376472282', '312064914148', '312066454618', '312066454619', '312066454626', '312066454631', '312067111141', '312067111166', '312067111447', '312067111453', '312067111456', '382379835972', '382380706716', '382380706724', '382380706736', '382380706913', '312066454630', '312066454633', '312066454636', '312066454643', '312067111151', '312067111157', '312067111449', '312067111469', '312067111656', '312067111658', '312067111669', '312067111670', '312067111675', '382379835970', '382380706566', '382380706575', '382380706582', '382380706725', '382380706726', '382380706730', '382380706733', '382380706898', '382380706903', '382380706905', '382380706906', '382378338372', '312066454620', '312066454637', '312067111162', '312067111666', '382379835953', '382380706570', '382380706578', '382380706896', '382380706916', '312066454617', '312066454622', '312066454632', '312067111145', '312067111146', '382379835954', '382379835963', '382380706576', '382378337765', '312063426969', '382379835969', '382378337451', '382378338368', '382378337448', '382378337442', '382378338371', '382378337439', '382378338386', '312064912986', '382376472307', '382376480687', '312064912976', '312064912983', '382378337457', '312065916615', '382379835952', '312066454615', '312066454627', '382379835955', '382380706561', '382380706571', '382380706714', '382378338366', '382380706564', '312064912974', '382378337460', '382380706581', '382376480660', '312063427002', '312064912978', '312067111439', '382380706900', '312067111160', '382379835951', '382380706721', '382380706908', '312067111438', '312067111649', '382380706560', '382380706895', '382380706918', '382378337445', '312064912959', '312064912966', '382376480680', '312063436809', '382376472298', '382379835957', '382379835973', '312063427001', '312063426977', '382378338393', '312063426957', '312063436830', '312063436812', '312063436829', '382376472275', '312063436825', '312064913423', '382376472299', '382376472267', '312063436832', '312064914157', '382378338375', '312064914165', '382378338376', '312064914152']
291 # number of items in old_items
327 # number of items in new_items
122 # intersection result
196 # result of newitems set - olditems set

回答1:


What you're looking for is called the "symmetric difference".

set(new_Items) ^ set(old_Items)

Or,

set(new_Items).symmetric_difference(old_Items)

This gives you items that belong to either set, but not both. You are currently computing only those items that belong to new_Items, but not the other way round, hence the discrepancy.

Refer to the set.symmetric_difference docs.




回答2:


A-B gives items of A which are not in B B-A gives items of B which are not in A Either of these is not what you are looking for

For items not common you need (A union B) minus (A intersection B) which is the symmetric difference of sets

You can also get by "(A-B) union (B-A)"




回答3:


The list had some repeated items, that was the problem.

So the set cuts these repeated items and that is why printing less numbers than I expect.



来源:https://stackoverflow.com/questions/48807504/python-intersection-and-difference-sum-doesnt-give-me-the-actual-number-of-the

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!