Partial Intersection of Sepecific Columns in Large CSV Files
问题 I'm working on a script to find the intersection between large csv files based on the contents of only two specific columns in each file which are : Query ID and Subject ID. A set of files are pairs of Left and Right for each species , every single file looks something like this: Similarity (%) Query ID Subject ID 100.000000 BRADI5G01462.1_1 BRADI5G16060.1_36 90.000000 BRADI5G02480.1_5 NCRNA_11838_6689 100.000000 BRADI5G06067.1_8 NCRNA_32597_1525 90.000000 BRADI5G08380.1_12 NCRNA_32405_1776