Removing duplicate strings from a comma separated list, in a cell

橙三吉。 提交于 2019-12-24 06:45:11

问题


I'm using Google Sheets and this is way beyond my simple scripting.

I have numerous cells containing comma separated values;

AA, BB, CC, BBB, CCC, CCCCC, AA, BBB, BB

BB, ZZ, ZZ, AA, BB, CC, BBB, CCC, CCCCC, AA, BBB, BB

I'm trying to return:

AA, BB, CC, BBB, CCC, CCCCC etc.

BB, ZZ, AA, CC, BBB, CCC, CCCCC etc.

... remove the duplicates. Per cell.

I can't get my head around a solution. I've tried every online tool that removes duplicates. BUT they all remove duplicates throughout my document.

Part of the problem is, I can't put the cells in 'alphabetical' order (which would make things simple) they have to be kept in the original order they appear.

I also have, at my disposal (but beyond my skill) Open Refine which I believe is a clever tool.


回答1:


Here is how to do that in OpenRefine.

The formula I used is :

value.split(',').uniques().join(',')

It means : split the value in the cells by commas, remove duplicates, join them again using commas.

EDIT :

Another solution in OpenRefine using Python instead of GREL. This one keep better the original order.

Python/Jython Script:

from collections import OrderedDict
dedup = list(OrderedDict.fromkeys(value.replace(' ','').split(',')))
return ",".join(dedup)


来源:https://stackoverflow.com/questions/50937289/removing-duplicate-strings-from-a-comma-separated-list-in-a-cell

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!