How does clustering (especially String clustering) work?
I heard about clustering to group similar data. I want to know how it works in the specific case for String. I have a table with more than different 100,000 words. I want to identify the same word with some differences (eg.: house, house!!, hooouse, HoUse, @house, "house", etc... ). What is needed to identify the similarity and group each word in a cluster? What algorithm is more recommended for this? To understand what clustering is imagine a geographical map. You can see many distinct objects (such as houses). Some of them are close to each other, and others are far. Based on this, you can