homoglyph

Homoglyph attack detection in email phishing

不问归期 提交于 2020-12-30 06:51:38
问题 Main Question I am working on an API in Java that needs to detect the use of brands (e.g. PayPal, Mastercard etc.) in phishing emails. Obviously there are different strategies that the attackers use to target these brands so that they are harder to detect. For instance " rnastercard " looks very similar to " mastercard " and can fool an unsuspecting user. At this time I can easily detect the misspellings of these brands using a form of fuzzy string search. However the problem I am facing is

Efficient algorithm to find all “character-equal” strings?

╄→尐↘猪︶ㄣ 提交于 2019-12-21 05:11:18
问题 How can we write an efficient function that outputs "homoglyph equivalents" of an input string? Example 1 (pseudo-code): homoglyphs_list = [ ["o", "0"], // "o" and "0" are homoglyphs ["i", "l", "1"] // "i" and "l" and "1" are homoglyphs ] input_string = "someinput" output = [ "someinput", "s0meinput", "somelnput", "s0melnput", "some1nput", "s0me1nput" ] Example 2 : homoglyphs_list = [ ["rn", "m", "nn"], ] input_string = "rnn" output = ["rnn", "rm", "mn", "rrn", "nnn", "nm", "nrn"] Example 3 :

Efficient algorithm to find all “character-equal” strings?

时光毁灭记忆、已成空白 提交于 2019-12-03 15:42:50
How can we write an efficient function that outputs " homoglyph equivalents " of an input string? Example 1 (pseudo-code): homoglyphs_list = [ ["o", "0"], // "o" and "0" are homoglyphs ["i", "l", "1"] // "i" and "l" and "1" are homoglyphs ] input_string = "someinput" output = [ "someinput", "s0meinput", "somelnput", "s0melnput", "some1nput", "s0me1nput" ] Example 2 : homoglyphs_list = [ ["rn", "m", "nn"], ] input_string = "rnn" output = ["rnn", "rm", "mn", "rrn", "nnn", "nm", "nrn"] Example 3 : homoglyphs_list = [ ["d", "ci", "a"], // "o" and "0" are homoglyphs ["i", "l", "1"] // "i" and "l"