Swift - Replacing emojis in a string with whitespace

匿名 (未验证) 提交于 2019-12-03 01:18:02

问题:

I have a method that detects urls in a string and returns me both the urls and the ranges where they can be found. Everything works perfectly until there are emojis on the string. For example:

Because of the emojis, the url extracted from the text is http://youtu.be/SW_d3fGz1 instead of http://youtu.be/SW_d3fGz1hk. I figured that the easiest solution was to just replace the emojis on the string with whitespace characters (cause I need the range to be correct for some text styling stuff). Problem is, this is extremely hard to accomplish with Swift (most likely my abilities with the Swift String API is lacking).

I've been trying to do it like this but it seems that I cannot create a string from an array of unicode points:

var emojilessStringWithSubstitution: String {     let emojiRanges = [0x1F601...0x1F64F, 0x2702...0x27B0]     let emojiSet = Set(emojiRanges.flatten())     let codePoints: [UnicodeScalar] = self.unicodeScalars.map {         if emojiSet.contains(Int($0.value)) {             return UnicodeScalar(32)         }         return $0     }     return String(codePoints) } 

Am I approaching this problem the wrong way? Is replacing emojis the best solution here? If so, how can I do it?

回答1:

You can use pattern matching (for emoji patterns) to filter out emoji characters from your String.

Note that the above only makes use of the emoji intervals as presented in your question, and is in no way representative for all emojis, but the method is general and can swiftly be extended by including additional emoji intervals to the emojiPatterns array.


I realize reading your question again that you'd prefer substituting emojis with whitespace characters, rather than removing them (which the above filtering solution does). We can achieve this by replacing the .filter operation above with a conditional return .map operation instead, much like in your question

extension String {      var emojilessStringWithSubstitution: String {         let emojiPatterns = [UnicodeScalar(0x1F600)...UnicodeScalar(0x1F64F),                          UnicodeScalar(0x1F300)...UnicodeScalar(0x1F5FF),                          UnicodeScalar(0x1F680)...UnicodeScalar(0x1F6FF),                          UnicodeScalar(0x2600)...UnicodeScalar(0x26FF),                          UnicodeScalar(0x2700)...UnicodeScalar(0x27BF),                          UnicodeScalar(0xFE00)...UnicodeScalar(0xFE0F)]          return self.unicodeScalars             .map { ucScalar in                 emojiPatterns.contains{ $0 ~= ucScalar } ? UnicodeScalar(32) : ucScalar }             .reduce("") { $0 + String($1) }     } } 

I the above, the existing emoji intervals has been extended, as per your comment to this post (listing these intervals), such that the emoji check is now possibly exhaustive.



回答2:

Swift 4:

extension String {   func stringByRemovingEmoji() -> String {     return String(self.filter { !$0.isEmoji() })   } }  extension Character {   fileprivate func isEmoji() -> Bool {     return Character(UnicodeScalar(UInt32(0x1d000))!) <= self && self <= Character(UnicodeScalar(UInt32(0x1f77f))!)       || Character(UnicodeScalar(UInt32(0x2100))!) <= self && self <= Character(UnicodeScalar(UInt32(0x26ff))!)   } } 


回答3:

Emojis are classified as symbols by Unicode. Character sets are typically used in searching operations. So we will use Character sets a property that is symbols.

Output is

Hey there , welcome 

Now observe the emoji is replaced by a white space so there is two white space and we replace it by the following way

emojiString.replacingOccurrences(of: "  ", with: " ")  

The above method replace parameter of: "two white space" to with: "single white space"



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!