Extract links from string optimization

前端 未结 7 1048
予麋鹿
予麋鹿 2021-01-03 05:21

I get data (HTML string) from website. I want to extract all links. I write function (it works), but it is so slow...

Can you help me to optimize it? What standard

7条回答
  •  感动是毒
    2021-01-03 05:31

    Details

    • Swift 5.2, Xcode 11.4 (11E146)

    Solution

    // MARK: DataDetector
    
    class DataDetector {
    
        private class func _find(all type: NSTextCheckingResult.CheckingType,
                                 in string: String, iterationClosure: (String) -> Bool) {
            guard let detector = try? NSDataDetector(types: type.rawValue) else { return }
            let range = NSRange(string.startIndex ..< string.endIndex, in: string)
            let matches = detector.matches(in: string, options: [], range: range)
            loop: for match in matches {
                for i in 0 ..< match.numberOfRanges {
                    let nsrange = match.range(at: i)
                    let startIndex = string.index(string.startIndex, offsetBy: nsrange.lowerBound)
                    let endIndex = string.index(string.startIndex, offsetBy: nsrange.upperBound)
                    let range = startIndex.. [String] {
            var results = [String]()
            _find(all: type, in: string) {
                results.append($0)
                return true
            }
            return results
        }
    
        class func first(type: NSTextCheckingResult.CheckingType, in string: String) -> String? {
            var result: String?
            _find(all: type, in: string) {
                result = $0
                return false
            }
            return result
        }
    }
    
    // MARK: String extension
    
    extension String {
        var detectedLinks: [String] { DataDetector.find(all: .link, in: self) }
        var detectedFirstLink: String? { DataDetector.first(type: .link, in: self) }
        var detectedURLs: [URL] { detectedLinks.compactMap { URL(string: $0) } }
        var detectedFirstURL: URL? {
            guard let urlString = detectedFirstLink else { return nil }
            return URL(string: urlString)
        }
    }
    

    Usage

    let text = """
    Lorm Ipsum is simply dummy text of the printing and typesetting industry. apple.com/ Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. http://gooogle.com. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. yahoo.com It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
    """
    
    print(text.detectedLinks)
    print(text.detectedFirstLink)
    print(text.detectedURLs)
    print(text.detectedFirstURL)
    

    Console output

    ["apple.com/", "http://gooogle.com", "yahoo.com"]
    Optional("apple.com/")
    [apple.com/, http://gooogle.com, yahoo.com]
    Optional(apple.com/)
    

提交回复
热议问题