Detect URLs in text with JavaScript

后端 未结 13 2207
孤城傲影
孤城傲影 2020-11-22 06:23

Does anyone have suggestions for detecting URLs in a set of strings?

arrayOfStrings.forEach(function(string){
  // detect URLs in strings and do something sw         


        
13条回答
  •  无人共我
    2020-11-22 07:07

    Generic Object Oriented Solution

    For people like me that use frameworks like angular that don't allow manipulating DOM directly, I created a function that takes a string and returns an array of url/plainText objects that can be used to create any UI representation that you want.

    URL regex

    For URL matching I used (slightly adapted) h0mayun regex: /(?:(?:https?:\/\/)|(?:www\.))[^\s]+/g

    My function also drops punctuation characters from the end of a URL like . and , that I believe more often will be actual punctuation than a legit URL ending (but it could be! This is not rigorous science as other answers explain well) For that I apply the following regex onto matched URLs /^(.+?)([.,?!'"]*)$/.

    Typescript code

        export function urlMatcherInText(inputString: string): UrlMatcherResult[] {
            if (! inputString) return [];
    
            const results: UrlMatcherResult[] = [];
    
            function addText(text: string) {
                if (! text) return;
    
                const result = new UrlMatcherResult();
                result.type = 'text';
                result.value = text;
                results.push(result);
            }
    
            function addUrl(url: string) {
                if (! url) return;
    
                const result = new UrlMatcherResult();
                result.type = 'url';
                result.value = url;
                results.push(result);
            }
    
            const findUrlRegex = /(?:(?:https?:\/\/)|(?:www\.))[^\s]+/g;
            const cleanUrlRegex = /^(.+?)([.,?!'"]*)$/;
    
            let match: RegExpExecArray;
            let indexOfStartOfString = 0;
    
            do {
                match = findUrlRegex.exec(inputString);
    
                if (match) {
                    const text = inputString.substr(indexOfStartOfString, match.index - indexOfStartOfString);
                    addText(text);
    
                    var dirtyUrl = match[0];
                    var urlDirtyMatch = cleanUrlRegex.exec(dirtyUrl);
                    addUrl(urlDirtyMatch[1]);
                    addText(urlDirtyMatch[2]);
    
                    indexOfStartOfString = match.index + dirtyUrl.length;
                }
            }
            while (match);
    
            const remainingText = inputString.substr(indexOfStartOfString, inputString.length - indexOfStartOfString);
            addText(remainingText);
    
            return results;
        }
    
        export class UrlMatcherResult {
            public type: 'url' | 'text'
            public value: string
        }
    

提交回复
热议问题