Regular Expression to mask any string matching 10 digits only in golang

梦想的初衷 提交于 2021-01-28 11:52:59

问题


Since golang regex does not support lookaheads, I was wondering is there any way i can create a regex that will mask any string having a 10 digit number.

func main() {
    s := "arandomsensitive information: 1234567890 this is not senstive: 1234567890000000"
    re := regexp.MustCompile(`\d{10}`)
    s = re.ReplaceAllString(s, "$1**********$2")
    fmt.Println(s)
}

Is it possible to get an output like this "arandomsensitive information: 1234****** this is not senstive: 1234567890000000"

Also any regex without lookaheads that i can try?


回答1:


If you know the 10-digit number can only appear in between word boundaries - that is, between characters other than letters, digits or underscores - you may use a simple word boundary approach with ReplaceAllString:

\b(\d{4})\d{6}\b

Replace with $1******. See the regex demo online.

The \b(\d{4})\d{6}\b pattern matches a word boundary first, then matches and captured four digits into Group 1, then matches any six digits and then requires a word boundary position.

See the Go demo:

package main

import (
    "fmt"
    "regexp"
)

func main() {
    s := "arandomsensitive information: 1234567890 this is not senstive: 1234567890000000"
    re := regexp.MustCompile(`\b(\d{4})\d{6}\b`)
    s = re.ReplaceAllString(s, "$1******")
    fmt.Println(s)
}

If you need to match the 10-digit number in between any non-digit characters, you may use

package main

import (
    "fmt"
    "regexp"
)

func main() {
    s := "aspacestrippedstring1234567890buttrailingonehouldnotbematchedastitis20characters12345678901234567890"
    re := regexp.MustCompile(`((?:\D|^)\d{4})\d{6}(\D|$)`)
    fmt.Println(re.ReplaceAllString(s, "$1******$2"))
}

See the Go demo

NOTE: Since Golang regex does not support lookarounds, it is impossible to handle consecutive numbers with a regex in a single step. A (?!\d) lookahead would make it possible to match both numbers in 1234567890 1234567891 string. So, there is no pure regex way of solving the problem with consecutive matches like this. However, you may run regex replace twice to solve it:

result := re.ReplaceAllString(re.ReplaceAllString(s, "$1******$2"), "$1******$2")

Regex details:

  • ((?:\D|^)\d{4}) - Group 1: any non-digit char or start of string and then any 4 digits
  • \d{6} - any six digits
  • (\D|$) - Group 2: any non-digit or end of string.



回答2:


Extending the solution provided by Wiktor, The regex can be as follows if you need to match the 10-digit number in between any non-digit characters, you may use

 ((\b|\D)\d{4})\d{6}(\b|\D)

Regex Demo Link

package main

import (
    "fmt"
    "regexp"
)

func main() {
    s := "arandomsensitive information: 1234567890 this is not senstive: 1234567890000000 and 2 sensitiveinfo in url https://someurl?data=1234567890%2C0987654321"
    re := regexp.MustCompile(`((\b|\D)\d{4})\d{6}(\b|\D)`)
    s = re.ReplaceAllString(s, "$1******")
    fmt.Println(s)
}
//the output for above code is below
arandomsensitive information: 1234****** this is not senstive: 1234567890000000 and 2 sensitiveinfo in url https://someurl?data=1234******%2C0987******

Demo Golang Link



来源:https://stackoverflow.com/questions/61852533/regular-expression-to-mask-any-string-matching-10-digits-only-in-golang

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!