How to ensure replaceAll will replace a whole word and not a subString

后端 未结 2 1380
执笔经年
执笔经年 2020-12-19 08:16

I have an input of dictionary. The dictionary is iterated over to replace the key from dictionary in the text. But replaceAll function replaces the

相关标签:
2条回答
  • 2020-12-19 09:03

    "\bword\b" is working for me.

    Sample Code :

    for (row <- df.rdd.collect){   
    var config_key = row.mkString(",").split(",")(0)
    var config_value = row.mkString(",").split(",")(1)
    val rc_applied_hiveQuery="select * from emp_details_Spark2 where empid_details= 'empid' limit 10"
    var str_row = rc_applied_hiveQuery.replaceAll("\\b"+config_key+"\\b", "xyz")
    println(str_row)}
    

    Output : select * from emp_details_Spark2 where empid_details= '5' limit 10

    0 讨论(0)
  • 2020-12-19 09:18

    replaceAll takes as parameter a regular expression.

    In regular expressions, you have word boundaries : \b (use \\b in a string literal). They're the best way to ensure you're matching a word and not a part of a word : "\\bword\\b"

    But in your case, you can't use word boundaries as you're not looking for a word ([69-3] isn't a word).

    I suggest this :

    text=text.replaceAll("(?=\\W+|^)"+Pattern.quote("[69-3]")+"(?=\\W+|$)", ...
    

    The idea is to match a string end or something that's not a word. I can't ensure this will be the right solution for you though : such a pattern must be tuned knowing the exact complete use case.

    Note that if all your keys follow a similar pattern there might be a better solution than to iterate through a dictionary, you might for example use a pattern like "(?=\\W+|^)\\[\\d+\\-\\d+\\](?=\\W+|$)".

    0 讨论(0)
提交回复
热议问题