remove umlauts or specialchars in javascript string

一笑奈何 提交于 2019-12-10 09:54:52

问题


Never played before with umlauts or specialchars in javascript strings. My problem is how to remove them?

For example I have this in javascript:

var oldstr = "Bayern München";
var str = oldstr.split(' ').join('-');

Result is Bayern-München ok easy, but now I want to remove the umlaut or specialchar like:

Real Sporting de Gijón.

How can I realize this?

Kind regards,

Frank


回答1:


replace should be able to do it for you, e.g.:

var str = str.replace(/ü/g, 'u');

...of course ü and u are not the same letter. :-)

If you're trying to replace all characters outside a given range with something (like a -), you can do that by specifying a range:

var str = str.replace(/[^A-Za-z0-9\-_]/g, '-');

That replaces all characters that aren't English letters, digits, -, or _ with -. (The character range is the [...] bit, the ^ at the beginning means "not".) Here's a live example.

But that ("Bayern-M-nchen") may be a bit unpleasant for Mr. München to look at. :-) You could use a function passed into replace to try to just drop diacriticals:

var str = str.replace(/[^A-Za-z0-9\-_]/g, function(ch) {
  // Character that look a bit like 'a'
  if ("áàâä".indexOf(ch) >= 0) { // There are a lot more than this
    return 'a';
  }
  // Character that look a bit like 'u'
  if ("úùûü".indexOf(ch) >= 0) { // There are a lot more than this
    return 'u';
  }
  /* ...long list of others...*/
  // Default
  return '-';
});

Live example

The above is optimized for long strings. If the string itself is short, you may be better off with repeated regexps:

var str = str.replace(/[áàâä]/g, 'a')
             .replace(/[úùûü]/g, 'u')
             .replace(/[^A-Za-z0-9\-_]/g, '-');

...but that's speculative.

Note that literal characters in JavaScript strings are totally fine, but you can run into fun with encoding of files. I tend to stick to unicode escapes. So for instance, the above would be:

var str = str.replace(/[\u00e4\u00e2\u00e0\u00e1]/g, 'a')
             .replace(/[\u00fc\u00fb\u00f9\u00fa]/g, 'u')
             .replace(' ','-');

...but again, there are a lot more to do...



来源:https://stackoverflow.com/questions/4804885/remove-umlauts-or-specialchars-in-javascript-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!