MongoDB - strip non numeric characters in field

坚强是说给别人听的谎言 提交于 2020-05-11 06:11:08

问题


I have a field of phone numbers where a random variety of separators have been used, such as:

932-555-1515
951.555.1255
(952) 555-1414

I would like to go through each field that already exists and remove the non numeric characters.

Is that possible?

Whether or not it gets stored as an integer or as a string of numbers, I don't care either way. It will only be used for display purposes.


回答1:


You'll have to iterate over all your docs in code and use a regex replace to clean up the strings.

Here's how you'd do it in the mongo shell for a test collection with a phone field that needs to be cleaned up.

db.test.find().forEach(function(doc) {
  doc.phone = doc.phone.replace(/[^0-9]/g, ''); 
  db.test.save(doc);
});



回答2:


Based on the previous example by @JohnnyHK, I added regex also to the find query:

/*
MongoDB: Find by regular expression and run regex replace on results
*/
db.test.find({"url": { $regex: 'http:\/\/' }}).forEach(function(doc) {
  doc.url = doc.url.replace(/http:\/\/www\.url\.com/g, 'http://another.url.com'); 
  db.test.save(doc);
});



回答3:


Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.

And coupled with improvements made to db.collection.update() in Mongo 4.2 that can accept an aggregation pipeline, allowing the update of a field based on its own value,

We can manipulate and update a field in ways the language doesn't easily permit and avoid an inefficient find/foreach pattern:

// { "x" : "932-555-1515",   "y" : 3 }
// { "x" : "951.555.1255",   "y" : 7 }
// { "x" : "(952) 555-1414", "y" : 6 }
db.collection.updateMany(
  { "x": { $regex: /[^0-9]/g } },
  [{ $set:
    { "x":
      { $function: {
          body: function(x) { return x.replace(/[^0-9]/g, ''); },
          args: ["$x"],
          lang: "js"
      }}
    }
  }
])
// { "x" : "9325551515", "y" : 3 }
// { "x" : "9515551255", "y" : 7 }
// { "x" : "9525551414", "y" : 6 }

The update consist of:

  • a match query { "x": { $regex: /[^0-9]/g } }, filtering documents to update (in our case any document that contains non-numeric characters in the field we're interested on updating).

  • an update aggreation pipeline [ { $set: { active: { $eq: [ "$a", "Hello" ] } } } ] (note the squared brackets signifying the use of an aggregation pipeline). $set is a new aggregation operator and an alias for $addFields.

$function takes 3 parameters:

  • body, which is the function to apply, whose parameter is the string to modify. The function here simply consists in replacing characters matching the regex with empty characters.
  • args, which contains the fields from the record that the body function takes as parameter. In our case, "$x".
  • lang, which is the language in which the body function is written. Only js is currently available.


来源:https://stackoverflow.com/questions/13005269/mongodb-strip-non-numeric-characters-in-field

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!