问题
I'm trying to build a simple method to look at about 100 entries in a database for a last name and pull out all the ones that match above a specific percentage of letters. My current approach is:
- Pull all 100 entries from the database into an array
- Iterate through them while performing the following action
- Split the last name into an array of letters
- Subtract that array from another array that contains the letters for the name I am trying to match which leaves only the letters that weren't matched.
- Take the size of the result and divide by the original size of the array from step 3 to get a percentage.
- If the percentage is above a predefined threshold, push that database object into a results array.
This works, but I feel like there must be some cool ruby/regex/active record method of doing this more efficiently. I have googled quite a bit but can't find anything.
回答1:
To comment on the merit of the measure you suggested would require speculation, which is out-of-bounds at SO. I therefore will merely demonstrate how you might implement your proposed approach.
Code
First define a helper method:
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
In short, if
a = [3,1,2,3,4,3,2,2,4]
b = [2,3,4,4,3,4]
then
a - b #=> [1]
whereas
a.difference(b) #=> [1, 3, 2, 2]
This method is elaborated in my answer to this SO question. I've found so many uses for it that I've proposed it be added to the Ruby Core.
The following method produces a hash whose keys are the elements of names (strings) and whose values are the fractions of the letters in the target string that are contained in each string in names.
def target_fractions(names, target)
target_arr = target.downcase.scan(/[a-z]/)
target_size = target_arr.size
names.each_with_object({}) do |s,h|
s_arr = s.downcase.scan(/[a-z]/)
target_remaining = target_arr.difference(s_arr)
h[s] = (target_size-target_remaining.size)/target_size.to_f
end
end
Example
target = "Jimmy S. Bond"
and the names you are comparing are given by
names = ["Jill Dandy", "Boomer Asad", "Josefine Simbad"]
then
target_fractions(names, target)
#=> {"Jill Dandy"=>0.5, "Boomer Asad"=>0.5, "Josefine Simbad"=>0.8}
Explanation
For the above values of names and target,
target_arr = target.downcase.scan(/[a-z]/)
#=> ["j", "i", "m", "m", "y", "s", "b", "o", "n", "d"]
target_size = target_arr.size
#=> 10
Now consider
s = "Jill Dandy"
h = {}
then
s_arr = s.downcase.scan(/[a-z]/)
#=> ["j", "i", "l", "l", "d", "a", "n", "d", "y"]
target_remaining = target_arr.difference(s_arr)
#=> ["m", "m", "s", "b", "o"]
h[s] = (target_size-target_remaining.size)/target_size.to_f
#=> (10-5)/10.0 => 0.5
h #=> {"Jill Dandy"=>0.5}
The calculations are similar for Boomer and Josefine.
来源:https://stackoverflow.com/questions/40078385/how-can-i-generate-a-percentage-for-a-regex-string-match-in-ruby