问题
So what I am doing is iterating over various versions of snippet of code (for e.g. Associations.rb in Rails).
What I want to do is just extract one snippet of the code, for example the has_many method:
def has_many(name, scope = nil, options = {}, &extension)
reflection = Builder::HasMany.build(self, name, scope, options, &extension)
Reflection.add_reflection self, name, reflection
end
At first I was thinking of just searching this entire file for the string def has_many
and then saving everything between that string and end
. The obvious issue with this, is that different versions of this file can have multiple end
strings within the method.
For instance, whatever I come up with for the above snippet, should also work for this one too:
def has_many(association_id, options = {})
validate_options([ :foreign_key, :class_name, :exclusively_dependent, :dependent, :conditions, :order, :finder_sql ], options.keys)
association_name, association_class_name, association_class_primary_key_name =
associate_identification(association_id, options[:class_name], options[:foreign_key])
require_association_class(association_class_name)
if options[:dependent] and options[:exclusively_dependent]
raise ArgumentError, ':dependent and :exclusively_dependent are mutually exclusive options. You may specify one or the other.' # ' ruby-mode
elsif options[:dependent]
module_eval "before_destroy '#{association_name}.each { |o| o.destroy }'"
elsif options[:exclusively_dependent]
module_eval "before_destroy { |record| #{association_class_name}.delete_all(%(#{association_class_primary_key_name} = '\#{record.id}')) }"
end
define_method(association_name) do |*params|
force_reload = params.first unless params.empty?
association = instance_variable_get("@#{association_name}")
if association.nil?
association = HasManyAssociation.new(self,
association_name, association_class_name,
association_class_primary_key_name, options)
instance_variable_set("@#{association_name}", association)
end
association.reload if force_reload
association
end
# deprecated api
deprecated_collection_count_method(association_name)
deprecated_add_association_relation(association_name)
deprecated_remove_association_relation(association_name)
deprecated_has_collection_method(association_name)
deprecated_find_in_collection_method(association_name)
deprecated_find_all_in_collection_method(association_name)
deprecated_create_method(association_name)
deprecated_build_method(association_name)
end
Assuming that each value is stored as text
in some column in my db.
How do I approach this, using Ruby's string methods or should I be approaching this another way?
Edit 1
Please note that this question relates specifically to string manipulation via using a Regex, without a parser.
回答1:
As discussed, this should be done with a parser like Ripper.
However, to answer if it can be done with string methods, I will match the syntax with a regex, provided:
- You can rely on indentation i.e. the string has the exact same characters before
"def"
and before"end"
. - There are no multiline strings in between that could simulate an
"end"
with the same indentation. That includes multine strings, HEREDOC, %{ }, etc.
Code
regex = /^
(\s*) # matches the indentation (we'll backreference later)
def\ +has_many\b # literal "def has_many" with a word boundary
(?:.*+\n)*? # match whole lines - as few as possible
\1 # matches the same indentation as the def line
end\b # literal "end"
/x
subject = %q|
def has_many(name, scope = nil, options = {}, &extension)
if association.nil?
instance_variable_set("@#{association_name}", association)
end
end|
#Print matched text
puts subject.to_enum(:scan,regex).map {$&}
ideone demo
The regex relies on:
- Capturing the whitespace (indentation) with the group
(\s*)
, - followed by the literal
def has_many
. - It then consumes as few lines as it can with
(?:.*+\n)*?
.
Notice that.*+\n
matches a whole line
and(?:
..)*?
repeats it 0 or more times. Also, the last?
makes the repetition lazy (as few as possible).
It will consume lines until it matches the following condition... \1
is a backreference, storing the text matched in (1), i.e. the exact same indentation as the first line.- Followed by
end
obviously.
Test in Rubular
来源:https://stackoverflow.com/questions/38628521/how-do-i-extract-just-a-specific-portion-of-a-code-snippet-from-multiple-files