问题
I'm currently building a file system crawler with the following code:
require 'find'
require 'spreadsheet'
Spreadsheet.client_encoding = 'UTF-8'
count = 0
Find.find('/Users/Anconia/crawler/') do |file|
if file =~ /\b.xls$/ # check if filename ends in desired format
contents = Spreadsheet.open(file).worksheets
contents.each do |row|
if row =~ /regex/
puts file
count += 1
end
end
end
end
puts "#{count} files were found"
And am receiving the following output:
0 files were found
The regex is tested and correct - I currently use it in another crawler that works.
The output of row.inspect is
#<Spreadsheet::Excel::Worksheet:0x003ffa5d418538 @row_addresses= @default_format= @selected= @dimensions= @name=Sheet1 @workbook=#<Spreadsheet::Excel::Workbook:0x007ff4bb147140> @rows=[] @columns=[] @links={} @merged_cells=[] @protected=false @password_hash=0 @changes={} @offsets={} @reader=#<Spreadsheet::Excel::Reader:0x007ff4bb1f3b98> @ole=#<Ole::Storage::RangesIOMigrateable:0x007ff4bb126fa8> @offset=15341 @guts={} @rows[3]> - certainly nothing to iterate over.
回答1:
Try this:
content = Spreadsheet.open(file)
sheet = content.worksheet 0
sheet.each do |row|
...
回答2:
As Diego mentioned, I should have been iterating over contents - really appreciate the clarification! It should also be noted that row must be converted to a string before any iteration takes place.
来源:https://stackoverflow.com/questions/14044357/file-system-crawler-iteration-bugs