Screen scraping: Automating a vim script

筅森魡賤 提交于 2019-12-10 22:54:15

问题


In vim, I loaded a series of web pages (one at a time) into a vim buffer (using the vim netrw plugin) and then parsed the html (using the vim elinks plugin). All good. I then wrote a series of vim scripts using regexes with a final result of a few thousand lines where each line was formatted correctly (csv) for uploading into a database.

In order to do that I had to use vim's marking functionality so that I could loop over specific points of the document and reassemble it back together into one csv line. Now, I am considering automating this by using Perl's "Mechanize" library of classes (UserAgent, etc).

Questions:

  1. Can vim's ability to "mark" sections of a document (in order to perform substitutions on) be accomplished in Perl?
  2. It was suggested to use "elinks" directly - which I take to mean to load the page into a headless browser using ellinks and perform Perl scripts on the content from there(?)
  3. If that's correct, would there become a deployment problem with elinks when I migrate the site from my localhost LAMP stack setup to a hosting company like Bluehost?

Thanks

Edit 1:

TYRING TO MIGRATE KNOWLEDGE FROM VIM TO PERL:

If @flesk (below) is right, then how would I go about performing this routine (written in vim) that "marks" lines in a text file ("i" and "j") and then uses that as a range ('i,'j) to perform the last two substitutions?

:g/^\s*\h/d|let@"=substitute(@"[:-2],'\s\+and\s\+',',','')|ki|/\n\s*\h\|\%$/kj|
\   'i,'js/^\s*\(\d\+\)\s\+-\s\+The/\=@".','.submatch(1).','/|'i,'js/\s\+//g

I am not seeing this capability in the perldoc perlre manual. Am I missing either a module or some basic Perl understanding of m/ or qr/ ??


回答1:


I'm sure all you need is some kind of HTML parser. For example I'm using HTML::TreeBuilder::XPath.



来源:https://stackoverflow.com/questions/8887207/screen-scraping-automating-a-vim-script

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!