html-parsing | 易学教程

Parse HTML source after login with Java

阅读更多关于 Parse HTML source after login with Java

问题 I've been trying to access a website to parse data for an Android application I am developing, but I am having no luck when it comes to logging in. The website is https://giffgaff.com/mobile/login And below is a stripped out version of the form from that page (HTML): <form action="/mobile/login" method="post"> <input type="hidden" name="login_security_token" value="b22155c7259f402f8e005a771c460670"> <input type="hidden" name="redirect" value="/mobile"> <input type="hidden" name="p_next_page"

Extract URL from HTML using Javascript and use it in a function

阅读更多关于 Extract URL from HTML using Javascript and use it in a function

问题 I need to extract URL link from html. <a rel="nofollow" href="link" class="1"> There is only one such link on the whole page. Then I need to add use it as a.href in this function: function changespan() { var spans = document.querySelectorAll('span.image'); for (var i = spans.length; i--; ) { var a = document.createElement('a'); a.href = "http://domain.com"; spans[i].appendChild(a).appendChild(a.previousSibling); } } 回答1: Try this: function changespan() { var spans = document.querySelectorAll(

Why is this tag empty when parsed with beautiful soup?

阅读更多关于 Why is this tag empty when parsed with beautiful soup?

问题 I am parsing this page with beautiful soup: https://au.finance.yahoo.com/q/is?s=AAPL I am attempting to get the total revenue for 27/09/2014 (42,123,000) which is one of the first values on the statement near the top. I inspected the element in chrome tools and found that the value is in a table with class name yfnc_tabledata1 . My python code is as follows: import requests import bs4 #get webpage page = requests.get("https://au.finance.yahoo.com/q/is?s=AAPL") #put into beautiful soup soup =

Web scrapping using selenium and beautifulsoup.. trouble in parsing and selecting button

阅读更多关于 Web scrapping using selenium and beautifulsoup.. trouble in parsing and selecting button

问题 I am trying to web srape the following website "url='https://angel.co/life-sciences' ". The website contains more than 8000 data. From this page I need the information like company name and link, joined date and followers. Before that I need to sort the followers column by clicking the button. then load more information by clicking more hidden button. The page is clickable (more hidden) content at the max 20 times, after that it doesn't load more information. But I can take only top follower

R Read & Parse HTML to List

阅读更多关于 R Read & Parse HTML to List

问题 I have been trying to read & parse a bit of HTML to obtain a list of conditions for animals at an animal shelter. I'm sure my inexperience with HTML parsing isn't helping, but I seem to be getting no where fast. Here's a snippet of the HTML: <select multiple="true" name="asilomarCondition" id="asilomarCondition"> <option value="101"> Behavior- Aggression, Confrontational-Toward People (mild) - TM</option> .... </select> There's only one tag with <select...> and the rest are all <option value

how to parse html file using Jsoup with multiple class-name element?

阅读更多关于 how to parse html file using Jsoup with multiple class-name element?

问题 the below java code works fine for html file with class for eg css-sched-table-title. However i have multiple class names to find for in the html file eg css-sched-waypoints , css-sched-times. How do i combine the search using getElementsByClass method in jsoup. I don't want to write the code multiple times because I want to preserver the order. My point is i want something like doc.getElementsByClass("css-sched-table-title" || doc.getElementsByClass("css-sched-waypoints" ); Document doc =

How do I parse HTML with Perl?

阅读更多关于 How do I parse HTML with Perl?

问题 I'm new to programming and learning Perl as well. Here is my question: How can I parse the data below in Perl using Perl modules? <h4>This is the line</h4> abc : 130.65 TB<br> dif : 74.52 TB<br> asw : 56.13 TB<br> qwe : 57<br> This is the sample data from a webpage and I want an output like abc = 130.65 TB dif = 74.52 TB asw = 56.13 TB qwe = 57 Can anyone please help me? 回答1: Use an HTML parsing module like HTML::Parser or HTML::TreeBuilder. If you are just trying to extract the text and

Regex for HTML tags

阅读更多关于 Regex for HTML tags

问题 I'm doing the following: <? $text = preg_replace ("/<p>(.*?)<\/p>/", "$1<br>", "$text"); ?> So I can get rid of <p> tags and place a space at the end of the string (this is for styling of the page). This works for "<p>Something</p>" perfectly. However, with text like: <h3>Section 1.10.32 of "de Finibus Bonorum et Malorum", written by Cicero in 45 BC</h3> <p>"Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab

ios - How to parse HTML content in ios?

阅读更多关于 ios - How to parse HTML content in ios?

问题 How do I parse HTML file? I'm getting an HTML file in the below code,I just want to get data in between BinarySecurityToken XML node. - (void)connectionDidFinishLoading:(NSURLConnection *)connection { if(_data) { //Here am getting the below HTML content NSString* content = [[NSString alloc] initWithData:_data encoding:NSUTF8StringEncoding]; } } <input type="hidden" name="wa" value="wsignin1.0" /> <input type="hidden" name="wresult" value="<t:RequestSecurityTokenResponse xmlns:t="http:/

scraping different table with same classes with beautifulsoup,python

阅读更多关于 scraping different table with same classes with beautifulsoup,python

问题 i'm trying to extract,using beautiful soup and python,all the odds from this website http://www.sportstats.com/soccer/italy/serie-a-2013-2014/sampdoria-napoli-zZAT2c14/#odds/1X2/s3 they are divided in different tables depending on wich type do they are. Ex:the first table under the div id="betType_1_2" represents odds of type 1X2 of "full time" I tried to search for all class="odds" but it return also odds from others tables. Have anyone idea on how to extract and then scrape only one table