Parsing HTML table data with xpath and selenium in java

前端未结
关注
 4  1106
I want to take the data and organize it without the tags. It looks something like this

              相关标签:
       

        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  面向向阳花        
                
              
                            
                2020-12-10 10:18
              
            
            
                                                                       
Probably this will suite your needs: 

string text = driver.findElement(By.cssSelector("table.SpecTable")).getText();


String text will contain all text nodes from the table with class SpecTable.
I prefer using css, because it's supported by IE and faster than xpath. But as for xpath tutorials try this and this.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  梦如初夏        
                
              
                            
                2020-12-10 10:22
              
            
            
                                                                       
The spec is surprisingly a very good read on XPath.

You might also try CSS selectors. 

Anyway, one way to get the data from a table can be as following:

// gets all rows
List<WebElement> rows = driver.findElements(By.xpath("//table[@class='SpecTable']//tr"));
// for every line, store both columns
for (WebElement row : rows) {
    WebElement key = row.findElement(By.XPath("./td[1]"));
    doAnythingWithText(key.getText());
    WebElement val = row.findElement(By.XPath("./td[2]"));
    doAnythingWithText(val.getText());
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  自闭症患者        
                
              
                            
                2020-12-10 10:33
              
            
            
                                                                       
CSharp method to extract any table in a 2 dimension array:

private string[,] getYourSpecTable(){
    return getArrayBy(By.CssSelector("table.SpecTable tr"), By.CssSelector("td"));
}

private string[,] getArrayBy(By rowsBy, By columnsBy){
    bool init=false;
    int nbRow=0, nbCol=0;
    string[,] ret = null;
    ReadOnlyCollection<OpenQA.Selenium.IWebElement> rows = this.webDriver.FindElements(rowsBy);
    nbRow = rows.Count;
    for(int r=0;r<nbRow;r++) {
        ReadOnlyCollection<OpenQA.Selenium.IWebElement> cols = rows[r].FindElements(columnsBy);
        if(!init) {
            init= true;
            nbCol = cols.Count;
            ret = new string[rows.Count, cols.Count];
        }                
        for(int c=0;c<nbCol;c++) {
            ret[r, c] = cols[c].Text;
        }
    }
    return ret;
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野趣味        
                
              
                            
                2020-12-10 10:36
              
            
            
                                                                       
As another option you could grab all the cells of the table into one array and access them that way.
EG.

ReadOnlyCollection<IWebElement> Cells = driver.FindElements(By.XPath("//table[@class='SpecTable']//tr//td"));


This will get you all the cells in that table as an array which you can then use to access the text iteratively.

string forOutput = Cells[i].Text;

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复
            
          
        
      

          
 
     
 
        热议问题