google-scholar | 易学教程

Google scholar Captcha verification problem

阅读更多关于 Google scholar Captcha verification problem

问题 I'm working on a project for which I need to extract some data from Google Scholar. My PHP program takes a string from my local machine, passes it to the Google Scholar and on the search results page it takes out the first result and saves it to the database. I have to do this for almost 90 thousand strings/queries. The problem is that after a few hundred entries the program stops as the Google Scholar asks for captcha verification. What can I do about that? 回答1: Because Google Scholar does

Python: How to access the elements in a generator object and put them in a Pandas dataframe or in a dictionary?

阅读更多关于 Python: How to access the elements in a generator object and put them in a Pandas dataframe or in a dictionary?

问题 I am using the scholarly module in python to search for a keyword. I am getting back a generator object as follows: import pandas as pd import numpy as np import scholarly search_query = scholarly.search_keyword('Python') print(next(search_query)) {'_filled': False, 'affiliation': 'Juelich Center for Neutron Science', 'citedby': 75900, 'email': '@fz-juelich.de', 'id': 'zWxqzzAAAAAJ', 'interests': ['Physics', 'C++', 'Python'], 'name': 'Gennady Pospelov', 'url_picture': 'https://scholar.google

Retrieve citations of a journal paper using R

阅读更多关于 Retrieve citations of a journal paper using R

问题 Using R, I want to obtain the list of articles referencing to a scientific journal paper. The only information I have is the title of the article, e.g. "Protein measurement with the folin phenol reagent". Is anyone able to help me by producing a replicable example that I can use? Here is what I tried so far. The R package fulltext seems to be useful, because it allows to retrieve a list of IDs linked to an article. For instance, I can get the article's DOI: library(fulltext) res1 <- ft_search

Retrieve citations of a journal paper using R

阅读更多关于 Retrieve citations of a journal paper using R

Scraping large amount of Google Scholar pages with url

阅读更多关于 Scraping large amount of Google Scholar pages with url

问题 I'm trying to get full author list of all publications from an author on Google scholar using BeautifulSoup. Since the home page for the author only has a truncated list of authors for each paper, I have to open the link of the paper to get full list. As a result, I ran into CAPTCHA every few attempts. Is there a way to avoid CAPTCHA (e.g. pause for 3 secs after every request)? Or make the original Google Scholar profile page to show full author list? 回答1: Recently I faced similar issue. I at

extract text from google scholar

阅读更多关于 extract text from google scholar

问题 I am trying to extract the text from the test snippet that google scholar gives for a particular query. By text snippet I mean the text below the title (in black letter). Currently I am trying to extract it from the html file using python but it contains a lot of extra test such as /div><div class="gs_fl" ...etc. Is there a easy way or some code which can help me get the text without these redundant texts. 回答1: You need an html parser: import lxml.html doc = lxml.html.fromstring(html) text =

Can anybody share a simple example of using Mathematica and Google scholar to extract academic research information

阅读更多关于 Can anybody share a simple example of using Mathematica and Google scholar to extract academic research information

问题 How can I use Mathematica and Google scholar to find the number of papers a person published in 2011? 回答1: Google Scholar is not very suited for this goal as it doesn't have a formal API AFAIK. It also doesn't provide results in a structured (e.g. XML) format. So, we have to resort to a quick (and very, very fragile!) text pattern matching hack like: searchGoogleScholarAuthor[author_String] := First[StringCases[ Import["http://scholar.google.com/scholar?start=0&num=1&q=" <> StringDrop[

Can anybody share a simple example of using Mathematica and Google scholar to extract academic research information

阅读更多关于 Can anybody share a simple example of using Mathematica and Google scholar to extract academic research information

Get all publications by an author from Google Scholar using scholar.py

阅读更多关于 Get all publications by an author from Google Scholar using scholar.py

问题 I am trying to get all the publications by an author using scholar.py https://github.com/ckreibich/scholar.py But whenever I run the script, I only get a fraction of the publications associated with the author in my results. So running: ./scholar.py --author "albert einstein" Will only retrieve a subset of Einstein's 1000+ publications associated with him in Google Scholar. How can I get all of the publications for an author? 来源： https://stackoverflow.com/questions/39257172/get-all

Google Scholar: get links for cited papers(not cited by) [closed]

阅读更多关于 Google Scholar: get links for cited papers(not cited by) [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 5 years ago . This may seem like a stupid question, but I have been looking for this for quite some time and haven't found anything helpful. I want to download all papers cited within a given paper. Is there such a feature available in Google scholar? Or even just a page listing all the cited paper links? 来源： https:/