Python package dependency tree

こ雲淡風輕ζ 提交于 2019-12-03 03:45:38

问题


I would like to analyze the dependency tree of Python packages. How can I obtain this data?

Things I already know

  1. setup.py sometimes contains a requires field that lists package dependencies
  2. PyPi is an online repository of Python packages
  3. PyPi has an API

Things that I don't know

  1. Very few projects (around 10%) on PyPi explicitly list dependencies in the requires field but pip/easy_install still manage to download the correct packages. What am I missing? For example the popular library for statistical computing, pandas, doesn't list requires but still manages to install numpy, pytz, etc.... Is there a better way to automatically collect the full list of dependencies?
  2. Is there a pre-existing database somewhere? Am I repeating existing work?
  3. Do similar, easily accessible, databases exist for other languages with distribution systems (R, Clojure, etc...?)

回答1:


You should be looking at the install_requires field instead, see New and changed setup keywords.

requires is deemed too vague a field to rely on for dependency installation. In addition, there are setup_requires and test_requires fields for dependencies required for setup.py and for running tests.

Certainly, the dependency graph has been analyzed before; from this blog article by Olivier Girardot comes this fantastic image:


The image is linked to the interactive version of the graph.




回答2:


Using tool like pip, you can list all requirements for each package.

The command is:

pip install --no-install package_name

You can reuse part of pip in your script. The part responsible for parsing requirements is module pip.req.




回答3:


Here is how you can do it programmatically using python pip package:

from pip._vendor import pkg_resources  # Ensure pip conf index-url pointed to real PyPi Index

# Get dependencies from pip 
package_name = 'Django'
try:
    package_resources = pkg_resources.working_set.by_key[package_name.lower()] # Throws KeyError if not found
    dependencies = package_resources._dep_map.keys() + ([str(r) for r in package_resources.requires()])
    dependencies = list(set(dependencies))
except KeyError:
    dependencies = []

And here is how you can get dependencies from the PyPi API:

import requests
import json
package_name = 'Django'
# Package info url
PYPI_API_URL = 'https://pypi.python.org/pypi/{package_name}/json'
package_details_url = PYPI_API_URL.format(package_name=package_name)
response = requests.get(package_details_url)
data = json.loads(response.content)
if response.status_code == 200:
    dependencies = data['info'].get('requires_dist')
    dependencies2 = data['info'].get('requires')
    dependencies3 = data['info'].get('setup_requires')
    dependencies4 = data['info'].get('test_requires')
    dependencies5 = data['info'].get('install_requires')
    if dependencies2:
        dependencies.extend(dependencies2)
    if dependencies3:
        dependencies.extend(dependencies3)
    if dependencies4:
        dependencies.extend(dependencies4)
    if dependencies5:
        dependencies.extend(dependencies5)
    dependencies = list(set(dependencies))

You can use recursion to call dependencies of dependencies to get the full tree. Cheers!



来源:https://stackoverflow.com/questions/15708723/python-package-dependency-tree

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!