I\'m trying to audit a Python project with a large number of dependencies and while I can manually look up each project\'s homepage/license terms, it seems like most OSS pac
A slightly better version for those running jupyter - uses Anaconda defaults - no install needed
import pkg_resources
import pandas as pd
def get_pkg_license(pkg):
try:
lines = pkg.get_metadata_lines('METADATA')
except:
lines = pkg.get_metadata_lines('PKG-INFO')
for line in lines:
if line.startswith('License:'):
return line[9:]
return '(Licence not found)'
def print_packages_and_licenses():
table = []
for pkg in sorted(pkg_resources.working_set, key=lambda x: str(x).lower()):
table.append([str(pkg).split(' ',1)[0], str(pkg).split(' ',1)[1], get_pkg_license(pkg)])
df = pd.DataFrame(table, columns=['Package', 'Version', 'License'])
return df
print_packages_and_licenses()
Here is a way to do this with yolk3k (Command-line tool for querying PyPI and Python packages installed on your system.)
pip install yolk3k
yolk -l -f license
#-l lists all installed packages
#-f Show specific metadata fields (In this case, License)
I found several ideas from the answers and comments for this question to be relevant and wrote a short script for generating the license information for the applicable virtualenv:
import pkg_resources
import copy
def get_packages_info():
KEY_MAP = {
"Name": 'name',
"Version": 'version',
"License": 'license',
}
empty_info = {}
for key, name in KEY_MAP.iteritems():
empty_info[name] = ""
packages = pkg_resources.working_set.by_key
infos = []
for pkg_name, pkg in packages.iteritems():
info = copy.deepcopy(empty_info)
try:
lines = pkg.get_metadata_lines('METADATA')
except (KeyError, IOError):
lines = pkg.get_metadata_lines('PKG-INFO')
for line in lines:
try:
key, value = line.split(': ', 1)
if KEY_MAP.has_key(key):
info[KEY_MAP[key]] = value
except ValueError:
pass
infos += [info]
return "name,version,license\n%s" % "\n".join(['"%s","%s","%s"' % (info['name'], info['version'], info['license']) for info in sorted(infos, key=(lambda item: item['name'].lower()))])
The answer didn't work for me a lot of those libraries generated exception.
So did a little brute force
def get_pkg_license_use_show(pkgname):
"""
Given a package reference (as from requirements.txt),
return license listed in package metadata.
NOTE: This function does no error checking and is for
demonstration purposes only.
"""
out = subprocess.check_output(["pip", 'show', pkgname])
pattern = re.compile(r"License: (.*)")
license_line = [i for i in out.split("\n") if i.startswith('License')]
match = pattern.match(license_line[0])
license = match.group(1)
return license
Another option is to use Brian Dailey's Python Package License Checker.
git clone https://github.com/briandailey/python-packages-license-check.git
cd python-packages-license-check
... activate your chosen virtualenv ...
./check.py