Here's a python library which tries to infer the logo image from a URL:
https://github.com/dcollien/urlimage
it parses the HTML at the url, and tries a whole bunch of things including:
- meta tag with itemprop="image" or property="image"
- meta tag with
property="og:image:secure_url" or property="og:image"
- meta tag with
name="twitter:image"
- meta tag for Microsoft tiles, with: name="msapplication-wide310x150logo", name="msapplication-square310x310logo", name="msapplication-square150x150logo", name="msapplication-square70x70logo"
- link tag with rel="apple-touch-icon"
- link tag with rel="icon"
- tries out "{scheme}://{domain}/favicon.ico" to see if it exists
- otherwise pulls out the first img tag (next to an h1)