I have two sets of data: the first is a list of pdf URLS, together with data on links not working within them (from our own internal website data), the second is an export f