问题
I've been using Gepsio to try and churn my way through standard SEC XBRL filings through their EDGAR system, and despite my numerous attempts to figure this out, it seems I'm at a loss.
When you extract the facts from any document, and you're interested in retrieving "revenues," per the specific US-GAAP standards, there might be as many as 200 facts with the revenues tag associated with it. While the ID of each one is unique, figuring out which ID equates to the particular type of revenue you want doesn't seem to be very straightforward. The Revenue I'm interested in getting is the same one that appears in the consolidated statement of operations--i.e. Net Revenue, and not some obscure other type of revenue in the document. However, XBRL viewers like Arelle get it right every time, and despite trolling through the source code of Arelle I can't figure out the logic they are using either.
Anyone who can throw me in the right direction to understanding this would be greatly appreciated.
回答1:
You need to find the fact that has the right concept, period and entity.
- Finding the concept is not as trivial as it sounds. While in theory it should be - us-gaap:Revenues, many filers do not use that concept and either "abuse" another US GAAP concept or (in the worst case) make up their own concept in their own namespace. Charles Hoffman has spent considerable time investing this and designed report frames to solve this issue and allow comparison across filers. Report frames include mappings, such as this one where you can see that there are no less than 77 different concepts (- us-gaap:Revenues,- us-gaap:SalesRevenueNet, ...) used to report revenues. Charlie's approach is to pick the first one in the list that gets reported. For some concepts (I think it doesn't happen with revenues), facts may not even get explicitly reported, so that calculations are needed. Some XBRL vendors have worked with Charlie and integrated this report frame feature in their products.
- The entity is the easiest, because in the vast majority of the cases (it may even be mandatory, but I couldn't find instructions on this), all facts within a filing share the same entity. For SEC filings, it is the CIK of the company (with the CIK scheme). Having said that, in SEC filings, there is an additional dimension ( - dei:LegalEntityAxis) that you must check is absent or set to its default value to filter out any subsidiaries.
- The period is a bit more complicated. For this, you need to find another fact reported on - dei:DocumentPeriodEndDatethat gives you the period on which the report ends, which is the balance sheet date. With this date, you can filter the candidate facts and pick the ones that end on that date, and that have one year, or one or several quarters of duration.
- Revenues may also be reported for specific scenarios or branches. In this case, simply filter out facts that have any further dimensions. 
Normally, if you filter facts based on all the above (concept, entity, period, extra dimensions), you should only have one left, because collisions are very rare and are often mistakes.
Another, different approach for finding the concept is to look at the networks in each filing to programmatically find the statement of consolidated operations that you mention, but this is also not trivial as labels may vary. Then you may be able to infer the revenues concept, and the (absence of) definition network will make sure the dimensions (if any) are right. This may be the way Arelle finds it.
来源:https://stackoverflow.com/questions/44356106/get-specific-value-from-xbrl-document