Are there some recent and reliable statistics about "Web use" (webpages using one standard or another) of these standards?
Or an specific s
Now I see, there are some statistics (!!), the link of Wikipedia was lost... I corrected. It isn't updated, is from "Winter 2013" (~1.5 or 2 years old collected data), but show reality and tendencies.
http://webdatacommons.org/structureddata/index.html#toc2
This is the chart at the report (with RDFa+HTML dominance!):
Interpreting:
the section 5, "Extraction Process", say that "on each page, we run our RDF extractor based on the Anything To Triples (Any23) library", so all (RDF and Microformat) resulted in "triples" (not only RDF).
The ideia for "per domain" statistics is that domains use uniform politics for all pages... But I think this uniformity is false, only few pages per domain adopt "semantic markup" ... It is not more unbiased than URLs, is only another picture. Anyway, the outcome was dead heat, ~57% vs 43%.
Only 21% of the "semantic markup URLs" of 2013 was Microformat, all other are RDFa-HTML (Microdata is also a kind of RDFa).
using the average of percentuals of Domains (Ds) and URLs (Us), (Ds+Us)/2, the outcome is ~60% for RDFs and ~40% for Microformats.
before 2013 there was a dominance of Microformats, so, is evident the big growing of "RDFa-HTML" since 2011... The tendency is clear.
If we adopt the arithmetic mean of "per domain" and "per URL" countings, we have Microformats and RDFa-HTML near each other, with but with little less Microformat (and the strong tendency to RDFa-HTML grow in 2014).
Here a table for @sashoalm discussion, showing the percentuals and totals
NOTE1: HTML5 was released only 2014-10-28, so only ~2015-10 we will can check the real (definitive) impact of the new standard on the Web. An important expected impact is that Microdata not was blessed by HTML5, so the only standard is HTML+RDFa (that recommends RDFa Lite)... In the future perhaps there will less Microdata and more schema.org.
NOTE2: methodological problem of counting web-pages, of boilerplate text with some huge-cloned "semantic markup": I think that the "next generation" of statiscs can use some "per domain analisys" to make URL substatistics (sampling) of diversity (of semantically marked pages). Ideal is to weigh (p. ex. count once the non-clones and use 1+SQRT(count)
of clones) the boilerplate.
Today perhaps some people use Microformat, but there are more pages in the Web using RDFa-HTML (Microdata, RDFa, RDFa Lite, etc.), and the tendency is to grow.
If your project is for next years, the statistics say to use RDFa.
Another insteresting counting for RDFa is not the use, but the reuse of vocabularies (!). See Linked Open Vocabularies (LOV)
The last statistics from the WebDataCommons as follows:
Source: http://webdatacommons.org/structureddata/2016-10/stats/stats.html
Number of domain parsed: 34 million pay-level-domains
Number of domains with RDFa, Microdata and Microformats: 5.63 million (16.5%)
Popularity of different formats: