How do you identify the file content as being in ASCII or binary using C++?
Github's linguist uses charlock holmes library to detect binary files, which in turn uses ICU's charset detection.
The ICU library is available for many programming languages, including C and Java.