Pi = 3.14159 26 5358979323846 26 433... so the first 2-digit substring to repeat is 26.
What is an efficient way
Trie
RBarryYoung has pointed out that this will exceed the memory limits.
A trie data structure might be appropriate. In a single pass you can build up a trie with every prefix you've seen up to length n (e.g., n = 20). As you continue to process, if you ever reach a node at level n that already exists, you've just found a duplicate substring.
Suffix Matching
Another approach involves treating the expansion as a character string. This approach finds common suffixes, but you want common prefixes, so start by reversing the string. Create an array of pointers, with each pointer pointing to the next digit in the string. Then sort the pointers using a lexicographic sort. In C, this would be something like:
qsort(array, number_of_digits, sizeof(array[0]), strcmp);
When the qsort finishes, similar substrings will be adjacent in the pointer array. So for every pointer, you can do a limited string comparison with that string and the one pointed to by the next pointer. Again, in C:
for (int i = 1; i < number_of_digits; ++i) {
if (strncmp(array[i - 1], array[i], 20) == 0) {
// found two substrings that match for at least 20 digits
// the pointers point to the last digits in the common substrings
}
}
The sort is (typically) O(n log_2 n), and the search afterwards is O(n).
This approach was inspired by this article.