The output size of pooling layer is
output = (input size - window size) / (stride + 1)
in the above case the input size is 13, most implementations of pooling add an extra layer of padding in order to keep the boundary pixels in the calculations, so the input size will become 14.
the most common window size and stride is W = 2 and S = 2 so put them in the formula
output = (14 - 2) / (2 + 1)
output = 12 / 3
output = 4
now there will be 256 feature maps produced of size 4x4, flatten that out and you get
flatten = 4 x 4 x 256
flatten = 4096
Hope this answers your question.