I have an open source iOS application that uses custom OpenGL ES 2.0 shaders to display 3-D representations of molecular structures. It does this by using procedurally gene
On the desktop, it was the case on many early programmable devices that while they could process 8 or 16 or whatever fragments simultaneously, they effectively had only one program counter for the lot of them (since that also implies only one fetch/decode unit and one of everything else, as long as they work in units of 8 or 16 pixels). Hence the initial prohibition on conditionals and, for a while after that, the situation where if the conditional evaluations for pixels that would be processed together returned different values, those pixels would be processed in smaller groups in some arrangement.
Although PowerVR aren't explicit, their application development recommendations have a section on flow control and make a lot of recommendations about dynamic branches usually being a good idea only where the result is reasonably predictable, which makes me think they're getting at the same sort of thing. I'd therefore suggest that the speed disparity may be because you've included a conditional.
As a first test, what happens if you try the following?
void main()
{
float distanceFromCenter = length(impostorSpaceCoordinate);
// the step function doesn't count as a conditional
float inCircleMultiplier = step(distanceFromCenter, 1.0);
float calculatedDepth = sqrt(1.0 - distanceFromCenter * distanceFromCenter * inCircleMultiplier);
mediump float currentDepthValue = normalizedDepth - adjustedSphereRadius * calculatedDepth;
// Inlined color encoding for the depth values
float ceiledValue = ceil(currentDepthValue * 765.0) * inCircleMultiplier;
vec3 intDepthValue = (vec3(ceiledValue) * scaleDownFactor) - (stepValues * inCircleMultiplier);
// use the result of the step to combine results
gl_FragColor = vec4(1.0 - inCircleMultiplier) + vec4(intDepthValue, inCircleMultiplier);
}