A delay will always occur between a user action and an application response.
It is well known that the lower the response delay, the greater the feeling of the appli
Persistence of vision is around 100ms so it should be a reasonable visual feedback delay. 110ms should make no difference, as it is an approximate value. In practice you won't notice a delay below 200ms.
Out of my memory, studies have shown that users lose patience and retry an operation after around 2s of inactivity (in the absence of feedback), e.g. clicking on a confirm or action button. So plan on using some kind of animation if the action takes longer than 1s.
No solid evidence but for our own application, we allow a maximum of one second between a user action and feedback. If it does take longer, a "waiting box" should be shown.
A user should see "something" happening within a second of causing an action.
I am a cognitive neuroscientist who studies visual perception and cognition.
The paper by Mary Potter mentioned above regards the minimum time required to categorize a visual stimulus. However, understand that this is under laboratory conditions in the absence of any other visual stimuli, which certainly would not be the case in the real world user experience.
The typical benchmark for a stimulus-response / input-stimulus interaction, that is, the average amount of time for an individuals minimum reaction speed or input-response detection is ~200ms. to be certain there is no detectable difference, this threshold could be lowered to around 100ms. Below this threshold, the temporal dynamics of your cognitive processes take longer to compute the event than the event itself, so there is nearly no chance of any ability to detect or differentiate it. You could go lower to say 50 ms, but it really wouldn't be necessary. 10 ms and you've gone into the territory of overkill.
Use the dual of test for visual spatial resolution ( two parallel black bars, with an equal width, and an equal gap between them. Reduce angular subtense until they appear to be one line, ie scale down or simply move away. The point at which it seems to merge into one line shows the threshold).
Use function gen to blink an LED on for an interval, then off, then on, then off --- same time delay each interval, but repeat the pattern while gradually decreasing that delay, thus same as above, but time in place of space. Imagine an oscilloscope image like so:
_________/^d^\_d_/^d^\_________
I note that at 41 ms interval, I perceive one longer blink only, but at 42 ms, I just perceive it as extremely rapid double blink. Thus, threshold is ~42ms. Probably varies depending on person, age, condition etc.
This is close to 24 fps, which is probably why cinema works at that presentation rate.
Reaction time to see something, then decide to react, say by clicking mouse etc, is longer much longer again. Thus, it's not surprising that experiments requiring a reaction response to measure yield a longer time, but that longer delay wasn't what you were asking for, and the above experiment is easy and illuminating!
But note also -- smoothly moving animations require the visual cortex to work harder, delaying visual comprehension. This delay is 'hidden' from perception, so longer delays (several hundred ms) can be 'hidden' by just providing something thats difficult to see because moving.
The effect that hides it is called Chronostasis. Basically, glancing somewhere 'new' requires the visual cortex to work harder to 'de-render' / 'recognise' the scene. This takes a remarkably long time, during which your consciousness is essentially 'paused'.
Once looking at a mostly-constant scene, only changes need this processing, so smaller/faster changes are possible and your perceptual experience resumes, and faster/smaller movements are detectable.
The detection of changes visually is processed basically on your retina. Your eyes also have a natural 'bandpass' response -- stare unblinkingly at anything for sufficient time, and at sufficient distance for saccades to be unable to change the image much, and you will find your visual feed fading out to 'grey'. This is what gives us our 'white balance', and is somewhat similar to the automatic gain control on analogue radio/tv.
The point is, that your eyes themselves have a time constant to respond, but this is actually dependant on the strength of the stimulus. (brightness of the LED, for our case).
Too bright, and the ability of your retinal cells to 'relax' back from the brightness, ie, respond to the 'sudden dark', is compromised.
The effect which keeps you seeing bright things after the light has stopped is called 'persistence of vision', and old cathode-ray picture tubes more or less depend heavily on it for them to work at all.
This is the one that's usually 100 ms or so, but it's not a 'sharp' interval -- more like a exponential roll-off, and again -- changes duration depending on how bright the stimulus is relative to how dark-adjusted (ie, sensitive) the eye is at that moment.
For duller, faster changes, especially changes outside your fovea, you will perceive even higher rates easily. Eg, flickering lights. Those outer parts of your retina (most of the area, actually) are adapted to detecting movement, and bringing it to your attention. So it makes sense that although lacking spatial resolution, they have greater time resolution / shorter response rate.
But this also means animating things usually requires even finer time steps, otherwise 'jumpiness' is perceptible, mostly due to that faster response.
Note all the scaling/sliding full screen animations iOS uses -- these essentially exploit chronostasis to hide technically unavoidable loading delays, giving the perception that those products respond instantly and smoothly at all times.
So, show something different within 42 ms -> instant response. Keep animating otherwise useless hard-to-see-properly visuals continuously at high frame rates, then stop suddenly when done -> hides the delay so long as enough is visually busy, and the delay isn't too long. (probably 250ms is pushing the friendship).
This also seems to tee up with other's perceptions of input lag, for example : http://danluu.com/input-lag/
I worked on an application that had a explicit business goal of being blindingly fast, and we had a max allowed server time of 150ms for processing a full web page.
I don't think anecdotes or opinions are really valid for answers here. This question touches on the psychology of user experience and the sub-conscious mind. The human brain is powerful and fast and mere milliseconds do count and are registered. I am no expert but I know there is much science behind e.g. what Matt Jacobsen mentioned. Check out Google's study here http://services.google.com/fh/files/blogs/google_delayexp.pdf for an idea of how much it can affect site traffic.
Here's another study by Akami - 2 second response time http://www.akamai.com/html/about/press/releases/2009/press_091409.html (From https://ux.stackexchange.com/questions/5529/once-apon-a-time-there-was-a-10-seconds-to-load-a-page-rule-what-is-it-nowa )
Does anyone have any other studies to share?