Consider the following contenteditable div.
bold textbold text
Browsers are inconsistent on this. Firefox will let you position the caret in more positions than most browsers but WebKit and IE both have definite ideas about valid caret positions and will amend a range you add to the selection to conform to the nearest valid position. This does make sense: having different document positions and hence behaviours for the same visual caret location is confusing for the user. However, this comes at the cost of inflexibility for the developer.
This is not documented anywhere. The current Selection spec says nothing about it, principally because no spec existed when browsers implemented their selection APIs and there is no uniform behaviour for the current spec to document.
One option would be to intercept the keypress
event as you suggest, although this will not help when the user pastes in content using the edit or context menus. Another would be to keep track of the selection using mouse and key events, create elements with, say, a zero-width space character for the caret to be placed in and place the caret in one those elements when necessary. As you say, ugly.