This pattern is meant simply to grab everything in a string up until the first potential sentence boundary in the data:
[^\\.?!\\r\\n]*
Out
The * quantifier allows the pattern to capture a substring of length zero. In your original code version (without the ^ anchor in front), the additional matches are:
hard and the first !!!! and the end of the textYou can slice/dice this further if you like here.
Adding that ^ anchor to the front now ensures that only a single substring can match the pattern, since the beginning of the input text occurs exactly once.