Here's what worked for us in a very similar situation:
Please apply the following mitigation steps (one node at a time) on all nodes:
- Go to data root of cluster, by default it is “D:\SvcFab\” or C:\ProgramData\Microsoft\SF if you didn’t specify datapath attribute in deployment. Look for folder by name . Inside that folder you will see file Fabric.Package.current.xml. create a backup.
- Open the Fabric.Package.current.xml file and look for Fabric.Config version:
- Go to config folder for that version and look for Settings.xml file.
- Take a backup of Settings.xml file and open the Settings.xml file.
- Look for section named “EseStore”.
- If section is present, add this parameter to section (prefer to type these text in the Settings.xml file than copy-pasting as copy-pasting sometimes appends extra invalid characters at the end).
- If the section with name “EseStore” is not present, add the section name with parameter as and save the file.
- Kill FileStoreService.exe on the node.
- Wait for the FileStoreService.exe to come back up and then proceed to next node.
After you are done applying mitigation to all nodes and cluster is back up healthy, please retry deploying the package. Please make sure for testing in testing machine first before use on production.