Why Mongo Spark connector returns different and incorrect counts for a query?
I'm evaluating Mongo Spark connector for a project and I'm getting the inconsistent results. I use MongoDB server version 3.4.5, Spark (via PySpark) version 2.2.0, Mongo Spark Connector version 2.11;2.2.0 locally on my laptop. For my test DB I use the Enron dataset http://mongodb-enron-email.s3-website-us-east-1.amazonaws.com/ I'm interested in Spark SQL queries and when I started to run simple test queries for count I received different counts for each run. Here is output from my mongo shell: > db.messages.count({'headers.To': 'eric.bass@enron.com'}) 203 Here are some output from my PySpark