HDFS is not necessary but recommendations appear in some places.
To help evaluate the effort spent in getting HDFS running:
What are the benefits of
The shortest answer is:"No, you don't need it". You can analyse data even without HDFS, but off course you need to replicate the data on all your nodes.
The long answer is quite counterintuitive and i'm still tryng to understand it with the help stackoverflow community.
Spark local vs hdfs permormance