This is a followup questions to this one:
Python DictReader - Skipping rows with missing columns?
Turns out I was being silly, and using the wrong ID field.<
My python skills are poor, so I am far too ignorant to write out what I have in mind in any kind of reasonable time. But I do know how to do OO decomposition.
Why does the Employees class to do all the work? There are several types of things that your monolithic Employees class does:
I suggest that you create a class to handle each task group listed.
Define an Employee class to keep track or employee data and handle field processing/tidying tasks.
Use the Employees class as a container for employee objects. It can handle tasks like tracking down an Employee's supervisor.
Define a virtual base class EmployeeLoader to define an interface (load, store, ?? ). Then implement a subclass for CSV file serialization. (The virtual base class is optional--I'm not sure how Python handles virtual classes, so this may not even make sense.)
So:
EmployeeCSVLoader with a file name to work with.Employees object and parse the file.Why is this design worth the effort?
It makes things easier to understand. Smaller, task focused objects are easier to create clean, consistent APIs for.
If you find that you need an XML serialization format, it becomes trivial to add the new format. Subclass your virtual loader class to handle the XML parsing/generation. Now you can seamlessly move between CSV and XML formats.
In summary, use objects to simplify and structure your data. Section off common data and behaviors into separate classes. Keep each class tightly focused on a single type of ability. If your class is a collection, accessor, factory, kitchen sink, the API can never be usable: it will be too big and loaded with dissimilar groups of methods. But if your classes stay on topic, they will be easy to test, maintain, use, reuse, and extend.