I am building a library for working with very specific structured data and I am building my infrastructure on top of Pandas. Currently I am writing a bunch of different data
Because of similar issues and Matti John's answer I wrote a _pandas_wrapper
class for a project of mine, because I also wanted to inherit from pandas Dataframe.
https://github.com/mcocdawc/chemcoord/blob/bdfc186f54926ef356d0b4830959c51bb92d5583/src/chemcoord/_generic_classes/_pandas_wrapper.py
The only purpose of this class is to give a pandas DataFrame lookalike that is safe to inherit from.
If your project is LGPL licensed you can reuse it without problems.
I would avoid subclassing DataFrame
, because many of the DataFrame
methods will return a new DataFrame
and not another instance of your CTMatrix
object.
There are a few of open issues on GitHub around this e.g.:
https://github.com/pydata/pandas/issues/60
https://github.com/pydata/pandas/issues/2485
More generally, this is a question of composition vs inheritance. I would be especially wary of benefit #2. It might seem great now, but unless you are keeping a close eye on updates to Pandas (and it is a fast moving target), you can easily end up with unexpected consequences and your code will end up intertwined with Pandas.