I\'ve got four unsigned 32-bit integers representing an unsigned 128-bit integer, in little endian order:
typedef struct { unsigned int part[4]; } bigint
The most immediate speedup will come from inlining the conversion rather than calling functions; it could be as simple as marking bigint_divmod10() inline, or using profile-guided optimisation as offered by your compiler.
bigint_divmod10()