I profile running Java applications often with VisualVM but it needs X to run on the machine.
I know I can connect through management port but that will be an offlin
Can you collect 10 or 20 stack samples with jstack? Then if Foo is a method, its overall time usage is the fraction of samples containing it. Its CPU usage is the fraction of those samples that don't terminate in I/O or a system call. Its "self time" is the fraction of samples in which it itself is the terminus.
I don't need anything pretty. I either run it under the IDE and collect them that way, or use something like jstack that snapshots the stack of a running app.
That's the random-pause technique.