I need help to convert PDF to XML using PHP. There are some sites which claims to do so. But they charge for that. I have to write my own code in PHP for that. Being a novic
PDFX does PDF-to-XML conversion and it's free to use. It might be helpful in your case as it can extract things like images and captions separately.
Example input/output can be found here.
The usage page includes a simple PHP client example.
(Disclosure: It is my system.)