Short answer:
A* uses heuristics to optimize the search. That is, you are able to define a function that to some degree can estimate the cost from one node to the target. This is particulary useful when you are searching for a path on a geographical representation (map) where you can, for instance, guess the distance to the target from a given graph node. Hence, typically A* is used for path finding in games etc. Where Djikstra is used in more generic cases.
No, A* won't always give the best path.
If heuristic is the "geographical" distance, the following example might give the non optimal path.
[airport] - [road] - [start] -> [road] -> [road] -> [road] -> [road] -> [target] - [airport]
|----------------------------------------------------------------|