I've implemented a program able to solve the n-puzzle problem with A*. Since the space of the states is too big I cannot precompile it and I have to calculate the possible states at runtime. In this way A* works fine for a 3-puzzle, but for a 4-puzzle can take too long. Using Manhattan distance adjusted with linear conflicts, if the optimal solution requires around 25 moves is still fast, around 35 takes 10 seconds, for 40 takes 180 seconds. I haven't tried more yet.
I think that's because I must keep all visited states, since I'm using functions admissible but (I think) not consistent (i tried also with Hamming and Gaschnig distances and a few more). Since the space of the solution is a graph the heuristic must also be consistent, otherwise the algorithm can loop or be not optimal. That's why I keep all visited nodes (it's also written in the book "AI: A modern approach"). But anyway, this storage does not slow at all. What slows is keeping the queue of nodes to be visited ordered.
So I decided to try IDA* that, as I saw, does not require this storage (but still I have to keep all visited states to avoid loops). It's faster for solutions that require 35 or less moves, but for 40 it's much slower.
Here is my code. Am I doing something wrong?
public static State solveIDAStar(State initialState) {
    int limit = initialState.getManhattanDistance() + 2 * initialState.getLinearConflicts();
    State result = null;
    while(result == null) {
        visitedStates.add(initialState); // It's a global variable
        result = limitedSearch(initialState, limit);
        limit = newLimit;
        visitedStates.clear();
    }
    return result;
}
public static State limitedSearch(State current, int limit) {
    for(State s : current.findNext()) {
        if(s.equals(GOAL)) {
            s.setParent(current);
            return s;
        }
        if(!visitedStates.contains(s)) {
            s.setPathCost(current.getPathCost() + 1);
            s.setParent(current);
            int currentCost = s.getManhattanDistance() + 2 * s.getLinearConflicts() + s.getPathCost();
            if(currentCost <= limit) {
                visitedStates.add(s);
                State solution = limitedSearch(s, limit);
                if(solution != null)
                    return solution;
            } else {
                if(currentCost < newLimit)
                    newLimit = currentCost;
            }
        }
    }
    return null;
}
Old stuff moved down.
Changes so that newLimit can skip steps (the bestSolution stuff):
State bestSolution; // add this global
public static State solveIDAStar(State initialState) {
    int limit = initialState.getManhattanDistance() + 2 * initialState.getLinearConflicts();
    bestSolution = null; // reset global to null
    State result = null;
    while(result == null) {
        visitedStates.add(initialState); // It's a global variable
        newLimit = INFINITY;
        result = limitedSearch(initialState, limit);
        limit = newLimit;
        visitedStates.clear();
    }
    return result;
}
public static State limitedSearch(State current, int limit) {
    for(State s : current.findNext()) {
        if(s.equals(GOAL)) {
            s.setParent(current);
            return s;
        }
        if(!visitedStates.contains(s)) {
            s.setPathCost(current.getPathCost() + 1);
            s.setParent(current);
            int currentCost = s.getManhattanDistance() + 2 * s.getLinearConflicts() + s.getPathCost();
            if(currentCost <= limit) {
                visitedStates.add(s);
                State solution = limitedSearch(s, limit);
                if(solution != null &&
                   (bestSolution == null || solution.getPathCost() < bestSolution.getPathCost()))
                        bestSolution = solution; // cache solution so far
            } else {
                if(currentCost < newLimit)
                    newLimit = currentCost;
            }
        }
    }
    return null;
}
Original answer
So I found an open source implementation. Miraculously, it is also in java.
The application can be tested here: http://n-puzzle-solver.appspot.com/
And the source code specifically relevant is: http://code.google.com/p/julien-labs/source/browse/trunk/SlidingPuzzle/src/be/dramaix/ai/slidingpuzzle/server/search/IDAStar.java
Not sure how much the 1st change suggested below might change the time taken, but I am quite sure that you need to make the 2nd change.
First change
By comparing the code, you will find that this function
private Node depthFirstSearch(Node current, int currentCostBound, State goal)
is basically your function here
public static State limitedSearch(State current, int limit)
and Julien Dramaix's implementation doesn't have:
if(!visitedStates.contains(s)) {
    ...
    visitedStates.add(s);
So take those two lines out to test.
2nd change
Your function public static State solveIDAStar(State initialState) does something weird in the while loop.
After you fail once, you set the maximum depth (limit) to infinity. Basically, 1st iteration, you try find a solution as good as your heuristic. Then you try to find any solution. This is not iterative deepening.
Iterative deepening means every time you try, go a little bit deeper.
Indeed, looking at the while loop in public PuzzleSolution resolve(State start, State goal), you will find nextCostBound+=2;. That means, every time you try, try find solutions with up to 2 more moves.
Otherwise, everything else looks similar (although your exact implementation of the State class might be slightly different).
If it works better, you might also want to try some of the other heuristics at http://code.google.com/p/julien-labs/source/browse/#svn%2Ftrunk%2FSlidingPuzzle%2Fsrc%2Fbe%2Fdramaix%2Fai%2Fslidingpuzzle%2Fclient.
The heuristics are found in the server/search/heuristic folder.
A small issue: you said "What slows is keeping the queue of nodes to be visited ordered.". In this case you can use a "Priority Heap". This is a partial ordered queue that always return the min (or max) item in e queue, insertions, retrieves - removes are O(log n), so this can make a bit fast your initial A* algorithm. Here a send you a simple implementation, but is made in C#, you need to translate it to Java...
public class PriorityHeap<T>
{
    private int count;
    private int defaultLength = 10;
    private PriorityHeapNode[] array;
    private bool isMin;
    /// <summary>
    /// 
    /// </summary>
    /// <param name="isMin">true si quiere que la ColaHeap devuelva el elemento de menor Priority, falso si quiere que devuelva el de mayor</param>
    public PriorityHeap(bool isMin)
    {
        this.count = 0;
        this.isMin = isMin;
        this.array = new PriorityHeapNode[defaultLength];
    }
    public PriorityHeap(bool isMin, int iniLength)
    {
        this.count = 0;
        this.isMin = isMin;
        this.defaultLength = iniLength;
        this.array = new PriorityHeapNode[defaultLength];
    }
    public class PriorityHeapNode
    {
        T valor;
        int _priority;
        public PriorityHeapNode(T valor, int _priority)
        {
            this.valor = valor;
            this._priority = _priority;
        }
        public T Valor
        {
            get
            { return this.valor; }
        }
        public double Priority
        {
            get
            { return this._priority; }
        }
    }
    public int Count
    { get { return this.count; } }
    /// <summary>
    /// Devuelve true si la cola devuelve el valor de menor Priority, falso si el de mayor
    /// </summary>
    public bool IsMin
    { get { return isMin; } }
    /// <summary>
    /// Devuelve y quita el Valor Minimo si la cola lo permite,si no, retorna null
    /// </summary>
    /// <returns></returns>
    public PriorityHeapNode GetTopAndDelete()
    {
        PriorityHeapNode toRet;
        if (count > 0)
        {
            if (count == 1)
            {
                toRet = array[0];
                array[0] = null;
                count--;
                return toRet;
            }
            else
            {
                toRet = array[0];
                array[0] = array[count - 1];
                array[count - 1] = null;
                HeapyfiToDown(0);
                count--;
                return toRet;
            }
        }
        else return null;
    }
    /// <summary>
    /// Devuelve el tope pero no lo borra
    /// </summary>
    /// <returns></returns>
    public PriorityHeapNode GetTop()
    {
        return array[0];
    }
    public void Insert(PriorityHeapNode p)
    {
        if (array.Length == count)
            Add(p);
        else array[count] = p;
        count++;
        HeapyfiToUp(count - 1);
    }
    public void Clear()
    {
        count = 0;
    }
    #region Private Functions
    private int GetFather(int i)
    {
        return ((i + 1) / 2) - 1;
    }
    private int GetRightSon(int i)
    { return 2 * i + 2; }
    private int GetLeftSon(int i)
    { return 2 * i + 1; }
    private void Add(PriorityHeapNode p)
    {
        if (array.Length == count)
        {
            PriorityHeapNode[] t = new PriorityHeapNode[array.Length * 2];
            for (int i = 0; i < array.Length; i++)
            {
                t[i] = array[i];
            }
            t[count] = p;
            array = t;
        }
    }
    private void HeapyfiToUp(int i)
    {
        if (isMin)
        {
            int father = GetFather(i);
            if (father > -1 && array[father].Priority > array[i].Priority)
            {
                PriorityHeapNode t = array[father];
                array[father] = array[i];
                array[i] = t;
                HeapyfiToUp(father);
            }
        }
        else
        {
            int father = GetFather(i);
            if (father > -1 && array[father].Priority < array[i].Priority)
            {
                PriorityHeapNode t = array[father];
                array[father] = array[i];
                array[i] = t;
                HeapyfiToUp(father);
            }
        }
    }
    private void HeapyfiToDown(int i)
    {
        if (isMin)
        {
            #region HeapyFi To down Min
            int l = GetLeftSon(i);
            int r = GetRightSon(i);
            if (r < count)
            {
                PriorityHeapNode right = array[r];
                PriorityHeapNode left = array[l];
                int t;
                if (right != null && left != null)
                {
                    t = left.Priority < right.Priority ? l : r;
                }
                else if (right != null)
                    t = r;
                else if (left != null)
                    t = l;
                else return;
                if (array[t].Priority < array[i].Priority)
                {
                    PriorityHeapNode temp = array[t];
                    array[t] = array[i];
                    array[i] = temp;
                    HeapyfiToDown(t);
                }
            }
            else if (l < count)
            {
                PriorityHeapNode left = array[l];
                int t;
                if (left != null)
                    t = l;
                else return;
                if (array[t].Priority < array[i].Priority)
                {
                    PriorityHeapNode temp = array[t];
                    array[t] = array[i];
                    array[i] = temp;
                    HeapyfiToDown(t);
                }
            }
            #endregion
        }
        else
        {
            #region HeapyFi To down NOT Min
            int l = GetLeftSon(i);
            int r = GetRightSon(i);
            if (r < count)
            {
                PriorityHeapNode right = array[r];
                PriorityHeapNode left = array[l];
                int t;
                if (right != null && left != null)
                {
                    t = left.Priority > right.Priority ? l : r;
                }
                else if (right != null)
                    t = r;
                else if (left != null)
                    t = l;
                else return;
                if (array[t].Priority > array[i].Priority)
                {
                    PriorityHeapNode temp = array[t];
                    array[t] = array[i];
                    array[i] = temp;
                    HeapyfiToDown(t);
                }
            }
            else if (l < count)
            {
                PriorityHeapNode left = array[l];
                int t;
                if (left != null)
                    t = l;
                else return;
                if (array[t].Priority > array[i].Priority)
                {
                    PriorityHeapNode temp = array[t];
                    array[t] = array[i];
                    array[i] = temp;
                    HeapyfiToDown(t);
                }
            }
            #endregion
        }
    }
    #endregion
}      
Hope this helps...
来源:https://stackoverflow.com/questions/12033252/iterative-deepening-a-star-ida-to-solve-n-puzzle-sliding-puzzle-in-java