Let\'s say I have a table with millions of rows. Using JPA, what\'s the proper way to iterate over a query against that table, such that I don\'t have all an in-memo
You can use another "trick". Load only collection of identifiers of the entities you're interested in. Say identifier is of type long=8bytes, then 10^6 a list of such identifiers makes around 8Mb. If it is a batch process (one instance at a time), then it's bearable. Then just iterate and do the job.
One another remark - you should anyway do this in chunks - especially if you modify records, otherwise rollback segment in database will grow.
When it comes to set firstResult/maxRows strategy - it will be VERY VERY slow for results far from the top.
Also take into consideration that the database is probably operating in read commited isolation, so to avoid phantom reads load identifiers and then load entities one by one (or 10 by 10 or whatever).