问题
I have the following SQL:
SELECT id, url
FROM link
WHERE visited = false
ORDER BY id
LIMIT 500;
--*500 is only a example
I'm making a webcrawler and there is a table with links. This SQL returns the links to visit, but dont all them, only the quantitiy defined in the limit clause.
I will use threads and if the first execute this query, it will obtains the first 500 links, if the second thread execute the same query, it will obtains the next 500 links. In other words, first thead obtains links 1 to 500, second thread obtains 501 to 1000, third thread obtains 1001 to 1500 and so on.
MAYBE it's dont need works with threads, but with different computers running the same application. I dont know if a need create a field in the table to set that row was in use by another thread/application or I can do this only with SQL/DBMS. I'm using PostgreSQL.
In other words AGAIN, I will need lock a consulted row to not appears in another query.
回答1:
Have you tried for update/returning?
update link
set visiting = true
from (
select id
from link
where visiting = false
and visited = false
limit 500
for update
) as batch
where batch.id = link.id
returning *;
回答2:
Skip 1500 rows and take the next 500
SELECT id, url
FROM link
WHERE visited = false
ORDER BY id
LIMIT 500 OFFSET 1500
http://www.postgresql.org/docs/8.3/interactive/queries-limit.html
来源:https://stackoverflow.com/questions/5981459/how-i-make-result-of-sql-querys-with-limit-different-in-each-query