Compound OR SQL query (query #1 with WITH) and (query #2) with COUNT() excusions

问题

EDIT

Yes, you are correct. What I am looking for in pseudo code is a list that

(st_area(geom)>0.1) OR (COUNT(*) > 1)

and in words:

return a list, that only has only states that have an area greater than 0.1, but don't exclude that state if it is the only one in the country (usually and island country, which has plenty room for labels next to it). The excluded states are places like Slovenia that has 100 provinces, but a tiny land area (Great Britain is also and offender).

I have a table for the entire world listing all states and provinces (I call the table states, but it also can mean province).

StateName, ContryName, Pop, geometry

The table is on PostGreSQL 9.2 PostGIS 2.0

I need to remove small states (area too small) to label. But if it is an island (one country, one state) then I want to leave it in.

My Naive query is like this, but there is a syntax error:

SELECT s.name,s.admin, st_area(geom)
FROM vector.states s
INNER JOIN (
SELECT ss.admin
FROM vector.states ss
GROUP BY ss.admin
HAVING (COUNT(*) > 1) AND (st_area(ss.geom) > 0.01)
) a ON a.admin = s.admin
ORDER BY s.admin ASC;

this is the syntax error (and I expected this to happen).

ERROR:  column "ss.geom" must appear in the GROUP BY clause or be used in an aggregate function
LINE 7: HAVING (COUNT(*) > 1) AND (st_area(ss.geom) > 0.01)

回答1:

Two problems:

Just like the error message tells you, geom needs to be wrapped in an aggregate function if it is not listed in GROUP BY. You could just use min() ...
You got your logic backwards. It needs to be COUNT(*) = 1 OR ..

But this can be solved more elegantly with an anti-semi-join using NOT EXISTS:

SELECT s.name, s.admin, st_area(geom)
FROM   vector.states s
WHERE  st_area(s.geom) > 0.01           -- state big enough ...
   OR NOT EXISTS (                      -- ... or there are no other counties
         SELECT 1 FROM vector.states s2
         WHERE  s2.admin = s.admin
         AND    s2.pk_column <> s.pk_column  -- exclude self-join
         )
ORDER BY s.admin;

Replace pk_column with your actual primary key column(s).

回答2:

Editing the answer, I didn't understood the problem the 1st time sorry for that

Try this:

SELECT s.name,s.admin, st_area(geom)
FROM vector.states s
WHERE s.admin in (
SELECT ss.admin
FROM vector.states ss
GROUP BY ss.admin
HAVING (COUNT(*) > 1)
)
AND (st_area(s.geom) > 0.01)
ORDER BY s.admin ASC;

Hope it helps!

回答3:

This part: (st_area(ss.geom) > 0.01) doesn't belong in HAVING clause. HAVING is to limit results based on aggregate functions, like that COUNT, but for the rest, you have to use WHERE.

Then, i would use two separate queries to get those different values:

SELECT s.NAME, s.admin, st_area(geom)
FROM vector.states s
INNER JOIN (
  (SELECT ss.admin
    FROM vector.states ss
    WHERE st_area(ss.geom) > 0.01
    GROUP BY ss.admin
    HAVING COUNT(ss.admin) > 1)  
  UNION ALL  
  (SELECT ss.admin
    FROM vector.states ss
    GROUP BY ss.admin
    HAVING COUNT(ss.admin) = 1)
  ) a
ON a.admin = s.admin
ORDER BY s.admin ASC;

来源：https://stackoverflow.com/questions/20151418/compound-or-sql-query-query-1-with-with-and-query-2-with-count-excusions

标签

sql

postgresql

PostGIS