Required to join 2 tables with their FKs in a 3rd table

こ雲淡風輕ζ 提交于 2019-11-26 12:33:24

Foreign keys are not needed to join tables!

So the answer to how to join tables when no FK is between them is to just join them.

The real question is how do we choose to join what tables (or combine in any other way)?

Statements & tables

Every base table comes with a predicate--statement template parameterized by column names. The table value is the rows that make its predicate into a true proposition--statement.

// species [name] yields [woodtype] and has max height [maxht]
Species(name,woodtype,maxht)
// forest [name] has area [size] in area [loc] and owner [comp]
Forest(name,size,loc,comp)
// tree group [numb] is of species [species] in forest [forest] and was planted in [planted] in quadrant [loc] on date [date] with parent tree group [parent]
Tree(species,forest,numb,planted,loc,parent)

A query also has a predicate. Its value also is the rows that make its predicate true. Its predicate is built up according to its FROM, WHERE and other clauses. A table alias names a table value like its base table but with columns prefixed by the alias. So its predicate is its base table's predicate using alias-prefixed columns.

Species s

holds rows satisfying

species [s.name] yields [s.woodtype] and has max height [s.maxht]

CROSS & INNER JOIN put AND between predicates; UNION puts OR between them; EXCEPT inserts AND NOT and ON & WHERE AND in a condition; SELECT renames, adds & drops columns. (Etc for other operators.) So

Species s CROSS JOIN Forest f

holds rows where

    species [s.name] yields [s.woodtype] and has max height [s.maxht]
AND forest [f.name] has area [f.size] in area [f.loc] and owner [f.comp]

(No matter what the constraints are!) If you wanted only the rows above having forests named after their wood type then you would just add a condition via ... WHERE f.name=s.woodtype because that makes the value be rows satisfying ... AND f.name=s.woodtype.

For forests found in the regions coded as "ARCTIC" list the forest name & species name and species wood type found therein. Eliminate any duplicates and order the output by forest name and species name.

That is a big informal predicate that the rows returned are to satisfy. If we try to rewrite it using only the predicates that we have been given plus AND, OR, AND NOT (etc) then we can only do it by ANDing all three given predicates (hence, JOIN of the base table names) and adding AND Forest.loc='ARCTIC' (hence, ON or WHERE that condition).

FKs (etc) and querying (not)

PKs and FKs are special cases of integrity constraints. Given the predicates and what situations can arise, only some database values can arise. That's what integrity constraints describe. They let the DBMS keep out database values that shouldn't arise. (Also, optimize query execution.) Because a name is unique in Species we declare it a key. Ditto for Forest name and Tree numb. Because a species in Tree is a name in Species and name is a key of Species we declare FK Tree.species->Species.name. Ditto for forest and parent. Nothing to do with enabling joins. (Although they imply that a query result satisfies certain constraints, too.)

It does not matter to querying what the constraints are. If there were Tree species values that did not appear as any Species name value because the business rules or Tree or Species predicate(s) were different then there would be no FK Tree.species->Species.name. But each query would continue to return the rows satisfying its predicate as expressed in terms of base table predicates. (Since the possible business situations or predicate(s) would be different, the input rows could be different so output rows could be different.)

What determines query SQL

So the answer to how we choose what tables to join (or combine in any other way) is that we arrange base table names, JOIN, UNION, EXCEPT and WHERE (etc) as appropriate to give a query expression whose predicate is the one we want our rows to satisfy. This is usually taught as something informal to be done by feel but now you know what ties the SQL to the natural language. And constraints are irrelevant.

NOTE: The preceding assumes that we return no duplicates from queries. The reason why there are no duplicates in tables in the relational model is so that the above correspondence between table operators and logic connectives holds. However SQL tables can have duplicates. Where SQL differs from the relational model (in its many ways), there querying becomes less (literally) logical.

Is there any rule of thumb to construct SQL query from a human-readable description?

You could do multiple joins. Link the tree table to your main table forest, then link the species table:

SELECT 
Forest.Fo_name, 
Species.Sp_name, 
Species.Sp_woodtype

FROM 
Forest
INNER JOIN Tree ON Tree.Tr_forest=Forest.Fo_name
INNER JOIN Species ON Tree.Tr_species = Species.sp_name

WHERE 
Fo_loc='ARTIC'

ORDER BY Fo_name, Sp_name

The Tree table is the connection between the Forest table and the Species table. Think of it as two steps:

1) Starting from the Forest table, join to the Tree table (from Forest.Fo_name to Tree.Tr_forest)

2) Now that the Tree is known, join to the Species table (from Tree.species to Species.sp_name)

I'd write the final query like this:

SELECT Forest.Fo_name, Species.Sp_name, Species.Sp_woodtype
FROM Forest
JOIN Tree ON Forest.Fo_name=Tree.Tr_forest
JOIN Species ON Tree.species=Species.sp_name
WHERE Fo_loc='ARTIC'
ORDER BY Fo_name, Sp_name

The ON condition should compare columns from different tables.

Then you just join each table step by step.

SELECT DISTINCT Fo_name, Sp_name, Sp_woodtype
FROM Forest AS f
INNER JOIN Tree AS t ON t.Tr_forest = f.Fo_name
INNER JOIN Species AS s ON t.Tr_speecies = s.Sp_name
WHERE f.Fo_loc = 'ARCTIC'
ORDER BY Fo_name, Sp_name
SELECT Forest.Fo_name, Species.Sp_name, Species.Sp_woodtype
    FROM Forest
        INNER JOIN Tree 
            INNER JOIN Species ON Species.sp_name = Tree.Tr_species
        ON Forest.Fo_name=Tree.Tr_forest
WHERE Fo_loc='ARTIC'
ORDER BY Fo_name, Sp_name

Try the SQL 99 Method

      SELECT DISTINCT F.Fo_name, S.Sp_name, Sp_woodtype
      FROM Forest F, Species S, Tree T
      WHERE F.Fo_name = T.Tr_Forest 
      AND S.Sp_name = Tr_species
      AND f.Fo_loc = 'ARCTIC';

A F S are aliases used to make the SQL shorter and neater.

DISTINCT will remove duplicits.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!