Regular expression to find all table names in a query

前端 未结 12 947
一个人的身影
一个人的身影 2020-12-03 18:09

I am not that hot at regular expressions and it has made my little mind melt some what.

I am trying to find all the tables names in a query. So say I have the query

相关标签:
12条回答
  • 2020-12-03 18:42

    I tried all the above but none worked since I use a wide variety of queries. I'm working with PHP though and used a PEAR library called SQL_Parser, but hope my solution helps. Also, I was having trouble with apostrophes and MySQL reserved sencences so I decided to strip off all the fields section from the query before parsing it.

    function getQueryTable ($query) {
        require_once "SQL/Parser.php";
        $parser = new SQL_Parser();
        $parser->setDialect('MySQL');
    
        // Stripping fields section
        $queryType = substr(strtoupper($query),0,6);            
        if($queryType == 'SELECT') { $query  = "SELECT * ".stristr($query, "FROM"); }
        if ($havingPos = stripos($query, 'HAVING')) { $query = substr($query, 0, $havingPos); }
    
    
        $struct = $parser->parse($query);
    
        $tableReferences = $struct[0]['from']['table_references']['table_factors'];
    
        foreach ((Array) $tableReferences as $ref) {
            $tables[] = ($ref['database'] ? $ref['database'].'.' : $ref['database']).$ref['table'];
        }
    
        return $tables;
    
    }
    
    0 讨论(0)
  • 2020-12-03 18:44

    One workaround is to implement a naming convention on tables and views. Then the SQL statement can be parsed on the naming prefix.

    For example:

    SELECT tbltable1.one, tbltable1.two, tbltable2.three
    FROM tbltable1
        INNER JOIN  tbltable2
            ON tbltable1.one = tbltable2.three
    

    Split whitespace to array:

    ("SELECT","tbltable1.one,","tbltable1.two,","tbltable2.three","FROM","tbltable1","INNER","JOIN","tbltable2","ON","tbltable1.one","=","tbltable2.three")

    Get left of elements to period:

    ("SELECT","tbltable1","tbltable1","tbltable2","FROM","tbltable1","INNER","JOIN","tbltable2","ON","tbltable1","=","tbltable2")

    Remove elements with symbols:

    ("SELECT","tbltable1","tbltable1","tbltable2","FROM","tbltable1","INNER","JOIN","tbltable2","ON","tbltable1","tbltable2")

    Reduce to unique values:

    ("SELECT","tbltable1","tbltable2","FROM","INNER","JOIN","ON")

    Filter on Left 3 characters = "tbl"

    ("tbltable1","tbltable2")

    0 讨论(0)
  • 2020-12-03 18:46

    Everything said about the usefulness of such a regex in the SQL context. If you insist on a regex and your SQL statements always look like the one you showed (that means no subqueries, joins, and so on), you could use

    FROM\s+([^ ,]+)(?:\s*,\s*([^ ,]+))*\s+ 
    
    0 讨论(0)
  • 2020-12-03 18:46

    This will pull out a table name on an insert Into query:

    (?<=(INTO)\s)[^\s]*(?=\(())
    

    The Following will do the same but with a select including joins

    (?<=(from|join)\s)[^\s]*(?=\s(on|join|where))
    

    Finally going back to an insert if you want to return just the values that are held in an insert query use the following Regex

    (?i)(?<=VALUES[ ]*\().*(?=\))
    

    I know this is an old thread but it may assist someone else looking around

    Enjoy

    0 讨论(0)
  • 2020-12-03 18:50

    It's definitely not easy.

    Consider subqueries.

    select
      *
    from
      A
      join (
        select
           top 5 *
        from
          B)
        on B.ID = A.ID
    where
      A.ID in (
        select
          ID
        from
          C
        where C.DOB = A.DOB)
    

    There are three tables used in this query.

    0 讨论(0)
  • 2020-12-03 18:50

    I think it would be easier to tokenize the string and look for SQL keywords that could bound the table names. You know the names will follow FROM, but they could be followed by WHERE, GROUP BY, HAVING, or no keyword at all if they're at the end of the query.

    0 讨论(0)
提交回复
热议问题