问题
OK, this might seem too tough to be posted here so I beg your pardon. Been working on this for almost a week.
I need to extract all selected columns in a given Oracle SQL String. It should pass the following test cases:
// single column test
select col1 from dual
// ^ should match "col1"
// multiple column test
select col1,col2 from dual
// ^ should match "col1", "col2"
// multiple space test
select col1 , col2 from dual
// ^ should match "col1", "col2"
// "distinct" tests
select distinct col1 from dual
// ^ should match "col1"
select distinct col1, col2 from dual
// ^ should match "col1", "col2"
// "distinct" with whitespaces tests
select distinct col1 from dual
// ^ should match "col1"
select distinct col1 , col2 from dual
// ^ should match "col1", "col2"
// "as" tests
select col1 from dual
// ^ should match "col1"
select colA as col1 from dual
// ^ should match "col1"
select colA as col1, col2, col3 from dual
// ^ should match "col1", "col2", "col3"
select col1, colB as col2, col3 from dual
// ^ should match "col1", "col2", "col3"
select col1, col2, colC as col3 from dual
// ^ should match "col1", "col2", "col3"
// "as" tests with whitespaces tests
select colA as col1, colB as col2, colC as col3 from dual
// ^ should match "col1", "col2", "col3"
// "distinct" with "as" tests
select distinct colA as col1 from dual
// ^ should match "col1"
select distinct colA as col1, colB as col2, col3 from dual
// ^ should match "col1", "col2", "col3"
select distinct colA as col1, col2, colC as col3 from dual
// ^ should match "col1", "col2", "col3"
// function test
select funct('1','2') as col1 from dual
// ^ should match "col1"
select col1, funct('1','2') as col2 from dual
// ^ should match "col1", "col2"
select col1, colB as col2, funct('1','2') as col3 from dual
// ^ should match "col1", "col2", "col3"
I tried the following RegEx in Java
((?<=select\ )(?!distinct\ ).*?(?=,|from))
((?<=select\ distinct\ ).*?(?=,|from))
((?<=as\ ).*?(?=,|from))
((?<=,\ ).*?(?=,|from))(?!.*\ as\ ) // <- Right, I'm guessing here
OR-ed them together but I can't simply pass all the test cases above. (I'm using this tool to validate my Regex).
I tried searching for SQL evaluator but can't find any that extracts all columns without executing it against a real database and that assumes all referenced tables and functions exist.
A Java ReGex, a free SQL Evaluator (that doesn't need a real database) that can pass the tests, or anything better that those two are the acceptable answers. The assumption is that the SQL is always in Oracle 11g format.
回答1:
Given that Oracle SELECT lists can get pretty complex (having to account for all of the cases you mention, plus subqueries, tablename.columnname constructs, quoted aliases, etc), you probably want to go beyond Regular Expressions and actually parse the SQL query then pull the tokens out of the parsed output.
To that end, you have a couple of different options, none of which are all that easy, but may be able to solve your problem
- If you're willing to use Perl, you can probably make SQL::Parser do what you want.
- You can get a 90 free trial download of gsqlparser if you want a java-based solution, which would be helpful if this is a one-time project.
- There is this - SQL92 parser, which is a free download but of unknown license, and I'm not totally sure if it can handle any Oracle-specific weirdness.
- you can use Antlr to generate a SQL parser with a java interface based on this guy's work, which is based on CREATE TABLE syntax but can be adapted readily to handle SELECT syntax (or you can search for
antlr sql grammar
and find a premade one pretty easily)
来源:https://stackoverflow.com/questions/14331263/how-do-i-extract-selected-columns-given-an-oracle-sql-string