[ACCEPTED]-Regular expression to match common SQL syntax?-unit-testing

Accepted answer
Score: 39

Regular expressions can match languages 3 only a finite state automaton can parse, which 2 is very limited, whereas SQL is a syntax. It can be demonstrated you can't validate SQL with a regex. So, you 1 can stop trying.

Score: 15

SQL is a type-2 grammar, it is too powerful to be described 5 by regular expressions. It's the same as 4 if you decided to generate C# code and then 3 validate it without invoking a compiler. Database 2 engine in general is too complex to be easily 1 stubbed.

That said, you may try ANTLR's SQL grammars.

Score: 2

As far as I know this is beyond regex and 5 your getting close to the dark arts of BnF 4 and compilers.


Same things happens to people 3 who want to do correct syntax highlighting. You 2 start cramming things into regex and then 1 you end up writing a compiler...

Score: 2

I had the same problem - an approach that 5 would work for all the more standard sql 4 statements would be to spin up an in-memory 3 Sqlite database and issue the query against 2 it, if you get back a "table does not exist" error, then 1 your query parsed properly.

Score: 1

Off the top of my head: Couldn't you pass 3 the generated SQL to a database and use 2 EXPLAIN on them and catch any exceptions 1 which would indicate poorly formed SQL?

Score: 0

Have you tried the lazy selectors. Rather 3 than match as much as possible, they match 2 as little as possible which is probably 1 what you need for quotes.

Score: 0

To validate the queries, just run them with 7 SET NOEXEC ON, that is how Entreprise Manager does it 6 when you parse a query without executing 5 it.

Besides if you are using regex to validate 4 sql queries, you can be almost certain that 3 you will miss some corner cases, or that 2 the query is not valid from other reasons, even 1 if it's syntactically correct.

Score: 0

I suggest creating a database with the same 2 schema, possibly using an embedded sql engine, and 1 passing the sql to that.

Score: 0

I don't think that you even need to have 16 the schema created to be able to validate 15 the statement, because the system will not 14 try to resolve object_name etc until it 13 has successfully parsed the statement.

With 12 Oracle as an example, you would certainly 11 get an error if you did:

select * from non_existant_table;

In this case, "ORA-00942: table 10 or view does not exist".

However if you execute:

select * frm non_existant_table;

Then 9 you'll get a syntax error, "ORA-00923: FROM 8 keyword not found where expected".

It ought 7 to be possible to classify errors into syntax 6 parsing errors that indicate incorrect syntax 5 and errors relating to tables name and permissions 4 etc..

Add to that the problem of different 3 RDBMSs and even different versions allowing 2 different syntaxes and I think you really 1 have to go to the db engine for this task.

Score: 0

There are ANTLR grammars to parse SQL. It's really a 6 better idea to use an in memory database or a very lightweight 5 database such as sqlite. It seems wasteful to 4 me to test whether the SQL is valid from 3 a parsing standpoint, and much more useful 2 to check the table and column names and 1 the specifics of your query.

Score: 0

The best way is to validate the parameters 5 used to create the query, rather than the 4 query itself. A function that receives the 3 variables can check the length of the strings, valid 2 numbers, valid emails or whatever. You can 1 use regular expressions to do this validations.

More Related questions