Regular expression to detect semi-colon terminated C++ for & while loops
You could write a little, very simple routine that does it, without using a regular expression:
- Set a position counter
pos
so that is points to just before the opening bracket after yourfor
orwhile
. - Set an open brackets counter
openBr
to0
. - Now keep incrementing
pos
, reading the characters at the respective positions, and incrementopenBr
when you see an opening bracket, and decrement it when you see a closing bracket. That will increment it once at the beginning, for the first opening bracket in "for (
", increment and decrement some more for some brackets in between, and set it back to0
when yourfor
bracket closes. - So, stop when
openBr
is0
again.
The stopping positon is your closing bracket of for(...)
. Now you can check if there is a semicolon following or not.
In C/C++ why does the do while(expression); need a semi colon?
Because you're ending the statement. A statement ends either with a block (delimited by curly braces), or with a semicolon. "do this while this" is a single statement, and can't end with a block (because it ends with the "while"), so it needs a semicolon just like any other statement.
why there is semicolon after loop while();
The ;
is just a null statement, it is a no op but it it the body of the while loop. From the draft C99 standard section 6.8.3
Expression and null statements:
A null statement (consisting of just a semicolon) performs no operations.
and a while statement is defined as follows from section 6.8.5
Iteration statements:
while ( expression ) statement
So in this case the statement of the while loop is ;
.
The main effect of the while loop is here:
string1[i++] == string2[j++]
^^^ ^^^
So each iteration of the loop increments i
and j
until the whole condition:
string1[i++] == string2[j++] &&string1[i-1] != 0 && string2[j-1] != 0
evaluates to false
.
Why do { } while(condition); needs semicolon at the end of it but while(condition) {} doesn't?
You put semicolon after all statements, except the block statement. This is the reason that you place it after the while
in do while
, but not after the block in the while {...}
.
You also use it to terminate almost all declarations. The only exceptions I can think about at the moment is function bodies, and namespace
bodies in C++.
semicolon after the for loop block
The semicolon is an empty expression statement.
From section 6.2 of the C++ standard
The expression is a discarded-value expression (Clause 5). All side
effects from an expression statement are completed before the next
statement is executed. An expression statement with the expression
missing is called a null statement. [ Note: Most statements are
expression statements — usually assignments or function calls. A null
statement is useful to carry a label just before the } of a compound
statement and to supply a null body to an iteration statement such as
a while statement (6.5.1). —end note ]
This will be more clear with some reformatting:
#include <iostream>
int main(){
for(int i=0; i<5; ++i){
std::cout <<"Hello"<<std::endl;
}
;
}
The presence of this null statement has no effect on the program.
What kind of statements don't require semicolon termination in C++?
Yes, it's covered in section 6, "Statement" of the C++ standard (section 6 of C++03, it may have changed in C++11 but I don't have access to that one at the moment).
There are a large number of statement types and not all of them need to be terminated. For example, the following if
is a selection statement:
if (i == 1) {
doSomething();
}
and there is no requirement to terminate that with a semi-colon.
Of the different statements covered, the requirements are:
Statement type Termination required?
============== =====================
labelled statement N (a)
expression Y
compound statements N (a)
selection statements N (a)
iteration statements N (a) (b)
jump statements Y
declaration statement Y
(a) Although it may sometimes appear that these are terminated with a semi-colon, that's not the case. The statement:
if (i == 1) doSomething();
has the semi-colon terminating the inner expression statement, not the compound statement, somthing that should be obvious when you examine the first code segment above that has it inside {}
braces.
(b) do
requires the semi-colon after the while
expression.
RegEx split string with on a delimeter(semi-colon ;) except those that appear inside a string
The regular expression pattern ((?:(?:'[^']*')|[^;])*);
should give you what you need. Use a while
loop and Matcher.find()
to extract all the SQL statements. Something like:
Pattern p = Pattern.compile("((?:(?:'[^']*')|[^;])*);";);
Matcher m = p.matcher(s);
int cnt = 0;
while (m.find()) {
System.out.println(++cnt + ": " + m.group(1));
}
Using the sample SQL you provided, will output:
1: CREATE OR REPLACE PROCEDURE Proc
AS
b NUMBER:=3
2:
c VARCHAR2(2000)
3:
begin
c := 'BEGIN ' || ' :1 := :1 + :2; ' || 'END;'
4:
end Proc
If you want to get the terminating ;
, use m.group(0)
instead of m.group(1)
.
For more information on regular expressions, see the Pattern JavaDoc and this great reference. Here's a synopsis of the pattern:
( Start capturing group
(?: Start non-capturing group
(?: Start non-capturing group
' Match the literal character '
[^'] Match a single character that is not '
* Greedily match the previous atom zero or more times
' Match the literal character '
) End non-capturing group
| Match either the previous or the next atom
[^;] Match a single character that is not ;
) End non-capturing group
* Greedily match the previous atom zero or more times
) End capturing group
; Match the literal character ;
Related Topics
How to Get Current Time and Date in C++
Using Custom Std::Set Comparator
How Does C++ Handle &&? (Short-Circuit Evaluation)
Simple Example of Threading in C++
Two Different Values At the Same Memory Address
Why Does the Use of 'New' Cause Memory Leaks
Measuring Execution Time of a Function in C++
How to Implement Big Int in C++
Difference Between Static and Dynamic Arrays in C++
How to Convert Vector to Array
Why Is There an Injected Class Name
Converting a Hex String to a Byte Array
Why Does Subtracting '0' in C Result in the Number That the Char Is Representing
Officially, What Is Typename For