c# regex-match backslashes in strings

My suggestion — First find a “safe” character that’s guaranteed not to show up in the original string, like “_”. Replace all back slashes. Then proceed.

Problem with backslashes is the unnecessary complications. Here I want to match “one-or-more backslashes”. In the end I need to put 4 bachslashes in the pattern to represent that “one”.

var ret = Regex.Replace(@”any number of\\\backslashes”, “(.+\\\\+)?(.+)”, “$1 – $2”);

Alternatively, I could use @ to reduce the complexity @”(.+\\+)?(.+)”

Disappointingly the @ does a partial job. We still need 2 strokes — Confusing! I’d rather just remember one simple rule and avoid the @ altogether

java regex — replace with captured substring but modified

Any time you have a string with lots of x.xx000001 or x.xx99999, it’s probably noise you want to get rid of. Here’s a java solution. Perl can do this in 1 line (at most 2).

    public static String cleanUp999or000(String orig) {
        final static Pattern PATTERN9999 = Pattern
                .compile(“(\\d\\.\\d*)([0-8]9999+)(\\d\\s)”);
        Matcher m = PATTERN9999.matcher(orig);
        StringBuffer sb = new StringBuffer();
        String without999 = orig;
        String without999_or_000 = orig;
        try {
            while (m.find()) {
                final long intEndingIn999 = Long.parseLong(m.group(2));
                final long intEndingIn000 = intEndingIn999 + 1;
                System.out.println(intEndingIn000);
                m.appendReplacement(sb, m.group(1) + intEndingIn000 + m.group(3));
            }
            m.appendTail(sb);
            without999 = sb.toString();
        } catch (NumberFormatException e) {
            e.printStackTrace();
            without999 = orig;
        } finally {
            without999_or_000 = without999.replaceAll(
                “(\\d\\.\\d+?)0000+\\d(\\s)”, “$1$2”);
        }
        return without999_or_000;
    }

lookaround assertions across languages

/(?<!->)\bparentDataStore/ ## is a perl regex with a negative lookbehind. It disqualifies "->parentDataStore"

—-
(class|struct)\s+MyClass\b(?!;) ## is a perl regex for a class definition. The trailing negative lookahead (?!;) ensures we don’t match a forward class declaration.
—-Below is a Nov 2010 java example:

replaceAll("(?<=</?)MTSMessage", "SIG_Notification")

The optional positive lookbehind assertion above says to match (and replace) the “MTSMessage” string provided it’s preceded by “<” or “</”

LookAhead is simpler than LookBehind — Compare the syntax. Some languages only support lookAhead.

I feel negative lookAhead is more useful than positive lookAhead. I feel these zero-width assertions are useful in progressive matches, but I seldom need complex progressive.

If you use lookaround you may want to start with sample code and make incremental changes. Plausible but incorrect lookaround patterns abound. This is a time you need to understand how regex engines work.

http://www.regular-expressions.info/lookaround.html
* explains how (not) to capture a back-reference in a lookAhead.

group() – progressive match`]java

google iview 1st whiteboard cod` question
public class Main {
static final int size=20;
static Pattern p = Pattern.compile("[a-zA-Z]+");
static Matcher m;
static StringBuffer c; // candidate without extra space
static String good;
public static void main(String[] args) {
test("we all love apples more");
}
static void test(String s){
m = p.matcher(s);
c = new StringBuffer();
good="";
while(m.find()){
c.append(m.group());
if (c.length() >size) break;
good = c.toString();
c.append(" ");
}
System.out.println(good + "____");
System.out.println("12345678901234567890123");
}
}

matching a stuck-record^nopainnogain

a stuck-record is something like “i love fish i love fish i love fish “.

Q: How do u match a stuck record?
A: /(i love fish )\1+/ seems to be the standard solution in literature

On the other hand, “no pain no gain” are not stuck-records but a …. excel column list — AB-AC-AD-…?

Q: Will \d{3} match 287 ?
A: Yes. see p 177 [[ programming perl ]]. Looks like \d behaves like a wildcard just like the dot

Q: will (\d\d){2} match 7193 ?
a: yes

j regex to remove trailing alphabets (from portNum)

        Pattern p = Pattern.compile(“[a-zA-Z]*$”);
        Matcher m = p.matcher(this.portNum);
 
        this.aid=  
                “au-” +
                opp.cardSlot + “-” +
                opp.portNum + “-” +               
                this.ontSlot + “-” + 
                m.replaceAll(“”);