FB 2questions #regex..

—Q1: write a stateless utility function to take in any string consisting of a-z, and return a string with all the original characters appearing once only. All 2nd and subsequent occurrences should be removed. eg: abcbdc → abcd

They don’t mind c/c++/java, but I feel they prefer c-string.

I don’t want to new up a string in my utility, so I decided to take in an output char* buffer which should be large enough.

I guess this is all about robust and fast coding.
—Q2: Suppose there’s a regex grammar that can accept exactly 3 constructs
1) literals — any combination of abcdefghijkljmnopqrstuvwxyz,
2) dot (.) that matches any one character
3) asterisk — a standard quantifier

For example,

“a*” can match empty string, a single a, or aa, or aaa etc.
“.*” matches any string including the empty string.
“a*b*a” should match “a” and also match “aa”, or “ba”, or “aaa”

Now, write a utility function bool match(char const * regex, char const * S);

Note entire string S must match entire regex. We don’t want to get a false positive when only a substring of S matches the regex.
—– some tips—–

  1. Easy part — if regex ends in literals, then they must “eat up” the tail of S exactly.
  2. After this step, if regex ends with a dot, it will just eat up the last char of S. After this step, regex definitely ends in a star.
  3. If 2nd char in regex is not a star, then first char must eat up head of S.
  4. Repeat previous step until we see star as 2nd char.
  5. Now we can basically split the regex into segments. Each segment is either a single literal, a single dot, or a non-star followed by a star.

Interviewer said — First assume dot is not allowed in regex. We will deal with the dot later.

—– analysis
EPI Question 6.23 is very similar to my Facebook interview. I told FB interviewer that we don’t NEED such problem solving at work. He said this is one method (standard method?) they assess a candidate’s coding abilities.

I feel it (and google questions) is like IQ test or probability test or abstract reasoning test — irrelevant to work, but used to assess learning capacity. They don’t want a dumb person on their team.

The truth is, practice is key. There are patterns. A very very smart person will still fail if without enough practice. A mediocre guy can speed up significantly if practised enough. I said the same thing to My Hwa Chong math teacher Tony Tan. This factor is also the reason I became so strong in physics.

Q: A simple regex (SRE) can use alphanumerics, dot and star. It is enclosed between implicit ^ and $. Write a match() function taking a SRE and a word. No white space allowed.

https://github.com/tiger40490/repo1/blob/py1/py/regexFB.py is my implementation. Here are a few test cases:

assert not match(”, ‘.’)
assert match(‘xyab-abc’, ‘.*ab.*c.*x*’)
assert match(‘xyab-abc’, ‘.*ab.*c.*.*’)
assert match(‘xyab-abc’, ‘.*ab.*c’)
assert match(‘aaaaaaab-abc’, ‘a*aab.*c’)
assert match(‘aab-abc’, ‘a*aab.*c’)
assert match(‘aabc’, ‘a*aab.*..*x*’)
assert not match(‘abc’, ‘a*aab.*c’)
assert match(‘a-aab’, ‘a*.*a*aab’)
assert not match(‘a-aab’, ‘a*a*aab’)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s