I think single-string coding questions are evergreen. The insights and techniques might be …reusable?
Brute force search — shift by one, then attempt to match the entire substring. My code below is more efficient than brute-force. Mostly it shifts by one each time, but checks just one char.
Q: write a python function match(string haystack, string nonRegexNeedle), following the simplified Boyer-Moore algo:
Try matching at the left end. If mismatch, then identify the LEFT-most offending char (J) in haystack. Now locate the RIGHT-most position of J in needle (if not found then simply shift all the way past the J.) and right-shift the needle to align the two J.
Q: So when would we right-shift by one? Not sure. So I don’t know if my code below is 100% correct in all cases.
def visualize(haystack, substring, offset): print '\t--' + '0123456789 123456789 12345-----' print '\t--' + haystack print '\t--' + ' '*offset + substring print '\t--' + '-------------------------------' def match(haystack, substring): offset = 0 #offset from start of haystack, where substring is now while(True): #now locate first offender in haytack offenderOffset, offender = -9999, '' for i, char in enumerate(substring): offenderOffset = offset+i if substring[i] != haystack[offenderOffset]: offender = haystack[offenderOffset] print offender + ' <-- spotted at offenderOffset = ', offenderOffset visualize(haystack, substring, offset) break; if offender == '': return True # now back-scan substring to locate offender, and then shift to align # not really forward-scan haystack while True: offset += 1 if offset + len(substring) > len(haystack): print 'Gave up complately:' visualize(haystack, substring, offset) return False if offenderOffset - offset == -1: print 'gave up on aligning: ' visualize(haystack, substring, offset) break if substring[offenderOffset - offset] == offender: print 'aligned: ' visualize(haystack, substring, offset) break # go back to start of outer while loop print match('aborb bdb', 'abors')