Matches Unicode characters that are word boundaries.
Characters with the following General_category (gc) property values are used as word boundaries. While this does not fully conform to the Word Boundaries algorithm described in http://unicode.org/reports/tr29, as PCRE does not contain the Word_Break property table, this simpler algorithm has to do.
Cc, Cf, Cn, Co, Cs: Other.
Pc, Pd, Pe, Pf, Pi, Po, Ps: Punctuation.
Sc, Sk, Sm, So: Symbols.
Zl, Zp, Zs: Separators.
Non-boundary characters