memos/plugin/markdown/parser
Steven 64e9d82d67 fix(parser): support Unicode characters in tags
Fixes #5264

Chinese, Japanese, Korean, and other Unicode characters are now
properly recognized in hashtags, following the standard hashtag
parsing conventions used by Twitter, Instagram, and GitHub.

Changes:
- Updated tag parser to allow Unicode letters and digits
- Tags stop at whitespace and punctuation (both ASCII and CJK)
- Allow dash, underscore, forward slash in tags
- Added comprehensive tests for CJK characters and emoji

Examples:
- #测试 → recognized as tag '测试'
- #日本語 → recognized as tag '日本語'
- #한국어 → recognized as tag '한국어'
- #测试。→ recognized as tag '测试' (stops at punctuation)
- #work/测试/项目 → hierarchical tag with Unicode
2025-11-19 22:06:11 +08:00
..
tag.go fix(parser): support Unicode characters in tags 2025-11-19 22:06:11 +08:00
tag_test.go fix(parser): support Unicode characters in tags 2025-11-19 22:06:11 +08:00