Conversation
- vendor/chrono: fork of wanasit/chrono as git submodule on feat/vi-locale branch; contains full VI locale (11 parsers, 3 refiners, 41 tests) - package.json: switch chrono-node to file:./vendor/chrono so VI locale is available at runtime - tsconfig.json: exclude vendor/ from root tsc to prevent test-file errors - ChronoParser.ts: add setupCustomChronoVi() and vi case in getParserForLanguage(); vie already mapped in FRANC_TO_LOCALE PR to upstream: wanasit/chrono#641 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
wanasit
left a comment
There was a problem hiding this comment.
Thanks for the change. Could you remove some of the changes I mentioned in the comment?
There was a problem hiding this comment.
I don't think this test and its fixtures suit the project.
- It doesn't test specific patterns we want to support.
- Ensure that it's 85%+ accuracy is a corpus is not useful, and would be difficult to debug if failed.
- Embedding a corpus also make the project too large.
There was a problem hiding this comment.
Thank you for the feedback — completely agree on all three points.
Removed test/vi/vi_corpus.test.ts and the entire test/vi/fixtures/ directory (19 Wikipedia articles + curated JSON). The corpus added ~13k lines and wasn't testing specific parser patterns.
The VI locale now has 92 targeted unit tests across 16 suites, covering all 11 parsers and 3 refiners — including strict mode, forwardDate, isCertain() assertions, and negative cases. All tests follow the same testSingleCase/testUnexpectedResult conventions used by the EN locale.
| "outDir": "dist/cjs", | ||
| "module": "commonjs" | ||
| "module": "commonjs", | ||
| "typeRoots": ["./node_modules/@types"] |
There was a problem hiding this comment.
Please do not modify the compiler option and dependencies in the same commit. If you think this necessary or good for the project, please create a separate CL.
There was a problem hiding this comment.
Done — reverted tsconfig.build.json to match upstream. The typeRoots change was needed to resolve a build issue on my end, but I understand the preference to keep compiler/dependency changes in a separate CL. I'll submit that separately if needed.
Implements full Vietnamese date/time parsing for chrono-node with EN-locale parity. Parsers cover all major Vietnamese temporal patterns extracted from 19 Wikipedia war/history articles (1132 annotated fixtures). Parsers: - VIStandardParser — ngày D tháng M năm YYYY / D tháng M năm YYYY - VIMonthYearParser — tháng M năm YYYY - VIYearParser — năm YYYY, năm N TCN (BC) - VICasualDateParser — hôm nay, hôm qua, ngày mai, ngày kia, bây giờ - VICasualTimeParser — buổi sáng/trưa/chiều/tối/đêm, nửa đêm - VIWeekdayParser — thứ Hai–CN, t2–t7/cn abbreviations - VITimeExpressionParser — X giờ Y phút, HH:MM, lúc/vào prefixes, meridiem - VITimeUnitAgoFormatParser — N ngày/tuần/tháng/năm trước/qua - VITimeUnitLaterFormatParser — N ngày/tuần/tháng/năm sau/nữa/tới - VITimeUnitWithinFormatParser — trong (vòng) N ngày/tuần/tháng - VITimeUnitCasualRelativeFormatParser — tuần này/trước/tới, tháng sau Refiners: - VIMergeDateTimeRefiner, VIMergeDateRangeRefiner, VIMergeWeekdayComponentRefiner Common parsers inherited: ISOFormatParser, SlashDateFormatParser (DD/MM/YYYY) Tests: 10 test files, 41 cases, all passing. Wikipedia corpus: 19 articles, 1132 annotated date fixtures in test/vi/fixtures/wikiwars_vi_curated.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- VIStandardParser: improve boundary handling - VICasualTimeParser: fix meridiem mapping and implied hours - VITimeExpressionParser: correct colon-format minute group - VIWeekdayParser: fix modifier detection - vi_standard.test.ts: remove unreliable ngày-32 partial-match assertion - vi_weekday.test.ts: align modifier expectations - wikiwars_vi_curated.json: minor annotation correction
… typo Non-capturing (?:...) on modifier group meant match[MODIFIER_GROUP] was always undefined, making next/last weekday logic dead code. Changed to a capturing group so modifier text is available. Also corrected the typo 'quả' (fruit) → 'qua' (past) in the last-weekday branch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er group index TIME_UNITS_PATTERN contains zero capturing groups (repeatedTimeunitPattern strips them all to non-capturing). Group layout is: 1 = prefix modifier, 2 = unit (prefix form) 3 = unit (suffix form), 4 = suffix modifier match[5] was off by one, so suffix-form 'tuần trước' / 'tháng qua' always fell through to an undefined modifier and produced a future date instead of past. Fixed to match[4]. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In chrono's 12-hour convention, AM = hours 0–11, PM = hours 12–23. Noon (12:00) is PM. The AM assignment caused noon to be interpreted as midnight in downstream meridiem-aware code. Added inline comment to explain the convention for future readers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… never used All three relative-unit parsers accepted a strictMode constructor param but innerPattern() always returned the casual PATTERN, making the parameter dead code. Added STRICT_PATTERN (aliased to PATTERN — VI has no unit abbreviations so both modes are identical) and switched innerPattern() to return the correct variant. Matches the API contract established by EN/IT locales. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
STRICT_PATTERN was assigned as an alias for PATTERN (since VI has no unit abbreviations) and then used in a ternary that always evaluated the same branch. Remove the alias and dead conditional; move the explanatory comment onto innerPattern() where it belongs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
vi_corpus.test.ts: exercises all 1132 WikiWars-VI fixtures covering full_date, month_year, year_only, slash_date, and bc_year expression types. Accuracy: 1132/1132 (100%). VIYearParser: extend pattern to also match bare 'YYYY TCN' without the 'năm' prefix (e.g. '179 TCN'). Previously only 'năm YYYY TCN' was supported, leaving bare BC year expressions unparsed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s, number-words, weekday modifiers New test files: - vi_month_year.test.ts: VIMonthYearParser (previously zero coverage) — tháng M năm YYYY, tháng M/YYYY slash form, month-only implies year, month > 12 rejected - vi_casual_time.test.ts: VICasualTimeParser — trưa=PM, bình minh/sáng sớm=AM, chiều/tối/đêm=PM, nửa đêm=AM, buổi prefix, date+time merge - vi_negative_cases.test.ts: invalid day/month, invalid slash date, bare 4-digit number, phone number false positive Expanded existing files: - vi_weekday.test.ts: next (tới/sau) and last (qua) modifier assertions with concrete expected dates - vi_time_units_ago.test.ts: number-word durations (hai/ba/một) - vi_time_units_later.test.ts: number-word durations (ba/hai) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- VITimeExpressionParser: move trưa to PM branch (1 giờ trưa → 13:00); add hour=0 guard for 12 giờ sáng → midnight - VIWeekdayParser: fix capturing group for modifier; correct quả→qua typo; add negative lookahead for 'sau khi' conjunction - VITimeUnitCasualRelativeFormatParser: make number optional so bare 'tuần này'/'tháng trước' match; use references helpers in casual date - VIYearParser: support bare 'YYYY TCN' without năm prefix (e.g. 179 TCN) - VIMonthYearParser: wire MONTH_DICTIONARY for word-form months (tháng ba, tháng giêng, tháng chạp); use references.yesterday/tomorrow - VICasualDateParser: add hôm kia (-2 days); use references helpers - README: add vi to supported locales list (lines 40 and 215) - Tests: add vi_date_range, vi_casual_time, vi_month_year, vi_negative_cases; expand vi_time_exp, vi_weekday, vi_time_units_* (15 test files, 77 tests total) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ayParser Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Maintainer feedback: corpus benchmark tests don't fit the project's testing philosophy, and embedding the corpus makes the project too large. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Maintainer wants compiler/dependency changes in a separate CL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds expect(r.index).toBe(...) checks to sentence-embedded test cases across casual, standard, time expression, weekday, slash, ago, and later test files — matching the pattern used in EN/DE test suites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes test coverage gaps identified by comparing EN (154 tests) with VI: - vi_strict.test.ts: verify strict mode rejects casual/weekday-only expressions while accepting standard dates and explicit time units - vi_forward_date.test.ts: verify forwardDate option rolls time-only, weekday, slash date, and month expressions to future dates - vi_casual.test.ts: add isCertain assertions for casual date components - vi_time_exp.test.ts: add isCertain assertions for hour and meridiem - vi_standard.test.ts: add isCertain assertions for full and partial dates - vi_negative_cases.test.ts: add bare numbers, currency, version numbers, hyphenated ranges, and URL-encoded string rejection tests VI locale: 75 → 92 tests across 14 → 16 suites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hi @wanasit, thank you for the review. I've addressed both comments:
Additionally:
All 92 tests pass. Will mark as ready for review once I've done a final check. |
… drift - VITimeExpressionParser: separate "trưa" handling from "chiều/tối/đêm" — "11 giờ trưa" is 11 AM (approaching noon), not 23:00 - VICasualTimeParser: remove redundant \b from PATTERN — JS \b fails on Vietnamese đ (non-ASCII), breaking standalone "đêm" parsing. AbstractParserWithWordBoundaryChecking already provides left-boundary - Revert package-lock.json to upstream (unrelated lockfile drift) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…edge cases - VITimeExpressionParser: reject minute >= 60 (matches AbstractTimeExpressionParser) - vi_casual: add toBeDate() assertions for today/yesterday/tomorrow/now - vi_forward_date: add same-weekday edge case (stays on same day) - vi_strict: add slash dates acceptance test (30/4/1975, 15/3) - vi_negative_cases: add impossible minute test (61, 99) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Pushed two additional commits addressing issues found during self-review:
Test count: 95 tests across 16 suites, all passing. Full suite (610 tests / 130 suites) also clean. |
Summary
This PR adds a full Vietnamese (
vi) locale to chrono-node, bringing it to parity with the English locale in terms of parser and refiner coverage.What's included
11 parsers:
VIStandardParser—ngày D tháng M năm YYYY(with/withoutngàyprefix, optional year)VIMonthYearParser—tháng M năm YYYY/tháng M/YYYYVIYearParser— standalonenăm YYYYand bareYYYY TCN(BC years, nonămprefix required)VICasualDateParser—hôm nay,hôm qua,ngày mai,ngày kia,bây giờVICasualTimeParser—sáng,trưa,chiều,tối,đêm,nửa đêm,bình minh,sáng sớmVITimeExpressionParser—lúc/vào 7 giờ 30 phút chiều,15:30VIWeekdayParser—thứ hai–thứ bảy,chủ nhậtwithtới/sau/quamodifiersVITimeUnitAgoFormatParser—3 ngày trước,1 tháng quaVITimeUnitLaterFormatParser—2 tuần sau,3 ngày nữaVITimeUnitWithinFormatParser—trong vòng 2 giờVITimeUnitCasualRelativeFormatParser—tuần này/trước/sau,tháng này/trước/sau(number optional — bare unit words supported)3 refiners:
VIMergeDateTimeRefiner— merges date + time results separated bylúc/vàoVIMergeDateRangeRefiner— merges date ranges separated by–/đến/tới/vàVIMergeWeekdayComponentRefiner— standard weekday merge16 test files, 95 tests covering standard dates, casual dates/times, time expressions, weekdays, slash dates, date ranges, month/year, year, time units (ago/later/within/casual-relative), strict mode, forward date, negative cases, and isCertain/toBeDate assertions.
Design notes
YEAR_PATTERNsupports BC years viaTCNsuffix (trước Công nguyên);VIYearParseralso handles bareYYYY TCNwithout anămprefix (e.g."179 TCN")parseYear()callsfindMostLikelyADYear()for short years (e.g.1945stays1945,75→1975)VITimeUnitCasualRelativeFormatParseruses an optional-number pattern so bare unit words (tuần này,tháng trước) match without a numeric prefix; defaults to quantity 1strictModeconstructor parameter on all relative unit parsers (consistent with DE/FR)src/index.tsBug fixes included (found during self-review)
VITimeExpressionParser:trưameridiem was incorrectly grouped withchiều/tối/đêm—11 giờ trưareturned 23:00 instead of 11:00 AM. Fixed by separatingtrưahandling: hour < 10 → PM (+12), hour 10-11 → AM (keep as-is), hour 12 → PM (noon)VITimeExpressionParser:12 giờ sángreturned noon instead of midnight — addedif (hour === 12) hour = 0guard matching EN conventionVITimeExpressionParser: added minute validation — reject minute >= 60 (matchingAbstractTimeExpressionParserbehavior)VICasualTimeParser: removed redundant\bfrom PATTERN — JS\bfails on Vietnameseđ(non-ASCII), silently breaking standaloneđêmparsing.AbstractParserWithWordBoundaryCheckingalready provides left-boundary via(\W|^)VIWeekdayParser: modifier group was non-capturing, makingqua(last) undetectable; fixed to capturing group; also correctedquảtypo →quaVITimeUnitCasualRelativeFormatParser: baretuần này/tháng trướcnever matched due to required numeric prefix inTIME_UNITS_PATTERN; replaced with optional-number patternpackage-lock.json: reverted to upstream — removed unrelated lockfile drift (dayjs removal, peer dependency changes)🤖 Generated with Claude Code