Java: Meridiem (am/pm) date parsing surprise
Our team was perplexed with the date parsing logic where it seemed we couldn't parse 11/19/2024 11:25 am
or 11/19/2024 11:25 AM
depending on the machine.
I eventually decided to ask on Reddit's r/javahelp. It turns out that it is the locale. en_GB
(British locale) would fail to parse the uppercase AM
/PM
, but en_US
(US locale) would fail to parse the lowercase am
/pm
.
Here's a proof of concept: https://onecompiler.com/java/42z2nenrb
Since users or OpenAI might use either the uppercase or the lowercase version, I'll need to handle both of them. Therefore, in our Scala code, we are doing the below:
val formatter = DateTimeFormatter.ofPattern("M/d/yyyy h:mm a")
val time = LocalDateTime
.from(
Try(formatter.parse(s.toLowerCase))
.orElse(Try(formatter.parse(s.toUpperCase)))
.get
)
Java's date / time libraries are full of surprises.
When I worked at Twitter, there was a huge outage in 2014 / 2015 due to Twitter using YYYY
(week year) instead of yyyy
(which is normal year).
Or at least I once thought it meant “normal year”. It turns out yyyy
is "year of era"; uuuu
is the most correct one, even though uuuu
and yyyy
are effectively identical unless you process the years before the current era.