• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Paul Clapham
  • Tim Cooke
  • Jeanne Boyarsky
  • Liutauras Vilda
  • Frank Carver
  • Henry Wong
  • Ron McLeod
Saloon Keepers:
  • Tim Moores
  • Frits Walraven
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Al Hobbs
  • Piet Souris
  • Himai Minh

Parsing a date from arbitrary text.

Posts: 9
Mac Postgres Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Is there any robust implementation of a parser for extracting a date from arbitrary text?
By arbitrary, I mean anything a user may type into a text field on an HTML form, for example.
The DateFormat class included with Java parses dates from text, but you must specify the _exact_format of the text. If I knew the exact format, and the user used it perfectly, I wouldn't need a parser library!
I'm looking for a more robust parser library that can make sense out of various input, expecially locale-aware (i18n) ones.
--Basil Bourque
Ranch Hand
Posts: 624
IntelliJ IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
"Holly un-validated input Batman!"
Parsing an arbitrary "anything goes" string into a Date object would be pretty hard if you were limiting it to just a single locale. Throw in I18N and that would be a pretty tall order indeed. Just think of all the possibilities:
  • order of month, day & year;
  • abbreviations vs. fully typed months and day names,
  • 4 digit year vs 2 digit year
  • 1 vs 2 digit month and date
  • different separators ('/' '-' '.' ' ')
  • does the string represent month & year; month day & year; month day, year, & time;
  • etc, etc

  • { if any math gurus want to calculated the total number of permutations, knock yourself out. Let s know the result }
    Take for example the string "200310" � is that
  • The month of October 2003
  • Oct 1, 2003
  • March 10, 2020
  • March 10, 0020
  • Oct 3, 2020
  • today at 8:03:10pm

  • Toss in the ongoing argument between my mother and I as to whether September is abbreviated "Sept" or "Sep" (let alone "Sept.") and the craziness just goes on an on.
    IMHO, you really need some kind of structure to the string in order to make parsing it a reasonably surmountable task. And if you are designing the input form, that gives you the opportunity to do so. It�s when you have raw collection of preexisting data and the data has no common structure or format that things can get very hard and tricky. And even then, it is usually a case of having multiple formats present, not just arbitrary data.
    Take a look at JavaAlmanac.com examples e320. Formatting a Date Using a Custom Format and e323. Formatting and Parsing a Date for a Locale for some guidance and sample code. Also look at the setLenient( ) method of the DateFormat class in combination with the above examples, although by default, date parsing is lenient. You may be pleasantly surprised how lenient the parser can be. It's not all-knowing, but it does a pretty good job.
    Personally, I try an avoid using an input field/parameter of "Date" on a user input form � I always use multiple input fields/parameters of Month, Day and Year (and hour & minute if needed) � And even then I often use drop down selection rather than text boxes. Using separate parameters makes validation a lot easier. It also allows for easier I18N in that I can change the order of the fields on the input form as needed for a particular locale. If you do use a single "date" input field, you simply must tell the user what format to use, and then validate that String before passing it on to your DateFormatter. (Who among us hasn't entered Feb 30th into a web form just to see if the programmer is using proper validation? And if I am the only one, then maybe I do need to get a life like my dog keeps telling me :roll: )
    Those are my thoughts on the subject. Others may have additional comments, or know of something that I am not aware of. I look forward to their opinions as well...
    Posts: 7023
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Welcome to JavaRanch, Basil!
    I'm moving this to the Other Java APIs forum...
    Any sufficiently advanced technology will be used as a cat toy. And this tiny ad contains a very small cat:
    the value of filler advertising in 2021
      Bookmark Topic Watch Topic
    • New Topic