If you’ve spent much time working with Excel, you know that things can get really frustrating when you are (or Excel thinks you are) working with dates. In its efforts to be helpful, it often causes huge headaches, and worse still, it doesn’t give you a clear way to see what it changed and how to change it back. Many, if not most, data projects involve a time dimension, and so as Excel-using data crunchers we’re particularly sensitive to this frustration.
Let’s look at a couple examples where users complain that Excel comes up short with dates and timestamps.
Autoformatting Is Not Very Smart
A frequent complaint about Excel is that it autoformats entries or important data incorrectly, but then it doesn’t provide a way to reverse the autoformatting.
Quick example: let’s say you want to create some column headers with ranges, like 1-5, 6-10, 11-15, and 16-20. Well, first of all, Excel will automatically reformat the first three columns as dates -- in our case, “01-May”, “06-Oct”, and “15-Nov”. Ah, but it’s just a formatting issue, right? Manually reformat the cells as “General,” and suddenly you have three brand new numbers: 42009, 42165, and 42323.
Uh… what?? How did that happen?
You see, Excel uses a date system where Day 1 is January 1, 1900. That means that January 1, 2000 -- one hundred years later -- is represented as 36526. Hours, minutes, and seconds are represented as fractions of days.
Now, that’s not really unique -- all computer programs use some variation of this methodology. Where Excel goes wrong is that in its haste to figure out what you were trying to say, it applies the date without giving you an option or a way back.
A Single-Format Program for a Multi-Format World
Another area of frustration is that Excel can really only understand one date format at a time, and so if you’re working with data from, say, the US and the UK, you’re going to have a real mess on your hands.
In Windows, Excel is actually tied to how you have your date configured in your operating system, meaning you have to change your Windows settings just to change your Excel settings! So if you have a UK computer, and you upload a file with the dates “06/10/2016” and “06/30/2016”, Excel will interpret the first one to mean October 6, 2016… and it won’t recognize the second one as a date at all!
It’s not an impossible problem to solve, but you wind up needing to write a bunch of formulas just to get your data formatted correctly, and then you always have that risk of missing something.
An Easier Way: Dataiku DSS
In Dataiku DSS, parsing dates is as easy as two clicks. Furthermore, Dataiku will intelligently interpret the format of dates, but it will always give you the final say on which format you want to use.