I think I’ve mentioned that I have a “challenging” codebase that I deal with in my day job. It’s a classic Delphi legacy app, worked on by numerous different people over the years with multiple different styles and techniques. Part of the product includes the use of an Address Engine – a third-party utility that “scrubs” addresses and ensures that they exist and that they are in a Post Office acceptable form.
Recently, we’ve had to change Address Engine providers. It should have been easy – I had created an interface to isolate the engine before the switch. So I got the new engine all plugged in, ready to go, and it all should have worked. But try as I might, I couldn’t get anything but garbage data out of the engine. Argh. It simply wouldn’t return any data. Now mind you, the DLL in question was written in C, and I had a translation of the *.h file that I was pretty confident in. It all should have worked.
But the problem was much simpler than that – the `*.ini` file that had an entry for the path to the postal data was missing an ending backslash. That was it. I put the trailing backslash on the entry (why didn’t the installer do that?) and it all worked.
Which brings me to the point of this blog post: Always assume that your paths don’t have that trailing delimiter, and that it is your responsibility to put it there.
In fact, the RTL provides a routine that allows you to do that very thing:
function IncludeTrailingPathDelimiter(const S: string): string; overload;
`IncludeTrailingPathDelimeter` will place either a backslash or a forward slash on the end of a path, depending on the operating system’s definition of a path delimiter if there isn’t one there. If there is, it will do nothing. You should use it to ensure that every path string you use has a delimiter on the end of it. Your code should thus assume that all paths have that trailing backslash on it. If you have code that looks like this:
MyFilenameWithPath := SomePath + ‘\’ + ‘somefile.txt’;
this should be changed to
MyFilenameWithPath := IncludingTrailingPathDelimiter(SomePath) + ‘somefile.txt’;
The former code can be buggy – what if `SomePath` already has a trailing delimiter? You’ll end up with an invalid path. The latter code has no such problem. It is safe both ways – with and without the trailing delimiter. This is especially necessary when your paths come as a result of user input, such as that `*.ini` file above.
The moral of the story: Never assume that your path variables have a trailing delimiter, and always wrap them up with `IncludeTrailingPathDelimiter` to ensure that they do.
Would this “Be conservative in what you send, be liberal in what you accept.” apply in this case?
Rick —
Yeah, exactly. Be generous in accepting paths with and without backslashes on the end.
My convention is that all functions I write that return a directory, ALWAYS return it including the trailing backslash. And all functions I write that accepts a directory as a parameter, ALWAYS assumes it can be both with or without a trailing backslash. Therefore, I haven’t (in many, many years) run into this particular problem…
By sticking to this convention, you should be safeguarded in all cases…
What about user input? What about API calls like SelectDirectory? Do they have the trailing backslash or not?
User input is always “sent somewhere”. I don’t expect a trailing backslasg (“all functions I write that accepts a directory as a parameter, ALWAYS
assumes it can be both with or without a trailing backslash”), and when passing it along to SelectDirectory and other external functions, I adhere to the specification for the respective functions as to trailing backslasg or not (if not specified, I generally supply it without trailing backslash).
The other reason why “SomePath + ”” should be replaced with “IncludeTrailingPathDelimiter(SomePath)” is platform compatibility. Windows uses backslashes, other platforms use forward, some esoteric ones use other symbols… The more you can abstract away platform dependencies into RTL or isolated library code, the better.
Nowadays, I use this instead:
MyFilenameWithPath := TPath.Combine(SomePath,‘somefile.txt’);
It takes care of optionally inserting a proper path delimter, and it’s easier to read and write for those who also work with C#, where you’d write this:
MyFilenameWithPath = Path.Combine(SomePath,”somefile.txt”);
Cool — didn’t see that one. Good solution.
And this leads to cargo-cult-throwing-IncludeTrailingPathDelimiter-calls-all-over-the-place.
No, it leads to putting them where they are needed – which is pretty much everywhere you are dealing with paths, especially paths input by users. Can you be sure that your path is properly formed without doing the checks yourself?
Your point was not making sure that a user input is formatted correctly but you said “always wrap them up with …”.
I have seen code where strings run through dozens of ITPD calls every time across several routines because ya know the backslash could be missing! I rather go with the convention when something is named path it contains a delimiter and when something is named dir it does not.
But again, user input can vary.
If a “path” has no ending delimiter it is not a path. It can be anything like volume, directory, … but it is never a path. Best case is to guard this with a record called TPathName with an implicit string conversion and check, if the string has an ending delimiter. If not, just raise an exception.
EnsureTrailingPathDelimiter is so much easier and friendlier than an exception
In some cases it might be the solution but it does not follow the “fail early” principle. That is because this function just makes sure that whatever string you are passing ends with a path delimiter.
I really like Sir Rufos suggestion. A TPathName type gives more clarity (a routine taking a TPathName is clearer than a string argument) and better error control (TPathName can contain all necessary validation so you are unable to pass anything that is not a path in it)
Noone keeps you away from creating a record TPathName that will also implicit call the EnsureTrailingPathDelimiter. And you can trust in procedure foo( APath : TPathName ); that you have a string with an ending delimiter without repeating calls. You can also implement a complete check for a valid path in any way and you will have this problem solved for ever :o)
Well. Umm.. No. Open your explorer, go to any directory. Click on the adress field. Voila, you see the Microsoft definition of a path. And this as no ” at the end. But then again, your argumentation seems legit. Or hmmm. maybe not…. get the point?
The AddressEngine is in a black box DLL, hence the error I couldn’t fix short of merely making the INI file correct.
There is a good method: if you read something from user or external environment (configs, IPCs, etc.) – put it into variable which name ends with “Unsafe”. E.g. SomePathUnsafe := ReadConfig(…);
Never pass “unsafe” variables into your code, pass to external environment only. You must validate this data before using in your code. E.g. SomePath := IncludeTrailingPathDelimiter(SomePathUnsafe);
(IncludeTrailingPathDelimiter is not the only one needed! You MUST also expand relative file paths; other possible (optional) actions are: expanding environment variables; canonization (e.g. remove ‘.’ and ‘..’); converting short names into long names)
This is a common principle in security: filter your inputs.
Also, there is a rule for paths: use “Path” to indicate variables/functions which store/return path with trailing delimiter, use “Dir” to indicate ones which store/return path without trailing delimiter. Doing so you would instantly know that GetSomeDir + ‘input.dat’ is wrong.