Dealing with a long path name on windows in Python

2 Nov 2016

Windows strangely imposes a limit on a path length. The limit is 260 characters.

Now many answers on Stackoverflow suggests we can prefix a path with \?\ in order to bypass the limit.

It works well for a while until you use os.listdir(..). It turns out os.listdir(..) checks if the length exceeds the limit regardless of whether \?\ exists. And it'll raise an exception: TypeError: encoded string too long (264, maximum length 259).

The right way to handle this is to get the corresponding short path name for a long path. For example, the shorter version of c:\aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\test.txt is c:\AAAAAA~1\test.txt. Windows seems to assign a unique short path name for every long path name, therefore, we don't need to be worried about conflict.

Here's the windows API that can retrieve the short path name (given a long path name): GetShortPathName.

This Stackoverflow answer shows how to use the window API from Python. But the answer handles the TOCTTOU problem and uses an infinite loop. I'm not so sure about that version. So, here's my version:

import ctypes from ctypes import wintypes path = '\\\\?\\c:\\some_long_path' short_path_fn = ctypes.windll.kernel32.GetShortPathNameW short_path_fn.argtypes = [wintypes.LPCWSTR, wintypes.LPWSTR, wintypes.DWORD] short_path_fn.restype = wintypes.DWORD # According to https://msdn.microsoft.com/en-us/library/aa364989.aspx, when passing (path, None, 0), # the return value is the size of the buffer that will contains the short path and the terminating null character. buffer_length = short_path_fn(path, None, 0) output_buffer = ctypes.create_unicode_buffer(buffer_length) expected_length = buffer_length - 1 # excluding the terminating null character. # According to https://msdn.microsoft.com/en-us/library/aa364989.aspx, when passing (path, output_buffer, buffer_length), # the short path is written to output_buffer, and the return value is the length of the short path (excluding the terminating null value). actual_length = get_short_path(path, output_buffer, buffer_length) assert (expected_length == actual_length, ("The short-path length %d doesn't equal the expected length %d." % (actual_length, path, expected_length)) output.value # Here's the short path name.

The real question here is: what would we do if the short path contains more than 260 characters?

Give it a kudos