Sunday, August 17, 2008

Copying files from Linux to Windows

If you ever have occasion to copy files from Linux to Windows, you may discover that this process is not as simple as it may seem. It's fairly easy to take a USB drive formatted with the FAT or NTFS filesystem and physically transport the files from one environment to the other, but there are other complexities you must manage:

1. File Naming Rules

On ext3 filesystems, one of the most common filesystems in the Linux world, there are many legal filenames that are illegal on Windows:
  • Filenames with double quotes (")
  • Filenames with colons (:)
  • Filenames with backslashes (\)
  • etc.
This script will, when executed from the path you wish to examine, rename the files so that they have legal names for Windows environments.

2. Path Length

While ext3 has no maximum length for paths and a 255-character limit for filenames, Windows' NTFS restricts each each path component (directory or filename) to a maximum of up to 255 characters long (from Wikipedia). You have to examine the source folders to confirm you don't have any path component names that are too long to copy. DO NOT TRUST THE SUCCESS OF THE FILE COPY TO VALIDATE THIS. It seems that the Linux NTFS implementation (at least on Ubuntu 8.04) allows paths that would be legal for ext3, and that work while the Linux machine is using the NTFS-formatted drive, but that are NOT legal when consumed by Windows.

3. Case Sensitivity

Windows filenames are not case-sensitive, but Linux filenames are. This means that if you have two files in one folder on the Linux source system as follows:
  • myDocument.doc
  • MyDocument.doc
Only one of these will copy to the destination, or odd errors will result when you attempt to copy them. To identify these issues, and to confirm you haven't been bitten by any of the above problems...

4. Compare, compare, compare

Always use a file and folder compare tool like WinMerge to make sure that everything did copy over properly. Make sure you investigate any differences it identifies, because those are likely issues with case-sensitivity, path lengths, or other problems. This prevents you from thinking you have a good copy of the file tree when you don't. I recommend editing WinMerge's compare options so that only the file size and time are considered when comparing the results of large copy operations so that the comparison completes in a timely fashion.

5. Text File Handling

Lastly, you must be aware that text files have different end-of-line delimiters on Linux vs. Windows. This means that at a minimum Windows programs that don't understand this (Notepad) will show the text all run together rather than having separate lines as it should be. Cross-platform applications using the files (e.g. the instant messenger application Pidgin) will fail on Windows because they expect Windows text file formats.

To avoid these issues, after getting a successful copy, use a Unix-to-Windows text file converter application. I have had good luck with EOL Converter (as an added bonus, it is also free. :) )

No comments:

Post a Comment