How to Convert Dos/Windows Newline (Crlf) to Unix Newline (Lf)

How to detect and convert DOS/Windows end of line to UNIX end of line in Ruby

The CR (^M as you say it) char is "\r" in Ruby (and many other languages), so if you're sure your line endings also have the LF char (Windows uses CRLF as the line ending) then you can just remove all the CRs at the ends of the lines ($ matches at the end of a line, before the last "\n"):

uploadData.read.gsub /\r$/, ''

If you're not sure you're going to have the LF (eg. MacOS 9 used to use a plain CR at the end of the line) then replace any CR optionally followed by a LF with an LF:

uploadData.read.gsub /\r\n?/, "\n"

Convert a string's UNIX line endings to DOS line endings

What are you trying to do, exactly? Internally, regardless of the system, line endings are '\n'. If you're on a Windows system, they will be converted correctly in std::ifstream and std::ofstream, you don't have to worry about it (provided you open the files in text mode). And an std::ifstream will read a file written under Unix without problems. The only time you might have to pay attention to this issue is when writing on Windows and reading on Unix; then you will probably find an extra '\r' immediately in front of the '\n'. Normally, this is not a problem either, because the '\r' count as legal white space, and you want to ignore trailing white space anyway.

Within a C++ program, you should never see the '\r'.

Convert DOS line endings to Linux line endings in Vim

dos2unix is a commandline utility that will do this, or :%s/^M//g will if you use Ctrl-v Ctrl-m to input the ^M, or you can :set ff=unix and Vim will do it for you.

There is documentation on the fileformat setting, and the Vim wiki has a comprehensive page on line ending conversions.

Alternately, if you move files back and forth a lot, you might not want to convert them, but rather to do :set ff=dos, so Vim will know it's a DOS file and use DOS conventions for line endings.

How to convert CRLF to LF on a Windows machine in Python

Convert line endings in-place (with Python 3)

Line endings:

  • Windows - \r\n, called CRLF
  • Linux/Unix/MacOS - \n, called LF

Windows to Linux/Unix/MacOS (CRLFLF)

Here is a short Python script for directly converting Windows line endings to Linux/Unix/MacOS line endings. The script works in-place, i.e., without creating an extra output file.

# replacement strings
WINDOWS_LINE_ENDING = b'\r\n'
UNIX_LINE_ENDING = b'\n'

# relative or absolute file path, e.g.:
file_path = r"c:\Users\Username\Desktop\file.txt"

with open(file_path, 'rb') as open_file:
content = open_file.read()

# Windows ➡ Unix
content = content.replace(WINDOWS_LINE_ENDING, UNIX_LINE_ENDING)

# Unix ➡ Windows
# content = content.replace(UNIX_LINE_ENDING, WINDOWS_LINE_ENDING)

with open(file_path, 'wb') as open_file:
open_file.write(content)

Linux/Unix/MacOS to Windows (LFCRLF)

To change the converting from Linux/Unix/MacOS to Windows, simply comment the replacement for Unix ➡ Windows back in (remove the # in front of the line).

DO NOT comment out the command for the Windows ➡ Unix replacement, as it ensures a correct conversion. When converting from LR to CRLF, it is important that there are no CRLF line endings already present in the file. Otherwise, those lines would be converted to CRCRLF. Converting lines from CRLF to LF first and then doing the aspired conversion from LF to CRLF will avoid this issue (thanks @neuralmer for pointing that out).



Code Explanation

Binary Mode

Important: We need to make sure that we open the file both times in binary mode (mode='rb' and mode='wb') for the conversion to work.

When opening files in text mode (mode='r' or mode='w' without b), the platform's native line endings (\r\n on Windows and \r on old Mac OS versions) are automatically converted to Python's Unix-style line endings: \n. So the call to content.replace() couldn't find any \r\n line endings to replace.

In binary mode, no such conversion is done. Therefore the call to str.replace() can do its work.

Binary Strings

In Python 3, if not declared otherwise, strings are stored as Unicode (UTF-8). But we open our files in binary mode - therefore we need to add b in front of our replacement strings to tell Python to handle those strings as binary, too.

Raw Strings

On Windows the path separator is a backslash \ which we would need to escape in a normal Python string with \\. By adding r in front of the string we create a so called "raw string" which doesn't need any escaping. So you can directly copy/paste the path from Windows Explorer into your script.

(Hint: Inside Windows Explorer press CTRL+L to automatically select the path from the address bar.)

Alternative solution

We open the file twice to avoid the need of repositioning the file pointer. We could also have opened the file once with mode='rb+' but then we would have needed to move the pointer back to start after reading its content (open_file.seek(0)) and truncate its original content before writing the new one (open_file.truncate(0)).

Simply opening the file again in write mode does that automatically for us.

Cheers and happy programming,

winklerrr

How can I make all line endings (EOLs) in all files in Visual Studio Code, Unix-like?

In your project preferences, add/edit the following configuration option:

"files.eol": "\n"

This was added as of commit 639a3cb, so you would obviously need to be using a version after that commit.

Note: Even if you have a single CRLF in the file, the above setting will be ignored and the whole file will be converted to CRLF. You first need to convert all CRLF into LF before you can open it in Visual Studio Code.

See also: https://github.com/Microsoft/vscode/issues/2957



Related Topics



Leave a reply



Submit