How to Run Dos2Unix on an Entire Directory

How can I run dos2unix on an entire directory?

find . -type f -print0 | xargs -0 dos2unix

Will recursively find all files inside current directory and call for these files dos2unix command

How to run dos2unix on all files with all extensions in a directory and its sun-directories?

First solution (general)

The standard find program is designed precisely for that kind of tasks. Something along the lines of:

find Folder/ -type f -exec dos2unix '{}' '+'

This command

  1. explores Folder/ recursively,
  2. selects all files which are of type f (regular files, by contrast with directories, symbolic links and other types of special files),
  3. and executes dos2unix with all selected filenames.

'{}' is a placeholder which indicates where in the command you want the filename(s) to be inserted, and '+' terminates the said command. You can also run dos2unix once for each filename (by changing '+' with ';'), but since dos2unix accepts an arbitrary number of input arguments, it’s better to use it (as it avoids spawning many processes).

find has many more options, you can find them in the manual page (link above).

Second solution (specific to your problem)

If your shell is Bash, Bash supports recursive wildcards (other shells such as zsh probably have a similar feature). It is disabled by default, so you have to change a shell option:

shopt -s globstar

Then, ** is a wildcard that selects everything recursively (including directories and other special files, so you may need to filter it). After that, you can try:

dos2unix Folder/**

If you want the wildcard to actually select absolutely everything, including filenames starting with a dot (hidden files), you’ll also need to set another option: shopt -s dotglob.

How to run dos2unix for all the files in subfolders in bash?

you could use find:

find . -type f -name "*.sh" -exec dos2unix {} \+;

this locates all *.sh (-name "*.sh") files (-type f) in the current directory (recursing into subdirectories as well) and executes on all of them the dos2unix utility

convert dos2unix line endings for all files in a directory

for i in `find . -type f \( -name "*.c" -o -name "*.h" \)`; do    sed -i 's/\r//' $i ; done

How to execute dos2unix command on multiple files in the same directory?

By using "command" you simply execute a command. But you are using globbing (the *). This is done by a shell, i.e. bash. So you should use the shell module:

- name: Execute dos2unix on several files
shell: dos2unix “{{ scripts_dir }}/*.sh"
args:
chdir: "/home/yourdir/"

Unable to checkin linux format file from windows

You don't need to do anything.

The explanation is a little tricky though. You need to be aware that when you use Git, there are always three version of each active file.

The reason that two of these versions exist is obvious enough with a bit of thought. The third one is a bit odd; we'll get to that in a moment.

Start with the fact that each commit, as identified by its unique hash ID, stores a full snapshot of all of your files. This inside-a-commit snapshot stores the files in a special, read-only, Git-only format. Only Git can actually use these files. The special Git format causes de-duplication, so if you have the same versions of some file(s) in multiple commits, there's really only one copy of that file. That's why it's not a big deal that every commit has a full copy of every file: they're all shared whenever possible. Since each file is frozen for all time, it's easy to share it.

But because these copies of files literally can't be changed, and can't be used by any other non-Git program on your computer, they are no good for doing work. They are only useful as archived commits. So when you use git checkout (or git switch in Git 2.23 or later), you pick one commit that you'd like to have "checked out". Git copies all of the files from that commit, turning them from special, read-only, Git-only frozen files into regular everyday read/write files, in your computer's normal file format.

The copies that Git puts into normal everyday format, that you can see and work with, are in a work-space area. Git calls this your working tree or work-tree. Since these are ordinary files, you can use them, and even change them whenever you like.

So that's why there are two copies of each file in use: there's the frozen one in the current commit, and there's the normal-format one in your work-tree. But what about that third copy? This copy is in what Git calls its index, or the staging area (two terms for the same thing). This one sort of sits in between the frozen HEAD-commit copy, and the work-tree copy.

Let's draw a simple illustration of the three copies, assuming you have just two files named README.md and main.py:

   HEAD         index       work-tree
--------- --------- ---------
README.md README.md README.md
main.py main.py main.py

All three copies of, say, main.py match—well, sort of—at the beginning, right after your initial git checkout. The HEAD one, the one in the current commit, is frozen: it literally cannot be changed. The work-tree copy is yours to do with as you like.

The one in between is the one that Git will put in the next commit you make. Right now, it matches the other two. But what if you change main.py? Let's add a version-number to each file:

   HEAD         index       work-tree
--------- --------- ---------
README.md(1) README.md(1) README.md(1)
main.py(1) main.py(1) main.py(2)

You changed the work-tree copy, so we bumped the version number. (It's not actually in the file, we're just drawing it to keep track of what each copy looks like.)

If you want your changed main.py to go into your next commit, you must now run git add main.py. This copies the main.py file into Git's index, replacing the existing one. The new copy is in Git's frozen format, but isn't actually frozen yet:

   HEAD         index       work-tree
--------- --------- ---------
README.md(1) README.md(1) README.md(1)
main.py(1) main.py(2) main.py(2)

... but now, after git add, the copy in the index is different from the HEAD copy (and is the same as the work-tree copy). If you run git commit now, Git will make a new frozen commit from the index copies of each file.

Notice that the copies are index → work-tree and work-tree → index

When Git extracts the commit initially, it needs the frozen-format files in its index. That's easy and is a straight copy.1 Git needs to copy that frozen-format file to your work-tree, though, and this involves de-compressing and de-Git-izing it. Git makes this copy by extracting the index version (frozen-format) to your work-tree (a regular everyday file):

  • index → work-tree: de-compress

Meanwhile, your later git add has to copy your work-tree file into the index:

  • work-tree → index: compress into frozen format

What if, while Git is doing these copies, we gave Git the ability to turn Unix / Linux style LF-only line endings into Windows-style CRLF line endings? Then we just need this:

  • index → work-tree: de-compress and Windows-ify
  • work-tree → index: de-Windows-ify and re-compress

and that's what Git does, when you tell it to manipulate line endings.


1Actually, it's even easier, because the index doesn't hold a true copy of the file, but rather a reference to an internal Git blob object. But you don't need to worry about this—not unless you get down into the details of using git ls-files --stage and git update-index, anyway.


What you did

You started out, and ran for some number of commits so far, with core.autocrlf set to true. This tells Git: do mess with my line endings. The files Git modifies are, by default, the ones it guesses are the right ones to have this done to them. (It's usually wiser to use .gitattributes to tell Git which files should be manipulated like this, rather than letting Git guess, but Git's guesses are pretty good, most of the time.)

Since Git already does the same de-Windows-line-ending work, the earlier files, committed on Windows, already have Linux-style LF-only line endings. The index copies, which always literally match the committed copies initially, also have Linux-style LF-only line endings.

Only your work-tree copies have other kinds of line endings and even then, they only have those line endings if you told Git to manipulate line endings (which you did).

When you tell Git don't mess with my line endings at all, then it matters whether your work-tree files have CRLF line endings, or LF-only line endings, because then git add will copy whatever you have in your work-tree, into Git's index, without messing with any line endings. Setting core.autocrlf to false, and not having any more-explicit settings in .gitattributes, does this, so now it becomes important to make sure your work-tree files have the endings you want to have in new copies you git add and then git commit.

You ran dos2unix on two work-tree files. This takes their CR-LF Windows-style line endings, if they had those line endings in the first place (they probably did), and turns them into LF-ony line endings. Then you ran git add. The git add step this time didn't de-Windows-ize the line endings, but did re-compress the files. The result was ... the same file that was already in the index, for each file, because the index copies have been Unix / Linux style all along.

Note that it's now very important what line endings you put in each file, because with core.autocrlf turned off, and no .gitattributes entries, you've told Git: hands off all file contents: do not mess with line endings.

If you'd like Git to mess with line endings in a really predictable fashion, rather than guessing, you should create a .gitattributes file and list each file name or file-name-pattern and the correct treatment for that file. This is a bit painful to set up initially, but after that, tends to work well—it's what the Git folks do with the Git project.

Convert line-endings for whole directory tree (Git)

dos2unix does that for you. Fairly straight forward process.

dos2unix filename

Thanks to toolbear, here is a one-liner that recursively replaces line endings and properly handles whitespace, quotes, and shell meta chars.

find . -type f -exec dos2unix {} \;

If you're using dos2unix 6.0 binary files will be ignored.



Related Topics



Leave a reply



Submit