Remove Rows in File - Ruby

How to delete specific lines in text file?

Deleting lines cleanly and efficiently from a text file is "difficult" in the general case, but can be simple if you can constrain the problem somewhat.

Here are some questions from SO that have asked a similar question:

  • How do I remove lines of data in the middle of a text file with Ruby
  • Deleting a specific line in a text file?
  • Deleting a line in a text file
  • Delete a line of information from a text file

There are numerous others, as well.

In your case, if your input file is relatively small, you can easily afford to use the approach that you're using. Really, the only thing that would need to change to meet your criteria is to modify your input file loop and condition to this:

File.open('output.txt', 'w') do |out_file|
File.foreach('input.txt').with_index do |line,line_number|
out_file.puts line if line_number.even? # <== line numbers start at 0
end
end

The changes are to capture the line number, using the with_index method, which can be used due to the fact that File#foreach returns an Enumerator when called without a block; the block now applies to with_index, and gains the line number as a second block argument. Simply using the line number in your comparison gives you the criteria that you specified.

This approach will scale, even for somewhat large files, whereas solutions that read the entire file into memory have a fairly low upper limit on file size. With this solution, you're more constrained by available disk space and speed at which you can read/write the file; for instance, doing this to space-limited online storage may not work as well as you'd like. Writing to local disk or thumb drive, assuming that you have space available, should be no problem at all.

How do I remove lines of data in the middle of a text file with Ruby

You can delete a line in a several ways:

  • Simulate deletion. That is, just overwrite line's content with spaces. Later, when you read and process the file, just ignore such empty lines.

    Pros: this is easy and fast. Cons: it's not real deletion of data (file doesn't shrink) and you need to do more work when reading/processing the file.

    Code:

    f = File.new(filename, 'r+')
    f.each do |line|
    if should_be_deleted(line)
    # seek back to the beginning of the line.
    f.seek(-line.length, IO::SEEK_CUR)

    # overwrite line with spaces and add a newline char
    f.write(' ' * (line.length - 1))
    f.write("\n")
    end
    end
    f.close

    File.new(filename).each {|line| p line }

    # >> "Person1,will,23\n"
    # >> " \n"
    # >> "Person3,Mike,44\n"
  • Do real deletion. This means that line will no longer exist. So you will have to read next line and overwrite the current line with it. Then repeat this for all following lines until the end of file is reached. This seems to be error prone task (lines of different lengths, etc), so here's an error-free alternative: open temp file, write to it lines up to (but not including) the line you want to delete, skip the line you want to delete, write the rest to the temp file. Delete the original file and rename temporary one to use its name. Done.

    While this is technically a total rewrite of the file, it does differ from what you asked. The file doesn't need to be loaded fully to memory. You need only one line at a time. Ruby provides a method for this: IO#each_line.

    Pros: No assumptions. Lines get deleted. Reading code needs not to be altered. Cons: lots more work when deleting the line (not only the code, but also IO/CPU time).

    There is a snippet that illustrates this approach in @azgult's answer.

How to remove a row from a CSV with Ruby

You should be able to use CSV::Table#delete_if, but you need to use CSV::table instead of CSV::read, because the former will give you a CSV::Table object, whereas the latter results in an Array of Arrays. Be aware that this setting will also convert the headers to symbols.

table = CSV.table(@csvfile)

table.delete_if do |row|
row[:foo] == 'true'
end

File.open(@csvfile, 'w') do |f|
f.write(table.to_csv)
end

Deleting a specific line in a text file?

I think you can't do that safely because of file system limitations.

If you really wanna do a inplace editing, you could try to write it to memory, edit it, and then replace the old file. But beware that there's at least two problems with this approach. First, if your program stops in the middle of rewriting, you will get an incomplete file. Second, if your file is too big, it will eat your memory.

file_lines = ''

IO.readlines(your_file).each do |line|
file_lines += line unless <put here your condition for removing the line>
end

<extra string manipulation to file_lines if you wanted>

File.open(your_file, 'w') do |file|
file.puts file_lines
end

Something along those lines should work, but using a temporary file is a much safer and the standard approach

require 'fileutils'

File.open(output_file, "w") do |out_file|
File.foreach(input_file) do |line|
out_file.puts line unless <put here your condition for removing the line>
end
end

FileUtils.mv(output_file, input_file)

Your condition could be anything that showed it was the unwanted line, like, file_lines += line unless line.chomp == "aaab" for example, would remove the line "aaab".

What's the best way to remove the last n lines of a file in a ruby script?

This is about the simplest way I can think to do this in pure ruby that also works for large files, since it processes each line at a time instead of reading the whole file into memory:

INFILE = "input.txt"
OUTFILE = "output.txt"

total_lines = File.foreach(INFILE).inject(0) { |c, _| c+1 }
desired_lines = total_lines - 4

# open output file for writing
File.open(OUTFILE, 'w') do |outfile|
# open input file for reading
File.foreach(INFILE).with_index do |line, index|
# stop after reaching the desired line number
break if index == desired_lines

# copy lines from infile to outfile
outfile << line
end
end

However, this is about twice as slow as what you posted on a 160mb file I created. You can shave off about a third by using wc to get the total lines, and using pure Ruby for the rest:

total_lines = `wc -l < #{INFILE}`.strip.to_i
# rest of the Ruby File code

Another caveat is that your CSV must not have it's own line breaks within any cell content, in which case, you would need a CSV parser, and CSV.foreach(INFILE) do |row| could be used instead, but it is quite a bit slower in my limited testing, but you mentioned above that your cells should be ok to be processes by file line.

That said, what you posted using wc and dd is much faster, so maybe you should keep using that.

remove rows in file - Ruby

You can use this to get the unique lines in an array in a csv file

File.readlines("file.csv").uniq
=> ["350 lbs., Outrigger Footprint, 61\" x 53\", Weight, 767 lbs., 300-2080\n", "350 lbs., Outrigger Footprint, 61\" x 53\", Weight, 817 lbs., 300-2580\n", "350 lbs., Outrigger Footprint, 69\" x 61\", Weight, 867 lbs., 300-3080\n"]

To write it to a new file, you can open a file in write mode, write this into the file:

File.open("new_csv", "w+") { |file| file.puts File.readlines("csv").uniq }

For comparing values, you can use split function on ",", to access each column like this:

rows = File.readlines("csv").map(&:chomp) # equivalent to File.readlines.map { |f| f.chomp }
mapped_columns = rows.map { |r| r.split(",").map(&:strip) }
=> [["350 lbs.", " Outrigger Footprint", " 61\" x 53\"", " Weight", " 767 lbs.", " 300-2080"], ["350 lbs.", " Outrigger Footprint", " 61\" x 53\"", " Weight", " 817 lbs.", " 300-2580"], .....]
mapped_columns[0][5]
=> "300-2080"

If you want more functionality, you are better off installing FasterCSV gem.

Remove row of a CSV file based on the index by iterating over the rows

The thing you thought of is not complicated, just rewrite the file.

File.open(@filepath, 'w') { |f| f.puts(@file) }


Related Topics



Leave a reply



Submit