How to delete specific lines in text file?
Deleting lines cleanly and efficiently from a text file is "difficult" in the general case, but can be simple if you can constrain the problem somewhat.
Here are some questions from SO that have asked a similar question:
- How do I remove lines of data in the middle of a text file with Ruby
- Deleting a specific line in a text file?
- Deleting a line in a text file
- Delete a line of information from a text file
There are numerous others, as well.
In your case, if your input file is relatively small, you can easily afford to use the approach that you're using. Really, the only thing that would need to change to meet your criteria is to modify your input file loop and condition to this:
File.open('output.txt', 'w') do |out_file|
File.foreach('input.txt').with_index do |line,line_number|
out_file.puts line if line_number.even? # <== line numbers start at 0
end
end
The changes are to capture the line number, using the with_index
method, which can be used due to the fact that File#foreach
returns an Enumerator
when called without a block; the block now applies to with_index
, and gains the line number as a second block argument. Simply using the line number in your comparison gives you the criteria that you specified.
This approach will scale, even for somewhat large files, whereas solutions that read the entire file into memory have a fairly low upper limit on file size. With this solution, you're more constrained by available disk space and speed at which you can read/write the file; for instance, doing this to space-limited online storage may not work as well as you'd like. Writing to local disk or thumb drive, assuming that you have space available, should be no problem at all.
How do I remove lines of data in the middle of a text file with Ruby
You can delete a line in a several ways:
Simulate deletion. That is, just overwrite line's content with spaces. Later, when you read and process the file, just ignore such empty lines.
Pros: this is easy and fast. Cons: it's not real deletion of data (file doesn't shrink) and you need to do more work when reading/processing the file.
Code:
f = File.new(filename, 'r+')
f.each do |line|
if should_be_deleted(line)
# seek back to the beginning of the line.
f.seek(-line.length, IO::SEEK_CUR)
# overwrite line with spaces and add a newline char
f.write(' ' * (line.length - 1))
f.write("\n")
end
end
f.close
File.new(filename).each {|line| p line }
# >> "Person1,will,23\n"
# >> " \n"
# >> "Person3,Mike,44\n"Do real deletion. This means that line will no longer exist. So you will have to read next line and overwrite the current line with it. Then repeat this for all following lines until the end of file is reached. This seems to be error prone task (lines of different lengths, etc), so here's an error-free alternative: open temp file, write to it lines up to (but not including) the line you want to delete, skip the line you want to delete, write the rest to the temp file. Delete the original file and rename temporary one to use its name. Done.
While this is technically a total rewrite of the file, it does differ from what you asked. The file doesn't need to be loaded fully to memory. You need only one line at a time. Ruby provides a method for this: IO#each_line.
Pros: No assumptions. Lines get deleted. Reading code needs not to be altered. Cons: lots more work when deleting the line (not only the code, but also IO/CPU time).
There is a snippet that illustrates this approach in @azgult's answer.
How to remove a row from a CSV with Ruby
You should be able to use CSV::Table#delete_if
, but you need to use CSV::table
instead of CSV::read
, because the former will give you a CSV::Table object, whereas the latter results in an Array of Arrays. Be aware that this setting will also convert the headers to symbols.
table = CSV.table(@csvfile)
table.delete_if do |row|
row[:foo] == 'true'
end
File.open(@csvfile, 'w') do |f|
f.write(table.to_csv)
end
Deleting a specific line in a text file?
I think you can't do that safely because of file system limitations.
If you really wanna do a inplace editing, you could try to write it to memory, edit it, and then replace the old file. But beware that there's at least two problems with this approach. First, if your program stops in the middle of rewriting, you will get an incomplete file. Second, if your file is too big, it will eat your memory.
file_lines = ''
IO.readlines(your_file).each do |line|
file_lines += line unless <put here your condition for removing the line>
end
<extra string manipulation to file_lines if you wanted>
File.open(your_file, 'w') do |file|
file.puts file_lines
end
Something along those lines should work, but using a temporary file is a much safer and the standard approach
require 'fileutils'
File.open(output_file, "w") do |out_file|
File.foreach(input_file) do |line|
out_file.puts line unless <put here your condition for removing the line>
end
end
FileUtils.mv(output_file, input_file)
Your condition could be anything that showed it was the unwanted line, like, file_lines += line unless line.chomp == "aaab"
for example, would remove the line "aaab".
What's the best way to remove the last n lines of a file in a ruby script?
This is about the simplest way I can think to do this in pure ruby that also works for large files, since it processes each line at a time instead of reading the whole file into memory:
INFILE = "input.txt"
OUTFILE = "output.txt"
total_lines = File.foreach(INFILE).inject(0) { |c, _| c+1 }
desired_lines = total_lines - 4
# open output file for writing
File.open(OUTFILE, 'w') do |outfile|
# open input file for reading
File.foreach(INFILE).with_index do |line, index|
# stop after reaching the desired line number
break if index == desired_lines
# copy lines from infile to outfile
outfile << line
end
end
However, this is about twice as slow as what you posted on a 160mb file I created. You can shave off about a third by using wc
to get the total lines, and using pure Ruby for the rest:
total_lines = `wc -l < #{INFILE}`.strip.to_i
# rest of the Ruby File code
Another caveat is that your CSV must not have it's own line breaks within any cell content, in which case, you would need a CSV parser, and CSV.foreach(INFILE) do |row|
could be used instead, but it is quite a bit slower in my limited testing, but you mentioned above that your cells should be ok to be processes by file line.
That said, what you posted using wc
and dd
is much faster, so maybe you should keep using that.
remove rows in file - Ruby
You can use this to get the unique lines in an array in a csv file
File.readlines("file.csv").uniq
=> ["350 lbs., Outrigger Footprint, 61\" x 53\", Weight, 767 lbs., 300-2080\n", "350 lbs., Outrigger Footprint, 61\" x 53\", Weight, 817 lbs., 300-2580\n", "350 lbs., Outrigger Footprint, 69\" x 61\", Weight, 867 lbs., 300-3080\n"]
To write it to a new file, you can open a file in write mode, write this into the file:
File.open("new_csv", "w+") { |file| file.puts File.readlines("csv").uniq }
For comparing values, you can use split function on ",", to access each column like this:
rows = File.readlines("csv").map(&:chomp) # equivalent to File.readlines.map { |f| f.chomp }
mapped_columns = rows.map { |r| r.split(",").map(&:strip) }
=> [["350 lbs.", " Outrigger Footprint", " 61\" x 53\"", " Weight", " 767 lbs.", " 300-2080"], ["350 lbs.", " Outrigger Footprint", " 61\" x 53\"", " Weight", " 817 lbs.", " 300-2580"], .....]
mapped_columns[0][5]
=> "300-2080"
If you want more functionality, you are better off installing FasterCSV gem.
Remove row of a CSV file based on the index by iterating over the rows
The thing you thought of is not complicated, just rewrite the file.
File.open(@filepath, 'w') { |f| f.puts(@file) }
Related Topics
Retrieve All Posts Where the Given User Has Commented, Ruby on Rails
Why Are My Thread Variables Intermittent in Rails
How to Specify a Minimum Ruby Version in a Gemspec
Escaping the .Each { } Iteration Early in Ruby
How to Return Everything After Last Slash(/) in a Ruby String
Using %I and %I Symbol Array Literal
Rails Render Partial in Helper
Error Connecting to Redis on 127.0.0.1:6379 (Errno::Econnrefused) - Wercker
Carrierwave Cannot Remove Image
Count Records Created Within the Last 7 Days
Remove Element at Specific Index from Redis List
Best Practice for Limiting the Number of Associations Within a Has_Many Relationship
How Write into CSV File Properly
Invoking Knife in a Ruby Class