How to write Rake task to import data to Rails app?
I wouldn't delete the products and vendors tables on every cycle. Is this a rails app? If so there are some really nice ActiveRecord helpers that would come in handy for you.
If you have a Product active record model, you can do:
p = Product.find_or_initialize_by_identifier(<id you get from file>)
p.name = <name from file>
p.size = <size from file>
etc...
p.save!
The find_or_initialize will lookup the product in the database by the id you specify, and if it can't find it, it will create a new one. The really handy thing about doing it this way, is that ActiveRecord will only save to the database if any of the data has changed, and it will automatically update any timestamp fields you have in the table (updated_at) accordingly. One more thing, since you would be looking up records by the identifier (id from the file), I would make sure to add an index on that field in the database.
To make a rake task to accomplish this, I would add a rake file to the lib/tasks directory of your rails app. We'll call it data.rake.
Inside data.rake, it would look something like this:
namespace :data do
desc "import data from files to database"
task :import => :environment do
file = File.open(<file to import>)
file.each do |line|
attrs = line.split(":")
p = Product.find_or_initialize_by_identifier(attrs[0])
p.name = attrs[1]
etc...
p.save!
end
end
end
Than to call the rake task, use "rake data:import" from the command line.
How to write a Rake task that imports data and handles deletions?
You definitely should not delete all the records and then recreate them all from the data. This will create all sorts of problems, eg breaking any foreign key fields in other tables, which used to point to the object before it was deleted. It's like knocking a house down and rebuilding it in order to have a different coloured door. So, the "see if it's there, if it is then update it (if it's different), if it's not then create it" is the right strategy to use.
You don't say what your criteria for deletion are, but if it is "any record which isn't mentioned in the import data should be deleted" then you just need to keep track of some unique field from your input data and then delete all records whose own unique field isn't in that list.
So, your code to do the import could look something like this (copying the code from the other question: this code sets the data in a horribly clunky way but i'm not going to address that here)
namespace :data do
desc "import data from files to database"
task :import => :environment do
file = File.open(<file to import>)
identifiers = []
file.each do |line|
#disclaimer: this way of setting the data from attrs[0], attrs[1] etc is crappy and fragile and is not how i would do it
attrs = line.split(":")
identifier = attrs[0]
identifiers << identifier
if p = Product.find_or_initialize_by_identifier(identifier)
p.name = attrs[1]
etc...
p.save!
end
end
#destroy any which didn't appear in the import data
Product.where("identifier not in (?)", identifiers).each(&:destroy)
end
end
How can I import a CSV file via a rake task?
under your project folder in lib/task create a rake file say "import_incidents_csv.rake"
follow this
Ruby on Rails - Import Data from a CSV file
in rake file have following code
require 'csv'
namespace :import_incidents_csv do
task :create_incidents => :environment do
"code from the link"
end
end
You can call this task as "rake import_incidents_csv:create_incidents"
creating rake task for importing data from csv file
I was having the same issue while writing rake task to populate data in database.
In my case the error was same and it was nothing just running the rake task in wrong manner.
I guess you are doing the same, as per the error I can guess
You are running rake tech:temp
in which task is temp and namespace is tech, which is wrong you should pass it other was as first you need to give task name then namespace.
so the right command is
rake temp:tech
It hope this will work. It is silly I know
Writing TestCase for CSV import rake task
I haven't worked with engines, but is there a way to just put the CSV importing logic into it's own class?
namespace :web_import do
desc 'Import users from csv'
task users: :environment do
WebImport.new(url: 'http://blablabla.com/content/people.csv').call
end
end
class WebImport # (or whatever name you want)
def initialize(url) ... end
def call
counter, CSV parse, etc...
end
end
That way you can bump into the Rails console to do the WebImport
and you can also do a test isolating WebImport
. When you do Rake tasks and Jobs (Sidekiq etc), you want to make the Rake task act as as thin a wrapper as possible around the actual meat of the code (which is in this case CSV parsing). Separate the "trigger the csv parse" code from the "actually parse the csv" code into their own classes or files.
Rails 5 - Rake task to import data from CSV file
There might be two problems here.
Encoding in your csv file.
ArgumentError: invalid byte sequence in UTF-8
Undefined local variable.
NameError: undefined local variable or method 'randd_fields' for main:Object
I guess you are trying to count created/imported records:
for_code = Randd::Field.create(anz_reference: row["anz_reference"], title: title)
counter += 1 if randd_field.persisted?the record you created is
for_code
, however you are checking againstrandd_field
.
This should fix itcounter +=1 if for_code.presisted?
Updated:
$ bundle exec rake import:randd_fields --trace ** Invoke import:randd_fields (first_time) ** Invoke environment (first_time) ** Execute environment ** Execute import:randd_fields nil rake aborted! NameError: undefined local variable or method `title' for main:Object
this is because title
variable is not defined. I guess you want to use row[]
here.
for_code = Randd::Field.create(anz_reference: row["anz_reference"], title: row['title'])
Updated 2:
You have a typo in your rake task name
Updated 3:
I think you are calling bundle exec rake import:randd_fields
inside rails console.
Run it directly in terminal should fix it.
Rails Rake Task drop table and import from csv
Try by truncating the table by running a custom sql command:
namespace :csvimportproducts do
desc "Import Products CSV Data."
task :import_products_csv_data => :environment do
ActiveRecord::Base.connection.execute("TRUNCATE TABLE products")
require 'csv'
csv_file_path = '/home/jay/workspace/db/import_tables/products.csv'
CSV.foreach(csv_file_path) do |row|
p = Product.create!({
:product_id => row[0],
:product_name => row[1],
}
)
end
end
end
Rails (rake) Data Import Concurrency
For the question,
is it possible to use the parallel gem with find_each? I cannot find anything in their documentation or examples online doing such. Is there another solution I can do to for iterating over the Customers concurrently?
I would recommend you to use find_in_batches
by Activerecord. You can query for a batch of records and then iterate over each element in the batch using Parallel. For example, it can be something like
User.find_in_batches do |batch|
Parallel.each(batch,in_processes: 8) do |user|
...
end
end
Related Topics
Linkedin API for Company Directory
Elasticsearch & Tire: Using Mapping and To_Indexed_JSON
Creating Permutations from a Multi-Dimensional Array in Ruby
Deleting a Modified Object from a Set in a No-Op
Escape Single Quote in Xpath with Nokogiri
How to Create Symbol (Hash Key) from Association, Using New Ruby (1.9) Hash Syntax
How to Write Rake Task to Import Data to Rails App
Detect Key Press (Non-Blocking) W/O Getc/Gets in Ruby
Xpath Expression for Regex-Like Matching
Zlib in Ruby to Uncompress .Gz
Render Three Different Partials Depending on Button Clicked