Is It Faster to Access Data from Files or a Database Server

Is it faster to access data from files or a database server?

I'll add to the it depends crowd.

This is the kind of question that has no generic answer but is heavily dependent on the situation at hand. I even recently moved some data from a SQL database to a flat file system because the overhead of the DB, combined with some DB connection reliability issues, made using flat files a better choice.

Some questions I would ask myself when making the choice include:

  1. How am I consuming the data? For example will I just be reading from the beginning to the end rows in the order entered? Or will I be searching for rows that match multiple criteria?

  2. How often will I be accessing the data during one program execution? Will I go once to get all books with Salinger as the author or will I go several times to get several different authors? Will I go more than once for several different criteria?

  3. How will I be adding data? Can I just append a row to the end and that's perfect for my retrieval or will it need to be resorted?

  4. How logical will the code look in six months? I emphasize this because I think this is too often forgotten in designing things (not just code, this hobby horse is actually from my days as a Navy mechanic cursing mechanical engineers). In six months when I have to maintain your code (or you do after working another project) which way of storing and retrieving data will make more sense. If going from flat files to a DB results in a 1% efficiency improvement but adds a week of figuring things out when you have to update the code have you really improved things.

File access speed vs database access speed

If you're doing read-heavy access (looking up filenames, etc) you might benefit from memcached. You could store the "hottest" (most recently created, recently used, depending on your app) data in memory, then only query the DB (and possibly files) when the cache misses. Memory access is far, far faster than database or files.

If you need write-heavy access, a database is the way to go. If you're using MySQL, use InnoDB tables, or another engine that supports row-level locking. That will avoid people blocking while someone else writes (or worse, writing anyway).

But ultimately, it depends on the data.

Database vs File system storage

A database is generally used for storing related, structured data, with well defined data formats, in an efficient manner for insert, update and/or retrieval (depending on application).

On the other hand, a file system is a more unstructured data store for storing arbitrary, probably unrelated data. The file system is more general, and databases are built on top of the general data storage services provided by file systems. [Quora]

The file system is useful if you are looking for a particular file, as operating systems maintain a sort of index. However, the contents of a txt file won't be indexed, which is one of the main advantages of a database.

For very complex operations, the filesystem is likely to be very slow.

Main RDBMS advantages:

  • Tables are related to each other

  • SQL query/data processing language

  • Transaction processing addition to SQL (Transact-SQL)

  • Server-client implementation with server-side objects like stored procedures, functions, triggers, views, etc.

Advantage of the File System over Data base Management System is:

When handling small data sets with arbitrary, probably unrelated data, file is more efficient than database.
For simple operations, read, write, file operations are faster and simple.

You can find n number of difference over internet.

Which is faster , interacting with a database or using a file system for input output

It depends and you probably should consider other factors as well.

If you use a database, there is an overhead for transactions, security, index management etc. on the one hand. On the other hand you can get caching (which could significantly speed up your application) and better performance for random access, if you have a lot of data. In a multithreaded environment I suggest using a database because of a property implemented locking mechanism.

Flat files are OK for really simple and small data. Do you really need to open and close them so often?

Query from database or from memory? Which is faster?

Those parameters change anually

Yes, do cache them in memory. Especially if they are large or complex.

You should take care to invalidate them at the right time once a year, depending how accurate that has to be.

Simply caching them for an hour or even for a few minutes might be a good compromise.

in PHP, which is faster - reading a file or a database call?

If you're using APC (or similar), your fastest result is probably going to be coding the word list directly into a PHP source file and then just require_once()'ing it.

What is faster, flat files or a MySQL RAM database?

Flat files? Nooooooo...

Use a good DB engine (MySQL, SQLite, etc). Then, for maximum performance, use memcached to cache content.


In this way, you have the ease and reliability of sharing data between processes using proven server software that handles concurrency, etc... But you get the speed of having your data cached.

Keep in mind a couple things:

  1. MySQL has a query cache. If you are issuing the same queries repeteadly, you can gain a lot of performance without adding a caching layer.
  2. MySQL is really fast anyway. Have you load-tested to demonstrate it is not fast enough?

reading from MySQL is faster or reading from a file is faster?

As long as your tables are properly indexed and as long as you are using those indices (that's right), using a relational DB (like mysql) is going to be much faster, more robust, flexible (insert many buzzwords here), etc.

To examine why your queries' performance does not match your expectations, you can use the explain clause with your selects (http://dev.mysql.com/doc/refman/5.1/en/explain.html).

Reading a file system or run a database's query

Reading a static file is nearly always faster than running an SQL query that returns the same information.

The reason that developers use a database is to support data that isn't static. That is, data changes in complex ways. Sometimes it's only individual records changing.

In many web sites, if you try to replace the whole static html file every time one thing in it changes, you will find that you can't do that fast enough to keep up with the rate of changes.

But it may be that a given web page view doesn't need all the data in that static file. It only needs a small subset, related to the current user who is viewing the web page. Or related to recent changes. Reading a limited subset of data using SQL is much faster than reading a huge static file.

For example: Suppose your website is about concert tickets. It records every concert, and every attendee, and every ticket sale. Should that go into one huge static file to be included by every PHP request? What if a user only wants to view the information about their ticket purchase for one upcoming concert? It would be wasteful to read the whole file that contains both upcoming concerts as well as past records of hundreds of concerts and millions of ticket sales.



Related Topics



Leave a reply



Submit