Why Use SQL Database

Why use SQL database?

If all you need to do is store some application data somewhere, then a general purpose RDBMS or even SQLite might be overkill. Serializing your objects and writing them to a file might be simpler in some cases. An advantage to SQLite is that if you have a lot of this kind of information, it is all contained in one file. A disadvantage is that it is more difficult to read it. For example, if you serialize you data to YAML, you can read the file with any text editor or shell.

Personally, I would have used some
'compiled db query' bytecode, that
would be assembled once inside a
client application and passed to the
database.

This is how some database APIs work. Check out static SQL and prepared statements.

Is there any reason for me not to
write it myself and to use SQL
database instead?

If you need a lot of features, at some point it will be easier to use an existing RDMBS then to write your own database from scratch. If you don't need many features, a simpler solution may be wiser.

The whole point of database products is to avoid writing the database layer for every new program. Yes, a modern RDMBS might not always be a perfect fit for every project. This is because they were designed to be very general, so in practice, you will always get additional features you don't need. That doesn't mean it is better to have a custom solution. The glove doesn't always need to be a perfect fit.

UPDATE:

But why use SQL language for
interaction with such a database?

Good question.

The answer to that may be found in the original paper describing the relational model A Relational Model of Data for Large Shared Data Banks, by E. F. Codd, published by IBM in 1970. This paper describes the problems with the existing database technologies of the time, and explains why the relational model is superior.

The reason for using the relational model, and thus a logical query language like SQL, is data independence.

Data independence is defined in the paper as:

"... the independence of application programs and terminal activities from the growth in data types and changes in data representations."

Before the relational model, the dominate technology for databases was referred to as the network model. In this model, the programmer had to know the on-disk structure of the data and traverse the tree or graph manually. The relational model allows one to write a query against the conceptual or logical scheme that is independent of the physical representation of the data on disk. This separation of logical scheme from the physical schema is why we use the relational model. For a more on this issue, here are some slides from a database class. In the relational model, we use logic based query languages like SQL to retrieve data.
Codd's paper goes into more detail about the benefits of the relational model. Give it a read.

SQL is a query language that is easy to type into a computer in contrast with the query languages typically used in a research papers. Research papers generally use relation algebra or relational calculus to write queries.

In summary, we use SQL because we happen to use the relational model for our databases.

If you understand the relational model, it is not hard to see why SQL is the way it is. So basically, you need to study the relation model and database internals more in-depth to really understand why we use SQL. It may be a bit of a mystery otherwise.

UPDATE 2:

SQL is an interface between a human
and a database. The question is why do
we have to use it for
application/database interaction? I
still ask for examples of human beings
writing/debugging SQL.

Because the database is a relational database, it only understands relational query languages. Internally it uses a relational algebra like language for specifying queries which it then turns into a query plan. So, we write our query in a form we can understand (SQL), the DB takes our SQL query and turns it into its internal query language. Then it takes the query and tries to find a "query plan" for executing the query. Then it executes the query plan and returns the result.

At some point, we must encode our query in a format that the database understands. The database only knows how to convert SQL to its internal representation, that is why there is always SQL at some point in the chain. It cannot be avoided.

When you use ORM, your just adding a layer on top of the SQL. The SQL is still there, its just hidden. If you have a higher-level layer for translating your request into SQL, then you don't need to write SQL directly which is beneficial in some cases. Some times we do not have such a layer that is capable of doing the kinds of queries we need, so we must use SQL.

Why to use NoSQL DB when we could scale out SQL DB by sharing?

NOSQL databases tend to relax the ACID properties of databases. For instance, instead of "immediate" consistency, they allow "eventual" consistency.

That is, if you update a row in a table, then for some period of time after the transaction is committed other queries might see the old value. Eventually, everyone will see the same value.

Because this is relaxed, NOSQL datbases have more flexibility for optimizations -- and particularly not waiting for all nodes in a cluster to agree to something.

Relational databases have not stood still for the past few decades, so they often offer variations on locking and transactions that come close.

In my experience, this can make a difference in very high volume transaction environments. However for many purposes, the guarantees of transactional integrity on a SQL database are worth the overhead.

Note: NOSQL really refers to a class of "alternative" databases. The "NO" in NOSQL stands for "not only". In reality, I think of these as all being a part of an ecosystem of functionality. NOSQL can also refer to document stores, key-value pair databases, graph databases, GIS databases -- and sophisticated databases of any type often have significant overlapping functionality.

What is a good reason to use SQL views?

Another use that none of the previous answers seem to have mentioned is easier deployment of table structure changes.

Say, you wish to retire a table (T_OLD) containing data for active users, and instead use a new table with similar data (named T_NEW) but one that has data for both active and inactive users, with one extra column active.

If your system(s) have gazillion queries that do SELECT whatever FROM T_OLD WHERE whatever, you have two choices for the roll-out:

1) Cold Turkey - Change the DB, and at the same time, change, test and release numerous pieces of code which contained said query. VERY hard to do (or even coordinate), very risky. Bad.

2) Gradual - change the DB by creating the T_NEW table, dropping the T_OLD table and instead creating a VIEW called T_OLD that mimics the T_OLD table 100% (e.g the view query is SELECT all_fields_except_active FROM T_NEW WHERE active=1).

That would allow you to avoid releasing ANY code that currently selects from T_OLD, and do the changes to migrate code from T_OLD to T_NEW at leisure.

This is a simple example, there are others a lot more involved.

P.S. On the other hand, you probably should have had a stored procedure API instead of direct queries from T_OLD, but that's not always the case.

Good reasons NOT to use a relational database?

Plain text files in a filesystem

  • Very simple to create and edit
  • Easy for users to manipulate with simple tools (i.e. text editors, grep etc)
  • Efficient storage of binary documents

XML or JSON files on disk

  • As above, but with a bit more ability to validate the structure.

Spreadsheet / CSV file

  • Very easy model for business users to understand

Subversion (or similar disk based version control system)

  • Very good support for versioning of data

Berkeley DB (Basically, a disk based hashtable)

  • Very simple conceptually (just un-typed key/value)
  • Quite fast
  • No administration overhead
  • Supports transactions I believe

Amazon's Simple DB

  • Much like Berkeley DB I believe, but hosted

Google's App Engine Datastore

  • Hosted and highly scalable
  • Per document key-value storage (i.e. flexible data model)

CouchDB

  • Document focus
  • Simple storage of semi-structured / document based data

Native language collections (stored in memory or serialised on disk)

  • Very tight language integration

Custom (hand-written) storage engine

  • Potentially very high performance in required uses cases

I can't claim to know anything much about them, but you might also like to look into object database systems.

Database vs File system storage

A database is generally used for storing related, structured data, with well defined data formats, in an efficient manner for insert, update and/or retrieval (depending on application).

On the other hand, a file system is a more unstructured data store for storing arbitrary, probably unrelated data. The file system is more general, and databases are built on top of the general data storage services provided by file systems. [Quora]

The file system is useful if you are looking for a particular file, as operating systems maintain a sort of index. However, the contents of a txt file won't be indexed, which is one of the main advantages of a database.

For very complex operations, the filesystem is likely to be very slow.

Main RDBMS advantages:

  • Tables are related to each other

  • SQL query/data processing language

  • Transaction processing addition to SQL (Transact-SQL)

  • Server-client implementation with server-side objects like stored procedures, functions, triggers, views, etc.

Advantage of the File System over Data base Management System is:

When handling small data sets with arbitrary, probably unrelated data, file is more efficient than database.
For simple operations, read, write, file operations are faster and simple.

You can find n number of difference over internet.

Can you use SQL to make a database for a HTML website?

Short answer: No, not only with SQL.

SQL is a language used to perform queries in a database (inserting data, deleting, searching, etc.). To use it to display data on your website you would also need to learn some other programming language, so you can write code that will serve as an interface between your website and the database. Two most popular choices are Python (with Flask or Django) or NodeJS, I recommend using Python, since it's known to be somewhat beginner-friendly. I suggest finding a tutorial online to get you started.

Happy coding!



Related Topics



Leave a reply



Submit