What Is It Exactly a Blob in a Dbms Context

What is it exactly a BLOB in a DBMS context

BLOB :

BLOB (Binary Large Object) is a large object data type in the database system. BLOB could store a large chunk of data, document types and even media files like audio or video files. BLOB fields allocate space only whenever the content in the field is utilized. BLOB allocates spaces in Giga Bytes.

USAGE OF BLOB :

You can write a binary large object (BLOB) to a database as either binary or character data, depending on the type of field at your data source. To write a BLOB value to your database, issue the appropriate INSERT or UPDATE statement and pass the BLOB value as an input parameter. If your BLOB is stored as text, such as a SQL Server text field, you can pass the BLOB as a string parameter. If the BLOB is stored in binary format, such as a SQL Server image field, you can pass an array of type byte as a binary parameter.

A useful link : Storing documents as BLOB in Database - Any disadvantages ?

Explanation of a BLOB and a CLOB

BLOB's (Binary Large OBject) store binary files: pictures, text, audio files, word documents, etc. Anything you can't read with the human eye. You can't select them via SQL*Plus.

CLOB's (Character Large OBjects) store character data. They are often used to store XML docs, JSON's or just large blocks of formatted or unformatted text.

Why does git store file contents as a blob?

“Blob” just means a sequence of bytes. A blob in Git will contain the same exact data as a file, it’s just that a blob is stored in the Git object database, and a file is stored on the filesystem.

So there is no difference in the format, the only difference is how they are stored.

For example, if you add an image hello.jpg to your repository, and then commit it, you will have two copies of the same data:

  • You will have a file on disk, named hello.jpg, which contains the JPEG data,

  • You will have a blob in your Git object database, named with the hash of its contents, which contains the exact same JPEG data in the same format.

The database can use some fancy tricks to store data efficiently, including compression and using deltas, but in the end it is still storing the exact same data that was in the original file.

A text file is no different. “Text” is just a particular type of data that you can store in a binary file.

Storing Documents as Blobs in a Database - Any disadvantages?

When your DB grows bigger and bigger it will become harder to backup.
Restoring a backup of a table with over 100 GB of data is not something that makes you happy.

Another thing that get is that all the table management functions get slower and slower as the dataset grows.

But this can be overcome by making your data table just contain 2 fields:
ID and BLOB.

Retrieving data (by primary key) will likely only become a problem long after you hit a wall with backing up the dataset.

Where does the term blob come from in the context of git?

The git man page seems to be surprisingly bereft of an official definition, other than this (emphasis mine):

The object database contains objects of three main types: blobs, which hold file data; trees, which point to blobs and other trees to build up directory hierarchies; and commits, which each reference a single tree and some number of parent commits.

The repeated use of the term "object database" across git documentation suggests a borrowing of "blob" specifically from DBMSs.

In its article on Binary large objects Wikipedia defines the term as "a collection of binary data stored as a single entity in a database management system", further offering the following:

Blobs were originally just amorphous chunks of data invented by Jim Starkey at DEC, who describes them as "the thing that ate Cincinnati, Cleveland, or whatever" from "the 1958 Steve McQueen movie", referring to The Blob. Later, Terry McKiever, a marketing person for Apollo, felt that it needed to be an acronym and invented the backronym Basic Large Object. Then Informix invented an alternative backronym, Binary Large Object.

So, though it's not a definitive answer, the term "blob" has a conventional and well-defined usage across computer science as an opaque string of binary data, and git adheres to that definition without further specifying it.

Using BLOB to save a path and retrieve it

It sounds like you're slightly confused about database blobs, images, and paths. Let's step through this.

When your users upload images of significant size, you'll want to store them in the filesystem, and record the path in the database. The path is just a string - you can use a plain old VARCHAR type of column for it.

If you wanted to store the image itself in the database, you'd use a BLOB type of column. A blob is a "binary large object" - it's how you'd store data that's not some flavor of string or number. If your users' profile pictures are small (e.g. Twitter-sized avatars), then you might consider storing them in the database as BLOB data.

However, your main question of "okay, how do I implement this?" can't be answered without knowing what programming language and database you're using. You might want to provide more information so that you can get a better answer to that question.

EDIT: Responding to your clarification: yes, you should definitely store paths in the database and images in the filesystem. That's the standard way to do it. One of the major reasons for this is that you can operate on the images separately from operating on the database. It's much, much, much easier to back up a folder full of pictures than it is to back up a large database! You can also, for example, automatically create thumbnail versions of the images, or compare them to one another, and so on, using other applications that won't have to touch your database.

Here's a tutorial from Tizag about how to upload files with PHP, and here's a tutorial from W3Schools about it. On Stack Overflow, you might find the questions "PHP Image Upload" and "Image Upload Script in PHP" helpful.

dll MATLAB, dll DBMS, blob?

I am sure you can transfer data from at least Oracle using a blob to any program outside the database. That program has to be able to interpret the blob and most likely it has to be able to build some memory structure from it. Most memory structures rely on pointers to memory locations and this will be hard to store in a database because memory location will vary and the database storage is consistent. For be this sounds like back to the 70's - 80's.

Apart from that, you most probably want to do some analysis on that data. For the data to transfer to your client, the database has to dig it all up. That costs time, the same for the transfer to the client, especially when large volumes are in place. By the time the client has the data to be able to start the analysis, the database - at least in Oracle - has completed that same analysis. Disadvantage of that solution is that it is database depended, advantage is that it is the best performant solution.

What is the difference between i, d, s, b in PDOStatement::bindParam

"Double" is a float with double precision. "blob" is a "Binary Large OBject" (for example a file).

What are the ways to insert & retrieve BLOB data from Oracle database using SQL?

First of all, you should expect storing BLOBs in a database to be (sometimes a bit, often significantly) slower, but definitly not faster than storing them in a file system. The reasons to store them in a DB do not center about performance, but about e.g.:

  • Unavailability of a (shared) file system in a clustered or load-balanced scenario
  • Ease of backup: Single process, a.o.t. 2 processes when files and DB are used
  • Transaction safety: A BLOB is either there and complete or not, but not in a half-baked stage
  • others I can't think of right now.

The general rule of thumb is, that if none of these concern you, you should store your files as ... files. Storing the metadata and pathname in a DB is IMHO good and common practice.

Concerning Oracle tuning: There are books written about that. I suspect to total them far over a ton in dead-tree-paperback format. You might first of all look at the Oracle process' memory consumption - rule of thumb: If it is less than a gig and you use BLOBs, you are in trouble. Read up on the different memory pools and how to increase them. Some limits for the express edition might apply.



Related Topics



Leave a reply



Submit