MySQL - Creating Rows VS. Columns Performance

mysql - Creating rows vs. columns performance

I think the advantage to storing as more rows (i.e. normalized) depends on design and maintenance considerations in the face of change.

Also, if the 140 columns have the same meaning or if it differs per experiment - properly modeling the data according to normalization rules - i.e. how is data related to a candidate key.

As far as performance, if all the columns are used it makes very little difference. Sometimes a pivot/unpivot operation can be expensive over a large amount of data, but it makes little difference on a single key access pattern. Sometimes a pivot in the database can make your frontend code a lot simpler and backend code more flexible in the face of change.

If you have a lot of NULLs, it might be possible to eliminate rows in a normalized design and this would save space. I don't know if MySQL has support for a sparse table concept, which could come into play there.

In MYSQL, is it better to have many rows or many columns?

Obviously the Second one. All those Attributes should go in a separate column else there is no point i storing them in RDBMS ... you can rather store them in text file as well

VARCHAR vs TEXT performance when data fits on row

With respect to storage, InnoDB will handle VARCHAR and TEXT much the
same when both stored inline. However, when fetching the data from
InnoDB, the server will allocate space for all VARCHAR columns before
query execution. While space for TEXT columns will only be allocated
if they are actually read, where DYNAMIC memory allocation takes time.

https://forums.mysql.com/read.php?24,645115,645164#msg-645164

Better structure of Mysql table for Mysql performance

MySQL's InnoDB storage engine (the default) stores rows in pages of fixed size (default is 16KB per page). Some number of rows fit on a single page, depending on the row size. I.e. if rows are smaller, more rows fit per page.

Pages are the increment of data loaded from storage to RAM. So if your query references one row on that page, the whole page is loaded into RAM, and then all rows on the same page are faster to access.

A single row will not be split. It will be stored in the same page (except for very long varchar or text/blob columns, those can expand to other pages).

Assuming rows for the same entity_id are probably grouped together, the difference in storage and performance between your two table designs is really very close. It's true that in the first design, you have extra rows so there will be extra instances of id and entity_id. But those are only one bigint and one int, so not much overhead. The other columns will use identical storage.

Other considerations:

Will you will ever expect to extend the reference types to 5 or higher? The second design would require you to add a column with ALTER TABLE.

Do you need a varchar for the reference type? Could you encode it as a tinyint or an ENUM? That would save space.

On the other hand, using your second design saves even more space because the reference types are only part of the metadata. Therefore they take space only once, not on every row.



Related Topics



Leave a reply



Submit