Designing a SQL schema for a combination of many-to-many relationship (variations of products)
Applying normalization to your problem, the solution is as given. Run and see it on SQL Fiddle.
CREATE TABLE products (
product_id int AUTO_INCREMENT PRIMARY KEY,
name varchar(20),
description varchar(30)
);
INSERT INTO products
(name, description)
VALUES
('Rug', 'A cool rug' ),
('Cup', 'A coffee cup');
-- ========================================
CREATE TABLE variants (
variant_id int AUTO_INCREMENT PRIMARY KEY,
variant varchar(50)
);
INSERT INTO variants
(variant)
VALUES
('color'),
('material'),
('size');
-- ========================================
CREATE TABLE variant_value (
value_id int AUTO_INCREMENT PRIMARY KEY,
variant_id int,
value varchar(50)
);
INSERT INTO variant_value
(variant_id, value)
VALUES
(1, 'red'),
(1, 'blue'),
(1, 'green'),
(2, 'wool'),
(2, 'polyester'),
(3, 'small'),
(3, 'medium'),
(3, 'large');
-- ========================================
CREATE TABLE product_variants (
product_variants_id int AUTO_INCREMENT PRIMARY KEY,
product_id int,
productvariantname varchar(50),
sku varchar(50),
price float
);
INSERT INTO product_variants
(product_id, productvariantname, sku, price)
VALUES
(1, 'red-wool', 'a121', 50),
(1, 'red-polyester', 'a122', 50);
-- ========================================
CREATE TABLE product_details (
product_detail_id int AUTO_INCREMENT PRIMARY KEY,
product_variants_id int,
value_id int
);
INSERT INTO product_details
(product_variants_id, value_id)
VALUES
(1, 1),
(1, 4),
(2, 1),
(2, 5);
Database schema design for products depending on the variants and prices
Start with entities and relationships. Each entity will get its own table when you're done. Be hardnosed about understanding your real-world entities and your data-world representation of them.
- Your
product
is definitely an entity. - A product
variant
is also an entity. - A
product
has one or morevariant
s. That's a one-to-many relationship. (If the product is something without sizes or colors, thevariant
is generic. - Each
variant
has stock-keeping attributes like sku_number and number_in_stock.
You can't use the product
to keep stock. Consider this: your product is a T-shirt. But you don't run out of T-shirts, you run out of small red T-shirts and you have to order more of those from your supplier.
Two entities so far:
product
:1 to N:variant
.
Let's consider the variant to have a collection of product attributes. Let's make attribute
an entity. Each attribute has a type: 'size', 'color', 'accent_color' for example. So your attribute
table contains these columns.
attribute_id PK
attribute_type VARCHAR(63) 'size', 'color', ...
attribute_value VARCHAR(63) '30', 'red', ...
You have a many-to-many join table variant_attribute
. The presence of a row in this table means the variant has the attribute.
variant_id PK
attribute_id PK
So, we're up to three entities.
product
:1 to N:variant
:0-N to 0-N:attribute
.
That's done with four tables, product
, variant
, variant_attribute
, attribute
(If you want to treat color and size completely separately, you can do that too. But the scheme I propose is field-extensible: if somebody tells you to sell 'cold', 'warm', and 'hot' variants of something next year you won't have to reinvent the wheel.)
These are the important entities and relationships: they describe the stuff your customers order, and the stuff in your warehouse you have to sell. They need to be simple enough to troubleshoot in production at midnight, and expressive enough to represent your real-world entities. Get that part of the schema right.
Only then add on the categories and sections. I cannot tell from your last diagram what you want there. You probably should revisit your entities for that part of your app.
If you want to everything for sale that's red you can do this:
SELECT p.product_id,
p.product_name_ar,
s.attribute_value size,
c.attribute_value color,
v.sku_number,
v.price
FROM product p -- one to many with ...
JOIN variant v ON p.product_id = v.product_id -- many to many with ...
JOIN variant_attribute cva ON v.variant_id = cva.variant_id
JOIN attribute c ON cva.attribute_id = c.attribute_id -- ... color
AND c.attribute_type = 'color'
JOIN variant_attribute sva ON v.variant_id = sva.variant_id -- ... size
JOIN attribute s ON sva.attribute_id = s.attribute_id
AND s.attribute_type = 'size'
WHERE c.attribute_value = 'red'; -- we want the red ones.
Database design for site with many product categories
This is a known (anti) pattern called "Entity Attribute Value" (you can search for that name in the internet if you want to find out more).
Nowadays (and especially with Postgres) I would go for a JSONB
column that stores the category specific attributes of each product rather than an additional fields
table.
You can even go so far to validate the dynamic attributes in the product
table based on the meta-information in the category
table.
So something like this:
create table category
(
id integer primary key,
name varchar(50) not null,
allowed_attributes jsonb not null
);
create table product
(
id integer primary key,
name varchar(100) not null,
brand varchar(100) not null, -- that should probably be a foreign key
... other common columns ...
);
create table product_category
(
product_id integer not null references product,
category_id integer not null references category,
attributes jsonb not null, -- category specific attributes
primary key (product_id, category_id)
);
Now with the list of "allowed attributes" in the category table we can write a trigger that validates them.
First I create a little helper function that makes sure that all keys from one JSON value are present in another:
create function validate_attributes(p_allowed jsonb, p_to_check jsonb)
returns boolean
as
$$
select p_allowed ?& (select array_agg(k) from jsonb_object_keys(p_to_check) as t(k));
$$
language sql;
This function is then used in the trigger for the category table:
create function validate_category_trg()
returns trigger
as
$$
declare
l_allowed jsonb;
l_valid boolean;
begin
select allowed_attributes
into l_allowed
from category
where id = new.category_id;
l_valid := validate_attributes(l_allowed, new.attributes);
if l_valid = false then
raise 'some attributes are not allowed for that category';
end if;
return new;
end;
$$
language plpgsql;
Now let's insert some sample data:
insert into category (id, name, allowed_attributes)
values
(1, 'TV Set', '{"display_size": "number", "color": "string"}'::jsonb),
(2, 'Laptop', '{"ram_gb": "number", "display_size": "number"}');
insert into product (id, name)
values
(1, 'Big TV'),
(2, 'Small TV'),
(3, 'High-End Laptop');
And now let's insert the category information:
insert into product_category (product_id, category_id, attributes)
values
(1, 1, '{"display_size": 60}'), -- Big TV
(2, 1, '{"display_size": 32}'), -- Small TV
(3, 2, '{"ram_gb": 128}'); -- Laptop
This works as all attributes are defined in the category. If we tried to insert the following:
insert into product_category (product_id, category_id, attributes)
values
(3, 2, '{"usb_ports": 5}');
Then the trigger will throw an exception preventing use from inserting the row.
This can be extended to actually use the data type information stored in the allowed_attributes
.
To find products based on attributes, we can use the JSON functions provided by Postgres, e.g. all products that have a display_size:
select p.*
from product p
where exists (select *
from product_category pc
where pc.product_id = p.id
and pc.attributes ? 'display_size');
Finding products that contain multiple attributes is just as easy (and a lot more complicated with the "traditional" EAV model).
The following query finds only products that have the attributes display_size
and ram_gb
select p.*
from product p
where exists (select *
from product_category pc
where pc.product_id = p.id
and pc.attributes ?& '{display_size, ram_gb}');
This can be indexed quite efficiently to make searching faster.
I am not entirely sure you do want to store the attributes in the product_category
table. Maybe they should be stored directly in the product
table - but that depends on your requirements and how you want to manage them.
With the above approach you could e.g. have a category "Computer HW" that would store information like number of CPUs, RAM and clock speed. That category (and its attributes) could be used e.g. Smartphones and Laptops at the same time.
However you would need more than one row in product_category
to fully describe a product if you do that.
The most common approach is probably to store the attributes directly on the product and skip all the dynamic JSONB validation.
So something like this:
create table category
(
id integer primary key,
name varchar(50) not null
);
create table product
(
id integer primary key,
name varchar(100) not null,
brand varchar(100) not null, -- that should probably be a foreign key
attributes jsonb not null,
... other common columns ...
);
create table product_category
(
product_id integer not null references product,
category_id integer not null references category,
primary key (product_id, category_id)
);
Or even a combination of both if you need category specific dynamic attributes and product specific attributes regardless of the category.
Best way to create a table with each possible combination of product options and variations
I have found a way.It's called Cartesian Product.
Here is how it's done.
This is a function that i found online,from this repo https://github.com/schwarmco/go-cartesian-product
func Iter(params ...[]interface{}) chan []interface{} {
// create channel
c := make(chan []interface{})
// create waitgroup
var wg sync.WaitGroup
// call iterator
wg.Add(1)
iterate(&wg, c, []interface{}{}, params...)
// call channel-closing go-func
go func() { wg.Wait(); close(c) }()
// return channel
return c
}
// private, recursive Iteration-Function
func iterate(wg *sync.WaitGroup, channel chan []interface{}, result []interface{}, params ...[]interface{}) {
// dec WaitGroup when finished
defer wg.Done()
// no more params left?
if len(params) == 0 {
// send result to channel
channel <- result
return
}
// shift first param
p, params := params[0], params[1:]
// iterate over it
for i := 0; i < len(p); i++ {
// inc WaitGroup
wg.Add(1)
// create copy of result
resultCopy := append([]interface{}{}, result...)
// call self with remaining params
go iterate(wg, channel, append(resultCopy, p[i]), params...)
}
}
Here's how you can use it
a := []interface{}{"Small", "Medium", "Large"}
b := []interface{}{"White", "Black", "Yellow"}
var d [][]interface{}
d = append(d,a)
d = append(d,b)
c := Iter(d...)
for product := range c {
fmt.Println(product)
}
I then store each product as a JSON string to the database
Related Topics
How to Copy a Record in a SQL Table But Swap Out the Unique Id of the New Row
How to Select Only the First Rows for Each Unique Value of a Column
Sum of Grouped Count in SQL Query
Rails Find Record with Zero Has_Many Records Associated
Doesn't Linq to SQL Miss the Point? Aren't Orm-Mappers (Subsonic, etc.) Sub-Optimal Solutions
The Object 'Df_*' Is Dependent on Column '*' - Changing Int to Double
SQL for Ordering by Number - 1,2,3,4 etc Instead of 1,10,11,12
SQL Update Fields of One Table from Fields of Another One
Find Duplicate Rows with Postgresql
Update Query Using Subquery in SQL Server
Oracle 'Partition By' and 'Row_Number' Keyword
Using SQL Server as a Db Queue with Multiple Clients
What's the Purpose of SQL Keyword "As"
Postgresql: How to Convert from Unix Epoch to Date