MySQL PHP Select Count of Distinct Values from Comma Separated Data (Tags)

MySql PHP select count of distinct values from comma separated data (tags)

Solution

I don't really know how to transform an horizontal list of comma-separated values to a list of rows without creating a table containing numbers, as many numbers as you may have comma-separated values. If you can create this table, here is my answer:

SELECT 
SUBSTRING_INDEX(SUBSTRING_INDEX(all_tags, ',', num), ',', -1) AS one_tag,
COUNT(*) AS cnt
FROM (
SELECT
GROUP_CONCAT(tags separator ',') AS all_tags,
LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags
FROM test
) t
JOIN numbers n
ON n.num <= t.count_tags
GROUP BY one_tag
ORDER BY cnt DESC;

Returns:

+---------------------+-----+
| one_tag | cnt |
+---------------------+-----+
| chicken | 5 |
| pork | 4 |
| spaghetti | 3 |
| fried-rice | 2 |
| manchurain | 2 |
| pho | 1 |
| chicken-calzone | 1 |
| fettuccine | 1 |
| chorizo | 1 |
| meat-balls | 1 |
| miso-soup | 1 |
| chanko-nabe | 1 |
| chicken-manchurian | 1 |
| pork-manchurian | 1 |
| sweet-and-sour-pork | 1 |
| peking-duck | 1 |
| duck | 1 |
+---------------------+-----+
17 rows in set (0.01 sec)

See sqlfiddle


Explaination

Scenario

  1. We concatenate all tags using a comma to create only one list of tags instead of one per row
  2. We count how many tags we have in our list
  3. We find how we can get one value in this list
  4. We find how we can get all values as distinct rows
  5. We count tags grouped by their value

Context

Let's build your schema:

CREATE TABLE test (
id INT PRIMARY KEY,
tags VARCHAR(255)
);

INSERT INTO test VALUES
("1", "pho,pork"),
("2", "fried-rice,chicken"),
("3", "fried-rice,pork"),
("4", "chicken-calzone,chicken"),
("5", "fettuccine,chicken"),
("6", "spaghetti,chicken"),
("7", "spaghetti,chorizo"),
("8", "spaghetti,meat-balls"),
("9", "miso-soup"),
("10", "chanko-nabe"),
("11", "chicken-manchurian,chicken,manchurain"),
("12", "pork-manchurian,pork,manchurain"),
("13", "sweet-and-sour-pork,pork"),
("14", "peking-duck,duck");

Concatenate all list of tags

We will work with all tags in a single line, so we use GROUP_CONCAT to do the job:

SELECT GROUP_CONCAT(tags SEPARATOR ',') FROM test;

Returns all tags separated by a comma:

pho,pork,fried-rice,chicken,fried-rice,pork,chicken-calzone,chicken,fettuccine,chicken,spaghetti,chicken,spaghetti,chorizo,spaghetti,meat-balls,miso-soup,chanko-nabe,chicken-manchurian,chicken,manchurain,pork-manchurian,pork,manchurain,sweet-and-sour-pork,pork,peking-duck,duck

Count all tags

To count all tags, we get the length of the full list of tags, and we remove the length of the full list of tags after replacing the , by nothing. We add 1, as the separator is between two values.

SELECT LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags
FROM test;

Returns:

+------------+
| count_tags |
+------------+
| 28 |
+------------+
1 row in set (0.00 sec)

Get the Nth tag in the tag list

We use the SUBSTRING_INDEX function to get

-- returns the string until the 2nd delimiter\'s occurrence from left to right: a,b
SELECT SUBSTRING_INDEX('a,b,c', ',', 2);

-- return the string until the 1st delimiter, from right to left: c
SELECT SUBSTRING_INDEX('a,b,c', ',', -1);

-- we need both to get: b (with 2 being the tag number)
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('a,b,c', ',', 2), ',', -1);

With such logic, to get the 3rd tag in our list, we use:

SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(tags SEPARATOR ','), ',', 3), ',', -1)
FROM test;

Returns:

+-------------------------------------------------------------------------------------+
| SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(tags SEPARATOR ','), ',', 3), ',', -1) |
+-------------------------------------------------------------------------------------+
| fried-rice |
+-------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Get all values as distinct rows

My idea is a little tricky:

  1. I know we can create rows by joining tables
  2. I need to get the Nth tag in the list using the request above

So we will create a table containing all numbers from 1 to the maximum number of tags you may have in your list. If you can have 1M values, create 1M entries from 1 to 1,000,000. For 100 tags, this will be:

CREATE TABLE numbers (
num INT PRIMARY KEY
);

INSERT INTO numbers VALUES
( 1 ), ( 2 ), ( 3 ), ( 4 ), ( 5 ), ( 6 ), ( 7 ), ( 8 ), ( 9 ), ( 10 ),
( 11 ), ( 12 ), ( 13 ), ( 14 ), ( 15 ), ( 16 ), ( 17 ), ( 18 ), ( 19 ), ( 20 ),
( 21 ), ( 22 ), ( 23 ), ( 24 ), ( 25 ), ( 26 ), ( 27 ), ( 28 ), ( 29 ), ( 30 ),
( 31 ), ( 32 ), ( 33 ), ( 34 ), ( 35 ), ( 36 ), ( 37 ), ( 38 ), ( 39 ), ( 40 ),
( 41 ), ( 42 ), ( 43 ), ( 44 ), ( 45 ), ( 46 ), ( 47 ), ( 48 ), ( 49 ), ( 50 ),
( 51 ), ( 52 ), ( 53 ), ( 54 ), ( 55 ), ( 56 ), ( 57 ), ( 58 ), ( 59 ), ( 60 ),
( 61 ), ( 62 ), ( 63 ), ( 64 ), ( 65 ), ( 66 ), ( 67 ), ( 68 ), ( 69 ), ( 70 ),
( 71 ), ( 72 ), ( 73 ), ( 74 ), ( 75 ), ( 76 ), ( 77 ), ( 78 ), ( 79 ), ( 80 ),
( 81 ), ( 82 ), ( 83 ), ( 84 ), ( 85 ), ( 86 ), ( 87 ), ( 88 ), ( 89 ), ( 90 ),
( 91 ), ( 92 ), ( 93 ), ( 94 ), ( 95 ), ( 96 ), ( 97 ), ( 98 ), ( 99 ), ( 100 );

Now, we get the numth (num being a row in number) using the following query:

SELECT n.num, SUBSTRING_INDEX(SUBSTRING_INDEX(all_tags, ',', num), ',', -1) as one_tag
FROM (
SELECT
GROUP_CONCAT(tags SEPARATOR ',') AS all_tags,
LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags
FROM test
) t
JOIN numbers n
ON n.num <= t.count_tags

Returns:

+-----+---------------------+
| num | one_tag |
+-----+---------------------+
| 1 | pho |
| 2 | pork |
| 3 | fried-rice |
| 4 | chicken |
| 5 | fried-rice |
| 6 | pork |
| 7 | chicken-calzone |
| 8 | chicken |
| 9 | fettuccine |
| 10 | chicken |
| 11 | spaghetti |
| 12 | chicken |
| 13 | spaghetti |
| 14 | chorizo |
| 15 | spaghetti |
| 16 | meat-balls |
| 17 | miso-soup |
| 18 | chanko-nabe |
| 19 | chicken-manchurian |
| 20 | chicken |
| 21 | manchurain |
| 22 | pork-manchurian |
| 23 | pork |
| 24 | manchurain |
| 25 | sweet-and-sour-pork |
| 26 | pork |
| 27 | peking-duck |
| 28 | duck |
+-----+---------------------+
28 rows in set (0.01 sec)

Count tags occurrences

As soon as we now have classic rows, we can easily count occurrences of each tags.

See the top of this answer to see the request.

MySQL: Get count of data in comma separated column

The following approach builds a result of 1000 integers, then uses those integers (n) to locate segments within the comma seperated string, and for each segment it creates a new row so that the derived table looks like this:


userid | book
:----- | :---------
ym0001 | dictionary
ym0002 | textbooks
ym0001 | textbooks
ym0002 | dictionary
ym0001 | notebooks

Once that exists it is a simple matter of grouping by book to arrive at the counts.

select
book, count(*) Counts
from (
select
t.userid
, SUBSTRING_INDEX(SUBSTRING_INDEX(t.books, ',', numbers.n), ',', -1) book
from (
select @rownum:=@rownum+1 AS n
from
(
select 0 union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9
) a
cross join (
select 0 union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9
) b
cross join (
select 0 union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9
) c
cross join (select @rownum:=0) r
) numbers
inner join mytable t
on CHAR_LENGTH(t.books)
-CHAR_LENGTH(REPLACE(t.books, ',', '')) >= numbers.n-1
) d
group by
book
order by
book

book | Counts
:--------- | -----:
dictionary | 2
notebooks | 1
textbooks | 2
  1. If you already have a table of numbers, use that instead.
  2. the cross joins of a b and c dynamically produce 1000 rows, if you need more add further cross joins similar to c. i.e. the number of numbers should exceed the maximum length of your comma seperated data

db<>fiddle here

Get Count of different values in comma separated row in mysql

SELECT 
SUBSTRING_INDEX(SUBSTRING_INDEX(all_city, ',', num), ',', -1) AS one_city,
COUNT(*) AS cnt
FROM (
SELECT
GROUP_CONCAT(city separator ',') AS all_city,
LENGTH(GROUP_CONCAT(citySEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(citySEPARATOR ','), ',', '')) + 1 AS count_city
FROM table_name
) t
JOIN numbers n
ON n.num <= t.count_city
GROUP BY one_city
ORDER BY cnt DESC;

for getting count of comma separated distinct value run above query but getting correct resulr you should use one more table **numbers** which have only one column num integer type and insert some values.
if you getting error during GROUP_CONCAT(city separator ',') AS all_city in this condition set a global variable " SET group_concat_max_len = 18446744073709547520; "

Unique value count of comma separated field (PHP - MySQL)

As mysql is not able to hold arrays, its better to build a new table like this:

interest_id interest_name 
1 a
2 b

and another one to keep the relations:

pk id   interest_id
1 1 1
2 1 2

which this id is the id of the records in your main table.

With having this, you can easily use:

select count(*) from THIRD_TABLE where id = YOUR_ID

Mysql - count values from comma-separated field

Clever solution here on SO: How to count items in comma separated list MySQL

LENGTH(textfield) - LENGTH(REPLACE(textfield, ',', '')) + 1

EDIT

Yes you can select it as an additional column: and correcting with the CHAR_LENGTH from @HamletHakobyan's answer:

SELECT 
ID,
textfield,
(CHAR_LENGTH(textfield) - CHAR_LENGTH(REPLACE(textfield, ',', '')) + 1) as total
FROM table


Related Topics



Leave a reply



Submit