How to Count the Number of Words in a String in Oracle

How can I count the number of words in a string in Oracle?

You can use something similar to this. This gets the length of the string, then substracts the length of the string with the spaces removed. By then adding the number one to that should give you the number of words:

Select length(yourCol) - length(replace(yourcol, ' ', '')) + 1 NumbofWords
from yourtable

See SQL Fiddle with Demo

If you use the following data:

CREATE TABLE yourtable
    (yourCol varchar2(15))
;

INSERT ALL 
    INTO yourtable (yourCol)
         VALUES ('Hello To Oracle')
    INTO yourtable (yourCol)
         VALUES ('oneword')
    INTO yourtable (yourCol)
         VALUES ('two words')
SELECT * FROM dual
;

And the query:

Select yourcol,
  length(yourCol) - length(replace(yourcol, ' ', '')) + 1 NumbofWords
from yourtable

The result is:

|         YOURCOL | NUMBOFWORDS |
---------------------------------
| Hello To Oracle |           3 |
|         oneword |           1 |
|       two words |           2 |

How to count number of words in delimited string in Oracle SQL

Firstly, it is a bad design to store multiple values in a single column as delimited string. You should consider normalizing the data as a permanent solution.

With the denormalized data, you could do it in a single SQL using REGEXP_SUBSTR:

SELECT COUNT(DISTINCT(regexp_substr(country, '[^ ]+', 1, LEVEL))) as "COUNT"
FROM table_name
  CONNECT BY LEVEL <= regexp_count(country, ' ')+1 
/

Demo:

SQL> WITH sample_data AS
  2    ( SELECT 'japan singapore japan chinese chinese chinese' str FROM dual
  3    )
  4  -- end of sample_data mocking real table
  5  SELECT COUNT(DISTINCT(regexp_substr(str, '[^ ]+', 1, LEVEL))) as "COUNT"
  6  FROM sample_data
  7    CONNECT BY LEVEL <= regexp_count(str, ' ')+1
  8  /

     COUNT
----------
         3

See Split single comma delimited string into rows in Oracle to understand how the query works.

UPDATE

For multiple delimited string rows you need to take care of the number of rows formed by the CONNECT BY clause.

See Split comma delimited strings in a table in Oracle for more ways of doing the same task.

Setup

Let's say you have a table with 3 rows like this:

SQL> CREATE TABLE t(country VARCHAR2(200));

Table created.

SQL> INSERT INTO t VALUES('japan singapore japan chinese chinese chinese');

1 row created.

SQL> INSERT INTO t VALUES('singapore indian malaysia');

1 row created.

SQL> INSERT INTO t VALUES('french french french');

1 row created.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM t;

COUNTRY
---------------------------------------------------------------------------
japan singapore japan chinese chinese chinese
singapore indian malaysia
french french french

Using REGEXP_SUBSTR and REGEXP_COUNT:

We expect the output as 6 since there are 6 unique strings.

SQL> SELECT COUNT(DISTINCT(regexp_substr(t.country, '[^ ]+', 1, lines.column_value))) count
  2    FROM t,
  3      TABLE (CAST (MULTISET
  4      (SELECT LEVEL FROM dual
  5              CONNECT BY LEVEL <= regexp_count(t.country, ' ')+1
  6      ) AS sys.odciNumberList ) ) lines
  7    ORDER BY lines.column_value
  8  /

     COUNT
----------
         6

There are many other methods to achieve the desired output. Let's see how:

Using XMLTABLE


SQL> SELECT COUNT(DISTINCT(country)) COUNT
  2  FROM
  3    (SELECT trim(COLUMN_VALUE) country
  4    FROM t,
  5      xmltable(('"'
  6      || REPLACE(country, ' ', '","')
  7      || '"'))
  8    )
  9  /

     COUNT
----------
         6

Using MODEL clause


SQL> WITH
  2       model_param AS
  3       (
  4        SELECT country AS orig_str ,
  5               ' '
  6               || country
  7               || ' '                                 AS mod_str ,
  8               1                                      AS start_pos ,
  9              Length(country)                           AS end_pos ,
 10              (LENGTH(country) -
 11              LENGTH(REPLACE(country, ' '))) + 1        AS element_count ,
 12              0                                      AS element_no ,
 13              ROWNUM                                 AS rn
 14        FROM   t )
 15        SELECT COUNT(DISTINCT(Substr(mod_str, start_pos, end_pos-start_pos))) count
 16        FROM (
 17              SELECT *
 18              FROM   model_param
 19              MODEL PARTITION BY (rn, orig_str, mod_str)
 20              DIMENSION BY (element_no)
 21              MEASURES (start_pos, end_pos, element_count)
 22              RULES ITERATE (2000)
 23              UNTIL (ITERATION_NUMBER+1 = element_count[0])
 24            ( start_pos[ITERATION_NUMBER+1] =
 25                      instr(cv(mod_str), ' ', 1, cv(element_no)) + 1,
 26              end_pos[ITERATION_NUMBER+1] =
 27                      instr(cv(mod_str), ' ', 1, cv(element_no) + 1) )
 28            )
 29         WHERE    element_no != 0
 30    ORDER BY      mod_str , element_no
 31   /

     COUNT
----------
         6

PL / SQL word count from string

One another option would be the query below :

select word, count(1) as repeating
  from
  (
    with t(str) as
    (
     select 'Hello, I like ham pizza more than mozzarella pizza' from dual   
    )    
    select regexp_replace(regexp_substr(str,  '[^\ ]+', 1, level),'[^a-zA-Z]','')
        as word
      from t
     cross join dual
    connect by level <= regexp_count(str,  '[^\ ]+') 
   )   
 group by word
 order by repeating desc, word;

WORD       REPEATING
---------- ---------
pizza          2
ham            1
Hello          1
I              1
like           1
more           1
mozzarella     1
than           1

How to count the number of occurrences of a character in an Oracle varchar value?

Here you go:

select length('123-345-566') - length(replace('123-345-566','-',null)) 
from dual;

Technically, if the string you want to check contains only the character you want to count, the above query will return NULL; the following query will give the correct answer in all cases:

select coalesce(length('123-345-566') - length(replace('123-345-566','-',null)), length('123-345-566'), 0) 
from dual;

The final 0 in coalesce catches the case where you're counting in an empty string (i.e. NULL, because length(NULL) = NULL in ORACLE).

Counting the number of words in a Column in Oracle SQL

The answer to this data set is below:

`select * from(
 select x,count(*) as coun from ( 
 select substr(names,
            INSTR(names, ' ', -1, 1)+1) as x
            from abc

  union all

  SELECT SUBSTR(names,
          INSTR(names, ' ', 1, 1) + 1,
          INSTR(names, ' ', 1, 2) - INSTR(names, ' ', 1, 1) - 1) as x
  FROM abc

  union all 

   SELECT SUBSTR(names,1,
          INSTR(names, ' ',1 , 1)-1) as x
   FROM abc
   )
   where x is not null and x not in ('1','2','3','4','5','6','7')
  group by x
  order by coun desc)
  where rownum < 4800;'

Answer:

Sample Image

How do I Count the words in a string using regex

INSTR is also a viable option. By looking for the second occurrence of a space, that will indicate that the string has at least 3 words.

WITH
    books
    AS
        (SELECT 'Tom Sawyer' title FROM DUAL
         UNION ALL
         SELECT 'A tale of two cities' FROM DUAL
         UNION ALL
         SELECT 'The Little Prince' FROM DUAL
         UNION ALL
         SELECT 'Don Quixote' FROM DUAL)
SELECT title
  FROM books
 WHERE instr(title, ' ', 1, 2) > 0;

If you do with to stick with regex, the regex expression below can be used to find books that have 3 or more words.

WITH
    books
    AS
        (SELECT 'Tom Sawyer' title FROM DUAL
         UNION ALL
         SELECT 'A tale of two cities' FROM DUAL
         UNION ALL
         SELECT 'The Little Prince' FROM DUAL
         UNION ALL
         SELECT 'Don Quixote' FROM DUAL)
SELECT title
  FROM books
 WHERE REGEXP_LIKE (title, '(\S+\s){2,}');

(Thanks @Littlefoot for the books!)