PHP: Sort and Count Instances of Words in a Given String

php: sort and count instances of words in a given string

Use a combination of str_word_count() and array_count_values():

$str = 'happy beautiful happy lines pear gin happy lines rock happy lines pear ';
$words = array_count_values(str_word_count($str, 1));
print_r($words);

gives

Array
(
[happy] => 4
[beautiful] => 1
[lines] => 3
[pear] => 2
[gin] => 1
[rock] => 1
)

The 1 in str_word_count() makes the function return an array of all the found words.

To sort the entries, use arsort() (it preserves keys):

arsort($words);
print_r($words);

Array
(
[happy] => 4
[lines] => 3
[pear] => 2
[rock] => 1
[gin] => 1
[beautiful] => 1
)

Sort and count instances of words in a database

I don't have your database on hand, so I'll demonstrate by stepping through an array:

[ghoti@pc ~]$ cat doit.php
#!/usr/local/bin/php
<?php

$a=array(
'1' => "happy beautiful happy lines pear gin happy lines rock happy lines pear",
'2' => "happy lines pear gin happy lines rock happy lines pear",
'3' => "happy rock pear happy happy happy",
);

$wordlist=array();

foreach ($a as $index => $line) {
foreach (explode(" ", $line) as $word) {
$wordlist[$word]++;
}
}

print_r($wordlist);

[ghoti@pc ~]$ ./doit.php
Array
(
[happy] => 11
[beautiful] => 1
[lines] => 6
[pear] => 5
[gin] => 2
[rock] => 3
)
[ghoti@pc ~]$

To make this go for your use case, replace the foreach() with a while loop that steps through your table:

$sql = "SELECT id,wordlist FROM yadda";
$result = db_query($sql);
while ($row = db_fetch_row($result)) {
...
}

I don't know what database server you're using, so I can't provide a specific example that I know will be applicable to you.

How to sort by word count and in alphabetical order?

This should work for you:

usort() is the way to go here. I first compare the amount of words, which I get by counting the amout of spaces with substr_count(). If the count of words are equal I simply do a strcasecmp() to compare both strings. If the amounts of words are different I simply compare them.

<?php

$arr = ["Cube Pro Duo", "Cube Pro", "Cube Pro Trio"];

usort($arr, function($a, $b){
$countA = substr_count($a, " ")+1;
$countB = substr_count($a, " ")+1;

if($countA == $countB) {
return strcasecmp($a, $b);
} else {
return $countA > $countB ? 1 : -1;
}
});

print_r($arr);

?>

output:

Array
(
[0] => Cube Pro
[1] => Cube Pro Duo
[2] => Cube Pro Trio
)

count the occurrences of all the letters in a string PHP

You don't have to convert that into an array() you can use substr_count() to achieve the same.

substr_count — Count the number of substring occurrences

<?php
$str = "cdcdcdcdeeeef";
echo substr_count($str, 'c');
?>

PHP Manual

substr_count() returns the number of times the needle substring occurs in the haystack string. Please note that needle is case sensitive.

EDIT:

Sorry for the misconception, you can use count_chars to have a counted value of each character in a string. An example:

<?php
$str = "cdcdcdcdeeeef";

foreach (count_chars($str, 1) as $strr => $value) {
echo chr($strr) . " occurred a number of $value times in the string." . "<br>";
}
?>

PHP Manual: count_chars

count_chars — Return information about characters used in a string

php - count number of instances of a word in an array supporting UTF8

It is possible to make a UTF-8 (only!) version using the Unicode mode of PHP's PCRE functions.

function utf8_str_word_count($string, $format = 0, $charlist = null) {
if ($charlist === null) {
$regex = '/\\pL[\\pL\\p{Mn}\'-]*/u';
}
else {
$split = array_map('preg_quote',
preg_split('//u',$charlist,-1,PREG_SPLIT_NO_EMPTY));
$regex = sprintf('/(\\pL|%1$s)([\\pL\\p{Mn}\'-]|%1$s)*/u',
implode('|', $split));
}

switch ($format) {
default:
case 0:
// For PHP >= 5.4.0 this is fine:
return preg_match_all($regex, $string);

// For PHP < 5.4 it's necessary to do this:
// $results = null;
// return preg_match_all($regex, $string, $results);
case 1:
$results = null;
preg_match_all($regex, $string, $results);
return $results[0];
case 2:
$results = null;
preg_match_all($regex, $string, $results, PREG_OFFSET_CAPTURE);
return empty($results[0])
? array()
: array_combine(
array_map('end', $results[0]),
array_map('reset', $results[0]));
}
}

This function follows the semantics of str_word_count as closely as possible; in particular, if you replace "locale dependent" with "UTF-8" in the following note for str_word_count the result holds true for this

For the purpose of this function, 'word' is defined as a locale
dependent string containing alphabetic characters, which also may
contain, but not start with "'" and "-" characters.

Additionally, the characters ' and - are considered part of a word but cannot start one; however, any characters specified in the $charlist parameter can start a word which means that specifying ' and/or - slightly changes the way the function works. This behavior also matches the original str_word_count.

It is also interesting to note that you could make the function recognize only some subset of Unicode scripts by appropriately replacing \pL with character properties such as \p{Greek} -- see the PCRE Unicode reference.

Find how many times each word appeared in string in php

The code you posted in the comments is ok, but it considers words written with different casing as different words (like "Comments" and "comments"). So don't forget to use strtolower:

<?php  
$comments = "Comments? I like comments.";

$commentsArray = array_count_values(str_word_count(strtolower($comments), 1));

echo "<p>How many words were input: " . count($commentsArray) . "</p>";
?>
<table>
<tr>
<th>Word</th>
<th>Count</th>
</tr>
<?php foreach($commentsArray as $word=>$count): ?>
<tr>
<td><?php echo $word; ?></td>
<td><?php echo $count; ?></td>
</tr>
<?php endforeach; ?>
</table>

This script echoes:

How many words were input: 3

Word Count
comments 2
i 1
like 1

How to sort data ASC by total count of word in MYSQL

Try this:

#1st Query
SELECT *,LENGTH(fname)-LENGTH(replace(fname,' ','')) as word_count FROM mytable
ORDER BY word_count ASC;

#2nd Query
SELECT A.fname, LENGTH(A.fname)-LENGTH(replace(A.fname,' ','')) as word_count FROM mytable A LEFT JOIN
(SELECT *,LENGTH(fname)-LENGTH(replace(fname,' ','')) as w_count FROM mytable WHERE fname='Banana Cakes') B
ON A.fname=B.fname
ORDER BY CASE WHEN B.fname IS NOT NULL THEN B.w_count END DESC,
CASE WHEN B.fname IS NOT NULL THEN LENGTH(B.fname) END DESC,
word_count ASC;

Fiddle here: https://www.db-fiddle.com/f/3gwZj9yMp43dhmzaETKghd/4

So the first one is simple, just need to order by length ascending and fname descending (since your example show Banana Split to return first instead of Banana Cakes despite both have similar length and word count. Alphabetically, 'C' comes first so Banana Cakes should return first).

The second query I made the condition of 'Banana Cakes' query to become a sub-query then LEFT JOIN it with the main table. On the ORDER BY I'm using CASE expression whereby if the result from the LEFT JOIN (result from the sub-query) is not NULL, ORDER using that value first and then the second order condition is similar to the first order condition in the 1st query.

Edit: Adding word count condition for the ordering. Note this word count function is calculating how many space(s) instead of how many word(s). Example: 0 means only a word without any space and 1 means two words with 1 space.



Related Topics



Leave a reply



Submit