Sorting Characters in a String First by Frequency and Then Alphabetically

Sorting characters in a string first by frequency and then alphabetically

If you want highest frequency then lowest letter, an easy way would be to store negative values for frequency, then negate it after you sort. A more efficient way would be to change the function used for sorting, but that is a touch trickier:

struct sort_helper {
bool operator()(std::pair<int,char> lhs, std::pair<int,char> rhs) const{
return std::make_pair(-lhs.first,lhs.second)<std::make_pair(-rhs.first,rhs.second);
}
};
std::sort(vec.begin(),vec.end(),sort_helper());

Sort Counter by frequency, then alphabetically in Python

It sounds like your question is how to sort the entire list by frequency, then break ties alphabetically. You can sort the entire list like this:

>>> a = sorted(letter_count.items(), key=lambda item: (-item[1], item[0]))
>>> print(a)
# [('a', 2), ('b', 1), ('e', 1), ('h', 1), ('l', 1), ('p', 1), ('t', 1)]

If you want the output to be a dict still, you can convert it into a collections.OrderedDict:

>>> collections.OrderedDict(a)
# OrderedDict([('a', 2),
# ('b', 1),
# ('e', 1),
# ('h', 1),
# ('l', 1),
# ('p', 1),
# ('t', 1)])

This preserves the ordering, as you can see. 'a' is first because it's most frequent. Everything else is sorted alphabetically.

How to sort Strings by frequency then by first occurrence

Try this

  public static void main(String[] args) throws IOException {

Scanner sc = new Scanner(new File("file.txt"));
List<String> lines = new ArrayList<String>();
while (sc.hasNextLine()){
lines.add(sc.nextLine());
}


String[] arr = lines.toArray(new String[0]);
String text = Arrays.toString(arr);
String test = text.replaceAll("\\p{P}","");

List<String> list = Arrays.asList(test.split(" "));
SortedSet<String> uniq = new TreeSet<String>(list);


for (String w : uniq){
System.out.printf("%n%d %s",Collections.frequency(list, w), w);

}
}

C# sort string alphabetically followed by frequency of occurrence

The thing you're missing is adding an orderby to your LINQ statement:

var frequency = from f in "trreill"
group f by f into letterfrequency
orderby letterfrequency.Key
select new
{
Letter = letterfrequency.Key,
Frequency = letterfrequency.Count()
};

foreach (var f in frequency)
{
Console.WriteLine($"{f.Letter}{f.Frequency}");
}

The letterfrequency has a property called Key which contains the letter for each group, so adding orderby letterfrequency.Key sorts the results to give you the output you're after:

e1

i1

l2

r2

t1

I've also tweaked the result of the query slightly (simply to show that it's something that's possible) to generate a new anonymous type that contains the Letter and the Frequency as named properties. This makes the code in the Console.WriteLine a littl clearer as the properties being used are called Letter and Frequency rather than Key and the Count() method.

Turning it up to 11, using C# 7.0 Tuples

If you're using C# 7.0, you could replace the use of an anonymous type with Tuples, that means that the code would now look like this:

var frequency = from f in "trreill"
group f by f into letterfrequency
orderby letterfrequency.Key
select
(
Letter: letterfrequency.Key,
Frequency: letterfrequency.Count()
);


foreach (var (Letter, Frequency) in frequency)
{
Console.WriteLine($"{Letter}{Frequency}");
}

I've blogged about this, and there are plenty of other resources that describe them, including this Stackoverflow question that asks 'Are C# anonymous types redundant in C# 7', if you want a deeper dive into Tuples, what they are, how they work and why/when you might want to use them in preference to anonymous types.

how to sort letters per their frequency in a word?

Use the most_common() method of counters.

from collection import Counter
string = 'ddddaacccbb'
n = 3
count = Counter(string)
print([letter for letter, _ in count.most_common(n)])

Output will be

['d', 'c', 'a']

If you want alphabetical order on the output, you can sort the result.

print(sorted(letter for letter, _ in count.most_common(n)))

Output:

['a', 'c', 'd']

Sorting letters in an array by frequency in within a struct

One solution I thought of creates a structure that holds both the character and the frequency and did sort using qsort with a custom comparison function. This way the frequency limit is INT_MAX.

A more hacky way would be to sort an array of integers and for each poison use freq*128 + ('a' + i), do a normal integer array sort using greater and then get the characters using most_freq_chars = freq_array[i]%128

I hope it helps you =]

#include <stdio.h>      /* printf */
#include <stdlib.h> /* qsort */
#include <string.h> /* strlen */

typedef struct freq_pair {
char c;
int freq;
} freq_pair_t;

typedef struct statistics {
char_counts_t char_info;
int sentences;
int words;
int freq[26];
int max_freq;
char most_freq_chars[26]; // You don't need 27 spaces here
} statistics_t;

void get_letter_frequencies(const char* text, size_t len, int freq[26], int* max_freq) {
for (int i = 0; i < 26; i++) {
freq[i] = 0;
}

for (int i = 0; i < len; i++) {
if ((text[i] >= 'a') && (text[i] <= 'z')) {
freq[text[i] - 'a']++;
}
}

*max_freq = 0;
for (int i = 0; i < 26; i++) {
if (*max_freq < freq[i]) {
*max_freq = freq[i];
}
}
}

int compare(const void* a, const void* b) {
freq_pair_t* pa = (freq_pair_t*)a;
freq_pair_t* pb = (freq_pair_t*)b;

if (pa->freq > pb->freq) {
return -1;
}

if (pa->freq == pb->freq) {
if (pa->c < pb->c) {
return -1;
}
if (pa->c > pb->c) {
return 1;
}

return 0;
}

if (pa->freq < pb->freq) {
return 1;
}
}

void get_text_statistics(const char* text, size_t len, statistics_t* data) {
*data = (statistics_t){
.sentences = count_sentences(text, len),
.words = count_words(text, len),
/* Do not init most_freq_chars here, let it for after you calc all the frequencies */
};

get_letter_frequencies(text, len, &data->freq[0], &data->max_freq);
freq_pair_t freq_pairs[26];

for (int i = 0; i < 26; i++) {
freq_pairs[i].freq = data->freq[i];
freq_pairs[i].c = 'a' + i;
}

qsort(freq_pairs, 26, sizeof(freq_pair_t), compare);
for (int i = 0; i < 26; i++) {
data->most_freq_chars[i] = freq_pairs[i].c;
}
}

int main() {
char* s = "quero mudar o mundo, cruzar os ceus sem nada temer";
statistics_t data;
get_text_statistics(s, strlen(s), &data);

for (int i = 0; i < 26; i++) {
printf("%c ", data.most_freq_chars[i]);
}
printf("\n");

for (int i = 0; i < 26; i++) {
printf("%c-%d ", 'a' + i, data.freq[i]);
}
printf("\n");
}


Related Topics



Leave a reply



Submit