Testing Whether a String Has Repeated Characters

Testing whether a string has repeated characters

You can use collections.Counter :

>>> from collections import Counter
>>> [i for i,j in Counter(a).items() if j>1]
['4', '8']

Or you can use a custom function :

>>> def finder(s):
... seen,yields=set(),set()
... for i in s:
... if i in seen:
... if i not in yields:
... yield i
... yields.add(i)
... else :
... yields.add(i)
... else:
... seen.add(i)
...
>>> list(finder(a))
['4', '8']

Or use str.count method in a set comprehension :

>>> set(i for i in a if a.count(i)>1)
set(['8', '4'])

A benchmark on all approaches, which shows that the last 2 way (custom function and set comprehensions are much faster than Counter):

from timeit import timeit


s1="""
a = "12348546478"
[i for i,j in Counter(a).items() if j>1]

"""
s2="""
def finder(s):
seen,yields=set(),set()
for i in s:
if i in seen:
if i not in yields:
yield i
yields.add(i)
else :
yields.add(i)
else:
seen.add(i)

a = "12348546478"
list(finder(a))

"""

s3="""
a = "12348546478"
set(i for i in a if a.count(i)>1)
"""

print '1st: ' ,timeit(stmt=s1, number=100000,setup="from collections import Counter")
print '2nd : ',timeit(stmt=s2, number=100000)
print '3rd : ',timeit(stmt=s2, number=100000)

result :

1st:  0.726881027222
2nd : 0.265578985214
3rd : 0.26243185997

I also tried this for long string (a = "12348546478"*10000) and still got the same result:

1st:  25.5780302721341
2nd : 11.8482989001177
3rd : 11.926538944245

Any way my suggestion is using the set comprehension which is more pythonic :

set(i for i in a if a.count(i)>1)

How can I check if a string contains repeated characters or not?

Create a Set from the string, and check if the Set's size is less than the string's length. A Set can only hold unique values, so if there are repeated characters the Set's size would be less than the string's length.

const hasRepeatedCharacters = str => new Set(str).size < str.length;
console.log(hasRepeatedCharacters("abadan"));console.log(hasRepeatedCharacters("abc"));

How to find whether a string has repeated characters?

You are printing True and False for each character in the string. Running your code with input "hello" will give 5 lines of output, two True, two False and one more True.

def is_isogram(string):
for i in string:
if string.count(i) > 1:
return False
return True

This code will return either True or False. If any character in the string appears more than once, the function returns False and exits. If none of the characters appear more than once, it returns True after exiting the loop.

Explanation of the code

You can be sure that the word is not an isogram if any letter appears more than once, ie, you don't have to traverse the whole string, just until you find a repeated character. But to be sure that you have an isogram, you need to check the whole string.

for i in string:

This statement iterates through the loop character by character.

    if string.count(i) > 1:
return False

The if statement checks if the current character, i, appears more than once in the string. If it does, the function returns False. If not, it continues to the next iteration.

    return True

If the control has reached this statement, it means that none of the characters appear twice, as the function would have returned False before reaching this statement. So, the function returns True

Example

Consider the input "hello".

Iteration 1: i = 'h' and string.count('h') = 1

So, do nothing

Iteration 2: i = 'e' and string.count('e') = 1

So, do nothing

Iteration 3: i = 'l' and string.count('l') = 2

So, return False.

If you called the function like, print(is_isogram("hello")), False will be printed.

Consider another input "abcd"
Iteration 1: i = 'a' and string.count('a') = 1

So, do nothing

Iteration 2: i = 'b' and string.count('b') = 1

So, do nothing

Iteration 3: i = 'c' and string.count('c') = 1

So, do nothing

Iteration 4: i = 'd' and string.count('d') = 1

So, do nothing

At this point, the loop is exhausted and the return True statement is executed

Check for repeated characters in a string Javascript

(A recursive solution can be found at the end, of this answer.)

You could just use the builtin javascript Array functions some MDN some reference

 var text = "test".split("");
text.some(function(v,i,a){
return a.lastIndexOf(v)!=i;
});

callback parameters:

v ... current value of the iteration

i ... current index of the iteration

a ... array being iterated

.split("") create an array from a string

.some(function(v,i,a){ ... }) goes through an array until the function returns true, and ends than right away. (it doesn't loop through the whole array, which is good for performance)

Details to the some function here in the documentation

Tests, with several different strings:

var texts = ["test", "rest", "why", "puss"];

for(var idx in texts){
var text = texts[idx].split("");
document.write(text + " -> " + text.some(function(v,i,a){return a.lastIndexOf(v)!=i;}) +"<br/>");

}
//tested on win7 in chrome 46+

Testing for repeated characters in a string

If the string is short, then just looping and testing may well be the simplest and most efficient way. I mean you could create a hash set (in whatever platform you're using) and iterate through the characters, failing if the character is already in the set and adding it to the set otherwise - but that's only likely to provide any benefit when the strings are longer.

EDIT: Now that we know it's sorted, mquander's answer is the best one IMO. Here's an implementation:

public static bool IsSortedNoRepeats(string text)
{
if (text.Length == 0)
{
return true;
}
char current = text[0];
for (int i=1; i < text.Length; i++)
{
char next = text[i];
if (next <= current)
{
return false;
}
current = next;
}
return true;
}

A shorter alternative if you don't mind repeating the indexer use:

public static bool IsSortedNoRepeats(string text)
{
for (int i=1; i < text.Length; i++)
{
if (text[i] <= text[i-1])
{
return false;
}
}
return true;
}

EDIT: Okay, with the "frequency" side, I'll turn the problem round a bit. I'm still going to assume that the string is sorted, so what we want to know is the length of the longest run. When there are no repeats, the longest run length will be 0 (for an empty string) or 1 (for a non-empty string). Otherwise, it'll be 2 or more.

First a string-specific version:

public static int LongestRun(string text)
{
if (text.Length == 0)
{
return 0;
}
char current = text[0];
int currentRun = 1;
int bestRun = 0;

for (int i=1; i < text.Length; i++)
{
if (current != text[i])
{
bestRun = Math.Max(currentRun, bestRun);
currentRun = 0;
current = text[i];
}
currentRun++;
}
// It's possible that the final run is the best one
return Math.Max(currentRun, bestRun);
}

Now we can also do this as a general extension method on IEnumerable<T>:

public static int LongestRun(this IEnumerable<T> source)
{
bool first = true;
T current = default(T);
int currentRun = 0;
int bestRun = 0;

foreach (T element in source)
{
if (first || !EqualityComparer<T>.Default(element, current))
{
first = false;
bestRun = Math.Max(currentRun, bestRun);
currentRun = 0;
current = element;
}
}
// It's possible that the final run is the best one
return Math.Max(currentRun, bestRun);
}

Then you can call "AABCD".LongestRun() for example.

What is the fastest way to check if a string contains repeating characters in Python 3?

Let me start off by saying that I suspect that you are optimizing when you don't need to. Python is a high-level language that supports thinking about computation in a high-level manner. A solution that is readable, elegant, and reusable is often going to be better than one that is blazingly fast, but hard to understand.

When, and only when, you determine that speed is an issue, then you should proceed with the optimizations. Perhaps even write a C extension for the computationally intense parts.

That being said, here's a comparison of a few techniques:

def unique_chars_set(s):
return len(s) == len(set(s))

def unique_chars_frozenset(s):
return len(s) == len(frozenset(s))

def unique_chars_counter(s):
return Counter(s).most_common(1)[0][1] > 1

def unique_chars_sort(s):
ss = ''.join(sorted(s))
prev = ''
for c in ss:
if c == prev:
return False
prev = c
return True

def unique_chars_bucket(s):
buckets = 255 * [False]
for c in s:
o = ord(c)
if buckets[o]:
return False
buckets[o] = True
return True

And here is the performance comparisons (in IPython):

In [0]: %timeit -r10 [unique_chars_set(s) for s in candidate_strings]
100000 loops, best of 10: 6.63 us per loop

In [1]: %timeit -r10 [unique_chars_frozenset(s) for s in candidate_strings]
100000 loops, best of 10: 6.81 us per loop

In [2]: %timeit -r10 [unique_chars_counter(s) for s in candidate_strings]
10000 loops, best of 10: 83.1 us per loop

In [3]: %timeit -r10 [unique_chars_sort(s) for s in candidate_strings]
100000 loops, best of 10: 13.1 us per loop

In [4]: %timeit -r10 [unique_chars_bucket(s) for s in candidate_strings]
100000 loops, best of 10: 15 us per loop

Conclusion: set is elegant and faster than many other obvious methods. But the differences are so small, it doesn't matter anyway.

For more benchmarks, see @FrancisAvila's answer.

How to Check if String Has the same characters in Python

An option is to check whether the set of its characters has length 1:

>>> len(set("aaaa")) == 1
True

Or with all(), this could be faster if the strings are very long and it's rare that they are all the same character (but then the regex is good too):

>>> s = "aaaaa"
>>> s0 = s[0]
>>> all(c == s0 for c in s[1:])
True

How to determine if a string contains a sequence of repeated letters

You can use this function:

function hasRepeatedLetters(str) {
var patt = /^([a-z])\1+$/;
var result = patt.test(str);
return result;
}


Related Topics



Leave a reply



Submit