How to Get The Count of a Type Conforming to 'sequence'

How to get the count of a type conforming to `Sequence`?

I do not think that there is a better method for an arbitrary type conforming to
SequenceType. The only thing that is known about a sequence is that
is has a generate() method returning a GeneratorType, which in turn
has a next() method. The next() method advances to the next
element of the sequence and returns it, or returns nil if there
is no next element.

Note that it is not required at all that next() eventually returns
nil: a sequence may have "infinite" elements.

Therefore enumerating the sequence is the only method to count its
elements. But this need not terminate. Therefore the answer could
also be: A function taking a sequence argument should not need to know
the total number of elements.

For types conforming to CollectionType you can use the
countElements() function (renamed to count() in Swift 1.2).

There is also underestimateCount():

/// Return an underestimate of the number of elements in the given
/// sequence, without consuming the sequence. For Sequences that are
/// actually Collections, this will return countElements(x)
func underestimateCount<T : SequenceType>(x: T) -> Int

but that does not necessarily return the exact number of elements.

Type Int does not conform to protocol sequence

It may be swift.
You can use this iteration.

for number in 0..<(numbers.count-1)

Swift Collection underestimateCount usage

underestimatedCount is actually a requirement of the Sequence protocol, and has a default implementation that just returns 0:

public var underestimatedCount: Int {
return 0
}

However, for sequences that provide their own implementation of underestimatedCount, this can be useful for logic that needs a lower bound of how long the sequence is, without having to iterate through it (remember that Sequence gives no guarantee of non-destructive iteration).

For example, the map(_:) method on Sequence (see its implementation here) uses underestimateCount in order to reserve an initial capacity for the resultant array:

  public func map<T>(
_ transform: (Iterator.Element) throws -> T
) rethrows -> [T] {

let initialCapacity = underestimatedCount
var result = ContiguousArray<T>()
result.reserveCapacity(initialCapacity)

// ...

This allows map(_:) to minimise the cost of repeatedly appending to the result, as an initial block of memory has (possibly) already been allocated for it (although its worth noting in any case that ContiguousArray has an exponential growth strategy that amortises the cost of appending).

However, in the case of a Collection, the default implementation of underestimateCount actually just returns the collection's count:

public var underestimatedCount: Int {
// TODO: swift-3-indexing-model - review the following
return numericCast(count)
}

Which will be an O(1) operation for collections that conform to RandomAccessCollection, O(n) otherwise.

Therefore, because of this default implementation, using a Collection's underestimatedCount directly is definitely less common than using a Sequence's, as Collection guarantees non-destructive iteration, and in most cases underestimatedCount will just return the count.

Although, of course, custom collection types could provide their own implementation of underestimatedCount – giving a lower bound of how many elements they contain, in a possibly more efficient way than their count implementation, which could potentially be useful.

Why does count return different types for Collection vs. Array?

From Associated Types in the Swift Programming Language (emphasis added):

When defining a protocol, it’s sometimes useful to declare one or more associated types as part of the protocol’s definition. An associated type gives a placeholder name to a type that is used as part of the protocol. The actual type to use for that associated type isn’t specified until the protocol is adopted. Associated types are specified with the associatedtype keyword.

In Swift 3/4.0, the Collection protocol defines five associated types
(from What’s in a Collection?):

protocol Collection: Indexable, Sequence {
associatedtype Iterator: IteratorProtocol = IndexingIterator<Self>
associatedtype SubSequence: IndexableBase, Sequence = Slice<Self>
associatedtype Index: Comparable // declared in IndexableBase
associatedtype IndexDistance: SignedInteger = Int
associatedtype Indices: IndexableBase, Sequence = DefaultIndices<Self>
...
}

Here

    associatedtype IndexDistance: SignedInteger = Int

is an associated type declaration with a type constraint (: SignedInteger) and a default value (= Int),

If a type T adopts the protocol and does not define T.IndexDistance otherwise then T.IndexDistance becomes a type alias for Int.
This is the case for many of the standard collection types
(such as Array or String), but not for all. For example

public struct AnyCollection<Element> : Collection

from the Swift standard library defines

    public typealias IndexDistance = IntMax

which you can verify with

let ac = AnyCollection([1, 2, 3])
let cnt = ac.count
print(type(of: cnt)) // Int64

You can also define your own collection type with a non-Int index distance if you like:

struct MyCollection : Collection {

typealias IndexDistance = Int16
var startIndex: Int { return 0 }
var endIndex: Int { return 3 }

subscript(position: Int) -> String {
return "\(position)"
}

func index(after i: Int) -> Int {
return i + 1
}
}

Therefore, if you extend the concrete type Array then count
is an Int:

extension Array {
func whatever() {
let cnt = count // type is `Int`
}
}

But in a protocol extension method

extension Collection {
func whatever() {
let cnt = count // some `SignedInteger`
}
}

everything you know is that the type of cnt is some type adopting the
SignedInteger protocol, but that need not be Int. One can still
work with the count, of course. Actually the compiler error in

    for index in 0...count { //  binary operator '...' cannot be applied to operands of type 'Int' and 'Self.IndexDistance'

is misleading. The integer literal 0 could be inferred as a
Collection.IndexDistance from the context (because SignedInteger
conforms to ExpressibleByIntegerLiteral). But a range of SignedInteger is not a Sequence, and that's why it fails to compile.

So this would work, for example:

extension Collection {
func whatever() {
for i in stride(from: 0, to: count, by: 1) {
// ...
}
}
}

As of Swift 4.1, IndexDistance is no longer used, and
the distance between collection indices is now always expressed as an Int, see

  • SE-0191 Eliminate IndexDistance from Collection

In particular the return type of count is Int. There is a type alias

typealias IndexDistance = Int

to make older code compile, but that is remarked deprecated and will be removed
in a future version of Swift.

How to count occurrences of an element in a Swift array?

Swift 3 and Swift 2:

You can use a dictionary of type [String: Int] to build up counts for each of the items in your [String]:

let arr = ["FOO", "FOO", "BAR", "FOOBAR"]
var counts: [String: Int] = [:]

for item in arr {
counts[item] = (counts[item] ?? 0) + 1
}

print(counts) // "[BAR: 1, FOOBAR: 1, FOO: 2]"

for (key, value) in counts {
print("\(key) occurs \(value) time(s)")
}

output:

BAR occurs 1 time(s)
FOOBAR occurs 1 time(s)
FOO occurs 2 time(s)

Swift 4:

Swift 4 introduces (SE-0165) the ability to include a default value with a dictionary lookup, and the resulting value can be mutated with operations such as += and -=, so:

counts[item] = (counts[item] ?? 0) + 1

becomes:

counts[item, default: 0] += 1

That makes it easy to do the counting operation in one concise line using forEach:

let arr = ["FOO", "FOO", "BAR", "FOOBAR"]
var counts: [String: Int] = [:]

arr.forEach { counts[$0, default: 0] += 1 }

print(counts) // "["FOOBAR": 1, "FOO": 2, "BAR": 1]"

Swift 4: reduce(into:_:)

Swift 4 introduces a new version of reduce that uses an inout variable to accumulate the results. Using that, the creation of the counts truly becomes a single line:

let arr = ["FOO", "FOO", "BAR", "FOOBAR"]
let counts = arr.reduce(into: [:]) { counts, word in counts[word, default: 0] += 1 }

print(counts) // ["BAR": 1, "FOOBAR": 1, "FOO": 2]

Or using the default parameters:

let counts = arr.reduce(into: [:]) { $0[$1, default: 0] += 1 }

Finally you can make this an extension of Sequence so that it can be called on any Sequence containing Hashable items including Array, ArraySlice, String, and String.SubSequence:

extension Sequence where Element: Hashable {
var histogram: [Element: Int] {
return self.reduce(into: [:]) { counts, elem in counts[elem, default: 0] += 1 }
}
}

This idea was borrowed from this question although I changed it to a computed property. Thanks to @LeoDabus for the suggestion of extending Sequence instead of Array to pick up additional types.

Examples:

print("abacab".histogram)
["a": 3, "b": 2, "c": 1]
print("Hello World!".suffix(6).histogram)
["l": 1, "!": 1, "d": 1, "o": 1, "W": 1, "r": 1]
print([1,2,3,2,1].histogram)
[2: 2, 3: 1, 1: 2]
print([1,2,3,2,1,2,1,3,4,5].prefix(8).histogram)
[1: 3, 2: 3, 3: 2]
print(stride(from: 1, through: 10, by: 2).histogram)
[1: 1, 3: 1, 5: 1, 7: 1, 9: 1]

What type do I use for a Sequence of Strings

func doSomething<S: Sequence>(with seq: S) where S.Element == String

Or as per suggestion in comments by @LeoDabus you could include also substrings and other types conforming to StringProtocol by constraining the sequence’s element:

func doSomething<S: Sequence>(with seq: S) where S.Element: StringProtocol

Swift: Does Array conforms to Sequence Protocol?

You can find what protocols built-in types conform to in their documentation. There is a Relationships section at the bottom of their documentation pages.

If you check the docs of Array, you can see that it conform to MutableCollection, which inherits from Collection, which inherits from Sequence. So yes, Array does conform to Sequence via its MutableCollection conformance.



Related Topics



Leave a reply



Submit