swift-collections-performance

Mission
Idea
Environment
- Optimizations
Installation
Collection Types
- Standard vs clever collection types
Measurement
Results

Mission

Answer to the question: what collection type is the best to store enormous number of data?

A comprehensive answer to the question can be found in this blogpost.

Idea

Data layer for Scrabble dictionary application based on my experience and performance tests.

Polish dictionary has almost 3 millions of words and should allow in reasonable time:

validating words
returning words possible to create with given letters
simple regex

Comparison of:

different arrays (incl. optimized), sets and tries
compiler optimizations
syntax optimizations

Environment

Xcode 9.2
Swift 4.0
iPhone 8
iOS 11.2.5

Optimizations

Fast, Whole Module Optimization for the application's target (Release) and all dependencies
Run and Test set to Release build configuration

Installation

Clone the repository or download ZIP.
Install CocoaPods (if you don't have): sudo gem install cocoapods.
Run pod install in the project's root directory.

Collection Types

The repository presents four main types of collection:

Arrays:
- Standard Array,
- Array with reserved capacity that equals a count of strings,
- ContiguousArray,
- ContiguousArray with reserved capacity that equals a count of strings,
- CleverArray that is standard two-dimensional Array where arrays inside contains words with letters count that equals an index of array.
Sets:
- Standard Set,
- CleverSet that is standard Array of standard Sets where sets inside contains words with letters count that equals an index of set.
Tries:
- Modified Trie from Buckets-Swift by Mauricio Santos
Realm database:
- Standard Realm database with List of StringObjects,
- CleverRealm database with multiple Lists of StringObjects where each list contains words with different letters count.

Standard vs clever collection types

Measurement

All results, but RAM usage, were collected via measurement unit tests. RAM usage results were collected from Xcode / Debug navigator / Memory Report during running the sample app.

Results

Initialization time

A time needed to initialize an object.

Conclusions

Sets are much slower than arrays, because of object uniqueness.
CleverSet and Trie initialization time is unacceptable.
Both Realm and CleverRealm initialization time is unnoticeably, because of Realm laziness.

A time needed to validate correctness of these words one by one: "rękodzieło", "porównywać", "rodzicielski", "się", "powierzchnia", "substancja", "jeden", "dzień", "w", "kobieta", "narzędzie", "fałszywy", "pizza", "medycyna", "niematakiegoslowa".

Conclusions

Again, there is no big difference between the first four arrays.
Sets and trie are the fastest.
Non-optimized Realm database is faster than arrays, but CleverArray and CleverRealm seem to be golden mean (a time needed to initialized sets and Trie makes them unacceptable).

Words made from given letters

A time needed to find words that can be made from given letters:

a, p, s, t for four letters test,
a, d, p, s, t for five letters test.

Conclusions

There is a huge difference between 4 and 5 letters for standard Realm database.
Sets, trie, CleverArray and CleverRealm are really fast in this test, so I decided to run more tests for the fastest four of them.

_{^{Logarithmic scale, power trendline}}

a, d, e, p, s, t for six letters test,
a, d, e, p, r, s, t for seven letters test,
a, d, e, p, r, s, t, z for eight letters test.

Conclusions

Sets and trie growth when more letters is reasonable, but CleverRealm is useless for more than six letters.

Simple regex for blanks

A time time needed to find words that fit these simple blank regex one by one: "p?z?a", "???", "po??r", "pap????", "???????", "tele???".

In example: po??r should return pobór, poder, poker, pokór, polar, poler, polor, pomór, ponor, ponur, potir, power, pozer, pozór, pożar.

Conclusions

This is possible in reasonable time only for CleverArray, CleverSet, Trie, Realm and CleverRealm.
CleverRealm is the fastest, when Realm is the slowest.
Trie is faster than standard Swift collection types.

RAM Usage

RAM usage measured by Xcode profiler on real device running example app:

RAM Usage is RAM usage just after initialization.
Highest RAM Usage is the highest RAM usage noticed during initialization.

Conclusions

Trie and sets RAM usage is unacceptable high (it may cause crashes on older devices).
Arrays RAM usage is high, but acceptable. Unfortunately their performance is not satisfactory.
Realm RAM usage is close to null.

repoforks / swift-collections-performance Goto Github PK

swift-collections-performance's Introduction

swift-collections-performance's People

Watchers

Forkers

Recommend Projects

Recommend Topics

Recommend Org