Comments (2)
I rewrote your script for making very simple benchmarks, I post the script here in case we need it later:
<?php
declare(strict_types=1);
namespace App;
use Closure;
use loophp\collection\Collection;
include __DIR__ . '/vendor/autoload.php';
$bench1 = function (int $end, int $count) {
$collection1 = Collection::range(start: 1, end: $end + 1);
$collection2 = Collection::range(start: 1, end: (5 * $end) + 1);
$result = $collection1->merge($collection2)->distinct()->all();
assert($count === count($result));
};
$bench2 = function (int $end, int $count) {
$collection1 = range(start: 1, end: $end);
$collection2 = range(start: 1, end: 5 * $end);
$result = array_unique(array_merge($collection1, $collection2));
assert($count === count($result));
};
$benchmark = function (array $benchmarks, mixed ...$arguments): void {
$bench = function (Closure $closure, mixed ...$arguments): void {
$start = microtime(true);
$memoryStart = memory_get_usage();
$closure(...$arguments);
$memory = memory_get_usage() - $memoryStart;
$total = microtime(true) - $start;
echo 'Time: ' . $total . PHP_EOL;
echo 'Memory: ' . ($memory) . PHP_EOL;
};
foreach ($benchmarks as $benchmark) {
$bench($benchmark, ...$arguments);
}
};
$benchmark([$bench1, $bench2], 5000, 25000);
And yes, the Distinct
operation is slow, and there are several reasons for this.
Firstly, implementing the distinct operation in userland requires maintaining a history of all the items processed, which inherently adds overhead. Secondly, since the Collection
library works with both keys and values, the data storage requirement is effectively doubled.
Moreover, the Collection
library itself is inherently slow. My primary goal was not to optimize for performance but to create a functional library that meets my needs. I aimed for a lazy evaluation library with well-defined functions in their respective files to facilitate algorithm improvements and encourage easy collaboration from contributors.
A significant reason for the slowness is that Collection
can handle any type of iterables (arrays, iterators, objects) with any type of keys (objects, booleans, arrays) and any type of values. This flexibility incurs a performance cost because every operation must account for both the keys and the values.
Comparing Collection
(a PHP implementation) to array_*
(a C implementation) is somewhat nonsensical. array_*
will always be faster, but it is limited to handling only arrays.
The project is open-source, and you're free to improve the algorithm if you think there is room for improvements, contributions are very welcome.
from collection.
Since this issue has not had any activity within the last 5 days, I have marked it as stale.
I will close it if no further activity occurs within the next 5 days.
from collection.
Related Issues (20)
- `Partition` Operation - Awkward to use? HOT 18
- Add find/search/where/single/firstWhere method HOT 23
- Modify `all` operation to prevent data loss HOT 6
- PHPStan 1.0 upgrade HOT 1
- Typed collection support HOT 17
- API oddities HOT 6
- Reduction operations should return a single value HOT 9
- Issue with cache and fromCallable HOT 13
- PHPStan reporting an error for missing optional parameters HOT 6
- [Question] Rename Collection interface to CollectionInterface HOT 1
- Dependency Dashboard HOT 51
- Collection interface doesn't extend Countable HOT 9
- Weird interplay between Collection and PDO result set HOT 13
- Plus operation RFC HOT 10
- Palm cannot infer types when using some operations HOT 4
- Unexpected behavior of pair operation over empty collection HOT 1
- [Feature request] Implement stable sorting HOT 12
- Memory size exhausted for large collections HOT 15
- Issue with distinct HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from collection.