Am 14.12.20 um 17:41 schrieb Allin Cottrell:
On Mon, 14 Dec 2020, Artur Tarassow wrote:
[...]
> Am 13.12.20 um 15:35 schrieb Allin Cottrell:
>
>> Sorting a dataset: This was not optimized for a huge number of
>> observations. We were allocating more temporary storage than strictly
>> necessary and moreover, at some points, calling malloc() per
>> observation when it was possible to substitute a single call to get a
>> big chunk of memory. Neither of these points were much of an issue
>> with a normal-size dataset but they became a serious problem in the
>> huge case. That's now addressed in git.
>
> As I already wrote you privately: This change is a boost as sorting
> time got reduced from 14 to 7.5 seconds. Thanks for this!
>
> By the way, does this increased speed in sorting also affect the
> aggregate() function?
Right now it's specific to the case of sorting an entire dataset, but it
would be worth taking a look at the aggregate case too.
Hi all,
I've finalized the little documentation of this little project including
Allin's response to the first version:
https://github.com/atecon/gretl_pandas_pypolars
Best,
Artur