On Wed, 17 Apr 2019, Allin Cottrell wrote:
On Wed, 17 Apr 2019, Logan Kelly wrote:
> I have students who are working with very big dataset--around 9
> million observations. I had one student try to load a 4 GB csv
> file into gretl, and gretl loaded it! But with some errors.
A general comment: In gretl, every data value is stored as a
"double" (a double-precision floating-point value, which occupies
64 bits or 8 bytes).
Let me be a bit more specific. Suppose we have a data file that is 4
GB in CSV form. In CSV every digit takes 1 byte, or 8 bits. Suppose
that a quarter of the series are actually floating-point values that
require 8 bytes apiece for accurate representation, and the other
three quarters are either 0/1 dummy variables or other small
integers, less than 256.
Then when you load the data into gretl, 3/4 of the data will be
expanded by a factor of 8. So (1 + 3) GB -> (1 + 3*8) GB = 25 GB.
It's then quite possible that a computer with RAM sufficient to
handle the CSV file as such cannot handle gretl's in-memory
representation of the data.
Allin