Hi all,
in a recent exercise I needed to store gzipped-compressed csv files.
Unfortunately the "--gzipped" option of the "store" command is only
supported for the native format. Ok, I wrote a function executing some
shell command compressing the files. However, I am wondering whether we
could have also the compression supported for csv files.
Some more background: I had to store _many_ csv files, and to upload
these to a HDFS (hadoop) file system. Querying data is done by means of
Apache's HIVE or Impala tools. These work either with csv or gzipped-csv
(among other binary data formats such as Apache's Parquet file).
Unfortunately, files stored in the native binary format are not supported.
Best,
Artur
Show replies by date