Hi,
Am 26.02.19 um 18:45 schrieb Riccardo (Jack) Lucchetti:
I'm working with a panel dataset that includes a few string-valued
series, and I'm finding a few things annoying, so I'm sending this
message to the list with a two-fold purpose: (a) to hear you guys'
opinions and see of there's antything I'm missing and (b) as a reminder
to myself to work on these things as soon as possible.
- when you compute lags of a discrete variable, the "discrete" flag is
not propagated (I think it should); can anyone think of similar cases?
- similar to the above: lagging a string-valued variable does not
preserve labels. Example:
of course you can do "stringify(x_1, l)", but it's sort of annoying.
I've stumbled about the same annoying thing just recently.
- printing out an array of strings requires a loop if it has more
than 9
elements. In some cases, this is VERY inconvenient. I see two way to go
around this: either we introduce a "set" variable, which replaces the
hard-wired limit at 10 we have now (something like "set arrayprint 20")
or introduce an option to the print command ("--full" or similar). What
do you guys prefer?
What about introducing another function named head() as in Python's
pandas package? The user would call <head(x,n)> where
- x is either a series, list, string array or matrix
- n is an integer specifying to show the first 'n' entries (default: 5 or)
The output is printed in column-format.
Similarly, there exists the tail() function in pandas show the last n
entries.
This would also be helpful when working with huge datasets where
printing output by <print x -o> already takes quite a while...
Best,
Artur