Am 22.07.21 um 18:42 schrieb Sven Schreiber:
Hi,
the replace() function takes as 2nd and 3rd arguments the mapping pairs
of values. What about enabling a little bit of syntactic sugar for the
case of a matrix with exactly 2 columns, standing in for the respective
vectors, making the 3rd argument optional in cases like the following
example:
matrix m = {3, 0.5; 2, 0.7} # arbitrary stuff for the example
series y = replace(x, m) # not working yet
Usually I like the idea of having a minimum number of arguments to pass
to a function. But I am not sure in this case.
Currently, the signature of replace() is:
replace(x, find, subst)
This is a very intuitive and hence an easy to remember signature.
Introducing an alternatively supported signature for _the_ _same_
function of the kind
replace(x, {find, subst})
where {find, subst} acts like a set, I find disturbing. I guess the
"Zen" would say "There should be one-- and preferably only one --obvious
way to do it." :-D
It adds an additional layer to the function (the cost) for a very
limited benefit (having two instead of three arguments).
series y = replace(x, m) # not working yet
which would be equivalent to:
series y = replace(x, m[,1], m[,2])
Ok, this already works according to the doc as find and subst can be
vectors.
BTW, thinking about this, there seems to be some similarity to the
strsub() function. What I mean is that replace() could be overloaded to
nest the usage of strsub(), based on the type of the first argument. And
strings arrays might be supported as arguments as well. This is just a
general observation, I don't have a concrete need right now, but I guess
there would be use cases, replacing some handwritten loops.
True, strsub() has similarities to replace(). Having two functions doing
a similar thing are a kind of "historical costs" every software has I
guess ;-)
Some more words on this even though this is drifting away a bit (sorry
for that):
--------------------------------------
This reminds me of some thought I had a couple of weeks ago: Having an
apply() function which executes a function "element-wise". An element
could be a row, column or an item in some array.
Thus, one may say
<hansl>
strings input = defarray("A", "B", "C")
strings find = defarray("A", "B", "C")
strings subst = defarray("D", "E", "F")
apply(strsub(input, find, subst))
</hansl>
The apply function "knows" that it has to run through each element of
the string-array and execute "element-wise" the strsub() function.
This may be useful for the following reasons:
1. Calling apply() may be a hint for gretl that the separate steps are
independent such that things can be parallelized (pretty sure this does
not always hold).
2. This may make it easier to add support for array-like data for
functions which currently do not support arrays such as strsub(),
meanc() (in a matrix array). Thus, at least I imagine, one does not need
to touch the C code of each separate function but rather one has an
abstract function apply() which makes sure that a function is applied
element-wise. For instance, take the strsplit() function for which Allin
added support for string-arrays a couple of weeks ago. For strsub()
Allin did not so. I guess because there was no need for it, yet, and of
course(!) this involves a lot of additional work.
In principle, this is similar to Pandas's apply function:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFra...
Does replace() actually work already on a string-valued series? If
not,
I guess it should?
Each string is mapped to some identifier value. Changing the identifier
values (integers), requires running stringify() again.
<hansl>
clear
nulldata 3
strings S = defarray("A", "B", "C")
series y = seq(1,3)'
stringify(y, S)
print y -o
x = replace(y, 1, 2)
print y x -o
stringify(x, S)
print y x -o
</hansl>
Artur