On Sat, 17 Oct 2020, Riccardo (Jack) Lucchetti wrote:
 On Sat, 17 Oct 2020, Artur Tarassow wrote:
> Am 16.10.20 um 23:53 schrieb Allin Cottrell:
>> I'm not documenting this yet because it needs more testing, but in current 
>> git I've enabled a new operation on string-valued series: logical product. 
>> (Snapshots to follow.)
>> 
>> If sv1 and sv2 are string-valued series, then
>>
>>    series sv3 = sv1 * sv2
>> 
>> now yields another string-valued series with value si_sj at observations 
>> where sv1 has value si and sv2 has value sj.
>> 
>> A simple example is afforded by the R dataset warpbreaks (csv version 
>> attached). This includes two "factor" series, wool (with values
"A" and 
>> "B") and tension (with values "L", "M" and
"H").
>> If you multiply them together you get a series with 6 distinct values, 
>> "A_L" to "B_H".
> 
> This is pretty useful, Allin! I often face the "problem" that I have two or
> more string-valued series which need to be concatenated for creating a 
> unique identifier before one can set a panel structure.
 I concur: this is a very nice idea, very useful, and handled in a very 
 elegant way. I only have two remarks: a suggestion and a proposal.
 (a) we already use the "^" operator for performing what I see as a very 
 similar operation on lists. It would seem to me more consistent if we used 
 "^" instead of "*". 
Good point. I wasn't quite sure which operator symbol to borrow for 
this purpose but consistency with lists makes a good argument for 
'^'. I'll make that change.
 (b) why not extend this syntax to string arrays, or even vectors?
Imagine how 
 cool it would be to do something like
 <pseudo-hansl>
 s = defarray("a", "b")
 ss = s ^ seq(1,3)
 </pseudo-hansl>
 and have ss be a 6-element string array containing "a_1", "a_2" and
so on. 
Would be cool, yes. But to my mind the next priority for work on
string-valued data would be arranging for string values to be used 
in plots. For example, in a plot with the --dummy option we should 
clearly be setting x-tics with strings, not numbers, if the discrete 
x variable is string-valued.
> Also, let me go even a step further. A more flexible version may
be a 
> function with three arguments: "sv1", "bridge" and
"sv2" where "bridge" 
> (not the most ideal parameter name) may be optional allowing for a 
> user-defined bridging string such as underscore ("_") in the current 
> implementation.
 Or perhaps, the "bridge" character could be a libset variable. 
Yes, could be. To make the bridge an argument we'd have to implement 
this via a function rather than an operator.
And/or, a fairly simple generalization that could help here is to 
make strsub() and regsub() apply to string-valued series (and arrays 
of strings) as well as plain strings.
One other thought: R uses "." as bridge. At first I thought we 
couldn't do that since "." can't be used in a gretl identifier. But 
that was a confusion: the result is not supposed to be an 
identifier! So in the first instance I'm inclined to switch to dot 
as default bridge character, unless anyone sees a good reason not 
to.
Allin