On Sun, 11 Dec 2016, Henrique Andrade wrote:
Em 12 de novembro de 2016, Jack escreveu:
>
> Of course some rough edges will have to be dealt with. For example, the
> output to
>
> a = strscrape("1.2.3")
>
> is, as of now,
>
> [1.2, 3]
>
> which of course makes sense but is only one of the possible conventions.
To avoid that kind of situation I suggest to return the number as a array
of strings. So the user can explicit choose what to do. Furthermore, using
array of strings as the return value of the function can help users to
handle the "problem" of decimal comma.
I don't really favor that. The decimal comma could be handled via an
optional second argument (boolean "comma", defaulting to false).
As for Jack's example, I think this can be improved by adding the
decimal separator to the list of candidate characters for starting a
number (provided it's immediately followed by a digit). Then
strscrape("1.2.3")
would return {1.2, 0.3}, which is, I think, fully defensible as a
left-to-right parsing.
One more thing: it would be great if we could have a function like
this to count the number of occurrences of a determined string.
Something like this:
string S = "I like Gretl because Gretl is great and Gretl is complete"
scalar N = strscrapec(S, "Gretl") # strscrapc; String scrape counter
# result: N = 3
That sort of thing is easily done with existing string functions:
<hansl>
function scalar strcount (string s, string p)
scalar n = 0
scalar plen = strlen(p)
loop while strlen(s) > 0 -q
s = strstr(s, p)
if strlen(s) > 0
n++
s += plen
endif
endloop
return n
end function
string s = "I like Gretl because Gretl is great and Gretl is complete"
eval strcount(s, "Gretl")
</hansl>
The scraping of numerical values, on the other hand, could not be done
(or at least, not at all conveniently) in hansl without additional
low-level functions -- equivalents of the C functions strcspn, strtod
and isdigit. We _could_ add those if we thought it worthwhile.
Allin