On Fri, 12 Jul 2019, Riccardo (Jack) Lucchetti wrote:
On Fri, 12 Jul 2019, Allin Cottrell wrote:
> On Fri, 12 Jul 2019, Sven Schreiber wrote:
>
> > Something else: BTW, the guide mentions that --send-data is not
> > available with Ox, but is silent for the Python case. Actually Artur and
> > I are working (not too hard) on more tools for passing stuff to Python,
> > but enabling --send-data would also be nice. What Python/numpy functions
> > would you need to make this work?
>
> Basically just a CSV reading function -- and presumably a target
> structure that handles variable names, to make a distinction with just
> sending data in matrix form.
In fact, I was thinking about this some time ago: our --send-data apparatus
goes back to a time when we didn't have all the data types we have now,
particularly bundles. It would be really cool if we could pass bundles to
languages that support some variant of associative arrays, eg R, where they
call them lists, or Python, where they call them dicrtionaries (IIRC). That
would give us enormnous flexibility. Of course, (a) we'd have to write
import-export functions for those languages and (b) we'dd introduce some
dependencies in the target languages, but I reckon that, given the mechanism
we have for serialising a bundle as an xml file, that would be doable.
Proof-of-concept:
<hansl>
bwrite(defbundle("x", 42, "s", "foo"), "b.xml")
foreign language=R
library(XML);
b = xmlToList(xmlParse("b.xml"));
l <- length(b);
for (i in (1:l)) {
type <- b[[i]]$.attrs["type"];
payload <- b[[i]]$text;
print(type);
switch(type,
scalar={print(as.numeric(payload))},
string={print(payload)})
}
end foreign
</hansl>
Cool! Here's the same sort of thing for python, though I'm sure
there's a more compact way of doing it:
<hansl>
bwrite(defbundle("x", 42, "s", "foo"), "b.xml")
foreign language=python
from lxml import etree
from collections import defaultdict
def etree_to_dict(t):
d = {t.tag: {} if t.attrib else None}
children = list(t)
if children:
dd = defaultdict(list)
for dc in map(etree_to_dict, children):
for k, v in dc.items():
dd[k].append(v)
d = {t.tag: {k: v[0] if len(v) == 1 else v
for k, v in dd.items()}}
if t.attrib:
d[t.tag].update(('@' + k, v)
for k, v in t.attrib.items())
if t.text:
text = t.text.strip()
if children or t.attrib:
if text:
d[t.tag]['#text'] = text
else:
d[t.tag] = text
return d
with open('b.xml', 'r') as f:
xml = f.read()
str = '\n'.join(xml.split('\n')[1:])
tree = etree.fromstring(str)
d = etree_to_dict(tree)
print(d)
end foreign
</hansl>
Allin