[Gretl-users] Re: cause of xmlget error message?

Wednesday, 3 March 2021

Am 02.03.2021 um 17:04 schrieb Sven Schreiber:
...
 # now the real thing
 web =

readfile("https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/monthly/kl/historical/")

 print web  # very long string! (HTML source)
 eval xmlget(web, "//a")    # error: xmlParseMemory returned NULL

 </hansl>

 Does the error mean that the page's code is simply too much / too long?
 OK, turns out the main problem doesn't have anything to do with gretl's
xmlget function. The web page above uses the (valid) HTML 4 element <hr>
which isn't valid xml and therefore libxml chokes on this (and by
implication then also xmlget).

However, there's still something strange with the printout. If I run the
following short script:

<hansl>
clear
string web =
readfile("https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/monthly/kl/historical/")
web = strsub(web, "<hr>", "")

# print web

eval xmlget(web, "//a")

</hansl>

then there's no error, but the "eval" line only produces 40 lines or so,
with the last printed line being truncated like this ("_hist.zip" is
missing":

monatswerte_KL_00183_19360101_20191231

If instead I uncomment the "print web" line above then I get everything,
which is several hundreds of lines. Not sure why I need to print the
string first! This is with a recent snapshot.

But note: If I don't do "eval" but assign the xmlget result to a string
variable, everything seems fine, so no deep problem here.

thanks

sven

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

[Gretl-users] Re: cause of xmlget error message?