On Sat, 1 Jul 2017, Marcin Błażejowski wrote:
W dniu 30.06.2017 o 22:54, Allin Cottrell pisze:
> Hmm. So the public function in BMA requires a dataset, but the private
> functions you're calling, under MPI, don't have such a requirement?
Yes, that is exactly my case.
>
> Then I suppose it's a feasible hack to drop the dataset requirement
> from the package specification, but that doesn't really seem right to
> me: a package should, ideally, honestly state whether it needs a
> dataset or not.
Yeah, I agree at 100%.
> I've just added in git, an option --send-data for the mpi-block
> command, which works in the same sort of way as under "foreign"
> (except that in this case we can use gretl's native gdt/gdtb formats
> to send the data to gretlmpi). However, if you don't really need to
> send the dataset this may just waste CPU cycles in your case.
I see.
> Seems to me that you're trying to use gretl's "mpi" block in a way
> that was not intended. That leaves open the question: even if it was
> not originally intended, should it nonetheless be supported, somehow?
> Possibly, but I think it would be cleaner (if you want to support both
> MPI-enabled gretl and basic gretl) if you were to branch the package
> code early on, conditional on presence of MPI support: if so, do this
> (go into MPI mode right away); if not, do the other.
Let me explain my idea of incorporating MPI into BMA (and possibly into
other packages).
I start BMA in standard gretl instance and I let user to set options via
GUI interface (or script it doesn't matter). Then I start MC3 which is
random walk Monte Carlo, so I can't compute it in parallel (ok, there is
a possibility to triger two MC3 chains in parallel but we (I mean me and
Jacek) have work on this theoretically). That is why MC3 runs on just
one core. But if I have to employ Gibbs sampler (inside MC3 chain) which
can be computed in parallel (draws are independent) I then use MPI.
After Gibbs sampler job is finished I go back to MC3 with results and I
do next MC3 iteration.
So public function needs dataset in place, private functions used in
Gibbs sampler do not (data is mpibroadcast).
Allin, maybe to avoid sending dataset over MPI world we could drop data
requirement after MPI block is called I give the programmer an
opportunity to decide whether send data (via --send-data flag) or not.
In other words:
1. when we start package gretl respects dataset requirement,
2. when we use "--send-functions" in MPI block gretl drops this
requirement and let the programmer to use (or not) "--send-data".
What do you think?
OK, if a package has been loaded in gretl/gretlcli (and therefore
its data requirement, if any, has been met initially), then we'll
waive the data-requirement check if/when its functions are called in
a gretlmpi instance spawned via an "mpi" block. That's now in git.
Allin