Thanks for the comments Sven.
I did as you suggested and the results are not the same. The principal
components calculated using the original dataset with the sample restricted
and the ones calculated using a new dataset with missing values are
different.
So back to square one. Clearly one could use matrix to do the PCA
calculation directly and then it would be easy to pick only the
observations that I want for my analysis -- but I am not too familiar with
gretl script syntax.
Any idea how I can handle that using simple gretl commands?
Thanks
-Paulo
On Wed, Dec 4, 2013 at 4:31 PM, Sven Schreiber <svetosch(a)gmx.net> wrote:
Am 04.12.2013 19:16, schrieb Paulo Grahl:
> Hello,
> I am struggling with a simple issue:
> I have a data set of monthly time series that spans 2000:01 to
2012:12
> I want to run "pca" in a subset, from 2000:01 to 2011:12,
> So I change the sample using "smpl ; 2011:12" command before
> running "pca" command and save all the principal components.
> But when I change back to full sample I can see the principal
> components running through all my sample -- so I assume the "pca" runs
> in the full dataset and does not respect the "smpl" command.
>
> Is this the case? If so, how do I run "pca" in a subset of data?
>
I don't know if it's the case (which would look like a bug), but a
clunky workaround would be to save copies of the original data and set
all the post 2011m12 values to missing there. You can then also compare
the saved components with the first ones to see if indeed there's a
difference (which shouldn't be if the sample selection is honored).
cheers,
sven
_______________________________________________
Gretl-users mailing list
Gretl-users(a)lists.wfu.edu
http://lists.wfu.edu/mailman/listinfo/gretl-users
--
Dr. Paulo Gustavo Grahl, CFA
------------------------------------------
pgrahl(a)gmail.com
pgrahl(a)fgvmail.br
+55(21) 8809-9254
skype:paulo.grahl
www.linkedin.com/in/pgrahl
------------------------------------------