Data Analytics Corp. schrieb:
Hi,
Suppose I have a data set of 200 cars where the observation labels are
the make (Chevy, Honda, BMW, etc.). Say there are 10 makes. I want to
create a variable, say ID, that identifies the make so I don't have to
rely on the observation label. The purpose is to later create dummies
for the makes to include in a regression model and create boxplots (one
for each make's price, all shown in one plot window) of price. How can
I do this? Or is it better to just use the observation labels and
laboriously write if statements?
AFAIK the term "observation labels" in gretl terminology is used to
describe individual datapoints, not as a field to hold certain defining
characteristics which is effectively a variable. So, starting from a
spreadsheet file, if you put the "make" variable in its own column (with
character string values) then gretl should automatically code it with
numerical values and notify you when you import that file (in .csv
form). Then it seems to me that you will have what you need: You can
restrict the sample to those obs where the ID equals a certain number
code. If you want you can also create dummies like so:
series hondadummy = (ID=5) # example with code 5 for Honda
This can also be done for all makes in one step in the GUI I think: try
Add -> create dummies for selected discrete variables.
HTH,
sven