On Wed, Sep 28, 2022 at 7:38 AM Alison Loddick <Alison.Loddick@northampton.ac.uk> wrote:

>

> I have data with variables and years as columns, and my rows are companies, so my data variables look like

>

> Company name,ROE_year 1, ROE_year 2, ROE_year 3, ROE_year 4, Var2_year 1, Var2_year 2, Var2_year 3, Var2_year 4, Var3_year 1, Var3_year 2, Var3_year 3, Var3_year 4, Etc.

>

> I thought I could read this as cross sectional panel data, but it doesn’t work.

>

> I have data with variables and years as columns, and my rows are companies, so my data variables look like

>

> Company name,ROE_year 1, ROE_year 2, ROE_year 3, ROE_year 4, Var2_year 1, Var2_year 2, Var2_year 3, Var2_year 4, Var3_year 1, Var3_year 2, Var3_year 3, Var3_year 4, Etc.

>

> I thought I could read this as cross sectional panel data, but it doesn’t work.

As Sven has said, this case is more awkward than the (more common) case for which gretl's stack() function is designed.

You may or may not like this, but I'd say that the best way of handling the data you describe is to convert the dataset to a matrix, use gretl's matrix manipulation facilities to rearrange the data in stacked time-series form, then convert back to a dataset.

Here's hansl code that does the job, taking as input a toy version of the setup you describe (attached as loddick.csv). You'll have to adjust one or two things for your real case (names of the variables, and the dimensions T and N). You can of course delete the verbose feedback on the script's progress via "printf".

<hansl>

open loddick.csv

N = $nobs # number of companies

T = 3 # number of years

nvars = 4 # number of variables (excluding company names)

strings conames = strvals(coname) # the company names series

delete coname

# convert the dataset into a matrix

matrix X0 = {dataset}

# add a matrix to hold the rearranged data

NT = N * T

matrix X1 = zeros(NT, nvars)

r0 = 0

loop i=1..N

printf "original row %d\n", i

col = 1

loop j=1..nvars

printf " variable %d\n", j

loop t=1..T

printf " observation %d\n", t

X1[r0+t,j] = X0[i,col]

col++

endloop

endloop

r0 += T

endloop

# add the names of the variables to X1

cnameset(X1, "ROE V2 V3 V4")

# construct an "empty" panel dataset

nulldata NT --preserve

setobs T 1:1 --stacked-time-series

# populate with the content of matrix X1

list L = mat2list(X1)

# turn the "index" variable into company names

rename index company

company = $unit

stringify(company, conames)

# take a look the data

print -o

# save the modified data

store loddick.gdt

N = $nobs # number of companies

T = 3 # number of years

nvars = 4 # number of variables (excluding company names)

strings conames = strvals(coname) # the company names series

delete coname

# convert the dataset into a matrix

matrix X0 = {dataset}

# add a matrix to hold the rearranged data

NT = N * T

matrix X1 = zeros(NT, nvars)

r0 = 0

loop i=1..N

printf "original row %d\n", i

col = 1

loop j=1..nvars

printf " variable %d\n", j

loop t=1..T

printf " observation %d\n", t

X1[r0+t,j] = X0[i,col]

col++

endloop

endloop

r0 += T

endloop

# add the names of the variables to X1

cnameset(X1, "ROE V2 V3 V4")

# construct an "empty" panel dataset

nulldata NT --preserve

setobs T 1:1 --stacked-time-series

# populate with the content of matrix X1

list L = mat2list(X1)

# turn the "index" variable into company names

rename index company

company = $unit

stringify(company, conames)

# take a look the data

print -o

# save the modified data

store loddick.gdt

</hansl>

Allin