Panel data - 1 - an Intro

What is Panel data?

Panel data are also called longitudinal data or cross-sectional time-series data. These longitudinal data have “observations on the same units in several different time periods” (Kennedy, 2008: 281);

What construct Panel data?

A panel data set has multiple entities, each of which has repeated measurements at different time periods. Panel data may have individual (group) effect, time effect, or both, analyzed by fixed effect and/or random effect models. U.S. Census Bureau’s Census 2000 data at the state or county level are cross-sectional but not time-series, while annual sales figures of Apple Computer Inc. for the past 20 years are time series but not cross-sectional.

The cumulative Census data at the state level for the past 20 years are longitudinal. If annual sales data of Apple, IBM, LG, Siemens, Microsoft, Sony, and AT&T for the past 10 years are available, they are panel data. The National Longitudinal Survey of Labor Market Experience (NLS) and the Michigan Panel Study of Income Dynamics (PSID) data are cross-sectional and time-series, while the cumulative General Social Survey (GSS) and American National Election Studies (ANES) data are not in the sense that individual respondents vary across survey year.

The benefit of Panel data

As more and more panel data are available, many scholars, practitioners, and students have been interested in panel data modeling because these longitudinal data have more variability and allow exploring more issues than cross-sectional or time-series data alone (Kennedy, 2008: 282). Baltagi (2001) puts, “Panel data give more informative data, more variability, less collinearity among the variables, more degrees of freedom and more efficiency” (p.6). Given well-organized panel data, panel data models are definitely attractive and appealing since they provide ways of dealing with heterogeneity and examine fixed and/or random effects in the longitudinal data.

How to prepare panel data?

However, panel data modeling is not as easy as it sounds. A common misunderstanding is that fixed and/or random effect models should always be employed whenever your data are arranged in the panel data format. The problems of panel data modeling, by and large, come from 1) panel data themselves, 2) modeling process, and 3) interpretation and presentation of the result. Some studies analyze poorly organized panel data (in fact, they are not longitudinal in a strong econometric sense) and some others mechanically apply fixed and/or random effect models in haste without considering the relevance of such models. Careless researchers often fail to interpret the results correctly and to present them appropriately.

