Panel data - 2 - simple example of pooled cross section across years

(Comments)

I will try to explore in a millennial way about Panel data, and believe me, statistics and mathematically combined into econometric is not as scary as it sounds before we are going to prepare your data in the panel format. It is worthed seeing the chapter from Wooldridge's book. Don't be scared! It's not as scary as it is. 

Intro

Many surveys of individuals, families, and firms are repeated at regular intervals, often each year. An example is the Current Population Survey (or CPS), which randomly samples households each year. If a random sample is drawn at each time period, pooling the resulting random samples gives us an independently pooled cross-section.

One reason for using independently pooled cross-sections is to increase the sample size. By pooling random samples drawn from the same population, we can get more precise estimators and test statistics with more power at different points in time. Pooling is helpful only in so far as the relationship between the dependent variable and at least some of the independent variables remain constant over time.

Using pooled cross-sections raises only minor statistical complications. Typically, to reflect that the population may have different distributions in different time periods, we allow the intercept to differ across periods, usually years. This is easily accomplished by including dummy variables for all but one year, where the earliest year in the sample is usually chosen as the base year. It is also possible that the error variance changes over time, something we discuss later.

Sometimes, the pattern of coefficients on the year dummy variables is itself of interest.

Example of the simple cross-section with the time that resembles panel data 

First, Stata offers a great resource when you want to practice the code. You can find the source here 

https://www.stata.com/links/examples-and-datasets/

See also: And let say we want to use some of the data from the book of Wooldridge, In chapter 10, https://stats.idre.ucla.edu/other/examples/eacspd/

The fertility topic from Wooldridge or this site in chapter 13

http://fmwww.bc.edu/gstat/examples/wooldridge/wooldridge13.html

Try this
Simple panel data regression with one subject

 For example, a demographer may be interested in the following question: After controlling for education, has fertility pattern among women over age 35 changed between 1972 and 1984? The following example illustrates how this question is answered using multiple regression analysis with year dummy variables.

The data set in the link below, which is similar to Sander (1992), comes from the National Opinion Research Center’s General Social Survey for the even years from 1972 to 1984, inclusively. We use these data to estimate a model explaining the total number of kids born to a woman (kids).

use http://fmwww.bc.edu/ec-p/data/wooldridge/fertil1
reg kids educ age agesq black east northcen west farm othrural town smcity y74 y76 y78 y80 y82 y84

One question of interest is:

After controlling for other observable factors, what has happened to fertility rates over time? The factors we control for are years of education, age, race, region of the country were living at age 16, and living environment at age 16.

The base year is 1972. The coefficients on the year dummy variables show a sharp drop in fertility in the early 1980s. For example, the coefficient on y82 implies that holding education, age, and other factors fixed; a woman had on average .52 fewer children, or about one-half a child, in 1982 than in 1972. This is a substantial drop: holding Educ, age, and the other factors fixed, 100 women in 1982 are predicted to have about 52 fewer children than 100 comparable women in 1972.

Since we control education, this drop is separate from the decline in fertility due to the increase in average education levels. (The average years of education are 12.2 for 1972 and 13.3 for 1984.) The coefficients on y82 and y84represent drops in infertility for reasons not captured in the explanatory variables. Given that the 1982 and 1984 year dummies are individually quite significant, it is not surprising that as a group, the year dummies are jointly very significant: the R-squared for the regression without the year dummies is .1019, and this leads to F6,1111 5 5.87 and p-value < 0.

  • Women with more education have fewer children, and the estimate is very statistical
  • Women with more education have fewer children, and the estimate is very statistically significant.
  • Other things being equal, 100 women with a college education will have about 

51 fewer children on average than 100 women with only a high school education: .128(4) 5 .512. Age has a diminishing effect on fertility. (The turning point in the quadratic is at about age 5 46, by which most women have finished having children.)

The model estimated assumes that each explanatory variable's effect, particularly education, has remained constant. This may or may not be true; you will be asked to explore this issue in Computer Exercise C1. Finally, there may be heteroskedasticity in the error term underlying the estimated equation. 

There is one interesting difference here: now, the error variance may change over time even if it does not change with the values of Educ, age, black, etc.

The heteroskedasticity-robust standard errors and test statistics are nevertheless valid. The Breusch-Pagan test would be obtained by regressing the squared OLS residuals on all of the independent variables in the table below, including the year dummies. (For the special case of the White statistic, the fitted values in kid and the squared fitted values are used as the independent variables, as always.) A weighted least-squares procedure should account for variances that possibly change over time. In the procedure discussed in Section 8.4, year dummies would be included in equation (8.32).

      Source |       SS       df       MS              Number of obs =    1129
-------------+------------------------------           F( 17,  1111) =    9.72
       Model |  399.610888    17  23.5065228           Prob > F      =  0.0000
    Residual |  2685.89841  1111  2.41755033           R-squared     =  0.1295
-------------+------------------------------           Adj R-squared =  0.1162
       Total |   3085.5093  1128  2.73538059           Root MSE      =  1.5548

------------------------------------------------------------------------------
        kids |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        educ |  -.1284268   .0183486    -7.00   0.000    -.1644286    -.092425
         age |   .5321346   .1383863     3.85   0.000     .2606065    .8036626
       agesq |   -.005804   .0015643    -3.71   0.000    -.0088733   -.0027347
       black |   1.075658   .1735356     6.20   0.000     .7351631    1.416152
        east |    .217324   .1327878     1.64   0.102    -.0432192    .4778672
    northcen |    .363114   .1208969     3.00   0.003      .125902    .6003261
        west |   .1976032   .1669134     1.18   0.237    -.1298978    .5251041
        farm |  -.0525575     .14719    -0.36   0.721    -.3413592    .2362443
    othrural |  -.1628537    .175442    -0.93   0.353    -.5070887    .1813814
        town |   .0843532    .124531     0.68   0.498    -.1599893    .3286957
      smcity |   .2118791    .160296     1.32   0.187    -.1026379    .5263961
         y74 |   .2681825    .172716     1.55   0.121    -.0707039    .6070689
         y76 |  -.0973795   .1790456    -0.54   0.587     -.448685    .2539261
         y78 |  -.0686665   .1816837    -0.38   0.706    -.4251483    .2878154
         y80 |  -.0713053   .1827707    -0.39   0.697      -.42992    .2873093
         y82 |  -.5224842   .1724361    -3.03   0.003    -.8608214    -.184147
         y84 |  -.5451661   .1745162    -3.12   0.002    -.8875846   -.2027477
       _cons |  -7.742457   3.051767    -2.54   0.011    -13.73033   -1.754579
------------------------------------------------------------------------------
test y74 y76 y78 y80 y82 y84

 ( 1)  y74 = 0.0
 ( 2)  y76 = 0.0
 ( 3)  y78 = 0.0
 ( 4)  y80 = 0.0
 ( 5)  y82 = 0.0
 ( 6)  y84 = 0.0

       F(  6,  1111) =    5.87
            Prob > F =    0.0000

reg kids educ age agesq black east northcen west farm othrural town smcity y74 y76 y78 y80 y82 y84

Thanks for keep following!

Feel got helped, support the blog by buying me a coffee 

Current rating: 3.8

Comments

Riddles

22nd Jul- 2020, by: Editor in Chief
524 Shares 4 Comments
Generic placeholder image
20 Oct- 2019, by: Editor in Chief
524 Shares 4 Comments
Generic placeholder image
20Aug- 2019, by: Editor in Chief
524 Shares 4 Comments
10Aug- 2019, by: Editor in Chief
424 Shares 4 Comments
Generic placeholder image
10Aug- 2015, by: Editor in Chief
424 Shares 4 Comments

More News  »

Template that you need to know if you want to be pro in after effect

Recent news
6 days, 14 hours ago

What does the Fed do in 2008

Recent news
3 weeks, 1 day ago

What does the Fed do in 2008

Recent news

Today, one of the popular topic related to financial policy is the question on

read more
3 weeks, 1 day ago

What is Lifetime Value of customer

Recent news

Have you ever heard about LTV? well if you talk about Macroprudential policy, it will be loan to value. But if you talk about startups and the world of tech, it refers to the Lifetime value of a company. 

read more
1 month, 2 weeks ago

Mengenal lebih dalam kurikulum merdeka

Recent news

Akhirnya Indonesia menerapkan kurikulum merdeka, namun sebenarnya apa sih itu kurikulum merdeka? 

read more
1 month, 3 weeks ago

How to understand the impact of interactive variable from interaction model to depended variable

Recent news

I tried from my own research. And here it is

read more
2 months, 1 week ago

Thing you should do, to not clutter the social media

Recent news

There 7 things that really move the needle when it comes to social media. They aren’t always easy, but they really do produce results.

read more
2 months, 1 week ago

Elektabilitas cawapres 2024, Erick Thohir paling atas

Recent news

Jelang Pilpres 2024, sejumlah nama telah teridentifikasi dan mendapat dukungan publik. Tiga teratas capres sejauh ini adalah Anies Baswedan, Ganjar Pranowo, dan Prabowo Subianto. Ketiganya memang telah mengantongi dukungan dari beberapa partai politik. Namun, dukungan pada ketiganya tampak masih belum ajeg. Hal ini menunjukkan bahwa preferensi publik juga belum ajeg, masih rentan terhadap perkembangan situasi politik maupun kondisi lain.

read more
7 months, 3 weeks ago

More News »

Generic placeholder image

Collaboratively administrate empowered markets via plug-and-play networks. Dynamically procrastinate B2C users after installed base benefits. Dramatically visualize customer directed convergence without