185 million people are waiting for your regression

One for researchers (like me) who get very excited at the thought of large microdata sets.

The Minnesota Population Center is pleased to announce the latest expansion of the IPUMS-International data series. In December 2006 we added 16 new samples for Belarus, Cambodia, Greece, the Philippines, Romania, Spain, and Uganda.  The data series now contains 185 million person records from 63 samples and 20 countries.  The new URL is http://international.ipums.org.

As the database grows, we constantly try to make it easier for researchers to use.  With this release we have improved the data system in several ways.  In addition to the 400 harmonized IPUMS variables, users can now view and extract over 5,000 unharmonized variables specific to the individual samples.  These have undergone differing degrees of processing, but they generally contain all of the useful information in the original samples that does not pose a confidentiality concern.  A second enhancement is the ability to filter all variable documentation based on user sample selections, limiting the information on the screen to what is relevant to the researcher.  Finally, some complex variables now have additional “general” versions utilizing only their first digit or two, making them easier to use.

IPUMS have been trying for years to get samples of the Australian census added in to their collection, but the Australian Bureau of Statistics has consistently refused to hand the data over, on the grounds that it might threaten national security, or fade the curtains, or something. The result is that foreign academics, who would happily do research on Australia for free, won’t.

This entry was posted in Economics Generally. Bookmark the permalink.

2 Responses to 185 million people are waiting for your regression

  1. derrida derider says:

    Very, very typical of the ABS.

  2. Sinclair Davidson says:

    When running regressions with 185 millions observations remember Lindlay’s paradox.

Comments are closed.