Aggregate dataset of open data without identifying information
Chris Hartgerink, Richard Klein, Jelte Wicherts
This module contains a principal dataset collated from various open data, which we previously identified as not containing identifying information. This principal dataset is generated to be a pseudo-population to generate smaller sample datasets from without identifying information. These sample datasets will be used to generate precision estimates (α and 1-α) for algorithms to check for identifying information in open data in a next step. The principal dataset shared here contains 30,251 rows and a maximum of 23 columns.