Data Core

The data component of the IFH Survey/Data Core provides researches with access to experienced and specialized analysts, access to many large medical and administrative databases, and a powerful HIPAA-compliant computing environment which can facilitate productive, efficient, and sensitive research and training activities.

The IFH Survey/Data Core will provides a range of research support services, including the intellectual project support and analytical support. Staff are available to epidemiological and biostatistical analysis and programming of on-site and large healthcare and administrative databases, analysis of a user’s own databases, support of protocol and statistical analysis plan development, preliminary support of studies, assisting with power analysis calculations, support of data management activities including data cleaning and maintenance, and provide investigator support on their project manuscripts and reports.

The Core boosts an extensive portfolio of data resources. Included within are local, state, national, and international healthcare and administrative records for research. Several of these data resources include: Medicare, Medicaid, CPRD Gold and Aurum, and MarketScan. Each of these datasets are regulated by their own specific Data Use Agreement (DUA) or contract, which dictates in what capacity the data can be utilized. Additionally, certain datasets require an additional fee to be paid to use the data in a new study (data reuse fee), which is a pass-through cost imposed by the licensor of the data. Investigators will need to inquire with the Core to determine if a particular dataset is available and permitted for their project, and if permitted by the DUA or contract. Typically, the licensor of the data will also need to agree to the use of the data. The Core will provide guidance and direction for processing any paperwork regarding reuse with the licensors.

The HIPAA-compliant system is available for any study, and is perfect for studies that utilize sensitive data resources. The system boosts 100TB of high-speed storage space across all users, with an expansion currently planned, and has the capability to run high-performance analyses for major studies. Investigators can access the system remotely. Our environment permits the uploading and downloading of materials, including select data resources, after vetting by our IT staff.

IFH Survey / Data Core – Data Holdings

Data Set Years Size (N/Year) Location Description Additional Licensor Fee Additional Approval Required?
MarketScan – Q1 Commercial Claims and Encounters 1996-2017 Varies USA Patient-level claims data including drug and medical claims. $30K – non-RU funding
$60k – commercial funding
N/A
MarketScan – Commercial Claims and Encounters + Medicare Supplemental and Coordination of Benefits 2008-2017 Varies USA Patient-level claims data including drug and medical claims. $30K – non-RU funding
$60k – commercial funding
N/A
MarketScan – Multi-State Medicaid Database 2013-2017- Varies USA Patient-level claims data including drug and medical claims. $30K – non-RU funding
$60k – commercial funding
N/A
First Databank MedKnowledge 12017-2018 N/A USA Database that provides individual drug-level data on prescription medications. N/A N/A
CPRD Gold 1986- ~15M/yr UK Fully-coded patient electronic health records from GP Practices in the United Kingdom. Contains data contributed by practices using Vision® software. Varies for non-RU-users X
CPRD Aurum 1986- ~7M/yr UK Fully-coded patient electronic health records from GP Practices in the United Kingdom. Contains data contributed by practices using EMIS Web® electronic patient record software system. Varies for non-RU-users X
Medicaid Analytic eXtract (MAX) 1999-2000
2001-2010
2008-2014
~60M/yr
~60M/yr
~8M/yr
USA National Medicaid data: 1999-2000 (5 states), 2001-2010 (45 states), 2008-2014 (select states). Includes personal summary file, inpatient file, prescription drug file, other services file, and long term care file. $2k X
Medicaid Analytic eXtract (MAX) 2014- ~10M/yr USA National Medicaid data for 17 states with a 100% cohort. Includes personal summary file, inpatient file, prescription drug file, other services file, and long term care file. $2k X
Medicare 100% 2013-2016 ~50M/yr USA National Medicare with 100% coverage. sample with all of NJ residents. Includes Part D event data, Master Beneficiary Summary File (MBSF), utilization claims data, EDB, Medicare Provider and Analysis Review (MedPAR), Home Health Outcome and Assessment Information Set (OASIS), Long-term Care Minimum Data Set (MDS), CASPER, Home Health Agency (HHA) claims, Hospice claims, and relevant crosswalks. $2k X
Medicare 20% 2007-2016 ~8M/yr
~12M/yr
USA National Medicare cohort equal to 20% sample with all of NJ residents. Includes Part D event data, Master Beneficiary Summary File (MBSF), utilization claims data, EDB and relevant crosswalks. $2k X
NJ Based Data Varies Varies USA, NJ Variety of NJ based healthcare, insurance, claims, and administrative datasets. Varies X