IFH Data Core Databases

The IFH Data Core provides researchers with access to experienced and specialized data analysts, multiple large national and New Jersey-based medical and administrative databases (see list below), and a powerful HIPAA-compliant computing environment that can facilitate productive, efficient, and sensitive research and training activities.

IFH Data Core staff are available to provide support with data management and analysis, including work with Core databases (see list below) and survey resources, work with a user’s own databases, and generation of preliminary data.

The IFH Data Core manages an extensive portfolio of database resources, including New Jersey-based, national, and international healthcare and administrative records for research (see list below). Specific datasets come with their own regulatory requirements, such as Data Use Agreements (DUAs) or contracts with the data vendor, which dictate in what capacity the data can be utilized. Additionally, certain datasets require additional fees to use the data for a new study (data reuse fee), which is a pass-through cost imposed by the licensor of the data. The IFH Core will work with investigators to determine the feasibility and requirements for working with specific datasets, including the need for DUAs, contracts, or pass-through fees. The Core will also provide guidance for working with data vendors to secure data access.

The HIPAA-compliant Core computing system is available for any study and is perfect for studies that utilize sensitive data resources. The high-performance computing platform includes ample high-speed storage space and accommodates complex data management and data analysis tasks for large studies. Investigators and staff can access the system remotely. Our environment permits the uploading and downloading of materials, including selected data resources, after vetting by IT staff.

IFH Data Core – Data Holdings

Data Set Years Size (N/Year) Location Description Additional Licensor Fee Additional Approval Required?
MarketScan – Q1 Commercial Claims and Encounters 1996-2023 Varies USA Patient-level claims data including drug and medical claims. $30K – non-RU funding
$60k – commercial funding
N/A
MarketScan – Commercial Claims and Encounters + Medicare Supplemental and Coordination of Benefits 2008-2023 Varies USA Patient-level claims data including drug and medical claims. $30K – non-RU funding
$60k – commercial funding
N/A
MarketScan – Multi-State Medicaid Database 2013-2023 Varies USA Patient-level claims data including drug and medical claims. $30K – non-RU funding
$60k – commercial funding
N/A
First Databank MedKnowledge 2024-2025 N/A USA Database that provides individual drug-level data on prescription medications. N/A N/A
CPRD Gold 1986- ~15M/yr UK Fully-coded patient electronic health records from GP Practices in the United Kingdom. Contains data contributed by practices using Vision® software. Varies for non-RU-users X
CPRD Aurum 1986- ~7M/yr UK Fully-coded patient electronic health records from GP Practices in the United Kingdom. Contains data contributed by practices using EMIS Web® electronic patient record software system. Varies for non-RU-users X
Medicaid Analytic eXtract (MAX) 1999-2000
2001-2010
2008-2014
~60M/yr
~60M/yr
~8M/yr
USA National Medicaid data: 1999-2000 (5 states), 2001-2010 (45 states), 2008-2014 (select states). Includes personal summary file, inpatient file, prescription drug file, other services file, and long term care file. $2k X
Medicaid Analytic eXtract (MAX)/ Transformed Medicaid Statistical Information System (TAF) 2014-2020 ~10M/yr USA National Medicaid data for 17 states with a 100% cohort. Includes personal summary file, inpatient file, prescription drug file, other services file, and long term care file. $2k X
Medicare 20% 2007-2021 ~8M/yr
~12M/yr
USA National Medicare cohort equal to 20% sample with all of NJ residents. Includes Part D event data, Master Beneficiary Summary File (MBSF), utilization claims data, EDB and relevant crosswalks. $2k X
NJ Based Data Varies Varies USA, NJ Variety of NJ based healthcare, insurance, claims, and administrative datasets. Varies X