Challenge #2 README: Follow the Narcotics

In this challenge, your team will look for unusual patterns of narcotic prescribing in Medicare data.  Medicare is the US version of national healthcare.  Medicare provides health insurance for Americans 65 and older who have paid taxes into the system when they were employed. It also covers people <65 years old with disabilities, kidney disease who need dialysis or have a transplant, and who have the neurologic disease amyotrophic lateral sclerosis.  In this competition you will be given several files related to a publicly available file describing which providers (doctors, nurses and medical professionals) prescribed which medications during 2013 that were paid for by Medicare.

You should use any clustering or pattern recognition methods of your choosing to identify providers who prescribe lots of narcotics (morphine, dilaudid, oxycodone, etc.), and identify their features (state, geographic location, specialty).  To accomplish this challenge, you will also need to describe the basis of your findings, display your results using innovative graphics.  The team with the strong drug,  provider, and  prescribing identification methods and zero-day visualizations will win a $100 prize.


  1. Prescriber File:  The (Challenge_2_dataset_row_annotations_providers.txt) contains the Medicare Part D providers for a narcotic drugs. Values less than 11 are shown as 0.
  2. Medication Code Lookup File:  The (Challenge_2_dataset_column_annotations_narcotics.txt) file contains the drugs categorized as “narcotics”.
  3. Provider by Drug/Cost files These files show what providers in Data set 1 prescribed drugs identified in Data set 2. The Challenge_2_dataset_number_of_claims_matrix.txt file is a matrix telling the number of claims made by a particular provider (row) for a particular narcotic drug (column). The Challenge_2_dataset_cost_matrix.txt file is also a matrix file telling the total cost for claims made by a particular provider (row) for a particular narcotic drug (column).
    1. The file descriptions are in the Challenge_2_READ_ME.doc file
  4. National Provider Identifier File:  Centers for Medicare & Medicaid Services CMS has developed the National Plan and Provider Enumeration System (NPPES) to assign unique identifiers to health care providers. These National Provider Identifiers or NPI Numbers are required for reimbursing healthcare providers for services from CMS. These NPPES files contain all of the FOIA-disclosable data for active and deactivated healthcare providers in the US.

 Judging Criteria:

Teams then need to upload their readmission prediction results by 12:00 PM to the link you will be given.

 Teams will be judged based on three criteria –   (1) Identification (2) Presentation and (3) Deep Magic

Criterion #1:  Identification –    At the end of the competition, you will submit a list of NPI numbers for providers who prescribe a large number of narcotics in an unusual pattern.  We will compare this list to the results of a secret sauce algorithm that the Rochester Center for Health Informatics has developed.

Criterion #2:  Presentation –   You will put together a 1 + 5 slide presentation using the RocHackHealth Competition Template, and be judged on the clarity of your presentation of your method, findings, and conclusions.

Criterion #3:  Deep Magic –   How cuspy, wizardly, and beautiful your solution is.

Data Download Links:

NPPES files

Presentation Template:

The final presentation format is a 1 + 5 PowerPoint format template. Only the first 5 data slides will be included in the judging.