HapMap Project logo
International HapMap Project
 

Home | About the Project | Data | Publications | Tutorial

中文 | English | Français | 日本 | Yoruba

Guidelines for the Responsible Use and Publication of HapMap data is available here.

Browse data graphically

Use the Generic Genome Browser to view HapMap Project data in the context of other genomic features, as well as retrieve genotypes & frequencies for specific genomic regions.

Generate reports and extracts of data using HapMart.

Jump directly to chromosome in the dataset.

Downloads

The following directories contain HapMap project related data, software, and documentation, that have been made publicly available. (See HapMap Data Access Policy for more information). More details about each dataset can be found in READMEs in the respective directories:

ENCODE regions

Ten ENCODE regions are being studied by HapMap centers. The work includes genotyping all dbSNP SNPs in each region, as well as resequencing in several samples and genotyping additional SNPs found.These regions were chosen by the Analysis Group and they include a range of chromosomes, recombination rates, gene density, and values of non-transcribed conservation with mouse. For more information about the ENCODE Project see HapMap ENCODE Page, special genotype data dumps are available here.

Release notes

 -HapMap data release #23a, March 2008, on NCBI B36 assembly, dbSNP b126 --

This release contains genotypes and frequencies from the HapMap project, and
includes all data from phases I+II of the project. In addition, it contains 
commercially available arrays, including the Affymetrix 6.0, 100k, 500k and 
nsSNP products, Illumina Infinium 100k and 300k genotyping arrays. For 
more details on those, please see release notes for previous HapMap releases.

Mapping errors for merged SNPs that were detected in rel23 have now been corrected 
by using BLAT software. Change in position affected 1,708 SNPs in CEU, 
2,636 in JPT+CHB, and 4,669 in YRI. Complete lists of these SNPs can be found in:

http://ftp.hapmap.org/genotypes/2008-03/rs_strand/unfiltered/Mapping_error_*_r23a.txt

Additional inconsistencies that were detected and fixed for this release are grouped 
under the following categories:

- 2,990 SNPs with multiple alleles were removed from the QC+ filtered sets. For a
complete list of SNPs and featured allele (dbSNP b126), see 
Multiple_allele_errorlist_r23a.txt in the above ftp directory.

- 232 SNPs with strand orientation inconsistencies were removed from the QC+
filtered sets. For a complete list, see strand_flipped_error.txt in the above ftp 
directory.

- 39 SNP pairs with identical positions in dbSNP b126. These fall into special 
categories and may have been removed from the QC+ filtered sets as explained
below. For a complete list, see special_redundancies_QC+_r23a.txt in the above 
ftp directory. 

NOT REMOVED:
24 SNP pairs later merged into a single refSNP cluster (or rsid) in dbSNP b128
7 SNP pairs not yet merged into a single rsid as of dbSNP b128

REMOVED:
8 SNP pairs inconsitently mapped (ie, one SNP maps to chrN_random)
4 SNP pairs with inconsistent allele entries (ie, one SNP with multiple alleles)

- 108 SNPs on chrX, chrY or chrMT failed HapMap QC criteria implemented using PLINK 
software. These SNPs were kept in the current release but will be removed from the 
QC+ filtered sets in future HapMap releases. For a complete list of SNPs, see 
PLINK_errors_chrX_chrY_chrMT.txt in the above ftp directory. 
HapMap 'QC-' fail flags used in this list:

p =>  passrate < 80%
d =>  >1 duplicate discrepancy
h =>  Hardy-Weinberg p-value < 0.001 (calculated seperately for Jap & Chi)
m =>  >1 Mendel inconsistencies
s =>  fail-flagged by submitter (but only if gt-set didn't fail any of the DCC-imposed thresholds above)
x =>  >= 1 heterozygous haplotype (male for chrX; either sex for chrMT)
y =>  >= 1 female genotype present on chrY


Genotyped SNPs (non-redundant set):

Chrom       CEU   JPT+CHB   YRI   
---------------------------------------------
chr1     307,691  311,854  305,929
chr10    209,342  211,862  204,146
chr11    204,228  205,538  195,110
chr12    191,979  193,071  187,294
chr13    155,905  158,406  152,674
chr14    123,071  123,764  118,518
chr15    106,814  107,363  102,431
chr16    109,692  109,734  104,530
chr17     89,701   89,576   85,541
chr18    119,118  120,025  115,768
chr19     56,607   56,687   53,766
chr2     326,231  327,180  318,602
chr20    119,921  119,989  115,921
chr21     50,165   51,900   49,154
chr22     54,786   56,716   54,840
chr3     255,391  255,618  250,155
chr4     244,849  245,102  238,922
chr5     247,632  248,154  242,186
chr6     268,348  272,814  265,955
chr7     213,023  213,891  208,708
chr8     213,095  216,811  212,014
chr9     181,445  183,433  180,147
chrMT        216      209      215
chrX     118,086  118,740  117,313
chrY         672      667      668
---------------------------------------------
Total    3,968,008  3,999,104  3,880,507

--------------------------------------- 
help@hapmap.org


About

The HapMap Data Coordination Center (DCC) coordinates and manages project data flow, data storage, data release and presentation to the community. This includes managing the genotype database and this website. The DCC is operated by Lincoln Stein's group at Cold Spring Harbor Laboratory.

Last updated : index.html.en,v 1.19 2007/04/12 20:18:09 tellorui Exp


Home | About the Project | Data | Publications | Tutorial
Please send questions and comments on website to help@hapmap.org