Assembly: Human Dec. 2013 (GRCh38/hg38) Data last updated at UCSC: 2022-05-25 03:04:17
UniBind: maps of high-confidence direct TF-DNA interactions across species
Description
We provide here the track hub that corresponds to the map of permissive direct TF-DNA
interactions (aka TFBSs) stored in the UniBind 2021
database.
UniBind is a comprehensive map of direct transcription factor (TF) - DNA
interactions across species. These interactions were obtained by uniformly
processing ~10,000 public ChIP-seq data sets using the ChIP-eat software. The
uniform processing, up to ChIP-seq peaks calling was performed by ReMap and GTRD and the
entire collection of ChIP-seq peaks is also available in their respective websites. An entropy-based algorithm was used to automatically
delineate an enrichment zone containing direct TF-DNA interactions, supported
by both strong computational evidence and strong experimental evidence. Moreover, we applied a quality control step to each set of TF-DNA interactions to identify high-quality transcription factor binding sites (TFBSs), yielding two different collections of TFBSs:
Permissive TFBSs: a collection containing all TFBSs available in UniBind.
Robust TFBSs: a collection containing only those TFBSs that passed the quality control. More details on the quality control metrics and the selected thresholds are described in our UniBind 2021 publication.
The
UniBind database hosts the complete set of TFBS predictions, as well as the prediction model itself, and
cis-regulatory modules (CRMs) derived from these direct TF-DNA interactions. All the
data is publicly available. For further details, please refer to the
associated publications:
Individual BED files for specific TFs or datasets can be found and
downloaded on the UniBind website at http://unibind2021.uio.no.
Display Conventions and Configuration
A set of TFBSs derived from a specific ChIP-seq experiment with a specific
TF binding profile from JASPAR is
defined with a name following the format
<GEO/ArrayExpress/ENCODE/GTRD identifier>_<cell type/tissue>-<condition>_<TF name>_<JASPAR ID>.<JASPAR version>
Each transcription factor follow a specific RGB color according to the following table:
AHR
AR
ARID3A
ARID3B
ARNT
ARNTL
ASCL1
ASCL2
ATF2
ATF3
ATF4
ATF7
BACH1
BACH2
BATF
BATF3
BCL6
BCL6B
BHLHE22
BHLHE40
CDX2
CEBPA
CEBPB
CEBPD
CEBPG
CLOCK
CREB1
CREB3L1
CREM
CTCF
CTCFL
CUX1
DUX4
E2F1
E2F4
E2F6
E2F7
E2F8
EBF1
EBF3
EGR1
EGR2
EGR3
EHF
ELF1
ELF3
ELF4
ELF5
ELK1
ELK3
ELK4
EOMES
ERF
ERG
ESR1
ESR2
ESRRA
ETS1
ETS2
ETV1
ETV2
ETV4
ETV5
ETV6
FLI1
FOS
FOSL1
FOSL2
FOXA1
FOXA2
FOXA3
FOXD3
FOXE1
FOXJ2
FOXK1
FOXK2
FOXN3
FOXO1
FOXO3
FOXP1
FOXP2
FOXP3
GABPA
GATA1
GATA2
GATA3
GATA4
GATA6
GFI1
GFI1B
GLI2
GLIS1
GLIS2
GLIS3
GMEB1
GMEB2
GRHL2
HAND2
HES1
HES2
HEY1
HEY2
HIC1
HIF1A
HINFP
HLF
HMBOX1
HNF1A
HNF1B
HNF4A
HNF4G
HOXA9
HOXB13
HOXB7
HSF1
HSF2
IRF1
IRF2
IRF3
IRF4
IRF5
ISL1
JDP2
JUN
JUNB
JUND
KLF1
KLF10
KLF11
KLF12
KLF13
KLF14
KLF15
KLF16
KLF17
KLF3
KLF4
KLF5
KLF6
KLF9
LEF1
LHX2
LHX9
MAF
MAFB
MAFF
MAFK
MAX
MECOM
MEF2A
MEF2B
MEF2C
MEF2D
MEIS1
MEIS2
MGA
MITF
MIXL1
MLX
MNT
MXI1
MYB
MYBL2
MYC
MYCN
MYF5
MYOD1
MYOG
NANOG
NEUROD1
NEUROG2
NFATC1
NFATC3
NFE2
NFE2L1
NFE2L2
NFIA
NFIB
NFIC
NFIL3
NFKB1
NFKB2
NFYA
NFYB
NFYC
NKX2-5
NKX3-1
NR1H2
NR1H3
NR1H4
NR2C1
NR2C2
NR2F1
NR2F2
NR2F6
NR3C1
NR4A1
NR5A1
NR5A2
NRF1
OCT4
ONECUT1
ONECUT2
OSR2
OTX2
OVOL2
PAX5
PAX6
PAX7
PBX1
PBX2
PBX3
PDX1
PHOX2B
PITX3
PKNOX1
PLAG1
POU2F1
POU2F2
POU3F2
POU4F2
POU5F1
PPARG
PRDM1
PRDM4
PROX1
RARA
RARB
RARG
RBPJ
REL
RELA
RELB
REST
RFX1
RFX2
RFX3
RFX5
RUNX1
RUNX2
RUNX3
RXRA
RXRB
SCRT1
SCRT2
SIX1
SIX2
SMAD2
SMAD3
SMAD4
SMAD5
SNAI1
SNAI2
SOX10
SOX11
SOX13
SOX15
SOX17
SOX2
SOX4
SOX5
SOX6
SOX9
SP1
SP2
SP3
SP4
SPDEF
SPI1
SPIB
SREBF1
SREBF2
SRF
STAT1
STAT2
STAT3
STAT4
STAT5A
STAT5B
STAT6
T
TAL1
TBP
TBX1
TBX2
TBX21
TBX3
TBX5
TBXT
TCF12
TCF3
TCF4
TCF7
TCF7L1
TCF7L2
TEAD1
TEAD3
TEAD4
TFAP2A
TFAP2C
TFAP4
TFCP2
TFDP1
TFE3
TFEB
THAP1
THAP11
THRB
TP53
TP63
TP73
TWIST1
USF1
USF2
VDR
WT1
XBP1
YY1
YY2
ZEB1
ZFX
ZNF143
ZNF263
ZNF740
Methods
The entire collection of ChIP-seq data sets was uniformly processed in ReMap and GTRD up
to ChIP-seq peak calling. The entire collection of ChIP-seq peaks is also
available in the ReMap and GTRD databases, respectively. These peaks served as input for the ChIP-eat
data processing pipeline. The complete pipeline is designed to uniformly
process ChIP-seq data sets, from raw reads to the identification of direct
TF-DNA binding events, and it was implemented in the ChIP-eat software with
source code freely available at https://bitbucket.org/CBGR/chip-eat/. Only the
ChIP-seq datasets for which a TF binding profile for the targeted TF was
available in JASPAR were used for TFBS predictions. The enrichment zone
containing high confidence direct TF-DNA interactions was automatically defined
for each data set using an entropy-based algorithm. The diagram below
illustrates the processing steps.
Data Availability
Individual BED files for specific TFs or datasets can be found and
downloaded on the UniBind website at http://unibind2021.uio.no.
Reference
If you use UniBind or ChIP-eat in your work, please cite: