Project Details


[unreadable] DESCRIPTION (provided by applicant): Understanding regulatory networks controlling gene expression is one of the fundamental problems of modern biology. The proposed research focuses on the methods of locating regulatory elements in DNA. We have developed a new maximum likelihood method based on the physical DNA dependent binding probability of a transcription factor (TF) that correctly incorporates the protein concentration dependent saturation effect. The advantage of keeping the saturation effect is that the method automatically provides a score threshold for classifying candidate sites into binders and non-binders. Most conventional methods, based on the information score, merely provide a relative ordering of candidate sequences. The principled choice of a threshold is extremely useful for dealing with the highly variable sites typical of global regulatory factors. The simplest of our algorithms reduces to a one-class support vector machine. This classifier will be applied to detect large regulons in E. coli, as well as in phages, with special attention to targets of sigma factors. We also develop classifiers for regulatory targets that go beyond pure sequence analysis and combine it with information from additional sources, like microarray expression data or sequence similarity between phylogenetically closely related species. The proposed computational effort will be complemented by experiments verifying the predictions as well as providing in vitro and in vivo data needed to make predictions. Experimental efforts will involve a high throughput low stringency SELEX method applied to global transcriptional regulators from E. coli. It will also involve chromatin immunoprecipitation and beta- galactosidase assays performed in S. Cerevisiae that test the ability of bioinformatic algorithms to predict functionality of TF binding sites. A special feature of this proposal is the analysis of the effect of the rest of the promoter on the regulatory potential of a site. We apply the lessons learnt in simple organisms to the elucidation of distinctive specificity of different NFkB proteins involved in immunity, inflammation and cancer. Mutation, over-expression and amplification of genes encoding transcription factors play an important role in many diseases from diabetes to cancer. Understanding how a factor targets genes is crucial for discovering the pathways whose malfunction leads to the symptoms. This is an achievable goal, given the right way to analyze the plethora of genome-wide data available to us. [unreadable] [unreadable] [unreadable]
Effective start/end date4/13/063/31/10


  • National Human Genome Research Institute: $294,338.00
  • National Human Genome Research Institute: $299,796.00
  • National Human Genome Research Institute: $302,050.00


  • Genetics
  • Molecular Biology


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.