 |
- Jonathan D. Wren, Ph.D.
- .
- Research Assistant Professor
- University of Oklahoma
- Advanced Center for Genome Technology
- Department of Botany and Microbiology
- 101 David L. Boren Blvd., Rm. 2025
- Norman, OK 73019
- Phone: (405)325-3415
- Fax: (405)325-3442

|
|
- Research
- .......II have recently taken a position at
the Oklahoma Medical Research Foundation,
and am not sure how much longer this webpage will be up. But since I haven't
set up one there yet, I will continue to update this one. My OU email
should remain active until October 2007, and I will continue to check it
regularly in case anyone needs to get in touch with me.
-
- My
area of research is in Bioinformatics
, which briefly defined, is the application of information-related sciences to problems
in biology and medicine, and is becoming more prevalent across research fields (36). My research largely focuses upon developing methods of obtaining, integrating and modeling biomedical data into an informational framework to enable computers to play a greater role in the automated discovery of new knowledge.
It used to be that simply getting sufficient data was the biggest
challenge, but modern high-throughput technologies give us floods of data such that the biggest challenge is now getting the most out of it. Informatics methods enable us to integrate more observations into the scientific questions we are trying to address experimentally - observations we might not be aware of were we to rely solely upon our own knowledge, which is becoming increasingly specialized and limited as the total amount of available information grows. The scientific literature is probably the richest and most diverse source of this information, and I envision that much of the work done so far in text-mining is paving the way towards more sophisticated artificial intelligence methods (13).
- Professional Affiliations
- ....... Currently, I'm an Associate Editor for Bioinformatics, the field's leading journal, and have been a member of the International Society for Computational Biology (ISCB ) since 1999. But despite the success of this large international society, there are a number of advantages to having regional ones (18). As such, I'm on the Board of Directors for the Mid-South Bioinformatics Society (MCBIOS) since 2003, and have served as editor/co-editor for their 2004 and 2006 conference proceedings (24, 33). I'm also president of the Oklahoma Bioinformatics Society (OKBIOS). We held our first symposium, OKBIOS 2004, in the Stephenson Center in Norman on November 12, 2004. OKBIOS 2004 was pretty successful, with approximately 150 people attending, and was held back-to-back with another symposium on the exciting and emerging field of Synthetic Biology . OKBIOS 2005 also went well. Our latest
plans are to merge OKBIOS 2008 with MCBIOS 2008 in Oklahoma City.
- Past Projects
- .......My interests have been somewhat diverse. I've worked on a variety of projects, beginning with image enhancement for sequence data (1) and moving on to methods of predicting polymorphic sequence repeats within mammalian genomes (2). Our predictive methods turned out to be surprisingly accurate, so we refined and applied our rule set more specifically to polymorphisms in transcribed regions (P1,3) and studied the acquisition of somatically acquired polymorphisms in cancer cells (4). A colleague & co-author of mine recently discovered that these repetitive regions in genes enable rapid evolutionary "tweaking" of quantitative parameters such as the length of a dog's snout, providing an important new chapter in the book on evolution. I've developed a freeware package for basic sequence analysis and manipulation (5), conducted a study on microarray cross-hybridization (6), aided in the design of microarray analysis tools (7) and designed a set of acronym recognition heuristics to aid in literature-based text analysis (8) which led to the creation of a near-comprehensive biomedical acronym database (Google search, top hit and recently featured in the Netwatch section of Science (8a)). This ability to
map terms to their symbols within the literature (e.g., acronyms to
definitions) is not only important in text mining, but for nomenclature
standardization as well, so in collaboration with other groups that have
tackled the same problem we've recently compared and highlighted our online
efforts (19).
Recent and Ongoing Projects
Text Mining/Knowledge Discovery
.......A fair amount of my current research revolves around methods of discovering new knowledge using large-scale literature analysis (20). I began this research while at UTSW, and much of my previous work is embodied in a software package called IRIDESCENT (9,11). IRIDESCENT was written for the assimilation and analysis of unstructured literature domains for the purpose of discovering new knowledge that is implicit from existing information. IRIDESCENT has been patented by UTSW (P2) and proven quite useful so far in a number of different areas including knowledge discovery (21,31), ontology construction (10) and microarray analysis (18,24). I am a founding member and board member for a company called eTexx Biopharmaceuticals, which has licensed this technology. My ongoing research in this area involves improving, refining and extending knowledge discovery technology by developing better methods to rank the many inferences found in the "open-discovery" model (14) and identifying means of automatically detecting and prioritizing conceptual relationships (27). As part of an effort to expand analysis into non-literature based sources, I've developed a framework for sequence-based information integration and a prototype method to automate data-mining of genomic sequence data (23). The idea here is to create an "automated observer". I am also working on information retrieval and data classification methods for fully or semi-automated database construction (25, P4), currently with regards to short sequence related databases. These databases currently require a lot of manual labor which can be time-consuming and expensive, not to mention generally tedious and sometimes inconsistent.
- Networked Information Resources
.......Clearly, the Internet age is changing the way we conduct research via the increasing availability of literature, software and analysis tools online. Unfortunately, the Internet is also dynamic, meaning that the continued availability of electronic resources is not necessarily guaranteed. I conducted a 2003 study of URLs published in MEDLINE abstracts and found that, just as in other fields, their availability decays over time. For example, 93% of 2002 URLs were still up while only 42% of 1995 URLs were (12). Because this URL decay is constant and the rate of growth in URL citation is so high (~40%/year), if something isn't done now, this could become a very big problem down the road(14a). A summary of this work was recently reported in the journal Nature (12b) as well as in the Chronicle of Higher Education (12c). For many authors, URL decay is out of their control (33). In a similar vein, I studied the decay rates of corresponding author emails published in MEDLINE and find a similar trend, except emails decay much faster (28). I have also studied the online availability of information and found (using an API from Google) that the probability a journal article can be found online at a non-journal website rises with the impact factor of the publishing journal and recency of the paper (19). Peter Suber framed this study with a great editorial on the changing dynamics of open access and self-archiving (19a). As this study is very recent and BMJ is high-impact, I just couldn't resist the irony of posting it here... but I did ask first ;)
- Microarray Analysis
- ....... Ah, microarrays. High-throughput technologies aren't changing the way we do science, but the scale by which we do it. I'm working on methods to compare separate high-throughput experiments such as microarrays, which is difficult due to noise and lack of quantitative rigor (15, P3). Arguably the hardest part of microarray analysis is interpreting the biological significance of the results. Normalization, clustering and other post-processing steps are important but at the same time pretty well established. The real question is: what does it all mean? The first part of this question entails knowing what portion of the biological picture you are capable of viewing with the technology (29) and the second part is tying in the responders to what is known and figuring out what is not known (e.g. via IRIDESCENT).
Other
- .......I would be remiss if I didn't mention my most successful genetics project to date: My beautiful daughter Karen (*) who was born in January 1999. The success of this project and the procurement of sufficient funding (i.e., a real job) led to another fruitful collaboration with my wife to initiate a second project (**).
Click here for non-research related links (e.g. humor, philosophical thoughts, etc.)
Patents Received & Pending
(P1) Garner HR, Wren JD, Fondon JW 3rd, Minna JD. US Patent# 6,472,154, "Polymorphic Repeats in Human Genes", Filed Dec 31, 1999, Issued Oct 29, 2002 [USPTO]
(P2) Wren JD, Garner HR "Computer program products, systems and methods for information discovery and relational analyses (IRIDESCENT)" Patent pending [USPTO]
(P3) Conway T, Grissom JE, Yao M, Han J, Wren JD, Langer M, Traxler M "A secure, internet-enabled microarray database" Patent
pending
(P4) Wren JD "Method and system for recognizing biological sequence data within text" Patent pending
Selected Peer-Reviewed Publications
(1) O'Brien KM, Wren JD, Dave VK, Bai D, Anderson RD, Rayner S, Evans G, Dabiri AE, Garner HR, "ASTRAL, a hyperspectral imaging DNA sequencer", Review of Scientific Instruments 1998 May; 9(5), 2141-6 [link]
(2) Fondon JW 3rd, Mele GM, Cummings D, Pande A, Wren J, O'Brien KM, Kupfer KC, Lerman M, Minna JD and Garner HR, "Computerized polymorphic marker identification: experimental validation and a predicted human polymorphism catalog", Proceedings of the National Academy of Sciences 1998 Jun 23; 95(13):7514-9 [PDF]
(3) Wren JD, Forgacs E, Fondon JW 3rd, Pertsemlidis A, Cheng S, Gallardo T, Williams RS, Shohet RV, Minna JD, and Garner HR "Repeat polymorphisms within gene regions: Phenotypic and evolutionary implications", American Journal of Human Genetics 2000 Aug; 67(2): 345-56 [PDF]
(4) Forgacs E, Wren JD, Kamibayashi C, Kondo M, Xu XL, Markowitz S, Tomlinson GE, Muller CY, Gazdar AF, Garner HR and Minna JD "Searching for microsatellite mutations in coding regions in lung, breast, ovarian and colorectal cancers", Oncogene 2001 Feb 22; 20(8): 1005-9 [PubMed]
(5) Wren JD, Mittleman D, Garner HR "SIGNAL - Sequence Information and GeNomic AnaLysis" Computer Methods and Programs in Biomedicine 2002 May; 68(2): 177-81 [PubMed]
- (6) Wren JD, Kulkarni A, Joslin J, Butow R and Garner HR "Cross-hybridization on PCR spotted microarrays" IEEE Engineering in Medicine and Biology 2002 Mar-Apr; 21(2): 71-5. [link]
(7) Kulkarni AV, Williams NS, Lian Y, Wren JD, Mittleman D, Persemlidis A, Garner HR "ARROGANT: An application to manipulate large gene collections" Bioinformatics 2002 Nov; 18(11): 1410-7 [PDF]
(8) Wren JD, Garner HR "Heuristics for identification of acronym-definition patterns within text: Towards an automated construction of comprehensive acronym-definition dictionaries" Methods of Information in Medicine 2002; 41(5): 426-34 [PubMed]
........ (8a) Leslie M, "TOOLS: Acronym Soup" Science 2004 Jul 9; Vol 305: p. 157 [link]
(9) Wren JD "The IRIDESCENT System: An Automated Data-Mining Method to Identify, Evaluate, and Analyze Sets of Relationships Within Textual Databases" Ph.D. Dissertation, University of Texas Southwestern Medical Center, January 2003 [link]
(10) Wren JD, Garner HR "Shared Relationship Analysis: Ranking set cohesion and commonalities within a literature-derived relationship network" Bioinformatics 2004 Jan; 20(2): 191-8 [PDF]
(11) Wren JD, Bekeredjian R, Stewart JA, Shohet RV, Garner HR "Knowledge discovery by automated identification and ranking of implicit relationships" Bioinformatics 2004 Feb; 20(3): 389-98 [PDF]
- (12) Wren JD "404 Not Found: The Stability and Persistence of URLs Published in MEDLINE" Bioinformatics 2004 Mar; 20(5): 668-72 [PDF]
- ........(12a) Schilling LM, Wren JD, Dellavalle RP "Bioinformatics leads charge by publishing more Internet addresses in abstracts than any other journal" Bioinformatics 2004 Nov 22; 20(17):2903 (Letter) [PubMed]
- ........(12b) Whitfield J "Web links leave abstracts going nowhere" Nature 2004 Apr 8; Vol. 428: p. 592 [PubMed]
- ........(12c) Carlson, S "Here today, gone tomorrow: Studying how online footnotes vanish" Chronicle of Higher Education 2004 Apr 30th; 50(34): p. A33 [link]
(13) Wren JD "The emerging In-Silico scientist: How text-based bioinformatics is bridging biology and artificial intelligence" IEEE Engineering in Medicine and Biology 2004 Mar-Apr; 23(2): 87-93 [link]
(14) Wren JD "Extending the mutual information measure to rank inferred literature relationships" BMC Bioinformatics 2004 Oct 7; 5(1): 145 [Open Access]
(15) Wren JD, Yao M, Langer M, Conway T "Simulated annealing of microarray data reduces noise and enables cross-experimental comparisons" DNA and Cell Biology 2004 Oct; 23(10): p. 695-700 [PubMed]
(16) Jennings SF, Ptitsyn AA, Wilkins D, Bruhn RE, Slikker W, Wren JD "Regional societies: Fostering competitive research through virtual infrastructures" PLoS Biology 2004 Dec; 2(12): e372-3 [Open Access]
(17) Wren JD, Chang JT, Pustejovsky J, Adar E, Garner HR, Altman RB "Biomedical term mapping databases" Nucleic Acids Research 2005 Jan; 3(Database Issue): D289-293 [Open Access]
(18) Mizumoto N, Hui F, Edelbaum D, Weil MR, Wren JD, Shalhevet D, Matsue H, Liu L, Garner HR, Takashima A "Differential activation profiles of multiple transcription factors during dendritic cell maturation" Journal of Investigative Dermatology 2005 Apr; 124(4): 718-24 [PubMed]
(19) Wren JD "Open access and openly accessible: A study of scientific publications shared via the Internet" British Medical Journal 2005 May 14; 330(7500):1128-31 [PDF (long version)] - posted with permission from BMJ ;)
........(19a) Suber P "Open access, impact and demand" BMJ 2005 May 14; 330(7500):1104 [PubMed]
(20) Wren JD "Automating literature-based lead discovery", in Frontiers in Drug Design and Discovery (eds. GW Caldwell, A Rahman and BA Springer), Bentham Science Publishers, 2005; Volume 1, p.267-86. [link]
(21) Wren JD, Garner HR "Data-mining analysis suggests an epigenetic pathogenesis for Type II Diabetes" Journal of Biomedicine and Biotechnology 2005 Jun; 2005(2): 104-12 [Open Access]
(22) Wren JD and Slikker W "Proceedings of the Midsouth Computational Biology and Bioinformatics Society 2004 Conference" BMC Bioinformatics 2005 Jul 15; 6(Suppl 2):S1 [Open Access] (editorial)
(23) Wren JD, Johnston DK, Gruenwald L "Automating genomic data mining via a sequence-based matrix format and associative rule set" BMC Bioinformatics 2005 Jul 15; 6(Suppl 2):S2 [Open Access]
(24) Xu Z, Patterson TA, Wren JD, Han T, Shi L, Duhart H, Ali SF, Slikker W "A microarray study of MPP+-treated PC12 cells: Mechanisms of toxicity (MOT) analysis using bioinformatics tools" BMC Bioinformatics 2005 Jul 15; 6(Suppl 2):S8 [Open Access]
(25) Wren JD, Hildebrand WH, Chandrasekaran S, Melcher UK "Markov model recognition and classification of DNA/Protein sequences within large text databases" Bioinformatics 2005 Nov 1; 21(21): 4046-53 [PubMed]
(26) Wren JD "Truth, probability, and frameworks" PLoS Medicine 2005 Nov; 2(11):e361 [Open Access] (letter)
(27) Wren JD "Using Fuzzy Set Theory and scale-free network properties to relate MEDLINE terms" Soft Computing 2006 Feb; 10(4): 374-81 [link] [Supplementary_Info]
(28) Wren JD, Grissom JE, Conway T "Email decay rates among MEDLINE corresponding authors" EMBO Reports 2006 Feb; 7(2): 122-7 [Open Access]
(29) Wren JD, Conway T "Meta-analysis of published transcriptional and translational fold-changes reveals a preference for low-fold inductions" OMICS
2006 Feb; 10(1) 15-27 [PubMed]
(30) Wren JD, Roossinck MJ, Nelson RS, Scheets K, Palmer MW, Melcher U "Plant virus biodiversity and ecology" PLoS Biology 2006 Mar;4(3):e80 [Open Access]
(31) Flood EM, Kumar RS, Shah R, Amos Q, Wren JD, Shohet RV, Garner HR "Melatonin administration does not affect isoproterenol-induced LVH" IEEE Engineering in Medicine and Biology 2006
May-Jun 25(3):84-7 [PubMed]
(32) Wren JD "Theory and reality for software patents: Good in
concept, not so good in practice" Bioinformatics 2006 Jul
1;22(13):1543-5 [PubMed] (editorial)
(33) Wren JD, Ptitsyn AA, Gusev Y, Winters-Hilt S "Proceedings of the 2006 Midsouth Computational Biology and Bioinformatics Society Conference" BMC Bioinformatics 2006 Sep 26; 7(Suppl 2):S1
(editorial) [Open
Access]
(34) Wren JD "A scalable machine-learning approach to recognize chemical names within large text databases" BMC Bioinformatics 2006 Sep 26; 7(Suppl
2):S3 [Open
Access]
(35) Wren JD, Johnson KR, Crockett DM, Heilig LF. Schilling LM, Dellavalle RP
"URL decay in dermatology journals: Author attitudes and preservation
practices" Archives of Dermatology 2006 Sep; 142(9):1147-52 [PubMed]
(36) Perez-Iraxeta C, Andrade MA, Wren
JD "Evolving research trends in bioinformatics" Briefings in
Bioinformatics 2006 Oct 31 [PubMed]
----------------------------------------------from
OMRF-------------------------------------------------
(37) Wren JD, Wu Y, Guo S “A Near-System-Wide Analysis of
Differentially Expressed Genes in Ectopic and Eutopic Endometrium” Human
Reproduction 2007 (in press)
(38) Wren JD "The 'Open Discovery' Challenge" in
Literature-Based Discovery (eds. Peter Bruza & Marc Weeber)
Springer Publishing 2007 (in press)
(39) Errami M, Wren JD, Hicks JM, Garner HR “eTBLAST: A web
server to identify expert reviewers, appropriate journals and similar
publications.” Nucleic Acids Research
2007 (in press)
(*) Wren JD, Wren TS, "Creation of Karen Nicole Wren", Products of Marriage and Mating, Number 1, January 19, 1999
(**) Wren JD, Wren TS, "Creation of Ethan Daniel Wren", Products of Marriage and Mating, Number 2, January 30, 2004
Last updated: 4 /17/2007