Egon Willighagen

Postdoctoral researcher at the Karolinska Institute. I study the role of machine representation of knowledge and hypothesis in life sciences and toxicology in particular, involving chemometrics and semantic web technologies. In the past, I have applied research on this in QSAR, crystallography and metabolomics. Open source programming is my main hobby resulting in participation in, amongst many others, MetWare , Bioclipse , Chemistry Development Kit , Jmol , and Oscar . See also my biography and below.

My Group

Well, it is more like I want to establish a group.

Past Members

Research

The research in my group is about reducing the error introduced in chemical data analysis by the analysis itself, a field coined Molecular Chemometrics. This involves study of the errors introduced by improper handling of chemical knowledge, improper representation of the problem, and the statistical analysis method used. The solution my group is using to reduce that error involves explicit markup of knowledge using semantic technologies, development of new representation methods for molecular information, and the use of statistics to find, visualize, and validate new patterns.

Practically, however, this also means that to prove my point, I need others to adopt such methods too. This has resulted in that I am involved very much in Translational Cheminformatics too. This research field is about getting sound cheminformatics used in practice, and raising awareness about the problems in Molecular Chemometrics among people in other research fields, such as bioinformatics, metabolomics, QSAR, etc.

Each part of my research is exemplified by one or two key or recent papers.

Semantic Technologies

Technologies studied here include markup languages, like Chemical Markup Language, and the family of semantic approaches around the Resource Description Framework. These methods are being studied for use in cheminformatics and exchange of molecular data. This work has resulted in the development of CMLRSS, QSAR-ML, the Blue Obelisk Descriptor Ontology (BODO), and the CHEMINF ontology.

Molecular Representation

This branch of the research focuses on the presentation of molecular data, such that statistical methods can extract the most information from the data or generate the best prediction models. This research resulted in the development of a new descriptor for molecular crystal structures, the debunking of NMR spectra for some modeling approaches, and the development of translational tools, like the Chemistry Development Kit.

Statistics

Statistical methods help us find, understand and visualize complex patterns in our molecular data. This research focuses on improving the expert validation of our statistical models, by linking the models to external data. The latter brings us back to the semantic technologies to do that accurately. This work resulted in the application of supervised Self-Organizing Maps to classification problems with multiple end points.

Translational Cheminformatics

This part of the research is about getting the above methods used in other scientfic fields. This involves the development of tools that (unfortunately) hide much of the research from the above three fields, so that they can be easily used in other research fields. This research, particularly, is well accepted by the scientific community. Solutions here include the Chemistry Development Kit, Jmol, Bioclipse, and Oscar, which all make more fundamental research in Molecular Chemometrics more accessible.

Research Topics

The topics that this research covers all involves small molecules and nanomaterials, and is currently directed at toxicology and life sciences in general. Research is ongoing in the fields of QSAR/QSPR in toxicology and nanotoxicology (and property prediction in general), and metabolite identification. Past topics include crystallography and polymorph prediction.

Research Collaborations

Collaboration is ongoing with: Dr. Christoph Steinbeck (EBI/UK) on the CDK. Bioclipse, and JChemPaint; Dr. Steffen Neumann (Halle/Germany) and Dr. Roeland van Ham (Wageningen/NL) on MetWare; Prof. Jarl Wikberg (Uppsala University) on Bioclipse; with dr. Nina Jeliazkova and others of the OpenTox.org community; with many people in the CDK community; with various people in the HCLS interest group; and others.

EU Projects

At the moment I am involved in the ToxBank project (not as PI), which is embedded in the larger Safety Evaluation Ultimately Replacing Animal Testing cluster (SEURAT-1).

Publications

For now, please find my publication list at Google Scholar CiteULike, Mendeley and the less populated researchid:C-6136-2008.

Addresses

Post Address

Karolinska Institutet
Institutet för miljömedicin (IMM)
dr E.L. Willighagen
Box 210
SE-171 77 Stockholm

Visiting Address

Institutet för miljömedicin (IMM)
Nobels väg 13
Solna

Web Addresses

Blog: chem-bla-ics. Social Networking: egonw@Delicious , egonw@SourceForge , egonw@LinkedIn , egonw@FriendFeed , egonw@GitHub , egonwillighagen@Twitter , egonw@CiteULike , egon-willighagen@Mendeley , chemblaics@Identi.ca , egonwillighagen@Lanyrd