I am interested in the development of fast algorithms for solving problems specific for Bioinformatics and Metagenomics. As a Bioinformatician I want to bridge the gap between Biologists who explain data and draw links between ecology, evolution, and observations and computer engineers who provide tools to process data.
To given an example, a lake sample might contain about 1,500 species of plants, animals and fungi. Their complete identification is not feasible when done under the microscope. Instead, a certain genome region is targeted (and amplified via PCR) that allows an accurate enough identification. Currently, I focus on how to support the identification process. As part of it I work on the ambiguity problem in read mapping, i.e. determining the unique origin of short DNA sequences (e.g. from shotgun sequencing) that match at multiple locations in the reference sequence. Handling read ambiguity is a hard and ubiquitous problem -- up to 80% of the genome may consist of repetive DNA sequences.