Research Areas

Medical Search Engines and Topical Ranking

A major problem faced in biomedical informatics involves how to contextually retrieve and rank medical and healthcare information. In this research, we have introduced a Medical Information Retrieval model (called MIR) for extracting the semantic relation among such documents. The purpose is to maximize the contextual retrieval and ranking performance with minimum input from users. This model is implemented in our search engine called Medicoport. The experiments show that the model achieved higher average precision compared to the leading information retrieval techniques on medical documents. The improvement of MIR in terms of effectiveness over other methods is statistically significant. Moreover, both medical and nonmedical users have evaluated the ranking and relevance of the results quite successful.

When presenting the search results, ranking and categorization plays an important role for usability. For this reason we have looked into categorizing the results. As medical documents on the web are in different formats, we have been investigating the categorization of free text documents. We have developed an efficient rule-based method called ROLEX-SP for categorizing free text documents. The contributions of this research are the formation of lexical syntactic patterns as basic classification features, a categorization framework that addresses the problem of classifying free text with minimal label description, and an efficient learning algorithm in terms of time complexity and F-measure. The framework of ROLEX-SP concentrates on capturing the correct classes of text as well as reducing classification errors. We performed experiments in order to evaluate the proposed method and compare our work with state-of-the-art methods in domain specific source of knowledge. The results indicate that ROLEX-SP outperforms other methods in terms of standard F-measure in medical domain because of the strong definition of MeSH description of medical categories.

Static Detection of shared objects in multithreaded Java Programs

Implementing and verifying concurrent programs are quite challenging. For dynamic verification techniques, the errors in such programs are hard to reproduce; for static verification techniques, the number of possible executions of such programs is infinite. Therefore there are many researchers working on this challenging problem. The danger in concurrent programs is caused by the conflicts on objects that are accessed by multiple processes without proper locking mechanism. Therefore it is very useful to predict these objects before they pose a danger in.

In this project aim to help programmers see all potentially shared objects that may cause some complications at runtime. The project resulted with a simple and efficient automated tool called DoSSO that detects shared objects in multithreaded Java programs. As a result of this project, programmers can implement a concurrent software without considering synchronization issues and then use appropriate locking mechanism based on the DoSSO results. To illustrate the effectiveness of our tool, we have performed experiments on a multithreaded system with graphical user interfaces and remote method invocations and achieved promising results.

Automated Verification of Aspect Oriented Programs

Aspect oriented programming (AOP) has been gaining popularity over the past decade. Being a new programming paradigm, AOP programmers need verification and testing support as other programming paradigms have. The challenge in this line of work is that this paradigm introduces aspects that have powerful features whose effects can drastically change software behavior. Having such an impact on behavior requires an automated verification support. This project aims to develop a modular verification methodology for aspect oriented programs.

In this project we focus on AspectJ programs. So far we have developed 2 tools and a static verification technique in this project: 1) We have developed an automated tool that isolates aspects from its environment so that we can run independent tests and static verification techniques. 2) We have developed another tool that infers invariants of the aspects from its current usage and extracts its usage rules to help the programmers in correct reuse of the aspect. 3) We have developed a modular static verification technique based on model checking to reason the behavior of the aspect and well as to enforce the correct usage of the aspect based on its interface. In other words, a technique that asks two questions: "does the aspect have the provisioned effect?" and "does the base program satisfy the assumptions of the aspect?"

Design for Verification

The widespread usage of software systems in modern technology has heightened the need for techniques to assure the dependability of software which includes software safety, continuous operation and reliability. Dependability is a major concern especially for safety and mission critical systems. There has been significant progress in automated verification techniques based on model checking to reason the dependability of software systems. Typically, fully automated verification techniques are not scalable and scalable verification techniques require substantial human guidance. A promising approach in attacking these problems is to find ways of constructing software that facilitate automated verification. In this research we call this approach design for verification. This approach aims to bridge a gap between software development and automated software verification and it will contribute to the automated verification of large scaled software. Furthermore, the application of this approch results in frameworks that helps the programmers who are not trained in formal methods to develop reliable and dependable software.

Design for verification approach has been applied to concurrent programs and web service design and resulted in two different frameworks. The results are published in several respectable journals and conferences, one of which received the ACM SIGSOFT Distinguished paper award.

Future Directions

Web technologies are starting to dominate the software industry. The ease in the usability and the reachability to clients more effectively makes these technologies quite popular and widespread. Also the new frameworks and tools make this area attractive to programmers.

I am interested in investigating formal reasoning, testing and verification of such software systems. I am also interested in application of model checking in bioinformatics, on biological networks. Model checking techniques work on graphs and have been successfully used in reasoning on program graphs asking questions such as reachability. This area has attracted many researchers over the past few years and the initial results seems promising.

The syntax-semantic interface

Natural language comprehension and production by humans has received interdisciplinary interest from a theoretical, computational perspective, as well as for its potential in applied language research. A major issue in this research domain has been the investigation of the underlying mechanisms of the interface between syntax and semantics. In particular, syntactic diagnostic environments for certain parts of speech -such as intransitive verbs- are not always aligned with the aspectual properties, thus presenting a puzzle at the syntax-semantics interface. For instance, split intransitivity (i.e. the unaccusative/unergative distinction) remains to be an unsolved problem in language research. The research on the syntax-semantics interface at the Informatics Institute focuses both on constructing computational models at the theoretical level and on conducting empirical investigations for measuring offline and online language comprehension and production processes. In experimental studies conducted in the Research and Application Laboratory of Human Computer Interaction, a set of empirical methodologies is employed, such as eliciting acceptability judgments, and recording eye movements of language users during their comprehension and production of written and spoken language.

Discourse annotation

The term discourse refers to a coherent group of written or spoken sentences. How coherence is established in discourse can be studied at a global level (e.g. frames, schemes, goal and intention hierarchies, etc) just as well as the local level. The discourse research group at the Informatics Institute focuses on the local level of discourse and takes discourse connectives (e.g. coordinating connectives, subordinating connectives and discourse adverbials) as the simplest building blocks of local discourse. Discourse connectives are taken as discourse-level predicates having an argument structure, albeit a very simple one, e.g. all discourse connectives have two and only two arguments, that is, they relate two and only two text spans. A discourse resource sharing the goals and principles of the Penn Discourse TreeBank is being built by annotating the discourse connectives and their two arguments on a 400.000-word written Turkish corpus. At the theoretical level, this initiative shows that discourse is unlike syntax since it lacks complex dependencies and hence corroborates the findings from similar efforts in other languages. It reveals that the set of discourse connectives and their senses are comparable across languages. The research continues with an eye to more cross-linguistic similarities and any differences, and it aims to reveal the role that discourse connectives play in discourse comprehension and production.

Multimodal comprehension

Humans use multiple representational modalities (such as language, diagrams, statistical graphs, etc.) both in problem solving tasks and in communication. The research in the Informatics Institute on multimodal comprehension covers the investigation both at theoretical and empirical levels within the interdisciplinary framework of linguistics, psychology, computer science and educational sciences.

The cognitive architectures and Human-Computer Interaction models for multimodal comprehension are developed to analyze how humans integrate (or fuse) information in different modalities, from a theoretical perspective. Those models provide the framework for a complementary empirical perspective.

In experimental studies with human participants, conducted in Human Computer Interaction Research and Application Laboratory (, several methodological are employed such as eye tracking, judgment reports, and the posttest analyses (recall, retention and transfer measurement). This domain of research is connected to research on multimedia learning, multisensory perception, multimedia generation systems, and multimodal discourse analysis.

Verbal Aid Systems for Accessibility

Depictive representations (e.g., diagrammatical illustrations, statistical graphs) are usually accompanied by verbal annotations to facilitate understanding. Sighted persons access the information provided by a verbal annotation and the corresponding diagrammatic entity in a verbally annotated diagram instantaneously, mostly with a single gaze fixation.

Visually impaired persons, however, due to the intrinsic characteristics of accessing information by alternative sensory modalities such as touch, use different patterns of investigation than sighted persons to accomplish the integration of the information provided by linguistic and diagrammatic entities. This difference has certain implications for the design of verbally annotated diagrams for visually impaired persons. In contrast to the sensory substitution in graphs as a subdomain of accessibility research, the research on verbal aid systems has been limited.

The research on verbal aid systems in the Informatics Institute focuses on the analysis of comprehension of verbal annotations by humans for the development of design principles and guidelines for language support systems in diagrammatic representations. This research has implications both for accessibility research and learning by visually impaired persons.


Development of image analysis tools for automatic airway and vessel measurement on CT lung images for Cystic Fibrosis patients

Cystic Fibrosis (CF) is the most common lethal genetic disorder in the Caucasian population, affecting about 30,000 people in the United States. Airway (AW) inflammation begins early in life producing structural damage that can progress insidiously in patients who are relatively asymptomatic. High-resolution computed tomographic (CT) imaging has shown that the AWs of infants and young children with CF have thicker walls and are more dilated than those of normal children. The purpose of this study was to develop computerized methods which allow rapid, efficient and accurate assessment of CT AW and vessel (V) dimensions from axial CT lung images. Threshold-based and model-based automatic AW and V size measurement methods were developed. The only user input required is approximate center marking of the AW and V. The methods were evaluated using chest CT images from 16 patients (8 infants and 8 children) with different stages of mild CF related lung disease. Both threshold-based and model-based approaches correlated well with the measurements made by experienced observers using electronic calipers as well as with the spirometric measurements of lung function. However, the model-based approach correlates slightly better with the human measurements when compared with the threshold method. Considering that model-based approach requires adjustment of fewer parameters, the model-based technique has a definite advantage over the threshold-based scheme. Averaging the estimates from these two methods (called hybrid method) was also investigated. Additional improvements were observed when the hybrid approach was used.

Mitocondria detection and segmentation on Electron microscopy images

Mitochondrial function plays an important role in the regulation of apoptosis, and the disturbance in mitochondrial function is accompanied by significant morphological alterations. Electron microscopy tomography (EMT) is a powerful technique to study the 3D structure of mitochondria, but EMT mitochondrial images are quite noisy due to the presence of various sub-cellular structures/cristae and imaging artifacts. Therefore, interpretation, measurement and analysis of images are very challenging, and development of specialized software tools to automatically detect and segment mitochondria is very important. Typically, mitochondrial EMT images are segmented manually using special software tools. Automatic contour extraction on large images with multiple mitochondria and many other sub-cellular structures is still an unaddressed problem. The purpose of this work was to develop image analysis tools to automatically detect and segment mitochondria on EMT images. The automated algorithm has two main parts: mitochondria detection and mitochondria segmentation. The methods developed rely on the facts that mitochondria have a shape close to an elliptical, and the boundary contains mostly double membrane. Our mitochondria detection method is based on ellipse detection. The detection results are first refined using active contours. Then, our seed point selection method automatically selects reliable seed points, and segmentation is finalized by automatically incorporating live-wire graph search algorithm between these seed points.

Fusion of MR and CT imaging modalities for 3D human head modeling

We are developing automatic image segmentation tools for segmenting pixels into muscle, fat, bone and air classes on thin section low dose CT and MR images. Our Bayesian scheme models correlated noise and system resolution. Partial volume is also modeled. Using directional priors also improved the performance. This work is now leading to two publications.

Microphone array processing

Microphone array processing refers to a diverse set of techniques applied to extract information from multiple channels of audio data obtained from arrays of acoustic probes position in a variety of different geometries. The research in Informatics Institute focuses on the following two research problems, sound source localisation and separation. Sound source localisation techniques aim to find the direction of a sound source typically under adverse conditions such as highly reverberant enclosures or strong acoustical background noise including a variety of interferers. The source separation problem refers to the situation where individual sources are to be extracted from mixtures of these sources, such as extracting the conversation between two individuals in a marketplace with all sorts of different sounds or in a cocktail party setting. As the outcome of this research, a computationally efficient method has been developed that utilises coincident microphone arrays composed of 4 channels only. This work led to several publications as well as and international patent application.

Virtual acoustics

Acoustical modelling and simulation have important application areas in architecture, city planning, environmental management, and security among others. Such importance stemming from practical needs also makes it an interesting academic research topic. The virtual acoustics research in Informatics Institute is on the modelling and simulation of secondary effects such as source directivity and time-variant aspects in digital waveguide mesh models. In addition, work is being carried out to use high-performance computing techniques such as vector processors, multicore architectures and parallel processing. The results are combined with the work on spatial audio systems to provide realistic audio environments.

Spatial audio processing

Spatial audio processing refers to the plethora of signal processing and audio reproduction techniques that allow the perception of audio content in space and thus in context. The spatial audio research in Informatics Institute concentrates on binaural audio, ambisonics and wave field synthesis. The research is mainly concerned with recording, synthesis, and reproduction techniques and the perception of sound fields. A particular area of interest is on audio systems that combine acoustical models of enclosures with spatial audio reproduction. The effect of visual reproduction on the perception of spatial audio is also being investigated. An effective spatial audio coding technique has been developed which led several publications and a patent application.

Audio synthesis and sonification

Sonic feedback is an integral part of how we interact with computing systems. It is well-known that inclusion of auditory feedback associated with common tasks improves task efficiency. Indeed, product sound design has become an important area of research in recent years. The term sonification (which is the equivalent of visualisation but in the auditory modality) refers to making audible the patterns and features in otherwise unstructured data by synthetic sounds or snippets of real sounds at a higher level of abstraction. Current work in Informatics Institute focuses on the analysis-based synthesis and morphing/interpolation of common impact sounds such as bouncing balls, keyboard sounds, clapping sounds, sounds from a closing door, etc. The objective is to obtain a simple yet effective sound synthesis model that may be utilised in virtual reality applications, computer games, auditory user interfaces, etc.

Perception and psychoacoustics

To overcome the difficulty of dealing with a huge amount of information in the environment, the auditory system generally discards redundant information. Understanding how humans perceive the auditory environment surrounding them is very important to be able to develop environment-aware agents. The research on perception and psychoacoustics in Informatics Institute focuses on attention modeling as well spatial auditory perception. Specifically, subjective tests are designed whose results are used for developing objective models, for example for the determining the Quality of Experience with 3D content and displays. Subjective testing methods are also applied for testing speech intelligibility in the presence of noise in collaboration with charities for hard-of-hearing.