PRIS 2010 Abstracts


Full Papers
Paper Nr: 3
Title:

Towards a multi-camera mouse-replacement interface

Authors:

John Magee, Zheng Wu, Harshith Chennamaneni, Samuel Epstein, Diane E. H. Theriault and Margrit Betke

Abstract: We present our efforts towards a multi-camera mouse-replacement system for computer users with severe motion impairments. We have worked with individuals with cerebral palsy or multiple sclerosis who use a publicly-available interface that tracks the user's head movements with a single video camera and translates them into mouse pointer coordinates on the screen. To address the problem that the interface can lose track of the user's facial feature due to occlusion or spastic movements, we started to develop a multi-camera interface. Our multi-camera capture system can record synchronized images from multiple cameras and automatically analyze the camera arrangement. We recorded 15 subjects while they were conducting a hands-free interaction experiment. We reconstructed via stereoscopy the three-dimensional movement trajectories of various facial features. Our analysis shows that single-camera interfaces based on two-dimensional feature tracking neglect to take into account the substantial feature movement in the third dimension.

Paper Nr: 4
Title:

COARSE IMAGE EDGE DETECTION USING SELF-ADJUSTING RESISTIVE-FUSE NETWORKS

Authors:

Haichao Liang and Takashi Morie

Abstract: We propose a model of coarse edge detection using self-adjusting resistive-fuse networks. The resistive-fuse network model is known as a non-linear image processing model, which can detect coarse edges from images by smoothing noise and small regions. However, this model is hardly used in real environment because of the sensitive dependence on the parameters and the complexity of the annealing process. In this paper, we first introduce self-adjusting parameters to reduce the number of parameters to be controlled. Then, we propose a heating-and-cooling sequence for fast and robust edge detection. The proposed model can detect edges more correctly than the original one, even if an input image includes a gradation.

Paper Nr: 5
Title:

Classification of Datasets with Missing Values: Two Level Approach

Authors:

Ivan Bruha

Abstract: One of the problems of pattern recognition (PR) are datasets with missing attribute values. Therefore, PR algorithms should comprise some routines for processing these missing values. There exist several such routines for each PR paradigm. Quite a few experiments have revealed that each dataset has more or less its own 'favourite' routine for processing missing attribute values. In this paper, we use the machine learning algorithm CN4, a large extension of well-known CN2, which contains six routines for missing attribute values processing. Our system runs these routines independently (at the base level), and afterward, a meta-combiner (at the second level) is used to generate a meta-classifier that makes up the overall decision about the class of input objects.

Paper Nr: 6
Title:

Interactive text generation for information retrieval

Authors:

Luis Rodríguez, Alejandro Revuelta, Ismael García-Varea and Enrique Vidal

Abstract: Interactive text generation is aimed at facilitating text generation in those situations where text typing is somehow constrained. This approach achieves a significant amount of typing effort reduction in most tasks. Natural language based interfaces for information retrieval constitute a good scenario to include this kind of assistance in order to improve the system usability and provide considerable help in constrained input-interfaces. An initial proposal is presented here along with an experimental framework to assess its appropriateness.

Paper Nr: 7
Title:

Feature Transformation and Reduction for Text Classification

Authors:

Artur J. Ferreira, Artur Ferreira and Mario Figueiredo

Abstract: Text classification is an important tool for many applications, in supervised, semi-supervised, and unsupervised scenarios. In order to be processed by machine learning methods, a text (document) is usually represented as a bag-of-words (BoW). A BoW is a large vector of features (usually stored as floating point values), which represent the relative frequency of occurrence of a given word/term in each document. Typically, we have a large number of features, many of which may be non-informative for classification tasks and thus the need for feature transformation, reduction, and selection arises. In this paper, we propose two efficient algorithms for feature transformation and reduction for BoW-like representations. The proposed algorithms rely on simple statistical analysis of the input pattern, exploiting the BoW and its binary version. The algorithms are evaluated with support vector machine (SVM) and AdaBoost classifiers on standard benchmark datasets. The experimental results show the adequacy of the reduced/transformed binary features for text classification problems as well as the improvement on the test set error rate, using the proposed methods.

Paper Nr: 8
Title:

The GiDOC prototype

Authors:

N. Serrano, D. Pérez, Nicolás Serrano Martínez Santos, Lionel Tarazon, Daniel Perez, Oriol Ramos Terrades and Alfons Juan

Abstract: Transcription of handwritten text in (old) documents is an important, time-consuming task for digital libraries. In this paper, an efficient interactive- predictive transcription prototype called GIDOC (Gimp-based Interactive transcription of old text DOCuments) is presented. GIDOC is a first attempt to pro- vide integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. It is based on GIMP and uses ad- vanced techniques and tools for language and handwritten text modelling. Results are given on a real transcription task on a 764-page Spanish manuscript from 1891.

Paper Nr: 9
Title:

The impact of Pre-Processing on the Classification of MEDLINE Documents

Authors:

Rui Camacho, Eugénio Oliveira, Carlos Adriano Oliveira Gonçalves and Célia Talma Gonçalves

Abstract: The amount of information available in the medline database makes it very hard for a researcher to retrieve a reasonable amount of relevant documents using a simple query language interface. Automatic Classification of documents may be a valuable technology to help reduce the amount of documents retrieved for each query. To accomplish this process it is of capital importance to use appropriate pre-processing techniques on the data. The main goal of this study is to analyse the impact of pre-processing techniques in text Classification of medline documents. We have assessed the effect of combining different pre-processing techniques together with several classification algorithms available in the WEKA tool. We provide a numerical evaluation of the impact of the pre-processing techniques. Our experiments show that the application of pruning, stemming and WordNet reduces significantly the number of attributes and improves the accuracy of the results.

Paper Nr: 10
Title:

Personal Identification and Authentication Based on One-Lead ECG by using Ziv-Merhav Cross Parsing

Authors:

David Pereira Coutinho, Ana Fred and Mario Figueiredo

Abstract: In this paper, we propose a new data Compression based ECG biometric method for personal identification and authentication. The ECG is an emerging biometric that does not need liveliness verification. There is strong evidence that ECG signals contain sufficient discriminative information to allow the identification of individuals from a large population. Most approaches rely on ECG data and the fiducia of different parts of the heartbeat waveform. However non-fiducial approaches have proved recently to be also effective, and have the advantage of not relying critically on the accurate extraction of fiducia. We propose a non-fiducial method based on the Ziv-Merhav cross parsing algorithm for symbol sequences (strings). Our method uses a string similarity measure obtained with a data compression algorithm. We present results on real data, one-lead ECG, acquired during a concentration task, from 19 healthy individuals, on which our approach achieves 100\% subject identification rate and an average equal error rate of 1.1% on the authentication task.

Paper Nr: 11
Title:

Pertinent Parameters Selection for Processing of Short Amino Acid Sequences

Authors:

Zbigniew Szymanski, Stanislaw Jankowski, Marek Dwulit, Joanna Chodzynska and Lucjan Wyrwicz

Abstract: The paper describes the Least Squares Support Vector Machine (LS-SVM) classifier of short amino acid sequences for the recognition of kinase-specific phosphorylation sites. The sequences are represented by the strings of 17 characters, each character denotes one amino acid. The data contains sequences reacting with 6 enzymes: PKA, PKB, PKC, CDK, CK2 and MAPK. To enable classification of such data by the LS-SVM classifier it is necessary to map symbolic data into real numbers domain and to perform pertinent feature selection. Presented method utilizes the AAindex (amino acid index) set up of values representing various physicochemical and biological properties of amino acids. Each symbol of the sequence is substituted by 193 values. Thereafter the feature selection procedure is applied, which uses correlation ranking formula and the Gram-Schmidt orthogonalization. The selection of 3-17 most pertinent features out of 3281 enabled successful classification by the LS-SVM.