ICEIS 2020 Abstracts


Area 1 - Databases and Information Systems Integration

Full Papers
Paper Nr: 115
Title:

Anonymisation and Compliance to Protection Data: Impacts and Challenges into Big Data

Authors:

Artur P. Carvalho, Edna D. Canedo, Fernanda P. Carvalho and Pedro P. Carvalho

Abstract: Nowadays, in the age of Big Data, we see a growing concern about privacy. Different countries have enacted laws and guidelines to ensure better use of data, especially personal data. Both the General Data Protection Regulation (GDPR) in the EU and the Brazilian General Data Protection Law (LGPD) outline anonymisation techniques as a tool to ensure the safe use of such data. However, the expectations placed on this tool must be reconsidered according to the risks and limits of its use. We discussed whether anonymity used exclusively can meet the demands of Big Data and, at the same time, the demands of privacy and security. We have concluded that, albeit anonymised, the massive use of data must respect good governance practices to preserve personal privacy. In this sense, we point out some guidelines for the use of anonymised data in the context of Big Data.

Paper Nr: 134
Title:

OntoDIVE: An Ontology for Representing Data Science Initiatives upon Big Data Technologies

Authors:

Vitor A. Pinto and Fernando S. Parreiras

Abstract: Intending to be more and more data-driven, companies are leveraging data science upon big data initiatives. However, in order to reach a better cost-benefit, it is important for companies to understand all aspects involved in such initiative. The main goal of this research is to provide an ontology that allows to accurately describe data science upon big data. The following research question was addressed: ”How can we represent a Initiative of data science upon big data?” To answer this question, we followed Knowledge Meta Processes guidelines from Ontology Engineering Methodology to build an artifact capable of explaining aspects involved in such initiatives. As a result, this study presents OntoDIVE, an ontology to explain interactions between people, processes and technologies in a data science initiative upon big data This study contributes to leverage data science upon big data initiatives, integrating people, processes and technologies. It confirms interdisciplinary nature of data science initiatives and enables organizations to draw parallels between data science results for a particular domain to their own domain. It also helps organizations to choose both frameworks and technologies based on their technical decision only.

Paper Nr: 148
Title:

SSTR: Set Similarity Join over Stream Data

Authors:

Lucas Pacífico and Leonardo A. Ribeiro

Abstract: In modern application scenarios, large volumes of data are continuously generated over time at high speeds. Delivering timely analysis results from such massive stream of data imposes challenging requirements for current systems. Even worse, similarity matching can be needed owing to data inconsistencies, which is computationally much more expensive than simple equality comparisons. In this context, this paper presents SSTR, a novel similarity join algorithm for streams of sets. We adopt the concept of temporal similarity and exploit its properties to improve efficiency and reduce memory usage. We provide an extensive experimental study on several synthetic as well as real-world datasets. Our results show that the techniques we proposed significantly improve scalability and lead to substantial performance gains in most settings.

Paper Nr: 267
Title:

Discovering of a Conceptual Model from a NoSQL Database

Authors:

Fatma Abdelhedi, Amal A. Brahim, Rabah T. Ferhat and Gilles Zurfluh

Abstract: NoSQL systems have proven effective to handle Big Data. Most of these systems are schema-less which means that the database doesn't have a fixed data structure. This property offers an undeniable flexibility allowing the user to add new data without making any changes on the data model. However, the lack of an explicit data model makes it difficult to express queries on the database. Therefore, users (developers and decision-makers) still need the database data model to know how data are stored and related, and then to write their queries. In a previous work, we proposed a process to extract a physical model from a NoSQL database. In this article, we propose to extend this process by leading to the extraction of a conceptual model that provides an element of semantic knowledge close to human understanding. To do this, we use the Model Driven Architecture (MDA) that provides a formal framework for automatic model transformation. From a NoSQL physical model, we propose formal transformation rules to generate a conceptual model in the form of a UML class diagram. An experimentation of the extraction process was carried out on a medical application.

Short Papers
Paper Nr: 27
Title:

Objective Measures Ensemble in Associative Classifiers

Authors:

Maicon Dall’Agnol and Veronica Oliveira de Carvalho

Abstract: Associative classifiers (ACs) are predictive models built based on association rules (ARs). Model construction occurs in steps, one of them aimed at sorting and pruning a set of rules. Regarding ordering, usually objective measures (OMs) are used to rank the rules. The aim of this work is exactly sorting. In the proposals found in the literature, the OMs are generally explored separately. The only work that explores the aggregation of measures in the context of ACs is (Silva and Carvalho, 2018), where multiple OMs are considered at the same time. To do so, (Silva and Carvalho, 2018) use the aggregation solution proposed by (Bouker et al., 2014). However, although there are many works in the context of ARs that investigate the aggregate use of OMs, all of them have some bias. Thus, this work aims to evaluate the aggregation of measures in the context of ACs considering another perspective, that of an ensemble of classifiers.

Paper Nr: 86
Title:

An FCA-based Approach to Direct Edges in a Causal Bayesian Network: A Pilot Study using a Surgery Data Set

Authors:

Walisson Ferreira, Mark Song and Luis Zarate

Abstract: One of the problems during the construction of Causal Bayesian Network based on constraint algorithms occurs when it is not possible to orient edges between nodes due to Markov Equivalence. In this scenario this article presents the use of Formal Concept Analysis (FCA), specially attributes implication, as an alternative to support the definition of the direction of the edges. To do this it was applied algorithms of Bayesian learners (PC) and FCA in a data set containing 12 attributes and 5,473 records of surgeries performed in Belo Horizonte - Brazil. According to the results, although attribute implication did not necessarily mean causality, the implication rules were useful in defining edges orientation on the Bayesian network learned by PC Algorithm. The results of FCA were validated through intervention using do-calculus and by an expert in the domain. Therefore, as result of this paper, it is presented a heuristic to direct edges between nodes when the direction is unknown.

Paper Nr: 94
Title:

Quality Evaluation for Documental Big Data

Authors:

Mariagrazia Fugini and Jacopo Finocchi

Abstract: This paper presents the analysis of quality regarding a textual Big Data Analytics approach developed within a Project dealing with a platform for Big Data shared among three companies. In particular, the paper focuses on documental Big Data. In the context of the Project, the work presented here deals with extraction of knowledge from document and process data in a Big Data environment, and focuses on the quality of processed data. Performance indexes, like correctness, precision, and efficiency parameters are used to evaluate the quality of the extraction and classification process. The novelty of the approach is that no document types are predefined but rather, after manual processing of new types, datasets are continuously set up as training sets to be processed by a Machine Learning step that learns the new documents types. The paper presents the document management architecture and discusses the main results.

Paper Nr: 114
Title:

Empirical Study about Class Change Proneness Prediction using Software Metrics and Code Smells

Authors:

Antonio F. Martins, Cristiano Melo, José M. Monteiro and Javam C. Machado

Abstract: During the lifecycle of software, maintenance has been considered one of the most complex and costly phases in terms of resources and costs. In addition, software evolves in response to the needs and demands of the ever-changing world and thus becomes increasingly complex. In this scenario, an approach that has been widely used to rationalize resources and costs during the evolution of object-oriented software is to predict change-prone classes. A change-prone class may indicate a part of poor quality of software that needs to be refactored. Recently, some strategies for predicting change-prone classes, which are based on the use of software metrics and code smells, have been proposed. In this paper, we present an empirical study on the performance of 8 machine learning techniques used to predict classes prone to change. Three different training scenarios were investigated: object-oriented metrics, code smells, and object-oriented metrics and code smells combined. To perform the experiments, we built a data set containing eight object-oriented metrics and 32 types of code smells, which were extracted from the source code of a web application that was developed between 2013 and 2018 over eight releases. The machine learning algorithms that presented the best results were: RF, LGBM, and LR. The training scenario that presented the best results was the combination of code smells and object-oriented metrics.

Paper Nr: 124
Title:

Open Data Analytic Querying using a Relation-Free API

Authors:

Lucas F. de Oliveira, Alessandro Elias, Fabiola Santore, Diego Pasqualin, Luis E. Bona, Marcos Sunyé and Marcos Didonet Del Fabro

Abstract: The large availability of tabular Open Data sources with hundreds of attributes and relations makes the query development a difficult task, where analytic queries are common. When writing such queries, often called SPJG (Select-Project-Join-GroupBy), it is necessary to understand a data model and to write JOIN operations. The most common approach is to use business intelligence frameworks, or recent solutions based on keywords or examples. However, they require the utilization of specific applications and there is a lack of support for web-based APIs. We present a solution that eases the task of query development for tabular Open Data analytics through an API, using a simplified query representation where it is not allowed to specify the data relations, and consequently neither the joins over them, called Relation-Free Query. We define a single virtual schema that captures the database structure, which allows the use of relation-free queries in existent DBMS’s. The concrete queries are exposed by a RESTful API, which is then translated into a database query language using known query generation solutions. The API is available as a microservice. We present a case study to describe solution, using a real world scenario to query in an integrated database of several Brazilian open databases with hundreds of attributes.

Paper Nr: 135
Title:

MedBlock: Using Blockchain in Health Healthcare Application based on Blockchain and Smart Contracts

Authors:

Maria F. Ribeiro and André Vasconcelos

Abstract: Nowadays, healthcare data is generated every day from both medical institutions and individuals. Store and share such large amounts of data is expensive, challenging as well as critical. This challenge leads to a scenario of a lack of interoperability between health institutions and consequently to a patient's health data scattered across numerous systems. Blockchain emerges as a solution to these problems. It consists of a distributed database where records are saved with cryptographic encryption, making them immutable, transparent and decentralized. There are multiple healthcare applications blockchain-based that are being actively developed in order to solve the problem of interoperability between different health providers. The main objective of this work is to analyse and survey the blockchain technology and to study the smart contracts development, for the purpose of healthcare applications. Since this research is developed in the context of MedClick – a web platform that has the goal to give patients the possibility to save their health data plus interact with all the medical institutions they choose to, in one single site – an additional goal of this paper is to set the architecture to MedBlock – the MedClick platform based on blockchain and smart contracts.

Paper Nr: 149
Title:

Risk Analysis Techniques for ERP Projects based on Seasonal Uncertainty Events

Authors:

Paulo Mannini, Edmir V. Prado, Alexandre Grotta and Leandro Z. Rezende

Abstract: Risk management is fundamental in order to increase Enterprise Resource Planning (ERP) project success rate in order to plan, prevent and react to risks and uncertainties. But based on the literature review, we identified a few studies relating to both seasonal uncertainty events (SUE) and ERP projects. Given this context, this research objective is to analyse the most appropriate risk assessment techniques for ERP projects based on SUE. In order to achieve this goal, we performed and Systematic Review of Literature and we applied the Delphi technique with Project Management Professionals and Enterprise Directors. According to the SLR result, we identified 16 techniques that are more suitable to deal with SUE on ERP projects. After the Delphi panels perspective, six techniques pointed out as the most suitable for these projects. In addition, we identified that not all techniques described by the literature converged with the researched context reality. These findings are very relevant for both the Academia and the Industry to scaffolding SUE on ERP projects.

Paper Nr: 153
Title:

Publishing and Consuming Semantic Views for Construction of Knowledge Graphs

Authors:

Narciso Arruda, Amanda P. Venceslau, Matheus Mayron, V. P. Vidal and V. M. Pequeno

Abstract: The main goal of semantic integration is to provide a virtual semantic view that is semantically connected to data so that applications can have integrated access to data sources through the virtual Knowledge Graph. A semantic view can be published on a semantic portal to make it reusable for building Knowledge Graphs for different applications. This paper takes the first step towards publishing a semantic view on a semantic portal. This paper has three main contributions. First, we introduce a vocabulary for specifying semantic views. Then, we introduce a vocabulary for specification and quality assessment of Knowledge Graph. Third, we describe an approach to automatize the construction of a high-quality Knowledge Graph reusing a semantic view.

Paper Nr: 155
Title:

A Linked Data-based Service for Integrating Heterogeneous Data Sources in Smart Cities

Authors:

João G. Almeida, Jorge Silva, Thais Batista and Everton Cavalcante

Abstract: The evolution and development of new technological solutions for smart cities has significantly grown in recent years. The smart city scenario encompasses larges amount of data from several devices and applications. This raises challenges related to data interoperability, including information sharing, receiving data from multiple sources (Web services, files, systems, etc.), and making them available at underlying smart city application development platforms. This paper presents Aquedücte, a service that converts data from external sources and files to the NGSI-LD protocol, enabling their use by applications relying on an NGSI-LD-based middleware. This paper describes the Aquedücte methodology used to: (i) extract data from heterogeneous data sources, (ii) enrich them according to the NGSI-LD data format using Linked Data along with ontologies, and (iii) publish them into an NSGI-LD-based middleware. The use of Aquedücte is also described in a real-world smart city scenario.

Paper Nr: 157
Title:

Towards Test-Driven Model Development in Production Systems Engineering

Authors:

Felix Rinker, Laura Waltersdorfer and Stefan Biffl

Abstract: The correct representation of discipline-specific and cross-specific knowledge in manufacturing contexts is becoming more important due to inter-disciplinary dependencies and overall higher system complexity. However, domain experts do seldom have sufficient technical and theoretical knowledge or adequate tool support required for productive and effective model engineering and validation. Furthermore, increasing competition and faster product lifecycle require the need for parallel collaborative engineering efforts from different workgroups. Thus, test-driven modeling, similar to test-driven software engineering can support the model engineering process to produce high-quality meta and instance models by incorporating consistency and semantic checks during the model engineering. We present a conceptual framework for model transformation with testing and debugging capabilities for production system engineering use cases supporting the modeling of discipline-specific AutomationML instance models. An exemplary workflow is presented and discussed. Debug output for the models is generated to support non-technical engineers in the error detection of discipline-specific models. For future work user-friendly test definition is in planning.

Paper Nr: 162
Title:

When to Collect What? Optimizing Data Load via Process-driven Data Collection

Authors:

Johannes Lipp, Maximilian Rudack, Uwe Vroomen and Andreas Bührig-Polaczek

Abstract: Industry 4.0 and the Internet of Production lead to interconnected machines and an ever increasing amount of available data. Due to resource limitations, mainly in network bandwidth, data scientists need to reduce the data collected from machines. The amount of data can currently be reduced in breadth (number of values) or depth (frequency/precision of values), which both reduce the quality of subsequent analysis. In this paper, we propose an optimized data load via process-driven data collection. With our method, data providers can (i) split their production process into phases, (ii) for each phase precisely define what data to collect and how, and (iii) model transitions between phases via a data-driven method. This approach allows a complete focus on a certain part of the available machine data during one process phase, and a completely different focus in phases with different characteristics. Our preliminary results show a significant reduction of the data load compared to less flexible interval- or event-based methods by 39%.

Paper Nr: 168
Title:

Web Services-based Report Generation System from Big Data in the Manufacturing Industry based on Agile Software Development

Authors:

Plueksaporn Kattiyawong and Sakchai Tangprasert

Abstract: In the manufacturing industry, there are many information technology (IT) systems and machines connected to different databases with complex big data. In this study, the web service system is developed by using different programming languages consisting of C#, Javascript, HTML, CSS, Ext JS and structured query language (SQL). In addition to these programming languages, model view controller (MVC), a software design pattern is also used for developing a database interface. The development of this system allows the web service system to search for reports that meet the user needs and also has the user interface (UI) for convenience and speed. In this study, agile software development method is used in accordance with scrum framework, which consist of 4 steps: 1) product backlog creation, 2) sprint backlog creation, 3) sprint or the system development and testing and test case writing and 4) daily scrum. Sprint review is held to report the results of unit test, functional test, integration test and user acceptance test (UAT). This sprint review enables the development of an appropriate and comprehensive system and results in collaboration and understanding among stakeholders, including technology adoption among users.

Paper Nr: 182
Title:

An ICT Platform for the Understanding of the User Behaviours towards EL-Vs

Authors:

Maria Krommyda, Richardos Drakoulis, Fay Misichroni, Nikolaos Tousert, Anna Antonakopoulou, Evangelia Portouli, Mandimby R. Rakotondravelona, Marwane El-Bekri, Djibrilla A. Kountche and Angelos Amditis

Abstract: Demonstrations of Electrified Light Vehicles (EL-Vs) are organised in six European cities to collect trip data and users’ perceptions and experiences in order to boost the market uptake of such vehicles by the EU funded ELVITEN project. Aiming to handle the data flow from the various ICT tools operating in each city and to allow data analysis and data visualisation, a fully integrated ICT platform is designed, developed and deployed. The architecture and the design of the platform, the modules of the system as well as their relations and interactions are presented in this paper. The platform has been designed to collect data generated by the services and tools in the Demonstration Cities, consolidate the information and facilitate the retrieval, processing and visualisation of the collected information in a uniform and consistent way among cities and tools.

Paper Nr: 191
Title:

Understanding Query Interfaces: Automatic Extraction of Data from Domain-specific Deep Web based on Ontology

Authors:

Li Dong, Zhang Huan and Yu Zitong

Abstract: The resources of many Web-accessible databases, which are a very large portion of the structured data on the Web, are only available through query interfaces but are invisible to the traditional search engines. Many methods, which discovery these resources automatically, rely on the different structures of Web pages and various designing modes of databases. However, some semantic meanings and relations are ignored. Here we introduce a Web information retrieval system that obtains the knowledge from multiple databases automatically by using common ontology WordNet. Also, deep Web query results are post-processed based on domain ontology. That is, given an integrated interface, after inputting a query, our system offers an ordered list of data records to users. We have conducted an extensive experimental evaluation of the Web information retrieval system over real documents. Also, we test our system with hundreds of databases on different topics. Experiments show that our system has low cost and achieves high discovering accuracy across multiple databases.

Paper Nr: 218
Title:

Process Management Enhancement by using Image Mining Techniques: A Position Paper

Authors:

Myriel Fichtner, Stefan Schönig and Stefan Jablonski

Abstract: Business process modeling is a well-established method to define and visualize business processes. In complex processes, related process models may become large and hard to trace. To keep the readability of process models, process details are omitted. In other cases, process designers are not aware which process steps should be modelled in detail. However, the input specification of some process steps or the order of internal sub-steps could have an impact on the success of the overall process. The most straightforward solution is to identify the cause of reduced process success in order to improve the process results. This can be challenging, especially in flexible process environments with multiple process participants. In this paper we tackle this problem through recording image data of process executions and analyzing them with image mining techniques. We propose to redesign business process models considering the analysis results to reach more effective and efficient process executions.

Paper Nr: 241
Title:

Adopting Formal Verification and Model-Based Testing Techniques for Validating a Blockchain-based Healthcare Records Sharing System

Authors:

Rateb Jabbar, Moez Krichen, Noora Fetais and Kamel Barkaoui

Abstract: The Electronic Health Records (EHR) sharing system is the modern tool for delivering efficient healthcare to patients. Its functions include tracking of therapies, monitoring of the treatment effectiveness, prediction of outcomes throughout the patient’s lifespan, and detection of human errors. For all the stakeholders, integrity and interoperability of the care continuum are paramount. Yet, its implementation is challenging due to the heterogeneity of healthcare information systems, security threats, and the enormousness of EHR data. To overcome these challenges, this work proposes BiiMED: a Blockchain framework for Enhancing Data Interoperability and Integrity regarding EHR-sharing. This solution is innovative as it contains an access management system allowing the exchange of EHRs between different medical providers and a decentralized Trusted Third Party Auditor (TTPA) for ensuring data integrity. This paper also discusses two validation techniques for enhancing the quality and correctness of the proposed solution: Formal Verification and Model-Based Techniques. The first one checks the correctness of a mathematical model describing the behavior of the given system prior to the implementation. The second technique derives test suites from the adopted model, performs them, and assesses the correctness.

Paper Nr: 263
Title:

Modern Federated Database Systems: An Overview

Authors:

Leonardo G. Azevedo, Elton S. Soares, Renan Souza and Marcio F. Moreno

Abstract: Usually, modern applications manipulate datasets with diverse models, usages, and storages. “One size fits all” approaches are not sufficient for heterogeneous data, storages, and schemes. The rise of new kinds of data stores and processing, like NoSQL data stores, distributed file systems, and new data processing frameworks, brought new possibilities to meet this scenario’s requirements. However, semantic, schema and storage heterogeneity, autonomy, and distributed processing are still among the main concerns when building data-driven applications. This work surveys the literature aiming at giving an overview of the state of the art of modern federated database systems. It presents the background, characterizes existing tools, depicts guidelines one should follow when creating solutions, and points out research challenges to consider in future work. This work gives fundamentals for researchers and practitioners in the area.

Paper Nr: 264
Title:

“That Sweet Spot, Where Technology Is Just Mature Enough to Be Stable”: A Case Study on the Right Timing for Cloud ERP and Blockchain Adoption

Authors:

Marc A. Schmick and Anke Schüll

Abstract: The adoption of disruptive technologies like Cloud ERP and Blockchain has been a topic of ongoing research. Oriented on the TOE framework and informed by recent literature on the adoption of these technologies, we conducted a qualitative analysis within an international company on the threshold of integrating Cloud ERP and on the point of discussing the potential of Blockchain. Interviews revealed most of the TOE factors as equally important for both technologies, thus underlining the importance of the TOE framework and the explanatory power to describe the contextual factors for technology adoption. At the same time they gave evidence for a deficiency in the explanatory power of this model: The right timing for technology adoption must be explained by more dynamic aspects. The interviews pointed to the maturity of the technology and the market demand as paramount for the timing decision. This paper contributes to the body of knowledge by expanding research on the timing decision of technology adoption and recommends to improve the explanatory power of the TOE framework by taking the maturity level of the technology into account. As the evidence given by a single use case is too small, further research is required to confirm the results.

Paper Nr: 2
Title:

Situation-aware Building Information Models for Next Generation Building Management Systems

Authors:

Ovidiu Noran, Peter Bernus and Sorin Caluianu

Abstract: Technical advances in Information and Communication Technology have enabled the collection and storage of large amounts of data, rising hopes of improving asset decision-making and related building management support systems. It appears however that the gap between the required decision-making knowledge and the actual useful information provided by current technologies appears to increase, rather than contract. Thus, often the multitude of patterns afforded by current data analytics techniques does not deliver a set of scenarios prone to effective decision making. This paper advocates a decision analytics solution featuring the use of Situated Logic to create ‘narratives’ providing adequate meaning to data analytics results, and the use of Channel Theory so as to support adequate situational awareness. This approach is also analysed in the context of a Building Management System-of-Systems paradigm, highly relevant to the emerging complex Clusters of Intelligent Buildings within Smart Cities, featuring collaborative decision-making centres and their associated decision support systems.

Paper Nr: 39
Title:

Public Key Infrastructure Issues for Enterprise Level Security

Authors:

Kevin Foltz and William R. Simpson

Abstract: A public key infrastructure (PKI) provides a way to manage identities within an enterprise. Users are provided public/private key pairs, and trusted certification authorities issue credentials binding a user name to the associated public key for that user. This enables security functions by users within the enterprise, such as authentication, signature creation and validation, encryption, and decryption. However, the enterprise often interacts with partner enterprises and the open web, which may use different PKIs. Mobile devices do not easily operate with hardware-based PKI tokens such as smartcards. Standard digital signatures lack timing information such as validity or expiration. This paper examines some of the security challenges related to PKI deployment in the context of Enterprise Level Security (ELS), an enterprise solution for a high security environment.

Paper Nr: 61
Title:

A Data Analytics Approach to Online Tourists’ Reviews Evaluation

Authors:

Evripides Christodoulou, Andreas Gregoriades and Savvas Papapanayides

Abstract: This paper utilizes online data of tourists’ reviews from TripAdvisor to identify patterns with regards to sentiment and topics discussed by tourists that visit Cyprus, along with the investigation of the effect of tourist culture and purchasing power on reviews’ polarity, using logistic regression. The analysis uses natural language processing using the LDA technique and Naïve Bayes sentiment analysis. For the data collection, custom-made python scripts were used. Ordinal logistic regression is used to identify differences among the types of tourists visiting Cyprus, in accordance to culture and purchasing power.

Paper Nr: 79
Title:

Efficient Representation of Very Large Linked Datasets as Graphs

Authors:

Maria Krommyda, Verena Kantere and Yannis Vassiliou

Abstract: Large linked datasets are nowadays available on many scientific topics of interest and offer invaluable knowledge. These datasets are of interest to a wide audience, people with limited or no knowledge about the Semantic Web, that want to explore and analyse this information in a user-friendly way. Aiming to support such usage, systems have been developed that support such exploration they impose however many limitations as they provide to users access to a limited part of the input dataset either by aggregating information or by exploiting data formats, such as hierarchies. As more linked datasets are becoming available and more people are interested to explore them, it is imperative to provide an user-friendly way to access and explore diverse and very large datasets in an intuitive way, as graphs. We present here an off-line pre-processing technique, divided in three phases, that can transform any linked dataset, independently of size and characteristics to one continuous graph in the two-dimensional space. We store the spatial information of the graph, add the needed indices and provide the graphical information through a dedicated API to support the exploration of the information. Finally, we conduct an experimental analysis to show that our technique can process and represent as one continuous graph large and diverse datasets.

Paper Nr: 89
Title:

Mixing Heterogeneous Authentication and Authorization Infrastructures through Proxies

Authors:

Angelo Furfaro and Giuseppe de Marco

Abstract: The ever increasing diffusion of digital services offered by institutional organizations and the need of interoperability among them have made crucial the role of Authentication and Authorization Infrastructures (AAIs). Numerous formats and technologies for data exchange have been developed in recent years and some of them have become very popular. This paper discusses the main challenges an organization has to face in making its services seamlessly available to end-users and client systems across multiple AAIs. An effective solution, relying on Authentication and Authorization Proxies, like SATOSA, which allows the interoperability of hybrid types of service providers and consumers, is described. In particular, a scenario is considered which envisages the support of heterogeneous (public) digital identity technologies for access to digital services on a university campus.

Paper Nr: 136
Title:

Hybrid Shallow Learning and Deep Learning for Feature Extraction and Image Retrieval

Authors:

Hanen Karamti, Hadil Shaiba and Abeer M. Mahmoud

Abstract: In the last decennia, several works have been developed to extract global/or local features from images. However, the performance of image retrieval stay surfing from the problem of semantic interpretation of the visual content of images (semantic gap). Recently, deep neural networks (DCNNs) showed excellent performance in different fields like image retrieval for feature extraction compared to traditional techniques. Although, Fuzzy C-Means (FCM) Clustering Algorithm that is a shallow learning method, but it has a competitive performance in the clustering field. In this paper, we present a new method for feature extraction combining DCNN and Fuzzy c-means, where DCNN gives a compact representation of images and FCM clusters the features and enhances the real-time for searching. The proposed method is performed against other methods in literature on two benchmark datasets: Oxford5K and Inria Holidays, where the proposed method overbeats respectively 83% and 86%.

Paper Nr: 151
Title:

LEDAC: Optimizing the Performance of the Automatic Classification of Legal Documents through the Use of Word Embeddings

Authors:

Víctor Labrador, Álvaro Peiró, Ángel L. Garrido and Eduardo Mena

Abstract: Nowadays, the number of legal documents processed daily prevents the work from being done manually. One of the most relevant processes is the classification of this kind of documents, not only because of the importance of the task itself, but also since it is the starting point for other important tasks such as data search or information extraction. In spite of technological advances, the task of automatic classification is still performed by specialized staff, which is expensive, time-consuming, and subject to human errors. In the best case it is possible to find systems with statistical approaches whose benefits in terms of efficacy and efficiency are limited. Moreover, the presence of overlapping elements in legal documents, such as stamps or signatures distort the text and hinder these automatic tasks. In this work, we present an approach for performing automatic classification tasks over these legal documents which exploits the semantic properties of word embeddings. We have implemented our approach so that it is simple to address different types of documents with little effort. Experimental results with real data show promising results, greatly increasing the productivity of systems based on other approaches.

Paper Nr: 152
Title:

uAIS: An Experience of Increasing Performance of NLP Information Extraction Tasks from Legal Documents in an Electronic Document Management System

Authors:

Marcos Ruiz, Cristian Román, Ángel L. Garrido and Eduardo Mena

Abstract: Nowadays, the huge number of documents which are managed through document management systems make their manual processing practically impossible. That is why the use of natural language processing subsystems that help to perform certain tasks begins to be essential for many commercial systems. Although its use is gradually extending to all levels, this type of subsystems presents the problem of its high requirements of resources from CPU and memory that can harm the entire system to which it intends to provide assistance. In this work, we propose and study an architecture based on microservices and message brokers which improves the performance of these NLP subsystems. We have implemented our approach on a real document management system, which performs intensive processes of language analysis on large legal documents. Experimental results show promising results, greatly increasing the productivity of systems based on other approaches.

Paper Nr: 229
Title:

Comparison of Search Servers for Use with Digital Repositories

Authors:

Aluísio S. Gonçalves and Marcos S. Sunye

Abstract: Search is a fundamental operation of storage systems, such as digital repositories, and their performance and quality directly impact the user’s opinion of a system. This paper evaluates two different search engines, Apache Solr and Elasticsearch, for the same repository and reports the pros and cons of each for that specific use case. In particular, we identify that although Elasticsearch consumes less resources and responds to most queries more quickly, it may also take longer to respond in some scenarios.

Paper Nr: 253
Title:

Investigating the Performance of Moodle Database Queries in Cloud Environments

Authors:

Karina Wiechork and Andrea S. Charão

Abstract: Several computing services are being migrated to cloud environments, where resources are available on demand and billing is based on usage. Databases in the cloud are increasingly popular, however their performance is a key indicator that must be known before deciding to migrate a system to a cloud environment. In this article, we present a preliminary investigation of the performance of database queries in Moodle, a popular Learning Management System, installed on cloud environments from Amazon Web Services and Google Cloud Platform. Experiments and performance analysis were based on benchmarks Pgbench, Sysbench and a Moodle Benchmark Plugin. We collected data and compare it with the results obtained on a local computer. In the configurations we tested, the results show that the Moodle database at Amazon’s provider performed better than Google’s. We made our data and scripts available to favour reproductibility, so to support decision makers on the migration of a Moodle instance to a cloud service provider.

Area 2 - Artificial Intelligence and Decision Support Systems

Full Papers
Paper Nr: 12
Title:

A Flexible Model for Enterprise Document Capturing Automation

Authors:

Juris Rāts, Inguna Pede, Tatjana Rubina and Gatis Vītols

Abstract: The aim of the research is to create and evaluate a flexible model for document capturing that would employ machine learning to classify documents feeding them with values for one or more metadata items. Documents and classification metadata fields typical for Enterprise Content Management (ECM) systems are used in the research. The model comprises selection of classification methods, configuration of the methods hyperparameters and configuration of a number of other learning related parameters. The model provides user with visual means to analyse the classification outcomes and those to tune the further steps of the learning. A couple of challenges are addressed along the way – as informal and eventually changing criteria for document classification, and imbalanced data sets.

Paper Nr: 50
Title:

Predicting e-Mail Response Time in Corporate Customer Support

Authors:

Anton Borg, Jim Ahlstrand and Martin Boldt

Abstract: Maintaining high degree of customer satisfaction is important for any corporation, which involves the customer support process. One important factor in this work is to keep customers’ wait time for a reply at levels that are acceptable to them. In this study we investigate to what extent models trained by the Random Forest learning algorithm can be used to predict e-mail time-to-respond time for both customer support agents as well as customers. The data set includes 51,682 customer support e-mails of various topics from a large telecom operator. The results indicate that it is possible to predict the time-to-respond for both customer support agents (AUC of 0.90) as well as for customers (AUC of 0.85). These results indicate that the approach can be used to improve communication efficiency, e.g. by anticipating the staff needs in customer support, but also indicating when a response is expected to take a longer time than usual.

Paper Nr: 54
Title:

OPTIC: A Deep Neural Network Approach for Entity Linking using Word and Knowledge Embeddings

Authors:

Italo L. Oliveira, Diego Moussallem, Luís F. Garcia and Renato Fileto

Abstract: Entity Linking (EL) for microblog posts is still a challenge because of their usually informal language and limited textual context. Most current EL approaches for microblog posts expand each post context by considering related posts, user interest information, spatial data, and temporal data. Thus, these approaches can be too invasive, compromising user privacy. It hinders data sharing and experimental reproducibility. Moreover, most of these approaches employ graph-based methods instead of state-of-the-art embedding-based ones. This paper proposes a knowledge-intensive EL approach for microblog posts called OPTIC. It relies on a jointly trained word and knowledge embeddings to represent contexts given by the semantics of words and entity candidates for mentions found in the posts. These embedded semantic contexts feed a deep neural network that exploits semantic coherence along with the popularity of the entity candidates for doing their disambiguation. Experiments using the benchmark system GERBIL shows that OPTIC outperforms most of the approaches on the NEEL challenge 2016 dataset.

Paper Nr: 55
Title:

Solution of a Practical Pallet Building Problem with Visibility and Contiguity Constraints

Authors:

Manuel Iori, Marco Locatelli, Mayron O. Moreira and Tiago Silveira

Abstract: We study a pallet building problem that originates from a case study at a company that produces pallet building robotized systems. The problem takes into account well known constraints, such as rotation and stackability, and we introduce two practical constraints named visibility and contiguity between items of the same type. We formalize the problem and propose heuristic algorithms to solve it, using a strategy that first creates 2D layers and, then, creates the final 3D pallets. The proposed heuristic is based mainly on the Extreme Points heuristic, that is tailored to choose feasible positions to pack items during the construction of the solution. Besides that, we adapt our proposed heuristic using other basic heuristics from the literature, considering different constraints. The performance of the algorithms is assessed through extensive computational tests on real-world instances, and the obtained results show the proposed heuristics are able to create compact packing in a very short time.

Paper Nr: 62
Title:

A Decision Support System for a Multi-trip Vehicle Routing Problem with Trucks and Drivers Scheduling

Authors:

Nilson M. Mendes and Manuel Iori

Abstract: Many real-world transportation problems can be modeled as variants of the well-known vehicle routing problem (VRP), where a fleet of vehicles based at a central depot is used to deliver freight to clients at a minimum cost. Frequently, the problems defined in the VRP literature and the corresponding solution algorithms do not catch all the problem features incurred by the companies in their every-day activity, and further flexibility is needed during the decision process to make adjustments on the fly. In this paper, we present a decision support system developed for an Italian pharmaceutical distribution company to deal with a Multi-Trip VRP characterized by additional constraints and Truck and Driver Scheduling. The problem is solved in the software with a two-phase algorithm: the first phase consists of an Iterated Local Search metaheuristic that defines the vehicle routes, whereas the second phase invokes a mathematical model to assign trucks and drivers to the routes. The software allows, between the two phases, changes in the solution to better fit the company requirements. Computational results prove the effectiveness of the proposed method.

Paper Nr: 68
Title:

Globo Face Stream: A System for Video Meta-data Generation in an Entertainment Industry Setting

Authors:

Rafael Pena, Felipe A. Ferreira, Frederico Caroli, Luiz Schirmer and Hélio Lopes

Abstract: The amount of recorded video in the world is increasing a lot due not only to the humans interests and habits regarding this kind of media, but also the diversity of devices used to create them. However, there is a lack of information about video content because generating video meta-data is complex. It requires too much time to be performed by humans, and from the technology perspective, it is not easy to overcome obstacles regarding the huge amount and diversity of video frames. The manuscript proposes an automated face recognition system to detect soap opera characters within videos, called Globo Face Stream. It was developed to recognize characters, in order to increase video meta-data. It combines standard computer vision techniques to improved accuracy by processing existing models output data in a complementary manner. The model performed accurately using a real life dataset from a large media company.

Paper Nr: 82
Title:

Estimating Problem Instance Difficulty

Authors:

Hermann Kaindl, Ralph Hoch and Roman Popp

Abstract: Even though for solving concrete problem instances, e.g., through case-based reasoning (CBR) or heuristic search, estimating their difficulty really matters, there is not much theory available. In a prototypical real-world application of CBR for reuse of hardware/software interfaces (HSIs) in automotive systems, where the problem adaptation has been done through heuristic search, we have been facing this problem. Hence, this work compares different approaches to estimating problem instance difficulty (similarity metrics, heuristic functions). It also shows that even measuring problem instance difficulty depends on the ground truth available and used. A few different approaches are investigated on how they statistically correlate. Overall, this paper compares different approaches to both estimating and measuring problem instance difficulty with respect to CBR and heuristic search. In addition to the given real-world domain, experiments were made using sliding-tile puzzles. As a consequence, this paper points out that admissible heuristic functions h guiding search (normally used for estimating minimal costs to a given goal state or condition) may be used for retrieving cases for CBR as well.

Paper Nr: 100
Title:

InfraSmart: A Decision Guidance System for Investment in Infrastructure Service Networks

Authors:

Bedor Alyahya and Alexander Brodsky

Abstract: Current approaches to infrastructure investment either (1) model the problem in high-level financial terms, but do not accurately express the underlying system behavior and non-financial performance indicators, or (2) are hard-wired to infrastructure silos, and do not take into account the complex interaction across these silos. This paper proposes to bridge the gap by modeling interrelated infrastructures as a hierarchical service network operating over a time horizon, as well as an extensible repository of infrastructure-specific component models. The paper reports on formal modeling, the development and an initial experimental study of InfraSmart, a decision guidance system for investment in interdependent infrastructure service networks.

Paper Nr: 102
Title:

Decision Guidance on Software Feature Selection to Maximize the Benefit to Organizational Processes

Authors:

Fernando Boccanera and Alexander Brodsky

Abstract: Many software development projects fail because they do not deliver sufficient business benefit to justify the investment. Existing approaches to estimating business benefit of software adopt unrealistic assumptions which produce imprecise results. This paper focuses on removing this limitation for software projects that automate business workflow processes. For this class of projects, the paper proposes a new approach and a decision-guidance framework to select and schedule software features over a sequence of software releases as to maximize the net present value of the combined cash flow of software development as well as the improved organizational business workflow. The uniqueness of the proposed approach is in precise modelling of the business workflow and the savings achieved by deploying new software functionality.

Paper Nr: 118
Title:

State Validation in Automated Planning

Authors:

Caio G. Rodrigues da Cruz, Mauricio V. Ferreira and Rodrigo R. Silva

Abstract: The crescent number of automated systems in satellites raises several security and reliability concerns, that are worsened with the time. Plan validation techniques were created to validate flight operation plans generated automatically. The execution of automatically generated plans on satellite flight operations can result in degraded or invalid states. Verifying the possibility of removing these states of a plan through a state validation technique is the objective of this paper. Analyzing the action that generated and, in planning time, remove the invalid states from the plan steps enables the planner to find the final state without any invalid state. Therefore, implementing a state validator in the automated planner prevents the plan from containing any invalid state.

Paper Nr: 122
Title:

Hybrid Approach based on SARIMA and Artificial Neural Networks for Knowledge Discovery Applied to Crime Rates Prediction

Authors:

Felipe L. Soares, Tiago B. Silveira and Henrique C. Freitas

Abstract: The fight against crime in Brazilian cities is an extremely important issue and has become a priority agenda in public, statutory or municipal discussions. Even so, reducing cases of violence is a complex task in large Brazilian cities, such as Rio de Janeiro and São Paulo, as these large cities have vast criminal points. Therefore, this paper presents the steps followed in the process of knowledge discovery applied to prediction of crime rate numbers in different regions of São Paulo city in order to better understand it and distribute the security forces more efficiently. Then, a hybrid model composed of an Artificial Neural Network and the SARIMA mathematical model was applied to databases related to different areas of the city. The average results showed assertiveness rates of 83.12% and 76.78% and root mean square deviation of 1.75 and 2.16 for two different tests.

Paper Nr: 132
Title:

Intelligent Regulation System to Optimize the Service Performance of the Public Transport

Authors:

Nabil Morri, Sameh Hadouaj and Lamjed Ben Said

Abstract: The urban public transport systems deal with dynamic environments and evolve over time. Frequently, we dispose of a lot of correlated information that is not well exploited to improve the public transport quality service, especially in perturbation cases where a regulation system should be used in order to maintain the public transport scheduled time table. The quality service should be measured in terms of public transport key performance indicator (KPI) for the wider urban transport system and issues like regularity, punctuality and correspondence criteria. In fact, in the absence of a set of widely accepted performance measures and transferable methodologies, it is very difficult for public transport to objectively assess the effects of specific regulation system and to make use of lessons learned from other public transport systems. Unfortunately, most of the existing traffic regulation systems do not take into consideration part or most of the performance criteria when they propose a regulation maneuver. Therefore, the applicability of these models is restricted only to specific contexts. This paper sets the context of performance measurement in the field of public traffic management and presents the regulation support system of public transportation (RSSPT). The aim of this regulation support system is (i) to detect the traffic perturbation by distinguishing the non-equability of scheduled and the current time table of vehicle passage at the station (ii) and to find the regulation action by optimizing the performance of the service quality of the public transportation. We adopt a multi-agent approach to model the system. The validation of our model is done by simulating two scenarios on Abu Dhabi transport system and shows the efficiency of our system when we want to use many performance indicators to regulate a disturbance situation.

Paper Nr: 150
Title:

A Mixed Linear Integer Programming Formulation and a Simulated Annealing Algorithm for the Mammography Unit Location Problem

Authors:

Marcos A. de Campos, Manoel S. Moreira de Sá, Patrick M. Rosa, Puca V. Penna, Sérgio R. de Souza and Marcone F. Souza

Abstract: Breast cancer is the most commonly occurring one in the female population. Early diagnosis of this disease, through mammography screening, can increase the chances of cure to 95%. Studies show that Brazil has a relatively satisfactory number of mammography units, but this equipment is poorly geographically distributed. This paper focuses on the Mammography Unit Location Problem (MULP), which aims an efficient distribution of mammography units, in order to increase the covered demand. Focusing on the State of Minas Gerais, Brazil, an analysis is made considering that, in the real world, there is a difficulty in relocating equipment already installed. Therefore, it would be interesting to optimize the location of new equipment purchases. Since MULP is NP-hard, an algorithm based on the Simulated Annealing meta-heuristic is also developed to handle large instances of the problem.

Paper Nr: 159
Title:

Comparing Supervised Classification Methods for Financial Domain Problems

Authors:

Victor U. Pugliese, Celso M. Hirata and Renato D. Costa

Abstract: Classification is key to the success of the financial business. Classification is used to analyze risk, the occurrence of fraud, and credit-granting problems. The supervised classification methods help the analyzes by ’learning’ patterns in data to predict an associated class. The most common methods include Naive Bayes, Logistic Regression, K-Nearest Neighbors, Decision Tree, Random Forest, Gradient Boosting, XGBoost, and Multilayer Perceptron. We conduct a comparative study to identify which methods perform best on problems of analyzing risk, the occurrence of fraud, and credit-granting. Our motivation is to identify if there is a method that outperforms systematically others for the aforementioned problems. We also consider the application of Optuna, which is a next-generation Hyperparameter optimization framework on methods to achieve better results. We applied the non-parametric Friedman test to infer hypotheses and we performed Nemeyni as a posthoc test to validate the results obtained on five datasets in Finance Domain. We adopted the performance metrics F1 Score and AUROC. We achieved better results in applying Optuna in most of the evaluations, and XGBoost was the best method. We conclude that XGBoost is the recommended machine learning classification method to overcome when proposing new methods for problems of analyzing risk, fraud, and credit.

Paper Nr: 201
Title:

Tax Crime Prediction with Machine Learning: A Case Study in the Municipality of São Paulo

Authors:

André Ippolito and Augusto G. Lozano

Abstract: With the advent of Big Data, several industries utilize data for analytical and competitive purposes. The government sector is following this trend, aiming to accelerate the decision-making process and improve the efficiency of operations. The predictive capabilities of Machine Learning strengthen the decision-making process. The main motivation of this work is to use Machine Learning to aid decision-making in fiscal audit plans related to service taxes of the municipality of São Paulo. In this work, we applied Machine Learning to predict crimes against the service tax system of São Paulo. In our methods, we structured a process comprised of the following steps: feature selection; data extraction from our databases; data partitioning; model training and testing; model evaluation; model validation. Our results demonstrated that Random Forests prevailed over other learning algorithms in terms of tax crime prediction performance. Our results also showed Random Forests’ capability to generalize to new data. We believe that the supremacy of Random Forests is due to the synergy of its ensemble of trees, which contributed to improve tax crime prediction performance. With better predictions, our audit plans became more assertive. Consequently, this rises taxpayers’ compliance with tax laws and increases tax revenue.

Paper Nr: 233
Title:

Last Mile Delivery with Lockers: Formulation and Heuristic

Authors:

Willian P. Oliveira and André D. Santos

Abstract: The creation of efficient routes is essential for different areas having several practical applications mainly in the transport of goods. With the growth of e-commerce and consequently the increase in demand for delivery to end users, minimizing costs in the delivery process has gained more importance, especially the last mile stage. It is in this context that the use of lockers emerges to optimize last mile deliveries. Lockers have compartments of different sizes, with self-service interface and they can be positioned in supermarkets, parks and other areas that are of interest to customers. The problem addressed in this work is to determine the positioning of the lockers and the necessary routes to supply them and to serve the remaining customers. We present a mathematical model to define the problem, but due to the complexity of the problem obtaining a solution can be very expensive and require a lot of computational effort, therefore we present a heuristic, based on Variable Neighborhood Descent (VND), using a greedy construct inspired by the Clark & Wright savings method. By comparing the results of the heuristic with the Gurobi optimizer, we conclude that the heuristic is capable of obtaining competitive solutions in less time than the exact methods.

Paper Nr: 270
Title:

Functions Approximation using Multi Library Wavelets and Least Trimmed Square (LTS) Method

Authors:

Abdesselem Dakhli, Maher Jbeli and Chokri Ben Amar

Abstract: Wavelet neural networks have recently aroused great interest, because of their advantages compared to networks with radial basic functions because they are universal approximators. In this paper, we propose a robust wavelet neural network based on the Least Trimmed Square (LTS) method and Multi Library Wavelet Function (MLWF). We use a novel Beta wavelet neural network BWNN. A constructive neural network learning algorithm is used to add and train these additional neurons. The general goal of this algorithm is to minimize the number of neurons in the network during the learning phase. This phase is empowered by the use of Multi Library Wavelet Function (MLWF). The Least Trimmed Square (LTS) method is applied for selecting the wavelet candidates from the MLWF to construct the BWNN. A numerical experiment is given to validate the application of this wavelet neural network in multivariable functional approximation. The experimental results show that the proposed approach is very effective and accurate.

Short Papers
Paper Nr: 15
Title:

A Fuzzy Cognitive Map Approach to Investigate the Sustainability of the Social Security System in Jordan

Authors:

George Sammour, Ahmad Alghzawi and Koen Vanhoof

Abstract: Fuzzy Cognitive Maps are emerging as an important new tool in economic modelling. The aim of this study is to investigates the use of fuzzy cognitive maps with their learning algorithms, based on genetic algorithms, for the purposes of prediction of economic sustainability. A Case study data are extracted from the Jordanian Social Security system for the last 120 months; The Real-Code genetic algorithm and structure optimization algorithm were chosen for their ability to select the most significant relationships between the concepts and to predict future development of the Jordanian social security revenues and expenses. The study shows that fuzzy cognitive maps models clearly predict the future of a complex financial system with incoming and outgoing flows. Therefore, this research confirms the benefits of fuzzy cognitive maps applications as a tool for scholarly researchers, economists and policy makers.

Paper Nr: 36
Title:

Personal Documents Classification using a Hybrid Framework at a Mobile Insurance Company: A Case Study

Authors:

Raissa Barcellos and Rodrigo Salvador

Abstract: In the information age, coupled with the full range and speed of data, the ease of access to new disruptive technologies brings the relevant problem of document classification. Identifying and categorizing documents is still a very challenging initiative addressed in the literature. This paper analyzes the construction of a document classification hybrid framework in a real business context. The research is based on a case study addressing the construction of a hybrid framework that uses text and image in document classification and how this framework can be useful in an authentic context of a mobile insurance company. Excellent accuracy and precision results were found in the use of both approaches, even considering a possible fraudulent circumstance. From these results we can conclude that using the hybrid framework, using the visual approach as a filter — which is more efficient in verifying the authenticity of documents— and consolidating the results with the textual approach, is a convincing option for deployment in the company in question.

Paper Nr: 51
Title:

A Meta-heuristic based Multi-Agent Approach for Last Mile Delivery Problem

Authors:

Maram Hasan and Rajdeep Niyogi

Abstract: e-Commerce has become a primary part of any country’s economy, and seeking maximum efficiency and level of service is an essential concern for any corporation in order to stay in business. Logistics has a significant impact on the efficiency of online transactions, especially with the increasingly competitive domain with minimal profit margin-left. Thus, the collaboration between many logistics service providers (LSPs) at different levels has become a desirable approach to reduce the overall costs and increase the utilization level of their resources. In this work, we propose a domain-independent multi-agent framework that allows different LSPs to plan their operations jointly. The system considers the individual satisfaction of LSPs and their profits in an egalitarian manner while trying to achieve an overall benefit. We use different search strategies for every agent as the underlying solving method, and investigate to what level taking the personal interest of participants will affect the overall shared/ common goal.

Paper Nr: 57
Title:

Using Reinforcement Learning for Optimization of a Workpiece Clamping Position in a Machine Tool

Authors:

Vladimir Samsonov, Chrismarie Enslin, Hans-Georg Köpken, Schirin Baer and Daniel Lütticke

Abstract: Modern manufacturing is increasingly data-driven. Yet there are a number of applications traditionally performed by humans because of their capabilities to think analytically, learn from previous experience and adapt. With the appearance of Deep Reinforcement Learning (RL) many of these applications can be partly or completely automated. In this paper we aim at finding an optimal clamping position for a workpiece (WP) with the help of deep RL. Traditionally, a human expert chooses a clamping position that leads to an efficient, high quality machining without axis limit violations or collisions. This decision is hard to automate because of the variety of WP geometries and possible ways to manufacture them. We investigate whether the use of RL can aid in finding a near-optimal WP clamping position, even for unseen WPs during training. We develop a use case representing a simplified problem of clamping position optimisation, formalise it as a Markov Decision Process (MDP) and conduct a number of RL experiments to demonstrate the applicability of the approach in terms of training stability and quality of the solutions. First evaluations of the concept demonstrate the capability of a trained RL agent to find a near-optimal clamping position for an unseen WP with a small number of iterations required.

Paper Nr: 66
Title:

A Reference Process Model for Machine Learning Aided Production Quality Management

Authors:

Alexander Gerling, Ulf Schreier, Andreas Hess, Alaa Saleh, Holger Ziekow and Djaffar O. Abdeslam

Abstract: The importance of machine learning (ML) methods has been increasing in recent years. This is also the reason why ML processes in production are becoming more and more widespread. Our objective is to develop a ML aided approach supporting production quality. To get an overview, we describe the manufacturing domain and use a visualization to explain the typical structure of a production line. Within this section we illustrate and explain the as-is process to eliminate an error in the production line. Afterwards, we describe a careful analysis of requirements and challenges for a ML system in this context. A basic idea of the system is the definition of product testing meta data and the exploitation of this knowledge inside the ML system. Also, we define a to-be process with ML system assistance for checking production errors. For this purpose, we describe the associated actors and tasks as well.

Paper Nr: 85
Title:

Predicting Crime by Exploiting Supervised Learning on Heterogeneous Data

Authors:

Úrsula M. Castro, Marcos W. Rodrigues and Wladmir C. Brandão

Abstract: Crime analysis supports law-enforcing agencies in preventing and resolving crimes faster and efficiently by providing methods and techniques to understand criminal behavior patterns. Strategies for crime reduction rely on preventive actions, e.g., where perform street lighting and police patrol. The evaluation of these actions is paramount to establish new security strategies and to ensure its effectiveness. In this article, we propose a supervised learning approach that exploits heterogeneous criminal data sources, aiming to understand criminal behavior patterns and predicting crimes. Thus, we extract crime features from these data to predict the tendency of increase or decrease, and the number of occurrences of crimes types by geographic regions. To predict crimes, we exploit four learning techniques, as k-NN, SVM, Random Forest, and XGBoost. Experimental results show that the proposed approach achieves up to 89% of accuracy and 98% of precision for crime tendency, and up to 70% of accuracy and 79% of precision for crime occurrence. The results show that Random Forest and XGBoost usually perform better when trained with a short time window, while k-NN and SVM perform better with a longer time window. Moreover, the use of heterogeneous sources of data can be effectively used by supervised techniques to improve forecast performance.

Paper Nr: 99
Title:

An Effective Sparse Autoencoders based Deep Learning Framework for fMRI Scans Classification

Authors:

Abeer M. Mahmoud, Hanen Karamti and Fadwa Alrowais

Abstract: Deep Learning (DL) identifies features of medical scans automatically in a way very near to expert doctors and sometimes over beats in treatment procedures. In fact, it increases model generalization as it doesn’t focus on low level features and reduces difficulties (eg: overfitting) of training high dimensional data. Therefore, DL becomes a prioritized choice in building most recent Computer-Aided Diagnosis (CAD) systems. From other prospective, Autism Spectrum Disorder (ASD) is a brain disorder characterized by social miscommunication and confusing repetitive behaviours. The accurate diagnosis of ASD through analysing brain scans of patients is considered a research challenge. Some appreciated efforts has been reported in literature, however the problem still needs enhancement and examination of different models. A multi-phase learning algorithm combining supervised and unsupervised approaches is proposed in this paper to classify brain scans of individuals as ASD or controlled patients (TC). First, unsupervised learning is adopted using two sparse autoencoders for feature extraction and refinement of optimal network weights using back-propagation error minimization. Then, third autoencoder act as a supervised classifier. The Autism Brain fMRI (ABIDE-I) dataset is used for evaluation and cross-validation is performed. The proposed model recorded effective and promising results compared to literatures.

Paper Nr: 116
Title:

Predicting the Tear Strength of Woven Fabrics Via Automated Machine Learning: An Application of the CRISP-DM Methodology

Authors:

Rui Ribeiro, André Pilastri, Carla Moura, Filipe Rodrigues, Rita Rocha and Paulo Cortez

Abstract: Textile and clothing is an important industry that is currently being transformed by the adoption of the Industry 4.0 concept. In this paper, we use the CRoss-Industry Standard Process for Data Mining (CRISP-DM) methodology to model the textile testing process. Real-world data were collected from a Portuguese textile company. Predicting the outcome of a given textile test is beneficial to the company because it can reduce the number of physical samples that are needed to be produced when designing new fabrics. In particular, we target two important textile regression tasks: the tear strength in warp and weft directions. To better focus on feature engineering and data transformations, we adopt an Automated Machine Learning (AutoML) during the modeling stage of the CRISP-DM. Several iterations of the CRISP-DM methodology were employed, using different data preprocessing procedures (e.g., removal of outliers). The best predictive models were achieved after 2 (for warp) and 3 (for weft) CRISP-DM iterations.

Paper Nr: 142
Title:

An Approach for Acquiring Knowledge in Complex Domains Involving Different Data Sources and Uncertinty in Label Information: A Case Study on Cementation Quality Evaluation

Authors:

Flavia Bernardini, Rodrigo S. Monteiro, Inhauma Ferraz, Jose Viterbo and Adriel Araujo

Abstract: Oil and Gas area presents many problems in which the experts need to analyze different data sources and they must be very specialized in the domain to correctly analyze the case. So, approaches that uses artificial intelligence techniques to help the experts to help them turning explicit their expert knowledge and analysing the cases is very important. Analysing cementation quality in oil wells is one of these cases. Primary cementation operation of an oil well is creating a hydraulic seal in the annular space formed between the coating pipe and the open well wall, preventing the flow between different geological zones bearing water or hydrocarbons. To evaluate the quality of this seal at determined depths, acoustic tools are used, aiming to collect sonic and ultrasonic signals. Verifying the quality of the available data for cementation quality evaluation is a task that consumes time and effort of the domain experts, mainly due to data dispersion in different data sources and missing labels in data. This work presents an approach for helping acquiring knowledge from domains where these problems are presented using machine learning. Interactive labeling and multiple data sources for acquiring knowledge from experts can help to construct better systems in complex scenarios, such as cementation quality. We obtained promising results in our case study scenario.

Paper Nr: 181
Title:

A Layered Quality Framework for Machine Learning-driven Data and Information Models

Authors:

Shelernaz Azimi and Claus Pahl

Abstract: Data quality is an important factor that determines the value of information in organisations. Data, when given meaning, results in information. This then creates financial value that can be monetised or provides value by supporting strategic and operational decision processes in organisations. In recent times, data is not directly accessed by the consumers, but is provided ’as-a-service’. Moreover, machine-learning techniques are now widely applied to data, helping to convert raw, monitored source data into valuable information. In this context, we introduce a framework that presents a range of quality factors for data and resulting machine-learning generated information models. Our specific aim is to link the quality of these machine-learned information models to the quality of the underlying source data. This takes into account the different types of machine learning information models as well as the value types that these model provide. We will look at this specifically in the context of numeric data, where we use an IoT application that exhibits a range of typical machine learning functions to validate our framework.

Paper Nr: 184
Title:

Experiment Workbench: A Co-agent for Assisting Data Scientists

Authors:

Leonardo G. Azevedo, Raphael M. Thiago, Marcelo D. Santos and Renato Cerqueira

Abstract: The analysis of large volumes of data is a field of study with ever increasing relevance. Data scientists is the moniker given for those in charge of extracting knowledge from Big Data. Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. The exploration done by data scientists relies heavily on the practitioner experience. These activities are hard to plan and can change during execution – a type of process named Knowledge Intensive Processes (KiP). The knowledge about how a data scientist performs her tasks could be invaluable for her and for the enterprise she works. This work proposes Experiment Workbench (EW), a system that assists data scientists in performing their tasks by learning how a data scientist works in-situ and being a co-agent during task execution. It learns through capturing user actions and using process mining techniques to discover the process the user executes. Then, when the user or her colleagues work in the learned process, EW suggests actions and/or presents existing results according to what it learned towards speed up and improve user wok. This paper presents the foundation for EW development (e.g., the main concepts, its components, how it works) and discuss the challenges EW is going to address.

Paper Nr: 196
Title:

Emotional Factor Forecasting based on Driver Modelling in Electric Vehicle Fleets

Authors:

J. I. Guerrero, M. C. Romero-Ternero, E. Personal, D. F. Larios, J. A. Guerra and C. León

Abstract: Until recently, the automotive industry focus has been safety, comfort, and user experience. Now, the focus is shifting towards human emotion for driver-car interactions, autonomy and sustainability; all of them are increasing concerns in recent scientific literature. On the one hand, the growing role of emotion in automotive driving is empowering human-centred design coupled with affective computing in driving context to improve future automotive design. It is resulting in emotional analysis being present in automotive. This requires real-time data processing that involves energy consumption in the vehicle. On the other hand, electric vehicle fleets and smart grids are technologies that have provided new possibilities to reduce pollution and increase energy efficiency looking for sustainability. This paper proposes the emotional factor forecasting according to data gathered from electric vehicle fleet, based on the application of K-means algorithm. The results shows that is possible to forecast the emotional status that takes negative effect in the driving. Additionally, the Cronbach alpha variation analysis provides an interesting tool to select features from samples.

Paper Nr: 198
Title:

User-adaptable Natural Language Generation for Regression Testing within the Finance Domain

Authors:

Daniel Braun, Anupama Sajwan and Florian Matthes

Abstract: Reporting duties and regression testing within the financial industry produce huge amounts of data which has to be sighted and analyzed by experts. This time-consuming and expensive process does not fit to modern, agile software developing practices with fast update cycles. In this paper, we present a user-adaptable natural language generation system that supports financial experts from the insurance industry in analysing the results from regression tests for Solvency II risk calculations and evaluate it with a group of experts.

Paper Nr: 205
Title:

Designing a Decision Support System for Predicting Innovation Activity

Authors:

Olga N. Korableva, Viktoriya N. Mityakova and Olga V. Kalimullina

Abstract: Decision support systems for predicting innovation activity at the macro level are not yet widely used, and the authors have not been able to find direct analogues of such a system. The relevance of creating the system is due to the need to take into account heterogeneous structured and unstructured information, including in natural language, when predicting innovation activity. The article describes the process of designing a decision support system for predicting innovation activity, based on the system for integrating macroeconomic and statistical data (described by the authors in previous articles) by adding a module of decision-making methods. The UML diagram of use cases and the UML diagram of the components of this module, the general architecture of the prototype of the decision support system, are presented. It also describes an algorithm for predicting innovation activity and its impact on the potential for economic growth using DSS.

Paper Nr: 206
Title:

Radial Basis Function Neural Network Receiver Trained by Kalman Filter Including Evolutionary Techniques

Authors:

Pedro G. Coelho, J. F. M. Do Amaral and A. S. Tome

Abstract: Artificial Neural Networks have been broadly used in several domains of engineering and typical applications involving signal processing. In this paper a channel equalizer using radial basis function neural networks is proposed, on symbol by symbol basis. The radial basis function neural network is trained by an extended Kalman filter including evolutionary techniques. The key motivation for the equalizer application is the neural network capability to establish complex decision regions that are important for estimating the transmitted symbols appropriately. The neural network training process using evolutionary techniques including an extended Kalman filter enables a fast training for the radio basis function neural network. Simulation results are included comparing the proposed method with traditional ones indicating the suitability of the application.

Paper Nr: 215
Title:

Methods, Models and Techniques to Improve Information System’s Security in Large Organizations

Authors:

Vladislavs Minkevics and Janis Kampars

Abstract: This paper presents the architecture of a modular, big-data based IS security management system (ISMS) and elaborates one of its modules – the domain generation algorithm (DGA) generated domain detection module. The presented methods, models and techniques are used in Riga Technical University, and can be used in any other large organization to stand against IS security challenges. The paper describes how organization can construct IS security management system using mostly free and open source tools and reach it’s IS security goals by preventing or minimizing consequences of malware with little impact on employee’s privacy. The presented DGA detection module provides detection of malicious DNS requests by extracting features from domain names and feeding them into random forest classifier. ISMS doesn’t rely solely of DGA detection and instead uses an ensemble of modules and algorithms for increasing the accuracy of the overall system. The presented IS security management system can be employed in real-time environment and its DGA detection module allows to identify infected device as soon as it starts to communicate with the botnet command and control centre to obtain new commands. The presented model has been validated in the production environment and has identified infected devices which were not detected by antivirus software nor by firewall or Intrusion Detection System.

Paper Nr: 222
Title:

Early Dyslexia Evidences using Speech Features

Authors:

Fernanda M. Ribeiro, Alvaro R. Pereira Jr., Débora B. Paiva, Luciana M. Alves and Andrea C. Bianchi

Abstract: The pathologies of the language are alterations in the reading of a text caused by traumatisms. Many people go untreated due to the lack of specific tools and the high cost of using proprietary software, however, new audio signal processing technologies can aid in the process of identifying genetic pathologies. In the past, a methodology was developed by medical specialists, which extracts characteristics from the reading of a text aloud and returns evidence of dyslexia. In this work, a new computational approach is described in order to automate serving as a tool for dyslexia indication efficiently. The analysis is done in recordings of the reading of pre-defined texts with school-age children, being extracted characteristics using specific methodologies. The indication of the probability of dyslexia is performed using a machine learning algorithm. The tests were performed comparing with the classification performed by the specialist, obtaining high accuracy on the evidence of dyslexia. The difference between the values of the automatically collected characteristics and the manually assigned was below 20% for most of the characteristics. Finally, the results show a very promising area for audio signal processing with respect to the aid to specialists in the decision making related to language pathologies.

Paper Nr: 262
Title:

An Approach to Assess the Existence of a Proposed Intervention in Essay-argumentative Texts

Authors:

Jonathan Nau, Aluizio H. Filho, Fernando Concatto, Hercules Antonio do Prado, Edilson Ferneda and Rudimar S. Dazzi

Abstract: This paper presents an approach for grading essays based on the presence of one or more theses, arguments, and intervention proposals. The research was developed by means of the following steps: (i) corpus delimitation and annotation; (ii) features selection; (iii) extraction of the training corpus, and (iv) class balancing, training and testing. Our study shows that features related to argumentation mining can improve the automatic essay scoring performance compared to the set of usual features. The main contribution of this paper is to demonstrate that argument marking procedures to improve score prediction in essays classification can produce better results. Moreover, it remained clear that essays classification does not depends on the number of features but rather on the ability of creating meaningful features for a given domain.

Paper Nr: 87
Title:

Ontology-based Question Answering Systems over Knowledge Bases: A Survey

Authors:

Wellington Franco, Caio Viktor, Artur Oliveira, Gilvan Maia, Angelo Brayner, V. P. Vidal, Fernando Carvalho and V. M. Pequeno

Abstract: Searching relevant, specific information in big data volumes is quite a challenging task. Despite the numerous strategies in the literature to tackle this problem, this task is usually carried out by resorting to a Question Answering (QA) systems. There are many ways to build a QA system, such as heuristic approaches, machine learning, and ontologies. Recent research focused their efforts on ontology-based methods since the resulting QA systems can benefit from knowledge modeling. In this paper, we present a systematic literature survey on ontology-based QA systems regarding any questions. We also detail the evaluation process carried out in these systems and discuss how each approach differs from the others in terms of the challenges faced and strategies employed. Finally, we present the most prominent research issues still open in the field.

Paper Nr: 140
Title:

EvaTalk: A Chatbot System for the Brazilian Government Virtual School

Authors:

Guilherme D. Guy Andrade, Geovana S. Silva, Francisco M. Duarte Júnior, Giovanni A. Santos, Fábio L. Lopes de Mendonça and Rafael D. Sousa Júnior

Abstract: EvaTalk is a complete chatbot system developed to attend users from Escola Virtual de Governo (EV.G), which is a Brazilian virtual school maintained by the federal government. The proposed architecture was based on a framework to build chatbots, but it was necessary to replace and adapt services to attend EV.G needs. The architecture is composed of the following modules: Interface for direct interaction; Artificial Intelligence to comprehend and process messages; Development to deal with the knowledge base; and Business Intelligence to analyze messages. The first version responded to questions related to Institutional Membership and chitchat. Still, it was noted that Eva needed more training data considering that the developers could not predict well user behavior. Therefore, it was necessary to change the conversational data examples and flows to match user behavior observed after release, which showed an increase in the chatbot’s response confidence. The system relies mostly on the data collected through the data analysis tools to evolve.

Paper Nr: 166
Title:

Prototype Proposal for Profiling and Identification of TV Viewers using Watching Patterns

Authors:

Aldis Erglis, Gundars Berzins, Irina Arhipova, Artis Alksnis and Evija Ansonska

Abstract: Content based recommendation systems have widely been used for recommendations in ecommerce and in TV content recommendations for a long period of time. Such recommendation systems could help multimedia content providers separate content on individual level of TV viewers and offer better advertising options for media agencies and advertisers. One of the greatest challenges for providing individual TV content is identification of distinct TV viewers in household and link them with social economic and demographic metrics individually. From a technical point of view Machine Learning ensemble model should be created with several separate models for each need. In this study a prototype for a content-based recommendation system was created that can fulfil content targeting and watched content efficiency using real time watching data. The solution prototype covers all important parts of the model including data filtering, cleaning and transformation. The technical prototype allows to test efficiency of Machine Learning techniques used for prediction of household composition and social profiles assigned to an individual inhabitant of the household.

Paper Nr: 193
Title:

Exploring Sentiments of Voters through Social Media Content: A Case Study of 2017 Assembly Elections of Three States in India

Authors:

Aman Agarwal and Veena Bansal

Abstract: Purpose: Winning an election requires more than a good and appealing manifesto. The purpose of this paper is to establish that content from the social media provides useful insights and can be used to manage an election campaign provided the right content is properly analyzed. Information such as frequency of mentions, sentiments of the mentions and demography is obtained analysis. This information provides insights into the demography of supporters, topics that are most talked about revealing their importance to the voters, sentiments of voters. Design/Methodology/Approach: We analyzed 25000 documents from twitter, forums, reviews, Facebook pages, blogs etc. over a period of 12 months in three states of India using Watson Analytics for Social Media (WASM) of IBM. We used ETL (extract, transform and load) utility of WASM to fetch the documents for our chosen themes, topics, dates and sources. WASM deploys deep learning to perform sentiment analysis. Findings: We found that social media content analysis provides useful insight that goes beyond general perception and can be used for managing a campaign. Originality/Value: There have been many efforts where researchers are trying to predict election results based on social media analysis. However, these efforts have been criticized as predicting election results is a very complex problem. We, in this work, have shown that social media content can definitely help in gaining a clear understanding of the sentiments of voters.

Paper Nr: 242
Title:

Automatic Text Summarization: A State-of-the-Art Review

Authors:

Oleksandra Klymenko, Daniel Braun and Florian Matthes

Abstract: Despite the progress that has been achieved in over 50 years of research, automatic text summarization systems are still far from perfect, posing many challenges to the researchers in the field. This paper provides an overview of the most prominent algorithms for automatic text summarization that were proposed in the last years, as well as describes automatic and manual evaluation methods that are currently widely adopted.

Paper Nr: 248
Title:

Root Cause Analysis and Remediation for Quality and Value Improvement in Machine Learning Driven Information Models

Authors:

Shelernaz Azimi and Claus Pahl

Abstract: Data quality is an important factor that determines the value of information in organisations. Information creates financial value, but depends largely on the quality of the underlying data. Today, data is more and more processed using machine-learning techniques applied to data in order to convert raw source data into valuable information. Furthermore, data and information are not directly accessed by their users, but are provided in the form of ’as-a-service’ offerings. We introduce here a framework based on a number of quality factors for machine-learning generated information models. Our aim is to link back the quality of these machine-learned information models to the quality of the underlying source data. This would enable to (i) determine the cause of information quality deficiencies arising from machine-learned information models in the data space and (ii) allowing to rectify problems by proposing remedial actions at data level and increase the overall value. We will investigate this for data in the Internet-of-Things context.

Area 3 - Information Systems Analysis and Specification

Full Papers
Paper Nr: 24
Title:

Impact of Developers Sentiments on Practices and Artifacts in Open Source Software Projects: A Systematic Literature Review

Authors:

Rui S. Carige Junior and Glauco F. Carneiro

Abstract: Context: Sentiment Analysis proposes the use of Software Engineering techniques for automated identification of human behavior. There is a growing interest in the use of Sentiment Analysis in topics related to Computing, more specifically in Software Engineering itself. Objective: Analyze the impact of developers sentiments on software practices and artifacts in open source software projects. Methods: We conducted a Systematic Review to collect evidence from the literature regarding the impacts of developers sentiments on software practices and artifacts. Results: We have found that the growing number of studies in this area provides greater visibility of the direct influence of developers sentiments on software practices. Practices associated with developers productivity and collaboration, along with source code, are the most vulnerable to sentiments variation. Conclusions: With the results presented, we hope to contribute to the discussion about the potential of improvement the social environment quality of software projects, as the sentiments of developers can positively or negatively impact software practices and artifacts.

Paper Nr: 33
Title:

Mitigating Difficulties in Use-Case Modeling

Authors:

Cristiana P. Bispo, Ana P. Magalhães, Sergio Fernandes and Ivan Machado

Abstract: The specification of software requirements in an enterprise system is crucial for software quality. The use-case (UC) approach is often used to describe software requirements because, among other benefits, its simplicity and ability to convey detailed information favors communication between business analysts, requirements analysts and, crucially, end users, who can easily understand and validate requirements. However, UC models are easier to understand than to specify, and difficulties in use case modeling (UCM) may negatively affect the quality of UC models, and so its usefulness. UC models quality can be enhanced by several modeling strategies mapped in the literature. However, no studies were found to show which of these strategies can be used to mitigate specific difficulties. There is a gap between UCM difficulties and UCM strategies. This paper presents a difficulty-strategy correlation proposal based on quality attributes of the UC model. This correlation was initially evaluated in a controlled experiment with students of an undergraduate program in computer science.

Paper Nr: 43
Title:

Systematic Risk Assessment of Cloud Computing Systems using a Combined Model-based Approach

Authors:

Nazila G. Mohammadi, Ludger Goeke, Maritta Heisel and Mike Surridge

Abstract: Data protection and a proper risk assessment are success factors for providing high-quality cloud computing systems. Currently, the identification of the relevant context and possible threats and controls requires high expertise in the security engineering domain. However, consideration of experts’ opinions during the development life-cycle often lacks a systematic approach. This may result in overlooking of relevant assets or missing relevant domain knowledge, etc. Our aim is to bring context analysis and risk assessment together in a systematic way. In this paper, we propose a systematic, tool-assisted, and model-based methodology to scope the context and risk assessment for a specific cloud system. Our methodology consists of two parts: First, we enhance the initial context analysis necessary for defining the scope for risk assessment, and second we identify relevant threats and controls during design- and deployment-time. Using the context model, and design-time system model, we further refine the gathered information into a deployment model. All steps of our methodology are tool supported and in a semi-automatic manner.

Paper Nr: 64
Title:

(L)earning by Doing – »Blockchainifying« Life-long Volunteer Engagement

Authors:

Elisabeth Kapsammer, Birgit Pröll, Werner Retschitzegger, Wieland Schwinger, Philipp Starzer, Markus Weißenbek, Johannes Schönböck and Josef Altmann

Abstract: Volunteering is a vital cornerstone of our society, ranging from social care to disaster relief, being supported by a plethora of web-based volunteer management systems (VMS). These VMS primarily focus on centralized task management within non-profit organizations (NPOs), lacking means for volunteers to privately digitize and exploit their engagement assets, e.g., task accomplishments or earned competences. This may decrease engagement, since appreciation of volunteer work is the only reward available, and hinders the exploitation of engagement assets between NPOs and beyond, e.g., the education or labor market. We put volunteers in the middle of concern by investigating “how can engagement be digitized and exploited in a life-long way”. First, based on a systematic identification of requirements for a trustful digitization and exploitation of engagement assets, a web-based volunteer ecosystem relying on emerging blockchain technology is proposed. Second, for representing various kinds of engagement assets, a generic and adaptable ontology is put forward. Third, for establishing trust across all stakeholders, a prototypical web application is presented allowing to »blockchainify« life long volunteer engagement. Finally, the prototype applies semantic web technologies to offer a machine readable form of engagement assets in terms of Linked Data (LD).

Paper Nr: 67
Title:

Developing Model Transformations: A Systematic Literature Review

Authors:

Ana P. Magalhães, Rita P. Maciel and Aline Andrade

Abstract: Model Driven Development is an approach that makes use of models instead of code in software development. At its core, there is a transformation chain responsible for the (semi) automation of the development process converting models into models until code. The development of transformations has been a challenge as there is an inherent complexity of the transformation domain in addition to the complexity of the software being constructed using these transformations. In order to assist this development as well as improve transformation quality, it is important to adopt software engineering facilities such as processes, languages and other techniques. This paper presents a systematic literature review of strategies currently proposed to develop model transformations. We aim to investigate development processes or any other strategies used to guide transformation development, the phases of software development life cycle considered, modeling languages adopted for specification and also the level of automation provided. The study selected and analyzed 23 papers to identify which aspects are addressed by research and any gaps in this area. We identified four different strategies in guiding transformation development and perceived the lack of a modeling language standard.

Paper Nr: 78
Title:

How to Better Form Software Development Teams? An Analysis of Different Formation Criteria

Authors:

Sérgio Cavalcante, Bruno Gadelha, Edson César de Oliveira and Tayana Conte

Abstract: In the competitive world of the on-demand software development market, some practices that increase companies' chances of delivering better results turn out to be an essential differentiator. Several studies in the literature discuss numerous criteria used by companies in the formation of teams. This research aims to analyze the criteria and factors in the formation of software teams and their impacts on the value perceived by the customer of the deliveries. We collected 31 project results scores of an R&D organization and performed a quantitative analysis comparing teams formed using two selection criteria: self-selection versus leader selection. We observed a statistical significance in the comparison between the selection criteria when tested with the longevity factor. Our results indicated that the self-selection team formation criteria had impact on value delivered to the customer. We also noticed this impact when, besides being self-selected, the team was also long-lived.

Paper Nr: 90
Title:

Integrating Model-Driven Development Practices into Agile Process: Analyzing and Evaluating Software Evolution Aspects

Authors:

Elton F. da Silva, Rita P. Maciel and Ana F. Magalhães

Abstract: Software use is increasing in different areas of society, and new proposals of development processes have been presented to support this demand focusing on increase productivity and reduce time to market. In this context, some software development processes emphasize source code production, such as agile processes, others focus on modeling, such as Model-Driven Development (MDD). ScrumDDM is a hybrid metaprocess that integrates MDD practices into the SCRUM method aiming to specify software processes instances which models can be used in the agile development context. This paper presents a controlled experiment conducted to analyze the effectiveness of a ScrumDDM instance of its ability to support the agility and the evolutionary aspects of this software process. The results of the experiment showed that the models used in ScrumDDM gave extra support for evolution without compromising the development agility by executing a set of model transformations while preserving project code and documentation updated to support future software maintenance.

Paper Nr: 95
Title:

An Experimental Analysis of Tools for Ontology Evolution Management

Authors:

Jéssica S. Santos, Viviane T. Silva, Leonardo G. Azevedo, Elton S. Soares and Raphael M. Thiago

Abstract: The use of ontology is widely spread among software engineering groups as a way to represent, structure, share and reuse knowledge. As projects progress, the ontological understanding of the domain may change, evolve. New domain concepts may emerge and existing ones may disappear or be updated, causing changes in the ontology. Taking this into account, the management of the ontology evolution/life cycle should be addressed by the ones that adopt them as a way of representing knowledge. This paper provides an analysis of tools related to ontology evolution management focusing on the ones able to identify elementary and complex changes (by grouping elementary changes) that help on externalizing the intention of the user. The main contribution of this analysis is to present the state of the art of tools related to ontology evolution by identifying their strengths and limitations. The results are particularly useful to ontology designers who need to choose a tool to help them to inspect, understand and manage ontology evolution. Besides, we point out several research issues to be addressed in different kinds of research initiatives.

Paper Nr: 97
Title:

Time-series Approaches to Change-prone Class Prediction Problem

Authors:

Cristiano S. Melo, Matheus M. Lima da Cruz, Antônio F. Martins, José M. Filho and Javam C. Machado

Abstract: During the development and maintenance of a large software project, changes can occur due to bug fix, code refactoring, or new features. In this scenario, the prediction of change-prone classes can be very useful in guiding the development team since it can focus its efforts on these pieces of software to improve their quality and make them more flexible for future changes. A considerable number of related works uses machine learning techniques to predict change-prone classes based on different kinds of metrics. However, the related works use a standard data structure, in which each instance contains the metric values for a particular class in a specific release as independent variables. Thus, these works are ignoring the temporal dependencies between the instances. In this context, we propose two novel approaches, called Concatenated and Recurrent, using time-series in order to keep the temporal dependence between the instances to improve the performance of the predictive models. The Recurrent Approach works for imbalanced datasets without the need for resampling. Our results show that the Area Under the Curve (AUC) of both proposed approaches has improved in all evaluated datasets, and they can be up to 23.6% more effective than the standard approach in state-of-art.

Paper Nr: 108
Title:

SLang: A Domain-specific Language for Survey Questionnaires

Authors:

Luciane C. Araújo, Marco A. Casanova, Luiz P. Leme and Antonio L. Furtado

Abstract: The use of surveys permeates the economy, ranging from customer satisfaction measurement to tracking global economic trends. At the core of the survey process lies the codification of questionnaires, which vary from simple lists of questions to complex forms that include validations, computation of derived data, use of triggers to guarantee consistency, and dynamic creation of objects of interest. Questionnaire specification is part of what is called survey metadata and is a key factor for the quality of the data collected and of the survey itself. In this context, the paper first introduces a comprehensive complex questionnaire model. Then, based on the model, it proposes a prototype domain-specific language (DSL) for modeling complex questionnaires, called SLang. The paper also describes a prototype implementation of SLang and an evaluation of the usefulness of the language in practical scenarios.

Paper Nr: 238
Title:

Maturity Models for Agile, Lean Startup, and User-Centered Design in Software Engineering: A Combined Systematic Literature Mapping

Authors:

Maximilian Zorzetti, Matheus Vaccaro, Cassiano Moralles, Bruna Prauchner, Ingrid Signoretti, Eliana Pereira, Larissa Salerno, Ricardo Bastos and Sabrina Marczak

Abstract: In a bid to reduce the risk accompanied by innovation, IT companies have been trying to boost their Agile development practices by combining Lean Startup and User-Centered Design (UCD) with their existing work processes. Undergoing this transformation in large enterprises can be a difficult challenge without an instrument to help in conducting the adoption and assessment of this novel development approach. In this paper we seek to identify maturity models that assess the use of Agile, Lean Startup, and UCD; characterize these maturity models; and see how they are applied and evaluated. We conducted a systematic literature mapping of maturity models published between 2001 and 2020 taking existing systematic review guidelines into account; and we analyzed the models using an adapted maturity model classification criteria. There are 35 maturity models, of which 23 are maturity models for Agile, 5 for Lean thinking, 5 for User-Centered Design, and 2 for Agile and UCD combined. We found that agile models have been published fairly consistently throughout the years (2001–2020), while Lean thinking and UCD models have mostly been published in the last decade, which might be related to the somewhat recent use of Design Thinking and Lean Startup in software engineering. However, there are no maturity models for a combined use of Agile, Lean Startup, and UCD. We believe that this is the case due to the approach’s infancy, as it is seeing success among industry practitioners.

Paper Nr: 266
Title:

Long Term-short Memory Neural Networks and Word2vec for Self-admitted Technical Debt Detection

Authors:

Rafael M. Santos, Israel M. Santos, Methanias C. Rodrigues Júnior and Manoel M. Neto

Abstract: Context: In software development, new functionalities and bug fixes are required to ensure a better user experience and to preserve software value for a longer period. Sometimes developers need to implement quick changes to meet deadlines rather than a better solution that would take longer. These easy choices, known as Technical Debt, can cause long-term negative impacts because they can bring extra effort to the team in the future. Technical debts must be managed and detected so that the team can evaluate the best way to deal with them and avoid more serious problems. One way to detect technical debts is through source code comments. Developers often insert comments in which they admit that there is a need to improve that part of the code later. This is known as Self-Admitted Technical Debt (SATD). Objective: Evaluate a Long short-term memory (LSTM) neural network model combined with Word2vec for word embedding to identify design and requirement SATDs from comments in source code. Method: We performed a controlled experiment to evaluate the quality of the model compared with two language models from literature and LSTM without word embedding in a labelled dataset. Results: The results showed that the LSTM model with Word2vec have improved in recall and f-measure. The LSTM model without word embedding achieves greater recall, but perform worse in precision and f-measure. Conclusion: Overall, we found that the LSTM model and word2vec can outperform other models.

Short Papers
Paper Nr: 48
Title:

Software Projects Success and Informal Communication: A Brazilian Bank Case Study

Authors:

Leandro Z. Rezende, Edmir V. Prado and Alexandre Grotta

Abstract: Technology project management is challenging. However, there are few works in the literature related to informal communication and project success. Therefore, this research aims to analyse the influence of informal communication on the short - and medium - term success of software development projects in a Brazilian banking institution. This research is based on a literature review about project communication and success. The research has a qualitative and descriptive approach and used an ex-post-fact strategy. Ten software development project management professionals were interviewed at a large banking institution in the first half of 2019. The research found an association between informal communication with project efficiency and contribution to the project team. No association was found between this communication line and customer contribution. In addition, it made more contributions in waterfall projects than in agile projects.

Paper Nr: 52
Title:

A Data-centered Usage Governance: Providing Life-long Protection to Data Exchanged in Virtual Enterprises

Authors:

Jingya Yuan, Frédérique Biennier and Nabila Benharkat

Abstract: Since the early definition of the Virtual Enterprise concept in the 90s, efficient information sharing and trust have been pointed out as major challenges to support the enactment of collaborative organisations. By now, traditional Collaborative Business support systems have been designed to interconnect corporate Business Processes and different well-known information systems, whereas trust is mostly managed thanks to interpersonal relationships. Unfortunately, this well-perimetrized vision of a Collaborative Network Organization does not fit the large scale, opened and evolving context due to the fast adoption of Industry 4.0 and sharing economy models which rely on the large scale adoption of Social Mobile Analytics Cloud Internet of Things technologies (later called SMACIT for short) and semi-opened information systems. This involves rethinking the way information, services and applications are organized, deployed, shared and protected, moving from the traditional perimetrized system protection to data and service life-long usage control. To this end, we propose a data-driven security organization which uses a multi-layer architecture to describe on one hand the logical organisation of the information system, i.e. the data assets and the business services needed to implement the collaborative business processes and on the other hand the multiple copies exchanged with different service providers. Based on this Information System meta-model, our system integrates a blockchain-based usage manager to govern the way information are exchanged and processed.

Paper Nr: 58
Title:

Navigational Distances between UX Information and User Stories in Agile Virtual Environments

Authors:

Abner C. Filho and Luciana M. Zaina

Abstract: User stories (US) are valuable artifacts to agile teams, being a succinct requirement description with its details complemented by other artifacts. User eXperience (UX) is an important cross-cutting quality requirement that has gained spotlight over the past years. Many approaches on merging agile and UX are presented in the literature, but few have analysed how UX information and US are related in agile virtual environments. This paper presents a case study which investigates the navigational distances between UX information and USs. The goal was to understand how agile practitioners linked UX information to USs. To do this, we conducted a qualitative analysis in 13 requirement documents of three different industry projects and also we explored the USs derived from such documents. We propose a classification of navigational distances found among UX information and USs, discussing the pros and cons of each distance type. Our work contributes to motivate agile teams in rethinking about different forms to organize the information, artifacts and documents in virtual environments.

Paper Nr: 60
Title:

Evaluating Support for Implementing BPMN 2.0 Elements in Business Process Management Systems

Authors:

Carlos D. Santos, Marcelo Fantinato, José P. Moreira de Oliveira and Lucineia H. Thom

Abstract: Business Process Model and Notation (BPMN) provides an extensive set of notational elements, such as activities, events and gateways, which enable the representation of a wide variety of business processes. Business Process Management Systems (BPMSs) can implement BPMN to allow the execution of business processes. However, not all BPMN elements are implemented in a BPMS. The purpose of this paper is to present the results of an evaluation of the BPMN 2.0 elements conducted to identify those whose implementation is supported in BPMSs. We evaluated four BPMSs, comparing the elements implemented in such BPMSs with their respective definitions in the BPMN specification, considering the requirements to implement them. As a result, we found that only 34.18% of the BPMN 2.0 elements are implemented in the investigated BPMSs. We also identified that BPMSs implement BPMN elements only partially, adapting their original definition. In addition to the results of our evaluation, our contribution is to provide developers with an approach to evaluating the support of BPMN elements in a BPMS. Following the steps proposed in this paper, we identified the limitations and devised a method for implementing the remaining BPMN elements not yet implemented.

Paper Nr: 69
Title:

Analysis of Tools for REST Contract Specification in Swagger/OpenAPI

Authors:

Jéssica D. Santos, Leonardo G. Azevedo, Elton S. Soares, Raphael M. Thiago and Viviane T. Silva

Abstract: REST is a resource-based architectural style that has emerged as a promising way for designing Web services. A REST API exposes services’ functionalities through a contract that allows consumption by different clients. The contract specifies service’s request and response schemes and related rules the service and the client should comply with. The process of documenting and keeping an API consistent is a time consuming human effort. The documentation should reflect the implementation which may evolve. This work compares different tools for REST APIs specifications. We focused on tools that automatically generate Swagger (Open API in version 3.0), a specification for designing REST APIs. We evaluated the tools using a set of criteria whose results may help software engineers to choose the most appropriate tool, and point out gaps for research initiatives.

Paper Nr: 74
Title:

GDPR: What’s in a Year (and a Half)?

Authors:

Ana Ferreira

Abstract: This paper aims to investigate, with a literature review, how the research community has been tackling the security and privacy requirements mandated by the General Data Protection Legislation (GDPR), over the last year and a half. We assessed what proposed solutions have been implemented since GDPR came into force, if and where they were tested in real settings, with what technologies and what specific GDPR requirements were targeted. No similar review has been found by the authors as works in the literature mostly provide recommendations for GDPR compliance or assess if current solutions are GDPR compliant. Results show that most proposed solutions focus on Consent, PrivacybyDefault/Design and are assessed on IoT and healthcare domains. However, almost none is tested and used in a real setting. Although it may be still early days for this review, it is clear that: a) there is the need for more GDPR compliant novel solutions, tests and evaluations in real settings; b) the obtained knowledge be quickly shared so that proper feedback is given to the legal authorities and business/research organizations; and c) solutions on privacy must integrate socio-technical components that can face, in an all-inclusive way, infrastructures, activities and processes, where GDPR must apply.

Paper Nr: 76
Title:

Design Thinking Use in Agile Software Projects: Software Developers’ Perception

Authors:

Edna D. Canedo, Ana S. Pergentino, Angelica S. Calazans, Frederico V. Almeida, Pedro T. Costa and Fernanda Lima

Abstract: Only recently the application of the Design Thinking (DT) in agile development has begun to be researched. Thus, this research aims to analyze information collected from agile software developers concerning their perceptions about applying DT methods and tools in agile development. An online survey was submitted to agile teams from Brazilian software organizations and a total of 59 answers were obtained. Results reveal that the most commonly used techniques during the development process are brainstorming and prototyping. In requirement elicitation, the techniques most used by the practitioners are interviews with users and prototyping. The study concludes that practitioners are already using many DT techniques, tools and methods in software development activities. Furthermore, they acknowledge that DT practice in requirement elicitation could contribute to delivering product quality to the end-user since design thinking techniques could prevent failure to understand requirements prior to implementation.

Paper Nr: 91
Title:

Front End Application Security: Proposal for a New Approach

Authors:

Renato C. Ribeiro, Edna D. Canedo, Bruno G. Praciano, Gabriel M. Pinheiro, Fábio L. Lopes de Mendonça and Rafael D. Sousa Jr.

Abstract: The data processing center (CPD) of the University of Brasília (UnB) has the need of evolution of legacy systems and the communication between systems in an efficient and safe way. For this reason, it is needed to implement a centralized control system for authentication and authorization to access services, systems and information. The technologies used focus on what is most modern in the market. In this paper we will discuss the security of applications developed as part of the single page application (SPA) concept, focusing on security using the Oauth 2 framework, Angular front-end language and service-oriented architecture (SOA). It will show the development of a security module that turns security complexity into programming abstractions for the new client applications developed in the CPD. The security module developed by the UnB aims to centralize, modernize, and improve the security of University applications. The advantage of this module is its flexibility, abstraction concepts, centralization, and use of one of the standard security protocols used today, OAuth 2, which brings greater security to UnB applications.

Paper Nr: 107
Title:

Netphishing: Network and Linguistic Analysis of Phishing Email Subject Lines

Authors:

Ana Ferreira, Soraia Teles, Rui Chilro and Milaydis Sosa-Napolskij

Abstract: This study provides support on why subject lines of predatory emails should be analysed to improve the detection of phishing attacks. Network science together with a linguistic analysis were performed on a sample of 240 phishing email subject lines from the past 12 years. Results show that even in straightforward subject lines, phishers can employ text elements to create a sense of proximity, mutual relationship as well as a neutral and professional relation, focused on present and future actions, to persuade potential victims to open phishing emails. The common words “your” and “account” form two main hubs and communities of words that integrate main organisations and actions related to those hubs. The linguistic analysis shows that concise phrases integrate such richness of language that can potentially be used to find differential emotional and behavioural marks on the text, to be used for better detecting phishing emails. This work provides current information as well as new research questions to be tested and further perused, to support the improvement of automated tools to identify predatory emails.

Paper Nr: 121
Title:

Blockchain-based Traceability of Carbon Footprint: A Solidity Smart Contract for Ethereum

Authors:

António M. Rosado da Cruz, Francisco Santos, Paulo Mendes and Estrela F. Cruz

Abstract: In recent decades there has been an increasing concern about climate change. Every person is increasingly concerned about global warming and, as a consumer, with their own individual contribute to that issue, wich may be measured by each one’s carbon footprint. In this sense, it is only natural that each person wants to consume products with a lower carbon footprint, meaning with a lower environmental impact. For this, however, consumers need to be able to know the carbon footprint of the products they are buying. This is only possible by having every company tracking and sharing their own products carbon footprint. The blockchain is a distributed technology that allows for registering and sharing information between those companies and the final consumers. The blockchain is being used in many areas as a distributed database, and has some strong points like trust, transparency, security, immutability, durability, disintermediation and others. In this paper the blockchain technology is being used to track and trace back the carbon footprint of products and organizations. More exactly, this paper proposes a smart contract-based platform for the traceability of the carbon footprint of products and organizations.

Paper Nr: 125
Title:

Modelling and Visualization of Robot Coalition Interaction through Smart Space and Blockchain

Authors:

Alexander Smirnov and Nikolay Teslya

Abstract: Nowadays the study of interaction models of intelligent agents is one of the main directions in the field of joint task solving. It includes studies of coalition formation principles, tasks decomposition and distribution, winnings sharing, and implementation of proposed techniques and models. This work focuses on ensuring the interaction of coalition members through distributed ledger technology and smart contracts using Hyperledger Fabric platform, as well as modeling and visualizing the interaction of intelligent robots using open software Gazebo and Robotic Operation System. The ontology of context used to adjust robot actions is presented. It combines environmental characteristics with robots and tasks descriptions to provide full situation context. The paper presents a modelling approach architecture with an example of modelling and visualization based on obstacle overcoming scenario.

Paper Nr: 126
Title:

Children Face Long-term Identification in Classroom: Prototype Proposal

Authors:

Nikolajs Bumanis, Gatis Vitols, Irina Arhipova and Inga Meirane

Abstract: A children face automated identification raise additional challenges compared to an adult face automated identification. A long-term identification is used in the environment in which a person must be identified in longer time spans, such as months and years. A long-term identification is present for example in schools where children spend multiple years and, if automated face identification solution is implemented, it must be resilient to recognise face biometrical data in the span of typically up to 9 years. In this proposal, we discuss children face identification available solutions which use deep learning networks, introduce legal constraints that come with privacy of children and propose prototype for a long-term identification of children attendance in their classroom. The solution consists of a developed prototype that is architecturally separated into three layers. The layers encapsulate necessary local and remote hardware, software and interconnectivity solutions between these entities. The protype is intended for implementation into a school’s class attendance management system, and should provide sufficient functionality for person’s identity management, object detection and person’s identification processes. The prototype’s processing is based on the model that incorporates the principles of multiple correct biometric pattern versions, providing possibility of a long-term identification. The model uses Single Shot MultiBox Detector for object detection and Siamese neural network for a person identification.

Paper Nr: 133
Title:

Towards a Taxonomy for Big Data Technological Ecosystem

Authors:

Vitor A. Pinto and Fernando S. Parreiras

Abstract: Data is constantly created, and at an ever-increasing rate. Intending to be more and more data-driven, companies are struggling to adopt Big Data technologies. Nevertheless, choosing an appropriate technology to deal with specific business requirements becomes a complex task, specially because it involves different kinds of specialists. Additionally, the term Big Data is vague and ill defined. This lack of concepts and standards creates a fuzzy environment where companies do not know what exactly they need to do and on the other hand consultants do not know how to help them to achieve their goals. In this study the following research question was addressed: Which essential components characterize Big Data ecosystem? To answer this question, Big Data terms and concepts were first identified. Next, all terms and concepts were related and grouped creating a hierarchical taxonomy. Thus, this artifact was validated through a classification of tools available in the market. This work contributes to clarification of terminologies related to Big Data, facilitating its dissemination and usage across research fields. The results of this study can contribute to reduce time and costs for Big Data adoption in different industries as it helps to establish a common ground for the parts involved.

Paper Nr: 137
Title:

How Machine Learning Has Been Applied in Software Engineering?

Authors:

Olimar T. Borges, Julia C. Couto, Duncan A. Ruiz and Rafael Prikladnicki

Abstract: Machine Learning (ML) environments are composed of a set of techniques and tools, which can help in solving problems in a diversity of areas, including Software Engineering (SE). However, due to a large number of possible configurations, it is a challenge to select the ML environment to be used for a specific SE domain issue. Helping software engineers choose the most suitable ML environment according to their needs would be very helpful. For instance, it is possible to automate software tests using ML models, where the model learns software behavior and predicts possible problems in the code. In this paper, we present a mapping study that categorizes the ML techniques and tools reported as useful to solve SE domain issues. We found that the most used algorithm is Naíve Bayes and that WEKA is the tool most SE researchers use to perform ML experiments related to SE. We also identified that most papers use ML to solve problems related to SE quality. We propose a categorization of the ML techniques and tools that are applied in SE problem solving, linking with the Software Engineering Body of Knowledge (SWEBOK) knowledge areas.

Paper Nr: 144
Title:

Software Processes Used in University, Government, and Industry: A Systematic Review

Authors:

Caroline G. Silva, Evandro F. Filho and Lisandra M. Fontoura

Abstract: Software processes are essential for software development organizations to deliver quality software. There are currently several software processes to meet different needs. However, it is difficult to find in the literature software processes focused on university projects involving other institutions, such as government and industry. This article aims to conduct a systematic literature review to identify the characteristics and limitations of agile and plan-oriented methodologies, which processes were used in software development projects and to establish a relationship between organizational characteristics and best methodologies successful. As a research method, we conducted a systematic study of the literature associated with a snowball strategy, identified and structured the literature on the use of agile and plan-oriented methodologies. We selected 12 studies using the systematic review and added 5 more using the snowball method, totaling 17 selected articles. We note that there is no specific methodology to be used in software development, each organization has its characteristics. The lack of specific processes for university projects is evident, and the differences between this environment and industry require processes tailored. Beside, a large number of projects use practices of more than one method, called hybrid methodologies, to exploit the best of agile and plan-oriented methodologies.

Paper Nr: 179
Title:

The Use of De-identification Methods for Secure and Privacy-enhancing Big Data Analytics in Cloud Environments

Authors:

Gloria Bondel, Gonzalo M. Garrido, Kevin Baumer and Florian Matthes

Abstract: Big data analytics are interlinked with distributed processing frameworks and distributed database systems, which often make use of cloud computing services providing the necessary infrastructure. However, storing sensitive data in public clouds leads to security and privacy issues, since the cloud service presents a central point of attack for external adversaries as well as for administrators and other parties which could obtain necessary privileges from the cloud service provider. To enable data security and privacy in such a setting, we argue that solutions using de-identification methods are most suitable. Thus, this position paper presents the starting point for our future work aiming at the development of a privacy-preserving tool based on de-identification methods to meet security and privacy requirements while simultaneously enabling data processing.

Paper Nr: 180
Title:

Test Oracle using Semantic Analysis from Natural Language Requirements

Authors:

Maryam I. Malik, Muddassar A. Sindhu and Rabeeh A. Abbasi

Abstract: Automation of natural language based applications is a challenging task due to its semantics. This challenge is also confronted in the software testing field. In this paper, we provide a systematic literature review related to the semantic analysis of natural language requirements in the software testing field. The literature review assisted us in the identification of the substantial research gap related to the semantics-based natural language test oracle. To the best of our knowledge, we have not found any technique in which the semantics of test oracle from natural language requirements can be solved using Word Sense Disambiguation techniques. We have discussed our proposed approach to generate semantics-based test oracle from natural language requirements. Our proposed approach can be applied to any domain.

Paper Nr: 183
Title:

Requirement Engineering and the Role of Design Thinking

Authors:

Anas Husaria and Sergio Guerreiro

Abstract: Recently, interest in the use of Design Thinking (DT) has been on the rise as a field to study related to Human-Computer Interaction (HCI). Companies today are seeking innovation and user-centricity in the software development projects regardless of the field they are in. Companies like IBM and SAP among others have taken steps forward to innovate their software development processes using Design Thinking, by creating academies and innovation labs. DT is used as a method to improve the User Experience (UX) while interacting with a computer software. Requirement Engineering (RE) is a process of defining, documenting and maintaining requirements in the system design and software engineering process, while RE takes on the initial phase in software engineering. This position paper reviews the practices of RE as well as how DT could have a role to mitigate the challenges that RE could have.

Paper Nr: 202
Title:

Considering Legal Regulations in an Extendable Context-based Adaptive System Environment

Authors:

Mandy Goram and Dirk Veiel

Abstract: Legal regulations demand that applications consider legal aspects of the application domain. Regulations equally concern software designers, developers, legal experts, providers and users. Not only the application must be legally compliant, the users must also comply with the law when using the application. Therefore, it is important to explain the user the current situation and the related consequences to the usage of the system. But it is a big challenge to support users with explanations of the law and the related actions and consequences to the usage of the system. We address the aforementioned challenges by developing an extendable context-based adaptive system environment, which considers legal policies and generate personalized explanations for users. This paper presents an approach to integrate legal regulations into context-based systems and an excerpt of our legal domain model. We describe a process on how legal experts can configure the adaptive interaction with the domain-specific application and the generating of personalized explanations. For that, we use a sample collaboration situation when Copyright Law and personalized explanations get relevant.

Paper Nr: 208
Title:

An Engineering Approach to Integrate Non-Functional Requirements (NFR) to Achieve High Quality Software Process

Authors:

Muhammad A. Gondal, Nauman A. Qureshi, Hamid Mukhtar and Hafiz F. Ahmed

Abstract: Software quality calls for an engineering approach to incorporate non-functional requirements as first-class citizens into software specification and later operationalized at the development time. Recent research argues to model high level goals capturing the intentions of the users as Non-functional requirements (NFRs) at the early stage of the requirements engineering. However, intertwining relevant NFRs into the specification at early stage increases the complexity to many folds. Therefore, a straightforward approach for capturing NFRs is not possible as product specific NFRs are usually domain dependent. In this paper, we propose a systematic approach to integrate NFRs into the specification and development artifacts to ensure high quality of the software system under development. Considering existing seminal approaches in the literature, we propose a textual template for specifying NFRs and provide systematic technique to integrate relevant NFRs during the software requirements specification phase. We demonstrate our approach using a healthcare-information-systems as a case study and report initial results.

Paper Nr: 211
Title:

A Concept & Compliance Study of Security Maturity Models with ISO 21827

Authors:

Rabii Anass, Assoul Saliha and Roudiès Ounsa

Abstract: Ever since the success of maturity models in software engineering, the creation of security maturity models began enlarging the choice pool for organizations. Yet their implementation rate has been low and their impact difficult to perceive. This security maturity model choice grew even larger in the last decade regardless of the existence of the standard security maturity model ISO 21827. Amongst governmental approaches, CCSMM is the US national security maturity model supported by a presidential policy for national preparedness. MMISS-SME is one of the only validated security maturity model created by academia between 2007 and 2018. Our research aims to study the added value and compliance of CCSMM and MMISS-SME with the ISO 21827 standard and their shared core concepts. We presented each security maturity model’s main lines and modeled their core concepts. Our study shows that the standard encompasses all security engineering concepts yet leaving room for characterization and customization to the organizations. However, CCSMM and MMISS-SME provide nuances in both functions and concepts seeing that they were created for specific contexts such as SMEs or the US local government and their vital organisms.

Paper Nr: 219
Title:

Capability Management in Resilient ICT Supply Chain Ecosystems

Authors:

Jānis Grabis, Janis Stirna and Jelena Zdravkovic

Abstract: An ICT system consists of multiple interrelated software and hardware components as well as related services. They are often produced by a complex network of suppliers the control of which is hard, time consuming and in many cases almost impossible for a single company. Hence, it is a common practice for malicious actors to target the ICT product supply chain assuming that some members have lax security practices or lag behind in terms of using the latest solutions and protocols. A single company cannot assure the security of complex ICT systems and cannot evaluate risks and therefore, to be successful it needs to tap into a wider network of ICT product developers and suppliers, which in essence leads to forming an ecosystem. We propose in this study that such an ecosystem should be established and managed on the bases of its members capabilities, which in this means capacity to meet desired goals, i.e., security and privacy requirements in a dynamic business context. The proposal is illustrated on the case of the ICT product called IoTool, which is a lightweight IoT gateway. The IoTool uses various third-party components such as sensors and actuators supplied by different vendors.

Paper Nr: 231
Title:

On the Adaptations of the Scrum Framework Software Development Events: Literature and Practitioners Analysis using Feature Models

Authors:

Luciano A. Garcia, Edson OliveiraJr, Gislaine L. Leal and Marcelo Morandini

Abstract: Scrum is one of the most well-known and used agile methods for software development. Practitioners highlight the Sprint event as the heart of Scrum. Different kinds of Scrum events aid at performing Sprints. We gathered evidence of industrial adaptations of Scrum events based on the existing reporting literature and a survey with practitioners. We grouped, represented and unified these adaptations in terms of feature models. Such representation aids to register practitioners’ knowledge, thus creating an structure capable of instantiating it to new software development solutions deployment.

Paper Nr: 234
Title:

Prevalence of Bad Smells in C# Projects

Authors:

Amanda L. Sabóia, Antônio F. Martins, Cristiano S. Melo, José M. Monteiro, Cidcley Teixeira de Souza and Javam C. Machado

Abstract: Bad smell can be defined as structures in code that suggest the possibility of refactoring. In object-oriented languages such as C# and Java, Bad Smells are heavily exploited as a way to avoid potential software failures. The presence of a high number of bad smells in a software project makes the system maintenance and evolution hard. So, identifying smells in code and refactoring them helps to improve and maintain software quality. Anti-patterns are considered inadequate programming practices, but not an error, they are bad solutions to recurring software problems. In this work, we propose an exploratory study on open source projects written in C# and published in GitHub. We empirically analyzed a total of 25 projects, studying the prevalence of Bad Smells, in a quantitatively and qualitatively manner, and their relationship in order to identify possible anti-patterns. Our results showed that implementation smells are the most common. Besides, some smells occur together, such as Missing Default and Unutilized Abstraction that are perfectly correlated, and ILS and IMN detected by association rules. Thus, the proposed study aims to assist software developers in avoiding future problems during the development of C# projects.

Paper Nr: 256
Title:

Feasibility Analysis of SMartyModeling for Modeling UML-based Software Product Lines

Authors:

Leandro F. Silva, Edson OliveiraJr and Avelino F. Zorzo

Abstract: Variability modeling in UML-based Software Product Lines (SPL) has been carried out using basically the UML Profiling mechanism for a diverse of theoretical approaches. However, there is no UML-based SPL life cycle supporting tool, which takes advantages of the UML standard diagrams in a controlled environment exclusively dedicated to it. Users usually adopt general-purpose UML tools to model variability. Its drawback is no control over data regarding SPL models, especially on variability. With such control, one might, for instance, use different visualization techniques to show SPL/variability information, inspecting/testing SPL models and data, apply metrics, and configure specific products. To provide an environment with these characteristics, we developed SMartyModeling. We evaluated its feasibility based on two studies: one qualitative supported by the Technology Acceptance Model (TAM), and one experiment comparing SMartyModeling with Astah. The first study aided to establish assumptions on how to improve the environment. We then, stated hypotheses to be tested in a comparative experiment. Thus, we identified aspects related to the automation of the SPL concepts, the number of errors and the difficulties in modeling SPLs. Hence, we measured effectiveness and efficiency of SMartyModeling over Astah. General results provide preliminary evidence that SMartyModeling is feasible for further developing.

Paper Nr: 83
Title:

An Effective Parallel SVM Intrusion Detection Model for Imbalanced Training Datasets

Authors:

Jing Zhao, Jun Li, Chun Long, Jinxia Wei, Guanyao Du, Wei Wan and Yue Wang

Abstract: In the field of network security, the Intrusion Detection Systems (IDSs) always require more research on detection models and algorithms to improve system performance. Meanwhile, higher quality data is critical to the accuracy of detection models. In this paper, an effective parallel SVM intrusion detection model with feature reduction for imbalanced datasets is proposed. The model includes 3 parts: 1) NKSMOTE-a Modified unbalanced data processing method. 2) feature reduction based on Correlation Analysis. 3) Parallel SVM algorithm combining clustering and classification. The NSL-KDD dataset is used to evaluate the proposed method, and the empirical results show that it achieves a better and more robust performance than existing methods in terms of the accuracy, detection rate, false alarm rate and training speed.

Paper Nr: 98
Title:

A Model for Evaluating Requirements Elicitation Techniques in Software Development Projects

Authors:

Naiara C. Alflen, Edmir V. Prado and Alexandre Grotta

Abstract: Requirements elicitation is the understanding of the real need of the user. It is considered the most complex and critical activity in software development. In the process of eliciting requirements several techniques are found in the literature. The main techniques described in the literature are presented in this article. Based on the techniques found in the literature, a model was proposed to analyze the influence and moderation of team involvement in eliciting requirements and the number of techniques used. The model was applied in an experimental format with students of the Information Systems course at the University of São Paulo. The results of the experiment are presented in this article.

Paper Nr: 123
Title:

Time2Play - Multi-sided Platform for Sports Facilities: A Disruptive Digital Platform

Authors:

Dinis Caldas, Estrela F. Cruz and António M. Rosado da Cruz

Abstract: Digital platforms drive disrupting businesses, supported by emerging technological improvements. These platforms enable new business models that use technology to connect people, organizations and resources in an interactive ecosystem where all the affiliates to the platform can create, add and capture value. This drives us to an innovative technology-driven business model called multi-sided platforms that is transforming our lives in ways that would be impossible to think a few decades ago. This new business model, platform-based and value-driven, enables direct interactions and/or transactions between two or more affiliates, enabling network effects, and consequently increasing the value of the network where affiliates belong to. Creating value through business model innovation is the engine of the success behind this kind of platforms. The main goal of this project is to build an innovative multi-sided platform to become a reference on the booking of sports facilities sector bringing together sports facilities’ owners and players. The proposed platform is modeled and presented, after being characterized through analysing the most similar tools in the market, and comparing them with a Blue Ocean Strategy tool, namely [yellow tail] value curve.

Paper Nr: 172
Title:

Towards a Business Process Model-based Testing of Information Systems Functionality

Authors:

Anastasija Nikiforova and Janis Bicevskis

Abstract: The main idea of the solution is to improve testing methodology of information systems (IS) by using data quality models. The idea of the approach is as follows: (a) first, a description of the data to be processed by IS and the data quality requirements used for the development of the test are created, (b) then, an automated test of the system on the generated tests is performed. Thus, the traditional software testing is complemented with new features – automated compliance checks of data to be entered and stored in the database. The generation of tests for all possible data quality conditions creates a complete set of tests that check the operation of the IS on all possible data quality conditions. Since this paper describes the first steps that are taken moving towards the proposed idea, it aims to (a) define the aim of the initiated research and (b) to choose the main components and to propose their combination resulting in the architecture of the idea to be implemented.

Paper Nr: 176
Title:

Blockchain-based Traceability Platforms as a Tool for Sustainability

Authors:

António M. Rosado da Cruz and Estrela F. Cruz

Abstract: Among the three pillars of sustainability (environmental, social and economic), modern society begins to give greater attention to the environmental and social ones at the expense of the economic. Consumers are beginning to be prepared to pay more for products that are socially and environmentally responsible, but, for taking that decision, they need to be sure that the products they choose are really socially and environmentally friendly. For that, it is necessary to have transparency about what happens in each stage of the products’ supply chain and this information must be available to the consumer. Thus, it is necessary to know the entire supply chain, from the creation of raw materials to the arrival of the final products at the consumer. Storing information (where, who, how, when, etc.) on each of the steps of the supply chain is essential, enabling the traceability of products to become more transparent and even allowing them to be withdrawn from the market if necessary, for health reasons (for example: use of a toxic paint in a clothes’ factory). This position paper proposes the use of the blockchain technology to implement traceability in the supply chain.

Paper Nr: 187
Title:

Representing Programs with Dependency and Function Call Graphs for Learning Hierarchical Embeddings

Authors:

Vitaly Romanov, Vladimir Ivanov and Giancarlo Succi

Abstract: Any source code can be represented as a graph. This kind of representation allows capturing the interaction between the elements of a program, such as functions, variables, etc. Modeling these interactions can enable us to infer the purpose of a code snippet, a function, or even an entire program. Lately, more and more work appear, where source code is represented in the form of a graph. One of the difficulties in evaluating the usefulness of such representation is the lack of a proper dataset and an evaluation metric. Our contribution is in preparing a dataset that represents programs written in Python and Java source codes in the form of dependency and function call graphs. In this dataset, multiple projects are analyzed and united into a single graph. The nodes of the graph represent the functions, variables, classes, methods, interfaces, etc. Nodes for functions carry information about how these functions are constructed internally, and where they are called from. Such graphs enable training hierarchical vector representations for source code. Moreover, some functions come with textual descriptions (docstrings), which allows learning useful tasks such as API search and generation of documentation.

Paper Nr: 223
Title:

OntoExper-SPL: An Ontology for Software Product Line Experiments

Authors:

Henrique Vignando, Viviane R. Furtado, Lucas O. Teixeira and Edson OliveiraJr

Abstract: Given the overall popularity of experimentation in Software Engineering (SE) in the last decades, we observe an increasing research on guidelines and data standards for SE. Practically, experimentation in SE became compulsory for sharing evidence on theories or technologies and provide reliable, reproducible and auditable body of knowledge. Although existing literature is discussing SE experiments documentation and quality, we understand there is a lack of formalization on experimentation concepts, especially for emerging research topics as Software Product Lines (SPL), in which specific experimental elements are essential for planning, conducting and disseminating results. Therefore, we propose an ontology for SPL experiments, named OntoExper-SPL. We designed such ontology based on guidelines found in the literature and an extensive systematic mapping study previously performed by our research group. We believe this ontology might contribute to better document essential elements of an SPL experiment, thus promoting experiments repetition, replication, and reproducibility. We evaluated OntoExper-SPL using an ontology supporting tool and performing an empirical study. Results shown OntoExper-SPL is feasible for formalizing SPL experimental concepts.

Paper Nr: 225
Title:

A Semiotics-oriented Approach to Aid the Design of Ubiquitously Monitored Healthcare Systems

Authors:

Jasmine Tehrani and Sajeel Ahmed

Abstract: Ubiquitous computing technology, sensor networks, and ambient intelligence have initiated the birth of pervasive health. While successful in many environments, in healthcare, monitoring technologies have been known to cause undesirable effects, such as increases in stress in patients being observed. To date, the use of this monitoring technology and its effect on human behaviour have not been thoroughly investigated, meaning future system designs may result in (preventable) undesirable effects. Pervasive healthcare’s envisioned deep intertwining with the patient’s day-to-day care, makes patient’s socio-cultural values a fundamental consideration. In this paper, we present a semiotics-oriented approach for analysing factors, identified in the literature and believed to influence patient’s behaviour, from both physical and social perspectives to aid the design of socially aware and patient-centric ubiquitous monitoring environments that are successfully adopted and used whilst aiding the incorporation of social aspects of pervasive technologies in the design.

Paper Nr: 246
Title:

A Cost based Approach for Multiservice Processing in Computational Clouds

Authors:

Jan Kwiatkowski and Mariusz Fraś

Abstract: The paper concerns issues related to evaluation of processing in computational clouds while multiple services are run. The new approach for the cloud efficiency evaluation and the problem of selection the most suitable cloud configuration with respect to user demands on processing time and processing price cost is proposed. The base of proposed approach is defined the Relative Response Time RRT which is calculated for each service individually, for different loads, and for each tested configuration. The paper presents results of experiments performed in real clouds which enabled to evaluate processing at general and individual application levels. The experiments show the need of applying such type of metric for evaluation of cloud configurations if different types of services are to be delivered considering its response time and price cost. The presented approach with use of RRT enables for available cloud virtual machine configurations to choose suitable one to run the application with regard to considered demands.

Paper Nr: 260
Title:

Software Ecosystems and Digital Games: Understanding the Financial Sustainability Aspect

Authors:

Bruno Lopes Xavier, Rodrigo Pereira dos Santos and Davi Viana dos Santos

Abstract: The digital games industry has a deep alignment with the field of Software Ecosystems (SECO). Despite the strict relationship, actors of the games industry do not apply the SECO perspective to understand the dynamics of the environment. Financial sustainability is considered a key factor for the permanence of these actors on the platforms and directly impacts the sustainability of an ecosystem. This paper presents a qualitative analysis in that challenging aspect of the digital game industry based on SECO concepts. A survey research that identified the benefits, problems and challenges aspects reported by games actors of digital games SECO in Brazil bases the statements and the analysis of this study. The focus was on exploring the reports of financial sustainability through the SECO perspective in order to help actors to understand the technical, business and social elements of this global and interconnected industry in the area of Information Systems. Finally, some ideas for academic research are listed, such as the need to use knowledge management to support studies on the business dimension of digital games SECO and the need to explore further relationships among games industry actors in the Brazilian context.

Area 4 - Software Agents and Internet Computing

Full Papers
Paper Nr: 120
Title:

An Adaptive System Architecture Model for the Study of Logic and Programming with Learning Paths

Authors:

Adson S. Esteves, Aluizio H. Filho, André A. Raabe and Rudimar S. Dazzi

Abstract: The number of dropouts and evasion rates in computing courses are among the highest in Brazilian universities. To reduce this rate, eLearning technologies are being used to compose solutions. Because of such reality, this work aims at showing an adaptive system architecture with learning paths that best fit the student’s profiles and interests. In order to take student’s profiles and interests into account, two theories will be used: constructivist and constructionist. The fundamentals of these theories were analysed to formulate a teaching structural model for the system. The literature was researched to find adaptive systems with theories similar to constructivism and constructionism. Then it was designed a collaborative agent system based on intelligent software agent techniques to help the student on its paths and content choices. In this system, the student’s difficulties, characteristics, and knowledge obtained from other users can be reused. An environment with a content hierarchy which allows more attractive learning path construction options may ease the learning, make the study more interesting and help reduce evasion rates in computing courses.

Paper Nr: 128
Title:

Collaborative Agents in Adaptative VLEs: Towards an Interface Agent for Interactivity and Decision-making Improvement

Authors:

Karize Viecelli, Aluizio H. Filho, Hércules Antonio do Prado, Edilson Ferneda, Jeferson M. Thalheimer and Anita R. Fernandes

Abstract: This paper presents an Interface Agent (IAg) in the context of collaborative software agents aiming at improving the interaction, interactivity and decision-making processes in Virtual Learning Environments (VLE). Working collaboratively in a multi-agent system, IAg receives notifications about situations that require interaction with students to assist and motivate them in the processes of navigation and use of VLE. In order to assist decision-making processes, it provides dashboards that enable the human tutor and VLE coordinators to make real-time decisions about non-normal situations. In addition, it monitors the actions of students seeking for clarifying doubts, utilizing a knowledge based on past situations. With this approach it is expected to enable a more attractive environment to students by reducing feelings of demotivation and isolation, and helping to reduce student dropout.

Short Papers
Paper Nr: 30
Title:

Towards Interoperability of oneM2M and OPC UA

Authors:

Salvatore Cavalieri and Salvatore Mulè

Abstract: Interoperability between industrial applications is considered a cornerstone in the current fourth industry revolution, known as Industry 4.0. Interoperability can be realised in different ways, among which by interworking solutions between existing communication systems adopted inside Industry 4.0. Among them there are oneM2M and OPC UA. To the best of the authors’ knowledge, literature does not present interworking solutions between these two communication systems. For this reason, the paper proposes and describes a novel solution to realise the interworking between OPC UA and oneM2M.

Paper Nr: 32
Title:

A Literature Review of Recommender Systems for the Cultural Sector

Authors:

Nguyen K. Dam and Thang L. Dinh

Abstract: Nowadays, organizations in the cultural sector have faced the problem of improving the discoverability of their products to meet the target objective regardless of the tremendous amount of information. In this respect, recommender systems have been proven to be the solution for enterprises, especially for cultural small and medium-sized organizations and enterprises (SMOs/SMEs), to enhance the discoverability of their products. This study aims at presenting a concept-centric literature review of recommender systems for cultural SMEs/SMOs to identify the current status-quo of the application in six cultural domains, including heritage and libraries, live performance, visual and applied arts, written and published works, audio-visual and interactive media, and sound recording. The finding of this paper reveals the adoption of recommender systems of cultural SMOs/SMEs is still in the early stage of maturity. The specific status-quo of recommender system adoption in each cultural domain is uncovered through the literature review. Other relevant aspects, which relate to data sources, data mining models, and algorithms, are also discussed in detail. Finally, the paper proposes future research directions to promote the application of artificial intelligence in general, and recommender systems, in particular, in the cultural sector.

Paper Nr: 56
Title:

Barriers for the Advancement of an API Economy in the German Automotive Industry and Potential Measures to Overcome these Barriers

Authors:

Gloria Bondel, Sascha Nägele, Fridolin Koch and Florian Matthes

Abstract: The API Economy is a type of service ecosystem that emerged due to organizations using Web APIs to provide third parties with access to their resources, i.e., functionality or data. It is argued that participation in the API economy creates value and offers strategic advantages to API providers. However, there are sectoral differences within the API Economy, with specific sectors being more advanced than others. Since there are currently no explanations for these differences between sectors, this research aims at providing insights into barriers inhibiting the advancement of the API Economy as well as potential measures to overcome these barriers for a specific sector, the automotive industry. We apply a Grounded Theory Methodology approach based on interviews with 21 experts from OEMs, automotive suppliers, consultants, mobility start-ups, and insurance firms. As a result, we present 14 legal, economic, social, technological, and organizational barriers. Furthermore, we derive five measures to overcome these barriers.

Paper Nr: 112
Title:

Factors Influencing Reuse Intention of e-Payment in Thailand: A Case Study of PromptPay

Authors:

Kobthong Ladkoom and Bundit Thanasopon

Abstract: Thai e-commerce is growing rapidly in the past few years. The driving factors for the rapid growth arise from increased Internet and mobile phone use as well as improved e-payment and logistics. Thai government has come up with a national e-payment initiative called “PromptPay” with an aim to reduce the use of cash and catalyze the adoption of e-payment in Thailand. However, the use of e-payment in Thailand is still lagged behind other countires. The objective of this research is therefore to identify factors influencing reuse intension of e-payment in Thailand and their antecedents. For our preliminary study, we survey PromptPay 100 users in Bangkok and analyze the data using PLS-SEM technique. The results suggest that satisfaction and attitude positively impact reuse intention of PromptPay. In addition, perceived usefulness is found to be a driver of user satisfaction and positive attitude towards PromptPay, while positive confirmation affects satisfaction. However, trust is unexpectedly found to be insignificant to reuse intention of PromptPay. Our proposed conceptual model offers an alternative model studying e-payment adoption in other context, while our findings could help Thai government in planning their strategy for improving the diffusion rate of PromptPay in Thailand.

Paper Nr: 119
Title:

Using a Domain Ontology to Bridge the Gap between User Intention and Expression in Natural Language Queries

Authors:

Alysson G. de Sousa, Dalai S. Ribeiro, Rômulo C. Costa de Sousa, Ariane B. Rodrigues, Pedro T. Furtado, Simone J. Barbosa and Hélio Lopes

Abstract: Many systems try to convert a request in natural language into a structured query, but formulating a good query can be cognitively challenging for users. We propose an ontology-based approach to answer questions in natural language about facts stored in a knowledge base, and answer them through data visualizations. To bridge the gap between the user intention and the expression of their query in natural language, our approach enriches the set of answers by generating related questions, allowing the discovery of new information. We apply our approach to the Movies and TV Series domain and with queries and answers in Portuguese. To validate our natural language search engine, we have built a dataset of questions in Portuguese to measure precision, recall, and f-score. To evaluate the method to enrich the answers we conducted a questionnaire-based study to measure the users’ preferences about the recommended questions. Finally, we conducted an experimental user study to evaluate the delivery mechanism of our proposal.

Paper Nr: 199
Title:

TermIt: A Practical Semantic Vocabulary Manager

Authors:

Martin Ledvinka, Petr Křemen, Lama Saeeda and Miroslav Blaško

Abstract: Many organizations already benefit from using semantic vocabularies which help them systematize, search and reuse their data. However, to efficiently manage such vocabularies, appropriate and adequate tools are needed. In this paper, we present TermIt, an integrated system for managing a set of interconnected vocabularies, identification of individual concepts in source documents, and using such terms for semantic data asset annotation and subsequent search. We relate TermIt to other relevant tools and present usage scenarios we have identified so far.

Paper Nr: 212
Title:

Constraints and Challenges in Designing Applications for Industry 4.0: A Functional Approach

Authors:

Mateus C. Silva, Frederico L. Martins de Sousa, Débora M. Barbosa and Ricardo R. Oliveira

Abstract: The Industry 4.0 concept relies on the integration of its composing elements using modern tools. These modern industrial plants must consider concepts like the Internet of Things, Cyber-Physical Systems and Smart Devices. The main features involved in these architectures are the local control, machine-to-machine information exchange, and human-to-machine interface through virtualization. The integration of these elements to create a connected environment presents a challenge to developers and engineers. In this text, we perform a theoretical analysis of the main constraints and challenges in designing and implementing novel applications using digital twins, robots, wearable devices, and other control interfaces. To evaluate the theoretical approach, we performed a series of tests in prototype environments.

Paper Nr: 217
Title:

An Agent based Platform for Resources Recommendation in Internet of Things

Authors:

Agostino Forestiero and Giuseppe Papuzzo

Abstract: Internet of Things (IoT) paradigm aims to bridge the gap between physical and cyber world allowing a deeper understanding of users in terms of preferences and behaviors. User devices and services interact and maintain relations which need of effective and efficient selection/suggestion approaches to better meet users’ interests. Recommendation systems provide a set of significant and useful suggestions for users and systems with given characteristics. This paper introduces the design of an agent based platform for building a distributed recommendation system in IoT environment. Internet of Things objects (devices, sensors, services, etc.) are represented with vectors obtained through the Doc2Vec model, a neural word embedding approach able to capture the semantic context representing documents in dense vectors. The vectors are managed by cyber agents that, performing simple and local operations, organize themselves exploiting the vector values. The outcome is the emergence of an organized overlay-network of cyber agents that allows to obtain an efficient recommender system of IoT services. Preliminaries results confirm the validity of the approach.

Paper Nr: 235
Title:

Social Micromachine: Origin and Characteristics

Authors:

Brunno W. Lemos de Souza and Sílvio L. Meira

Abstract: The incorporation of computing into society through personal devices has led to the discussion of Social Machines and social computing, that is, has guided the even greater existence of relationships between people and machines to solve problems. Social Machines represent information systems that establish connections through certain constraints to deal with the complexity of services and operations because with the spread of the web as a software development platform along with increased interactivity and application connectivity, the understanding of the nature of computing has been modified. Software architecture is a description of how a software system is organized whether the large or small scale and is currently highly interconnected because interactions, relationships, and their constraints are considered in software behavior, including the service granularity that is used to measure the depth of abstraction that has been applied to existing services. In this research, some specific definitions of Social Machines are presented, extending the focus of the different relationship visions and their restrictions. Using a methodology based on technical goals, the understanding of the relationship-aware is presented, adding it to the Social Machine and to term Social Micromachine, highlighting the Microservice architecture as one of the types of service-oriented relationship.

Paper Nr: 59
Title:

Recommender Systems based on Scientific Publications: A Systematic Mapping

Authors:

Felipe Ciacia de Mendonça, Isabela Gasparini, Rebeca Schroeder and Avanilde Kemczinski

Abstract: Recommender Systems are intended to recommend items according users’ preference, resulting in greater satisfaction to them. Among the objects of study that may be recommended are scientific articles from venues such as conferences and journals. However, there are still many challenges in this area, such as effective analysis of textual data as well as improvement of the recommendations produced. In this paper we investigate the state-of-the-art. For this purpose, we have applied the systematic mapping methodology (SM), considering 165 articles selected from the search string. Applying the inclusion criteria resulted in 78 articles, and applying the exclusion criteria resulted in 38 articles to answer the defined research questions. As result, it is possible to know which evaluation approaches, algorithms, and metrics are being used, as well as which databases are being studied for research in the area.

Paper Nr: 226
Title:

Use of Text Mining Techniques for Recommender Systems

Authors:

Yanelys Betancourt and Sergio Ilarri

Abstract: Recommender systems help users to reduce the information overload they may suffer in the current era of Big Data, by offering them recommendations of relevant items according to their tastes/preferences and/or context (location, weather, time of the day, etc.). We argue that text mining techniques can be exploited for the development of recommender systems. Thus, they can be applied to detect user preferences (user profiling) and also to extract context data. For this purpose, text mining can be applied on user reviews, text descriptions associated to the items, and other texts written by the user (e.g., posts in social networks). In this paper, we provide an overview of works exploiting text mining techniques in the field of recommender systems, characterizing them according to their purpose and the type of textual data analyzed.

Area 5 - Human-Computer Interaction

Full Papers
Paper Nr: 46
Title:

Facial Expressions Animation in Sign Language based on Spatio-temporal Centroid

Authors:

Diego A. Gonçalves, Maria C. Baranauskas, Julio D. Reis and Eduardo Todt

Abstract: Systems that use virtual environments with avatars for information communication are of fundamental importance in contemporary life. They are even more relevant in the context of supporting sign language communication for accessibility purposes. Although facial expressions provide message context and define part of the information transmitted, e.g., irony or sarcasm, facial expressions are usually considered as a static background feature in a primarily gestural language in computational systems. This article proposes a novel parametric model of facial expression synthesis through a 3D avatar representing complex facial expressions leveraging emotion context. Our technique explores interpolation of the base expressions in the geometric animation through centroids control and Spatio-temporal data. The proposed method automatically generates complex facial expressions with controllers that use region parameterization as in manual models used for sign language representation. Our approach to the generation of facial expressions adds emotion to representation, which is a determining factor in defining the tone of a message. This work contributes with the definition of non-manual markers for Sign Languages 3D Avatar and the refinement of the synthesized message in sign languages, proposing a complete model for facial parameters and synthesis using geometric centroid regions interpolation. A dataset with facial expressions was generated using the proposed model and validated using machine learning algorithms. In addition, evaluations conducted with the deaf community showed a positive acceptance of the facial expressions and synthesized emotions.

Paper Nr: 53
Title:

Playing the Role of Co-designers on Mobile PWAs: An Investigation of End-Users Interaction

Authors:

Giulia A. Cardieri and Luciana M. Zaina

Abstract: Progressive Web App (PWA) is a new approach that combines technology resources of both web and native apps. End-User Development (EUD) is an approach from which end-users participate actively in a system’s design process. PWAs are a recent technology and the impacts of associating EUD and PWAs have been little exploited. To cover this gap, we proposed the PWA-EU approach in previous work. In this paper, we present an investigation about end-users interactions when they act as co-designers on PWAs. We built a mobile app based on the PWA-EU approach and conducted a study with 18 participants with eight acting as co-designers of the app, and ten interacting as non-designers. We carried out a qualitative analysis from the participants’ interaction focusing on the breakdowns communication and user experience (UX) of the participants. Our gathered evidence points out that even acting as co-designers participants still have communication breakdowns. Moreover, those who acted as co-designs had a more satisfying experience than those who did not.

Paper Nr: 63
Title:

To Inspect or to Test? What Approach Provides Better Results When It Comes to Usability and UX?

Authors:

Walter T. Nakamura, Leonardo C. Marques, Bruna Ferreira, Simone J. Barbosa and Tayana Conte

Abstract: Companies are constantly striving to improve their products for satisfying customers. Evaluating the quality of these products concerning usability and User eXperience (UX) has become essential for obtaining an advantage over competing products. However, several evaluation methods exist, making it difficult to decide which to choose. This paper presents a comparison between usability inspection and testing methods and a UX evaluation. We investigated the extent to which each method allows identifying usability problems with efficiency and effectiveness. We also investigated whether there is a difference in UX ratings between inspectors and users. To do so, we evaluated a Web platform designed for a government traffic department. Inspectors used TUXEL to evaluate the usability and UX of the platform, while usability testing moderators employed Concurrent Think-Aloud and User Experience Questionnaire with users. The inspection method outperformed usability testing regarding effectiveness and efficiency while addressing most major problems that occurred in usability testing, even when considering only the results from novice inspectors. Finally, the UX evaluation revealed contrasting results. While inspectors evaluated the platform as neutral, reflecting the problems they identified, users, by contrast, rated it very positively, in contradiction to the problems they had during the interaction.

Paper Nr: 70
Title:

Straight to the Point - Evaluating What Matters for You: A Comparative Study on Playability Heuristic Sets

Authors:

Felipe S. Manzoni, Tayana U. Conte, Milene S. Silveira and Simone J. Barbosa

Abstract: Background: Playability is the degree by which a player can learn, control, and understand a game. There are many and different Playability evaluation techniques that can evaluate different and numerous game aspects. However, there is a shortage of comparative studies between these proposed evaluation techniques. These comparative studies can assess whether the evaluated techniques can identify playability problems with a better cost-benefit ratio. Also, these studies can show game developers the evaluation power that a technique has in comparison to others. Aim: This paper aims to present and initially evaluate CustomCheck4Play, a configurable heuristic-based evaluation technique that can be suited for different game types and genres. We evaluate CustomCheck4Play, assessing its efficiency and effectiveness in order to verify if CustomCheck4Play performs better than the compared heuristic set. Method: We have conducted an empirical study comparing a known literature heuristic set and CustomCheck4Play. The study had 54 participants, who identified 49 unique problems in the evaluated game. Results: Our statistical results comparing both evaluation techniques have shown that there was a significant statistical difference between groups. Efficiency (p-value = 0.030) and effectiveness (p-value = 0.004) results represented a statistically significant difference in comparison to the literature heuristic set. Conclusions: Overall, statistical results have shown that CustomCheck4Play is a more cost-beneficial solution for the playability evaluation of digital games. Moreover, CustomCheck4Play was able to guide participants throughout the evaluation process better and showed signs that the customization succeeded in adapting the heuristic set to suit the evaluated game.

Paper Nr: 72
Title:

OpenDesign: Analyzing Deliberation and Rationale in an Exploratory Case Study

Authors:

Fabrício M. Gonçalves, Alysson Prado and Maria C. Baranauskas

Abstract: The open phenomenon coming from the free-software movement has gained several fields, including services, digital and physical products. Nevertheless, some authors point out the limited availability of supporting methods and online tools to face the challenges of distributed collaboration of volunteers. They claim that online collaborative platforms are still needed for supporting co-creation. In this paper, we investigate the deliberation and design rationale in an (open) design process using the OpenDesign Platform. A case study conducted with 22 participants of a Conference illustrates the use of the platform to cope with a proposed design challenge. Results illustrated with graphical representations based on concepts of the Actor Network Theory provide a visual representation of the network constituted by both the participants and the artifacts (boundary objects) they produce and interpret. Further studies are pointed out suggesting new possibilities of features and platform enhancements.

Paper Nr: 139
Title:

Articulating Socially Aware Design Artifacts and User Stories in the Conception of the OpenDesign Platform

Authors:

Júlio D. Reis, Andressa D. Santos, Emanuel F. Duarte, Fabrício M. Gonçalves, Breno B. Nicolau de França, Rodrigo Bonacin and M. C. Baranauskas

Abstract: Gathering and understanding requirements and features in a system play a central role in its acceptance and proper use. Unconventional systems involving uncertainties about their requirements demand methods that adequately support the capture of stakeholder’s needs, desires and objectives. The OpenDesign project aims at supporting the design of computational solutions to wide-ranging problems under a holistic and socially-aware perspective through an open and collaborative platform. Designing this platform with a proper understanding of the desired features is for sure a hard task. In this paper, we investigate and characterize an ideation process to be used in situations involving uncertainties about requirements, as experienced in the OpenDesign Project. This process involves the collaborative construction of user stories, articulated with Socially Aware Design artifacts, created through participatory practices, as part of the platform design and prospection of potential uses of it. Based on the results, we organize the core concepts permeating the OpenDesign proposal, expressed in a map synthesizing its features.

Paper Nr: 220
Title:

Personalising Exergames for the Physical Rehabilitation of Children Affected by Spine Pain

Authors:

Cristian Gómez-Portes, Carmen Lacave, Ana I. Molina, David Vallejo and Santiago Sánchez-Sobrino

Abstract: Injuries or illnesses related to the lumbar spine need great clinical care as they are one of the most prevalent medical conditions worldwide. The use of exergames has been widespread in recent years and they have been put forward as a possible solution for motivating patients to perform rehabilitation exercises. However, both customizing and creating them is still a task that requires considerable investment both in time and effort. In this project we present a language with which we have designed a system based on the physical rehabilitation of patients suffering from bone-marrow injuries, which enables customization and generation of exergames. To assess the system, we have designed an experiment with an exergame based on the physical rehabilitation of the lumbar spine. The purpose of this was to assess its understanding and suitability, whose result reveals that the tool is fun, interesting and easy to use. It is hoped that this approach can be used to considerably reduce the complexity of creating new exergames, as well as supporting the physical rehabilitation process of patients with lower back pain.

Short Papers
Paper Nr: 81
Title:

Usability Heuristics for Tabletop Systems Design

Authors:

Vinícius Diehl de Franceschi, Lisandra M. Fontoura and Marcos R. Silva

Abstract: Tabletops are large interactive displays that enable many users to interact at the same time. These devices have different characteristics than other touchscreen devices, such as smartphones or tablets. They cannot be easily moved to bring the screen closer to the eyes or rotate interface elements, changing the screen to horizontal or vertical, for example. In this context, this paper presents a set of heuristics to be considered in tabletop interface design from the initial planning until validation. Nielsen’s heuristics and others adapted or formalized from Nielsen, as well as researches about tabletop characteristics and user tests, were identified and analyzed to adapt and formalize heuristics to tabletop context. A set of twelve heuristics for tabletop context was created and they were considered to design simulator interfaces. Observing the militaries using them, we have gathered evidence that these heuristics can help designers to think about essential interface characteristics to support users to realize and understand the interface goal and how to interact with it.

Paper Nr: 92
Title:

Prioritization of Mobile Accessibility Guidelines for Visual Impaired Users

Authors:

Fiamma M. Quispe, Lilian P. Scatalon and Marcelo M. Eler

Abstract: m-Government involves delivering public services to citizens through mobile applications. Naturally, such apps should benefit everyone, including citizens with disabilities. However, several Brazilian government apps still present accessibility issues, given the high demand of development teams, which end up working on functional requirements in detriment of other aspects such as usability. In this scenario, we investigate the prioritization of mobile accessibility guidelines extracted from e-MAG (the Brazilian Government Accessibility Model) to help dealing with limited resources while also addressing accessibility. We surveyed people with some kind of disability by asking them to rate the relevance of each guideline. They were supposed to consider the severity of problems faced by them due to non-compliance. Most respondents were visual impaired users (66 out of 103 overall responses), so we considered this group of respondents in our analysis. Results show variations on the perception of relevance among subgroups with different levels of visual impairment. Moreover, results allowed us to propose a priority list of the evaluated mobile accessibility guidelines. Ultimately, the idea is to help developers with a strategy to remediate accessibility issues on existing apps and also to avoid them during the early development phases of new ones.

Paper Nr: 186
Title:

Management System for Regional Electronic Coupon

Authors:

Hiroki Satoh, Toshiomi Moriki, Yuuichi Kurosawa, Tasuku Soga, Miho Kobayashi and Norihisa Komoda

Abstract: We propose a regional electronic coupon management system. This system is for events where many local shopping streets in Japan offer special menus and rediscover the local shopping streets and their appeal. It has a web application for customers that incorporates a mechanism that allows customers and store clerks to interact with each other, and a function for event organizers that can visualize the success of the event every hour. The proposal system was used in actual events and was well received.

Paper Nr: 190
Title:

Western ERP Roll-out in China: Insights from Two Case Studies and Preliminary Guidelines

Authors:

Paola Cocca, Filippo Marciano, Elena Stefana and Maurizio Pedersoli

Abstract: China is one of the countries with the highest failure rate of implementation of foreign – western - ERP systems. Western companies encounter difficulties in their roll-out projects in China mainly for cultural issues. Based on the insights from two case studies, this paper aims to provide a preliminary analysis of the main criticalities of western ERP roll-outs in China and suggest guidelines to increase their success rate.

Paper Nr: 254
Title:

Using Mixed Reality as a Tool to Assist Collaborative Activities in Industrial Context

Authors:

Breno Keller, Andrea C. Bianchi and Saul Delabrida

Abstract: The transition process from industry 3.0 to 4.0 results in need to develop interconnected systems as well as new interfaces for human-computer interaction with these systems since it is not yet possible or allowed to automate these processes in all industrial contexts fully. Therefore, new technologies should be designed for the workers’ interaction with the new systems of Industry 4.0. Mixed Reality (MR) is an alternative for the inclusion of workers, as it allows them to have their perception increased through information from the industrial environment. Consequently, the use of MR resources as a tool for performing collaborative activities in an industrial context is promising. This work aims to analyze how this strategy has been applied in the industry context and discuss its advantages, disadvantages, and characteristics that impact the performance workers in Industry 4.0.

Paper Nr: 272
Title:

A Wearable System for Electrodermal Activity Data Acquisition in Collective Experience Assessment

Authors:

Patrícia Bota, Chen Wang, Ana Fred and Hugo Silva

Abstract: In the recent years, we have been observing an increase of research work involving the use of biomedical data in affective computing applications, which is ever more dependent on data and its quality. Many physiological data acquisition devices have been developed and validated. However, there is still a need for pervasive and unobtrusive equipment for collective synchronised acquisitions. In this work, we introduce a novel system, the Electrodermal Activity (EDA) Xinhua Net Future Media Convergence Institute (FMCI) device, allowing group data acquisitions, and benchmark its performance using the established BITalino as gold standard. We developed a methodical experimental protocol in order to acquire data from the two devices simultaneously, and analyse their performance over a comprehensive set of criteria – Data Quality Analysis. Additionally, the FMCI data quality is assessed over five different setup scenarios towards its validation in a real-world scenario – Data Loss Analysis. The experimental results show a close similarity between the data collected by both devices, paving the way for the application of the proposed equipment in simultaneous, collective data acquisition use cases.

Paper Nr: 73
Title:

Towards a Change Management Framework for Cloud Transitions: Findings from a Case Study at a German Machine Manufacturer

Authors:

Gloria Bondel, Sophie Buchelt, Natascha Ulrich, Horst Urlberger, Christian Kabelin and Florian Matthes

Abstract: Cloud computing provides benefits such as cost savings, a high degree of flexibility, as well as advanced security opportunities. However, introducing a cloud environment into an established enterprise requires both technical and structural changes in the organization. Thus, a well-planned change management approach is essential to ensure a successful transition. The goal of this paper is to contribute to change management research by providing a change management framework for cloud transitions. Furthermore, we present measures to implement the framework consisting of a vision statement definition, communities of practice, a learning journey, a change story, and a collaboration tool. Finally, we also provide a roadmap for the sequential implementation of the identified measures. Our results are based on an extensive literature review and a case study approach, including eight expert interviews.

Paper Nr: 197
Title:

Virtual Reality, a Telepresence Technique Applied to Distance Education: A Systematic Mapping Review

Authors:

Luis Guevara, Jonathan Yépez and Graciela Guerrero

Abstract: The growing demand for innovation in education coupled with the advancement of visualization technologies has led to a large number of people and researchers becoming interested in immersive technologies, the use of virtual reality, which offers opportunities and challenges in the education sector, in so doing, the advancement of hardware and software has made it possible to incorporate this technology into teaching strategies. The present work details a systematic mapping review, which gathers existing scientific documentation on virtual reality in the field of education, as well as describing the methodology used to carry out this review, it also sets out the results and conclusions obtained at the end of the review.

Paper Nr: 207
Title:

Conception and Analysis for New Social Networks in University Community

Authors:

Alan Quimbita, Andrés Pupiales and Graciela Guerrero

Abstract: Social networks have dramatically changed the way people relate to each other. In the university context it has also been used as communication tools that intervene in the learning process. The objective of this study is to initiate a research process for the analysis, design and development of a new social network for the university community. Thus, a systematic mapping of literature was proposed with the research topic: social network analysis based on eye tracking. The method used establishes a research protocol that defines the guidelines for research, information extraction and analysis of the results of the mapping study. Among the results obtained, there is a growing trend in the use of eye tracking to obtain requirements prior to the design of a social network.

Area 6 - Enterprise Architecture

Full Papers
Paper Nr: 18
Title:

Project, Program, Portfolio Governance Model Reference Architecture in the Classic Approach to Project Management

Authors:

Gonçalo Cordeiro, André Vasconcelos and Bruno Fragoso

Abstract: This paper presents a reference architecture on projects, programs and portfolios (PPP) governance model. Projects support organizations in the achievement of the planned objectives. The governance model of a project, program or portfolio has a direct relation with the PPP outcome. A PPP governance model is defined as the use of systems, structures of authority, and processes to allocate resources and coordinate or control activities in a project, program and portfolio. The required roles, responsibilities, and performance criteria should be an integral part of the governance model for projects, programs, and portfolios. This research adds to the knowledge base of PPP management a proposed reference architecture, that verifies deviations between different PPP governance models at competences and roles levels. The reference architecture is organized in five layers (business governing, steering, directing, managing, and performing) identifying the roles, the roles concerns, and their competences. The reference architecture is modelled using ArchiMate. Finally, the proposed architecture is demonstrated and evaluated in a government owned company.

Paper Nr: 19
Title:

Business Processes with Pre-designed Flexibility for the Control-flow

Authors:

Thomas Bauer

Abstract: In order to avoid limitations for end users, run-time deviations from the pre-defined business process have to be allowed at process-aware information systems (PAIS). Predictable flexibility shall be pre-designed already at build-time. The advantage, compared to completely dynamic modifications at run-time, is that this significantly reduces the effort for the end users necessary to trigger a deviation. Furthermore, this increases process safety since, for instance, it can be pre-defined which users are allowed to perform which modifications. In this paper we present the corresponding requirements for the control-flow perspective. Thereby, the main focus is to discuss which information has to be pre-designed at build-time in each case. Furthermore, examples from practice are presented in order to illustrate the necessity of the requirements.

Paper Nr: 25
Title:

Digital Value Dependency Framework for Digital Transformation

Authors:

Suthamathy Sathananthan, Dennis Gamrad and Johanna Myrzik

Abstract: Knowing there is considerable value in digitalization, enterprises have started to transform their operations utilizing digital technologies. However, current methods used in estimating benefits are methods typically used in capital budgeting projects which do not consider the value interdependencies or uncertainties from digitalization into account. Therefore, a standardized yet comprehensive framework and a mathematical model have been developed to estimate and measure potential from digitalization. The framework and model together were applied in an industrial digital project where results show overall value of the project based on economic and qualitatively measured impacts, and value contribution of transformational elements such as technologies and organizational changes. The results have been used to form value networks which demonstrate shared values between multiple digital projects with respect to digital capabilities. These results bring transparency across projects for informed decision making and support in data-driven business model innovation.

Paper Nr: 41
Title:

Empirical Task Analysis of Data Protection Management and Its Collaboration with Enterprise Architecture Management

Authors:

Dominik Huth, Michael Vilser, Gloria Bondel and Florian Matthes

Abstract: The General Data Protection Regulation has forced organizations worldwide to rethink their processing activities of personal data. One of the key difficulties of ensuring GDPR compliance is the scope of the regulation and its interdisciplinarity: Data protection management (DPM) has to address challenges on the legal, business and technical level over the entire organization. Enterprise architecture management (EAM) is a well-established discipline that follows a holistic approach to strategically develop the enterprise architecture, consisting of people, processes, applications, and their interrelationships. Thus, DPM can be considered a stakeholder in the EA management process. In this paper, we report on a survey with 38 data protection officers that investigates the main challenges for DPM, as well as the collaboration between DPM and EAM during the implementation of the GDPR.

Paper Nr: 88
Title:

On Enterprise Architecture Patterns: A Systematic Literature Review

Authors:

Roberto R. García-Escallón and Adina Aldea

Abstract: Organizations around the world today face many challenges, and not all of them are unique. Such repeatable problems can themselves be solved by reusable solutions. This study focuses on Enterprise Architecture Patterns, reusable solutions to repeating challenges organizations face. Through a systematic literature review, we aim to gather an exhaustive recollection of all such patterns in literature. In order to be as comprehensive as possible, the sample included related fields also researching patterns. The result is a collection of 593 patterns and their respective analysis, which will help practitioners tackle their challenges.

Paper Nr: 104
Title:

Enterprise Security Architecture: Mythology or Methodology?

Authors:

Michelle McClintock, Katrina Falkner, Claudia Szabo and Yuval Yarom

Abstract: Security has never been more important. However, without a holistic security structure that secures all assets of an organisation (physical, digital or cognitive), an organisation is at a critical risk. Enterprise architecture (EA) applies engineering design principles and provides a complete structure to design and build an organisation using classification schema and descriptive representations. The grouping of security with EA, through a framework with corresponding security classifications and representations, promises a complete security solution. We evaluate security frameworks and find that grouping security with EA is not new, however current solutions indicate a lack of research process in development, a disjoint focus in either technical or policy / department or project. Thus, there is a need for a holistic solution. We use a Design Science Research methodology to design, develop, and demonstrate a security EA framework that provides an organisation with a complete security solution regardless of industry, budgetary constraints, or size, and survey professionals to critically analyse the framework. The results indicate the need for a complete security structure including benefits in governance, resourcing, functional responsibilities, risk management and compliance.

Short Papers
Paper Nr: 4
Title:

e-Business Reference Modelling Framework for SMEs: An Enterprise Architecture based Approach

Authors:

Magido Mascate and André Vasconcelos

Abstract: Through the digital industry and economy, we find countless researches providing conceptual models aiming to depict digital technology adoption by different businesses. However, according to existing studies, we found that SMEs lack an e-business reference modelling framework that supports the design and adoption of digital-enabled business models. Therefore, we exploit an adaptive and technology-independent approach to propose an EA-based e-business reference modelling framework for SMEs in diverse business contexts. Our framework comprises mainly of three interrelated building blocks, starting with the 1) situational analysis to determine the motivating factors for change and barriers of the business environment; followed by SMEs profiling. The SMEs profiling embodies the 2) SMEs’ readiness depiction based on the existence of digital strategy, digital-value drivers’ catalogue, and business models proposals; and culminates with the 3) description of the implementation based on e-business architecture. In addition, a fourth building block is incorporated into the framework for e-Business solutions architecture, selection, and deployment into the current SMEs' business context. In this study, we assume a practical approach, proposing and demonstrating the application of tools that support the conception of an SMEsé-business in the business context in all the different facets of our framework.

Paper Nr: 26
Title:

Current Practices in the Information Collection for Enterprise Architecture Management

Authors:

Robert Ehrensperger, Clemens Sauerwein and Ruth Breu

Abstract: The digital transformation influences business models, processes, and enterprise IT landscape as a whole. Therefore, business-IT alignment is becoming more important than ever before. Enterprise architecture management (EAM) is designed to support and improve this business-IT alignment. The success of EAM crucially depends on the information available about a company's enterprise architecture, such as infrastructure components, applications, and business processes. This paper discusses the results of a qualitative expert survey with 26 experts in the field of EAM. The goal of this survey was to highlight current practices in the information collection for EAM and identify relevant information from enterprise-external data sources. The results provide a comprehensive overview of collected and utilized information in the industry, including an assessment of the relevance of such information. Furthermore, the results highlight challenges in practice and point out investments that organizations plan in the field of EAM.

Paper Nr: 34
Title:

Collecting and Integrating Unstructured Information into Enterprise Architecture Management: A Systematic Literature Review

Authors:

Robert Ehrensperger, Clemens Sauerwein and Ruth Breu

Abstract: In the age of digital transformation, strategic IT alignment is becoming a primary driver for economic success. In this context, the optimization of strategic IT alignment plays a key role in enterprise architecture management (EAM). A successful EAM strategy depends on the quantity and quality of the available information within the enterprise architecture (EA) models. EA information about the functional scope of software solutions and its supported business processes is often available only in an unstructured form. Automatic acquisition of this information assists companies in the design of target architectures. In recent years, new technologies have been introduced that facilitate the use of unstructured information. The research at hand discusses these new technologies and emerging challenges. Furthermore, it provides a systematic literature review of the current state of research on collecting and integrating unstructured information into EAM.

Paper Nr: 138
Title:

Towards an Automated DEMO Action Model Implementation using Blockchain Smart Contracts

Authors:

Marta Aparício, Sérgio Guerreiro and Pedro Sousa

Abstract: Blockchain (BC) is a technology that introduces a decentralized, replicated, autonomous, and secure databases. A Smart Contract (SC) is a transaction embedded in BC that contains executable code and its internal storage, offering immutable execution and record keeping. A SC has enormous potential in automating traditional paper contracts and encoding contract logic into program code. Therefore, replacing the role of a central authority and reducing the time and money spent on the enforcement of such contracts. This paper intends to determine the sufficiency or insufficiency of ontology to support the automatic generation of SCs code from text, in particular, DEMO Action Model. A new way to capture the SC in a user-friendly way could be proposed. With this, it is intended to eliminate the errors associated with programming since the SC code is automatically generated from models.

Paper Nr: 147
Title:

An Exploratory View on Risk Management Constructs for Business Process Models

Authors:

Gabriel D. Favera, Denilson S. Ebling, Vinicius Maran, Jonas B. Gassen and Alencar Machado

Abstract: Business Process Modeling and Notation (BPMN) is a widely used process modeling notation both in academia and industry, with a structure that is easy to understand and use. It contains more than a hundred elements referring to various concepts. However, it does not cover risk management constructs. In this paper, we seek to identify the need and which aspects are important if we were to associate process models with risk management. We performed an exploratory research method with experts in the area of risk management that also work with process models. The work resulted various concepts that could be related in process models, such as risk presentation, control activities and risk mitigation. We conclude that experts would like to have such disciplines better integrated and that we have a starting point to design, for instance, a BPMN extension to cover such aspects.

Paper Nr: 177
Title:

Design Challenges for GDPR RegTech

Authors:

Paul Ryan, Martin Crane and Rob Brennan

Abstract: The Accountability Principle of the GDPR requires that an organisation can demonstrate compliance with the regulations. A survey of GDPR compliance software solutions shows significant gaps in their ability to demonstrate compliance. In contrast, RegTech has recently brought great success to financial compliance, resulting in reduced risk, cost saving and enhanced financial regulatory compliance. It is shown that many GDPR solutions lack interoperability features such as standard APIs, meta-data or reports and they are not supported by published methodologies or evidence to support their validity or even utility. A proof of concept prototype was explored using a regulator based self-assessment checklist to establish if RegTech best practice could improve the demonstration of GDPR compliance. The application of a RegTech approach provides opportunities for demonstrable and validated GDPR compliance, notwithstanding the risk reductions and cost savings that RegTech can deliver. This paper demonstrates a RegTech approach to GDPR compliance can facilitate an organisation meeting its accountability obligations.

Paper Nr: 188
Title:

SMARTINSUR: A Platform for Digitizing Business Transactions in the Insurance Industry

Authors:

Andreas Lux and Michael Muth

Abstract: For the digitization of business processes in insurance, the concept of a platform is presented that is based on the design principles of modern software architectures (domain-driven design, micro services). This raises the question of whether the complex application scenario can be realized with this software development approach and whether the advantages of agile development come into play. The complex interplay of the different roles involved in the processes, namely insurance companies, sales organizations and brokers, is illustrated using the example of a document service. The advantages resulting from the digitization of the example process, such as labor and cost savings, but also quality improvement and increased customer satisfaction are worked out. In the near future, process events will be recorded via the integration of a process mining service that can be of great help for further process optimization.

Paper Nr: 194
Title:

ICs Manufacturing Workflow Assessment via Multiple Logs Analysis

Authors:

Vincenza Carchiolo, Alessandro Longheu, Giuseppe Saccullo, Simona M. Sau and Renato Sortino

Abstract: Today’s complexity in ICT services, consisting of several interacting applications, requires strict control over log files to detect what exceptions/errors occurred and how they could be fixed. The current scenario is harder and harder due to the volume, velocity, and variety of (big) data within log files, therefore an approach to assist developers and facilitate their work is needed. In this paper an industrial application of such log analysis is presented, in particular, we consider the manufacturing of Integrated Circuits (ICs), i.e. a set of physical and chemical processes performed by production machines onto silicon slices. We present a widely used set of open-source tools that join together a platform to allow logs mining to assess manufacturing workflow processes. We show that the proposed architecture helps in discovering and removing anomalies and slowdown in ICs production.

Paper Nr: 200
Title:

Towards a Conceptual Model for Undesired Situation Detection through Process Mining

Authors:

Matheus F. Flores, Denílson Ebling, Jonas B. Gassen, Vinícius Maran and Alencar Machado

Abstract: As technology advances, recent research propose solutions to monitor and control organizational processes, aiming to maximize efficiency and productivity and minimize the loss of resources involved in the execution of processes, whether human or technological, in addition to maintaining a controlled environment so that the objectives of the organizations are achieved, that is, the satisfaction of their customers. For this, historical information contained in the event log is frequently used, related to the execution of processes in the organizational environment. These information serves as a basis for controlling the environment, preventing the occurrence of unwanted situations. In this context, this paper presents a model for detecting situations of interest in the organizational environment through event logs, making it possible to initiate proactive actions in the face of these situations, resulting in a Web application provided by interfaces that validate the purpose of the article. Beyond the scenario, an event log related to the execution of a real process was tested. By means of control charts, it is possible to view (using time parameters) the delay in the execution of the process, which may be related to a situation of interest.

Paper Nr: 203
Title:

Exploring Blockchain Technology to Improve Multi-party Relationship in Business Process Management Systems

Authors:

Paulo H. Alves, Ronnie Paskin, Isabella Frajhof, Yang R. Miranda, João G. Jardim, Jose B. Cardoso, Eduardo H. Tress, Rogério Ferreira da Cunha, Rafael Nasser and Gustavo Robichez

Abstract: Business Process Management Systems (BMPSs) are often used to track activities, identify inefficiencies, and streamline the workflow. Typically, BPMSs are used by a single organization for internal users and processes through a trusted central system. However, scenarios involving multiple parties present a new challenge: to ensure the reliability of registered information and strict adherence to business rules for all participants without a central authority. Therefore, we explored the use of blockchain technology in association with a BPMS to create a Distributed Business Process System (dBPMS). This integration can fulfill the requirements above mentioned, creating tamper-proof registries, allowing reliable self-execution through smart contracts, within a trusted environment for all parties, without the need for inter-party trust. The proposed solution provides a workflow encompassing all activities and parties in an efficient ecosystem.

Paper Nr: 214
Title:

Microservices Management with the Unicorn Cloud Framework

Authors:

George Feuerlicht, Marek Beranek and Vladimir Kovar

Abstract: The recent transition towards cloud computing from traditional on-premises systems and the extensive use of mobile devices has created a situation where traditional architectures and software development frameworks no longer support the requirements of modern enterprise applications. This rapidly evolving situation is creating a demand for new frameworks that support the DevOps approach and facilitate continuous delivery of cloud-based applications using microservices. In this paper, we first discuss the challenges that the microservices architecture presents and then describe the Unicorn Cloud Framework designed specifically to address the challenges of modern cloud-based applications.

Paper Nr: 224
Title:

Towards an Automatic Data Value Analysis Method for Relational Databases

Authors:

Malika Bendechache, Nihar S. Limaye and Rob Brennan

Abstract: Data is becoming one of the world’s most valuable resources and it is suggested that those who own the data will own the future. However, despite data being an important asset, data owners struggle to assess its value. Some recent pioneer works have led to an increased awareness of the necessity for measuring data value. They have also put forward some simple but engaging survey-based methods to help with the first-level data assessment in an organisation. However, these methods are manual and they depend on the costly input of domain experts. In this paper, we propose to extend the manual survey-based approaches with additional metrics and dimensions derived from the evolving literature on data value dimensions and tailored specifically for our use case study. We also developed an automatic, metric-based data value assessment approach that (i) automatically quantifies the business value of data in Relational Databases (RDB), and (ii) provides a scoring method that facilitates the ranking and extraction of the most valuable RDB tables. We evaluate our proposed approach on a real-world RDB database from a small online retailer (MyVolts) and show in our experimental study that the data value assessments made by our automated system match those expressed by the domain expert approach.

Paper Nr: 227
Title:

National Cybersecurity Capacity Building Framework for Countries in a Transitional Phase

Authors:

Mohamed A. Ben Naseir, Huseyin Dogan, Edward Apeh and Raian Ali

Abstract: Building cybersecurity capacity has become increasingly a subject of global concern in both stable countries and those countries in a transitional phase. National and international Research & Technology Organisations (RTOs) have developed a plethora of guidelines and frameworks to help with the development of a national cybersecurity framework. Current state-of-art literature provides guidelines for developing national cybersecurity frameworks but, relatively little research has focussed on the context of cybersecurity capacity building especially for countries in the transitional stage. This paper proposes a National Cybersecurity Capacity Building Framework (NCCBF) that relies on a variety of existing standards, guidelines, and practices to enable countries in a transitional phase to transform their current cybersecurity posture by applying activities that reflect desired outcomes. The NCCBF provides stability against unquantifiable threats and enhances security by embedding leading and lagging performance security measures at a national level. The NCCBF is inspired by a Design Science Research methodology (DSR) and guided by utilising enterprise architectures, business process and modelling approaches. Furthermore, the NCCBF has been evaluated by a focus group against a structured set of criteria. The evaluation demonstrated the valuable contribution of the NCCBF’s in representing the challenges in National Cybersecurity Capacity Building and the complexities associated to the build.

Paper Nr: 250
Title:

CogBPMN: Representing Human-computer Symbiosis in the Cognitive Era

Authors:

Juliana J. Ferreira, Viviane Torres da Silva, Raphael M. Thiago, Leonardo G. Azevedo and Renato G. Cerqueira

Abstract: The human-computer symbiosis is a core principle of Cognitive Computing where humans and computers are coupled very tightly, and the resulting partnership presents new ways for the human brain to think and computers to process data. Business Process Management (BPM) provides methods and tools to represent, review, and discuss business domains, considering their knowledge, context, people, computer systems, and so on. Such methods and tools will be affected by advances in Cognitive Computing. Business Process Modeling notations need to support discussion and representation of human-computer symbiosis in any given organizational context. We propose CogBPMN, a set of cognitive recommendation subprocesses types that can be used to represent human-computer symbiosis in business process models. With CogBPMN, business stakeholders and Cognitive Computing specialists can understand how business processes can thrive by considering cognitive empowerment in organizations’ core processes. We discuss the proposed cognitive subprocesses in a medical domain use case.

Paper Nr: 10
Title:

Toward the Alignment and Traceability between Business Process and Software Models

Authors:

Aljia Bouzidi, Nahla Z. Haddar, Mounira Ben-Abdallah and Kais Haddar

Abstract: The current paper presents an approach to derive static and functional software models from a business process model (bpm), including trace links between business-system and system-system artifacts. This approach is based on a set of well-defined rules that transform a source model represented with the Business Process Model and Notation (BPMN), into a UML class diagram structured according to the model view controller design pattern, a UML use case model, and a trace model. All artifacts, except the trace model, are represented according to the standards (BPMN and UML). To show the feasibility of our approach we apply it on a topical case study.

Paper Nr: 21
Title:

Using Enterprise Architecture to Model a Reference Architecture for Industry 4.0

Authors:

Miguel Paiva, André Vasconcelos and Bruno Fragoso

Abstract: Enabled by the new technologies brought by the fourth industrial revolution (Industry4.0), organizations have the possibility to address productivity challenges and consequently become more profitable. This research uses the Reference Architecture Model of Industry 4.0 (RAMI4.0) and Industry 4.0 Component Model as ingredients for a reference architecture modelled using Enterprise Architecture (EA). RAMI4.0 EA is modelled using Archimate by producing a mapping between the Archimate and RAMI4.0 concepts. To apply the proposed solution and evaluate it, the EA of an Industry project is modelled including the 4 lower layers of RAMI4.0 (Asset, Integration and Communication, Informational and Functional). The evaluation of the mapping is done using ontological analysis supporting the benefits of using a common language for modelling Enterprise and Industry 4.0 concepts.

Paper Nr: 44
Title:

An Approach for Adaptive Enterprise Architecture

Authors:

Wissal Daoudi, Karim Doumi and Laila Kjiri

Abstract: Given the fast emergence of new technologies and the highly changing business demands, enterprises are confronted with the need to keep up with the evolving transformation. This one is subject to internal and external factors which make it very often in the form of disruptive changes. As a consequence, various parts of companies’ Enterprise Architecture are impacted. To address the new requirements of these increasingly dynamic environments, enterprises need to transition from heavy and document-centred Enterprise Architecture Frameworks to more agile and continuously adaptive approaches. On the other hand, Agile Software Development (ASD) are commonly used methods for IT development. They are mainly characterized by the high involvement of the requester and the rapid accommodation to development needs. This paper presents an Adaptive Enterprise Architecture model that is inspired from some ASD values. Thus, we begin with a brief summary of the criteria that we consider compulsory for Adaptive Enterprise Architecture. Then we present the related work and the connection between agile values and our criteria. Finally, we describe our model and illustrate it via a case study.

Paper Nr: 106
Title:

A Framework for Sustainable and Data-driven Smart Campus

Authors:

Zeynep N. Kostepen, Ekin Akkol, Onur Dogan, Semih Bitim and Abdulkadir Hiziroglu

Abstract: As small cities, university campuses contain many opportunities for smart city applications to increase service quality and use of public resources efficiency. Enabling technologies for Industry 4.0 play an important role in the goal of building a smart campus. The study contributes to the digital transformation process of İzmir Bakírçay University which is a newly established university in Turkey. The aim of the study is to plan a road map for establishing a smart and sustainable campus. A framework including an architectural structure and the application process, for the development of a smart campus have been revealed in the study. The system application is designed to be 3 stages. The system, which is planned to be built on the existing information systems of the university, includes data collection from sensors and data processing to support the management processes. The proposed framework expects to support some value-added operations such as increasing personnel productivity, increasing the quality of classroom training, reducing energy consumption, accelerating interpersonal communication and finding the fastest solution to the problems on campus. Therefore, not only a smart campus but also a system is designed for sustainability and maximum benefit from the facilities.

Paper Nr: 111
Title:

Citizen’s Perception of Public Services Digitization and Automation

Authors:

Edna D. Canedo, Heloise A. Tives and Anderson J. Cerqueira

Abstract: The Brazilian government, with the objective of promoting the transformation of its government services into digital services, has published relevant Decrees in recent years. This article aims to characterize public services, especially the Brazilian ones, in the context of digital transformation and proposes a model/process of digitization of public services focused on the needs of the citizen that can be used by any Government Agency that wishes to digitize its services, being adaptable and flexible, according to its needs. In addition, this paper presents the results of a survey in the scope of the Federal Public Administration to try to identify the expectation of the citizen and/or public servant with the Digitization of Public Services by the Brazilian Government.

Paper Nr: 154
Title:

Analyzing Software Application Integration into Evolving Enterprise Business Processes using Process Architectures

Authors:

Zia Babar and Eric Yu

Abstract: Many organizations face frequent, ongoing challenges as they attempt to integrate software applications into their business processes, particularly as enterprises continuously evolve, resulting in shifting requirements for these applications. The hiBPM framework supports modelling multiple interconnected processes involved in the integration of software applications into enterprise business processes so that alternative process-level configurations are compared and analysed. To support the evolving design capabilities and flexibilities of process execution, we elaborate on “Design-Use” and “Plan-Execute” relationships between processes. Design-Use relationships represent the exchange of a tool, capability or artifact that can be used repeatedly by other enterprise business processes for attaining some process or enterprise objectives. Plan-Execute relationships represent the exchange of information than enables process activities execution to accomplish enterprise objectives while simultaneously reducing the space of possible process execution possibilities. We applied the hiBPM framework at a large retail organization to illustrate how the organization could better integrate data analytics applications into their existing business and IT processes.