Our BPI cluster meeting on 21st February will be a joint session with a DSC/e lecture by Marek Reformat.
Title: Fuzziness in Processing and Representation of Web Data
The web represents an immense repository of information. A number of sources of structured and unstructured data is growing every day. There is no doubt that our dependency on web data increases continuously. However, the increased amount of data – although recognized as a positive and beneficial fact – creates challenges regarding our ability to fully utilize that data. Such situation increases pressure as well as expectations for providing better ways of processing data available on the web.
Every day, users search the web for things of their interest. On multiple occasions they expect precise results. However, human’s curiosity and a need for being exposed to different and novel things is an important part of exploration processes. Existing systems supporting users in
their search activities provide them with some variations, but it is not a controlled process. Diversity is accidental. In the first part of the presentation, we postulate that application of fuzziness in systems supporting users in their search will allow them to guide and control mechanisms that identify alternatives, and influence recommendations. Fuzzy-based methods can be applied to scenarios where users want to relax their requirements. Here, we concentrate on social networks. A methodology for selecting groups of individuals that satisfy linguistically described requirements regarding a degree of matching between users’ interests and collective interests of groups is presented. Additionally, we describe a simple fuzzy-based recommending approach that aims at constructing lists of suggested items. This is accomplished via explicit control of requirements regarding rigorousness of identifying users who become a reference base for generated suggestions.
A novel graph-based data representation format becomes an attractive and important way of storing data. It leads to better utilization of information stored and available on the web. High connectedness of such representation provides a means to create methods and techniques that can assimilate new data and build knowledge-like data structures. Such procedures resemble a human-like way of dealing with information. One of the most popular graph-based data formats is called the Resource Description Framework (RDF). It is a data format introduced together with the concept of Semantic Web. In the second part of the presentation, we present a process of assimilating information from multiple sources of RDF data. A newly proposed form of participatory learning using propositions provides an approximate reasoning-based approach to integrate previously unknown information with already known facts. We show how participatory learning has been adapted to integrating new information represented as relations. The approach recognizes two types of variables: conjunctive and disjunctive, that are common for knowledge graphs existing on the web.
The details of the above methodologies are presented, and multiple examples illustrating behavior of the processes are provided.
Marek Reformat (IEEE SM’05) received the M.Sc. degree (Hons.) from the Technical University of Poznan, Poznan, Poland, and the Ph.D. degree from the University of Manitoba, Winnipeg, MB, Canada. He is currently a Professor with the Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada. The goal of his research activities is to develop methods and techniques for intelligent data modeling and analysis leading to translation of data into knowledge, as well as to design systems that possess abilities to imitate different aspects of human behavior. In this context, the concepts of computational intelligence—with fuzzy computing and possibility theory in particular—are key elements necessary for capturing relationships between pieces of data and knowledge, and for mimicking human ways of reasoning about opinions and facts. He also works on computational intelligence-based approaches for dealing with information stored on the web. He applies elements of fuzzy sets to social networks, linked data, and Semantic Web in order to handle inherently imprecise information, and provide users with unique facts retrieved from the data. All his activities focus on introduction of human aspects to web and software systems which will lead to the development of more human-aware and human-like systems.
You are all kindly invited to this joint session!
Time:12:30 – 13:30
Remarks:Doors open at 12:00h
Location:TU/e Luna building, Corona room (Koepelzaal)
The student who will present is Evertjan Peer. He conducts a joint master project for Operations Management & Logistics and Data Science in Engineering (which is a special track of Computer Science & Engineering). From the faculty IE&IS he is supervised by Yingqian Zhang. Vlado Menkovski is his first supervisor from the Computer Science faculty.
Solving the Train Unit Shunting Problem: A Deep Reinforcement Learning Approach.
The Train Unit Shunting Problem is a complex task which consists of planning train movements and cleaning/maintenance tasks on shunting yards. Current solution techniques fall short by either having a long runtime (linear programs), or producing non-intuitive solutions (local search). In this thesis I investigate whether recent successes of Deep Reinforcement Learning in solving the game of Go, and playing Atari games, can be brought to this real life planning problem. Could a Deep Reinforcement Learning agent build up experience about what good moves are when handling trains on the shunting yard? An iterative procedure is followed in which the problem formulation complexity is gradually increased. I will introduce Deep Reinforcement Learning (more specifically the Deep Q-Network) and discuss my experiences so far in applying these techniques to planning problems.
Miranti Rahmani will give a talk during our session. She is a master student of Prof. Uzay Kaymak.
Feature Selection for Predictive Medical Decision Models
This project is part of master graduation project, which the end goal is to construct a new feature selection method to be used in (predictive) medical decision making process. The presentation would largely covers the brief introduction of feature selection in general and within medical domain, basic feature selection methods, state-of-the-art of feature selection for medical data, problem description, research question for the project, also a brief planning for the next step. Since the project is still in early stage, any input or suggestions for content or methodology would be appreciated.
Nick Hoffmans will give a talk during our session. He is a master student of Prof. Uzay Kaymak (1st assessor) and dr. Anna Wilbik (2nd assessor).
Title: Fuzzy compliance checking of healthcare process data on time-task matrices
Time-task matrices (TTM’s) are a common means to represent a standardized way of a healthcare process, a so-called care pathway. What we propose in this research, is checking to what extent the actual process execution data comply with the TTM in the perspectives of control-flow and data (timing, resources and possibly other data variables). Whereas the standard compliance checking techniques use rather crisp boundaries to measure the compliance, we propose to use fuzzy sets to allow a certain amount of flexibility. The method is tested through a retrospective case study at Maastricht University Medical Center+ on patients that underwent surgery for the treatment of aortic valve stenosis.
Dian Kroes, a master student of Anna Wilbik, has the following talk:
Measuring the influence of greenwashing practices on the public opinion – a twitter sentiment analysis
After the Deepwater Horizon Oil spill in 2010, BP has actively used greenwashing to try and improve its company image. Greenwashing is a common practice that provides the public with “false green information”. To see if these greenwashing practices are really improving the company’s image in the minds of the public a Twitter sentiment analysis is conducted. The sentiment analysis is based on a Naïve Bayes algorithm and tries to set out a sentiment trend over the time period 2009-2014, the sentiment trend is used to see if the public opinion improves after the identified greenwashing practices.
On 6th of December, between 12:30-13:30 in Pav. K.16 Guillaume Crognier will present his work on “Vehicle routing in Python” which he conducted under the supervision of dr. Jérôme GALTIER in Orange Gardens, Châtillon, France · Orange Labs Product & Services. Guillaume Crognier is a student at Ecole Polytechnique Paris that is one of the leading engineering schools in France.
Everyone is welcome to Guillaume’s talk!
This talk will present and compare two different approaches to solve exactly the Capacitated Arc Routing Problem (CARP) and will provide computational results. The first one is named the “standard approach” and consists in writing a linear programming based on flow constraints to ensure that generated paths are connected. The second one is more complex and relies on a master/slave formulation of the problem.
On 7th of May, 12:30-13:30 in Pav K 16 Guillaume Zamora will present his work on “A clinical decision support system by using wrist-worn smartphone tremor measurements”.
Everyone is welcome!
Background: Tremor related diseases affect millions of people around the world, hindering various everyday life tasks, such as holding a glass of water. Tremor severity assessment is an important element for the diagnosis and treatment decision making process. For decades, subjective clinical rating scales were mostly performed. Recently, remarkable attention around computerized tremor analysis has grown. While dedicated devices are expensive and not practical for the everyday use, smartphone applications are promising. Previous studies on Parkinson’s disease or Essential Tremor mostly classified the Fahn score rating scale. Objective: Using machine learning techniques to
regress tremor severity observed by clinicians (ETRS) and patients (QUEST), and give the research accessibility and new insights that would later lead to decision making process improvements.
Methods: Five wrist-worn different tests were performed on 20 Essential Tremor patients from the open-source TREMOR12 iPhone/iWatch compatible application. Linear displacements and joint rotations are measured from in-device accelerometer and gyroscope. From these signals, time, frequency and time-frequency domain tools are used to extract the following features: dominant frequency, dominant magnitude, signal RMS, signal period and the power growth during the test.
Results: While the study demonstrates good predictive power, its feature extraction shows to bring improvements when compared to previous close setting studies. Conclusion: This study gives the research new directions and tools in order to perform further investigations around tremor severity evaluation. Smartphone sensors improvement in the following years, research on the best predicted variable to use and larger data collection may lead to very robust models, measuring rapidly, accurately and more objectively tremor severity than clinical rating scales.
Key words: movement disorder, tremor severity, smartphone, feature extraction, regression
On Wednesday, 17.05.2017, 12:30-13:30 in Pav K16
Yingqian Zhang will present her talk on
“Learning decision trees with flexible constraints and objectives using integer optimization”
We encode the problem of learning the optimal decision tree of a given depth as an integer optimization problem. We show experimentally that our method (DTIP) can be used to learn good trees up to depth 5 from data sets of size up to 1000. In addition to being efficient, our new formulation allows for a lot of flexibility. Experiments show that we can use the trees learned from any existing decision tree algorithms as starting solutions and improve the trees using DTIP. Moreover, the proposed formulation allows us to easily create decision trees with different optimization objectives instead of accuracy and error, and constraints can be added explicitly during the tree construction phase. We show how this flexibility can be used to learn discrimination-aware classification trees, to improve learning from imbalanced data, and to learn trees that minimise false positive/negative errors.