# 2020-2021

## Details

• Video-Based Gesture Interface for Touchless Tablet Operation
Supervisor: Michael Greenspan
Description: “A novel video communication system is being developed for use within primary healthcare facilities. The system will allow nurses and other healthcare professionals to communicate with patients from various locations throughout a hospital ward. This will not only reduce the need to physically enter patient rooms, thereby reducing the need for PPE, but will also improve the workflow of healthcare professionals who can monitor patients and communicate with them more effectively. The system will involve a number of tablets, mounted at specific locations throughout a ward, such as outside patient rooms. One requirement of the system is that the system be operated in a purely hands-free mode, as any surfaces that are touched have the potential to transmit microbes. The objective of this project is to develop an effective, video-based gesture operation mode for the system. The team will investigate a number of different public domain video-based gesture libraries, and evaluate the relative effectiveness of these systems for the task. A specific “gesture language” will be then developed based on the selected method, that will be simple, intuitive, robust, and appropriate for the task. The system will be implemented in primary healthcare facilities throughout the term of this project, and so the team will have the opportunity to evaluate the effectiveness of various alternatives with a representation of the target user group, in order to make informed decisions on the design.”
• Homomorphic Encryption Implementation
Supervisor: Selim Akl
Description: “The emergence and widespread use of cloud services in recent years represented one of the most important evolutions in information technology. By offering abundant, conveniently accessible, and relatively inexpensive storage for data, the cloud is certainly a very attractive option for businesses and citizens alike. However, this ease of access to, often personal and sometimes sensitive, information, is clearly coupled with serious concerns over the privacy and security of the data. The most effective approach to mitigate threats to data safety by unscrupulous individuals is cryptography. Unfortunately, encrypting the data offers an inevitable trade-off: convenience is diminished. Users wishing to process their data, must download the encrypted information from the cloud, decrypt it, process the plaintext version, re-encrypt the result, and re-upload the ciphertext to the cloud. A special kind of cryptography however exists, called homomorphic cryptography, that allows users to operate remotely on their encrypted data, directly on the untrusted database. Three projects are available aiming to implement the use of homomorphic encryption for three types of data, namely, graphs, integers, and polynomials. Each project will provide a working homomorphic encryption system that: (i) receives inputs, (ii) encrypts the inputs, (iii) applies operations on the encrypted inputs, and (iv) decrypts the results. “
• Intelligent Supporter for the Development and Maintenance of Notebooks
Supervisor: Yuan Tian
Description: “Jupyter Notebook is an interactive computational environment where people can combine code execution, text,mathematics, plots, and rich media into a single document. It has become extremely popular as a method to develop data analytics projects in industry and academic domains. However, despite the increasing popularity and importance of Jupyter Notebooks, recent studies also reveal that many open-source notebooks are of low quality, e.g., unable to execute or replicate, hard to debug and maintain. These issues exist mainly due to the limited quality control tools provided by Project Jupyter, the organization that develops and maintains JupyterNotebook. Moreover, lack of software quality tools reduces the efficiency of developing data science projects/products, reduces the reusability and potential collaboration on data science code, and ignores critical bugs that could potentially harm the system. In the literature, some tools such as ReviewerNB are proposed to support visualization of code changes in a notebook over its previous version. However, these tools are line-based and thus they fail to analyze cell-level changes in notebooks. Besides, existing tools are unable to track the full history of cells in notebooks, which is crucial for project owners and researchers to mine and investigate maintenance patterns from code changes in notebooks. To fill the gap, in this project, We would like to conduct the first look at the cell-level evolution of notebooks in open source software and develop a tool that can automatically visualize and track the changes in code cells in a given notebook.”
• Analyzing Bots Used in Open-source Software Projects
Supervisor: Yuan Tian
Description: “The use of intelligent bots is increasing in the modern software development process. For instance, bots have been proposed and adopted to help developers automatically categorize issue reports and find suitable contributors who have enough expertise to resolve the issues. There are also bots that help with the onboarding of new contributors and administration within the project development team. In this project, we would like to investigate the usage of bots in the issues and pull requests of open source projects host on GitHub, the world-wide largest code hosting platform. Our main research goal is to identify good/bad usage patterns of bots, as well as the limitations of bots. This research will shed light on the future improvement of automated software development.”
• Intelligent Course Planner
Supervisor: Yuan Tian
Description: “In the rise of Open Online Courses (MOOCs) such as Coursera or EdX, and many other tutoring systems, distance learning creates learning opportunities for everyone with a broad range of knowledge needs, ranging from minority young graduates from high school for college preparation to working professionals who want to upgrade technical skill set perfectly. With teaching materials increasingly available online, next-generation learning will inevitably take place online. In times of crisis, furthermore, the focus has urgently turned to online learning as more than just a substitute to ensure that education continues. However, due to wildly differing knowledge needs, existing online/offline learning platforms often fail to provide intelligent support for learners to better plan their courses taken to fulfill their online/offline degrees. In this project, we aim to develop a sophisticated course planner that contains a personalized learning plan (pathways) generation component. The plan generation will allow learns to schedule courses that already taken the recommend the remaining courses for achieving a degree pre-specified by learners. Our system will also feature an intelligent solution for mining hard and soft contains from course descriptions and external knowledge resources. “
• Label-free Action Inference
Supervisor: Christian Muise
Description: “Understanding how an environment works by observation alone is a long-standing grand challenge for Artificial Intelligence. Action model induction refers to the task of automatically or semi-automatically synthesizing action descriptions from observation data only, thus providing a description of how an environment works. Typically, partial action schemas or action labels are provided as guidance, but this project aims to remove that assumption. From discrete sequences of actions, the goal is to infer (1) which actions exist and are being executed; and (2) what are the potential preconditions and effects of those actions. Both perfect information and noisy variants of this problem will be considered, and the student is free to explore different approaches to solving the problem (an initial short-list of ideas will be provided). More info at http://mulab.ai/499.pdf (see project #1)”
• Interactive Forward Search Planner
Supervisor: Christian Muise
Description: “(1-2 students) Modern AI planners are large and complex systems with decades of advanced engineering practices built in. This project aims to highlight the major components of modern planning systems in a visual and interactive way; integrated into the leading editor for planning problems at Planning.Domains. This project aims to accomplish two key things: 1. Fill the gap in the field of automated planning for educational experiences that explain the core concepts of modern planners. 2. Construct a flexible framework for researchers in the field to visualize and test their ideas for new heuristics, search procedures, etc. Tackling the task of visually explaining difficult concepts in Artificial Intelligence is at the heart of this project. The final product will have a life far beyond the duration of the course, and provide a valuable resource to an entire research field. More info at http://mulab.ai/499.pdf (see project #2)”
• RL Agent & Mathematical Model of the Tak Board Game
Supervisor: Christian Muise
Description: “(2-3 students) Tak is a strategic board game for 2 players that involves placing and moving stacks of tokens around a grid of cells. It is inspired by a game described in the fictional series known as The King Killer’s Chronicles and was co-designed by the book’s author. The game is designed to be simple to play, but rich in the space of game-play and strategy. The added element of (physical) depth in the game gives it a unique flavour compared to other traditional board games (such as Go, Checkers, and Chess). Exploring a variety of mathematical models for the representation of the game state will play a central role in this project. The languages considered will include both custom representations that modern reinforcement frameworks are capable of consuming (i.e., convertible to tensor-based representations), as well as planning models to capture the game mechanics. The techniques explored for solving the game is intentionally left open to allow for custom exploration of ideas as the project progresses. That said, as a basis for comparison the student will be expected to implement and test common RL approaches such as DQN and those ideas found in the work of AlphaZero. More info at http://mulab.ai/499.pdf (see project #3) “
• Soccer Footage Perspective Shift
Supervisor: Christian Muise
Description: “(2-3 students) To analyze the strategy of players on a field, we need accurate measures of where players and ball are located. The footage, however, is typically not in a form amenable for simple analysis. This project focuses on the key component of a sports analytics pipeline that converts raw first-person perspective to top-down 2D views of the situation on the field. There are two main components to this project: (1) annotating the players and key markers in the scene; and (2) applying a mathematical transform to these annotations so that the configuration of players (and ball) are represented in a 2D format. The first component will involve the application (and potentially re-implementation) of existing techniques for identifying human poses in a scene (multiple times per second), combined with an interface for manual (but interactive) annotation of key points. These techniques do not need to be invented from scratch, but adopting them will likely involve some independent research. The second component will involve defining, implementing, and displaying the result of a mathematical transform of the players in a scene from 3D first-person perspective to 2D top-down perspective. The mathematics required for this component are very similar to that of game engine design, and so expertise/interest in this area will be helpful. Time and resource permitting, we would further like to investigate the detection of key events in the video – passes, goals, offsides, etc. This aspect is considered a stretch goal for the project. More info at http://mulab.ai/499.pdf (see project #4)”
• Analysis of COVID 19 data
Supervisor: Amber Simpson
Description: “The Ontario Health Data Platform (OHDP) is part of the Ontario government’s effort to fight COVID-19 includes a plan to bring artificial intelligence innovators together with health researchers to study a vast new pool of public health information. The platform is hosted in the Centre for Advanced Computing at Queen’s and will be launched in July 2020. Dr. Simpson is looking for students interested in working with the platform to ask a variety of research questions in collaboration with physicians.”
• The Medical Segmentation Decathlon Grand Challenge
Supervisor: Amber Simpson
Description: “Our lab is endeavoring to use the “wisdom of the crowd,” or crowdsourcing, to solve fundamental problems in biomedical research. Comprehensive benchmarking through the PASCAL Visual Object Classes challenge revolutionized the computer vision field by effectively solving the object recognition problem, necessary for self-driving cars among other tasks. The resulting ImageNet, a large annotated set of natural images, is considered responsible for the current artificial intelligence boom. Inspired by this challenge, we hosted a challenge aimed at solving the semantic segmentation problem, a foundational problem in medical imaging and the central first step in imaging biomarker development. Semantic segmentation is the process of automatically associating every pixel or voxel of the image with a label, without the need for human input. Recognizing our unique access to large amounts of high-quality annotated imaging data, this multi-institutional effort sought to create a large, open-source, manually annotated medical image dataset of various anatomical sites for use by the machine learning community. Notably, these data were used in the Medical Segmentation Decathlon challenge (http://medicaldecathlon.com/) held during the 2018 MICCAI conference, which demonstrated for the first time, the generalizability of segmentation algorithms to unseen tasks. Importantly, the challenge enables objective assessment of general-purpose segmentation methods through comprehensive benchmarking and democratizes access to medical imaging data. We wish to expand this challenge to additional anatomies. The goal of the project will be to design the next stage of the challenge, upload the challenge data to the challenge platform, and test the challenge platform.”
• Jupyter notebook extensions for teaching and learning
Supervisor: Burton Ma
Description: “”The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.” Notebooks are often used to share data and code that analyzes the data and they have great potential for teaching and learning, but they are missing easy-to-use functionality that would be useful for teaching applications. The team of students will assess and analyze the missing functionality from both a student and instructor point of view, design and implement extensions to supply the missing functionality, and demonstrate the extensions by creating a modest amount of course material for a first- or second-year computing science course. Requirements: Strong knowledge of JavaScript, CSS, and HTML. Some experience with Jupyter Notebook and the beakerx kernels and extensions might be useful.”
• Comparison of Statistical Methods in Psychology and Machine Learning
Supervisor: Catherine Stinson
Description: “Psychology has recently come under fire for the misuse of statistical tests in research (for example p-hacking), leading to a series of high profile retractions. The practice of using the same data for multiple tests without adjusting for reduced test power also seems to be widespread in Machine Learning. There has been criticism recently of how ImageNet has had an outsized influence on the development of ML systems for image recognition, for instance. Students will familiarize themselves with the literature on p-hacking in Psychology, then survey recent papers from major ML conferences and journals, analyze their statistical methods, and co-write a research article. Does the use and re-use of ImageNet in image recognition constitute p-hacking? Are other misuses of statistical tests widespread in ML? Are there lessons from the reforms recently enacted in Psychology that should be extended to ML?”
• Compact and interpretable predictive models for linear genetic programming
Supervisor: Ting Hu and Jana Dunfield
Description: “Despite being highly accurate on prediction, many trained machine learning (ML) models are often too complex to interpret. This results in a major hurdle for ML applications in areas where the consequence of a biased machine decision can be fatal, such as medicine. In order to understand how a machine decision has been made, we search for more interpretable ML representations. This research project investigates an ML and evolutionary computing technique, linear genetic programming, which represents a predictive model as a computer program, i.e., a sequence of imperative instructions. This project develops algorithms that take a linear genetic program as input and convert it to a shorter one with the same behaviour (for example, by removing instructions that do not affect the result), in order to make the model represented by the program more interpretable. “
• Predicting surge in Emergency Rooms
Supervisor: David Skillicorn
Description: “This project continues two projects from last year. In Ontario, those who show up at emergency rooms are caregorised into 80 syndromes by the triage nurse they see first. Emergency rooms would like to predict their future loads, i.e. how busy will be be tomorrow. We are interested in how well tomorrow’s patients with a range of syndromes can be predicted from those of today and earlier days. The KFL&A Heath Unit has shared five years of data about a variety of related syndromes with us: ILI (Influenza like illness), Asthma, COPD, bronchitis, pneumonia, and respiratory distress. Projects would involve developing predictive models for one of these illnesses, and models to try and predict combinations of them as well. There could be up to five projects. Requires decent grades in at least one of the data analytics courses.”
• Geolocation of posters in White Supremacist forums
Supervisor: David Skillicorn
Description: “In two White Supremacist forums we have collected, some posters mention where they live (at different granularities: country, city, county, town). The project has two phases: first, extract these geographical markers from all of their contexts; second look at the distribution to see if there are any geographical patterns. “
• Detecting malware tweets in real time
Supervisor: David Skillicorn
Description: “Burnap’s group at Cardiff looked at detecting tweets that contained drive-by (shortened) URLs in close to real time. More sophisticated prediction techniques have been developed since their work in 2015. This project will revist the problem to see whether both prediction and attribute selection can be improved. Requires one of the data analytics courses.”
• JARV1S: A Cyber Threat Intake and Processing Pipline for DND Canada
Supervisor: Steven Ding
Description: “With the introduction of ultra-fast 5G network, companies and organizational infrastructures are facing an ever-increasing number of endpoint devices such as PCs, mobile devices, Internet-of-Thing (IoT) sensors, and actuators. These endpoint devices significantly enlarge the attack surface, and the attackers have been vastly shifting their targets from server and perimeter to endpoint devices. Malware infection is one of the major rapidly evolving threats against endpoint security. 929 million malware has been registered up to August 2019, which is already 79 million more than the total of 2018. Behind the massive amount of data, organizations and security companies rely on critical backbone binary intake and processing pipelines to gain insight and reuse the learned knowledge to identify and understand future emerging threats. Traditional binary intake pipeline heavily rely on legacy signature-based static analysis that cannot detect unseen attacks. This system relies on an Information Retrieval (IR) approach to decompose any binary executable into existing known information from the repository. We are building up the complete system and the CI processing pipeline leveraging some modules from Kam1n0. “
• APP: Democratized Elderly Care Enhancement NeTwork (DECENT)
Supervisor: Steven Ding
Description: “The support required by aging Canadians ranges from help with transportation to help with household chores, both minor and major, and to help with personal care. Support is required to both help prevent injury in those who may have mild or no impairments in function and to do the work for those who are physically unable. Technology has the untapped potential to enable the provision of such services. In particular, the enablement of a shared economy directed at the special needs and requirements of seniors is needed, will improve their quality of life and will provide economic and social benefit. We propose the development of a time-sharing/volunteering/affordable-payment based application that will support the delivery of all forms of help for seniors. This application would take the form of a “social network” model to be called the Democratized Elderly Care Enhancement NeTwork (DECENT). In this UBER-like application, a caregiver can provide certain supporting services that seniors need for remuneration or on a volunteer basis. And the other users will be matched based on the services provided. The main challenges to the development of DECENT are privacy and security considerations of both customers (elderly service requesters) and service providers to ensure that neither party is exploited or endangered. “
• Real-time Rumor Detection on Social Network Stream
Supervisor: Steven Ding
Description: “Users of social media websites tend to rapidly spread breaking news and trending stories without considering their truthfulness. This facilitates the spread of rumors through social networks. A rumor is a story or statement for which truthfulness has not been verified. Efficiently detecting and acting upon rumors throughout social networks is of high importance to minimizing their harmful effect. However, detecting them is not a trivial task. They belong to unseen topics or events that are not covered in the training dataset. A rumor is defined as “a story or a statement whose truth value is unverified”. Definitions of rumors in major dictionaries also coincide with that. According to these definitions, rumors do not have to be false; they can be deemed later to be true or false. The main characteristic of a rumor is that its truth value is unverified at the time of posting. In this project, we study the problem of detecting breaking news rumors, instead of long-lasting rumors, that spread in social media. This project explore domain-independent Natural Language Processing (NLP) solutions to automatically identify rumors. “
• Deciphering cancer etiology using computational approaches
Supervisor: Anna Panchenko
Description: “Students will perform data analysis and design of computational methods that involve machine learning and molecular modeling – to offer insights into the basis of cancer mutagenesis, DNA repair and their contribution to cancer etiology.”
• Mining Code for Logical Puzzles
Supervisor: Yuan Tian and Juergen Dingel and Christian Muise
Description: “(2-3 students) One of the most important skills a CS student acquires from their undergraduate education is to understand logical conditions that govern the control flow of software. This project aims to provide a means for automatically crawling existing software projects; extracting the control flow of the code; and filtering for the most “complex” or “interesting” examples from the logical view. More info at http://mulab.ai/499.pdf (see project #5)”
• Bone segmentation in ultrasound images using neural networks
Supervisor: Tamas Ungi
Description: “Ultrasound is an emerging medical imaging modality. It is safe and accessible, but hard to see anatomical structures in ultrasound compared to other imaging modalities. This project is aimed at enhancing certain anatomical features, e.g. bones, in ultrasound images. The project involves programming in Python, using TensorFlow 2, in the 3D Slicer application framework. Those who plan to pick this project need at least a basic level prior experience working with these software tools.”
• Automated wifi router location adjustment
Supervisor: Hossam Hassanein & Hesham Moussa
Description: “A very common challenge often faced in public places or office environments is the placement of the wifi access routers. As routers might be operating on the same frequency, having multiple routers in proximity of each other could lead to high interference which in turn leads to a degradation in network performance. Furthermore, the location of the router impacts coverage. If a router is placed in an area with no one to serve, it is considered a waste of resources. Additionally, as people in public places, such as shopping malls, are constantly moving around, fixing the location of the routers might not be optimal. Hence, one way to solve all those issues is to allow some sort of mobility for some of the routers such that they are able to automatically adjust their locations depending on the loading as well as interference conditions. For instance, if a router senses that in its current location there is few people to serve, it might consider moving around to serve more people. Similarly, if a router is experiencing high interference, it can adjust its position as to achieve better performance. In this project, the students are expected to consider a scenario with multiple fixed router locations and one mobile router. The mobile router can be considered as a reinforcement learning agent that senses the environment and adjusts its position accordingly. “
• Intent-based coding
Supervisor: Hossam Hassanein & Hesham Moussa
Description: “In the era of technology, we are living in, knowing how to write codes is very important. However, there are multiple coding languages out there and learning each one of them is time consuming and difficult. An easier way to code is if we are able to translate a unified language to any intended programming language. Perhaps in the future we will even be able to talk to the machines and let them translate human language into codes for us. This is what is known as intent-based coding. It is a platform that takes in regular English words that describes the intended objective of the program and translates the intent into coding syntax. As a start towards this objective, in this project, students are expected to envision how intent-based coding can be implemented. First, it is essential to build a block diagram of the different functions that will enable this objective. Some suggested blocks are a language translation module, a code syntax generator, and compilation tester. Second, the students should choose one of these blocks and attempt to build it using some form of ML (Hints: For the language translation module, you would need to learn natural language processing (NLP). For code syntax generator, you can refer to this article on automatic programming.). Last, choose a simple example, for instance building a calculator, and attempt to run your model to generate the corresponding code in at least two different programming languages. It is expected that your code should take in a simple command such as “build me an adder in java” and it converts it into a functioning code in the specified programming language.”
• Automatic Highlights-video creation
Supervisor: Hossam Hassanein & Hesham Moussa
Description: “Many sport enthusiasts enjoy watching live games on the TV at the comfort of their homes. However, as there are many games that are either being aired at a different time-zone or multiple of them are being aired at the same time on different channels, it is challenging to follow everything. Hence, people resort to watching the after the game highlights video. Highlights videos are essentially 10-minute summary videos that contain the most important moments of the games. Normally, those highlights videos are created by other enthusiasts who watched the game. They often choose the scenes that capture the most popular players while in action, those that capture point scoring moments, or the ones with some controversies…etc. At the end of the day, these highlight videos should be enough to give a very good idea about what happened in the game. In this project, the objective is to train a machine learning agent to automatically generate such high quality highlight summary videos for sports games. A possible way and a good place to start would be to pick a specific sport, say soccer, and collect some sample highlights videos for various games (plenty of such videos can be found on Youtube). These videos can be analyzed and used to learn the basic features that make a scene more important than the other. Once these features are identified, the students should be able to use this knowledge and collected videos to train a highlights video generating network. Hint: Divide the video into segments and train the network such that it is able to determine the average length of a highlights moment, the beginning and end of a highlights moment, and the accompanying noises. Students can choose easier sports with less variables as a proof of concept. “
• Deep Learning Based Image Compression
Supervisor: Hossam Hassanein & Hazem Abbas
Description: “Data compression is an area that reduces data volume, so we can save as much memory as you would without compression. Examples of data that we want to compress can be audio files, videos, images, and more. This project deals with image compression. Compression can be distinguished from two major types: Lossless Compression and Lossy Compression. In Compression Shift, compressed information is completely restored to the original information. In contrast, compression loses the reconstructed information very close but not the same as the original information. There are several common image compression formats, one of which is JPEG – the Joint Photographic Expert Group. In recent years, neuron networks have become the most popular models in artificial intelligence. They excel in decision making. A network of neurons can easily identify and classify items in an image, and hence comes the desire to use these models to overcome the problems that emerge in past compression methods. In this project, we will focus on an existing method that attempts to achieve good compression results using deep generative models, namely, autoencoding and GAN methods. “
• Video-Based Human Emotion Recognition System Using Deep Learning Techniques
Supervisor: Hossam Hassanein & Hazem Abbas
Description: “Automatic facial emotion recognition (FER) is becoming in more and more demand in the current days. More industries are trying to incorporate emotion-aware technologies into their products. Some of those industries, the automotive industry being one example, require very tight limitations regarding both run-time and memory footprint of the used models to fit into small embedded devices. While there is a lot of machine learning and computer vision work on FER, most of it focuses on obtaining the best possible system accuracy without being bound by memory constraints. On the other hand, this project should explore the deep learning models for emotion recognition in videos for systems with limited memory like robots, cars, and embedded-systems. Naturally, this comes at the expense of sacrificing some accuracy. It is required to propose some deep models that can have small number of parameters and yet provide an acceptable accuracy. “
• Discover Nearby Resources
Supervisor: Hossam Hassanein
Description: “This project aims to discover resources/services offered by other mobile or non-mobile providers in close proximity. Such resources/services may be public sensing data, traffic information, navigation directions, etc. Students are required to develop a system that can keep track of services offered by providers and enable users to discover services or resources of interest, based on location. The implementation will have mobile side and server-side components. The mobile side component represents the user interface, whether to advertise or discover a service. In service advertisements, the provider can create a new service or choose from existing ones. Service discovery matches keywords from the user request with existing offerings in the service database (service directory or arbitrator). The discovery mechanism can simply rank matched services according to the matching score (i.e. how many keywords from service request are matching keywords of service description). A much desirable approach is to employ recommender systems such as collaborative filtering or context-aware methods. The collaborative filtering method uses the known taste of a group of users to produce recommendation to other users. The Context-aware method provides recommendations to the users regarding their environment and the details of the situation in which they are. Students may also explore other methods or combinations of many. Finally, the server-side component is a simple centralized database that manages service offerings and performs discovery requests. “
• Non–invasive cancer detection and grading using high-resolution ultrasound
Supervisor: Parvin Mousavi
Description: “The standard routine for cancer grading is through pathological assessment of surgical specimens or biopsies, which is both invasive and subjective. High-frequency micro-US is a novel real-time ultrasound-based imaging modality that allows real-time characterization of tissues in high resolution. The goal of this project is to develop a deep learning method to predict the stage of cancer from the micro-US data. The workflow includes, but not limited to, visualization, ROI selection, preprocessing, and model development. “
• Intrusion Detection for Connected & Autonomous Vehicles (Group of 2-3 students)
Supervisor: Mohammad Zulkernine and Marwa Elsayed
Description: “Vehicular ad-hoc Networks (VANET) have recently gained increasing momentum from leveraging the Internet-of-Things (IoT) to shape the future of intelligent transportation systems (ITS). Such networks enable connected and autonomous vehicles (CAVs) to exchange real time information through vehicle-to-everything (V2X) communications including vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I). The advances in wireless communication technologies are the key enabler behind realizing and improving the smart adaptive data-driven decision-making capabilities of CAVs. Despite the benefits, these communications are vulnerable to various types of attacks such as Denial of Service (DoS), message falsification, sybil, Greyhole, Blackhole, Wormhole, among other attacks. These attacks can lead to major safety concerns including disrupting road traffic, or even triggering a collision. Researchers over the past years have developed different intrusion detection systems to identify potential intrusive activities against CAVs. They mainly rely either on a) traditional techniques such as rule-based or time and frequency-based analysis that require prior knowledge of the underlying distribution of data, the normal model regarding a system, and other empirical assumptions; or b) machine-learning based that are not explainable. Such systems are further challenged by the sophistication of attacks against CAVs that can evolve to complex colluding activities. This project will specifically investigate hardening the security of CAVs against intrusions. The proposed solution should target meeting the following objectives: • Investigate different types of intrusions targeting inter and intra communications of CAVs. • Study and review existing state-of-the-art solutions including flow-based, payload-based and hybrid intrusion detection systems detailing their strengths and limitations as well as identifying current research challenges. • Develop countermeasure solutions that overcome the identified challenges. Such solutions will aim to monitor and secure internal and external networks of CAVs by leveraging advanced machine learning techniques for revealing potential intrusions. Please, find the following references [1-5] which represent an important starting point to investigate this area of research in an attempt to address the aforesaid objectives. References: [1] O.Y. Al-Jarrah, C. Maple, M. Dianati, D. Oxtoby, and A. Mouzakitis, “Intrusion detection systems for intra-vehicle networks: A review,” IEEE Access, vol. 7, pp.21266-21289, 2019. [2] R. Rieke, M. Seidemann, E. K. Talla, D. Zelle, and B. Seeger, “Behavior analysis for safety and security in automotive systems,” Proc. of 25th Euromicro Int. Conf. Parallel, Distrib. Netw.-based Process, pp. 381-385, 2017. [3] M. Marchetti and D. Stabili, “Anomaly detection of CAN bus messages through analysis of ID sequences,” Proc. of IEEE Intell. Vehicles Symp., pp. 1577-1583, 2017. [4] C. Wang, Z. Zhao, L. Gong, L. Zhu, Z. Liu, and X. Cheng, “A distributed anomaly detection system for in-vehicle network using HTM,” IEEE Access, vol. 6, pp. 9091-9098, 2018. [5] G. Loukas, T. Vuong, R. Heartfield, G. Sakellari, Y. Yoon, and D. Gan, “Cloud-based cyber-physical intrusion detection for vehicles using deep learning,” IEEE Access, vol. 6, pp. 3491-3508, 2018. “
• An implementation and interactive interface for evolutionary algorithms
Supervisor: Ting Hu
Description: “Evolutionary algorithms are an approach to AI where natural evolution mechanisms, such as mutation, recombination, and selection, are implemented to search for solutions to a problem. This project will design a web application that implements and provides an interactive interface for evolutionary algorithms. It will allow users to specify how to represent a candidate solution, how to perform mutation, crossover, and selection, and how to set algorithm parameters. It will also provide visualization of results and allow real-time adjustments of the algorithm configuration. “
• Functions and predicates for the QL language to query time series data
Supervisor: Juergen Dingel
• Implementation of an evaluator for predicate logic
Supervisor: Juergen Dingel
• Constraint solving for the QL language to query time series data
Supervisor: Juergen Dingel
• Exploring JavaMOP for runtime monitoring
Supervisor: Juergen Dingel
Supervisor: Juergen Dingel
• Formalization and analysis of advanced data structures in Alloy
Supervisor: Juergen Dingel
• Exploring probabilistic model checking
Supervisor: Juergen Dingel
• Formalizing Git
Supervisor: Juergen Dingel
• Preoperative margin assessment for cancer surgeries using mass spectrometry
Supervisor: Parvin Mousavi
Description: “The clinical management of cancer mostly involves surgical resection of the tumor, and the goal is to remove all affected cells to avoid recurrence of cancer. To determine whether this goal has been achieved, the resected specimens are examined by histopathologist post-operatively. The presence of cancer cells at the margins indicates incomplete tumor resection. Novel mass spectrometry-based technologies can provide enriched feedbacks to surgeons about the molecular signature of the resected specimen to avoid positive margin. The goal of this project is to develop deep learning models capable of characterizing cancer signatures in recorded mass spectra.”
• Analyzing the Impact of COVID-19 on existing Lab Test Infrastructure
Supervisor: Parvin Mousavi
Description: “Tentative Questions: 1. Has the frequency or latency of other types of common tests been affected? 2. What regions and types of facilities (e.g. hospitals) has testing increased? Where has it decreased? Are these phenomena related? 3. How might we allocate existing infrastructure to better allocate and reduce the burden of testing on all labs? (scheduling/optimization) “
• Template-guided recombination: finite automaton construction
Supervisor: Kai Salomaa
Description: ” Template-guided recombination (TGR) is a formal model for the gene descrambling process occurring in certain unicellular organisms called stichotrichous ciliates (M. Daley, I. McQuillan, 2005). The mechanism by which these genes are descrambled is of interest both as a biological process and as a model of natural computation. This project studies template guided recombination as an operation on strings and languages with a goal to better understand its computational capabilities. A goal is to {\em implement a TGR operation for finite automata\/} and use this for experiments to study the complexity of the operation. The TGR operation has particular theoretical interest because the iterated version is known to preserve regularity but the result is {\em nonconstructive\/} (i.e., there is no known algorithm to produce the automaton),\\ M. Daley, I. McQuillan, Template guided DNA recombination, {\em Theoret. Comput. Sci.} 330 (2005) 237–250. In this project you will implement a simulator for the \underline{non-iterated} TGR operation. The operation gets as input two nondeterministic finite automata, or NFAs (and some numerical parameters), and outputs an NFA for the resulting language. The software to implement the TGR operation should use input/output format similar to Fado or Vaucanson. The software is intended to be used in conjunction with software libraries such as \begin{itemize} \item Fado {\tt http://fado.dcc.fc.up.pt/}, or, \item Vaucanson {\tt http://vaucanson-project.org/?eng} \end{itemize} The software libraries provide a collection of operations that allow us to determinize and minimize the resulting NFAs to study the state complexity of the operation. From theoretical point of view a question of particular interest is to find examples where iterating the operation significantly increases the size of the minimized DFA. This is a larger project, including both programming and theoretical work, and is suitable for a group of three students. The implementation of the TGR operation for regular languages (NFAs) requires good theoretical background on formal languages concepts, e.g. from CISC-223. “
• Transforming regular expressions to finite-state machines
Supervisor: Kai Salomaa
Description: “Given a regular expression of length $n$, what is the worst-case size of the minimimal deterministic finite automaton (DFA) for the language? An exponential upper bound is known but average regular expressions can be implemented more efficiently. The main goal of this project is to generate libraries of “random” regular expressions and determine their state complexity. The regular expression–to–DFA transformations, as well as, the minimization of the DFAs have been automated in various libraries such as \begin{itemize} \item Fado {\tt http://fado.dcc.fc.up.pt/}, or, \item Vaucanson {\tt http://vaucanson-project.org/?eng}. \end{itemize} The software libraries provide a collection of operations to convert regular expressions to finite-state machines or vice versa and for minimizing finite state machines. The second goal of the project is to find different types of “bad” examples: regular expressions where the equivalent minimized DFA is large. The project requires an understanding of the basics of finite automata and regular expressions. The amount of programming required is not large, but you should expect to run a significant number of simulations and other experiments. This project is suitable for 2-3 students. Different components include: \begin{itemize} \item writing software for generating “random” regular expressions, \item using Fado to transform regular expression libraries to \underline{minimized} DFAs and finding individual “bad” examples (i.e., regular expressions where equivalent minimized DFA is very large) \item theoretical work (surveying the literature) on descriptional complexity of the “regular expression–to–DFA” transformation. \end{itemize} “
• Computational complexity of decision problems for regular languages (optionally including context-free languages)
Supervisor: Kai Salomaa
Description: “{\em This is a theoretical topic requiring capability to read about algorithm complexity, computational complexity and a good familiarity with finite state machines (and grammars). The topic is suitable for students with a strong record from CISC-223 and CISC-365. } It is known that all natural problems, like membership, emptiness, equivalence etc. are decidable for finite state automata and regular expressions. However, what is the complexity of these problems? The goal of this project is to investigate what are the known complexity results for the basic decision problems for deterministic and nondeterministic finite automata and for regular expressions. Often the questions are complete for PSPACE for NFAs and regular expressions, or log-space complete for DFAs. In particular, goals of the project include to identify \begin{itemize} \item examples of natural problems for finite automata/regular expressions where the precise complexity is unknown, \item examples of finite automaton problems that are not known to be solvable, or that are known to be unsolvable. \item (optional part) for context-free grammars many decision problems are unsolvable. \end{itemize} The goal of the project is to present the findings in the report in a systematic way (the terminology in different articles appearing in literature may not always be consistent). The project involves a fairly {\em large amount of literature search\/} since the complexity results are not included in typical textbooks. I can provide some survey articles to be used as a starting point. \begin{itemize} \item The project is for 2-3 students. If three students are working on the project, we would include also decision problems for context-free grammars (the optional part). \end{itemize} “
• Genomic data analysis and visualization GUI
Supervisor: Kathrin Tyryshkin
Description: “Analysis of a genomic data often involves pre-processing, quality control, normalization, feature selection and classification and differential expression analysis. Many methods exist, however, the best technique depends on the dataset. Therefore, it is often required to try different techniques to select the one that works best for a given dataset. This project involves further development and improvement of a user-interface for a feature selection algorithm and feature analysis. The objective is to implement new components for feature selection and visualization of data. The interface would be published online for other researches to use. “
• Normalization of miRNA expression using calibrators
Supervisor: Kathrin Tyryshkin
Description: “Analysis of a genomic data often involves pre-processing, quality control, normalization, feature selection and classification and differential expression analysis. Normalization is required to remove technical variability while preserving biological variability. This project will focus on developing a new algorithm for normalization of miRNA expression data using calibrators data. MicroRNAs (miRNAs) are small RNA molecules that are important in many cancers. They help regulate gene function in the cell and are informative biomarkers in several cancer types. This project is ideal for students interested in developing innovative and sophisticated algorithms and working with genomic data.”
• Identifying microRNA biomarker for a recurring paraganglioma
Supervisor: Kathrin Tyryshkin
Description: “Paraganglioma is a rare tumor that arises from the adrenal gland. Unlike many other types of cancer, there is no test that can tell between benign and malignant tumor. MicroRNAs (miRNAs) are small RNA molecules that are important in many cancers. They help regulate gene function in the cell and are informative biomarkers in several cancer types. The Cancer Genome Atlas (TCGA) is a large cancer database that contains miRNA profiles of recurring and benign paraganglioma tumors. It also contains vital information about the patients’ demographics, treatment, and survival. In this project, the student will preprocess and apply machine learning approaches to the TCGA dataset to identify possible biomarkers of paraganglioma recurrence. Time permitting, the student will also explore patterns and relationships between miRNA and tumor size, survival, and response to different treatments. The student will gain experience using a well-known and widely used real world dataset. The student will gain familiarity with data preprocessing, clinical statistics, and supervised and unsupervised machine learning methods. “
• Hidden class mining in machine learning
Supervisor: Ting Hu
Description: “Machine learning models for biomedical data analysis often suffer from poor performance on important subsets of a population that are not identified during training or testing. This is referred to as the subtype stratification in bioinformatics and hidden class mining problem in machine learning. For example, a machine learning model trained for cancer detection may achieve an overall high classification performance but still consistently misses a rare but aggressive cancer subtype. This project explores techniques that are able to detect variations and hidden classes in biomedical data. “
• Effectiveness of COVID-19 Enforcement on Community Transmission
Supervisor: Catherine Stinson
Description: “The research question for this project is how effective enforcement measures like police warnings, tickets, and snitch lines have been in reducing community transmission of COVID-19. This work will be in collaboration with Criminologists from https://www.policingthepandemic.ca/. The main tasks are to gather data about geo-locations of COVID-19 transmissions and enforcement efforts, analyze the data, and produce interactive maps and other visualizations. Data gathering may involve NLP, website scraping, or access to The Ontario Health Data Platform. Analysis tools may include GIS, causal graphs, or statistical analysis. “
• How Language Makes us Smart
Supervisor: Nancy Salay
Description: “Thesis paper on how language learning supports a capacity for representation, especially from the perspective of extended/enactive approaches to studying cognition.”
• Automatic chat summarization
Supervisor: Dr. Farhana Zulkernine, Hasan Zafari
• Real-Time Framework for Skeleton-Based Activity Recognition
Supervisor: Dr. Farhana Zulkernine, BAM Lab
Description: “Student Mentor: Isaac Hogan (Fall), another student afterwards. Objective Application of Human Activity Recognition research is limmited by the difficulty of deploying novel solutions. To this end, this project entails designing and implementing a framework for ingesting RGB video and Kinect data and converting the data into skeleton points in real time. This should enable faster and easier validation of skeleton-based action recognition models in the future. Learning Outcomes Learn about the requirements and application of real-time data-processing and learn about state of the art techniques involved with action recognition including pose detection and skeleton-based action-recognition. Expertise Needed Ideally, some experience programming real-time applications/using tools which enable real-time applications. Otherwise, a background in command-line programming in Linux and experience with database applications will be useful. Description This is a fairly general project which will touch quite a few different problems. A brief background covering pose detection and skeleton-based action recognition should be completed to understand the requirements of these kinds of systems. In addition, the student should complete a background on the best-practices for real time signal processing. From there, an implementation plan should be drafted and followed defining which software and hardware are to be used to create the action recognition framework as well as the plan for development. Finally, the system should be tested with a few pose detection and skeleton-based action recognition techniques to validate functionallity. Data stream sources Webcam RGB data and data from a Kinect 2.0 device. Deliverables A framework and prototype for real-time skeleton-based activity recognition, including source code or project files, and a project report. Support The student will work closely with Isaac Hogan for the fall semester, and a different student the following semester. Dr. Zulkernine will assist in a supervisory role and will assure that resources (licences and hardware) are accessable for the project work. “
• Reconstruction from stored data fragments using generative networks
Supervisor: Dr. Farhana Zulkernine, BAM Lab
Description: “Objective As part of an ongoing academic-IBM collaboration research project, we are developing a data analytics infrastructure for multilevel streaming analytics. This project will help reconstruct data from the stored data fragments. Learning Outcomes Learn about cutting edge streaming data stream analytic tools such as IBM Streams, design, implement and validate a mutli-stream multi-modal data analytic system. Expertise Needed Knowledge about generative networks and programming expertise. Description Reconstruction from corrupted images is a well known critical problem, specifically in medical domains.This also has the potential for reconstructing data from incomplete data fragments. Research is going on for reconstructing data from corrupted or partial stored data using different variations of generative models. Different models can be used for creating the models like generative adversarial networks, recurrent neural network – variational autoencoder (RNN-VAE), recurrent neural network-autoencoder (RNN-AE). The aim is to store partial data ingested in IBM Streams and reconstruct the data from the stored data fragments. Data stream sources Twitter Streaming API, Satori live data channels, and proprietary data streams via WebSockets. Deliverables A prototype application, source code, and project report. Support The student will work closely with Dr. Zulkernine and Sazia Mahfuz, a PhD student and will have access to one-one support and various software stacks necessary for the project. “
• Automatic Text Composition System for Replying to Emails
Supervisor: Dr. Farhana Zulkernine, BAM Lab
• Automatic Case Identification for Diabetes
Supervisor: Dr. Farhana Zulkernine, Hasan Zafari
Description: “Objective As part of an ongoing academic-industry collaboration research project, we are developing a data analytics project to examine existing datasets to identify key features that could be useful in the diagnosing cases of Diabetes in primary care settings. Description The ever-increasing volume of medical data in the form of electronic medical records (EMR) are of great value in primary care system. Patient medical records contain vast amount of information regarding patient conditions that could be used to help screening patients and helping in automatically case identification using data mining and machine learning algorithms. However, there are many challenges associated with medical data. For example, compared to data from other disciplines, the size of medical data is relatively small, and considering the fact that many diseases affect only a small subset of population, this data are usually imbalanced. Also they can be affected by several sources of uncertainty, such as measurement errors, missing data, or errors in coding the information buried in textual reports. Given these challenges, the goal of this project is to choose and implement analytical methods and approaches for integrating, processing, and interpreting healthcare data for Diabetes case detection with the highest predictive power. To this end, the data will be processed and converted to desired format. Subsequently, to identify key features that are informative in predicting these diseases, feature engineering and selection will be performed. At the core of this system, machine learning algorithms will be applied to create a supervised predictive model. This is a classification algorithm that will accept the features and produce the prediction. Finally, feature importance will be applied to provide insights into the model’s behavior. The performance of these algorithms is evaluated in terms of sensitivity and specificity of detection. For this purpose, a subset of patient’s data along with their labels (case or control) will be reserved and used in evaluation phase. Deliverables A prototype application, source code, and project report. Support The student will work closely with the Postdoctoral Fellow Hasan Zafari and will have access to one-one support and various software stacks necessary for the project. “
• Driver Drowsiness Detection
Supervisor: Dr. Farhana Zulkernine, BAM Lab
Description: “Objective Develop and test a real-time driver drowsiness detector for Advanced Driver-Assistance System (ADAS). Learning Outcome Learn about developing deep learning models for face detection, image classification and multi-task convolutional neural network. Expertise Needed Expertise in programming (Python, OpenCV, TensorFlow or PyTorch), computer vision and neural networks. Description According to the National Highway Traffic Safety Administration, every year about 100,000 police-reported crashes involve drowsy driving, resulting in injury or death that costs \$109 billion annually. Detecting driver drowsiness state is essential to ensure that the driver is in a safe driving state, thereby decreasing the probability of traffic accidents. In this project, we are going to build a multitask deep neural network to detect the driver’s driving status by using a video camera. Data Sources Students can use open source image data sources and available codes and expand upon them. Deliverables An image object detection and classification model, validation of the model, and a project report. Support The students will work closely with PhD student Donghao Qiao and Dr. Zulkernine to extend upon his prior work on driver fatigue detection. “
• Remote Photoplethysmography for Vital Signs Measurement
Supervisor: Dr. Farhana Zulkernine, BAM Lab
Description: “Objective Photoplethysmography (PPG) is a pulse-oximetry device. It has a LED light source and photodetector. The light source illuminates the tissue, light transmits is reflected by the tissue, and the photodetector senses the small variations between the emitted light and the reflected light. The amount of light that changes according to the blood volume can be used to measure the pulse or heart rate. PPG is a non-intrusive method to measure vital signs, but it requires extra light sources and sensors. The goal of our project is to build a remote PPG (rPPG) to measure vital signs. rPPG is based on the same principle as PPG, but the difference is that it is a contactless measurement process. It can measure vital signs remotely using a video camera or a recorded video of patients or users. Learning Outcomes Learn about developing deep learning models for Region-of-Interest (RoI) detection and segmentation, video data processing and analytics. Expertise Needed Expertise in programming (Python, OpenCV, TensorFlow or PyTorch), computer vision, image processing and neural networks. Description rPPG has two trends: color intensity-based and motion-based methods. Blood absorbs more light than the surrounding tissues and variation in blood volume affects light transmission and reflection. Video cameras can capture subtle color changes on human skin, which is invisible to our eyes. Similar to PPG, rPPG can estimate vital signs based on these subtle light changes. Due to our cardiovascular activities, head motions are caused by pulse and bobbing movements are caused by respiration. These motion changes can be used to estimate heart and respiration rates. We are also aiming to estimate more vital signs such as oxygen saturation and heartrate variations. Data Sources Open source rPPG datasets. Deliverables A prototype application, source code, and project report. Support Students will work closely with PhD student Donghao Qiao and Dr. Zulkernine to extend upon his prior work on rPPG. “
• Social Networks Representations _ a COGS499 project
Supervisor: Dr. Stanka Fitneva & Dr. Farhana Zulkernine
Description: “Objective Understanding how people represent the structure of (their) social networks is an important question attracting growing attention in the cognitive sciences. The goal of this project is to create a scalable research tool that allows us to study memory processes for networks in children and adults. Learning Outcomes 1. Familiarity with the current literature on the representation of social networks 2. Understand and apply key principles of experimental research design and HCI 3. Apply descriptive and inferential statistics to pilot data Expertise Needed java (jsPsyc), python, AWS familiarity, Psychology background Description Emergent research on social network representations has started to identify factors that influence memory for SN. However, little is known about the development and capacity limitations of this type of memory. The goal of this project is to create and pilot a tool that enables online data collection on this problem. Specifically, given a certain network, the tool should present participants with its edges (i.e., dyads) and then implement a recall test. Data stream sources Pilot data will be collected through the Psychology participant pool Deliverables A prototype application, source code and manual, and project report. Support The student will work closely Dr. S. Fitneva “
• Machine Learning Based Channel Quality Prediction for 5G Cellular Networks
Supervisor: Sameh Sorour
Description: “In the soon-to-come 5G networks, the scheduling and resource assignment depends on many parameters, one important among which is the channel quality. The channel parameters depends on many factors, including the wireless device mobility and the surrounding conditions. Knowing such channel qualities ahead of each transmission is very important for the network to allocate the proper resources to uses. As transmission rates are becoming increasing high, using classical training-based methods for channel estimation is becoming troublesome. Recently, machine learning starts to emerge as alternative solution to anticipate the channel qualities of mobile users. The potential of machine learning to provide quality estimates of the channel depends on the quality of the data set and their reflectance of realistic signal propagation settings. The objectives of this project are thus three folded, 1) To generate a dataset that can be used to predict the channel quality in 5G mobile networks using a high quality signal tracing simulator; 2) To analyses this data to find the important parameters affecting prediction; and 3) To build a machine learning model to predict the communication channel quality over different time horizons. During the course of this project, the students will learn about the 5G networks and channel propagation. He will get familiar with an advanced tool, the Siradel City Explorer software, that can be used to compute the multipath channel and determine the channel quality. Additionally, students will use Matlab to generate the dataset from through the Siradel APIs. Moreover, they will acquire experience in applying machine learning on real world problems using Python and tools like Tensorflow. The project deliverables are expected to be a dataset and prediction model code and a report that describes how the data was generated, the data analytics part, the prediction model, and the final results.”
• IoT-Enabled Cyber-Physical Systems
Supervisor: Sameh Sorour
Description: “Scope The migration towards the era of smart cities is heavily dependent on connecting physical things to the internet (a.k.a. the internet of things, IoT) and building environments that enable users to actively interact with these physical things (e.g., monitoring/tracking, control, reservation, etc.) through these connections. Systems with this description are usually known as cyber-physical systems (CPSs). Project Description In this project, interested groups of one or two students will develop a multi-purpose IoT-enabled CPS (see examples below). Any system must consist of more than two sides, one of which is a user app/website and the others are “things”. The user side should be able to establish connections with the thing sides, read monitoring information from them, and send control commands and/or requests to them. The thing sides will be required to maintain the established connections with the user side, send monitoring information, and receive and execute a variety of controls and/or requests from the user side. Applications of the system are left open to students’ innovation, and I will specify requirements for each system based on its chosen applications. Since some CPSs cannot be fully deployed in real life for beyond-student reasons (e.g., examples below), a full functional system should still be built and tested in an emulated environment for demonstration (details to be discussed with me as per the system’s idea and applications). An example of a multi-purpose CPS is a smart home environment with multiple controls and interactions (control light/heating, open/close door locks remotely, send and receive intercom-like voice messages, etc.). Another example is a smart bus interaction system, enabling users to monitor a bus position, send pickup requests to the bus at any given station, provide options to pay ride fees online, receive alerts to start moving toward the station when necessary, and transfer pickup requests to other buses if late. Required Expertise In addition to strong programming skills, the project requires basic knowledge on app/web development, interfacing with hardware, and connection establishment between end devices. Deliverables: 1. Codes for the entire system 2. Prototype of a fully functional system in a real or emulated environment 3. Project report 4. Video recording illustrating all the functions of the prototype Learning Outcomes Learn how to select, build, and program CPS applications, and interface them with physical hardware. “
• Clinical and genomic database
Supervisor: Kathrin Tyryshkin
Description: “This project involves developing, entering data and testing a RedCap database for clinical, pathological and genomic data for different types of cancer. The RedCap database platform is currently set at the CAC and has a basic set up and a template. The goal is to utilize the existing database template and expand to several other cancer datasets. If necessary, improvements to the existing template will need to be researched and added. In addition, investigating other options for storing clinical/pathological/genomic data on larger scale will constitute the main research component of the project. “
• Context-Anticipatory Resource Allocation for 5G-Assisted Autonomous Driving
Supervisor: Sameh Sorour
• Enhancing a toy computer
Supervisor: Richard Linley
Description: “My course makes use of a JavaScript simulation of a toy computer with a single accumulator register and a small number of assembly language instructions. All machine addresses are represented by string labels. Unfortunately, the toy has no capacity for representing array (list-like) structures, or distinguishing between an address and the data stored at that address. These are features of the assembly language and simulator that I would like to have added by a CISC 499 student.”
• Digital Fabrication using E-textiles
Supervisor: Sara Nabil
Description: “Imagine future computers that are woven into your shirt sleeve, stitched to your pillow, knitted into your scarf or embroidered on your favourite jacket. Apart from software, coding and digital data, this project uniquely focuses on physical fabrication and making of novel interactive tangible objects. Hardware prototyping is not always using electronic sensors, motors and LEDs. Instead, we can innovate new materials that have computational properties, such as textiles, wood, paper, stone, etc. In this project, we will explore an alternative approach to creating interactive everyday things, which is to incorporate smart materials directly into the making stages of the everyday materials e.g. sewn fabrics. Smart materials that have morphological (shape and colour-changing) capabilities such as thermochromic inks and shape-memory wires can be literally stitched, knitted and weaved into different materials. This research project aims to explore the different emergent materialities that can be digitally designed, fabricated and crafted from such smart materials. To achieve this, we will adopt a ‘Research through Design’ (RtD) approach, which we refer to as ‘co-designing with our materials’. We draw on the insights of RtD to frame the production of annotated portfolios as a rigorous theory and a developing form to underpin our presentation of a series of hands-on laboratory experiments. We will use specialized fabrication equipment and digital computational design in our design explorations in a creative practice to offer novel insight into the interactive potentials of the techniques we will exploit. For example, laser-cutting, 3D-printing and digital embroidery are some of the fabrication methods we can utilize. In this project, we bring material science innovation of actuating wires to a new context and appropriated practices, as threads. This bridging between technology and crafting enables ‘smart’ materials to have new encounters with other materials (such as fabrics and textiles), other tools (such as needles and bobbins) and other machines (such as sewing machines or embroidery machines). This approach broadens the accessibility of technology prototyping and has the potential to enable new previously unrealizable possibilities. For example, we can innovate shape-changing wearables and colour-changing garments as means of assistive technology to support marginalized groups including people with disabilities. See examples of similar design projects at: https://saranabil.com/pages/publications.html”
• Fabric Speakers as Deformable User Interfaces
Supervisor: Sara Nabil
Description: “Think of how future clothes can be wearable devices and how we can map different deformations in its fabric to an action in the software. The goal of this project is to digitally fabricate soft audio speakers made out of fabrics and study how they can be used by people in their daily lives. This project aims to propose interaction design techniques that improve user experience of everyday clothes embedded with interactive technology using novel fabrication methods. Speakers and head phones are common means of audio output which are mostly still rigid in nature. In this project, we will explore how we can fabricate soft speakers as a seamless part of deformable fabric interfaces and explore people’s experience with such embodied interaction. Emerging digital fabrication methods offer the opportunity of making soft speakers that are fabric-thin, malleable and deformable, sewn to clothes in fashionable and aesthetic ways. This could be used to support some patients requiring hearing aids or personal alerting soft devices potentially connected to their biosensors. When audio devices becomes seamlessly ubiquitous in this way, it avoids the common paraphernalia of digital technology that some might experience, supporting the psychological well-being of some patients as well. For example, machine-sewn speakers on user’s vest can allow them to express themselves in case of talking inability without wearing wires or rigid switches, buttons or bulky electronic components. We will explore deformable soft speakers using copper and silver-plated threads and conductive yarns. For this project, we will create prototypes of soft speakers on fabric samples that respond to flex sensors input detecting hand manipulations to manipulate the sound volume using Arduino microcontrollers. We use digital sewing machines to automate this process. Then, we design and run user experiments to access the functional and experiential impacts on their daily activities.”
• Interactive Smart Spaces for Living Through COVID-19
Supervisor: Sara Nabil
Description: “Make your room change its size and physical appearance –literally not virtually- during the pandemic. Just wave your hand for your wall to transform! Smart Spaces, including Smart Homes, are not only those that collect massive data and can control lights, curtains, temperature and humidity. Smart spaces can have interactive capabilities that can respond to people as actuating physical interfaces characterized by being aesthetically pleasing, intuitively manipulated and ubiquitously embedded in our daily life. In this project, we will be designing and building interactive walls, floors, ceilings, furniture or entire buildings that have the potential to –finally– transform the vision of smart homes and ubiquitous computing environments into reality. We can propose interactive spaces for both exterior and interior design, arguing that interaction design should be at the core of a new interdisciplinary field driving research and practice in architecture. The design concept will focus on supporting people during the pandemic to be able to tolerate the wellbeing challenges of self-isolation and lockdown. Based on this agenda, we will be innovating future technology of smart spaces and utilizing interactive smart materials (e.g. conductive paints and fabrics, shape-changing and colour-changing materials). We will also be addressing the challenges and opportunities of this novel design space. This agenda offers us new m means through which to deliver a future of interactive architecture.”
• Authorship Verification on Social Network based on Stylometric Learning
Supervisor: Steven Ding
Description: “Authorship analysis (AA) is the study of unveiling the hidden properties of authors from textual data. It extracts an author’s identity and sociolinguistic characteristics based on the reflected writing styles in the text. The process is essential for various areas, such as cybercrime investigation, psycholinguistics, political socialization, etc. In this project, we will develop a new computational method for authorship verification on social network data based on stylometric representation learning, to identify compromised accounts in real-time. “
• Deep Reinforcement Learning
Supervisor: Francois Rivest
Description: “Deep Reinforcement Learning is the type of machine learning architecture capable of learning on its own to play games (such as Atari and Go, see Nature paper’s by Google Deep Mind group) and solving operational research problems. While many papers focus on developing new algorithms to beat the state of the art or to solve a specific problem, fewer papers focus on studying the importance of each component and how to improve them. In this project, you will build on an existing architecture and focus on some of its components, modifying them, and running experiments to see how making small changes in some of the neural network elements are giving better or worst results.”
• Identifying fire and explosion from video data
Supervisor: Farhana Zulkernine and Francois Rivest
• Identifying fire and explosion from video data
Supervisor: Farhana Zulkernine and Francois Rivest
• Detecting fraud in financial statements
Supervisor: David Skillicorn
Description: “Some years ago we looked at whether fraudulent filings by public companies could be detected as fraudulent based on the language used in the so-called MD&A section, which is essentially freeform. We showed (much to everyone’s surprise) that it could. The paper is here: https://onlinelibrary.wiley.com/doi/full/10.1111/1911-3846.12089 Since then, natural language prediction techniques have improved a lot, especially with deep learning via biLSTMs. You mission, should you choose to accept it, is to repeat the analysis and see if accuracies can be increased using new techniques.”
• Natural embeddings for Mohawk
Supervisor: David Skillicorn
Description: “Neural-network based embeddings of natural language have been useful in lexicon expansion, and in allowing the informal language used online to be represented in a useful way. However, the focus has been on synthetic language such as English, where word order matters, and changes in words only convey small variations of meaning. Mohawk is an agglutinative language, where megawords are formed by adding pieces that alter the meaning, so that the meaning of the whole depends in subtle ways on the pieces and how they are assembled. Nobody knows how or whether embeddings would be a useful representation for such a language. This project would be to apply embedding techniques such as Fasttext and BERT to Mohawk and see how well they work. (Obviously this would suit a student from an indigenous background.)”
• Order preserving hashing
Supervisor: David Skillicorn
Description: “Clustering heterogeneous datasets requires comparing apples to oranges: one attribute might be numeric while another is categorical, and yet another is free text. One approach is to map all attributes to numbers, but in such a way that similar initial values map to close numbers. This is known as locality-based hashing or order-preserving hashing. Some algorithms are known, but it would be useful to have implementations that could be used to compare their real-world performance. The project is to find and implement these algorithms.”
• A simulation environment for collaborative robotics
Supervisor: Dingel, Givigi, Muise
• Automatic instrumentation of robot simulations for runtime monitoring
Supervisor: Dingel, Givigi, Muise
• Integration of the Robot Operating System with temporal logic
Supervisor: Dingel, Givigi, Muise
• Mapping police interactions for accountability research
Supervisor: Catherine Stinson
Description: “In collaboration with community groups in Kingston and Waterloo, the goal is to build an app where people can document and visualize incidents between police and community members to support research around police accountability. The app should be accessible, secure, and easy to use by the public, accommodate information about various types of incidents (stop & check, traffic stops, arrests), at varying levels of detail (location, time, badge or case numbers, written reports, audio records, video records, witness information), and several levels of trust (private, shared within the project, for public display). In addition to an interface for recording incidents as or shortly after they happen, the app should include display functions for visualizing the data in the form of an interactive map. Students will work with community members to develop specifications, and test the interface. The app can be built as an extension of an existing tool like Tello. Given the sensitive nature of policing data, care will need to be taken with data access, secure data storage, and building trust. The app should be extendable for eventual use in other contexts. All code will be produced under an open source license. Students with experience in any of the following areas, or who have an interest in developing one of these skills are especially encouraged to apply for this project: Database design and cybersecurity Interface design and human-computer interaction Community-based research “
• Environmental risk score analysis of chronic childhood diseases
Supervisor: Qingling Duan
Description: “In recent years, there has been increasing interest in determining the impact of environmental exposures on risk of chronic diseases. However, many of these studies employ univariate methods to assess the effects of a single exposure, which does not reveal the net effects of multiple simultaneous exposures on disease risk. We propose a method where we compute a weighted aggregated score of numerous exposures, where weights are determined from the odds ratios of meta-analyses. Resultant scores will measure individual risk of developing a trait/disease based on one’s exposome, which will be combined with genetic risk scores to predict overall disease risk. The ultimate goal of this project is to improve predictive ability of the developing specific health traits from environmental exposures and multi-omics datasets. The overarching objective of my research program is to gain a better understanding of the risk factors and mechanisms of complex diseases, which will ultimately lead to improved prevention, diagnosis, and treatment. I am an active member of the national networks including the Canadian Respiratory Research Network (CRRN) and the Allergy, Genes and Environment Network (AllerGEN) as well as a lead investigator in the Canadian Healthy Infant Longitudinal Development (CHILD) study and the Canadian Obstructive Lung Disease (CanCOLD) study. Trainees in my lab will benefit from my ongoing collaborations with these networks through participation in training workshops, funding opportunities and working with researchers across Canada who have expertise in diverse disciplines. Moreover, trainees will gain a wide range of skills in data analytics, hands on experience with multiple types of high-dimensional biological datasets, and work within a multi-disciplinary team.”
• Volumetric Surgical Target Reconstruction From Limited X-ray Images
Supervisor: Gabor Fichtinger
Description: “A significant problem in planning of volumetrically prescribed localized medical treatments, such as irradiation of tumors and surgical resections , is the mathematical impossibility to determine the exact three dimensional shape and volume of a target object from its projected X-ray images. Reconstruction accuracy also varies with viewing angle, depending on the convexity and aspect ratios of the target object. In response to this problem, your task would be developing a robust and efficient technique for approximate volumetric reconstruction, which (A) uses no prior information of the shape and volume of the target, (B) does not require exact silhouettes, (C) accepts arbitrary number of X-ray images, (D) produces solid object and measure of its volume, (E) provides confidence measure of the reconstruction and drawing of silhouettes, (F) is robust, fast and easy to implement. The method is applicable for any X-ray guided volumetric treatment. Pilot applications will be planning of radiosurgery of arterioveneous malformations (AVMs) and breast cancer surgery planning on X-ray mammography images. “
• Video-based Surgical Tool Tracking for Medical Skills Training
Supervisor: Gabor Fichtinger
Description: “It is well known that in medical education, feedback is regarded to be an essential component of learning. Unfortunately, continuously observing and evaluating trainees places a tremendous time burden on physicians. This is valuable time that could be better spent caring for patients. There exist education systems that can automatically monitor and evaluate trainees, but most of these systems rely on tracking systems to gather information about tool motion. These tracking systems typically cost thousands of dollars, making these educational systems inaccessible to many. With recent advances in deep learning and object recognition, it’s possible to track objects using video alone. For this project, your task will be to evaluate the use of publicly available object detection networks for recognizing surgical tools. You will be responsible for training and evaluating public implementations of well-known object detection networks such as YOLO and Faster R-CNN on our dataset of simulated surgical videos. Hands-on guidance and technical support will be provided by Rebecca Hisey, doctoral student, expert in deep learning of surgical video imagery. The main application of this work is an educational system for training central venous catheterization. “
• Optimization of camera placement for surgical tool tracking
Supervisor: Gabor Fichtinger
Description: “Video serves as an affordable alternative to the expensive tracking systems that are typically used in computer-assisted assessment systems for medical skills. One of the major challenges with using video-based tracking is keeping the surgical tools that you wish to track in the line of sight of the camera. A large field of view will give more context about the use of the tools and will mean that the surgical tools are visible to the camera more often. Unfortunately having a large field of view makes the surgical tools smaller in comparison and makes object detection networks less reliable. Alternatively, a small field of view makes the surgical tools easy to recognize, but they will frequently be blocked from view by the physician’s hands. The main goal of this project will be to identify which a pair of camera locations provides us with the optimal combination of context and accuracy for recognizing the surgical tools. This project will require A) training a neural network to recognize the surgical tools used in central venous catheterization B) Evaluating the network on data recorded from various camera angles and distances away from our venous access phantom. Hands-on guidance and technical support will be provided by Rebecca Hisey, doctoral student, expert in deep learning of surgical video imagery.”