## Michael E. Cotterell's Research

### Research Statement

Coming Soon!

Keywords: functional data analysis, big data, clustering, predictive analytics, domain-specific embedded languages, ontologies, semantic web, algorithms

### Current Research Areas

Predictive Analytics Techniques in Functional Data Analysis

Functional Data Analysis (FDA) is concerned with the analysis of data in continuous functional data spaces or data mapped into such spaces in order to take advantage of information about the data's rates of change. Fitting data to smoothing splines (e.g., B-splines) in reproducing kernel Hilbert space (RKHS), for example, facilitates the use of a functional data space in existing predictive analytics techniques like clustering. Traditional clustering methods usually suffer when dealing with high dimensional data because the distance metric approaches zero at the limit. When working with a functional data space, these techniques can take advantage of distance calculations based on derivatives of the function. A general form for such a functional distance metric is $$d_{n,p} (x_i, y_i)= ( \int \vert D^n_t x_{i}(t) - D^n_t y_{i}(t) \vert^p dt )^{1/p}$$ where $$D^n_t$$ denotes the $$n$$-th derivative with respect to $$t$$, allowing for various metrics like $$d_{0,1}$$, $$d_{0,2}$$, and $$d_{2,2}$$, which are the $$L_1$$, $$L_2$$, and semi metrics, respectively. In addition to the benefits reaped by traditional clustering techniques, some newer model-based approaches can take advantage of assumptions about the functional data’s distribution. Currently, I am working on comparing various clustering techniques for a functional data analysis of gene isoform expression data. Future work in this research area includes additional applications of clustering on functional data as well as utilizing functional data in other analytics areas like prediction and classification.

### Previous Research Areas

Domain-Specific Embedded Languages for Analytics, Simulation, and Optimization

I've helped contribute to the design and implementation of ScalaTion, a Domain-Specific Embedded Language (DSEL) written in Scala that serves as a testbed for exploring a modeling continuum that includes Analytics, Simulation, and Optimization. My early work on this project included the addition of Unicode operators within the ScalaTion DSEL with the goal of making source code more concise, readable, and in a form familiar to domain experts. The result, in many cases, is code that looks more similar to textbook formulas than to traditional programming code. Related to this, I also worked on SimOptDSL, a simulation optimization package that can utilize ScalaTion to easily model and execute optimization problems. More recently, I've contributed to many of the components in ScalaTion, including coroutines, the process interaction simulation package, the linear algebra package, the analytics package, and various functions used in probability and statistics.

Ontologies & Semantic Algorithms for Service and Model Suggestion

I've applied ontologies and semantic algorithms towards problems in the Bioinformatics, Energy Informatics, and Big Data Predictive Analytics domains. Within Bioinformatics, I helped extend a service suggestion algorithm created by Rui Wang that utilized the Ontology for Bioinformatics (OBI) and made it available as the Service Suggestion Engine (SSE), a REST web service. I also worked on a plugin for Galaxy, a web application for creating and executing bioinformatics workflows, that provided an interface for Galaxy users to utilize the service suggestion algorithm. This interface allowed users to get help in constructing their workflows by providing suggestions based on the current state of the workflow design as well as user-provided goals. In Energy Informatics, I created the Ontology for Energy Informatics (OEI) as part of my internship at the Department of Energy's National Renewable Energy Lab (NREL). Similar to OBI, this ontology was built on top of the Basic Formal Ontology (BFO) in the hopes that it would facilitate easier integration with other ontologies and systems. Within the domain of Big Data Predictive Analytics, I helped with the construction of the Analytics Ontology (AO) and its associated ScalaDash application. The ontology captured domain knowledge about different analytics modeling techniques and their underlying assumptions. The ScalaDash application utilized OA to provide modeling suggestions based on a description of their dataset. Users can tweak and directly execute the models within the application using ScalaTion in order to facilitate rapid analytics.

The content and opinions expressed on this Web page do not necessarily reflect the views of nor are they endorsed by the University of Georgia or the University System of Georgia.