The movements of large numbers of tracks over geospaces are routinely collected by surveillance platforms. This big data is often used for in the small analysis, where a single track, item, or event is focused. However, this focus ignores a profound aspect of the collected data: that the ensemble of tracks and their dynamics may be telling of the macroscopic qualities of a geo-space. Macroscopic qualities are essential to perform geospatial pattern-of-life analysis, anomaly detection, and evolutionary analysis. This work is an effort to build mathematical (dynamic network) models of these geospatial dynamics from big datasets of track locations over time. It: (i) develops algorithmic extensions for turning big data sets of tracking data into dynamic networks under noisy data conditions; (ii) transforms our algorithms and tools to run under modern high-performance computing environments; and (iii) implements common and novel social network analysis algorithms to establish relationships between tracks and places in a geo-space.
The dramatic shift in the kind of information shared on the Web and the Internet of Things concept has led to the rise of sophisticated Web robots or crawlers (automated programs that submit http requests to Web servers) across the Internet. Compared to robots that crawled the Web a decade ago, where most content was static and hosted on traditional Web server platforms, modern robots capture dynamic information that carries instantaneous value and are hosted on server farms or clouds. This NSF funded project at Wright State responds to this "rise of the robots" by devising mechanisms that mitigate their impact on the performance and reliability of Web systems. The researc: (i) develops and compares models for predicting the functionality of a Web robot given its behavioral features; (ii) codifies traffic generators that build synthetic robot requests; (iii) devises next generation Web caches, scalable to entire clouds, that anticipate the resources to be requested by both robots and humans.
Metabolomics is the exhaustive characterization of metabolite concentrations in biofluids and tissues. The use of NMR and chromatography-linked mass spectrometry to assay metabolic profiles of tissue homogenates and biofluids has been increasingly recognized as a powerful tool for biological discovery. In recent years metabolomics techniques have been applied to a wide variety of diagnostic, preclinical, systems biology, and ecological studies. Working with Dr. Nick Reo's NMR spectroscopy lab at Wright State University, we are developing standards-based tools and web services for the pre-processing, normalization/standardization, exploratory and comparative analysis, and visualization of NMR spectra from biofluids. More information on metabolomics research at the Bioinformatics Research Group
The National Science Foundation's EarthCube program has set out to establish concepts and approaches to create integrated data infrastructures across the Geosciences. On of its key challenges is to enable and simplify data publishing, discovery, access, reuse, and integration in a sustainable way. Existing data repositories and networks must be linked, while retaining their independent missions and services to existing disciplinary communities. Cultural, conceptual, and infrastructural heterogeneities must be respected in order to maintain different perspectives, topics, granularities, and differing priorities and thus foster inclusivity in the EarthCube endeavor. In particular, individual choices made by providers of data or repositories will need to be respected in an inclusive manner, and approaches to integration must reflect this. At the same time, however, the diversity and heterogeneity of geoscience data presents a significant barrier to its discovery and reuse. The NSF EarthCube Building Block project GeoLink addresses these challenges by 1) digital publication of geoscience data and knowledge as "Linked Open Data"; combined with 2) semantic integration by way of describing data using ontology design patterns and vocabularies shared among federated repositories; and 3) establishing Linked Data and ontology design patterns as an underlying cyberinfrastructure extendable in both depth and breadth, that can become a central building block for EarthCube data harmonization. GeoLink focuses on the use of smart data as an alternative to smart applications, and will effectively begin to establish an ontology-based metadata ecostystem which consists of flexibly alignable ontology building blocks. For more information, see the DaSeLab GeoLink website.
PCR-based amplification of STR loci has become the method of choice for the purpose of human identification in forensic investigations. With these loci, length polymorphisms associated with differences in the number of tandem repeats of four-nucleotide (tetranucleotide) core sequences are detected after polymerase chain reaction (PCR) amplification. A set of thirteen STR loci are typically genotyped with commercially available kits and length polymorphisms are identified with machines such as the Applied Biosystems 310 or 3100 capillary electrophoresis systems. In the analysis and interpretation of DNA evidence using STRs, a surprising number of technical, statistical, and computational issues emerge. Together with Forensic Bioinformatics Services, Inc., we investigate algorithmic, empirical, and statistical approaches to address many of these problems. The end goal of our research is to ensure that DNA evidence is treated with due scientific objectivity in the courtroom. For more information, see the Bioinformatics Research Group website.
The Semantic Web is based on describing the meaning - or semantics - of data on the Web by means of metadata - data describing other data - in the form of ontologies. The World Wide Web Consortium (W3C) has made several recommended standards for ontology languages which differ in expressivity and ease of use. Central to these languages is that they come with a formal semantics, expressed in model-theoretic terms, which enables access to implicit knowledge by automated reasoning. Progress in the adoption of reasoning for ontology languages in practice is currently being made, but several obstacles remain to be overcome for wide adoption on the Web. In this NSF-funded project, we investigate the use of tractable and other ontology languages from several perspectives, with a focus on extending description logics based ontology languages around OWL with further capabilities, while staying tractable if possible. Some of the aspects considered include the integration of rules languages with OWL, paraconsistent and non-monotonic reasoning, ontology modeling advances in particular related to geoscience applications, and reasoning algorithms and tools. For more information, see the DaSeLab TROn website.
Multidimensional data can be challenging in terms of identifying a comprehensible, easy-to-interpret visualization. Data sources, such as general recognition theory, can generate millions of these multidimensional data sets that need to be visualized all at the same time. The sample image shows a parallel coordinate plot of such a data set.
In order to reconstruct and study the flight characteristics of a dragonfly during take-off, the dragonfly can be captured using high-speed cameras from different angles to reconstruct the geometry of the body and wings. Using a flow simulation with this geometry as boundary condition the air flow around the wings can be computed and a suitable visualization reveals the properties that allow the dragonfly at take off.