Theses and Dissertations - Department of Computer Science
Permanent URI for this collection
Browse
Recent Submissions
Item Cross-Lingual Word Embeddings with Universal Concepts and their Applications(University of Alabama Libraries, 2020) Sheinidashtegol, Pezhman; Vrbsky, Susan; Musaev, Aibek; University of Alabama TuscaloosaEnormous amounts of data are generated in many languages every day due to our increasing global connectivity. This increases the demand for the ability to read and classify data regardless of language. Word embedding is a popular Natural Language Processing (NLP) strategy that uses language modeling and feature learning to map words to vectors of real numbers. However, these models need a significant amount of data annotated for the training. While gradually, the availability of labeled data is increasing, most of these data are only available in high resource languages, such as English. Researchers with different sets of proficient languages seek to address new problems with multilingual NLP applications. In this dissertation, I present multiple approaches to generate cross-lingual word embedding (CWE) using universal concepts (UC) amongst languages to address the limitations of existing methods. My work consists of three approaches to build multilingual/bilingual word embeddings. The first approach includes two steps: pre-processing and processing. In the pre-processing step, we build a bilingual corpus containing both languages' knowledge in the form of sentences for the most frequent words in English and their translated pair in the target language. In this step, knowledge of the source language is shared with the target language and vice versa by swapping one word per sentence with its corresponding translation. In the second step, we use a monolingual embeddings estimator to generate the CWE. The second approach generates multilingual word embeddings using UCs. This approach consists of three parts. For part I, we introduce and build UCs using bilingual dictionaries and graph theory by defining words as nodes and translation pairs as edges. In part II, we explain the configuration used for word2vec to generate encoded-word embeddings. Finally, part III includes decoding the generated embeddings using UCs. The final approach utilizes the supervised method of the MUSE project, but, the model trained on our UCs. Finally, we applied our last two proposed methods to some practical NLP applications; document classification, cross-lingual sentiment analysis, and code-switching sentiment analysis. Our proposed methods outperform the state of the art MUSE method on the majority of applications.Item Mining and Ranking Incidents for High Priority Intrusion Analysis(University of Alabama Libraries, 2020) Haque, Md Shariful; Atkison, Travis; University of Alabama TuscaloosaThreats and intrusions are increasing at an alarming rate, even though related technologies have observed rapid advancement. Hence, advanced threat analysis has become imperative to improve current technologies. These technologies are primarily designed to detect or predict threats and minimize the likelihood of damage. The goal of an efficient intrusion analysis is also to develop models unwavering to any external influences and produce optimized results. Several data mining techniques have been applied in these scenarios to detect both anomaly and misuse, predict possible attack paths, or generate attack models. Some consider determining the priority, an important criterion of alerts, using different characteristics of the attack scenarios. In this dissertation, novel priority-based alert mining techniques and a ranking model are proposed to prioritize sequences of alerts and to realize their actual effect which is often misunderstood due to the generic taxonomies used by detection systems. This dissertation has the following contributions: First, a novel data mining-based alert sequence mining technique is proposed to discover potential attacks from intrusion alerts. Intrusion detection systems maintain signatures of intrusions with a severity scale. This information has been leveraged predominantly in the proposed data mining-based alert association approach. This approach reduces the effort of post-processing alert sequences and calculating their severity when the relationship is established. Second, a non-redundant high priority association rules mining technique is proposed based on theories and background of non-redundant association rule mining. Such techniques are highly adopted to determine the correlation between items in sequences and to develop efficient prediction models with a reduced volume of derived data. Third, the above mining approaches facilitate the process of extracting severe incidents based on priority. However, severity levels determined by the detection system are generic; thus, their real consequences are hard to perceive. Multi-criteria decision making is a prominent research area to assess different alternatives. The proposed approach is equipped with a combination of MCDM techniques to further rank the prioritized threats based on several benchmarks. The novelty of our technique is to consider the priority level of alerts at prior stages of attack analysis and later determine the overall attack scenario.Item Designing Lightweight Mitigation Processes for DNS Flooding Attacks(University of Alabama Libraries, 2020) Mahjabin, Tasnuva; Xiao, Yang; University of Alabama TuscaloosaDistributed Denial of Service (DDoS) attacks are everyday threats in the current cyber world. Massive DDoS flooding attacks on October 21, 2016, were launched to attack Internet Domain Name System (DNS) -- the phone book of the Internet domain addresses. These attacks consumed all resources of the DNS, leading to Denial of Service (DoS) and as a result, hundreds of domains under the DNS became unreachable. In this dissertation, we design robust and practical mitigation techniques for DNS flooding attacks. First, we analyze the current state of the art of the DDoS attacks in a systematic review. We analyze different aspects of the DDoS attacks including types, motivation, and defense mechanisms. We propose a taxonomy of the attack types to include DNS flooding attacks under the category of the infrastructure attacks. Second, we propose a load distributed mitigation technique. This process utilizes existing resources of different DNS service providers and successfully distributes all attack traffcs in a load balancing way. Consequently, the service remains available for legitimate traffcs. Third, we propose a benign bot-based mitigation process. This benign bot works in the local DNS cache resolver and accumulates the latest information on important domain records. Therefore, during a DNS flooding attack, the system can continually reach these important domain names even if the authoritative server becomes unreachable. Fourth, we propose the hotlist and stale content update based enhanced DNS cache. This cache maintains updated records of popular domain names of different upper-level servers. Eventually, this rich cache contents support the DNS address resolution process from the local cache, even though a flooding attack makes the authoritative servers unresponsive. Finally, to address the potential problems of our hot list-based cache, we study cache replacement policies in DNS cache. We propose two popularity-based cache replacement policies LAFTR and LAFUR. These methods preserve only important items and effectively mitigates the consequences of a DNS flooding attack. We simulate our proposed mitigation techniques to evaluate the performance in DNS flooding scenarios. Our proposed techniques are lightweight, easy to deploy, and cost-effective solutions to the ongoing DNS flooding threats.Item Understanding Social Debt in Software Engineering(University of Alabama Libraries, 2021) Caballero Espinosa, Eduardo Anel; Carver, Jeffrey; University of Alabama TuscaloosaContext: Social debt describes the accumulation of costs to software projects resulting from community smells, i.e., suboptimal working environment conditions. The study of social debt is recent in the software engineering context. Thus, there is a need for a standard reference on this problem and learning how to manage it. Objective: The goal of this article-style dissertation is to offer a comprehensive and common body of knowledge on social debt and community smells in software engineering. Method: To reach the main goal, this dissertation consist of a systematic mapping study, a systematic literature review, a survey-based empirical study, and a theoretical study. Results: The results include inventories of relevant studies on social debt and community smells, educational material on social debt and community smells for software engineering professionals, and Community Smell Stages Framework that explains the origin and evolution of community smells. We also identified the impact of community smells on software development teams' performance by studying the connection between community smells and teamwork. Furthermore, we developed a survey-based framework to validate the community smells affecting cooperation in practice and generated useful visualization approaches. We also produced a set of hypotheses about the community smells and how their effects represent potential ethical violations in work environments. Conclusion: Social debt and community smells have the potential for becoming the sources of prolific human-centric research in software engineering. There is a need for more real-world empirical research to validate the findings reported in this dissertation and generalize the results.Item Intrusion Detection and Protection Systems in Smart Grids(University of Alabama Libraries, 2021) Jow, Julius; Xiao, Yang; University of Alabama TuscaloosaThe creation of microgrids, the introduction of multiple renewable energy sources, and Internet of things (IoT) devices have led to the increased vulnerability of Smart Grids. The existing Intrusion Detection and Protection Systems (IDPS) designed for Information and Communications Technology (ICT) are generally regarded as inadequate for the protection of Smart Grids mainly because these systems do not include aspects of the power systems and the underlying communication technologies and protocols. Thus, they are not suitable for Smart Grids. This dissertation addresses the above need by introducing and implementing an IDPS for a Smart Grid that would incorporate some aspects of communication protocols that apply to the power system.First, we have comprehensively surveyed publications that address IDPS in Smart Grids. We discovered that even though some IDPS for Smart Grid ideas have been suggested, the industry has not implemented them. Second, we propose a cooperative distributed IDPS that can detect and isolate attacked microgrids from the rest of the Grid. Furthermore, we develop an algorithm to detect intrusions that can be applied in our IDPS. The algorithm helps isolate compromised microgrids from the rest of the Grid and readmit cured microgrids back to the Grid. Third, we introduce a performance measure for IDPS, "Detectability, "and implement a novel risk analysis of IDPS for Smart Grids using Ladder Logic and Fault Tree Analysis (FTA). Method analysis and simulation results show that this method can be effective in evaluating specific failure contributors in IDPS.Item Change Analysis Across Version Histories of Systems Models(University of Alabama Libraries, 2021) Popoola, Saheed; Gray, Jeff; University of Alabama TuscaloosaModel-Based Systems Engineering (MBSE) elevates models as first-class artifacts throughout the development process of a system’s lifecycle. This makes it easier to develop standard tools for automated analysis and overall management of a system process; thereby, saving cost and minimizing errors. Like all systems artifacts, models are subject to continuous change and the execution of changes may significantly affect model maintenance. Existing work has already investigated processes and techniques to support, analyze and mitigate the impact of changes to models. However, most of these works often focus on the analysis of changes between two sets of models and do not take a holistic approach to the entire version history of models. To support change analysis across the entire version history, we developed a Change Analyzer that can be used to query and extract change information across successive versions of a model. We then used the Change Analyzer to mine several versions of Simulink models, computed the differences across the versions, and classified the computed differences into appropriate maintenance categories in order to generate information related to understanding the rationale of the design decisions that necessitated the observed changes. To study the impact of changes on the models, we used the Change Analyzer to analyze the evolution of seven bad smells in 81 LabVIEW models across 10 open-source repositories, and four bad smells in 575 Simulink models across 31 open-source repositories. The evaluation of the Change Analyzer indicates that it can be used to construct concise queries that execute faster than a generic model-based query engine. The results of the change analysis process also show a high similarity of the recovered design decisions with the manually identified decisions, even though the manual identification process takes much more time and often does not provide additional information about the changes executed to implement the design decisions. Furthermore, we discovered that adaptive maintenance tasks often lead to an increase in the number of smells in systems models, but corrective maintenance tasks often correlate with a decrease in the number of smells.Item Novel Geospatial Data Science Techniques for Interdisciplinary Applications(University of Alabama Libraries, 2021) Sainju, Arpan Man; Jiang, Zhe; University of Alabama TuscaloosaWith the advancement of GPS and remote sensing technologies, an enormous amount of geospatial data are being collected from various domains. Examples include crime locations, road temporally detailed networks, earth observation imagery, and GPS trajectories. Geospatial data science studies computational techniques to discover interesting, previously unknown, but potentially useful patterns from large spatial datasets. It is important for various applications. Crime hotspot detection helps law enforcement departments to create effective strategies to allocate police resources and to prevent crimes. Earth observation imagery classification plays a crucial role in flood extent mapping and water resource management. Big companies like UPS use truck GPS trajectories data to find efficient routes that can ultimately minimize the delivery time and reduce carbon footprint. However, geospatial data science poses several computational challenges. First, the spatial data volume is rapidly growing. For example, NASA collects around 12TB of earth observation imagery per day. Second, spatial data exhibits spatial dependency which imply nearby samples are not statistically independent. Third, different spatial patterns of interest may exist in different spatial scales. Finally, there can be limited observations. For example, sometimes it can be difficult or even impossible to get the complete observation of spatial features in an area due the presence of obstacles (e.g., clouds). My thesis investigates novel geospatial data science techniques to address some of these challenges. I propose novel parallel spatial colocation mining algorithms on GPUs to address the challenge of large data volume. Similarly, I propose a deep learning framework to automatically map the road safety features from streetview imagery that captures spatial dependency. Next, I propose a novel approach to address the challenge of limited observation based on the physics-aware spatial structural constraint. Finally, I propose a novel spatial structured model called hidden Markov contour tree (HMCT), a contour tree structure, to capture directed spatial dependency on flow directions between all locations on a 3D surface.Item An Investigation into Bad Smells in Model-Based Systems Engineering(University of Alabama Libraries, 2021) Zhao, Xin; Gray, Jeff; University of Alabama TuscaloosaSystems engineering is a multi-disciplinary approach to design, realize, manage and operate a system, which consists of hardware, software, process and personnel. Engineers and scientists from different domains often create domain-specific software artifacts - systems models to describe phenomena in the process of system development. Systems models are frequently tied to external instrumentation and devices that coordinate experimentation and observation. The methodologies and tools that support systems modeling often lack the capabilities that are found in software engineering environments and practice, limiting the potential analysis capabilities that can be realized by the software adopted in the system. Moreover, due to the different focus of interest, systems engineers may lack systematic software engineering knowledge compared with software engineers, creating a knowledge gap between systems engineers and software engineers. To assist engineers in developing systems models, this dissertation first mined systems engineers' questions they post on the discussion forum to understand the challenges and issues they face during the development of systems models. The examination results show that systems engineers have a great number of questions and problems related to bad smells in systems models. Motivated by this observation, the goal of my research is to assist systems engineers with a better understanding of bad smells in systems models from three aspects: 1) the summarization of bad smells in systems models; 2) the evaluation of bad smells from systems engineers; and 3) the identification of prominent bad smells in systems models. The work presented in this dissertation has informed the systems engineering community by an empirical investigation of bad smells in systems models.Item A novel intersection-based clustering scheme for VANET(University of Alabama Libraries, 2021) Lee, Michael Sutton; Atkison, Travis; University of Alabama TuscaloosaCurrently, much attention is being placed on the development and deployment of vehicle communication technologies. Such technologies could revolutionize both navigation and entertainment systems available to drivers. However, there are still many challenges posed by this field that are in need of further investigation. One of these is the limitations on the throughput of networks created by vehicular devices. As such, it is necessary to resolve some of these network throughput issues so that vehicle communication technologies can increase the amount of information they exchange. One scheme to improve network throughput involves dividing the vehicles into subgroups called clusters. Many such clustering algorithms have been proposed, but none have yet been determined to be optimal. This dissertation puts forth a new passive clustering approach that has the key advantage of a significantly reduced overhead. The reduced overhead of passive algorithms increases the amount of the network available in which normal data transmissions can occur. The drawback to passive algorithms is their unreliable knowledge of the network which can cause them to struggle to successfully perform cluster maintenance activities. Clusters created by passive algorithms, therefore, tend to be shorter-lived and smaller than what an active clustering algorithm can maintain. In order to maintain a cluster with a low overhead and better knowledge of the network, this dissertation introduces a new clustering algorithm intended to function at intersections. This new algorithm attempts to take advantage of the decreased overhead of passive clustering algorithms while introducing a lightweight machine learning algorithm that will assist with cluster selection.Item Quality assurance in research software(University of Alabama Libraries, 2020) Eisty, Nasir Uddin; Carver, Jeffrey C.; University of Alabama TuscaloosaBreakthroughs in research increasingly depend on complex software libraries, tools, and applications aimed at supporting specific science, engineering, business, or humanities disciplines. Collectively, we call these software, libraries, tools, and applications as research software. Research software plays an important role in solving real-life problems, scientific innovations, and handling emergency situations. So the correctness and trustworthiness of research software are of absolute importance. The complexity and criticality of this software motivate the need for proper software quality assurance through different software engineering practices. Software metrics, software development process, peer code review, and software testing are four key tools for assessing, measuring, and ensuring software quality and reliability. The goal of this dissertation is to better understand how research software developers use traditional software engineering concepts of software quality to support and evaluate both the software and the software development process. One key aspect of this goal is to identify how the four quality practices relevant to research software corresponds to the practices commonly used in traditional software engineering. I used empirical software engineering research methods to study the human aspects related to using software quality practices for the development of research software. I collected information related to the four software activities through surveys, interviews, and directly working with research software developers. Research software developers appear to be interested and see value in software quality practices, but maybe encountering roadblocks when trying to use them. Through this dissertation, beside current practices, I identified challenges to use those quality practices and provided guidelines to overcome the challenges and to improve the current practices.Item Redefining privacy: case study of smart health applications(University of Alabama Libraries, 2019) Al-Zyoud, Mahran; Carver, Jeffrey; University of Alabama TuscaloosaSmart health utilizes the unique capabilities of smart devices to improve healthcare. The smart devices continuously collect and transfer large amounts of useful data about the users' health. As data collection and sharing are two inevitable norms in this connected world, concerns have also been growing about the privacy of health information. Any mismatch between what the user really wants to share and what the devices share could either cause a privacy breach or limit a beneficial service. Understanding what influences information sharing can help resolve mismatches and brings protection and benefits to all stakeholders. The primary goal of this dissertation is to better understand the variability of privacy perceptions among different individuals and reflect this understanding into smart health applications. Towards this goal, this dissertation presents three studies. The first study is a systematic literature review conducted to identify the reported privacy concerns and the suggested solutions and to examine whether the context is part of any effort to describe a concern or form a solution. The study reveals 7 categories of privacy concerns and 5 categories of privacy solutions. I present a mapping between these major concerns and solutions to highlight areas in need of additional research. The results also revealed that there is a lack of both user-centric and context-aware solutions. The second study further empirically investigates the role of context and culture on the sharing decision. It describes a multicultural survey and another cross-cultural survey. The results support the intuitive view of how variable privacy perception is among different users and how understanding a user's culture could play a role in offering a smarter, dynamic set of privacy settings that reflects his privacy needs. Finally, the third study aims at providing a solution that helps users configure their privacy settings. The solution utilizes machine learning to predict the most suitable configuration for the user. As a proof of concept, I implemented and evaluated a prototype of a recommender system. Usage of such recommender systems helps make changing privacy settings less burden in addition to better reflecting the true privacy preferences of users.Item A state-based approach to context modeling and computing(University of Alabama Libraries, 2019) Yue, Songhui; Smith, Randy; University of Alabama TuscaloosaContext-aware computing is one of the most essential computing paradigms in pervasive computing. However, current context-aware computing is still in lack of good representation models, particularly in modeling proactive behaviors and historical context data. State diagrams have proven to be an effective modeling method for modeling system behaviors. For context-aware computing, explicitly putting forward states of high-level context can be beneficial and intrigue new angles of understanding and modeling activities. In this dissertation, I propose a state-based context model, and based on the model, I introduce Context State Machines (CSM) for simulating state changes of context attribute, situation, and context, which imply important behaviors of related to context. This research develops and demonstrates CSMs for known context-aware problems from the literature including a smart elevator control system. First of all, the smart elevator, as a context- aware application in the literature, is introduced. Secondly, I introduce the implementation of the CSM engine. Thirdly, I describe two context-aware scenarios, and show the model can help automatically capture the contexts and reason the context without the inference from the developers, and it is the first time in literature to apply state-based modeling approach and the CSM engine to a real-world context-aware system. To evaluate the CSM engine as well as the CSM modeling approach, I generate high-level contextual testing data to feed the engine. I surveyed the data quality issues regarding context- aware software and rubrics of the data quality and dimensionality are developed to address the challenges of applying context to context-aware systems. The rubrics are applied in the generation of synthetic data for feeding the CSM engine in this dissertation.Item Improving intelligent analytics through guidance: analysis and refinement of patterns of use and recommendation methods for data mining and analytics systems(University of Alabama Libraries, 2019) Pate, Jeremy; Dixon, Brandon; University of Alabama TuscaloosaIn conjunction with the proliferation of data collection applications, systems that provide functionality to analyze and mine this resource also increase in count and complexity. As a part of this growth, understanding how users navigate these systems, and how that navigation influences the resulting extracted information and subsequent decisions becomes a critical component of their design. A central theme of improving the understanding of user behavior and tools for their support within these systems focuses the effort to gain a context-aware view of analytics system optimization. Through distinct, but interwoven, articles this research examines the specific characteristics of usage patterns of a specific example of these types of systems, construction of and educational support system for new and existing users, and a decision-tree supported workflow optimization recommender system. These components combine to yield a method for guided intelligent analytics that uses behavior, system knowledge, and workflow optimization to improve the user experience and promote efficiency of use for systems of this type.Item Free/libre open source software contributors: one-time contributors, gender, and governance(University of Alabama Libraries, 2019) Lee, Amanda S.; Carver, Jeffrey C.; University of Alabama TuscaloosaFree/Libre Open Source Software projects are freely available on the internet for anyone to contribute to. However, fringe populations such as females, newcomers, and others struggle to interact with the projects. They face barriers and social pressures that prevent them from joining, or push them out before they can make meaningful contributions. In this dissertation, we address the issue of fringe populations in FLOSS projects by investigating two fringe populations, One-Time Contributors and female contributors. In addition, we examine project governance to determine to what extent it helps or hinders these contributors. We used surveys, interviews, and data mining techniques to gather information about these populations and factors. OTCs and females both face stringent barriers to contributing to FLOSS projects. Project governance is relatively ambivalent towards fringe populations, but does not offer them the support they need. To retain more fringe populations and encourage them to join FLOSS projects, project governance should take a strongly pro-inclusivity stance and focus on lowering the barriers they face.Item Named data networking in vehicular environment: forwarding and mobility(University of Alabama Libraries, 2018) Kuai, Meng; Hong, Xiaoyan; University of Alabama TuscaloosaVehicular networking has become an attractive scenario, where Intelligent Transportation Systems can provide tremendous benefits. Realizing this vision needs a careful design of network architecture due to the dynamic vehicle mobility, unbalanced network density, and specific requirements of its application. The current dominant Internet Protocol (IP) is challenged in this vehicular environment. Named Data Networking (NDN), a proposed future Internet architecture, is more suitable. In this paradigm, interest forwarding is challenging. Further, vehicle mobility plays a significant role in sustaining NDN. Therefore, this dissertation focuses on two aspects of vehicular NDN: interest forwarding and vehicle mobility. The thesis aims to solve the issues of interest forwarding in both dense and sparse networks. Interest forwarding in vehicular NDN suffers packet loss and bandwidth overuse due to broadcast storm, especially in dense networks. Our proposed work, Location-Based Deferred Broadcast (LBDB) scheme, takes advantage of location information to set a rebroadcast deferred timer before rebroadcast. Intermittent connectivity in sparse networks leads to undeliverable packets and long response delay in vehicular NDN. We thus propose a Density-Aware Delay-Tolerant (DADT) interest forwarding strategy that uses directional network density to make retransmission decisions. We have fully implemented LBDB and DADT in simulations. Our evaluation results show that they outperform other protocols for the desired metrics. The sustainability of vehicular NDN is highly related to vehicle mobility, which is influenced by traffic signal operations in urban areas. Our work employs an empirical approach to study the impact from the coordination of traffic signal operations on the capacity and persistence of a vehicle-crowd (v-crowd) that is connected via vehicular NDN. The work delivers practical guidelines for adjusting signals in terms of desired capacity of v-crowd.Item Online topic modeling for software maintenance using a changeset-based approach(University of Alabama Libraries, 2018) Corley, Christopher Scott; Kraft, Nicholas A.; Carver, Jeffrey C.; University of Alabama TuscaloosaTopic modeling is a machine learning technique for discovering thematic structure within a corpus. Topic models have been applied to several areas of software engineering, including bug localization, feature location, triaging change requests, and traceability link recovery. Many of these approaches train topic models on a source code snapshot -- a revision or state of code at a particular point of time, such as a versioned release. However, source code evolution leads to model obsolescence and thus to the need to retrain the model from the latest snapshot, incurring a non-trivial computational cost of model re-learning. This work proposes and investigates an approach that can remedy the obsolescence problem. Conventional wisdom in the software maintenance research community holds that the topic model training information must be the same information that is of interest for retrieval. The primary insight for this work is that topic models can infer the topics of any information, regardless of the information used to train the model. Pairing online topic modeling with mining software repositories, I can remove the need to retrain a model and achieve model persistence. For this, I suggest training of topic models on the software repository history in the form of the changeset -- a textual representation of the changes that occur between two source code snapshots. To show the feasibility of this approach, I investigate two popular applications of text retrieval in software maintenance, feature location and developer identification. Feature location is a search activity for locating the source code entity that relates to a feature of interest. Developer identification is similar, but focuses on identifying the developer most apt for working on a feature of interest. Further, to demonstrate the usability of changeset-based topic models, I investigate whether I can coalesce topic-modeling-based maintenance tasks into using a single model, rather than needing to train a model for each task at hand. In sum, this work aims to show that training online topic models on software repositories removes retraining costs while maintaining accuracy of a traditional snapshot-based topic model for different software maintenance problems.Item Application of human error theories in detecting and preventing software requirement errors(University of Alabama Libraries, 2017) Hu, Wenhua; Carver, Jeffrey C.; University of Alabama TuscaloosaDeveloping correct software requirements is important for overall software quality. Most existing quality improvement approaches focus on detection and removal of faults (i.e., problems recorded in a document) as opposed to identifying the underlying errors that produced those faults. Accordingly, developers are likely to make the same errors in the future and not recognize other existing faults with the same origins. The Requirement Error Taxonomy (RET) developed by Walia and Carver helps focus the developer’s attention on common errors that can occur during requirements engineering. However, because development of software requirements is a human-centric process, requirements engineers will likely make human errors during the process which may lead to undetected faults. Thus, in order to bridge the gap, the goals of my dissertation are: (1) construct a complete Human Error Taxonomy (HET) for the software requirements stage; (2) investigate the usefulness of HET as a defect detection technique; (3) investigate the effectiveness of HET as a defect prevention technique; and (4) provide specific defect prevention measurements for each error in HET. To address these goals, the dissertation contains three articles. The first article is a systematic literature review that uses insights from cognitive psychology research on human errors to develop formal HET to help software engineers improve software requirements specification (SRS) documents. After building the HET, it is necessary to empirically evaluate its effectiveness. Thus, the second article describes two studies to evaluate the usefulness of the HET in the process of defect detection. Finally, the third article analyzes the usefulness of HET for defect prevention and provides strategies for preventing specific errors in the SRS.Item The search phase of software engineering systematic literature review: barriers and solutions(University of Alabama Libraries, 2017) Al-Zubidy, Ahmed; Carver, Jeffrey C.; University of Alabama TuscaloosaThe Systematic Literature Review (SLR) is an approach for conducting literature reviews that provides less bias and more reliability than the more typical ad hoc approach. One of the first phases of the SLR process is to conduct a comprehensive search for current literature. Conducting this search on different sources of literature (i.e., digital libraries) is a manual and exhausting task that results in an SLR process that is more costly than necessary. Therefore, the goals of this dissertation are to: (1) find empirical evidence about the SLR search problem and the current status of tool support; (2) understand the barriers and the tooling requirements to support the SLR search phase; and (3) develop and evaluate a solution to help reduce the barriers. To address these goals, this dissertation consist of three articles. Article 1 describes the results from three empirical studies that produce a list of tooling requirements across the entire SLR process. The results of these studies identify numerous gaps between needs of SLR authors during the search phase and the current tool support. Article 2 consists of an SLR and a survey, to identify the specific tool requirements to support the SLR search phase. The SLR produced a list of SLR search barriers that were reported in the literature. The survey of SLR authors confirmed the results of the SLR and expanded the list to include issues and requirements from the community beyond what is reported in the literature. Article 3 describes the development and evaluation of the Searching Systematic Reviews Tool (SSRT) tool to address the problems identified in Articles 1 and 2. SSRT provides one interface to search multiple digital libraries at once, store the results, and remove duplicates. The article also describes the evaluation of SSRT and the future extensions of SSRT.Item Mobile interaction techniques for large displays(University of Alabama Libraries, 2017) Zeng, Yuguang; Zhang, Jingyuan; University of Alabama TuscaloosaNowadays, large displays are common in our life and work. The dramatic changes in display sizes hinder users' interaction based on traditional input devices. Recently, mobile devices have been massively adopted and have been proposed as interaction devices for large displays. However, how to integrate mobile devices into real world applications involving large displays is limitedly investigated. In our research on using mobile devices, we aim to explore solutions to problems in a large display environment. In this dissertation, a new mobile interaction model has been proposed to allow users to input, manage windows and transfer information in a large display environment. The proposed model consists of three components: MobileInput, MobileWindowManager, and MobileClipboard. MobileInput enables users to input from their mobile devices to large displays. Working as a mouse, MobileInput can position the cursor on the large display. This is achieved by analyzing keypoint features of both a screenshot of the large display and an image captured by a camera, and calculating mapping relationships. MobileInput simulates mouse events to move the cursor and performs mouse functions. Working as a keyboard, MobileInput provides a scheme to allow users to focus on inputting texts on their mobile devices, and produces those changes to text inputting on the remote display. MobileWindowManager allows users to access and manage application windows on large displays through their mobile devices. MobileWindowManager provides a local interface for users to launch application windows on large displays. With the introduction of QR codes, users gain access to a target window quickly. Considering the fat thumb problem in mobile interaction, MobileWindowManager provides effective schemes to move and resize windows on the remote display through their personal devices. MobileClipboard allows users to transfer information within a computer or across computers. MobileClipboard extends mechanisms of system clipboards to support information exchange in multiple device environments. User interface gestures are designed for users to select objects on a display through their cameras. In addition, MobileClipboard designs protocols to support copy and paste procedures among mobile devices and displays.Item Non-technical loss fraud detection in smart grid(University of Alabama Libraries, 2017) Han, Wenlin; Xiao, Yang; University of Alabama TuscaloosaUtility companies consistently suffer from the harassing of Non-Technical Loss (NTL) frauds globally. In the traditional power grid, electricity theft is the main form of NTL frauds. In Smart Grid, smart meter thwarts electricity theft in some ways but cause more problems, e.g., intrusions, hacking, and malicious manipulation. Various detectors have been proposed to detect NTL frauds including physical methods, intrusion-detection based methods, profile-based methods, statistic methods, and comparison-based methods. However, these methods either rely on user behavior analysis which requires a large amount of detailed energy consumption data causing privacy concerns or need a lot of extra devices which are expensive. Or they have some other problems. In this dissertation, we thoroughly study NTL frauds in Smart Grid. We thoroughly survey the existing solutions and divided them into five categories. After studying the problems of the existing solutions, We propose three novel detectors to detect NTL frauds in Smart Grid which can address the problems of all the existing solutions. These detectors model an adversary's behavior and detect NTL frauds based on several numerical analysis methods which are lightweight and non-traditional. The first detector is named NTL Fraud Detection (NFD) which is based on Lagrange polynomial. NFD can detect a single tampered meter as well as multiple tampered meters in a group. The second detector is based on Recursive Least Square (RLS), which is named Fast NTL Fraud Detection (FNFD). FNFD is proposed to improve the detection speed of NFD. Colluded NTL Fraud Detection (CNFD) is the third detector that we propose to detect colluded NTL frauds. We have also studied the parameter selection and performance of these detectors.