Researchers from across the social and computer sciences are increasingly using machine learning to study and address global development challenges. This paper examines the burgeoning field of machine learning for the developing world (ML4D). First, we present a review of prominent literature. Next, we suggest best practices drawn from the literature for ensuring that ML4D projects are relevant to the advancement of development objectives. Finally, we discuss how developing world challenges can motivate the design of novel machine learning methodologies. This paper provides insights into systematic differences between ML4D and more traditional machine learning applications. It also discusses how technical complications of ML4D can be treated as novel research questions, how ML4D can motivate new research directions, and where machine learning can be most useful.
Systems integration connecting software systems for cross-functional work is a significant concern in many large organizations, which continue to maintain hundreds, if not thousands, of independently evolving software systems. Current approaches in this space remain ad hoc, and closely tied to technology platforms. Following a design science approach, and via multiple design-evaluate cycles, we develop Systems Integration Requirements Engineering Modeling Language (SIRE-ML) to address this problem. SIRE-ML builds on the foundation of coordination theory, and incorporates important semantic information about the systems integration domain. The paper develops constructs in SIRE-ML, and a merge algorithm that allows both functional managers and integration professionals to contribute to building a systems integration solution. Integration models built with SIRE-ML provide benefits such as ensuring coverage and minimizing ambiguity, and can be used to drive implementation with different platforms such as middleware, services and distributed objects. We evaluate SIRE-ML for ontological expressiveness and report findings about applicability check with an expert panel. The paper discusses implications for future research such as tool building and empirical evaluation, as well as implications for practice.
Serendipity has been recognized to have the potential of enhancing unexpected information discovery. This study shows that decomposing the concept of serendipity into unexpectedness and interest is a useful way for implementing this concept. Experts domain knowledge helps providing serendipitous recommendation, which can be future improved by adaptively incorporating users real-time feedback. This research also conducts an empirical user-study to analyze the influence of serendipity in a health news delivery context. A personalized filtering system names MedSDFilter was developed, on top of which serendipitous recommendation was implemented using three approaches random, static knowledge-based, and adaptive knowledge-based models. The three different models were compared. The results indicate that the adaptive knowledge-based method has the highest ability in helping people discover unexpected and interesting contents. The insights of the research will make researchers and practitioners rethink the way in which search engines and recommender systems operate to address the challenges of unexpected and valuable information discovery. The outcome will have implications for empowering ordinary people with more chance of bumping into beneficial information.
Twitter has emerged as a major social media platform and generated great interest from sentiment analysis researchers. Despite this attention, state-of-the-art Twitter sentiment analysis approaches perform relatively poorly with reported classification accuracies often below 70%, adversely impacting applications of the derived sentiment information. In this research, we investigate the unique challenges presented by the Twitter sentiment analysis problem, and review the literature to determine how the devised approaches have addressed these challenges. To assess the state-of-the-art in Twitter sentiment analysis, we conduct a benchmark evaluation of 28 top academic and commercial systems in tweet sentiment classification across five distinctive data sets. We perform an error analysis to uncover the causes of commonly occurring classification errors. To further the evaluation, we apply select systems in an event detection case study. Finally, we summarize the key trends and takeaways from the review and benchmark evaluation, and provide suggestions to guide the design of the next generation of approaches.