Informs Annual Meeting Phoenix 2018

INFORMS Phoenix – 2018

SB66

n SB65 West Bldg 104B Joint Session DM/Practice Curated: Big Data Science Sponsored: Data Mining Sponsored Session Chair: Hongxia Yin, Minnesota State University, Mankato, Mankato, MN, USA 1 - A Simple and Efficient Hybrid Genetic Algorithm for Minimum Sum-of-squares Clustering Thibaut Vidal, Professor, PUC-Rio, Departamento de Informatica, Rua Marques de Sao Vicente, 225, Rio de Janeiro, 22453-900, Brazil, Daniel Gribel Minimum sum-of-squares clustering (MSSC) is a widely used clustering model. We introduce an efficient genetic algorithm that uses K-means as a local search in combination with problem-tailored variation operators. The approach is scalable and accurate, outperforming all recent state-of-the-art algorithms for MSSC in terms of solution quality, measured by the depth of local minima. This enhanced accuracy leads to classification results which are significantly closer to the ground truth for overlapping Gaussian-mixture datasets with a large number of features. Improved global optimization methods therefore appear to be essential to better exploit the MSSC model in high dimension. 2 - On the Behavior of the Expectation-maximization Algorithm for Mixture Models Babak Barazandeh, University of Southern California, Los Angeles, CA, 90007, United States, Meisam Razaviyayn Finite mixture models are among the most popular statisticalmodels that are widely used in different data science disciplines.Despite their broad applicability, inference under these models typically leads to computationally challenging non- convex problems. While the Expectation-Maximization (EM) is the most popular approach for solving these non-convex problems, the behavior of this algorithm is not well understood for general mixture model inference problems. In this work, we study the equally weighted mixture of two single dimensional Laplacian distributions and show that every local optimum of the population maximum likelihood estimation problem is global optimum. 3 - Fourier Transform Inverse Regression Estimators of the Central Subspaces Jiaying Weng, University of Kentucky, Lexington, KY, 40503, United States, Xiangrong Yin We introduce an optimal inverse regression estimator, Fourier transform inverse regression estimator, by optimizing the quadratic discrepancy function using Fourier transforms. We further develop degenerated and robust Fourier transform inverse regression estimators for computational efficiency and robustness, as well as partial Fourier transform inverse regression estimator for predictors consisting both categorical and continuous variables. For sufficient variable selection, we propose shrinkage and sparse group LASSO Fourier transform inverse regression estimators. Furthermore, marginal or conditional hypothesis tests for predictors or dimensions are considered. 4 - An Smoothing Newton Method for SVM Type Models in Data Analysis Hongxia Yin, Minnesota State University, Mankato, Department of Mathematics and Statistics, 273 Wissink Hall, Mankato, MN, 56001, United States Hongxia Yin, University of Chinese Academy of Sciences, Beijing, 100190, China An smoothing Newton method for a few support vector machine (SVM) models in data analysis are given by reformulate their dual problems. We proved the global convergence and local super-linear (or quadratic) convergence of the methods. Numerical tests on problems in UCI illustrate the efficiency and robustness of the algorithm compare to the existing results in literature.

n SB66 West Bldg 105A Joint Session AI/Practice Curated: Business Applications of Artificial Intelligence Sponsored: Artificial Intelligence Sponsored Session Chair: Srikar Velichety, The University of Memphis, Memphis, TN 1 - Does Image Semantics Impact Demand on Digital Sharing Platforms? An Empirical Study Using Deep Learning Vivek Kumar Singh, University of South Florida, Tampa, FL, United States, Utkarsh Shrivastava, Anol Bhattacherjee The digital sharing platforms such as Airbnb encourage customers to seek cues from textual information and images for making an informed decision. Unlike prior studies that focused on only textual information, we study the impact of semantic scenes (e.g. indoor and outdoor) depicted by the property’s images and their position within the listing’s webpage on the lodging demand. Our propositions are supported by ideas from theories of signaling and information processing and tested using deep learning and econometric modeling approaches. We found that advances in artificial intelligence can indeed provide insights from images at scale for guiding sellers on the digital platforms. 2 - Seeing the Forest for the Trees: Generating Instrumental Variables with Random Forest for Bias Correction in Statistical Inferences Mochen Yang, University of Minnesota, Carlson School of Management, 321-19th Avenue South, Minneapolis, MN, 55455, United States The practice of combining machine learning with econometric analysis has become increasingly prevalent in empirical research. In the first stage, machine learning methods are typically used to create new variables (e.g., predict sentiment from textual data), which are then added into second stage econometric models as covariates. Because the predictions from machine learning models are inevitably imperfect, the subsequent econometric estimations suffer from biases due to measurement error in covariates. In this paper, we discuss a novel approach that mitigates biases by leveraging instrumental variables that are generated from an ensemble machine learning model, such as a Random Forest. 3 - A Graphical Model for Topical Impact Over Time Zhiya Zuo, University of Iowa, Iowa City, IA, United States, Kang Zhao After being published, a document, whether it is a research paper or an online post, can make an impact when readers cite, share, or endorse it. Built on supervised topic models, we propose a graphical model to capture the topical impact over time within a corpus of documents. We conducted experiments on papers published in (i) D-Lib Magazine and (ii) The Library Quarterly from 2007 to 2017. Comparing with ToT, we found that our model produced more robust and interpretable results on topical trends over time and. Enabling better understanding and modeling of topical impact over time, this model can be used for the design of social media platforms, and evaluation of scientific contributions and policies. 4 - A Modeling Framework for Bike Rebalancing Problem in Bike Sharing System Fan Dong, University of Arizona, Tucson, AZ, 85715, United States Bike sharing systems have been implemented in many major cities to offer a convenient mobility service in which public bicycles are deployed in different stations across the city for shared use. The users can check out a bike from a station nearby, take a short ride, and check in the bike to a station around their destination. As the check in and check out of bikes at different stations are unbalanced during different time periods, the bike imbalance issue constantly occurs. Bike stations tend to be full of bikes or empty because the dynamic and asymmetric spatial and temporal bike usage patterns between stations. To ensure good service quality, bike sharing systems need to make sure all stations have enough available bicycles for check out and empty slots for check in. Therefore, system managers need to make decision in advance to send out trucks to dispatch bikes within different stations to rebalancing the bike stations. There are several key challenges related to the bike rebalancing problem: the determination of stations need to be rebalanced, and the dispatch truck routing optimization. In this paper, we design a general framework for bike sharing systems managers to solve the bike rebalancing problems by following steps. The first step is to design a community detection clustering algorithm to cluster bike stations into different communities based on their bike trip network. The second step is to implement a three layer predicting model to predict the check in and check out of bikes for each station in a future period in order to identify the unbalanced stations in future. The third step is to design rebalancing routing algorithm to rebalancing bike stations within and between communities.

55

Made with FlippingBook - Online magazine maker