2015 Informs Annual Meeting

WA32

WC42INFORMS Charlotte – 2011

WA31 31-Room 408, Marriott Data Mining for Environmental and Natural Hazard Applications

2 - Comprehensive Analysis of the U.S. Army’s Global Assessment Tool Cardy Moten, Maj, TRADOC Analysis Center-Monterey, 700 Dyer Road, Room 183, Monterey, CA, 93943, United States of America, cmoten@nps.edu The focus of this research is to investigate new interpretations of the Global Assessment Tool in order to provide more informed feedback to the Soldier and improve prediction of Soldier outcomes. We used factor analysis, cluster analysis, and data visualization techniques to evaluate similarities and differences between the services and provide a more comprehensive picture of the component data that is more readily understood by Soldiers. WA30 30-Room 407, Marriott Information Systems for E-Business/Commerce Contributed Session Chair: Anwesha Bhattacharjee, Student, University of Texas, Dallas, 2200 Waterview Pkwy, #1836, Richardson, TX, 75080, United States of America, axb094820@utdallas.edu 1 - Investigating the Effect of Social Connections on Usefulness of Online Reviews Pouya Khansaryan, University of Connecticut, 101 South Eagleville, Apt. 18B, Storrs, CT, 06268, United States of America, seyedamirpouya.khansaryan@business.uconn.edu Online reviews are a form of eWOM which are nowadays available to prospective customers. In this study, we try to answer the question: “what are the key factors that contribute to the consumer’s perception of the usefulness of online reviews?”. The data from Yelp in different time spots show that star ratings, total votes, review length, average writer’s star rating, number of fans and elapsed time are the most significant measures for the perceived usefulness. 2 - Online Activities in Virtual World and Money Spending in Real World Gwangjae Jung, Korea Information Society Development Institute, 18, Jeongtong-ro, Deoksan-myeon, Jincheon, Korea, Republic of, indioblu@gmail.com, Youngsoo Kim We examine the relationship between online activities virtual world and money spending in real world. We collected users’ log data in an online game from Feb. to Aug. 2010. Our analyses show that virtual money spending complements real money spending in playing an online game. Another finding is that group play in an online game facilitates real money spending on avatar decorations, but not on gaming efficiency. Real money spending also decreases as users advance to the latter stage of game. 3 - Differences in Hedonic and Utilitarian Apps through Consumer Addiction, Frustration and Evaluation Bidyut Hazarika, University of Colorado Denver, 1475 Lawrence St, Denver, CO, 80202, United States of America, bidyut.hazarika@ucdenver.edu, Madhavan Parthasarathy, Jahangir Karimi, Jiban Khuntia Hedonic and utilitarian apps differ in addiction, frustration and subsequent evaluation scores. This study analyzes scores on these factors for more than 18136 apps data to establish this differentiation values using interaction models and econometric analyses. 4 - How Could We Cope with Malicious Rater? A New Detection Method for Trustworthy Reputation Systems Yuanfeng Cai, CUNY—-Baruch College, 55 Lexington Ave, Reputation systems are vulnerable to rating fraud. To address it, we use data from Tripadvisor, Expedia and Amazon to empirically exploit the rating time series features of malicious rater. Then we propose the two-phase method for detection. First, it examines the rating series associated with each entity and filters out those under attack. Second, the clustering method is applied to discriminate malicious raters. Experimental studies have demonstrated the effectiveness of the proposed method. 5 - Searching the Global Distribution System: A Double-edged Sabre Anwesha Bhattacharjee, Student, University of Texas, Dallas, 2200 Waterview Pkwy, #1836, Richardson, TX, 75080, United States of America, axb094820@utdallas.edu, Vijay Mookerjee, Mehmet Ayvaci, Radha Mookerjee As the demand for travel grows, so does the need for travel agencies. Travel agencies, in turn, use a global distribution system to find the appropriate service for their clients. In this paper, we look at one such travel service market segment: hotel shopping. We identify search behaviors among agencies and we identify the tradeoff for the global distribution system itself which invest millions on setting up the search want to increase the number of bookings with the minimum number of searches. New York, NY, United States of America, Yuanfeng.Cai@baruch.cuny.edu, Dan Zhu

Sponsor: Data Mining Sponsored Session

Chair: Seth Guikema, Associate Professor, Johns Hopkins University, 3400 N Charles Street, Ames Hall 313, Baltimore, MD, 21218, United States of America, sguikem1@jhu.edu 1 - Data Mining Approaches to Characterize Non-uniform Wind Farm Power Production Andrea Staid, PhD Candidate, Johns Hopkins University, 3400 N. Charles St., 313 Ames Hall, Baltimore, MD, 21218, United States of America, astaid@gmail.com, Claire Verhulst, Seth Guikema Power production of wind farms with non-uniform layouts is more difficult to analyze using traditional wake-decay models. We present some of the discrepancies that arise when modeling these types of farms and highlight the sources of error. We then present new methods to characterize farm production based on data mining instead of wake modeling, and we show the benefits of using these methods in conjunction with more traditional means. 2 - Analysis of Low Probability Streamflow Outcomes in the Mid-atlantic Region Gina Tonn, PhD Candidate, Johns Hopkins University, 115 Broadbent Road, Wilmington, DE, 19810, United States of America, gtonn2@jhu.edu, Seth Guikema Standard flood frequency analysis methods are widely used, but involve much uncertainty and low probability outcomes can occur. In this study, statistical analysis is used to identify watershed characteristics that are correlated with low probability streamflow outcomes. Methods include a Random Forest model and clustering analysis. 3 - Data Mining for Understanding Tsunami Death Rates in Japan Seth Guikema, Associate Professor, Johns Hopkins University, 3400 N Charles Street, Ames Hall 313, Baltimore, MD, 21218, United States of America, sguikem1@jhu.edu, Roshanak Nateghi Then 2011 Tsunami in Japan caused widespread destruction and led to a large number of deaths. It was the most recent in a strong of tsunamis in the Tohoku region of Japan. We use data from the 1896, 1933, 1960, and 2011 tsunamis together with modern data mining methods to better understand the factors affecting death rates during these events. 4 - Prediction of Mean Harvest Weight of Royal Gala Apples Tom Logan, PhD Student, University of Michigan, 3700 N Charles Street, Baltimore, MD, 21218, United States of America, tom.logan@jhu.edu, Seth Guikema, Stella Mcleod Early prediction of the mean harvest size of apples is useful for decision makers in the apple and horticultural industry. Decisions including logistics and marketing are made prior to harvest and are generally based on estimates of the crop. A random forest model was developed using data for the apple variety Royal Gala from orchards within the Hawkes Bay Region of New Zealand. For the eight years of data available it has been shown to have a mean predictive error of 2.4%. WA32 32-Room 409, Marriott Data Mining with Marketing Applications Contributed Session Chair: Elham Khabiri, IBM, 1101 Kitchawan Rd, Yorktown Heights, NY, United States of America, ekhabiri@us.ibm.com 1 - Evaluating Database Marketing Models: More than Meets the Eye Sam Koslowsky, Senior Analytic Consultant Modeling Solutions and Delivery, Harte Hanks, 2118 Avenue T, Brooklyn, NY, 11229, United States of America, sam.koslowsky@hartehanks.com Managers are most pleased with using the gains table to assess their predictive models. Identifying more ‘HAVES’ at the top, and fewer on the bottom is most desirable. But, more needs to be examined. Some use standard statistical criteria. This may be fine. But, some common sense features are frequently ignored as it relates to model evaluation and the gains table. These include variations in lift, unevenness in decile performance, the stability of predictions and the interpretations of results.

383

Made with