Tag

data analysis

Introduction to SPSS

By |

Audience: Never before SPSS users who will be using SPSS for Windows.  Those using SPSS for Unix or Macintosh should email the instructor at cpow@umich.edu before enrolling.

Fundamentals

This portion introduces SPSS for Windows, the menu and the help systems, the three main types of files used, and printing from within SPSS.  It then addresses defining variables, attaching labels, defining missing values, and various ways to enter data into SPSS.  Finally, it covers a brief introduction to obtaining frequency distributions, descriptive statistics, and cross tabulations of variables.

Within-Case Transformations

This portion introduces data management capabilities, including recoding variables (manual and automatic), computing new variables using formulas, and counting occurrences of values within subjects.  Attention then turns to temporary transformations, conditional processing of transformations, and repetitive transformations.  SPSS syntax is also introduced.

Data Management with Multiple Files

This portion begins with a discussion of subsetting data files by drawing samples, selecting groups and excluding groups from analysis.  Then, the two main methods of merging SPSS data files are covered: adding additional variables and adding additional cases.  Next, creating aggregated data sets and applying aggregated data to individuals is covered.  Lastly, importing and exporting data between SPSS and other statistical programs (Excel, dBase, SAS) is demonstrated.

Basic Statistics and Graphics

This portion covers basic exploratory procedures, including obtaining percentiles, frequencies, descriptive statistics, and cross tabulations. Basic comparative procedures including two-sample t-tests, paired t-tests, and one-way analysis of variance are also covered.  Then, simple bivariate correlation analysis is introduced.  Participants are given a basic introduction to commonly used graphical procedures for displaying data, including scatter plots, bar graphs, histograms, and boxplots.

Introduction to SPSS

By | | No Comments

Audience: Never before SPSS users who will be using SPSS for Windows.  Those using SPSS for Unix or Macintosh should email the instructor at cpow@umich.edu before enrolling.

Fundamentals

This portion introduces SPSS for Windows, the menu and the help systems, the three main types of files used, and printing from within SPSS.  It then addresses defining variables, attaching labels, defining missing values, and various ways to enter data into SPSS.  Finally, it covers a brief introduction to obtaining frequency distributions, descriptive statistics, and cross tabulations of variables.

Within-Case Transformations

This portion introduces data management capabilities, including recoding variables (manual and automatic), computing new variables using formulas, and counting occurrences of values within subjects.  Attention then turns to temporary transformations, conditional processing of transformations, and repetitive transformations.  SPSS syntax is also introduced.

Data Management with Multiple Files

This portion begins with a discussion of subsetting data files by drawing samples, selecting groups and excluding groups from analysis.  Then, the two main methods of merging SPSS data files are covered: adding additional variables and adding additional cases.  Next, creating aggregated data sets and applying aggregated data to individuals is covered.  Lastly, importing and exporting data between SPSS and other statistical programs (Excel, dBase, SAS) is demonstrated.

Basic Statistics and Graphics

This portion covers basic exploratory procedures, including obtaining percentiles, frequencies, descriptive statistics, and cross tabulations. Basic comparative procedures including two-sample t-tests, paired t-tests, and one-way analysis of variance are also covered.  Then, simple bivariate correlation analysis is introduced.  Participants are given a basic introduction to commonly used graphical procedures for displaying data, including scatter plots, bar graphs, histograms, and boxplots.

#UMTweetCon2019

By |

A Conference on the Use of Twitter Data for Research and Analytics

 

#UMTweetCon2019 will connect U-M scholars across a diverse set of disciplines in an interdisciplinary exchange about common challenges and lessons learned. We further seek to facilitate new connections to help U-M scholars create opportunities for future joint research, collaborative grant writing, training and other activities. Conference attendance will be open to anyone interested in learning about the wide array of Twitter data applications in current research at the University. The conference is sponsored by the Social Science and Social Media Collaborative, the Michigan Institute for Data Science, the #Parenting Rackham Interdisciplinary Group, and coordinated by the Center for Political Studies and the Institute for Social Research.

Call for Abstracts

Do you use Twitter data in your research? Then, you are invited to submit an abstract for the first

 university wide conference at the University of Michigan (Ann Arbor, Dearborn, and Flint) on the use of Twitter data in research and analytics. #UMTweetCon2019 will connect U-M scholars across a diverse set of disciplines in an interdisciplinary exchange about common challenges and lessons learned. We further seek to facilitate new connections to help U-M scholars create opportunities for future joint research, collaborative grant writing, training and other activities. Conference attendance will be open to anyone interested in learning about the wide array of Twitter data applications in current research at the University.

To reflect the wide range of ongoing research across disciplines, we invite submissions that 1) directly examine dynamics of Tweet behavior and Twitter networks, 2) explore the representativeness and validity of Twitter data for making scientific inference, 3) develop new computational methodology for obtaining, processing, or archiving Twitter data, or 4) present applications of Twitter data for studying diverse social phenomena. During the 2-day conference, research presentations will be complemented with participatory sessions to provide participants with an opportunity to plan future activities and help create a regular user community across campuses (e.g., seminar series, computational training sessions, hackathons, regular coding meetups, etc.)

Interested U-M researchers are asked to use the form linked here to submit a short abstract of 200-300 words in length that describes their research project, along with information about participating co-authors. Submissions are due by Friday April 12, 2019.

Click here to submit an abstract for a panel or poster presentation.

Attending #UMTweetCon2019 will require a small, non-refundable registration fee from presenters and attendees alike (students/post-docs: $15 pre-conference online, $20 on-site; faculty/staff/other: $30 pre-conference online, $40 on-site). Presenters and attendees from Dearborn and Flint campuses will receive a registration discount (students/post-docs: $15, faculty/staff/other: $20). We will use the revenue from registration fees to fund best paper awards.

AIM Analytics Seminar – Dan Davis, PhD Candidate, TU Delft, the Netherlands.

By |

Improving Online Learning Outcomes Using Large-Scale Learning Analytics

Abstract: This talk will cover a holistic approach to improving learning outcomes and behavior in large-scale learning environments—namely MOOCs. I begin by sharing the results of a study exploring the extent to which learners follow (or deviate from) the designed learning path and the impact this behavior has on eventual learning outcomes. We next take a deeper dive into the design of online courses with a large-scale learning design approach, where I’ll present an automated method developed to categorize courses based on their design. With these trends in learning & teaching behavior in mind, the talk will conclude with the results of a series of randomized experiments (A/B tests) carried out in live MOOCs designed to provide additional support to learners. From these experiments we arrive at a better understanding of which instructional & design strategies are most effective for improving learning outcomes and behavior at scale.

Bio: Dan’s research uses and advances learning analytics techniques in open, online education at scale by pushing the boundaries towards personalized & adaptive learning environments. Dan develops methods to gain a deeper understanding about how the design of online learning environments affects learner success and engagement, often by implementing and testing instructional interventions at scale using randomized controlled experiments. Dan earned his BA in English, Writing & Mass Communication with a minor in Graphic Design from Assumption College in Worcester, Mass. His MA is from Georgetown University in Communication, Culture & Technology, and he is currently finishing his PhD in Computer Science, Learning Analytics from TU Delft in the Netherlands.

Lunch will be provided.

AIM Analytics is a bi-weekly seminar series for researchers across U-M who are interested in learning analytics. The field of learning analytics is a multi and interdisciplinary field that brings together researchers from education, learning sciences, computational sciences and statistics, and all discipline-specific forms of educational inquiry.

Introduction to SPSS

By | | No Comments

Audience: Never before SPSS users who will be using SPSS for Windows.  Those using SPSS for Unix or Macintosh should email the instructor at cpow@umich.edu before enrolling.

Note: Topic order is subject to change.  Participants must sign up for the entire series.

Fundamentals

This portion introduces SPSS for Windows, the menu and the help systems, the three main types of files used, and printing from within SPSS.  It then addresses defining variables, attaching labels, defining missing values, and various ways to enter data into SPSS.  Finally, it covers a brief introduction to obtaining frequency distributions, descriptive statistics, and cross tabulations of variables.

Within-Case Transformations

This portion introduces data management capabilities, including recoding variables (manual and automatic), computing new variables using formulas, and counting occurrences of values within subjects.  Attention then turns to temporary transformations, conditional processing of transformations, and repetitive transformations.  SPSS syntax is also introduced.

Data Management with Multiple Files

This portion begins with a discussion of subsetting data files by drawing samples, selecting groups and excluding groups from analysis.  Then, the two main methods of merging SPSS data files are covered: adding additional variables and adding additional cases.  Next, creating aggregated data sets and applying aggregated data to individuals is covered.  Lastly, importing and exporting data between SPSS and other statistical programs (Excel, dBase, SAS) is demonstrated.

Basic Statistics and Graphics

This portion covers basic exploratory procedures, including obtaining percentiles, frequencies, descriptive statistics, and cross tabulations. Basic comparative procedures including two-sample t-tests, paired t-tests, and one-way analysis of variance are also covered.  Then, simple bivariate correlation analysis is introduced.  Participants are given a basic introduction to commonly used graphical procedures for displaying data, including scatter plots, bar graphs, histograms, and boxplots.

Registration

To register for CSCAR Workshops, call the CSCAR front desk at (734) 764-7828 or come to the office in person with cash or check or a UM department shortcode:

OFFICE HOURS

9:00 a.m. – 5:00 p.m., Monday through Friday
Closed 12pm – 1:00 p.m. every Tuesday for staff meeting.
Voice: (734) 764-7828 (4-STAT from a campus phone)
Fax: (734) 647-2440

ADDRESS

Center for Statistical Consultation and Research (CSCAR)
The University of Michigan
3550 Rackham
915 E. Washington St.
Ann Arbor, MI 48109-1070

 

SPSS I Introduction to SPSS

By | | No Comments

Note: Topic order is subject to change.

This workshop is designed to introduce participants to SPSS. It will cover the fundamentals of SPSS, within-case transformations, data management with multiple files, and basic statistics and graphics. Useful for any scholar engaged in quantitative research.

Fundamentals

This portion introduces SPSS, the menu and the help systems, and the three main types of files used.  It then addresses defining variables, attaching labels, defining missing values, and various ways to enter data into SPSS.  Finally, it covers a brief introduction to obtaining frequency distributions, descriptive statistics, and cross tabulations of variables.

Within-Case Transformations

This portion introduces data management capabilities, including recoding variables (manual and automatic), computing new variables using formulas, and counting occurrences of values within subjects.  Attention then turns to temporary transformations, conditional processing of transformations, and repetitive transformations.

Data Management with Multiple Files

This portion begins with a discussion of subsetting data files by drawing samples, selecting groups and excluding groups from analysis.  Then, the two main methods of merging SPSS data files are covered: adding additional variables and adding additional cases.

Basic Statistical Analysis

The portion includes a brief demonstration of a statistical analysis in SPSS. While not delving deep into statistical theory, we will cover the basics of an analysis, as well as discuss the graphing facilities in SPSS.

Registration

To register for CSCAR Workshops, call the CSCAR front desk at (734) 764-7828 or come to the office in person with cash or check or a UM 6-digit department shortcode:

OFFICE HOURS

9:00 a.m. – 5:00 p.m., Monday through Friday
Closed 12pm – 1:00 p.m. every Tuesday for staff meeting.
Voice: (734) 764-7828 (4-STAT from a campus phone)
Fax: (734) 647-2440

ADDRESS

Center for Statistical Consultation and Research (CSCAR)
The University of Michigan
3550 Rackham
915 E. Washington St.
Ann Arbor, MI 48109-1070

 

UM Biostatistics Seminar: Veronika Rockova, PhD, University of Chicago

By |

vrockova

Veronika Rockova, Ph.D.

Assistant Professor in Econometrics and Statistics

The University of Chicago Booth

 

‘Fast Bayesian Factor Analysis via Automatic Rotations to Sparsity’

Abstract: Rotational post hoc transformations have traditionally played a key role in enhancing the interpretability of factor analysis. Regularization methods also serve to achieve this goal by prioritizing sparse loading matrices. In this work, we bridge these two paradigms with a unifying Bayesian framework. Our approach deploys intermediate factor rotations throughout the learning process, greatly enhancing the effectiveness of sparsity inducing priors. These automatic rotations to sparsity are embedded within a PXL-EM algorithm, a Bayesian variant of parameter-expanded EM for posterior mode detection. By iterating between soft-thresholding of small factor loadings and transformations of the factor basis, we obtain (a) dramatic accelerations, (b) robustness against poor initializations, and (c) better oriented sparse solutions. To avoid the prespecification of the factor cardinality, we extend the loading matrix to have infinitely many columns with the Indian buffet process (IBP) prior. The factor dimensionality is learned from the posterior, which is shown to concentrate on sparse matrices. Our deployment of PXL-EM performs a dynamic posterior exploration, outputting a solution path indexed by a sequence of spike-and-slab priors. For accurate recovery of the factor loadings, we deploy the spike-and-slab LASSO prior, a two-component refinement of the Laplace prior. A companion criterion, motivated as an integral lower bound, is provided to effectively select the best recovery. The potential of the proposed procedure is demonstrated on both simulated and real high-dimensional data, which would render posterior simulation impractical. Supplementary materials for this article are available online.

Bio: Veronika Rockova is Assistant Professor in Econometrics and Statistics at the University of Chicago Booth School of Business. Her work brings together statistical methodology, theory and computation to develop high-performance tools for analyzing large datasets. Her research interests reside at the intersection of Bayesian and frequentist statistics, and focus on: data mining, variable selection, optimization, non-parametric methods, factor models, high-dimensional decision theory and inference. She has authored a variety of published works in top statistics journals. In her applied work, she has contributed to the development of risk stratification and prediction models for public reporting in healthcare analytics.

Prior to joining Booth, Rockova held a Postdoctoral Research Associate position at the Department of Statistics of the Wharton School at the University of Pennsylvania. Rockova holds a PhD in biostatistics from Erasmus University (The Netherlands), an MSc in biostatistics from Universiteit Hasselt (Belgium) and both an MSc in mathematical statistics and a BSc in general mathematics from Charles University (Czech Republic).

Besides enjoying statistics, she is a keen piano player.

 

Light refreshments for seminar guests will be served at 3:00 p.m. in 3755.

New private insurance claims dataset and analytic support now available to health care researchers

By | General Interest, Happenings, HPC, News

The Institute for Healthcare Policy and Innovation (IHPI) is partnering with Advanced Research Computing (ARC) to bring two commercial claims datasets to campus researchers.

The OptumInsight and Truven Marketscan datasets contain nearly complete insurance claims and other health data on tens of millions of people representing the US private insurance population. Within each dataset, records can be linked longitudinally for over 5 years.  

To begin working with the data, researchers should submit a brief analysis plan for review by IHPI staff, who will create extracts or grant access to primary data as appropriate.

CSCAR consultants are available to provide guidance on computational and analytic methods for a variety of research aims, including use of Flux and other UM computing infrastructure for working with these large and complex repositories.

Contact Patrick Brady (pgbrady@umich.edu) at IHPI or James Henderson (jbhender@umich.edu) at CSCAR for more information.

The data acquisition and availability was funded by IHPI and the U-M Data Science Initiative.

ICPSR Summer Program Evening Workshop: Introduction to the R Programming Environment

By |

image for ICPSR r course

Introduction to R:

How to Use R for Data Management, Data Analysis, & Graphical Display

 

Dates and Time: August 15-19, 2016, 5-8 p.m.
Location: Mason/Angell Hall, University of Michigan, Ann Arbor, Michigan
Instructor: R. Joseph Waddington, University of Notre Dame

The “R” statistical software package has become widely used to conduct statistical analyses and produce graphical displays of data across the social, behavioral, health, and other sciences. R is an open-source, code-based program that combines the ability to easily conduct analyses with a convenient facility for programming. Through R’s comprehensive network (CRAN), there are thousands of “add-on” packages available for use with advanced quantitative analyses.

This course will introduce users to the R programming environment and its use as a data analysis package. Participants in the course will learn to use R for data management; conducting and interpreting descriptive analyses, basic hypothesis tests, and regression analyses; producing graphical displays; and other advanced topics as time permits.

The course will feature both a lecture and a lab component. During lecture, the instructor will demonstrate basic features and coding in R to manage data, conduct analyses, and produce graphical displays. During lab, the instructor will lead participants through guided examples with real social science data on topics and techniques that mirror the same ones covered in the day’s lecture. All data will be provided by the instructor.

Audience: Researchers, analysts, graduate students, and faculty who are seeking a brief and applied introduction to using R for quantitative data analysis in their own research or instruction.

Registration Fee: $600 (for U-M faculty, staff, students, and researchers)

ICPSR mailing address is:
Summer Program in Quantitative Methods of Social Research
P.O. Box 1248
Ann Arbor, MI 48106-1248

Copyright © 2016 Summer Program in Quantitative Methods of Social Research, All rights reserved