Monday, October 12, 2009

Statistical Analysis , SPSS

SPSS is a computer program used for statistical analysis and is also the name of the company.SPSS (originally, Statistical Package for the Social Sciences) was released in its first version in 1968, and is among the most widely used programs for statistical analysis in social science. It is used by market researchers, health researchers, survey companies, government, education researchers, and others. In addition to statistical analysis, data management (case selection, file reshaping, creating derived data) and data documentation (a metadata dictionary is stored with the data) are features of the base software. The many features of SPSS are accessible via pull-down menus (see image) or can be programmed with a proprietary 4GL command syntax language. Command syntax programming has the benefits of reproducibility and handling complex data manipulations and analyses. The pull-down menu interface also generates command syntax, though the default settings have to be changed to make the syntax visible to the user. Programs can be run interactively, or unattended using the supplied Production Job Facility. Additionally a "macro" language can be used to write command language subroutines and a Python programmability extension can access the information in the data dictionary and data and dynamically build command syntax programs. The Python programmability extension, introduced in SPSS 14, replaced the less functional SAX Basic "scripts" for most purposes, although SaxBasic remains available. From version 14 onwards SPSS can be driven externally by a Python or a VB.NET program using supplied "plug-ins". SPSS places constraints on internal file structure, data types, data processing and matching files, which together considerably simplify programming. SPSS datasets have a 2-dimensional table structure where the rows typically represent cases (such as individuals or households) and the columns represent measurements (such as age, sex or household income). Only 2 data types are defined, numeric and text (or "string"). All data processing occurs sequentially case-by-case through the file. Files can be matched one-to-one and one-to-many, but not many-to-many. SPSS can read and write data from ASCII text files (including hierarchical files), other statistics packages, spreadsheets and databases. SPSS can read and write to external relational database tables via ODBC and SQL. Statistical output is to a proprietary file format (*.spo file, supporting pivot tables) for which, in addition to the in-package viewer, a stand-alone reader is provided. The proprietary ouput can be exported to text or Microsoft word. Alternatively output can be captured as data (using the OMS command) as text, tab-delimited text, HTML, XML, SPSS dataset or a variety of graphic image formats (JPEG, PNG, BMP and EMF). Statistics included in the base software:

Descriptive statistics: Cross tabulation, Frequencies, Descriptives, Explore, Descriptive Ratio Statistics
Bivariate statistics: Means, t-test, ANOVA, Correlation (bivariate, partial, distances), Nonparametric tests
Prediction for numerical outcomes: Linear regression
Prediction for identifying groups: Factor analysis, cluster analysis (two-step, K-means, hierarchical), Discriminant
Add-on modules provide additional capabilities. The available modules are:

SPSS Programmability Extension (added in version 14). Allows Python programming control of SPSS.
SPSS Data Validation (added in version 14). Allows programming of logical checks and reporting of suspicious values.
SPSS Regression Models - Logistic regression, ordinal regression, multinomial logistic regression, and mixed models (multilevel models).
SPSS Advanced Models - Multivariate GLM and repeated measures ANOVA (removed from base system in version 14).
SPSS Classification Trees. Creates classification and decision trees for identifying groups and predicting behaviour.
SPSS Tables. Allows user-defined control of output for reports.
SPSS Exact Tests. Allows statistical testing on small samples.
SPSS Categories
SPSS Trends™
SPSS Conjoint
SPSS Missing Value Analysis. Simple regression-based imputation.
SPSS Map
SPSS Complex Samples (added in Version 12). Adjusts for stratification and clustering and other sample selection biases.
SPSS Server is a version of SPSS with a client/server architecture. It has some features not available in the desktop version, one example is scoring functions.

Versions
SPSS version 16.0 runs under Windows, Mac OS 10.4 and earlier, and Linux. The graphical user interface is written in Java. The Mac OS version is provided as an Universal binary, making it fully compatible with both PowerPC and Intel-based Mac hardware. Prior to SPSS 16.0 different versions of SPSS were available for Windows, Mac OS X and Unix. The Windows version was updated more frequently, and had more features, than the versions for other operating systems. SPSS version 13.0 for Mac OS X was not compatible with Intel-based Macintosh computers, due to the Rosetta emulation software causing errors in calculations. SPSS 15.0 for Windows needed a downloadable hotfix to be installed in order to be compatible with Windows Vista.

SPSS Inc.
The program SPSS is sold by SPSS Inc., a company that sells a wide range of software for market research, survey research and statistical analysis. These include AMOS for structural equation modeling, SamplePower for power analysis, AnswerTree used for market segmentation, SPSS Text Analysis for Surveys to code open-ended responses, Quantum for cross-tabulation, Clementine for data mining and mrInterview for CATI and online surveys. The company is headquartered in Chicago, Illinois.

See also
List of statistical packages
Comparison of statistical packages
References
SPSS 15.0 Command Syntax Reference 2006, SPSS Inc., Chicago Ill.
Raynald Levesque, SPSS Programming and Data Management: A Guide for SPSS and SAS Users, Fourth Edition (2007), SPSS Inc., Chicago Ill. PDF
External links
SPSS Inc Homepage - support page includes a searchable database of solutions (login using "guest" as User name and Password)
Raynald Levesque's SPSS Tools - library of worked solutions for SPSS programmers (FAQ, command syntax; macros; scripts; python)
Archives of SPSSX-L Discussion - SPSS Listserv active since 1996. Discusses programming, statistics and analysis
UCLA ATS Resources to help you learn SPSS - Resources for learning SPSS
UCLA ATS Techical Reports - Report 1 compares Stata, SAS and SPSS against R (R is a language and environment for statistical computing and graphics).
Using SPSS For Data Analysis - SPSS Tutorial from Harvard
SPSS Developer Central - Support for developers of applications using SPSS, including materials and examples of the Python programmability feature
SPSS Wiki - A wiki on SPSS statistics (since December 2005)
SPSS Log - A blog posting answers on SPSS questions (since March 2006)
SPSS Experts - Profiles of six SPSS experts around the world
comp.soft-sys.stat.spss - SPSS Usenet newsgroup via Google Groups
SPSS Forum - A forum for SPSS users (since June 2007)
GNU PSPP - PSPP is a free SPSS replacement