Statistical Programming with SAS/IML Software , livre ebook

icon

404

pages

icon

English

icon

Ebooks

2010

Écrit par

Publié par

icon jeton

Vous pourrez modifier la taille du texte de cet ouvrage

Lire un extrait
Lire un extrait

Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
icon

404

pages

icon

English

icon

Ebooks

2010

icon jeton

Vous pourrez modifier la taille du texte de cet ouvrage

Lire un extrait
Lire un extrait

Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus

SAS/IML software is a powerful tool for data analysts because it enables implementation of statistical algorithms that are not available in any SAS procedure. Rick Wicklin's Statistical Programming with SAS/IML Software is the first book to provide a comprehensive description of the software and how to use it. He presents tips and techniques that enable you to use the IML procedure and the SAS/IML Studio application efficiently. In addition to providing a comprehensive introduction to the software, the book also shows how to create and modify statistical graphs, call SAS procedures and R functions from a SAS/IML program, and implement such modern statistical techniques as simulations and bootstrap methods in the SAS/IML language. Written for data analysts working in all industries, graduate students, and consultants, Statistical Programming with SAS/IML Software includes numerous code snippets and more than 100 graphs.
This book is part of the SAS Press program.
Voir icon arrow

Publié par

Date de parution

22 octobre 2010

Nombre de lectures

0

EAN13

9781629592558

Langue

English

Poids de l'ouvrage

4 Mo

Statistical Programming
with SAS/IML® Software
Rick Wicklin
THE POWER TO KNOW.
The correct bibliographic citation for this manual is as follows: Wicklin, Rick. 2010. Statistical Programming with SAS/IML® Software. Cary, NC: SAS Institute Inc.
Statistical Programming with SAS/IML® Software
Copyright © 2010, SAS Institute Inc., Cary, NC, USA
ISBN 978-1-60764-663-1 ISBN 978-1-60764-770-6 (electronic book)
All rights reserved. Produced in the United States of America.
For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc.
For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication.
U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19, Commercial Computer Software-Restricted Rights (June 1987).
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.
1st printing, October 2010
SAS® Publishing provides a complete selection of books and electronic products to help customers use SAS software to its fullest potential. For more information about our e-books, e-learning products, CDs, and hard-copy books, visit the SAS Publishing Web site at support.sas.com/publishing or call 1-800-727-3228.
SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective companies.
Contents
I Programming in the SAS/IML Language
Chapter 1. An Introduction to SAS/IML Software
Chapter 2. Getting Started with the SAS/IML Matrix Programming Language
Chapter 3. Programming Techniques for Data Analysis
Chapter 4. Calling SAS Procedures
II Programming in SAS/IML Studio
Chapter 5. IMLPlus: Programming in SAS/IML Studio
Chapter 6. Understanding IMLPlus Classes
Chapter 7. Creating Statistical Graphs
Chapter 8. Managing Data in IMLPlus
Chapter 9. Drawing on Graphs
Chapter 10. Marker Shapes, Colors, and Other Attributes of Data
III Applications
Chapter 11. Calling Functions in the R Language
Chapter 12. Regression Diagnostics
Chapter 13. Sampling and Simulation
Chapter 14. Bootstrap Methods
Chapter 15. Timing Computations and the Performance of Algorithms
Chapter 16. Interactive Techniques
IV Appendixes
Appendix A. Description of Data Sets
Appendix B. SAS/IML Operators, Functions, and Statements
Appendix C. IMLPlus Classes, Methods, and Statements
Appendix D. Modules for Compatability with SAS/IML 9.22
Appendix E. ODS Statements
Index
Acknowledgments
 
I would like to thank Robert Rodriguez for suggesting that I write this book and for our discussions regarding content and order of presentation. He and Maura Stokes provided many opportunities for me to hone my writing and programming skills by inviting me to present papers and workshops at conferences.
I thank my colleagues at SAS from whom I have learned many statistical and programming techniques. Among these colleagues, Robert Cohen, Simon Smith, and Randy Tobias stand out as friends who are always willing to share their extensive knowledge.
Thanks to several colleagues who read and commented on early drafts of this book, including Rob Agnelli, Betsy Enstrom, Pushpal Mukhopadhyay, Robert Rodriguez, Randy Tobias, and Donna Watts. I also thank David Pasta, who reviewed the entire book and provided insightful comments, and my editor George McDaniel.
Finally, I would like to thank my wife Nancy Wicklin for her constant support.
Part I
Programming in the SAS/IML Language
Chapter 1
An Introduction to SAS/IML Software
Contents
1.1     Overview of the SAS/IML Language
1.2     Comparing the SAS/IML Language and the DATA Step
1.3     Overview of SAS/IML Software
1.3.1     Overview of the IML Procedure
1.3.2     Running a PROC IML Program
1.3.3     Overview of SAS/IML Studio
1.3.4     Installing and Invoking SAS/IML Studio
1.3.5     Running a Program in SAS/IML Studio
1.3.6     Using SAS/IML Studio for Exploratory Data Analysis
1.4     Who Should Read This Book?
1.5     Overview of This Book
1.6     Possible Roadmaps through This Book
1.7     How to Read the Programs in This Book
1.8     Data and Programs Used in This Book
1.8.1     Installing the Example Data on a Local SAS Server
1.8.2     Installing the Example Data on a Remote SAS Server
1.1 Overview of the SAS/IML Language
The acronym IML stands for “interactive matrix language.” The SAS/IML language enables you to read data into vectors and matrices and to manipulate these quantities by using high-level matrix-vector computations. The language enables you to formulate and solve mathematical and statistical problems by using functions and expressions that are similar to those found in textbooks and in research journals. You can write programs that analyze and visualize data or that implement custom algorithms that are not built into any SAS procedure.
The SAS/IML language contains over 300 built-in functions and subroutines. There are also hundreds of functions in Base SAS software that you can call. These functions provide the building blocks for writing statistical analyses. You can write SAS/IML programs by using either of two SAS products: the IML procedure (also called PROC IML) and the SAS/IML Studio application. These two products are discussed in Section 1.3 .
As implied by the IML acronym, matrices are a fundamental part of the SAS/IML language. A matrix is a rectangular array of numbers or character strings. In the IML procedure, all variables are matrices. Matrices are used to store many kinds of information. For example, each row in a data matrix represents an observation, and each column represents a variable. In a variance-covariance matrix, the ij th entry represents the sample covariance between the i th and j th variable in a set of data.
As an example of the power and convenience of the SAS/IML language, the following PROC IML statements read certain numeric variables from a data set into a matrix, x . The program then computes robust estimates of location and scale for each variable. (The location parameter identifies the value of the data's center; the scale parameter tells you about the spread of the data.) Each computation requires only a single statement. The SAS/IML statements are described in Chapter 2 , “Getting Started with the SAS/IML Matrix Programming Language.” The data are from a sample data set in the Sashelp library that contains age, height, and weight information for a class of students.

Figure 1.1 Robust Estimates of Location and Scale

In the PROC IML program, the location of the center of each variable is estimated by calling the MEDIAN function. The scale of each variable is estimated by calling the MAD function, which computes the median absolute deviation (MAD) from the median. (The median is a robust alternative to the mean; the MAD is a robust alternative to the standard deviation.) The data are then standardized (that is, centered and scaled) by subtracting each center from the variable and dividing the result by the scale for that variable. See Section 2.12 for details on how the SAS/IML language interprets quantities such as (x-c)/s.
The previous program highlights a few features of the SAS/IML language: You can read data from a SAS data set into a matrix. You can pass matrices to functions. Many functions act on the columns of a matrix by default. You can perform mathematical operations on matrices and vectors by using a natural syntax. You can analyze data and compute statistics without writing loops. Notice in the program that there is no explicit loop over observations, nor is there a loop over the variables.
In general, the SAS/IML language enables you to create compact programs by using a syntax that is natural and convenient for statistical computations. The language is described in detail in Chapter 2 , “Getting Started with the SAS/IML Matrix Programming Language.”
1.2 Comparing the SAS/IML Language and the DATA Step
The statistical power of SAS procedures and the data manipulation capabilities of the DATA step are sufficient to serve the analytical needs of many data analysts. However, sometimes you need to implement a proprietary algorithm or an algorithm that has recently been published in a professional journal. Other times, you need to use matrix computations to combine and extend results from procedures. In these situations, you can write a program in the SAS/IML language.
The syntax of the SAS/IML language has much in common with the DATA step: neither language is case-sensitive, variable names can contain up to 32 characters, and statements must end with a semicolon. Furthermore, the syntax for control statements such as the IF-

Voir icon more
Alternate Text