Anyway, we've just received our first "homework" and as nerdy as this sounds, I am excited about it... except for one little thing... it's done on SAS.
Now, I can't claim to be an expert in SAS or in R, and maybe someone who is will give me a little tip on why my opinion is complete fail, but I really think R is much more user friendly, especially for someone like me who has very little programming experience outside of making macros in Excel (prior to the great loss of VBA apps in Excel 2008 for Mac... the horror!), making a few visual basic programs that can taunt the user with insults given a certain input, or making annoying pop ups when certain coordinates on a screen are moused over. In other words, I know almost nothing about programming.
For example, I am about to begin a program in SAS for making a correlation matrix. Right, simple, a correlation matrix. Let me show you what that would look like in R.
DATANAME<-as.matrix(read.table('/R/dataname.csv', sep=","))
#possibly a few specifications about how to read the data included in that first bracket
DATACORR<-corr(dataname)
DATACORR
That's it!! That's the whole correlation matrix for R. It will show up immediately after you type in that second DATACORR (R likes to print automatically, oh, how Ilove R!)
Now let's talk about how we do THE EXACT SAME THING in SAS
DATA DATANAME;
INPUT VAR1 VAR2....etc. #not going to input vars here- worthless for the rant
# here is where you input all of the relationships between the variables if you need to
DATALINES;
#now you type in all the data, because SAS really hates reading from files... or you can be lazy like me and use Excel's export features to make files that are suitable for SAS... and easily adaptable to the variables you want to input
PROC CORR NOSIMPLE DATA= DATANAME;
VAR VAR1 VAR2 VAR3 .... etc.
WITH VAR1 VAR2 VAR3 ... etc.
RUN; QUIT;
Now, I think we can all see the problem here; as your data set gets bigger and bigger, and it will, incorporating all this stuff into SAS... pain in the butt. For SAS experts, there may be some shortcut to doing this, but let's think about it for a second. I am not a SAS expert, and many people who need to use statistical packages for large data set applications (biologists, etc.) probably aren't programming experts, either... So this seems like a lot of trouble to me.
With that being said, I think a fight needs to go down: SAS v. R... who will dominate the "Entry-Level Statistical Packages World"? I am obviously biased to R. Here's some points to think of
FOR SAS:
- more traditional package, has been around for a long time
- already incorporated into curriculums
- produces standard plots
- labels statistical tests and their solutions (for example, on the SW test it will remind you that P
- can easily run "tests" for equality of B's that help choose model form
- traditional ("old-school") programming syntax
FOR R:
- really sweet graphics
- lots of cool packages to download
- easy commands to use
- reading files is not TOO difficult (it can even work with some Excel files-- which is amazing-- it knows the data entry from the function entry!)
- calculates statistical tests
- new-school syntax
- 3D graphics that look just amazingly sweet (2 comments on the graphics, I know, but you can make just about anything on there! You can layer it with a MAP for goodness sake... I mean... hot skippy, that's amazing!)
So... it's the duel of the century in nerd world. I guess as this class progresses we will see-- will my exposure to SAS build my love for it or will my heart remain devoted to R? More of this epic saga to come.
Deathmatch # 2: R vs. Matlab!
ReplyDeleteI have no opinion yet, because I'm a scrub and have not yet learned R...
hmm... I am a scrub and don't know MATLAB... this sounds like a battle of the Gods or something, though.
ReplyDeleteBattle of the Scrubs! I'm not scared of your carriers or your tornados.
ReplyDeleteYou know I have been seriously thinking about learning R, for a couple of different reasons. I use Matlab for basically anything numerical at the moment (and Maple for anything symbolic). I was thinking about what bugs me about Matlab...I think there are four things:
ReplyDelete1. It's proprietary. This sucks for obvious reasons.
2. Functions have to either be anonymous or declared in entirely separate .m files. This really breaks up the flow of programming for me.
3. Array indexing starts at 1. I know, it's a small thing, but it's annoying.
4. Matlab's figure quality is crap, and R (as MoF points out) produces beautiful, publication-quality figures. This alone might be reason enough to switch!