Tuesday, August 31, 2010

Regrets of the dying

From Inspiration and Chai:
I wish I'd had the courage to live a life true to myself, not the life others expected of me.

This was the most common regret of all. When people realise that their life is almost over and look back clearly on it, it is easy to see how many dreams have gone unfulfilled. Most people had not honoured even a half of their dreams and had to die knowing that it was due to choices they had made, or not made.

It is very important to try and honour at least some of your dreams along the way. From the moment that you lose your health, it is too late. Health brings a freedom very few realise, until they no longer have it.
Powerful stuff. Read the whole thing. On a (somewhat?) related note, I stumbled across a peculiar finding with regards to solar flares. Sure is an interesting topic, and I have always loved space... Oh, if only there were some way to connect what I'm doing to this fascinating topic!

Heh.

Monday, August 30, 2010

For the morbidly curious...

My first first-author paper came out today, in PNAS's early edition:

G.J. Peterson, S. Presse, and K.A. Dill. 2010. Nonuniversal power law scaling in the probability distribution of scientific citations. Proc. Natl. Acad. Sci. USA. 10.1073/pnas.1010757107.

I'm pretty excited about this! This was my first real experience with publishing a scientific paper. The main thing that was a surprise to me is how much journal-specific formatting is required from the authors. I had expected the journal itself would take care of most of the typesetting and aesthetic details, but (at least at PNAS) this is not the case. This isn't a bad thing, just a bit unexpected. (My only previous paper, from my undergraduate work, had me as third-author, and I wasn't involved with the publication process at all. For the very morbidly curious, that paper can be found here.)

Saturday, August 28, 2010

EXTREME COMEDY VIDEOS

http://www.youtube.com/watch?v=wdc344K_rPM

T Rex's CompSci Advice

I have no idea if that actually works, but... I would definitely LOL if I heard someone say that phrase about a language.

Friday, August 27, 2010

Wednesday, August 25, 2010

O great one who knows of many computational things...

I hereby summon ye, knower of all that is having to do with making computers do menial tasks I don't want to do by hand, to give thy best advice regarding a data situation:

Suppose that I have 3000 text files (saved as whatever format you want, right now in .txt) which each represent one day (for example, 01.01.2005) with a format like this inside of each:

PLOT ID     X     Y    CARBON   WATER
101              100  200  0.009           3.00
102              106  201  0.003           5.00

And each has say, 11000 lines in it.  So each text file is showing me data for 11000 locations on a specific day.

This is not what I am interested in, though. What I want to know is, how does the CARBON on a specific locaiton, like PLOT ID 102, change over the course of the 3000 days (each of which are in a different file). Notice that each file has these two things that uniquely identify the PLOT-- the first is the PLOT ID and the second is the X and Y (together) coordinates.  The X and Y are more descriptive, of course, than the PLOT ID, but also in two different columns.

I am not sure where to start on this. Any ideas? This is what I have thought of so far-- but I think I am still in the mindset of "think like an accountant" and not "think like a ph.d. student who needs to learn to think in complex ways." this is summarized as: "lulz i r n00b." here's my n00bish thoughts:

1. put all data from one year into a single spreadsheet. use the sort function in excel to sort that year by plot ID....  then do something (??) to separate each of those plot ID's into separate files-- maybe a command in MatLab can do this?
2. do this for all years (let's say we are using 10 of them)
3. when trying to use matlab, just find all of those files using the command similar to the one about data1.txt, data2.txt that is in the worlds most awesome tutorial for MATLAB use that you made for me (also known as "reference manual of the gods").

I like this idea, but I know i will be updating the file information a lot over the next however many years-- is there a way to develop an automated "process" that the computer can run to do this for me? it sounds to me like a task for python but I'm still on the "Hello WORLD" level with that one.

Neat R trick

Neat R-trick of the day.  Transform pixelated coordinates and elevation data into a 3D scatterplot-- it's a really cool way to see if the pixels are accurately depicting the landscape!!

You can also make MOVIES, but I can't figure out how to upload them onto here yet. 
I mean, look! It's a valley-- how cool is THAT? I'm excited. 
did I mention that I have a cool job? why.... yes!

Monday, August 23, 2010

Beautiful Places

To brighten your day.

1. Milford Sound, NZ
2. Qinghai-Tibetan Railway
3. Ngorongoro Crater, Tanzania
4. Aberysthwyth, Wales
5. Santorini, Greece
6. Mt. Ossa, Tasmania
7. Grossglockner High Alpine Road (Hochalpenstrasse), Austria
8. Ankara, Madagascar
9. Tierra del Fuego, Argentina
10. Adam's Peak, Sri Lanka
11. Venezuela Falls, Venezuela
12. Iguazu Falls, Argentina
13. Dead Sea, Israel
14. Carpathian Railway, Romania
15. Iceland
16. Svalsbard, Norway
17. Paranal Observatory, Chile

Nothing like the beautiful starkness of nature.

A resource that needs to exist for us n00bs

Someone needs to write a document on "how to open files" for every data management application. It's really nice and all to have tons of analysis toolpaks, but if you can't get your data into there, how the heck do you analyze it?

I say this because I wasted 45 minutes trying to figure out "how do I open an .shp file in ArcGIS?" before I realized that: 1) the real ArcGIS is not ArcView (what CU had) but actually a very large set of files, about seven of which are executable programs named other things that start with "Arc"-- ArcMap, ArcInfo, ArcGlobe, ArcCatalog, ArcTools, 2) You can't actually open a file directly in any of those applications and 3) Using ArcCatalog to set a working directory in a drive that can be opened in ArcMap is pretty darn confusing.

Okay, so I'm sure there are a lot of people out there who wouldn't find that confusing. But I did. Just like I found "set wd()", "importdata," and even using Excel for the first time really confusing. It's like, okay, I have all this stuff, but how the hell do I get to it...

There needs to be a middle ground in computer help manuals-- one for people who are capable of using the advanced features on a program (because they have used them on other similar programs) but need to learn the fundamentals of "where do I click for what?" You know, for example, in Excel, it would be great to have a book that wasn't just "and THIS IS A CELL!!" but also not one that was like "and if you want to program your own macro to do such and such thing you will never use this is similar to insert the name of some other programming language here you have never heard of." What I want is something that is like: do you need to use v-lookup tables? First click here. Then type this. BAM. Look up table good to go.

If only my idea wasn't so fail, I'd make "the big book of doing easy stuff on computers for those who want to pretend to be saavy" also known as "someone else has already written a program to do that: what to click to make yourself look smart." And then I'd be super rich. Right.

Where do I sign up?

For the Oregon that existed in April, May, and June... you know, the 45 and rainy one?
It was 61 yesterday! What's with this (see below):

Sunday, August 22, 2010

Special K

Depressed? Not any more!

The effect described on depressed patients is pretty jaw-dropping. Ordinarily I'd worry that this wouldn't fare well with regulators, due to its potential for abuse as a "recreational drug," but given its astonishing effectiveness, surely this will get approved.

Mathematical innuendo

searching "natural log LaTeX" gets me some VERY INTERESTING results.

note to academic publishers: you are smart! what is with the names?

Good:
MatLab
ArcGIS

Bad:
Maple
SAS

Ugly:
R
LaTeX

Saturday, August 21, 2010

But what does it mean

I was sitting on the wax-paper table at Dr. Julian's office because my shin wasn't broken.

They told me that an x-ray had shown a seven-inch vertical fracture down my shin and that if they caught me not using crutches within the next three months I would be promptly removed from the team and my scholarship taken away.  But three days later my shin didn't hurt and I wanted to walk again.We know, they said, that you just want to go to California for the meet. This is not your time. Maybe it's never your time. Maybe this isn't for you. How often do you get hurt and we come telling you that you need to take time off, and you don't do it.

How many times do I heal faster on my own than your ridiculous nonsense, I wanted to ask. I have always been too prideful for my own good. Always. I've also always been a tremendous pushover. But instead I said, okay, I'm headed to the elliptical machine.  They said, see you in three hours. The sign on the wall said, be polite to others, do not sweat. Dr. Julian came into the office to tell me what I knew all along-- my shin was not broken.  I had another problem that was causing microtears in my muscles that he could see on the MRI-- a condition called hypotonia, common in people with defunct mental states-- it meant that my body couldn't communicate dualing signals with my muscles.  When it told one muscle to relax, it automatically told another one to tense. He pulled some strings for me and got me into therapy for free. I have not had any serious sports "injuries" since. The university was very hush-0hush about the whole thing. There was talk of a scheme underlying, as I was certainly not a great team member, and it took me a long time to become brainwashed to the mindset of making sacrifices for the team hierarchical hold. Eventually they cracked my brain, but my shin was nonetheless intact.

It was at that time that I said to myself, sitting on the table, and talking to Dr. Julian, that I said to myself, I need to be an orthopaedist. After all, I had been "saved" by one.  I had paper and images to prove that I in fact had a problem that was not a broken shin. This must be the most rewarding work, "saving" athletes so that they can pursue their passions. Dr. Julian was for a while my hero.

I typed before and I should reiterate here that I have always wanted to be a "hero." I realized the implications of this sentence about two days after I wrote it. I am by training a symbolist. Language is fascinating in that it takes so much breadth of idea and theme and compresses it into a single word. Hero. Love. Life. When we use that term through a standard  syntactical construction we bridge together word and idea and inherently create a set of analytical tools to use for it. We are very metonymical.  I say hero, and suddenly things that I never first imagined: superman, Einstein, ancient Greek lore, these are all part of the sentence. What is a hero? A standard protagonist of sorts, having epic adventures filled with meaningful failures and a barely make it climax and then a life is great and easy conclusion? Or some kind of lone spectre summiting mountains and passing distant judgement on the world, to cunning and smart for her own good? Maybe a hero is some Jane Austen woman who manages to convince both herself and Mr. Darcy to change? Tools we use to assess these protagonists are now fair game for assessing ourselves: is it a question of what Lucy DID to save Narnia or who she was? Or both? How are heroes made and analyzed?

I tile this "But what does it mean" because the question of venerating a Narnian warrior girl has something to do with how I view a job or career prospect. But what does it mean to be a hero in career fields? At the risk of being myopic, I even say, what does it mean to be a hero in the sciences? I don't know the answer to this question. Is it great achievement, like winning the Nobel Prize? Would that make me a scientific hero? Or is it better to be less prolific but more productive, maybe doing something small that changes the way that people look at a certain problem. Who invented the ANOVA? How many statisticians use ANOVA every day? Try to name all the Nobel Prize winners-- these are the top folks in science, the number ones, the "heros"-- I can't do it.  I can't even get five. And realistically, the honor of being even in contention for the prize would be pretty big. How many Nobel finalists can I name? None. Not a single one. These are the top dogs-- they should be the heros.  So it makes me wonder. Dimensional analysis shows that my goal and my progress are using two different units. Conversion is needed for understanding results.

I I have to give a presentation in December to a field that is entirely not mine. I will be batting in the big leagues when all I have ever encountered was a bowling ball.  Spheres carry different weights, to say the least. So it is safe to say that none will ask me questions about trees or something I know and am prepared for-- nor am I scared to say "well, I don't know" to questions that are out of my grasp. Questions that I should be able to answer will be the broader, structural kind. My past experiences with this sort of thing have shown me that for the most part professors and government bigwigs aren't interested in my methods, whether or not I've used a certain statistical test or whatever, but to use a terribly inappropriate aphorism, the "forest for the trees." I impressed my way to eight job interviews in my last year at Clemson dropping only one resume by talking to people about big pictures. Look at a forest fire and tell me who is concerned about needle length-- its not the leaf but the landscape. What is the goal of this project? How will you get there? And why the hell is it important? Why do we care about trees growing on mountains? The fact that it interests me is irrelevant. Cutting my toenails also interests me and that's not getting me anywhere in life. I can't say, well, I just want to know everything there is to know about how mountains work and how they affect forests.  Means are not ends. Unless you are a statistician, I guess.

In the 1960's, a bunch of forest rangers in the southwest were walking through a forest ravaged by fire.  In the charred woods they found a tiny bear cub, clinging to a burned tree.  They took him back to the station and nursed him to health.  At one point, a ranger, joking around, put his hat on the bear cub and took a photo. Smokey the bear is now a commonly recognized symbol of wild fire prevention.  What did Smokey do to become a hero? He was an icon of life in a dead wood. Maybe that's the key-- the hero isn't what is accomplished by the attitude with which it is presented. A scientific hero is new, fresh, and simple.  She describes the mountain forests with a passionate directness and creates applicable models for all people.  She isn't concerned with making a name for herself, but her system speaks louder than her words. Snow falls on a porch and no one hears it, but everyone knows its there and they feel different. The world is quiet and sublime.

These are ways to think of it, I suppose-- quiet and sublime-- secretly grand-- the lone spectre on the mountaintop knows that she in fact created the whole landscape in her view. Grandeur belongs to the creation; to its creator, satisfaction.

Zotero

Maybe I spent too much time with Apple, but I think I like this little crispy bibliographical software better than BibTex.  Especially if writing the .bib file seems sort of "follow this template and it's all okay" straightforward.

http://www.zotero.org/

Check this thing out-- it integrates with your firefox, so you can just ISI or JSTOR up the results, and then do all sorts of crap like take notes.

I know that one thing that sucks for me on reading papers is that I don't really keep notes I might take on them, or if I do keep them, I never reference them. Having these linked to the articles is salvation for me!

Tuesday, August 17, 2010

T-Rex's great insight

I didn't know T-rex lived in South Carolina!

Monday, August 16, 2010

A Life Story of Sorts (also known as, no! I don't want to write in LaTeX right now).

Once upon a time there was a girl who lived in a big house in a big neighborhood in a big city. The house was on a street called Red Spruce Lane and there were tricycles in the cul-de-sac.  Business was good for businessmen in the city in those days, and the plan to build a great radio station company was well underway for three young book binders from Decatur. They decided to buy a nice office on the north side of town and the girl and her family moved to another standard-grade house in another standard-grade neighborhood.

This time was before the time of computers, gameboys, and ipod.  It was when the Sony Walkman was just coming out and wow, you could take your cassette with you? Amazing. It was too expensive for her family. They were traditionalists from South Carolina mixed with half-hearted coal-miner kids from Pennsylvania.  They were very much to structure and social pecking orders. Every day when the girl went to school her mom made her wear a dress.  She hated the dress because it meant she couldn't run on the playground with the boys.  Instead she had to sit on the slide and in the dirt with the girls. The playground didn't have a "floor" then like it did now.  It was just some monkey bars, bars, and dirt. When she came home she put on Michael Jordan high tops and biking shorts and t-shirts that she got free from the world wildlife foundation.  She inherited her male cousin's mountain bike and she would ride it from school end until nearly six PM, when she would come inside to enjoy Kid Cuisine and the Carmen Sandiego show.

As she got older, doors, literal and figurative, closed.  The family bought their first computer, an MS DOS machine that didn't even have windows.  It had a few black and green programs for learning arithmetic and typing.  On Saturday mornings before she went to Tae Kwon Do practice, the girl's mom would sit her down in front of the machine and make her practice typing.  She had to do it, because somehow the machine could tell her mom if she practiced or not. On Saturday afternoons, she'd get back on her bike and ride as far as she could, sometimes all the way to Cherokee County, crossing roads with reckless abandon.  She only got stopped for jaywalking once, about 10 years later. In the evenings, she would go to her friend's house for dinner. The friend was Jewish and her parents always made fresh hallah bread.  She would stay sometimes for a few days, and her friends mom never nagged. This was just her other home that was sometimes a better home. She learned how to say a few words in Hebrew and how to bless various parts of the meal.  She often thought about the fact that she loved Judaism-- the sounds of the chants and the somber feeling of the rituals.  And also the foods. Moreover, it was a happy place where ancient rituals superceded social systems.

The times at home were sad. Her father lost his job right after her sister was born. Broadcasting business was bad and it was hard to get rehired in the age headed towards digital media. Ted Turner was entirely to blame. Didn't he own enough stuff in Montana?  The family had hard times for many years.  Money was tight. Everyone was half-alive, and many time removed relatives were constantly passing away. When she got in trouble she was sent outside, shut out from the inside.  It was sometimes cold-- very cold-- so she would huddle by the dog door near the basement and sometimes weasel her way through onto the cement floor. Other times, she'd walk down the street and climb into the big magnolia tree, waiting for her friend's mom to get home. They had fescue lawn that never went brown in the winter.  She learned that green grass was really a key to happiness. The friend's mom put a key for her in the big planter on their deck.  She'd go there sometimes in the afternoons while the mom was work and the friend at Jewish school. Then they would all eat hallah.

Throughout this time she dreamed and wrote stories. Sometimes she'd curl up with a spiral book her friend's mom bought her and just write adventure tales about girl warriors who lived in hollowed out trees on tall mountains. They were friends with the birds and had wolf lovers. There were passionate exchanges of single kisses and glorious sacrifices for good things. They wore shirts made of falcon feathers and leather pants skinned from dragons. They fought glimmering city men who tried to tame them. She thought C.S. Lewis would love her stories, like her friends loved her stories. Her parents never read them, didn't know they existed, didn't ask. What was the most important was passing algebra, not creating things. Why was the minus different than the subtract? Or were they? Doesn't minus mean negative and parentheses mean subtract? An accountants daughter couldn't get it through her head that basic algebra didn't work in terms of loans and equity. And why bother with scientific notation when you could just write out 100,000?  When she was in seventh grade, she started a story exchange with another girl in the school. They had a notebook that told an epic tale about a class of students fighting great enemies in a fantasy world. The notebook circulated to the extent that teachers tried to pursue it from the class front.  It was never found, and the girl found herself achieving detention by other means, none of which were serious. She learned a lot when she wasn't supposed to learn anything, and found that passing school was sort of a side job to learning how to build tree forts and dig deep holes.  One day she built an irrigation system for the blackberries in her yard, but wasn't allowed to power it because they were "weeds."  Also, putting nails in a hose was a bad idea.

There came a day that she realized that something was wrong.  She was driving around at night in a silver volvo, looking for meaning. It was two am and she had incidentally fallen asleep at a friends house. It wasn't her fault that the movies were boring. The town was more brick and cement than trees and grasses, and everything looked silver at night, even the sky, because city-glow was stuck in it. She was listening to angry rock music, wanting to be able to jump and run across fields and fight great enemies.  Instead she lived in the city of plastics.  Doors were shut now.  Televisions that worked had been invented and computers made interesting games. Gaming consoles drove afterschool wanderers to a digital world. It wasn't cool to have a hand-me-down bike or an old silver-volvo.  At best, in her circle, it was okay to be a bit of a somber social slider, slipping in and out of groups without making attachments.  At worst, all the other groups filled their guts and brains with poisons that were unallowed.  She wondered to herself if they had ever seen a close friend be killed by a drunk motorist? She had been at her coach's funeral and she knew that an innocent choice taken too far could take the life of a friend and the lives of his family. The whole place had an aura of soft repulsion to it, like a bubble or a pile of half-molded wax.  You could either fit in and be absorbed, or you were destined to street drive at night, planning your escape. She drove and drove, coming home and going to bed in silence. Something was wrong, and always was. She lay in her bed on the side, waiting for whatever should be coming with greatness. She got the feeling that it wouldn't come.

She wanted to become a hero. A scientist? An athlete? A doctor? Someone who made discoveries and explored the world.  She cared less about what she did than how she got to do it-- if it took her to somewhere beautiful and happy, rugged and glorious, that was the goal. Yet she made choices against that goal-- choices that reflected on financial obligations and the social frenzy that one should stay close to home for familial respect.  It was duty to remain in the soft place and take care of your relationships.  Duty is also rolling sisiphys's wheel in hell though. There was something great in the rugged places.  On the final saturdays in that town she took her bike to Cherokee county, looking for adventure.  She ended up on creek shores or climbing on mill ruins. Some people grow out of the need to be wandering around in the woods, but she didn't.

There are soft places that people get stuck in.  They stay there forever, sometimes because they like soft places, and other times because there's nothing to pull up on to get out. I have lived in a soft place. There are some beautiful old rituals in the soft places to remind us what is real. Seizing those can help us to pretend that the place is not all bad. Or perhaps good luck can rescue us. Or tenacity to wander.  Eventually wandering far enough will probably take you to the edge of the cotton ball. Today everything is crisp. I feel air on my face and see clouds on mountain tops. My coffee tastes vindictive this morning, and I now feel motivated to get back to work.

Tuesday, August 10, 2010

200,000 reasons to dislike children

http://moneywatch.bnet.com/saving-money/blog/family-finance/kids-cost-parents-200000/846/?tag=content;col1

That's like 50 decent motorcycles or my salary for ten years.

Thank you, I'll have the motorcycles with a side of peace and quiet.

Monday, August 09, 2010

Inference

As it turns out, picking up radio signals over the computer speakers is a common thing, and all it takes is a ferrite ring bundled around the cord to stop them.

What is a ferrite ring? I think you get those at level 78.

But what do you do if your speakers only pick up oldies?  I guess, not complain, they could be picking up country!

PCA?

So, I was thinking about PCA today... I think about it a lot, I guess, because I have trouble translating what I know about how it works with very straightforward data (given that we measured these 20 things, combinations of these things form logical components that can be used to help us format new models that explain the variation in the data better than just regression type analysis on measurements alone?) I am reading through a certain physics paper at the moment, and I really think that the PCA is a good choice because it is a system designed the best for phenominological... geez, that's a hard word to try to spell... models. 

I am about to use PCA today to look at some data.  I have a ton of variables about stuff measured on the watershed-- elevation, slope, LiDAR index, height, etc. What I am looking for is something that explains "primary productivity"-- the annual growth of plants, in Mg/Ha/Yr. When I look at these things versus the data one by one.. or if I use stepwise regression to look at every single possible combination of these things, what I end up with is a shit correlation of something like 0.10 R-squared. I mean, it's pathetic. The reason is that individually, none of these things are phenomenons that describe the "primary productivity"-- most relate to it in some sense or another, but they are not the "driving forces behind it."

I am going to get to the funny part, I just started to think, well, maybe if I type my logic here (although it's not very good, I know, maybe it can possibly spark something about PCA that might help with a certain physics paper).  Maybe not, but it could, I guess? It is worth the effort even if it doesn't, because it cleans my thoughts for my own purposes. I apologize that most of this is probably readily apparent and not helpful at all,  but for me, this "baby style PCA" took literally a month or two to learn, and I would rather over explain than not say enough.  I am one of those who must work through things from the most simple, 7th grade-style-math standpoint to even get anywhere. Sad, I know. Every day I wish my brain were faster. Anyhow...

Let's pretend I'm running PCA on my data in R....

>output<-prcomp(data)
>summary(output)

which gives you the loadings and eigenvalues and eigenvectors, and P-values for usefulness of parameters.

So, in pretend world, I have run this analysis, now what do I do?  First thing first is that I look at the loadings.  I want to see which PC's are good to keep.  Generally the accepted criterion that I know of is "eigenvalue > 1" or "cumulative explaination of variance > 0.70."  I think the second one is the one common for forestry, at least, that is what S told me to use.

One thing I really like to do with PCA is to plot my loadings along the principal components. Generally this is kind of annoying with more than 2 PC's, but 3 is okay, too. I will try an example here, lets say that one of my parameters was "elevation."  The analysis showed me that in PC1, 0.6 of the variance is explained by elevation, and in PC2, 0.2 of the variance is explained by elevation.  So if PC1 is an axis that is orthogonal to PC2, then this point would be at (0.6, 0.2) just hanging out in the first quadrant.  I do this for all components.  Now I've got this nice visual which shows me where each parameter is on this PC graph. It's especially nice to have each parameter labelled or in colors or something so that you can identify them.  I look at each quadrant individually and see 1) who is in that quadrant, and 2) how many clusters are in that quadrant.   Let's say in the second quadrant (PC1 +, PC2 +) I have "elevation", "needle thickness", "bark thickness" and "julian days of active photosynthesis" all in a cluster.  I say to myself, what do these things all have in common, phenomenon.... shit that word is hard to spell. What do they tangibly have in common?  Well, all of those things describe conifers. Conifers grow at high elevations, have thick needles, thick bark, and long periods of active photosythesis. So I can say, that "primary productivity" is related to "conifers" in some way.  I didn't ever have "conifers" in my data, but when I am trying to make a ecophysiological model, I will include some sort of term or input reflecting "conifers" (proportion of conifers on the land or something). I use the axes of the PC graph to guide me about the shape of this interaction-- is it increasing or decreasing. 

Also, now let us say that we have several clusters in the first quadrant. Let's say there is a distinct cluster for "conifers" and two others that we have reasoned out, "rainfall" and "rocky soils."  If the "divide by standard deviation" thing has been used prior to data input, we can measure the distance between the clusters (and there are numerous ways to do this, like closest point to closest point, average cluster point to average, etc.) and talk about their relationships with one another.  Perhaps rainfall is very near to conifers, and rocky soil a little further away.  We can infer that rainfall and conifers both impact the overall system similarly, and that rainfall and conifers are closely related to one another, more so than rocky soil is to either. For my purposes, PCA's strength is that it allows you to see the underlying phenomenons that caused the variation in your measured data.  That means when you go to write a model instead of having some crazy crap like you would get from a stepwise regression such as:

Productivity acre= aX1 + bX1X3 - cX4 +dX1X4X6 + eX3X7 -f X25... which is pretty annoying to work with, you might have

Productivity per acre= (number of trees)*(amount of rainfall) - (amount of rocky soils)

Both models can technically "explain" the data, but the second model makes sense and tells us something about the real world.

I hope that was at least slightly helpful or caused some thinking of a few things. This is how I know to use PCA for model making...

Sunday, August 08, 2010

Experimental design

Thinking about experimental design. 
The hard thing about nested models is that you can't just take the variance of the sample design and the variance of the shell design and compare them, because the samples are nested in the shell. Trust me, I wish you could-- ANOVA is easy-- but you don't account for the variance correctly.  What you need to do is run a nested analysis of variance and you can check the satterwaite statistic depending on the type of effects you are dealing with. Shoot, darn education-- I can tell exactly "what" tests I want to run, but unfortunately I don't have a good reason why I am running them. I are bad scientist. What I statistically find myself reasoning through is the comparison between a nested RCBD and a nested CRD (nested randomized complete block versus nested completely random design) was that our "blocks" in this case are not really blocks-- if it were a randomized block design, each "blocking factor" would have within it all types of treatments.  In the case of my work, that would mean that each type of cover index would have within it some kinds of treatment, and that treatment would then determine my observational values which in this case are Mg of litter fall. RCBD as it is called is pretty straightforward. In this case, I have one big shell which is the basin itself. The outer layer is just looking at the variance of the litter for the whole basin, regardless of which "block" it is in. The inner layer is looking at the variance within each "block" (cover type).  We would make the assumption that the variance of the whole basin is greater than the variance of the individual blocks, but who knows. Maybe litter has nothing to do with cover type. Most importantly, we cannot assume that the variances are independent-- the variance within (the cover types) is a part of the variance between the cover types.

So I'm going to run, and think about blocking factors. If I can run, because I think all this alternative working out with hiking and martial arts is "working" in that I am feeling more tired and hungry now...if only it would actually get me into shape! Three Finger Jack kicked my butt and it HURTS but in a "damn, you worked hard" way.

DID I MENTION THAT IT'S 55 AND RAINING!!! THIS IS WHAT I MOVED TO THE PACIFIC COAST FOR!! MAY IT RAIN FOR THE NEXT 10 MONTHS STRAIGHT, AND STAY ABOUT 50 DEGREES!

Saturday, August 07, 2010

I'm a lumberjack and that's okay!


What do shitzus have to do with lumberjacks? Up until this morning, I thought... nothing. Until I saw a man out hiking today with full "lumberjackin'" attire (plaid shirt, boots, suspenders) and a shitzu on a leash. I will admit... they may be small and ridiculous, but shitzus look like small versions of llasa apsos, and as is known BY SCIENCE llasa apsos are the best dog breed known to man. They don't bark, and they don't have bones (yoshi is actually just made of pure gelatin, hence why he can bend in funny ways), so I give shitzus some thumbs up.

In any case, it makes my brain joyous to picture a lumberjack walking a shitzu and now i have seen it "FO REELZ. "

Friday, August 06, 2010

FailuRe 2

So I am trying to do a new type of test in R... it's really quite simple, theoretically, but part of the test means that I need part of my data to be assigned as a "blocking factor" while other parts are random effects.  Right, R should totally be able to do this, and I am sure it can... but try searching this baby on Google:

Which of the above actually pertains to R statistical software, eh? I have spent enough of my life learning about the R factor in insulation (thank you Energy Economics class)... give me some STATS!! <3

The more time I spend graphing things in R...

...the more I realize that documentation is amazing.

It has been teh uber useful to me, to put in lightly, to have the awesome "live MatLab helpdesk" also known as "Pericles v 2.0".  From the information in the files he made, as well as the functional file of "logistic", I have been able to see the "patterns" in Matlab language that allow me to write more "sophisticated" (see also: more for loops and more random data) for other programs! Currently the ::name not provided here because it is a very common forestry term and until I'm done with the project I don't want anyone searching it:: series is on it's 9th iteration, each one calculating different data than the last!

I was talking to the coworker the other day about the helplessness of many computer software documentation things.  I think that many people who write those things jump through the basics very quickly. For example, it's great to know how to make sweet looking graphs in R using someone's fancy package, and some people need to learn how to make packages like that, but I will tell you that I used R for almost six months before I was able to understand what a "working directory" was and how to set it. I laugh now because it seems kind of simple to me, but imagine you are someone as I am who comes from a background of humanities. You can do everything in life on Microsoft Word, and you happen to also own a calculator. You jump into a management program to... as I did during my first semester at CU, discover that Excel is actually a valuable part of the office package, too, and that if you are given a process to do in Excel that needs to extend to, say, 2000 cells, with specific row and column references (for example, a constant in the first column is applied via a function to everything else in the row), that yes, there is a way to write this into your first row using fixed cell references and copy paste to the other cells.  Someone, also known as me, did this by hand during Valuation Economics and suffered greatly for it.You see, packages online will tell you all about v-look up tables (I have never used one, but they are cool) and all about "well, this is a cell! have you heard of it?" but not once does a document stop and say "You there! Excel user-- did you know you can just put a "$" after the column or row name you want to reference and it will fix your reference to that column or row?"

So fast forward to R.  Only recently did I understand the part of the CRAN-R documentation that talked about what to do with the setwd() command. Working directory? To a former English major, that sounds like "oh, it's an address book for people I work with! Why the hell do I need that in R? Wait! I LOVE APPS! I LOVE APPLE! It must be an APP for R, how cool!" No, we don't think, "that's where it keeps all the files I would use"-- even if we did, I know I would have thought, who the hell needs "files" for R-- it's just a confusing sort of calculator that does statistics, too.

So right now I'm working on learning nested experimental design in R.  Could I do it in MatLab-- probably-- but the benefit of R is that our biometricians also use it, and if I screw something up on the ecological side, then I have some people to turn to. I'm documenting it both in R and in a "answering BRORBs questions" word document--- note to self, after I finish this one, I'm graduating the LaTeX, because geez, I am not satisfied with the MSWord equation thing-- and they don't open up in some older versions of Office.  I realize as I'm writing it that although to me something like

par(mfcol=c(2,2))
hist(median, col="blue")
hist(mean, col="red")
boxplot(median)
title("boxplot of median")
plot(avg, trend, xlab="mean index", ylab="mean index", main="scatter plot of indices")

to me now seems a fairly obvious way to put four graphs onto one graph plot in a form that can be exported as PDF or JPG, for anyone else, especially people who don't really stare at R for hours on end, that just looks like a boat of jargon. Not to mention someday when I am an ancient hoary old professor, I will say to my students, "back in the day if I wanted to plot four graphs I had to type all this crap" and they will say "geez! All I have to do is THINK of what I want and it magically appears on a three-D hologram in the middle of the room, along with Princess Leah" and then the Star Wars theme will start playing and my favorite physicist will come flying in on the millenium falcon along with a wookie named Chewie and.... wait... I'm getting carried away.  Maybe R documentation and MatLab documentation are not the Force holding the universe together...but last I heard Yoda was hanging out saying "Document the code you will."