A picture isn't really worth a thousand words in this case, but here's my data. It's only 120 ish points because it's samples from our watershed on a specific time interval (hopefully it will go up soon as we do more samples).
What you are looking at is (on the x axis) an index-- pretty much an index of "how big is the canopy". On the y-axis is the biomass of the leaves. This is on a plot by plot basis.
I have reason (biologically and ecologically) to believe that this should blow up at an exponential rate. Or, should I say, BRORB has reason to believe this, and she knows far more about this than I do. If it doesn't, that's cool, but I should probably have a damn good fit to show other wise.
It kind of just looks like blob now. Or perhaps a whale, leaping from the water towards that outlier in the upper right hand corner.
I'm not sure I am doing the fitting right-- I've never done anything more complex than variations of cubic or logarithmic without the great guidance of many stats people. It's a learning curve that I want to jump asap.
It's hard to make out much from that plot -- it does kinda sorta appear to be increasing, but there's an awful lot of scatter. I can see why your curve fitting has been troublesome. If you're determined to fit this data, I'd recommend using some kind of robust fitting procedure, to minimize the impact of those three extreme outliers. I'm not sure what functions R has for this, but for a description of how Matlab's robust fitting works check this out -- has a nice description of what reweighted least squares is, in addition to specifically how it's implemented in Matlab. (My guess is that you're really going to need more data to do a good fit...in that plot, it's hard to make out if there even is a relationship between x and y...)
ReplyDeleteI am going to ask BRORB for MatLab for the lab, I think; it would be worthwhile-- I managed to get a very good fit with an exponential and with R,but only because I used a process I learned at the Clem to get out some of the "ghost outliers" (overly influential points within the data). Still, there were parts of the process that were overly tedious in R, especially the stuff looking for the parameter guesses...
ReplyDeleteBy the way THANKS... that was the problem overall-- my guesses were WAY off. When I really thought about it and tried stuff against the chart, I got some better guesses that let R converge!