SB Nation - Login for mobile commenting

Halos Heaven

FanShot

Interesting read especially on a slow news day

67 comments

Comments

nice.
Chris Iannetta is projected to be worth 2.4 WARP, the 11th-best total for a catcher. More significantly, that would be a two-and-a-half-win upgrade over Jeff Mathis’ projection. (PECOTA gives Mathis a .206/.261/.324 projection for the Blue Jays, which is actually generous considering his past two seasons.)
BULLSHIT

Mathis will only hit .206 if he is DFA by May

Just like it's still kind of surreal to think that Albert Pujols is an Angel(!!!)

It’s going to be weird having a catcher that can hit who Scioscia will actually play.

Mathis

Will hit .333 on opening day.
He will then eat a bad burger and get sick for a day.
He will go on to finish the season with a .176 BA.
Lyle Spencer tells the fans Mathis suffered his illness and was never the same afterwards. He’ll bounce back, you’ll see.

Now, come on, you know that's not true.

It’ll be bad poutine.

And if he does, I'll laugh.
Pujols=7.5 WARP?!

In the immortal words of Clay Davis, “Sheeeeeeeeeeeit.”

forgot to wipe

had to say it twice

I forgot to WARP.

Sheeeeeeeeeeit.

The thing that pisses me off most about these projections is they have Wells putting up abysmal numbers AND say that Mike Trout will not find a lot of playing time.

Hopefully they’re way off on one of those those two points.

Was just here to post this

you can’t have it both ways in a projection system: either Wells plays well, and Trout has little playing time, or Wells plays terribly, and Trout has a lot. It’s going to be one or the other.

Also, there’s nothing written there about Kendrys Morales? He may be the biggest question mark in 2012, but he’s also the most likely candidate to be that big bat with Pujols if he’s healthy. Does anyone have the projections for him?

There's also nothing written about Ervin Santana.

The “big three” are mentioned, but no Santana.

Santana’s one of the biggest talents we have, but seems to get no respect.

True

Santana would probably be the #1 on about half the teams in the league.

I love voodoo

but he could still go out and post an ERA in the mid 4’s, he’s done it before.

I hope something did click last season, though. Voodoo at the top of his game is a a hell of a #4

Voodoo at his top

Is a hell of a number 1 for that matter.

They aren't designed for that kind of analysis

PECOTA doesn’t calculate interactions within a 25-man roster. That kind of computation isn’t even possible without massively expensive resources, and even if you did do it, it’s not obvious that it would really tell you anything more than the basic approach. Baseball projections just use simple criteria to match each player with some historical data set. It’s the kind of thing you could do on your laptop with a copy of Microsoft Office.

As you can imagine, the historical data for 20 year-old center-fielders is pretty limited, so even though Trout’s projected stats look modest, there is a huge possible variation. Wells is probably in the same category—unfortunately, there just aren’t very many players worse than him who still get to play at his age.

How does the system project plate appearances at all, then?

that must be a variable controlled by whomever is inputting the data, right?

The same way it projects everything else

PECOTA looks through a player’s past statistics and tries to fit them to a certain profile. Then it guesses what that player will do based on what similar players did in the past. Trout is being projected for so little playing time because, historically, 20 year-old players don’t get very much playing time. There might be a “sanity check” to prevent the system from projecting, like, 4000 plate appearances at third base for the same team, but generally, a player’s projection is based on the historical data.

Where are Morales' projections, then?

did they just decide not to run a projection for him?

It just seems like such an easily fixed “flaw” of sorts to adjust these things.

The Morales projection was easy.

The computer just doubled Kendrys’ 2011 stats, then divided by two.

The OC Register decided not to report it

PECOTA runs projections for just about every player, including players in the lower minors. I can’t really tell what you’re objecting to at this point. You don’t “fix” things that aren’t part of the original design.

I wasn't "objecting" to much of anything. I was posing a question I didn't have an answer for

But in the case of Trout, yes, the fix is rather easy.

If you’re going to run projections for players, it’s worthwhile to run simulations that accurately reflect likely plate appearances. And in doing that, using just similar age comps isn’t adequate enough. For getting rate stats sure, maybe, but not for counting stats.

PECOTA just seems like it’d be infinitely more interesting (and maybe even more relevant) if it took other variables into how many plate appearances it guesses. I’m not asking it to change how it chooses player comps, or how it evaluates MLE stats. Everyone and their mother knows Trout isn’t just any 20 year old kid. And everyone knows the man blocking him from playing time is a month long slump away from riding pine, while Trout is the #1 option on the depth chart behind him. Why not just force the computer projections to extrapolate its stats across a minimum number of plate appearances. Or even tell the computer it should expect more than ‘x’ number of games from a player, and ignore what other 20 year olds in the past got. It doesn’t break the system’s evaluation of rate stats, but it does allow it to grant “exceptions” for younger players with high probabilities of increased playing time.

Over the last 20 seasons

Only seven players ever got more than 100 PA as a teenager. So these are Mike Trout’s only comparables, along with the number of PAs they got in their second season after reaching the major leagues:

Adrian Beltre: 614
Andruw Jones: 467
Edgar Renteria: 691
Alex Rodriguez: 149
Ivan Rodriguez: 123
BJ Upton: 0
Justin Upton: 417

The average is 351 PA, and that’s probably why PECOTA projects a “half season” for him. Of course the variation is huge; Renteria and Beltre played every day, while BJ Upton spent his entire age-20 season in the minors. But this tends to happen with a sample size of seven. That’s just the nature of a projection though, it’s not a “prediction,” it’s a statistical average with a varying degree of confidence, depending on how much data is available.

Yes, it would be nice if we had a more sophisticated system. Think about what you’re proposing though. If you start making forecasts for one player based on other players, then where do you stop? Is it really that obvious that Mike Trout will get whatever PAs Vernon Wells doesn’t get? You can’t factor in Mike Scioscia’s love for gritty veteran-ness, because then you have to make similar judgments for every other manager. It’s also possible that Vernon Wells has a pretty good season; that has to be accounted for too. Maybe he only gets used against left-handed pitchers, which would inflate his batting numbers. How would that reflect on Trout’s projection?

Suppose Mike Trout gets injured, or suppose DiPoto decides that he’d rather trade for an experienced outfield mid-season rather than push the rookie. Then it gets even worse. Now you have to make decisions like this for every player in the league. Mike Trout’s playing time doesn’t just depend on Vernon Wells, it depends on every other player who might possibly fill that position, and all of their projections also depend on Mike Trout. This just isn’t a tractable problem. Even if you could do it, you’d probably get a number not very far off from the PECOTA average, because of all the uncertainties.

I’m pretty sure this is why the people who make things like PECOTA just select a method and stick with it for everything. Otherwise you end up having to justify millions and millions of arbitrary assumptions about what counts and what doesn’t. These are all proprietary systems, though, so I admit that I don’t really know what’s going on in detail. This is just based on my own experience with other kinds of modeling. Dan Szymborski seems to have radar for this kind of discussion, so maybe he’ll chime in with his thoughts.

Or

since a lot of projection systems already offer projections for different ‘percentiles,’ then they can do the same thing for players with high probabilities of large PA variations.

My specific knowledge of things in PECOTA is pretty limited (I get along well with Silver and Wyers, but I’m one of the last people they’d give detailed information to, for obvious reasons).

I know I use a very large group of comps and a different framework – I use fuzzy clustering methods to find comps, while I’m under the impression that PECOTA more heavily uses naive Bayes classifiers in its model. ZiPS has minor league data going back 45 years now, so there are a lot of 20-year-olds in ZiPS. Don’t know how PECOTA uses minor league translated data – an awful lot of youngish middle tier prospects seem to have Hall of Famers as comps.

I just don’t feel that it’s a great use of resources to do that.

Computer projections are the most useful when they’re doing what computers do best: finding patterns in macro data and putting what the numbers mean into context objectively. Humans are wired rather poorly for this task – we’re biologically designed to see patterns where none exist.

For the situation on the ground at the micro level, a human can better deal with the more intangible aspects of playing time. My policy for projections is to let computers do what computers do best and let humans do what humans do best.

In fact, I feel that having the computer pick actual major league time is not only a futile task, but possibly counterproductive as it removes information from the equation. I want a computer to give me a data-driven opinion about how good Alexi Amarista or Tyson Auer are, not that Amarista will get a little bit of playing time and Auer none.

I rather enjoy these projections for Morales, though:

http://www.fangraphs.com/fanpdetails.aspx?playerid=8610&position=1B

Bill James has him at a .296/.341/.504 mark with 24 DINGERS and 56 XBHs.

I think assuming he is healthy that is a very resonable and would be a very good year for him. Got to think he doesn’t play more than 140 games no matter if he is healthy between days off and having someone else in the DH role at times when Pujols is still at first.

I just hope the Angels don't go 161 and 1...

that didn’t work out too well for the Packers this year…

They should go 91-71

and win their division by one game.

That was the way to do it this year.

in that case....

we should have gone after Pujols’ younger brother

But Pujols already knows how to perform in the clutch/postseason
yeah your right.....

the analogy fails pretty quickly, unless Andy Reid is the fattest coke head ever

Morales is going to hit bombs in front of or behind Pujols. Trumbo will get enough ABs.

Not only because of the 3rd base experiment, but because I’m expecting our OF to go: Trumbo, Bourjos, Trout for a decent number of games after the All-Star break, mostly thanks to Torii and Wells’ hamstrings.

VW's hamstrings are too blessed to get pulled.
they won't be stressed either

because slowly jogging toward the ball and catching it on a hop is not very hard on the body

I'm not buying the "Only 2 players with more than 15 HR."

If Morales plays, he’ll hit more than that. If Trumbo gets more ABs, he will. Kendrick, Hunter, Wells, Ianetta all have the power to break 15 easy…. especially which ever one gets the most pitches to hit via being protected by the Pujols/Other Power guy. I am almost 100% positive pitchers are going to be less likely to throw three strait low and away to Howie (his cryptonite) if Pujols hits behind him.

certainly more than two players have the pop to hit more than 15 dingers

Trumbo, Morales and Wells are good for 25-30, but none will play all season

One has to remember that projecting 2 players with more than 15 HR doesn’t necessarily mean it’s only projecting that 2 players to hit 15 HR. If, for example, a system projected all 9 members of the lineup to hit 14 HR, it wouldn’t be projecting 0 15 HR seasons, it’s probably projecting 4 or 5. Mean projections are just the midpoint of the distribution of results – a projection system isn’t saying a guy with 10 HR won’t hit 15 HR, just that he’s more likely to not hit 15 HR than to do so.

CommanderNate's young outfield reminds me of a question.....

I’ve seen a lot of bad defenders stashed in left field for their bats. But is left field really easier to play than right? And in our case, with scouts saying Trout has an average to effeminate arm, wouldn’t Trumbo work better in right, since they say he can throw a ball through the backstop from out there?

I was just wondering if a more skilled observer of outfielders could tell me why it seems they hide pseudo DHs in LF more than RF when it seems the ball will get hit out there more.

I agree, that for an outfield with Trout, Bourjos and Trumbo

Trumbo belongs in right. On the few opportunities we’ve seen him throw first to third, or even first to second, you can tell the guy has a decent arm. I would venture it’s already better than Trout’s.

Also, something has to be said for having comfort in staying on the same side of the diamond. 1B to RF is probably easier than 1B to LF. Same background on flyballs and the ball tails the same way.

Don't they decide whether to put a poor outfielder in LF or RF based on throwing ability?

If he has a strong arm, put him in RF. If not, put him in LF.

That would make sense to me.

It just seems like a lot more of them end up in LF. But that could be why. Or maybe, because a righty hitting to RF is going the other way and the ball tails off strangely more often, maybe the plays in RF or more difficult more often, while in LF the fielder has more chances, but it’s more strait in front of him singles?

I don’t know. It’s just been on my mind because we’re concerned about the 1B/DH log jam, when it seems that so many teams survive fine with a bomber parked in the outfield with his bad defense.

I'm so pumped for this year

Pujols will win the MVP, Kendrys will win comeback player of the year, and Howie will break out. Never been so excited (unless you count the 2003 season which I don’t)

Better get Howie some Proactive

Da-da dun

Let's hear some Morales projections!
.209/.209/750

All hits are HRs

I'd take it

But the slugging would be more like .836

DNP

sorry for the buzzkill

DNR?
They are trying to revive

Does Not Play in ’12 is my fear…

Projections are a Cylon trick.

To mess with your heads. Don’t fall for it!

Damn One god believers

Where is Starbuck when you need here

Don't worry

Groundhog saw it’s shadow…winter is coming.. Errr is still here for another couple weeks!

On a side note the walking dead will hold me over until spring training.

God's gift to television returns on Feb. 19.

EB&D

Other shows are good. EB&D is the alpha and omega, the Spiritus Sanctus.

That should hold you over.

Easymeat Bangstick & Douchswab?

Huh show is you talkin bout? and can it be all that good if it doesn’t have Guy Fieti in it?

DID YOU see the new set photos at winteriscoming.net?

yes walking dead is right around the corner
but mayhem is righter
EASTBOUND AND DOWN next weekend
Fuck YA

Myrtle Beach Mermen

Best baseball team ever.

Game of Thrones rules and I love it…but EB&D is a perfect television show and possibly my favorite of all time(at the very least, it’s tied with Deadwood for that honor, if not outright #1).

My List:

1) BSG until they found earth
2) Simpsons
3) GOT
4) EB&D
5) Fringe
6) Always sunny
7) Firefly
8) South Park
9) Jersey Shore
10) Once upon a Time

And they look like us now....

…. accept all their women are built hotter than ours.

Reply fail. I mean the Cylons.
Another math formula to predict the (possible/probable) future.

Based on the past. Wait, what??? Oh well, carry on…….

You must Login with your SB Nation account and be a member of Halos Heaven to post a comment.