Tuesday, September 29, 2015

The offense is perfectly fine, but is it? - Part 1

The Nats offense is problematic. It's hard to see it that way as it should end up as one of the Top 3 scoring offenses in the National League, but as I explained before there's a couple points that are ignored by just looking at RS/G.  One point is how much the offense is Bryce Harper dependent. Another point is that while the Nats are in the Top 3, the distance between the league average and the top teams is the smallest it's been in years. A third point is that distribution of runs scored matters.  It's this last point that I'll tackle today.

Usually when you deal with trying to figure out how many wins a team "deserved" you look at their Pythagorean Record. This takes the runs they've scored and the runs they've allowed and projects a winning percentage. But the statistic breaks down when your runs scored or allowed doesn't follow the usual distribution.

For example, let's say you are a team that alternates scoring between 0 runs a game and 20 runs a game. Game 1 you score 0 runs, Game 2, you score 20. Game 3, 0, and so on. Let's say it happens for a whole year. The pythagorean method will see a team that averages 10 runs a game and probably project that the team should have 145+ wins or something. But obviously the team will have something very close to 81 wins, as it can't win a game it scores 0 runs in and will almost certainly win every game where it scores 20 runs. The pythag will say the team is 60+ games "unlucky" but that's very far from the truth. The Nats "should" have 85 wins by the pythag. Were they unlucky? Or is that far from the truth?

Let's take a look at the distribution of runs scored for the Nats. How many times did they score 0 runs and what are the chances of winning that type of game? How many times for one run, etc. etc. It's still not perfect by any means, but it can correct for odd distributions of runs scored, as we see in that extreme example I noted above. What do I see when I do that for the Nats? I see an offense that should have won, based on the distribution of runs scored in a game, 80 games. (79.97 to be exact).  The Nats have won 80 games. We can do the same with pitching and it shows the same thing based on the runs allowed.  The Nats should have won 80 games (80.394) and they have.

The reason this works is that basically scoring your first run does little to help your chance of winning, and the same goes for scoring your 8th run and beyond. You are certain to lose a game you score 0 runs in. Adding 1 run to go from 0 to 1 and you are still almost certain to lose that game (~91% chance).  You are almost certain to win a game you score 7 runs in (like ~85%) Adding one run to go from 7 to 8 (or 8 to 9, etc.), barely improves those chances. Basically scoring that 8th run and beyond pads your expected win total. Which team in the NL has scored 7 or more runs the most times?  The Nats. 37 times.

Does this really mean the Nats haven't gotten unlucky, though? Well it depends how much you buy into luck as a factor. By ignoring the actual distribution the pythag method of win expectation is implicitly assuming a distribution close to expected for your RS and RA, and a random distribution of RS and RA per game. In other words you don't bunch your RS/RA at the high and low ends and you don't happen to match up your high scoring games with your low allowed score games. Either of those situations skew the results. But let's look at those assumptions

Is the distribution of RS and RA in individual games random? This has been looked at and the answer is yeah, pretty much. Let's call this the "Jack Morris" point. Teams don't seem to score more when they allow more, or score less when they give up less. Or looking at it the other way, they don't seem to allow more when they score more, or allow less when they score less. The distribution of runs does appear to be close to random. So in that the pythag has things right.

Is the bunching of RS (or RA) unusual? This is less clear. If a starter is doing poorly runs can come around very quickly. It's known that HRs tend to bunch. Also teams tend to use their worst pitchers in games that get out of hand. You have a big deficit, you trot out the bad pitchers on a team, the soft underbelly. If they fail you tend to follow with even worse pitchers. It would stand to reason that there are times/teams that can bunch runs, at least on the high end.

But then there is another thought that I have and it's harder to pin down. Each batter has a chance of doing well against each pitcher and for the most part it has to do with how the pitcher is doing that day. Bryce Harper has a certain chance to hit "On Scherzer" and a certain chance to hit "Off Scherzer", as does Dan Uggla. These chances are certainly going to be overwhelmed in an individual at bat by other factors, but in a game or certainly accross several games, you can imagine it stabilizing. At this point here no one can hit and you lose. At that point there everyone can and you win. Now assuming that, what if your team is made up of players all at a similar level of offensive talent? If that were the case rather than a slow shift from where you can't hit to where you can, that shift may be dramatic. When certain pitchers cross certain thresholds the combined effect on the line-up could be like flipping a switch. That type of line-up would tend to see bunching.  It's kind of like the "type of hitter" idea, where if you have the same type of hitter you may be easier to pitch to by good pitchers, but based on talent, not approach. I don't know if this is measurable. I don't know if it's measurable if it's something.  But if it is something then the Nats, as a team that bunches RS on the high end and has more games scoring 2 or fewer than expected, may have it. I don't know I'm definitely rambling now.

Anyway this is all a way to say the offense wasn't necessarily "ok" this year, despite the Top 3 ranking it'll end up with. It didn't outscore opponents in a way you'd normally expect given it's ranking because of the unusual grouping of teams this year. It also didn't score runs game by game in a way that you'd normally expect, instead scoring runs in bunches which isn't as helpful to winning as having a lot of 3-4-5 run games.  Is this something to worry about next year? Will the offense be boom and bust again? Was that a product of injuries? Good questions Let's look at part 2 though, first. How the offense is taking Bryce out of it.

24 comments:

  1. Bryan8:56 AM

    The Nats have enough offensive talent to really get after a bad pitcher or one struggling that day, but not enough to overcome a decent or great day, sound about right?

    I talk about players having ranges a lot. The Nats range appears to be good but limited height (Harper excluded), with a pretty low bottom, but very deep across. So if you catch a pitcher on an off day, especially if everyone is at the high end of their range, its a tough, tough line up. But otherwise, its not hitting very well because there are too many troughs in the lineup.

    ReplyDelete
  2. I think you covered this a bit in June, Harper. The Nats offense is weirdly mostly boom or bust. I don't know how the Nats would fix that though. Does it require a roster move or can the current players be "fixed"? Wouldn't this require a deeper dive into the individual games to see if there is any trends in the pitching when they scored a bunch of runs vs 0 or 1?

    ReplyDelete
  3. Chaz R - Definitely requires a deeper dive just to see if anything is there. And even if it is...well the thing is - if the talent is broad and moderately talented it might be good enough to win a division, maybe even easily. I think that's possibly what we saw in 2014. But it may also be more prone than the usual team to be eaten up in the playoffs. At the same time the playoffs are only 20 games long so simple luck with hot streaks, bad pitcher days, or right opponents is probably just as important. Do you spend millions on something for potential minimal playoff gain?

    I think really the "fix" is getting a healthy reliable bat (trade for Todd Frazier is a good example) to replace the healthy usually reliable Desmond, making bench no worse than last year, and being ready to spend money if Zimm/Rendon/Werth or someone new goes down. Unless you can somehow finagle a star from someone else (costly) you have to hope that say Rendon hits that spot between good and BRYCE or maybe Zimm does with a healthy year etc.

    ReplyDelete
  4. Before you succumb to the same madness of some of the compilers of the Oxford English Dictionary. consider that there may be less in the Nats' offense than meets the eye.

    Countless posts this season have noted that the Nats pound bad teams but struggle against good ones. @Bryan, I believe, sums it up succinctly.

    Identify--not that diifficult--the offensive albatrosses and replace them with someone better (more difficult).

    ReplyDelete
  5. The Nats have a number of hitters who don't really vary their approach or appear to be guess hitters. Desmond, Espinosa, Ramos, Zimmerman to some extent. I imagine those guys would have a bigger than normal delta between performance against crummy pitchers and performance against good pitchers. Good pitchers have multiple ways to get you out, and feast on guess hitters and hitters who swing from their heels all the time.

    ReplyDelete
  6. SM - probably right. The gains possible by improving the Ramos/MAT/(Moore/Uggla bench spot) are more easily obtained (but not easily obtained - if that makes sense). They are bad, fixing bad is easier than improving good. Does that mean they still might be buzzsawed in the playoffs by a good 1-2? Sure. But unless you can find one or two mini-Bryce's out of thin air what are you going to do about it? This team has missed the playoffs twice in three years, it can't be worried about how it does when/if it gets there again.

    ReplyDelete
  7. Jumping the gun here on tomorrow's post, but I ran some numbers back on September 2.

    Nationals with Harper: .251/.320 (AVG/OBP), good for 21st/13th in MLB. Without Harper: .242/.302, good for 29th/29th in MLB.

    ReplyDelete
  8. Harper, since you've already run the numbers, what was the W-L breakout on games where the Nats scored 4 runs (which I would call a probable win) and 5 runs (a likely win)? My impression is that the Nats lost a higher percentage of these games than they have in recent seasons, particularly in the second half as both the starters and the bullpen struggled. In the first half, it seemed more often that the hitters weren't scoring enough in well-pitched games. Both scenarios combine to produce, well, mediocrity.

    ReplyDelete
  9. And on 4 runs OR MORE, and 5 runs or more. In previous years, the Nats were practically unbeatable when they hit 5 runs. MASN used to flash a stat about it. But they weren't this year, particularly in Aug.-Sept.

    ReplyDelete
  10. Yes, it is. The last time I doubted I've lost 100 dollars, coz didn't listen the experts from Bet4Rate! Now they are telling the same.

    ReplyDelete
  11. ric - yeah it's impactful BUT I'm gonna have to pull every teams best player to make it fair so... it's going to be some work

    KW - RS4 in the NL wins ~55% of time, RS5 is ~68%. Nats are .550 and .522. SO on target for 4, but well under for 5. They were pretty much on target for most RS situations except for 5 RS. Of course that's only like 3 wins or so. Given the distributions matched up to the actual, probably pick up those 3 wins in tiny bits and pieces here and there.

    ReplyDelete
  12. You have just mapped the Nats' road ahead in one neat epigram: "Fixing bad is easier than improving good."

    I would include an addendum: How good are the Nats at distinguishing between good and bad?

    ReplyDelete
  13. How about "does the budget allow for fixing bad or improving good without creating bad or diminishing good" The answer in 2015 was no.

    ReplyDelete
  14. Anonymous10:30 AM

    Mark Lerner's band: Beyond Tapped Out.

    ReplyDelete
  15. Ideally I would have a stable of 10 starting-quality players rotating through 7 starting positions (Bryce starts everyday). Then have 1-2 scrubs fill out the bench if needed. In theory starting-quality players would want to play everyday, but I don't see how you couldn't pitch it as "Our guys are ALWAYS hurt, you will definitely get 300-400 ABs". I would love the plan to be a healthy Werth, Zim, Rendon, Ramos play 110 games and give the rest of those games/ABs go to the other rotation parts. So you're rotating (for example): Werth, Span, [insert quality starter like CarGo or whatever] in two outfield spots, with MAT as your 5th OF. Rotate Rendon, Zimmerman, Escobar, Espinosa, [quality starter TBD] among IF spots, with Turner as 6th IF (or swap Espinosa/Turner if you want). Plan to give everyone 400-ish ABs and know that injuries will probably make that happen by default. I realize this will never happen because it's what forward-thinking teams might try.

    Is there a way to calculate OPS+ (or something) based upon game situation, like the Treinen exercise (in futility). I'm thinking about Barnwell's DVOA when the game is within one touchdown (instead of a blowout)

    ReplyDelete
  16. The stats proclaiming the Nats among the top scoring teams in MLB are just a reminder that as Disraeli said, "There are lies, there are damned lies, and there are statistics." I'm not impressed or swayed by stats showing how many runs the Nationals average. I watch the games. I see how many they actually score. Team A that averages three runs a game will beat Team B that averages four runs a game three out of four times if Team A's scores are 3, 3, 3, and 3 while Team B scores 2, 2, 2, and 8. Team A is like the Cardinals. Team B is like the Nationals....except that too many times the Nationals' run distribution is more like 0, 1, 2, 13.

    This team needs bats....bats with power and run-manufacturing ability. I called it last winter, and I'll still sing this chorus: Rizzo could afford to let Adam LaRoche go from a defensive standpoint, but he did absolutely nothing to replace the man's 30/90 numbers of offensive production. Now he's going to let Desi go as well. I'm happy to see the errors and strikeouts go, but his 20 dingers and 75 RBIs now also have to be replaced. Trea Turner's not going to do that, not yet anyway. Down the road perhaps, but not now.

    The outfield is pretty much set, but Bryce will have to beat the odds to have another 40+ homer season. Who knows if Werth can stay healthy and even if he does he's trending only about 20 homers. Span is aging if he returns, and Taylor is still learning how to hit a curve ball. The only place to bring in some lumber is the infield. If Rizzo doesn't make a blockbuster trade at the winter meetings to get a Todd Frazier or Matt Carpenter (or, as much as I hate to say it Manny Machado) type power hitting infielder on the roster, this team will struggle more than ever to put enough runs on the board night after night to win 90 games, I don't care if Cy Young, Sandy Koufax, and Bob Gibson are your starters.

    ReplyDelete
  17. ProphetNAT11:53 AM

    Now because you can't quantify/calculate it, readers of this blog might find this dismissive: It is all in the hitters approach. When Tom Koehler or Matt Wisler is on the mound, you really can swing at every fastball you see and expect good results. You can also expect a handful of hanging breaking balls and a few walks scattered in there as well. All this leads to a nice day for the hitter. There is not much to think about other than "see ball-hit ball."

    But when the Nats run into a Kershaw or a Bumgarner, or even a pitcher on a good day, you won't get many of those opportunities. You're not going to tee-off and hit 5 homers, get a handful of walks and punish their mistake pitches. You're also not going to string together hits as easily. This is where situational hitting is PARAMOUNT. It's not easy, hence why these guys are Cy Young candidates. But this is what you face when battling a division rival with 3 aces on the hill coming out of the trade deadline - and what you face in October. Every pitch matters, so to speak. As evident by their manager, the mantra is "don't over-complicate things, hit the baseball, if not - we'll play again tomorrow." Keeping calm is important, so as not to get over anxious in the moment. But failure to recognize that all you need to go to get the runner in from third is to at least put the ball in play (Wilson!) is poor performance. Instead, they'll strike out and get to say 162 times "we'll play again tomorrow."

    ReplyDelete
  18. @Prophet: There's gotta be a way to quantify situational hitting. There's definitely a way to coach it... e.g. the way Hensley Meulens had the SF Giants hitters moving progressively closer to the pitching machine in the cage, helping them work on making contact against late inning heat. You'll shorten up in a hurry if you're taking 90 mph fastballs from 40 feet away.

    ReplyDelete
  19. Anonymous4:18 PM

    How is the distribution of RS and RA for the Nats compared to other teams like, say, the Mets?

    ReplyDelete
  20. Non-trolling Mets Fan5:07 PM

    Anon 4:18, and Harper too, of course...

    If you haven't seen it, baseball-reference has some intro-level RS/RA data on each team's box score page:
    http://www.baseball-reference.com/teams/WSN/2015-schedule-scores.shtml

    ReplyDelete
  21. I think part of it this year is how RH the line up was. LaRoche helped last year bc he balanced out the line up. I think the Nats need some LH bats. I think that is part of why the Mets chewed us up so badly in the second half (Harvey's 7-1 game not withstanding) They're top pitchers are all RH. Also, the Nats have a very bad habit of making average pitchers look good. Case in point the Giants last year. Hudson and Peavey pitched a total of like 4-5 innings in the World Series bc they were pretty much done. The Nats made them look like Greinke and Kershaw.
    This again brings me back to MAT. He may end up being a very good CF, but Harper in CF and Carlos Gonzalez in RF is a whole lot better. Even Gerrardo Parra in CF is helpful bc he balances out the line up. Daniel Murphy at 2b next year hurts the infield defense but again he is a guy who can hit and is LH.

    Finally, if the Nats have learned nothing else it is that big money spent should not impact other roster decisions. Having Clippard and Blevins (assuming he would have been healthy) for this year would have been worth every penny. Spending big money on Sherzer does not mean you don't have to spend on BP. So either suck it up and spend like you should or don't bother. No one asked the Lerners to buy the team. It is an ego thing. No one buys a sports franchise bc they want to just help the community. Plus the stadium was publicly financed anyway.

    ReplyDelete
  22. BornInDC7:52 AM

    My own thought thinking about the Nats' offense is that the boom and bust quality of it may have something to do with the significant number of low OBP players that appeared in the lineup over the season. The low OBP players act like an extra pitcher in the lineup to stop runs from getting home when the high OBP players .get on base. The Nats then get big innings when at least one of the low OBP players gets on base and doesn't stop a potential rally, thereby allowing the Nats decent supply of power hitters to create RBIs.

    I have no data to support this opinion, but I think the dramatic effect the presence or absence that Denard Span has on run production, and I bet, on the number of innings in which runs are scored. Span being in the line up eliminates one of the low OBP players so you can have more innings with scoring.

    ReplyDelete
  23. @Harper

    I was going to hold off commenting until I had a chance to gather some data. Unfortunately, I'm in the midst of home improvement chores, and even baseball must take a back seat to the higher authority - my wife.

    How are you certain the Nats runs distribution is an outlier? I was going to spreadsheet MLB as a whole,and then the Nats, fit a distribution to the former and then determine at what confidence interval the null hypothesis is void. If it's the variance at the tails causing the test to fail, then there are techniques at handling that variance at 2 or more standard deviations, without tossing the data points.

    Anyway, it's an interesting take on runs to wins correlations. I'm just not yet convinced the analysis is that straightforward. However, there are cabinets to be refinished, so it'll be a day or two until I can convince myself one way or the other.

    ReplyDelete