Wednesday, February 13, 2013

A Billion Reasons to Like Peg Melnik.

Peg Melnik
Peg Melnik writes a nice blog for the Press Democrat which weighs in on interesting wine and tourism related issues in Napa and Sonoma from time to time. She's a great resource for anyone visiting the area, and her book, the Explorer's Guide to Napa & Sonoma, is now in its 9th Edition (you can pick up a copy on Amazon). She writes with her husband, Tim Fish, who is an associate editor for Wine Spectator.  

She put up a piece a few weeks ago about Sonoma Valley rebranding itself, a screenshot of which is at the right. It's an interesting article, but there was just one thing that really jumped out at me ... $7 million spent in Sonoma County on tourism per year? 
Peg Melnik, "Identity Crisis?"
Tasting Room Blog, The Press Democrat, Jan. 29, 2013
available at

Someone gave Peg some very bad statistics.  The actual number in 2011, was, of course, much higher - about $1.32 billion.

By way of contrast, the value of the County's entire grape crop in 2011 was $347 million.

The figures are courtesy of Sonoma County's Economic Development Board, whose graph on the issue I excerpt at the right, and the Sonoma County Agricultural Commission.  

So -- Peg -- don't underestimate the effects of books like yours on the local economy ..., "Annual Tourism Report, 2012"
Sonoma County Economic Development Board
available at
These statistics are part of the very comprehensive economic reporting done on Sonoma County on a yearly basis.  The statistics on the EDB's web site go back more than a decade, and reading through them (and watching the changing predictions) is often illuminating.  These materials, like the UCLA Anderson Forecast and the Moody's annual reports to the County, are a great for answering questions about the economy for anyone involved in local government.

Dan Walters on Education Funding, Part 2.

I've blogged previously about Dan Walters and his views on California's budget. Dan has access to most of official Sacramento, and I generally believe that if he's thinking and writing about a certain problem, it is something that most of Sacramento already is (or soon will be) thinking about, too.

California State Assembly
The column that has my attention is about ELL.  Dan points out that Jerry Brown's latest plan for education reform "provides a 'base grant' of about $6,800 per student and then, over several years, adds as much as $5,000 to districts that have above-average concentrations of English learners and students who qualify for free or reduced-price lunches[.]" Dan Walters points to Los Angeles Unified as the potential biggest winner from this change in policy, noting that 76% of LA Unified is Latino or Hispanic.

However, Dan did miss a bit of the story; just pointing out the racial demographics of a school district isn't necessarily a good proxy for how many ELL students there are.  Those numbers are available.  27.3% of LA Unified, for example, are ELL students -- 180,495 out of 662,140.  Sonoma Valley's numbers are available, too.  31.7% of Sonoma Valley Unified students are ELL students -- 1,483 out of 4,673.

It's probable that Sonoma Valley Unified wouldn't receive the maximum grant under the program, because the calculation includes free and reduced price lunch enrollment, where SVUSD is just about at the Statewide average. But if Sonoma Valley Unified got even close to the maximum proposed grant, that would push Sonoma Valley's funding per student to somewhere near $11,800 per student -- which would add more than $10 million per year to the District's budget -- and which would bring total funding fairly close to the level enjoyed by, say, Healdsburg.

 I doubt Jerry Brown's plan will be enacted as proposed -- too many wealthy suburban school districts are highly motivated to fight it. But the specifics of the plan are less important at this point in the budget cycle than the simple fact that the issue's been identified -- that the battlefield in Sacramento has been chosen, and it's funding for ELL-impacted schools.

I suspect the choice by the Governor was a good one.

Finally, since it's "Catch Up With Dan Walters Day" for me, I also noted that Dan took on the "shadow budget" in a recent column, pointing out that the general fund (~$91 billion) does not equal the budget (~$225 billion).  He argues that the practice of reporting only on the balance status of the general fund tends to deceive voters.  I agree, Dan, I agree.

Tuesday, February 5, 2013

The Philosophy of Data and Sonoma's SAT Scores.

David Brooks, with Mark Shields and Judy Woodruff
"Weekly Political Wrap," PBS NewsHour
 available at
David Brooks' column regularly features in my weekly reading list, and his segment on the NewsHour with Mark Shields on PBS is a Friday night favorite at our house.  According to Wikipedia, he's "the sort of conservative pundit that liberals like, someone who is 'sophisticated' and 'engages with' the liberal agenda[.]" Today, his column's interesting because it's all about data, but it's specifically interesting for the observation he makes that there are two things that data does really well -- it can illuminate patterns of behavior we haven’t yet noticed, and it's very good at exposing when our intuitive view of reality is wrong.

"Highest Average SAT Scores in Sonoma County"
California Schools Guide, Los Angeles Times
His two points resonate for me, because I've been looking over Sonoma Valley High's SAT scores over the past two weeks, in response to some data sent to me by a concerned friend.  

The table in question showed that Sonoma Valley High's SAT scores ranked 11th out of 16 public schools in Sonoma County. The presumption was that these schools were all comparable to Sonoma Valley High. 

 The individual wondered what conclusions could be drawn about the performance of Sonoma Valley High as a consequence. So, I took a look.

The data comes from the Los Angeles Times, as a part of their California Schools Guide. As you can see, Sonoma Valley High is right behind Piner High and ahead of Windsor High.  Santa Rosa High is at the top, Technology High in Rohnert Park's around the middle ... 

Wait a minute.  Piner High is above Sonoma Valley High? For someone who grew up in Sonoma County, that data point is completely implausible.  I just knew there had to be some real problems with the method used to create the database, given what I know about Piner High.

Number of Test Takers Versus Size of School.
Data from "Highest Average SAT Scores in Sonoma County"
California Schools Guide, Los Angeles Times
available at
Thankfully, the LA Times includes the number of students in the student bodies of the schools -- and also includes the number of total test takers at each school. I gathered up the data (it wasn't exactly conveniently arranged), and ranked the schools by number of test takers.  That table's on the right.  

Technology High comes in at the top, which shouldn't really surprise anyone.  It's the highest ranked high school in Sonoma County based on API scores.  The program (it's a magnet school) is designed to send its students to college (the school itself is located on the campus of Sonoma State University).  An awful lot of seniors at Tech High are taking the SAT, and the ones that aren't may very well be taking the ACT instead. 

I'd say that Sonoma Valley should be proud that it's managing to motivate so many of its seniors to take the SAT.  Only Analy and the Petaluma schools do better, and even then it's not by much.  Piner, meanwhile, comes in nearly dead last, with only 16.7% of its students taking the SAT. 

Thus, to me, it looks like there's just a very, very serious problem in trying to draw any conclusions from ranking high schools by average test scores on the SAT, when there's a large self-selection bias taking place in the pool of test takers -- you don't have to take the SAT, after all.  You have to sign up for it (and pay for it!).  At Piner High, not many students are doing so -- in stark contrast to Sonoma Valley High.

An illustration of the Normal Curve.
From "Normal Distribution," Wikipedia
available at
OK, but what about the raw scores -- can we compare the test scores on the SAT by trying to control for self selection bias?  Can we "correct" the data to try to draw conclusions? Well, if we just assume that the distribution for each school is unimodal, symmetrical, and bell-shaped -- that the distribution is normal ...

Such an effort immediately runs into a problem, which is that some high schools are unimodal, and some (like Sonoma Valley) are bimodal, and that the data is anything but symmetric. The data for the bimodal schools looks like the table at the right,  where g/t is an SAT score, and t is the number of test takers that got that score.
A Bimodal, Asymmetric Distribution.
From "Unimodality," Wikipedia
available at

Given that I knew there was an oddity in the data, I deliberately focused on only those schools that are bimodal. Thus, this comparison is for high schools where no single ethnic group constitutes more than 70% of the population -- those schools where Spanish-English dual immersion (which I happen to be interested in for my kids) is generally possible.

Pursuing that idea, I took a stab at coming up with, at least theoretically, what the 50th percentile and the standard deviation for the SAT score would be for each of these schools, presuming the sample (the self-selecting students) are all on the right end of a normal distribution (that they're more-or-less the best test takers).

Making the (heroic?) assumptions outlined above, I did what I could to estimate the score for a student who was 1 SD above average --  and correcting for different sample sizes -- again, assuming the data is normal, which it isn't. The only reason doing something like this could make any sense at all is that these schools all have the same issue with their data -- they're all bimodal and asymmetric (admittedly to different degrees). Further, while the actual 1 SD performance -- roughly the 85th percentile of test takers -- is quite possibly higher than these estimates indicate, it bears repeating that it is the relative differences I'm more interested in here.

And finally, I put in per-pupil spending for 2007 -- the last year before the real estate bubble made a lot of oddities hit these numbers, and the only year I had data for all of them -- for each of these schools.

Estimated SAT @~85% versus Spending Per Pupil,
Selected Sonoma County and Napa County Schools.
Data from "California Schools Guide," Los Angeles Times,
available at, and the 
"Federal Education Budget Project," New America Foundation,
The end result of this is the table on the right. Sonoma Valley does pretty well, all things considered.  Sonoma Valley has less funding per pupil than the lowest scoring school, yet still lands in the upper half of the table.

But the story is really the spending-per-pupil. To try to measure Sonoma Valley against, say, Healdsburg, when Healdsburg has 33% more money per student, is hideously unfair.  An extra $1.2 million a year (the amount necessary to match Napa) would significantly help Sonoma Valley Unified.  And what about giving Sonoma Valley Unified an extra $12 million a year -- the amount necessary to match Healdsburg? I bet SVUSD could accomplish an awful lot with that much money ...

The whole exercise of looking at this data certainly illuminated one pattern that I hadn't noticed, which was the very significant disparity inside Sonoma County concerning school funding.  I didn't have any idea that Healdsburg was funding its schools as well as it is, and frankly, it's to Healdsburg's credit.  But the really useful part is that I think it again exposes that most people's intuitive view of Sonoma Valley High is wrong -- Sonoma Valley High, and Windsor to a lesser degree, look like they're overachieving, given what they have to work with financially.  Further, Sonoma Valley's performance is better than one of the two closest high schools (Vintage) and is in striking distance of the other.

I've been speculating why the idea that "Sonoma Valley High is a poor performer" has gotten entrenched in the community.  I was tempted to mine the Index-Tribune's archives, to perform a textual analysis to see if I can find harder evidence, in the form of a shift in the changing language used to describe the High School.  But I think the story here doesn't need that much data in order to grasp the narrative.

Sonoma's a fairly rural, agricultural place.  My hunch is that many such communities began to get a little bit skeptical of their high schools sometime in the late 1950's -- think of the charmingly quaint anti-authoritarianism of Grease.  Such grousing was probably mostly harmless until the near-revolution that took place in American Society after 1968.  When school funding really took a hit a decade later, and the decrease in funding began to bite, the slow degradation of the physical plant probably kept the idea alive in many people's minds that Sonoma Valley High was a troubled place -- now, think Fast Times at Ridgemont High.

Meanwhile, the population of the Valley became more stratified as Sonoma gained an allure as a high-end destination as a consequence of the "Judgment of Paris," and the significant population growth between 1978 and 1986 meant the High School had to grow physically while dealing with less funding per pupil from property taxes. Fast forward to the present, when the demographic profile of the school district is changing as Sonoma continues to become ever wealthier, and I suspect the older idea in people's minds that "there's a problem at the High School" gets triggered fairly easily. Even if the evidence doesn't appear to be there to support the argument, the fear now is something along the lines of Dangerous Minds, perhaps.

But the data shows that Sonoma Valley High's doing a surprisingly good job of encouraging its students to apply to college, despite the fact that it makes the school look like it's underperforming. Further, the school looks like it's overachieving next to its peers as far as performance on the SAT is concerned, despite the funding situation.  If anything, this starts looking just a little bit like a case of Stand and Deliver. Again, not the conventional wisdom -- but perhaps in keeping with David Brook's "Philosophy of Data."