The Chicago Cadet Cap


Democrats with Chicago Characteristics

To many in the US, Chicago is the quintessential Democratic city. Indeed, the last Republican mayor of Chicago was “Big Bill” Thompson in 1931. But under the surface of that solid stretch of Democratic control are stories of power struggles and corruption that make Chicago’s politics legendary.

My hope is that this hat spurs conversation amongst all of the Democrats in Chicago; i.e. the Machine, Neoliberal/business, and Progressive Democrats.  My expectation is that Progressive Democrats will wear it to mock the Neoliberal and Machine Democrats who sometimes refer to us as “COMMIES!” … and maybe some actual communists, socialists, and/or anarchists.

*Artist’s statement below

$15 | Buy it today! Find me outside this event:

730pm @ Co-Prosperity Sphere | 3219 S Morgan St, Chicago


$20 | Shipping anywhere local included

All proceeds will go to some charity or cause, I don’t need your $15 (and I think Pilsen needs a wrestling team).

中国灵感 (Chinese Insipiration)

This hat is inspired by the design of the Chinese Army cap with the difference being that the Chicago Cadet Cap bears the signature light blue and six-pointed red star of the Chicago flag.


I consider this an art project in that it is inspired by the parallels between Chinese and Chicago politics.  I held a post-doctoral position at Tsinghua University in Beijing, China 3 years ago.  Before leaving for China, I studied its long history.  This is 50th anniversary of the Cultural Revolution.  I’ve had this design idea for some time, but this weekend’s Cultural Revolution Propaganda Poster Art Exhibition spurred me to action.  The history of violence, success of political propaganda, and profundity of the era may hold some lessons for us today.  Here’s a poster I picked up in Beijing of the kind being discussed tomorrow (Mao’s wearing the hat!).


Chicago Analog (芝加哥模拟)

The contradiction of Communism and for-profit enterprise that now exists in China is explained away by the all-powerful Chinese Communist Party (CCP) as “Communism with Chinese Characteristics”.  This mirrors some of the un-Democratic actions and behavior I’ve seen from the all-powerful Chicago and national Democratic party.

There’s the racism directed to me by a local “progressive” church and my previous employer who are Democratic operatives with a history of diversity problems… and then the racism directed at our current president by our future president in questioning his “American roots”; classic dog-whistle racism.  We have a pension crisis created by utilizing the right-wing “Starve the Beast” strategy.  There is the rejection of established science by Rahm’s regime regarding lead in our water in much the same way Republicans reject climate change.  There is the lack of ballot access that forces one to become a full-time politician just to run for office, and the quickness to sell out your cause to advance that career that naturally follows.  There is the endless privatization of schools and other social services that would make both the Republicans and Chinese blush.  And let’s not forget our Mayor himself who jumped on the New Democrat train back in the first Clinton presidency.

The history starts with Richard J. Daley (and even before that).  Then it evolves in the 80’s with Harold Washington, the Council Wars, and the rise of Reagan which prompted a response by national Democrats to sell out their New Deal past to Wall Street for campaign cash.  The rise of Richard M Daley moved us toward the New Democrat model, especially as his brother Bill Daley moved up at JP Morgan Chase and eventually did the docey-doe with Rahm and Richie as Mayor and Chief of Staff for Chicago and Obama respectively.

The last municipal election appeared to cause a split between the old-school Machine Democrats and the Neoliberal Democrats as the working class Machine Democrats started to realize that their wealth disappeared to Wall Street and nothing was going to be done about it.

So there you go.  Plenty to talk about at the bar with your awesome new hat.  Just shut up when the game comes on.

The Last Time the Cubs Won the World Series

The last time the Cubs won the World Series the year was 1908.

The last time the Cubs won the World Series they played at West Side Park “bounded by Taylor, Wood, Polk and Lincoln (now Wolcott) Streets”.  The original rooftop seating was on Taylor Street!


You could see Pilsen’s St. Procopius church all the way on 18th and Allport out towards left-center field!

The phrase “way out in left field” came from this park… partially from the fact that left-center was so deep that one guy hit 4 inside the park home-runs in a single game, but mostly because Cook County Hospital’s mental institute was behind the left field wall.

The bottom line is, patients could be heard yelling and screaming things at fans behind the left field wall.

The last time the Cubs won the World Series their best pitcher was Mordecai “Three-finger” Brown.  He mangled his hand in a farm-machinery accident.  This gave him a great curveball.

Brown’s most important single game effort was the pennant-deciding contest between the Cubs and the New York Giants on October 8, 1908, at New York. With Mathewson starting for the Giants, Cubs starter Jack Pfiester got off to a weak start and was quickly relieved by Brown, who held the Giants in check the rest of the way as the Cubs prevailed 4–2, to win the pennant.

Win won for Mordecai!

The last time the Cubs won the World Series they wore caps with short bills.  (See 3 Finger Brown above.) I tried to find an official one in an attempt to borrow nostalgia from the unremembered nineteen-aughts.  But mostly because at 35, I’m too old to have a modern ball cap emphasize the boyishness of my boyish good looks.  I couldn’t find one.  So I made my own.


The last time the Cubs won the World Series even these elderly Cubs fans hadn’t been born.


The last time the Cubs won the World Series was too long ago.

h/t /r/CHICubs



The Globalization of the Digital Divide

Today the science subreddit is having a discussion on racial bias in science in conjunction with Science Magazine publishing “Doing Science while Black,” by Dr. Ed Smith. This discussion has inspired me to share some research that I did a year and a half ago on the “pipeline problem” in tech, and STEM more broadly. That problem being: Tech companies can’t hire minorities if there aren’t minorities who are trained for the job.

My experience, and the research in the reddit discussion, shows that the pipeline problem isn’t the only thing keeping minorities out of tech.  I’m going to share that experience and I’m going to show that my experience is not unique.

Let’s talk about me

It’s been over 3 years since I graduated with my PhD in statistics (specializing in machine learning) from UIC. I thought things would be fairly easy given that I have the hottest degree in the tech industry. Not so.  Being half-Mexican with an identifiably Hispanic name (this is an employer’s first impression upon seeing my resume), but appearing to be a vaguely ethnic White person, my employment in industry has felt like a social science experiment for which I never volunteered. Nonetheless, I feel an obligation to use my white privilege to highlight injustice, much like the woman in this story:

The Lawsuit

I previously wrote about the lawsuit I filed against Civis Analytics. That post also covered the statistics on racial pay discrepancies in tech. In short, I was the only machine learning (PhD) or optimization (masters) degree holder at a data science company. I was treated like an idiot while training supposed peers and simultaneously getting paid less than them. From optimization to linear models, anything beyond the first month of an undergrad level course required me explaining it to members of the data science team (that I wasn’t initially allowed to join) and its management.

I’ve since withdrawn my lawsuit. It was not a good use of my mental energy. I had the CFO and a co-founder being quoted as saying racist things. It didn’t matter. I was the one under-represented minority (URM) and there are an infinite number of variables in hiring.

If one were to file a lawsuit, your employer would just need to find one thing you don’t have that someone else does and claim that as their reasoning for paying you less. In this respect, racist hiring practices actually serve to insulate companies from litigation.

Political AsideI’ll admit it’s a little frustrating to hear that this company, whose C-level executive refers to “speaking Spanish” as “speaking poor” and whose hiring practices were consistent with this attitude, is still receiving business from the Democratic National Committee and Hillary’s Super Pac. That Hillary’s campaign is right now attacking Donald Trump for referring to Miss Universe as “Miss Housekeeping”, in addition to the effect of NAFTA on Mexican campesinos, her previous “super-predator” statements, and Obama’s failure to prosecute bankers that were actually super-predators of middle-class minority households, makes it feel like minority political power is non-existent. But that’s for another post.

Highlights since leaving Civis include: being yelled at by a subordinate repeatedly, being lied to repeatedly by my boss, providing physical proof of him lying to HR who proceeded to yell at me, moving to a new job to only have my new boss harass me by saying things like, “What kind of Mexican are you? A drug dealer or a rapist?”, proceeding to contact HR and then have him still be my boss and go right back a week later to saying inappropriate things. Thanks Vivaki and Sears!

For those URMs looking for jobs in tech, here’s my best advice: Find a place where other URMs are employed. Find a place where management doesn’t primarily consist of a population of tall, attractive, white or Asian males. This indicates that they aren’t hiring and promoting based on ability, but instead are promoting based on their gut-feeling of who looks like a “leader”… like they’re selling jeans!

Hiring Discrepancies in Tech

All that advice presumes that you have multiple offers or enough savings to remain unemployed.  I certainly didn’t.  It took me 5 months to find a job, including a 4 month interview process with Civis, and a month waiting tables.  I wonder how many other machine learning PhDs have waited tables.?  I would guess very few… unless you’re an URM!  Because that’s what the data says.

In my last post on this topic I used some data from a USA Today article on pay discrepencies. That data was solid. They found that the discount for being Hispanic in tech is 16%, but only 4% for Blacks. This was consistent with my pay gap of approximately 20% relative to my comparable coworkers at Civis.

But again, there is only a pay gap if you get hired! Just 3 days after that article was posted another one came out that talked about hiring, Tech jobs: Minorities have degrees, but don’t get hired. It tries to address the defense that many tech CEOs make, that there aren’t enough minorities in the pipeline.

Unfortunately, this piece was less impressive. While their conclusions that URMs (Hispanics and Blacks) were approximately correct, the Hispanic and Black pipelines are respectively two and four times the size of Hispanic and Black employment in tech, the methodology could be improved.  The data presented also doesn’t address what would seem to be the real headline; Asians are employed at twice the rate that they are graduating.

Below, I attempt to come up with better estimates of the tech pipeline using the available data from the article and open data sets.  I’m going to try to unpack this work using some basic assumptions, R programming, and Rmarkdown. My hope was that this would clarify these numbers so that they are a little more convincing. I think I succeeded, but let me know what you think.


Here’s the initial USA Today data:

usatoday.mat <- matrix(c(47.7, 60.6, 43.4, 18.8, 3.2, 6.5, 1.8, 4.5),
                       nrow = 4, byrow = T)
colnames(usatoday.mat) <- c("Staff", "Graduates")
rownames(usatoday.mat) <- c("White", "Asian", "Hispanic", "Black")
print(xtable(usatoday.mat, caption = 'USA Today Data'),
      type="html", html.table.attributes = c("align=center,
                                             border=1px solid black,
      caption.placement = "top")

USA Today Data
Staff Graduates
White 47.70 60.60
Asian 43.40 18.80
Hispanic 3.20 6.50
Black 1.80 4.50
barplot(t(usatoday.mat), beside = T, legend.text = T, main = "USA Today Data",
        xlab = "Race", ylab = "Percentage of Staff")


The employment statistics in the article come from a third party whose data I couldn’t find/access.

The biggest issues that I can address deal with the pipeline.  The authors don’t take into account the globalized workforce within the US tech industry, nor do they take into account the fact that the great majority of tech employees entering the workforce (the pipeline) have a post-graduate degree.  In the article, only the domestic bachelors degree population statistics were used for the pipeline.

Racial Demographics

The first thing to do is to unpack the racial demographics with some assumptions. We’re going to assume that the “nonresident alien” (NRAliens) race demographics follow that of the world. Comparing the national and global population demographics, we would expect that the pipeline of tech workers would naturally skew towards a more Asian demographic than the typical US population.

pop.mat <- matrix(c(45, 5.5, 28.9, 32.9, 71.5, 4.6, 15.8, 14.5, 77.7, 5.3, 17.1,
                    13.2, 16.7, 60.6, 8.5, 14.2), nrow = 4)
colnames(pop.mat) <- c("Chicago", "Illinois", "US", "World")
rownames(pop.mat) <- c("White", "Asian", "Hispanic", "Black")

Population Demographics
Chicago Illinois US World
White 45.0 71.5 77.7 16.7
Asian 5.5 4.6 5.3 60.6
Hispanic 28.9 15.8 17.1 8.5
Black 32.9 14.5 13.2 14.2

The Undergrad/Postgrad Pipeline

We’re also going to assume that the pipeline of CS graduates is proportional to the numbers reported in the CRA report. 1991 PhDs were awarded, 10326 masters degrees were awarded, and 15087 bachelors degrees were awarded in 2013 by the sampled institutions. There were also 9875 new masters students, so we can expect only 5212 (= 15087 - 9875) of the bachelors students go into the workforce. There were 2728 new PhD students, so we can expect 7598 (= 10326 - 2728) masters students to go into the workforce. This is also going to give us a significantly different picture of the CS/tech pipeline.

degree.count <- c(5212, 7598, 1991)
degree.prop <- degree.count / sum(degree.count)
degree.tbl <- t(degree.prop)
colnames(degree.tbl) <- c("Bachelor", "Master", "PhD")

Proportion of Degrees in Tech Workforce
Bachelor Master PhD
0.35 0.51 0.13
pipeline.mat <- matrix(0, nrow = 5, ncol = 3)
rownames(pipeline.mat) <- c("White", "Asian", "Hispanic", "Black", "NRAliens")
colnames(pipeline.mat) <- c("Bachelors", "Masters", "PhD")
pipeline.mat["White", ] <- c(60.6, 28.9, 29.0)
pipeline.mat["Asian", ] <- c(18.8, 9.0, 9.5)
pipeline.mat["Hispanic", ] <- c(6.5, 1.8, 1.4)
pipeline.mat["Black", ] <- c(4.5, 2.0, 1.4)
pipeline.mat["NRAliens", ] <- c(7.6, 57.1, 58.3)

Pipeline Demographics
Bachelors Masters PhD
White 60.6 28.9 29.0
Asian 18.8 9.0 9.5
Hispanic 6.5 1.8 1.4
Black 4.5 2.0 1.4
NRAliens 7.6 57.1 58.3

Putting it together

So what does the current flow of CS graduates look like? First we have to allocate the NRAliens in the pipeline and then we have to allocate the students going into/getting out of in grad school.

nra.pipeline <- outer(pop.mat[, "World"] / 100, real.pipeline.mat["NRAliens", ])
real.pipeline.mat <- t(t(pipeline.mat) * degree.prop) 
globalized.pipeline <- real.pipeline.mat[-5, ] + nra.pipeline
Weighted Globalized Race Tech Pipeline
Bachelors Masters PhD
White 21.79 19.73 5.21
Asian 8.24 22.38 6.03
Hispanic 2.52 3.42 0.85
Black 1.96 5.19 1.30

Summing this matrix across the rows will give us the race percentage in the tech pipeline.

globalized.pipeline.race <- apply(globalized.pipeline, 1, sum)
gpr.tbl <- t(globalized.pipeline.race)
colnames(gpr.tbl) <- c("White", "Asian", "Hispanic", "Black")

Race Percentage in Tech Pipeline
White Asian Hispanic Black
46.73 36.66 6.79 8.46

Globalized Digital Divide

Using the numbers calculated above, we can see a better estimate of the racial demographics for the tech graduate pipeline corrected for the presence of grad school students and international students.

new.usatoday.mat <- cbind(globalized.pipeline.race, usatoday.mat[, "Staff"])
colnames(new.usatoday.mat) <- c("Pipeline", "Staff")

Corrected USA Today Data
Pipeline Staff
White 46.7 47.7
Asian 36.7 43.4
Hispanic 6.8 3.2
Black 8.5 1.8

The drastic difference between the Asian pipeline and staffing that was in the USA Today has been reduced. Unfortunately the underrepresented minority disparity still exists. With the data at hand (i.e. not having access to the USA Today Research survey), the best we can hope for is an expectation of how the racial makeup of the technology sector will be changing if the sector hires fairly going forward. This should involve a doubling in the number of Hispanic staff (3.2% to 6.8%) and a quadrupling in the number of Black staff (1.8% to 8.4%).


Next we’re going to look at what an equitable globalized tech workforce should look like. Assuming college admission/graduation rates for NRAliens are stable and the distribution of students entering the workforce with varying levels of qualification remains stable, we can calculate what an equitable globalized work force would look like.

nr.alien.share <- apply(real.pipeline.mat, 1, sum)["NRAliens"] / 100
domestic.share <- 1 - nr.alien.share <- t(matrix(c(domestic.share, nr.alien.share)))
colnames( <- c("US", "World")

Percentage Domestic vs. International in Tech Pipeline
US World
0.60 0.40

Taking a weighted average of US and World racial demographics tells us what an equitable distribution of tech college students should look like:

nr.alien.share <- apply(real.pipeline.mat, 1, sum)["NRAliens"] / 100
domestic.share <- 1 - nr.alien.share <- t(matrix(c(domestic.share, nr.alien.share)))
colnames( <- c("US", "World")
equitable.enrollment <- cbind(pop.mat[, "US"] *[, "US"],
                              pop.mat[, "World"] *[, "World"])
equitable.enrollment <- cbind(equitable.enrollment,
                              apply(equitable.enrollment, 1, sum))
colnames(equitable.enrollment) <- c("US", "World", "Total")

Finally, we put together the equitable pipeline, the actual pipeline, and current staffing, normalized to a 4-way race model.

final.mat <- cbind(Equity = equitable.enrollment[, "Total"], new.usatoday.mat)
final.mat <- t(t(final.mat) / apply(final.mat, 2, sum))

Final Picture
Equity Pipeline Staff
White 0.49 0.47 0.50
Asian 0.25 0.37 0.45
Hispanic 0.13 0.07 0.03
Black 0.13 0.09 0.02
barplot(t(final.mat), beside = T, legend.text = T,
        xlab = "Race", ylab = "Percentage of Staff")


Relative to the current employment pipeline, the White population is represented equitably, but slightly disproportionately to the pipeline. Additionally, the Asian population has disproportionate representation relative to both the academic pipeline and from the pipeline into industry. The underrepresented minorities are… underrepresented everywhere.


This more rigorous treatment of the data is not perfect, but it helps to show us two things. Not only do we have to make the pipeline of qualified tech workers more representative of the population overall, but we have a long way to go to make the workplace more representative of the pipeline as it is!!!  This was the thesis of the original article, and it was approximately correct.  Many in the tech world want to believe that their companies are meritocratic and that the lack of representation has nothing to do with their companies’ hiring practices. The cleaned data shows us that they can’t be let off the hook so easily.

Concerns have been raised that the pipeline has perhaps only recently become as diverse as it is.  The thought is that this could be the reason for the discrepancy between degrees awarded and jobs filled.  From the data available here this is unknowable, but anecdotal evidence suggests this isn’t the only problem.  I’ve already collected enough horror stories to convince my friends and family and STEM diversity initiatives are now aged in the decades.

This means that not only do we need affirmative action in STEM academics, but we also need non-discrimination in STEM industry! The academic pipeline of URMs is being under-utilized in industry and this hurts everyone.

Finally, this also brings to mind one of my favorite economists, Ha Joon Chang, issuing one of the strongest cases for affirmative action that I’ve read (emphasis added):

Equality of opportunity is the starting point for a fair society. But it’s not enough. Of course, individuals should be rewarded for better performance, but the question is whether they are actually competing under the same conditions as their competitors. If a child does not perform well in school because he is hungry and cannot concentrate in class, it cannot be said that the child does not do well because he is inherently less capable. Fair competition can be achieved only when the child is given enough food – at home through family income support and at school through a free school meals programme. Unless there is some equality of outcome (i.e., the incomes of all the parents are above a certain minimum threshold, allowing their children not to go hungry), equal opportunities (i.e., free schooling) are not truly meaningful.

Re-elected to the LSC!

I was reelected to another 2-year term on the Pilsen Academy Local School Council! For the second consecutive election I received the most votes, tied with my running-mate Dolores Cortes!  Thank you to all those who came out to vote and congratulations to all those who will be serving with us.


It’s taken all summer to write this post, because like the 2 previous summers, I’ve been busy.  This one was especially busy as I got engaged, bought a house, got a new job, and held two forums with two nationally recognized experts on lead in Chicago’s water.  Additionally, I was re-elected to the local school council (LSC) at Pilsen Academy as a community representative.  I’ll keep this post to that topic.

As you may remember, I ran with 3 existing parent representatives and another community representative.  Our 3 parents won and I won.  Unfortunately, our other community representative lost.  Typically these races are uncontested and the seats remain vacant.  Why was this LSC race so hotly contested?  In a year where we actually did something, we made two enemies.

Principal Ali’s Last Stand

In one of many fruitless attempts at overturning the LSC’s decision, the previous principal convinced some people to run for parent representative.  Of the two who ended up winning, one wasn’t even a parent.  They were a non-guardian grandparent who, word on the street has it, was told he was signing up for a job where he only needed to show up once a month; i.e. the LSC meetings.  This is consistent with talk of the old principal paying off LSC members.  One rumor has it he even paid for a quinceañera for one parent.

As one would expect, neither of these people have shown up to any of the 5 meetings we’ve had so far.  With another parent moving out of the neighborhood and no teachers on the council, we have 7 members who are willing and able to show.  This is the minimum number of members required to do anything.  Unfortunately, between sickness and travel of various members, today was the first day where we had the 7 members required for quorum.  We established a regular meeting time (8:15am on the third Thursday of every month on the second floor of the annex).  I was also voted to be the secretary again!

Pilsen Alliance’s Failed Revenge

The other group that appeared to be angered by our success at getting things done was Pilsen Alliance.  They get money from the Chicago Teachers Union.  So it looked bad for them when, after teachers suffered years of abuse from the previous principal, they did nothing to help.  Not only that, it appears that two of their members switched sides and started helping the principal.

In response to me calling out their leadership, they ran two community representative candidates to get me off the council.  Pilsen Alliance’s executive director, Byron Sigcho, was simultaneously running for a community representative seat at Juarez High School, so they had an organized team electioneering outside the polls on election day. They ran no parent representative candidates at either school.

I ended up having to spend the day with the LSC moms passing out our flyers and talking to parents and neighbors… Which was fun.  It was the second-rate Chicago politicking that was not fun. First, a Pilsen Alliance rep was trying to spin it like they were responsible for ousting Principal Ali.  The LSC moms were incredulous.  Then they attacked me for being half-Mexican; e.g. “not really Mexican”.  Hilariously, their candidate: also half-Mexican.  Finally, they displayed their ability to make up anything while also being bad at math.  They were telling parents that I was doing this for political reasons and that I would quit when it came time for the aldermanic races in 3 years.  The LSC term only runs for 2 years.

I try not to take this stuff personally, but successfully defending myself from a coordinated attack by a determined opponent felt good.  Unfortunately, Sigcho exerted too much energy on having me replaced and lost his own race.  After the embarrassment of losing such a tiny election, they started grasping at straws and claimed that the smallest case of election fraud was the reason for his loss.

Cleaning House


Our new principal spent the summer cleaning out the school’s 12 years of hoarding.  I scored a “Sweet” old book.  The dumpsters outside of the school were full all summer.

The school’s counselor was let go.  I previously accused him of colluding with the old principal against the teachers.  He claimed in a public meeting that Ali had given him a great review and this was not consistent with the new principal’s review.  A partisan review from Ali was consistent with my understanding of the situation.

All of the middle school kids (6-8) are now on the first floor.  Apparently this makes them easier to manage.

It appears Pilsen Academy will be getting a Parent University!

So that’s the local politics in Pilsen as I see it.  Hopefully as our new principal settles in we keep the drama to a minimum.  As I wrote earlier, I’ve got a lot of other things going on right now.  Talking to teachers and staff it’s so far so good.  Stay tuned.

Troy Hernandez for Pilsen Academy Community Representative


I’m running for another 2-year term on the Pilsen Academy Local School Council (LSC).  As the community representative and secretary, these past 2 years have been a lot of work, but it’s been worth it.

Last week, the LSC voted to offer a contract to our new principal; Leanne Hightower. She will be starting next week!

Leanne comes to us with a JD from U of I, masters degrees in education and school administration, 5 years of teaching experience and 4 years of administrative experience. With an almost unanimous decision (9 in favor, 0 against, 1 abstention) the LSC showed that it is very excited for this fresh start.

We worked very hard and very closely with parents, members of CTU, CPS administrators, and the community to bring in this strong candidate.  It hasn’t always been easy, but knowing that I’ve been able to make a difference in the lives of teachers, staff, and students at my neighborhood school has made it worthwhile. My hope is that the kids at Pilsen Academy can get the kind of high-quality public education I got in the suburbs.  Hiring our new principal is the first step in that direction and we’re all excited for the next steps with her leadership.  I hope I will be able to work with her over these next two years.

I’m running with some friends I’ve made from the current LSC; President Dolores Cortez, parent representative Maria “Lupe” Gonzalez, along with parent representative-hopeful Kay Allen, and community representative-hopeful Teresa Gonzalez.


From left: Me, Kay, Dolores, Lupe, and Teresa

Looking back…

When I was first elected to the LSC two years ago, I thought I was signing up to practice my Spanish over bad coffee at an early morning meeting once a month.  My hope was that, as a Mexican-American with a PhD in statistics, I could serve as a role-model to the neighborhood kids.  I would give a few STEM talks/demos (science, technology, engineering, and mathematics) and I’d have done my duty.  My two years were unfortunately not so easy or pleasant.

Walking home from the first official LSC meeting a couple of council members followed me out, pulled me aside, and said that they wanted a new principal.  They said that the current principal had created a culture of fear and was driving good teachers away.  They wanted my help.

I was skeptical and wondered what their angle could be.  The principal seemed nice enough on election night.  The Pilsen Alliance “volunteer” that was helping me out, Vicky Lugo, said he was terrible.  I was apt to believe the opposite after it turned out that the organization was filled with political opportunists.

Then I started to notice some troubling behavior from the principal.  I presumed incompetence before I presumed malice.  As I became acclimated to the process, I started to ask questions.  My honest questions were repeatedly met with violent responses from the principal… like the time I respectfully questioned increasing his unsupervised spending limit from $1,000 to $10,000.

I thought that if he was comfortable yelling at one of the people responsible for approving his contract, I could only imagine what it was like to have to work for him.  That our teacher turnover rate was 50% higher than the rest of the neighborhood’s schools and twice the rate of the state’s schools only affirmed my judgement.

Bare-Knuckle Politics

I kept my politics separate from my service at the school.  I initially made no mention of my election to the LSC here on my blog.  When I was collecting signatures for my aldermanic run, none of the members on the LSC knew about it until it was over.  The principal was not so noble.

He got rid of the teacher representative on the LSC (and his brother), he repeatedly prevented the LSC from conducting its business, and held illegal meetings.  In December, the LSC decided to not renew the principal’s contract by a vote of 7 – 2 – 1.  We thought that with things settled we would be able to focus on moving forward.  Nope.

The principal decided to play games.  He proposed a $60,000 budget transfer with no advance notice to the LSC.  The LSC was confused (translations are frequently lacking) and punted with 6 parents and staff abstaining from the vote.  He then sent a letter home to the parents and teachers accusing the LSC, specifically me, of trying to ruin the school.  This was followed by more chaos, including teachers getting uncharacteristically bad reviews and posted LSC agendas disappearing.

In response, the LSC voted to send a letter to CPS CEO Forrest Claypool.  We requested that Dr. Ali be removed from the school immediately.  This was just over 45 days ago.  Last week, I was informed that Ali had started threatening undocumented parents with calls to immigration.  Those parents complained publicly to the CPS Board last week.  Monday was Dr. Ali’s last day at Pilsen Academy.

In the bricks

While it’s been great to work and serve with the families in my community, I’ve been put off by the amount of politics in a grade school.  It hasn’t just been the principal either.  Local polluters make donations for the school’s floats in parades and members of Pilsen Alliance have used it as a pawn and propaganda piece.  But I guess I shouldn’t be surprised.  This is Chicago.  Politics are in the bricks that were used to build this school over 100-years ago.

The best we can do is to show up and try to minimize the negative effects that politics has on the school.  The current LSC and the new principal are in agreement on that matter.  My hope is that we can maintain this mindset during and after the election.  Then we’ll be able to get to the more important and fun things in the school.

Analyzing 538’s Democratic Primary Analysis

I get into the extended discussions about politics on Facebook.  Sorry, not sorry.  Recently, I got into an in-depth analysis of Bernie’s chances going forward.  Given his big wins in Washington, Hawaii, and Alaska, I pointed to this great project on to see how much he caught up.

Who’s on Track for the Nomination? describes itself like this

Tracking a candidate’s progress requires more than straight delegate counts. We’ve estimated how many delegates each candidate would need in each primary contest to win the nomination. See who’s on track and who’s falling behind.

It’s great and you should check it out.  It will serve as the source of data in this post.  In short, Hillary is supposed to be up right now.  She has a bigger lead than she should at this point, but is it enough?

My initial argument was that Bernie has moved up monotonically from 81% of his projected target to 92% since late February and in that respect is doing well.

I (perhaps too frivolously?) brushed aside arguments that he isn’t polling well in New York, given that he was polling in the low teens 3 weeks before the Illinois primary.  He forced a draw here.  It’s now 3 weeks before the New York primary.

I’ve ceded the argument that Bernie is over-performing in caucus states and that those are for the most part gone.

Analogies were made to Bernie being down late in the game.  “Even though he’s gaining ground, he’s not gaining fast enough.”  We are now through 55% percent of the delegates.  This is equivalent to being in the top of the 6th inning.  So what does Hillary’s bullpen look like relative to her starting pitching?  I think the answer to that is, “Not good.”

Caveat emptor: The problem with all of these quick analyses is that there are 1000’s of variables in the world and we’re likely to find at least a few that look very predictive… just by chance.  Correlation is not causation and all that.  You’ve been warned.

I decided to plot fivethirtyeight’s target delegate counts against the delegates she actually won.  Each data point represents a state (or territory).  The line running through the middle represents a candidate getting as many delegates in a state as they would need to get the nomination.  Anything over that line means they over-performed in that state.  Under the line means they under-performed in that state.  The colors correspond to liberal (blue), conservative (red), and swing states (purple).  We get this plot for Hillary’s performance


What’s surprising here is how close 538 has been to reality (check out Michigan!) with a few notable exceptions at the top end; for Hillary that’s Florida and Texas.  This suggests to me that whatever model the authors (Aaron Bycoffe and David Wasserman) were using to construct this model was actually pretty good.

The other thing that pops out is that Hillary appears to be over-performing in purple and red states.  She isn’t doing as well in blue states.  To highlight this, I fit a very simple linear models seperately for the blue and the red/purple states.  Those models are represented by the red and blue lines around the black one



Bernie’s plot shows something similar


Whether or not it holds up remains to be seen.  It’s a small sample size, but it does pass the sniff test.  More liberal states support Bernie… sounds about right.  If this trend holds up then it begs the question…

Are the remaining primary states closer to Texas (red), Florida/Ohio/Michigan (Purple), or Illinois/Washington (Blue)?  The biggest upcoming states are Wisconsin (purple), New York (blue), Maryland (blue), Pennsylvania (purple), Indiana (red), California (blue), and New Jersey (blue).  I wrote some R code to do all of this analysis (see below), so it was trivial to calculate the exact answer.

This is where things get crazy.  Given the criticisms of DNC’s handling of the primary process, maybe I shouldn’t have been as surprised.

Blue Shift

57% of the pledged delegates have been voted on/pledged to their respective candidates.  Of those delegates a whopping 83% of them have come from red or purple states!  Only 17% have come from blue states.  Of the remaining 43% of delegates who have yet to be voted on, 66% come from blue states!

The electorate that is about to vote comes from substantially more liberal states.  If the trend of Bernie over-performing in liberal states holds, then this could be good news for Bernie.  I ran the numbers to see if Bernie outperforming his blue state requirements by the estimated 11% would be sufficient to win.

It’s not.  He’s lost too much ground in the red and purple states.  He’d end up with 1945 delegates; about 80 short of the nomination.  He’d have to either do better in the red and purple states, or do 25% better than targets in the blue states.  Neither of which are easy.

Or maybe this is all just noise.  Who knows?  Only 5 blue states have voted so far, we have 10 more to go.  There’s a lot that could happen.  Either way, it should continue to be an exciting primary season.

The code and data used to generate the data and plots can be found here:

Statisticians come in from the cold

Two weeks ago my paper, Descriptive Statistics of the Genome, was accepted to the Journal of Computational Biology.  I think it was the best part of my thesis, so I’m excited to see it finally being published.


Last week I gave my first talk/lecture at the monthly luncheon of the Chicago Chapter of the American Statistical Association.  That was my first talk in a couple of years.  Last night I gave almost the same talk at my alma mater.  The title is the same as this blog post.  The thesis is that the popularity of data science has led to the creation of tools that allow statisticians to dominate the data science world with minimal effort.

What is a data scientist?

Instead of pontificating on the frequently discussed topic, here is a panel of smart people. Instead of ruining the surprise, I’ll let you listen to this over your morning coffee.

I closed this section by referencing Zoubin Ghahramani’s point that we need to be mindful of the whole pipeline of data and not just the statistics and machine learning.  This pipeline includes what is commonly known in industry as data engineering and visualization or front-end web design.  His point is that machine learning and statistics are just a small part of the data science pipeline.  Like this:


The point of my talk(s) was that with the modern tools developed by the popularity of data science, machine learning, and R the pipeline can look more like this:


All your data science are belong to us!

Required tools

  • BigQuery – a cloud-based SQL-like super-fast database system
  • bigrquery – an R package that allows you to pull data from
    BigQuery into the R environment
  • Shiny – “A web application framework for R. . . No HTML,
    CSS, or JavaScript knowledge required”

As an example I used data from a favorite website of mine,

Because I’m busy, I was able to find a couple of people who had put all of reddit’s data into a public BigQuery dataset (fhoffa@reddit) and had created a couple of queries and ggplot graphs to display the data (Max Woolf).  So my work was already 2/3rds of the way done!  That’s efficiency.  Thank you people!  All I had left to do was turn it into a Shiny App.  And that’s what I did.  You type in a subreddit and/or reddit username and you’ll get a word cloud and the posting time of the most popular posts matching that description in seconds, as soon as you make like Captain Pickard and “Engage”:

The code is here.  Apologies, it’s not the prettiest.  You do have to get a free BigQuery account and put your account’s project id into the quotes that point to project in the server.R file (line 7).


I finally opened up a account from Rstudio.  That’s where that Shiny output above is coming from.  It integrates nicely with Rstudio, github, and (hopefully, we’ll see when I hit publish) but not iframe-ing into this WordPress site. (It looks like I’ll have to up my game to make that happen.)

So there you have it.  111GB (or 215m rows) of information processed in a matter of seconds and displayed to you using only R and web services.  It appears as if all your data science are belong to us.