# A (mostly) data driven approach to helping planned parenthood

Economists tend to think a little differently about things. We generally process things around us through the 3-tiered filter of trade-offs, opportunity costs, and credible counterfactual scenarios. I contend that this usually results in interesting insights into social-behavioral outcomes that would not otherwise be gleaned….but I’ll freely admit that sometimes (usually when applied poorly) it produces lines of thought that are just silly.

I’ve been thinking about this idea off-and-on for a little while and, to be totally honest, I’m only like 75% confident it has merit. So, by my subjective probability assessment, this post has like a 25% of being total kaka.

## Executive Summary

Before you read any further I will tell you that I have split the remainder into 2 parts. Part I: Background contains a fair bit of editorializing. I believe my arguments here are logically consistent and generally well supported by verifiable fact…but if you don’t care to hear my opinions on why rich Planned Parenthood donors should start thinking a little more strategically/politically, then feel free to skip Part I. Part II: Some Data Mining basically involves me using demographic data to identify places in the U.S. where there is a disconnect between expected support for Planned Parenthood and the voting record of the congressional representative.

As always, the punchline first:

My first reaction to the news item that Sheryl Sandberg had donated $1 million to Planned Parenthood (PP) was not the same glee that a lot of my fellow PP supporters had. It was a weird sensation of, ‘Oh man, the defundersare going to love this. It’s exactly what they want.’ Claim: If you’re rich and thinking about cutting Planned Parenthood a check for a million bucks, I think you could do more good for PP by using that money a little differently. The subtext to my claim is this: PP gets around 500 million bucks/yr in federal funds. If you’re rich and really believe in the mission of PP I think it is in your interest to get them votes not cash. Because if PP is defunded, the people who want to support those services won’t be on the hook for a cool mil here and there…they’ll need to come up with HUNDREDS of millions EVERY YEAR. Methods: Basically this post focuses on demonstrating how to use demographic data from the Census Bureau to find places in U.S. where you would expect more support for Planned Parenthood than currently exists. The data-mining exercise I’m presenting here is motivated by three observations: 1. Poll after poll after poll after poll has shown that a clear majority of Americans support Planned Parenthood. To be fair, support for PP has declined somewhat (I suspect this is due to the recent flood of false claims about federal money paying for abortions, selling body parts and other bullshit that’s been thrown about) but is still in the 60% of all Americans range. The problem (for PP supporters) is not numbers but location: opponents of Planned Parenthood are fewer in number but spatially distributed in such a way as to grant them over-representation in Congress. 2. The second observation is one that popped out at me after looking at a lot of demographic data: there seem to be a lot of places (congressional districts to be precise) where the demographics would appear friendly to Planned Parenthood - lots of women and educated people (two groups that according to Quinnipiac’s latest poll tend to strongly support Planned Parenthood) but where the Congressperson has voted in favor of defunding PP. 3. There are things that Planned Parenthood, as a government funded entity, cannot do. This means that some of the ‘groundgame’ strategies I will mention here involve not giving money to PP but rather giving that money to some other entity that can work on PP’s behalf. For example, PP can not/should not be able to choose where to provide services based on political strategy…and PP can not/should not be able to use federal funds to target outreach based on political strategy…but some entity called saveplannedparenthood.org (perhaps with a$1 million donation from Sherly Sandberg) can do both of these things.

My basic contention here is that, if you are a rich supporter of Planned Parenthood, you are not up against a national trend. You enjoy the support of a clear majority of Americans. What you are facing is many localized pockets of opposition. To deal with diffuse, localized opposition you need a groundgame…and a million bucks can buy you A LOT of groundgame.

Assumptions People like Sheryl Sandberg would likely prefer the positive press they get from giving money to Planned Parenthood over potentially polarizing press they would get from giving money to a political campaign. My approach is political but is not partisan and avoids the need to go after congressional seats directly. It involves spending money strategically to strenghten the voice of people who already support Planned Parenthood.

Conclusions

My conclusions (if you can even call them that) are basically 2-fold:

1. Outreach, education, and expansion of services seems like one pretty solid way to support the future of Planned Parenthood. This approach works on the linked theory that:
1. like same sex marriage, Planned Parenthood would enjoy even more political support if more people had a personal connection to someone who has been helped by PP.
2. more people would have a connection to someone helped by PP if more people were aware of available services in their area and if more areas had easier to access services.
2. straight up lobbying is also an option.

If you like Option 1 then the first pass data mining strategy is to find places currently voting to defund that have large populations of ‘likely PP users.’ There is some obvious bad news here: expanding services in areas that are already voting to support federal funding for PP doesn’t solve the problem of needing ~25 congressional votes to ensure the future of PP. That means there may be some very underserved areas that get passed over for investment (under the ‘most bang for the Sheryl Sandberg buck) in favor of other underserved areas in districts currently voting to defund. I KNOW this is a totally unsavory way to look at the problem BUT it should also be remembered that putting 250k into a clinic this year doesn’t help much if 500,000,000 goes away next year.

If you like Option 2 the data mining strategy is a little different. Rather than find likely Planned Parenthood users you are looking for likely pro-PP voters.

Caveats:

My main caveat here is that there is no reason this has to be a partisan exercise. The issue is unavoidably partisan in the sense that, in the last House vote to defund Planned Parenthood, only 2 Republicans (out of like 238) opposed defunding while only 2 Democrates (out of like 193) supported defunding…but what I want to point out immediately is that I AM NOT MAKING THIS A PARTISAN ISSUE. I’m simply suggesting that there are lot of congressional districts that, demographically, look more pro-Planned Parenthood than their congressman’s voting record. A little bit of money invested in right places can

1. help determine which of these places really are pro-Planned Parenthood
2. help those areas find their voice.

For the purposes of this piece I don’t care if all 435 House seats are red….as long as the votes on any measure to defund PP actually reflect the values of the congressional districts they represent.

## Data

I used some pretty basic and easy to obtain data for this:

1. Demographic data by Congressional District from the Census Bureau. I used the API to pull the following data directly in R
1. population by age and sex
2. education level by age
3. ethnic makeup
4. poverty status
2. I matched the demographic data with a list of Congressmen for each district and their party affiliation.
3. I also used a list of results from the 2016 presidential election by congressional district.

### Resources

1. The R Script plannedparenthood.R is used to retrive the Census Bureau Data and do the analysis
2. A couple .csv and .txt files with names and party affiliations for members of the 114th Congress are here

## Background

### The basic bargain

Last bit of editorializing I promise:

My personal feeling on this (which you are all certainly welcome to disagree with) is that when it comes to use of public monies/tax dollars there’s a basic implied civil compromise we’ve had for a while:

I like schools, I like investing in the future (NSF, research grants to universities, etc), I like agricultural subsidies (mostly because I dig the win-win that happens when we do shit like buy surplus milk from dairy farmers to keep them from going broke when supply shocks hit and then turn around and provide that milk at very low cost to poor kids in schools).

I don’t really like that my tax dollars are used to subsidize oil companies, or that my tax dollars are used to support religious shit like Ken Ham’s weird Young Earth Creationist Playground (P.S.: yes, my tax dollars go there. I live in donor state, KY is a ‘receiver’ state)

Here’s the thing, there’s a bunch of people in places like Alabama’s 4th Congressional District who are the exact opposite of me. They love the shit I don’t like and hate the stuff I like. But for years I feel like we had a basic compromise: I pay my taxes, they pay their taxes, they get some stuff that they like (oil subsidies and Young Earth Creation Museums), and I get some stuff I like (good public schools, poverty reduction programs, etc).

There is a brand of, generally Tea-Party aligned voters, now who want to trash this implied compromise I mentioned. That bothers me because I may not love my taxes supporting a military-industrial complex or religious subsidies but I’ve never tried to kill those things. To be clear there have always been libertarians and libertarian leaning folks who just want to take an axe to everything. I don’t typically agree with these folks but they don’t really bother me. There have also always been people who think we spend too much or spend inefficiently on some things. Again, I got no beef (at least no beef that can’t be discussed/debated like civilized adults) with somebody who says, “the current system of funding Planned Parenthood is a bad way to provide a safety net focused on women’s health issue. We need to rethink whether federal funding for Planned Parenthood is the best way to address the issues at hand.”

## The Data Part

My basic premise as I mentioned is this: if the ‘defunders’ get their way, Sheryl Sandberg isn’t getting a break on her taxes, she’s just going to see less of her money going to the shit she likes (assuming she likes Planned Parenthood, clean air, and clean water) and more of her money going to the shit that the ‘defunders’ like (Bible themed amusement parks, subsidies for oil and gas companies, etc.)

The ‘defunders’ aren’t saying, “Ok, let’s give everybody their money back and you pay for the stuff you want and we’ll pay for the stuff we want.” They’re saying, “Hey Sheryl Sandberg, we’d like to take all of your tax money to pay for our stuff and, oh-by-the-way, if you like Planned Parenthood you’ll have to cut a separate check for that because fuck you, that’s why.”

At this point, I feel like I should probably abstract away from Sheryl Sandberg because I don’t really know anything about her politics or the strength of her emotional connection to Planned Parenthood. So let’s suppose there is an entity (let’s call it the SSLE - Sheryl Sandberg Like Entity) that:

• is rich
• really likes Planned Parenthood and, importantly, wants PP to be able to continue providing health and wellness services to females for generations to come
• has $1 million bucks burning a hole in their pocket • wants to support the mission of Planned Parenthood honestly (i.e. not just looking to lessen their own tax burden through some charitable contributions) • is looking for the biggest impact for their gift My contention is pretty simple: votes are more important than money. Planned Parenthood gets something in the neighborhood of 500 million a year in federal funding. If congress defunds it, rich people wanting to continue providing those services aren’t going to be on the hook for$1 million bucks here and there…they’re gonna have to come up 100’s of MILLIONS of dollars EVERY year…all well watching their tax money get appropriated for stuff they may not like, such as spending 30 billion on dumbass border wall.

The last vote on defunding Planned Parenthood was 241 - 187 (basically a straight party line vote with Lapinski and Peterson the only 2 Democrats voting to defund and Dold and Hanna the two lone Republicans voting not to defund…and 6 note voting). So you need to gain 27/28 votes in the House to keep this issue off the table (I want to be crystal clear that I’m not suggesting Sheryl Sandberg or a SSLE become a total partisan hack or politcal animal. The votes to protect Planned Parenthood, I believe, could be gained without ‘flipping’ a district…I’ll explain later).

I’m not sure picking up these votes (or at least enough to convince Paul Ryan it’s not worth the fight to raise the issue) would be as hard as it sounds.

### The Outline

If I were tasked with using 1 million bucks to try and find votes for Planned Parenthood (against measures to defund Planned Parenthood) in the House of Reps, my high-level strategy would look something like this:

1. Identify areas of high ROI - areas currently voting against PP that look like they should be voting for it.
2. Allocate a ‘first cut’ of the money to getting some ‘boots on the ground’ intel about these areas
3. Use the info from 2 to develop a more in-depth, targeted strategy for each area
4. Price out the strategies developed in 3 and figure out which combination satisfies the budget constraint and results in maximum vote pick up.

The remainder of this post is a preveiw of sorts. It is specifically devoted to the first pass empirical work mentioned in item 1 above. Sorry rich people, if you want more detail on 2 - 4 I’m gonna need a retainer.

### Idea 1: Expand/Improve services in strategic areas

This probably sounds like a ‘no-duh’ idea and there might be a few people saying, “obviously, your fictional SSLE gives a million bucks to Planned Parenthood and then Planned Parenthood uses that money to provide health and wellness services to women…great idea Captain Obvious.”

My proposal is a bit more nuanced than that. I’m suggesting that, in this expansion of services and education, Planned Parenthood (or more likely an agent working on their behalf) be somewhat strategic about where money is used.

A 2009 Gallup poll found that people are more likely to support marriage equality if they know someone who is gay. Several studies and pundits since then have attributed the rise in support for same sex marriage (to the point that it is no longer even a controversial issue) to the large and growing number of people who personally know someone who is gay.

While not exactly related to the phenomenon above, here is an interesting discussion of the evolution of public opinion on same sex marriage by Nate Silver

It’s actually more than a little unsavory to think about Planned Parenthood targeting areas for facilities upgrades, outreach services, etc. based on propensity to influence congressional votes…but I see this fact as an opportunity for our SSLE.

Solution: the SSLE gives $1 million dollars to saveplannedparenthood.org, who uses part of the donation to commission a study of the most strategic locations for Planned Parenthood Support, then uses the rest of the money to: 1. make donations to local offices (yes, you can apparently donate directly to local area offices) or 2. establish a local office or outreach/educational center in a ‘high priority’ area that doesn’t currently have services. Quick Aside: yes, technically NARAL is a group already set up to do some of the lobby-type work I’m talking about…but they are really bad at it. Campaign contribution reports indicate that in 2016 NARAL spent 11,000 bucks to oppose Bobby Jindal’s presidential campaign and 100,000 duckets to defeat Marco Rubio in the senate. Bobby Jindal never polled above 2 percent in any state and Marco Rubio’s re-election to the senate was pretty much foregone in 2016. The 111 large they spent on these two opposition campaigns was totally wasted. ### How do we prioritize areas? Presumably a consulting/strategy firm could sus this out pretty quick for$200-300k..but if the SSLE wanted to hire me to do it here is how I might start:

Planned Parenthood by the Numbers reports that services recipients are overwhelmingly:

• females over the age of 20 and
• people with incomes at or below 150% of the poverty line

Since I know the last House vote on defunding basically resulted in all Republicans voting to defund I don’t really need to mess around with determining who voted ‘defund’ versus not. I can just look for Republican Congressional Districts with populations skewed towards the biggest statistical users of PP.

The steps are carried out below but here is what I did:

1. Got the data on party affiliation of each Congressional Rep for each district

2. pulled Census Bureau data on total population and female population by age and by Congressional District

3. pulled Census Bureau data on poverty status among females ages 25 - 44 by Congressional District.

4. Pulled some other demographics that I want to use later

5. Merged the demographic data and the party affiliation data

6. Filtered for Republican districts and sorted by female population and female poverty status to find Republican districts with demographics fitting the ‘Planned Parenthood users’ demographic.

########################################################
#some data cleaning
#----------------------------------------------------
house2016 <- strsplit(as.character(house2016[,1]),"\\s+")

#first just take out all the list elements that have exactly 5 elements because
# these will be easy to deal with
goods <- which(lapply(house2016,function(x)length(x))==5)

good.members <- data.frame(rbindlist(lapply(goods,function(i){
tmp <- house2016[[i]]
cd <- unlist(strsplit(tmp[[1]],"[.]"))
data.frame(cd,firstname=tmp[2],lastname=tmp[3],party=tmp[4],state=tmp[5])
})))

#probably have to do the others by hand

#there are only 13 bad ones at this point...
r1 <- data.frame(cd=23,firstname='Debbie', lastname='Wasserman Schultz', party='D',state='FL')
r2 <- data.frame(cd=8,firstname='Chris Van', lastname='Hollen', party='D',state='MD')
r3 <- data.frame(cd=2,firstname='Ann McLane', lastname='Kuster', party='D',state='NH')
r4 <- data.frame(cd=12,firstname='Bonnie', lastname='Watson Coleman', party='D',state='NJ')
r5 <- data.frame(cd=1,firstname='Michelle', lastname='Lujan Grisham', party='D',state='NM')
r6 <- data.frame(cd=3,firstname='Ben Ray', lastname='Lujan', party='D',state='NM')
r7 <- data.frame(cd=15,firstname='Jose E', lastname='Serrano', party='D',state='NY')
r8 <- data.frame(cd=18,firstname='Sean Patrick', lastname='Maloney', party='D',state='NY')
r9 <- data.frame(cd=18,firstname='Sheila', lastname='Jackson Lee', party='D',state='TX')
r10 <- data.frame(cd=1,firstname='G.K.', lastname='Butterfield', party='D',state='NC')
r11 <- data.frame(cd=30,firstname='Eddie Bernice', lastname='Johnson', party='D',state='TX')
r12 <- data.frame(cd=3,firstname='Jamie', lastname='Herrera Beutler', party='R',state='WA')
r13 <- data.frame(cd=5,firstname='Cathy', lastname='McMorris Rodgers', party='R',state='WA')

df <- rbind(r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,r11,r12,r13)

house2016 <- rbind(good.members,df)

#last step is to get the party afiliation
party.afil <- lapply(house2016$party,function(x){ tmp <- strsplit(as.character(x),'[())]') if(length(unlist(tmp))==2){ party <- as.character(unlist(tmp)[2]) }else{ party <- as.character(tmp) } return(party) }) house2016$party <- unlist(party.afil)
###############################################################################

###############################################################################
#now get demographic data by congressional district
#-------------------------------------------------------------------------------------
key <- 'getyourownAPIkey'
#Now get 3 simple demographics for 2012 and 2016

#1. percent female
#2. percent black
#3. percent with a college degree
#4. age structure...let's just use percent 25 - 50

#-------------------------------------------------------------------------------
# age and sex
series.males <- c('001E','002E','007E','008E','009E','010E','011E','012E','013E','014E','015E','016E')
series.females <- c('026E','031E','032E','033E','034E','035E','036E','037E','038E','039E','040E')
series <- c(series.males,series.females)

series <- paste('B01001_',series,sep="")
series.names<- c('total pop','total male','m18_19','m20','m21','m22_24','m25_29','m30_34',
'm35_39','m40_44','m45_49','m50_54','total female','f18_19','f20','f21','f22_24',
'f25_29','f30_34',
'f35_39','f40_44','f45_49','f50_54')

pop.fn <- function(i,yr){
resURL <- paste('http://api.census.gov/data/',yr,
'/acs1?get=NAME,',
series[i],
'&for=congressional+district:*&key=',key,
sep="")
ljson <- fromJSON(resURL)
ljson <- ljson[2:length(ljson)]
tmp <- data.frame(unlist(lapply(ljson,function(x)x[1])),
unlist(lapply(ljson,function(x)x[2])),
unlist(lapply(ljson,function(x)x[3])),
unlist(lapply(ljson,function(x)x[4])),
series.names[i])
names(tmp) <- c('name','variable','state','congressional_district','series_name')
return(tmp)
}

age.sex <- pop.fn(i=1,yr=2015)
names(age.sex) <- c('name','total_pop','state','congressional_district','series')
age.sex$total_female <- pop.fn(i=13,yr=2015)[,2] age.sex$f25 <- pop.fn(i=18,yr=2015)[,2]
age.sex$f30 <- pop.fn(i=19,yr=2015)[,2] age.sex$f35 <- pop.fn(i=20,yr=2015)[,2]
age.sex$f40 <- pop.fn(i=21,yr=2015)[,2] age.sex$f45 <- pop.fn(i=22,yr=2015)[,2]
age.sex$f50 <- pop.fn(i=23,yr=2015)[,2] #---------------------------------------------------------------------------------- #---------------------------------------------------------------------------------- #percent black #2015 1 yr ACS estimate #black population resURL <- 'http://api.census.gov/data/2015/acs1?get=NAME,B02001_003E&for=congressional+district:*&key=???' ljson <- fromJSON(resURL) pct.black <- data.frame(rbindlist(lapply(ljson,function(x){ tmp <- unlist(x) return(data.frame(name=tmp[1],pop.black=tmp[2],state=tmp[3],cd=tmp[4])) }))) #total population resURL <- 'http://api.census.gov/data/2015/acs1?get=NAME,B02001_001E&for=congressional+district:*&key=???' ljson <- fromJSON(resURL) total.pop <- data.frame(rbindlist(lapply(ljson,function(x){ return(data.frame(name=unlist(x)[1],total.pop=unlist(x)[2])) }))) pct.black <- tbl_df(pct.black) %>% inner_join(total.pop,by=c('name')) #--------------------------------------------------------------------------------- #--------------------------------------------------------------------------------- #poverty percentages resURL <- 'http://api.census.gov/data/2015/acs1?get=NAME,B17001_025E&for=congressional+district:*&key=???' ljson <- fromJSON(resURL) poverty.f25 <- data.frame(rbindlist(lapply(ljson,function(x){ return(data.frame(name=unlist(x)[1],pop.poverty=unlist(x)[2])) }))) resURL <- 'http://api.census.gov/data/2015/acs1?get=NAME,B17001_026E&for=congressional+district:*&key=???' ljson <- fromJSON(resURL) poverty.f35 <- data.frame(rbindlist(lapply(ljson,function(x){ return(data.frame(name=unlist(x)[1],pop.poverty=unlist(x)[2])) }))) pov.df <- data.frame(cbind(poverty.f25[2:nrow(poverty.f25),],poverty.f35$pop.poverty[2:nrow(poverty.f35)]))
names(pov.df) <- c('name','pov25','pov35')
#---------------------------------------------------------------------------------

#-----------------------------------------------------------------------------------
#Education Level

#set up a function for this too...just for compactness
edu.fn <- function(yr){
resURL <- paste('http://api.census.gov/data/',yr,
'/acs1/subject?get=NAME,S1501_C01_006E&for=congressional+district:*&key=',key,sep="")
ljson <- fromJSON(resURL)
name <- unlist(lapply(ljson[2:length(ljson)],function(x)x[1]))
pop25 <- unlist(lapply(ljson[2:length(ljson)],function(x)x[2]))
state <- unlist(lapply(ljson[2:length(ljson)],function(x)x[3]))
congressional_district <- unlist(lapply(ljson[2:length(ljson)],function(x)x[4]))
df <- data.frame(name=name,pop25=pop25,state=state,congressional_district=congressional_district,
source=paste('ACS_1yr_',yr,sep=""))

#add in the number of people 25 and over with a bachelor's degree
resURL <- paste('http://api.census.gov/data/',yr,
'/acs1/subject?get=NAME,S1501_C01_012E&for=congressional+district:*&key=',key,sep="")
ljson <- fromJSON(resURL)
pop25_bachelors <- unlist(lapply(ljson[2:length(ljson)],function(x)x[2]))

df$pop25_bachelors <- pop25_bachelors return(df) } edu2015 <- edu.fn(yr=2015) #--------------------------------------------------------------------------------- #--------------------------------------------------------------------------------- #add state abbreviations to the age.sex data frame state.codes <- read.csv('data/state_fips_codes.csv') %>% select(code,abb) names(state.codes) <- c('state.code','abb') names(house2016) <- c('cd','firstname','lastname','party','abb') house2016$cd <- as.numeric(as.character(house2016$cd)) house2016$cd[is.na(house2016$cd)] <- 0 age.sex <- age.sex %>% mutate(state.code = as.numeric(as.character(state))) %>% inner_join(state.codes,by=c('state.code')) %>% mutate(cd=as.numeric(as.character(congressional_district))) %>% inner_join(house2016,by=c('abb','cd')) #--------------------------------------------------------------------------------- #--------------------------------------------------------------------------------- #now bring in the % black pct.black <- pct.black %>% filter(row_number() > 1) %>% mutate(cd=as.numeric(as.character(cd)), state.code=as.numeric(as.character(state)), pop.black=as.numeric(as.character(pop.black)), total.pop=as.numeric(as.character(total.pop)), pct.black=pop.black/total.pop) %>% select(cd,state.code,pct.black) age.sex <- age.sex %>% inner_join(pct.black,by=c('state.code','cd')) #--------------------------------------------------------------------------------- #--------------------------------------------------------------------------------- #bring in education edu2015 <- edu2015 %>% mutate(state.code=as.numeric(as.character(state)), cd=as.numeric(as.character(congressional_district)), pop25=as.numeric(as.character(pop25)), pop25_bachelors=as.numeric(as.character(pop25_bachelors)), pct.bachelors=pop25_bachelors/pop25) %>% select(state.code,cd,pct.bachelors) age.sex <- age.sex %>% inner_join(edu2015,by=c('state.code','cd')) #--------------------------------------------------------------------------------- #--------------------------------------------------------------------------------- #add in the female poverty age.sex <- age.sex %>% left_join(pov.df,by=c('name')) %>% mutate(pov.female=as.numeric(as.character(pov25))+ as.numeric(as.character(pov35)), pct.pov.female=as.numeric(as.character(pov.female))/as.numeric(as.character(total_female))) #--------------------------------------------------------------------------------- #filter and have a look > age.sex %>% select(name,party,pct.pov.female,pct.female) %>% + filter(party=='R') %>% arrange(-pct.pov.female,-pct.female) name party pct.pov.female pct.female 1 Congressional District 21 (114th Congress), California R 0.09294807 0.11997973 2 Congressional District 5 (114th Congress), Kentucky R 0.08203590 0.13007997 3 Congressional District 3 (114th Congress), West Virginia R 0.07256008 0.12433657 4 Congressional District 5 (114th Congress), Louisiana R 0.07208806 0.12840255 5 Congressional District 4 (114th Congress), Louisiana R 0.06974929 0.13060522 6 Congressional District 4 (114th Congress), Mississippi R 0.06843843 0.13148905 7 Congressional District 2 (114th Congress), Alabama R 0.06394077 0.13710746 8 Congressional District 1 (114th Congress), Georgia R 0.06357490 0.13566131 9 Congressional District 3 (114th Congress), Louisiana R 0.06072020 0.14015800 10 Congressional District 5 (114th Congress), Texas R 0.06039765 0.12990302 11 Congressional District 8 (114th Congress), California R 0.06026953 0.12637231 12 Congressional District 8 (114th Congress), Georgia R 0.05971084 0.12878745 13 Congressional District 12 (114th Congress), Georgia R 0.05967383 0.13269793 14 Congressional District 10 (114th Congress), California R 0.05909745 0.13341327 15 Congressional District 4 (114th Congress), Arkansas R 0.05726820 0.12494924 16 Congressional District 2 (114th Congress), New Mexico R 0.05719204 0.11793210 17 Congressional District 4 (114th Congress), Alabama R 0.05718612 0.12638659 18 Congressional District 1 (114th Congress), Arkansas R 0.05608055 0.12807827 19 Congressional District 8 (114th Congress), North Carolina R 0.05586123 0.12860101 20 Congressional District 23 (114th Congress), California R 0.05563970 0.13073820 21 Congressional District 22 (114th Congress), California R 0.05551996 0.13542188 22 Congressional District 6 (114th Congress), Kentucky R 0.05439300 0.13676195 23 Congressional District 1 (114th Congress), Louisiana R 0.05365683 0.14032922 24 Congressional District 4 (114th Congress), Washington R 0.05299454 0.12357510  If I were operating under the assumption that the more people know about Planned Parenthood and, more specifically the more people know a person who has been helped by Planned Parenthood, the more likely they are to oppose defunding it, then 2 courses of action seem fruitful 1. make sure that people in areas with a large ‘PP demographic’ know what Planned Parenthood services are available to them and how to access those services, and 2. if PP services are not available to populations with a large ‘PP demographic’ figure out how some of the SSLE gift money can be used to provide such services. Admittedly, this strategy is a little more of a long game. The top district on this list (CA-21) is a pretty contested district but the rest of the names are pretty solid red districts (TX-5 with a PVI of R + 17, LA-1 with a PVI of R + 23). I know I said this wasn’t about flipping districts but it is about using leverage…and if you’re a Republican repping a district that’s R + 23 you probably aren’t living in fear of a pro-PP voter uprising (at least in the short-run). That doesn’t mean this strategy doesn’t have merit…it just means a little patients might be required. ### Idea 2: Target potential supporters rather than potential users This is a little bit of spin on the data-mining above…here, rather than ask, “who are the primary users of PP,” I’m going to ask, “who and where are the usual supporters of PP?” The basic strategy of this analysis is to find places where demographics look friendly to PP and identify which of these places are currently represented by someone voting to defund PP. The empirical strategy I’m going to use is pretty simple: set up a classification model to predict whether a Congressional District is Democrate or Republican based on demographics, then look at where that model fails (generates bad predictions). #### A Classification Tree Classification and Regression Trees are some of the most popular ‘learning’ algorithms…due in no small part I’m sure to their conceptual simplicity. Classification trees, like logistic regression or support vector machines, attempt to classify a response variable based on an input set. They work by recursively partitioning the data and looking to form local clusters of reasonably homogeneous observations. One of the popular features of classification trees is that they generate local predictors. In contrast to something like logistic regression, which generates a predictive model that is supposed to hold over the entire span of the data, classification trees try to break the data down into small groups of similar observations and can apply different models on each part. In the following I’m going to specify a model that tries to ‘learn’ the party affiliation of a Congressional District’s representative based on three demographic factors: 1. black population as a percent of total population 2. females 25 - 45 as a percent of total population 3. percent of total population holding a bachelor’s degree. # try a classification tree application just for fun... # here we will use a classification tree to predict whether the district is R or D based # on demographics library(tree) #recod Republican districts = 1 age.sex <- age.sex %>% mutate(z = ifelse(party=='R',1,0), female.votingage=as.numeric(as.character(f25))+ as.numeric(as.character(f30))+ as.numeric(as.character(f40))+ as.numeric(as.character(f50)), total_pop=as.numeric(as.character(total_pop)), pct.female=female.votingage/total_pop) #have a quick look at some factor distributions ggplot(age.sex,aes(x=party,y=pct.bachelors)) + geom_boxplot() ggplot(age.sex,aes(x=party,y=pct.female)) + geom_boxplot()  #a classification tree tree.model <- tree(factor(party) ~ pct.black + pct.bachelors + pct.female, data=age.sex) tree.model my.prediction <- predict(tree.model, age.sex) # gives the probability for each class head(my.prediction) #where does the classification tree do a bad job with republican districts post.est <- age.sex %>% select(name,z) %>% mutate(pR=my.prediction) %>% filter(z==1 & pR<0.5) %>% arrange(pR) plot(tree.model) text(tree.model)  #look at number correctly classified maxidx <- function(arr) { return(which(arr == max(arr))) } idx <- apply(my.prediction, c(1), maxidx) prediction <- c('D', 'R')[idx] age.sex$tree.pred <- prediction

#compare number correctly classified to what we would get from a logit model
#logit model
age.sex$pr.R <- predict.glm(vote.red, newdata = age.sex, type = "response") age.sex <- age.sex %>% mutate(logit.correct=ifelse(pr.R>0.5 & party=='R',1, ifelse(pr.R<0.5 & party=='D',1,0))) logit.bad <- nrow(age.sex) - sum(age.sex$logit.correct)

1. Look at voter registration statistics from the various state level Secretary of State websites (here are the stats by county for CA, here are the stats by county for CO). Could part of the ‘problem’ be coming from low voter turnout or low voter registration numbers? Cost estimate: $3,000 (~20 hours of research time) 1. Are there lot of young people who might view Planned Parenthood favorably who are not registered to vote? 2. Are there lots of young people, females, poor people who don’t vote? 2. Prioritize areas where there isn’t an obvious explanation for the PP opposition and commission some surveys/interviews to find out where people in the district stand on funding PP. Cost estimate: 100k (10k each to the 10 most ‘promising’ districts). 3. On the basis of 1 and 2 above develop a more refined strategy for allocating remaining money: 1. among districts, and 2. among functions (enhancing services to better serve a need population, voter registration, education/outreach, organization/mobilization of existing supporters). ## Final Words I’ve rambled long enough and I think you guys get the gist so let me wrap this up: something like 60% of Americans support Planned Parenthood yet PP faces extinction because a relatively small group of people weilding a disproportionately large amount of political influence really, really, really don’t like PP. If supporters of Planned Parenthood want to make a difference they need to recognize that this isn’t a national fight, it’s a hyper-regional fight. I’ve presented the skeleton of a strategy that I contend could pick up 10 - 12 House votes for Planned Parenthood for less than$1 million. I think that’s a pretty decent ROI.
• 60 percent chance I can get you 10 votes for $1 million • 75 percent chance I can get you 6 votes for$1 million, and
• 30 percent chance I can get you 15 votes for $1 million Also, I work cheap…If you hired a real strategy firm they’d charge you$500 an hour for the same stuff I just gave you for free.