Connect to amazon rds db with mysql workbench

I’ve been wanting to beef-up my AWS skills for a long time. The main thing that’s slowed me down is that we cannot store government data on/in Amazon’s AWS ecosystem. This isn’t really a hard roadblock, it’s just that a lot of my blog content is generated from little programming or data hurdles I encounter at work.

Read More

Fuzzy joins

Been a long time….if that intro didn’t immediately make you think of Christopher Walken than I’m begging you to watch this:

Read More

Research notes visually identifying intervention effects

It will come as no surprise to people here that I think the concept of determining the impact of some event by looking at whether a line went up or down around the time of that event is farcical. I also realize that talking about the ‘correlation isn’t causation’ cliche to a bunch of statistically literate people is totally unnecessary. Yet, I feel compelled to write about this phenomenon because it’s so pervasive in my life:

Read More

A quick and dirty machine learning post with python and scikit Learn

I found this post on the interwebs and thought it was pretty cool. I mean, I’m not enamored with the whole, “don’t worry about understanding what it’s doing, just run the code and get a feel for how to do it” vibe…as a practicing empiricist I’m pretty well aware of the fact that anyone can run the same R/Python/whatever code that I use to run a Neural Network, Support Vector Machine, Classification-and-Regression Tree, insert hip new ensemble method here. The thing that makes me worth anything at all - if I am indeed worth anything - isn’t that I know how to tell R to train a Neural Network, it’s that I know what the code is doing when I give it that command. I have a decent (a little better than most, a lot worse than a few) grasp of the technical detail and nuance (read: the math) of the popular machine learning and applied statistical algorithms used to do prediction.

Read More

Machine learning and econometrics 1

Full disclosure: I’m not 100% confident that my assessment of Linear Discriminant Analysis relative to Logistic Regression is totally accurate. I’ve been thinking pretty hard about this for the past couple days so I’m reasonably confident that I’ve not said anything rediculous here…but if strongly disagree with characterization of LDA estimated coefficients relative to Logit coefficient estimates I’d love for you to drop some knowledge on me.

Read More

Affordable housing in santa cruz part 2 more thoughts on accessory dwelling units

In my last post I tried to set the stage for a discussion (one I’m pretty much just having with myself at the moment) on regional housing policy in Santa Cruz County. For whatever it’s worth, I also blogged about this a while back on The Samuelson Condition Blog. Today’s discussion on Accessory Dwelling Units will be decidedly more nuanced than what I wrote a year ago and, as any good Bayesian would when confronted with additional data points, my opposition to subsidized ADUs has softened somewhat.

Read More

Mining facebook with python proof of concept

This is going to be a pretty remedial post about using Facebook’s Graph API with Python. At this point I’ve only figured out how to do some pretty basic shit….enough to share but probably not enough to be really cool. I’m planning on posting a follow-up this week where I’ll focus on shoving the dictionary and list output you’ll see here into pandas dataframes.

Read More

Connect python to oracle db with pyodbc

I’m only the 3 billion-th blogger to write about this but for some reason, even with the interwebs saturated with python-Oracle connection examples, this still took me pretty much the whole day to figure out.

Read More

Automating census data pulls with r

I had a pretty cool little quantitative micro-targeting of policy issues model I wanted to show you guys today. But I had to get this Census Bureau API-R relationship smoothed out for a work thing and I’m kinda thinking it will have more universal appeal. So I’m posting it first.

Read More

Working with shapefiles in r suck it esri!

I have been looking at other peoples’ awesome R-powered geospatial analysis for what feels like years and, until now, every time I’ve sat down to try and do some spatial analytics in R I’ve been stymied by wierd package load errors. I’ve been poking around this problem rather casually for several months and last night I think I finally made some tangible progress. I’m pretty stoked about this so I hope you will be too.

Read More

Can a robot learn economics

I spent a non-trivial amount of time this week trying to pick apart the code in R’s rgp package and Matlab’s GP tips to see if I could modify it to do coupled dynamical systems…I have no notable progress to report on this front.

Read More

Initial thoughts on genetic programming

I’ve been casually reading papers on Genetic Programming and Symbolic Regression for a little over a week now so I figured it was high time I stopped thinking about GP and started trying to DO a GP.

Read More

Sentiment analysis 4 naive bayes

I know I said last week’s post would be my final words on Twitter Mining/Sentiment Analysis/etc. for a while. I guess I lied. I didn’t feel great about the black box-y application of text classification…so I decided to add a little ‘under the hood’ post on Naive Bayes for text classification/sentiment analysis.

Read More

Model-free quantification of time-series predictability

I read an interesting paper recently. I wrote a half-ass review of it here. Below I provide some notes and interesting nuggets I took from my reading of “Model-free quantification of time-series predictability” by Garland, Jame, and Bradley, published in Physical Review v. 90:

Read More

A Fun State Space Simulation Exercise

In this earlier post I tried to provide some helpful nuggets regarding the use of R’s KFAS package for modeling monthly seasonal data using a state space framework. I was going to add a little simulation experiment I cooked up just to further my understanding of how KFAS and state space models behave…but that post got a little long so I decided to include the simulation as a separate, companion post.

Read More

Average Treatment Effect Examples

Here, I’ve added a bit more substance to an earlier post on quasi-treatement-control methods for social science research. More specifically, I’ve added some content on Propensity Score Matching to the discussion. Enjoy.

Read More